How to Replace a Faulty Cisco Switch from a Stack and Add a New Switch
Introduction
Cisco StackWise technology allows multiple physical switches to operate as a single logical unit, sharing a common configuration, management IP, and forwarding table. In a production environment, a switch within a stack can sometimes fail due to hardware faults, power issues, or software corruption.
This article provides a complete, step-by-step guide to safely identifying and removing a faulty Cisco switch from an existing stack, and then adding a new replacement switch — all while minimising network downtime. The procedure applies primarily to Cisco Catalyst series switches supporting StackWise, StackWise Plus, and StackWise-480 (e.g., Catalyst 3750, 3850, 9300 series).
This guide assumes you have console or SSH access to the stack master (active switch) before beginning. Always maintain a backup of your running configuration before making any hardware changes.
Prerequisites
- Physical access to the switch stack in the rack
- Console cable or SSH access to the stack master
- A replacement Cisco switch of the same model and series
- StackWise stacking cables (same type as existing stack)
- The correct IOS/IOS-XE software version for the replacement switch
- A copy of the current running configuration (exported to TFTP or local flash)
- Basic understanding of Cisco IOS CLI commands
Removing the active stack master will trigger a re-election and may cause a brief traffic disruption. Identify the master before proceeding and plan maintenance accordingly.
Understanding Cisco Stack Roles
Before replacing any switch, it is important to understand the roles each member plays in the stack. Each switch in a Cisco stack is assigned a role based on priority and election criteria.
| Role | Description | Impact if Removed |
|---|---|---|
| Active Master | Controls the entire stack, holds the running config and routing table | Triggers master re-election; brief traffic interruption possible |
| Standby | Ready to take over as master if active fails; syncs state | Minimal impact; another member becomes standby |
| Member | Forwards traffic; operates under master's control | Only ports on that switch go down; stack continues operating |
Best practice: always ensure the faulty switch is a member (not master) before physical removal. If it is the master, reload it first so another switch takes over the master role.
Step 1 — Verify the Stack and Identify the Faulty Switch
Log in to the stack master via console or SSH and run the following commands to view the current stack topology and identify which member is faulty.
Check all stack members and their status:
show switch
Sample output showing a faulty member:
Switch/Stack Mac Address : 0011.2233.4455
H/W Current
Switch# Role Mac Address Priority Version State
--------------------------------------------------------------------
*1 Active 0011.2233.4401 15 V04 Ready
2 Standby 0011.2233.4402 10 V04 Ready
3 Member 0011.2233.4403 1 V04 Provisioned
4 Member 0011.2233.4404 1 V04 Removed
In the above output, Switch 4 shows a state of Removed, indicating it is
faulty or has lost stack connectivity. Switch 3 shows Provisioned which
means the configuration slot exists but the physical switch is not responding.
For detailed hardware and power status of a specific member:
show switch 4 detail
show platform
Note down the Switch Number of the faulty switch — you will need it during the removal and renumbering steps.
Step 2 — Back Up the Running Configuration
Before making any changes, always export the current running configuration. This ensures you can restore the stack state if anything goes wrong.
Save the running config to the startup config:
copy running-config startup-config
Export to a TFTP server for an off-device backup:
copy running-config tftp://192.168.1.100/stack-backup.cfg
Alternatively, save to flash:
copy running-config flash:stack-backup.cfg
Keep a printed or text copy of critical interface configurations, VLANs, and IP addressing in case the replacement switch needs manual configuration.
Step 3 — Gracefully Remove the Faulty Switch from Stack
If the faulty switch is the Active Master, you must first transfer the master role to another member before physically removing it to reduce traffic impact.
If the faulty switch is a regular member (not master or standby), you can safely power it off and disconnect the stacking cables without issuing any CLI commands first. The stack will automatically remove it from the topology.
Optionally, remove the provisioned slot from the configuration:
no switch 4 provision
If the faulty switch is the active master, first change the priority of another member to make it the preferred new master, then reload the faulty master:
switch 2 priority 15
reload slot 1
After reload, Switch 2 will become the new active master. Verify using show switch.
Now the old master (Switch 1) is just a member and can be physically removed safely.
If the faulty switch is completely unresponsive (no console, no ping, no stack heartbeat), proceed directly to physical removal. Power off the faulty unit from the power strip or pull the power cable, then disconnect both stacking cables. Reconnect remaining stack members with a stacking cable to maintain the ring topology.
Breaking the stack ring without reconnecting the remaining members will degrade stack performance. Always reconnect cables promptly.
Step 4 — Physical Removal of the Faulty Switch
Once the CLI steps are complete, proceed with the physical removal from the rack. Follow these steps carefully:
- Power off the faulty switch (press the power button or disconnect the power cable)
- Disconnect both StackWise stacking cables from the faulty switch's stack ports
- Label and disconnect all network patch cables from the faulty switch's ports
- Slide the switch out of the rack and set it aside
- Reconnect the two free stacking cable ends to each other to restore the ring topology between the remaining members
- Verify the ring is intact by checking
show switch— all remaining members should show Ready
Never leave the stack in a broken ring (open chain) state for longer than necessary. A broken ring means the stack is operating in half-bandwidth mode and is vulnerable to a complete split-brain failure if another cable fails.
Step 5 — Prepare the New Replacement Switch
Before inserting the new switch into the stack, you must ensure it is running the correct IOS/IOS-XE version and is configured with the correct stack member number.
Boot the new switch standalone (not connected to the stack) and check its IOS version:
show version
Compare the IOS version with the running stack members. All members must run the same major software version for Auto-Upgrade to work reliably.
If the software does not match, either upgrade the new switch manually, or rely on Cisco's Auto-Upgrade feature (enabled by default on 3850/9300 series):
switch stack-member-number renumber new-stack-member-number
Assign the new switch a stack member number (e.g., number 4) before connecting it:
switch 1 renumber 4
reload
The renumber command takes effect after a reload. If you skip this step, the stack master will automatically assign the lowest available number to the new member when it joins.
Step 6 — Add the New Switch into the Stack
With the stack ring currently open (two free cable ends after removing the faulty switch), you can now insert the new switch into the ring.
- Slide the new switch into the rack in the position vacated by the faulty unit
- Connect the first free stacking cable end into Stack Port 1 of the new switch
- Connect the second free stacking cable end into Stack Port 2 of the new switch
- Connect the power cable to the new switch
- Power on the new switch
The new switch will boot, detect the stack, and begin the join process. The stack master will push the IOS software to the new member if Auto-Upgrade is enabled and versions differ. This process can take 5–15 minutes depending on IOS image size.
Do not power cycle the stack master or interrupt the stack during the software upgrade process. Interrupting the image copy will leave the new member in an unbootable state.
Step 7 — Verify the New Switch Has Joined the Stack
Once the new switch has completed its boot cycle and software upgrade (if applicable), verify it has successfully joined the stack and is in a Ready state.
Check all stack members:
show switch
Expected output after successful replacement:
Switch/Stack Mac Address : 0011.2233.4455
H/W Current
Switch# Role Mac Address Priority Version State
--------------------------------------------------------------------
*1 Active 0011.2233.4401 15 V04 Ready
2 Standby 0011.2233.4402 10 V04 Ready
3 Member 0011.2233.4403 1 V04 Ready
4 Member 0011.2233.4405 1 V04 Ready
Verify the stacking ring is intact and bandwidth is full:
show switch stack-ring speed
show switch neighbors
Verify the new member's interfaces are visible:
show interfaces status | include Gi4
All members showing Ready and the ring showing full speed confirms a successful stack replacement. The new switch will automatically inherit VLAN, interface, and port-channel configurations that were pre-provisioned for its slot number.
Step 8 — Restore Interface Configurations
If the replacement switch was assigned the same stack member number as the faulty one, all interface configurations (access VLAN, trunk, port-channel) from the startup-config will be automatically applied. Verify critical interfaces are up:
show interfaces GigabitEthernet4/0/1 status
If the new switch received a different member number, manually re-apply configurations for its interfaces. First view the relevant section from your backup config, then apply:
interface GigabitEthernet4/0/1 description SERVER-01 switchport mode access switchport access vlan 100 spanning-tree portfast no shutdown
Save the updated configuration:
copy running-config startup-config
Reconnect all patch cables to their original ports on the new switch. Update your network documentation to reflect the new switch's MAC address and serial number.
Troubleshooting Common Issues
If the new switch boots but does not appear in show switch, check the
stacking cables are firmly seated in the correct stack ports. Try swapping the
cable connections between Port 1 and Port 2. Also verify the switch model is
compatible with the existing stack (same Catalyst series).
show switch detail
show log | include STACKMGR
If Auto-Upgrade fails, manually copy the correct IOS image to the new switch's flash before connecting it to the stack. Boot the switch standalone, then:
copy tftp://192.168.1.100/cat3k_caa-universalk9.bin flash:
boot system flash:cat3k_caa-universalk9.bin
reload
After the switch is on the correct version, connect it to the stack.
A broken ring causes the stack to operate as a chain. Confirm the ring topology by checking neighbours. If a port shows as not connected, reseat or replace the stacking cable between those two members.
show switch neighbors
show switch stack-ports summary
If the new switch is assigned a member number already in use, the stack will increment it automatically. To manually resolve a conflict, renumber one of the members while standalone before re-connecting to the stack:
switch 1 renumber 5
reload
Quick Reference — Key Commands
| Command | Purpose |
|---|---|
show switch |
List all stack members, roles, and states |
show switch detail |
Detailed hardware and uptime per member |
show switch neighbors |
Display stack ring topology and cable connections |
show switch stack-ring speed |
Confirm ring is operating at full bandwidth |
switch X priority Y |
Set master election priority for a member |
reload slot X |
Reload a specific stack member only |
switch X renumber Y |
Assign a new member number to a switch |
no switch X provision |
Remove a provisioned (absent) member slot |
show version |
Check IOS version and uptime per switch |
copy running-config startup-config |
Save configuration to NVRAM |
Conclusion
Replacing a faulty Cisco switch in a stack is a structured process that, when followed carefully, can be completed with minimal or zero impact to the rest of the network. The key steps are: identify the faulty member, back up your configuration, gracefully transfer roles if needed, physically swap the hardware, and verify the new member has joined with a Ready state.
Cisco StackWise technology is designed to make this kind of maintenance straightforward — the stack master automatically provisions the new member with the correct software and configuration for its slot, provided Auto-Upgrade is enabled and the stack member number is correctly assigned.
After a successful replacement, update your network inventory records with the new switch's serial number, MAC address, and installation date. This helps with future maintenance planning and RMA tracking.
For large enterprise environments managing multiple stacks, consider using Cisco DNA Center or Cisco Prime Infrastructure to automate switch provisioning and reduce manual steps during replacements.