The short version: When changing to an "inferior" root bridge (higher priority that current root bridge) what is the expected downtime?
We shortly need to upgrade a Sup2T to 15.2(1)SY1a to be able to use C6800-16P10G modules. It functions as one of two gateways for a small- ish datacenter and is currently root bridge for a lot of VLANs. The topology is generally like this: +------------+ +------------+___ ___+------------+ +------------+ | Access (a) |---| Access (b) | \/ | Access (c) |---| Access (d) | +------------+ +------------+ /\ +------------+ +------------+ | / \ | | +-----------+/ \+-----------+ | +-----------| | | |-----------+ - - - -----| Sup2T (a) |==()==| Sup2T (b) |----- - - - - - ---| | | |--- - - +-----------+ +-----------+ And then with a few dozen access switches in similar "square" configurations. The access switches are generally HP 5900AF and Cisco 3560/3760/3560X. There is no VSS/stacking/IRF in play. Every time we have had to service one side we have moved the root away from that side by configuring a lower priority on the other side. This means almost no downtime, in the order of less that 50ms. The problem is that many VLANs now have a configured priority of 0 since we have had to move the root that many times -- we have "extend system-id" configured and need to. So we now have to switch to an inferior root. We had expected an election of such an inferior root to take around 50 seconds (2 * fwd_delay + max_age) but when testing we see much faster convergence. We typically see 3-4 seconds of complete loss and a few more seconds with "some" loss. In every test everything has converged after 8 seconds. The longer question is then: Are we possibly not testing this right? Can we actually expect just 4-8 seconds convergence time? Is there some trick to electing a new inferior root bridge? I can think of several ways this can work but cannot seem to find any documentation about it. One way was if neighbors start seeing BPDUs with the current root bridge MAC address but another bridge priority, then they might just start a new election without waiting for the current root bridge to time out. The switchover happens with the current root bridge online, so we're not talking the scenario where the current root bridge dies. The testing has currently been with a single host connected to a VLAN where we change priority, running a 100 pps "fping" to another host on another VLAN that isn't affected. Typical downtime (switchover happened just before 11:19:57): [11:19:55]: 10.83.65.10 : xmt/rcv/%loss = 101/101/0%, min/avg/max = 0.00/0.26/0.47 [11:19:56]: 10.83.65.10 : xmt/rcv/%loss = 101/101/0%, min/avg/max = 0.00/0.26/0.36 [11:19:57]: 10.83.65.10 : xmt/rcv/%loss = 101/75/25%, min/avg/max = 0.00/0.26/0.33 [11:19:58]: 10.83.65.10 : xmt/rcv/%loss = 101/0/100% [11:19:59]: 10.83.65.10 : xmt/rcv/%loss = 101/0/100% [11:20:00]: 10.83.65.10 : xmt/rcv/%loss = 101/51/49%, min/avg/max = 0.00/0.26/0.32 [11:20:01]: 10.83.65.10 : xmt/rcv/%loss = 101/101/0%, min/avg/max = 0.00/0.26/0.34 [11:20:02]: 10.83.65.10 : xmt/rcv/%loss = 100/97/3%, min/avg/max = 0.00/0.31/0.78 [11:20:03]: 10.83.65.10 : xmt/rcv/%loss = 101/101/0%, min/avg/max = 0.00/0.29/0.35 [11:20:04]: 10.83.65.10 : xmt/rcv/%loss = 101/101/0%, min/avg/max = 0.00/0.30/0.40 [11:20:05]: 10.83.65.10 : xmt/rcv/%loss = 101/99/1%, min/avg/max = 0.00/0.29/0.53 [11:20:06]: 10.83.65.10 : xmt/rcv/%loss = 101/101/0%, min/avg/max = 0.00/0.26/0.34 [11:20:07]: 10.83.65.10 : xmt/rcv/%loss = 101/101/0%, min/avg/max = 0.00/0.26/0.40 The Sup2T that takes over is running 15.2(1)SY1a and the current root is running 15.0(1)SY6. The Cisco access switches typically run either a 12.2(55)SE or 15.0(1)SE flavour. A very few have not been upgraded for years and run 12.2(35)SE5. Thank you in advance! -- Peter _______________________________________________ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/