Hello list,

I'm currently a little stuck and might need some help in order to decide how to improve the current setup. We are running a network where all customer vlans are bridged because the same Vlan is usually required in different areas in the network. This is the setup:



Room A:      +--------+
             | SX1600 |--------> [ 2nd SuperX not installed yet ]
             +--------+
                  |                 |
                  |                 |
             +--------+        +--------+
             | MX480  |--------|  MX80  |
             +--------+        +--------+
                  |                 |
                  |                 |
Room B:      +--------+        +--------+
             | SX400  |--------| SX400  |
             +--------+        +--------+



Both MX routers have a 10G link between each other with RSTP active, so the the two SuperXes in Room B. These are the priorities:

MX480: 0 (root bridge)
MX80: 4k (backup root)
SX400: both 16k


Because topology changes caused some minor packet loss in Room B, I installed the SX1600 with MSTP instead of RSTP to see if that performs better. During some tests before connecting customers to the SX1600, results looked fine. We proceeded with the setup and replaced the old Cisco 6509/sup32 with the SX1600 and turned all routed Vlans active on the Cisco into bridged Vlans.

I'm running just one instance of MSTP (CIST) on the SX1600 with the following configuration:


mstp scope all
mstp instance 0 vlan 1
mstp instance 0 vlan 19
...
mstp instance 0 priority 16384
mstp edge-port-auto-detect
mstp start


On this SX1600, most uplinks go to switches on their own, usually HP ProCurve 2600 or 2800 series. Although we manage those switches, customers can install cables on their own. And here is where the problem actually starts: a rack with two ProCurve switches installed receives two uplinks from the same SX1600 and those switches are connected with each other, causing a loop. No matter what I did, the loop continued to cause trouble to the whole network because the MX routers saw topology changes all the time (between a few and 200 seconds or so) and flushed the whole arp cache. With about 90.000 active arp entries, this caused a more or less heavy impact on the servers behind of course. Although STP was active on both HP switches, the problem didn't vanish but the topolgy change itself was not visible on the SX1600 as it seems. In order to solve the issue, we had to remove the cable causing the loop but of course this can't be the solution since customers may install a new loop anytime and what's the point in running STP if you need to care about that?

The question is now how to proceed and how to improve the setup generally? Does it make sense to change RSTP to MSTP on the MX routers in the first place? Is there any configuration I should perform on any of those devices involved? Since many of you are most likely from the Cisco world, here is a list of the available commands on the SuperX running in MST mode:

SSH@A.cs0 (config)#mst
  admin-edge-port         Define this port to be an edge port
  admin-pt2pt-mac         Define this port to be a point-to-point link
  disable                 Disable MSTP on this interface
  edge-port-auto-detect   Enable/Disable auto-detect edge port
  force-migration-check   Trigger port's migration state machine check
  force-version           Configure MSTP force version
  forward-delay           Configure bridge parameter forward-delay
  hello-time              Configure bridge parameter hello-time
  instance                Configure MSTP instance VLAN membership
  max-age                 Configure bridge parameter max-age
  max-hops                Configure MSTP max-hops
  name                    Configure MSTP configuration name
  revision                Configure MSTP revision level
  scope                   Configure MSTP scope
  start                   Start/stop MSTP operation


Inside the interface configuration, there is no way to configure e.g. a bpdu-protect on the port but root-protect is configured on every port towards customer switches.


I will be gladly thankful for any hints and I am aware that some of you might declare the setup to be broken but on the other hand, for colocation services where the same vlan might be required campus-wide, it's hard to improve that without installing tons of cables. Furthermore, we want to eliminate the dependency of just one big core-switch. Both rooms are equally important and in the past, we had a big core in Room A with downlinks going to smaller core-switches in Room B but with the big core having a problem, everything was going down.


Thanks so far for reading this and hopefully some great ideas will follow. Any help will rewarded with a cold beer in Frankfurt, Germany anytime! ;-)



Best regards,
Jeff
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Reply via email to