HI!

We have a very strange problem on two chassis clusters with 10.0R3.10 (will try updating to R4.7 today).

One chassis cluster (2x J6350) is our main system
The other (2x J4350) is a system located on the site of our customer.

The two clusters are speaking BGP with each other. For the customer system, this is the only BGP session. Our main system has a full BGP mesh to our other locations and edge systems. For understanding the problem, I would compress this to three BGP sessions:

A) BGP session to AMS-IX over VLAN 1
B) BGP session to ECIX over VLAN 1
C) BGP session to ECIX over VLAN 2

Involved are two switches. VLAN 1 is configured on both switches to make it available in Amsterdam and Düsseldorf. VLAN 2 is only configured on the switch, faced to Düsseldorf, to have a backup in the case the first switch is dead.

The day before yesterday, I started to pings to the ECIX router. One from my local workstation, the other from the main cluster.

If I cofigure something on the redundant interfaces, as soon as I do the commit, the first ping stays normal, the second junps to +30ms (normal around 6ms). 2-3 minutes later, both pings stop. The BGP session drops. This is the only BGP session that is dropped, due to Hold time expiration. After a few minutes, the pings and the BGP session come back. Every other BGP session even the one to Düsseldorf over VLAN 2 stays up.

I switched the main load to Düsseldorf to VLAN 2. That time, that BGP session was dropped, while the other stays up. The session to Düsseldorf is taking the main load with around 260000 prefixes.

Matthias
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Reply via email to