Thank you all !!! Regards, Swamy
--- On Sat, 9/20/08, Harry Reynolds <[EMAIL PROTECTED]> wrote: From: Harry Reynolds <[EMAIL PROTECTED]> Subject: RE: [j-nsp] BGP Hold time expiry To: "Richard A Steenbergen" <[EMAIL PROTECTED]>, juniper-nsp@puck.nether.net Cc: [EMAIL PROTECTED] Date: Saturday, September 20, 2008, 12:12 AM One comment. As I understand, the system-level setting is for all TCP **except** BGP, and is on by default. The BGP level is for BGP only, and off by default. Below came from a developer once. Never tested last combo, but note that system, level is on by default so one would have to specifically disable to test. I always thought the bgp and rest of TCP were independent: path-mtu-discovery mtu-discovery under BGP result configured no pmtu off for BGP session configured yes pmtu on for BGP session no no pmtu off for BGP session no yes pmtu off for BGP session Regards -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Richard A Steenbergen Sent: Friday, September 19, 2008 6:36 AM To: juniper-nsp@puck.nether.net Cc: [EMAIL PROTECTED] Subject: Re: [j-nsp] BGP Hold time expiry On Fri, Sep 19, 2008 at 10:20:47AM -0700, Kevin Oberman wrote: > Looks a lot like an MTU mismatch. BGP does not do PMTU and sets do not > fragment, so the MTUs need to be the same on both ends. Things like > VLAN or tunneling can mess this up. > > You can try capturing the traffic to confirm this. I use tcpdump (in > the shell as root) and move the output file to a box where I can run > wireshark on it. You will see the big frame being re-transmitted > several times. This is probably not really required, though. You can enable path-mtu-discovery for the entire box under set system internet-options path-mtu-discovery, or under bgp with set protocols bgp mtu-discovery, but of course that won't help you if you don't have your MTUs configured correctly on both sides. If your L3 devices are not both configured to a value which can safely pass between them (and any L2 devices in the middle), fragmentation (or ICMP needfrag) will not function, thus defeating PMTUD and causing blackholing. The big confusion that I typically see people running into is that Juniper and Cisco mean different things when you configure an interface MTU. Juniper includes all L2 overhead, Cisco does not, so for example a Juniper with interface mtu of 9192 (max) would only correctly talk to a Cisco with its L3 interface configured to 9178 (or 9174 if the Juniper is vlan-tagged, or 9170 if the Juniper is flexible/stacking vlan-tagged). And of course, under 6500/7600 SVIs, you have you configure the physical interface to 9216, and then the interface Vlan to 9178/9174/9170 (default is still 1500 even with the physical port mtu bumped). A quick and dirty test is to force the tcp-mss on the bgp session lower, say for example with set protocol bgp group blah neighbor x.x.x.x tcp-mss 536. If this stops the flapping, you probably have an MTU issue. You can also ping across the link with the do-not-fragment bit set to verify these issues, but remember that Cisco and Juniper also disagree about what ping "size" means. Cisco means it to be the size of the entire packet, Juniper means it to be the size of the ping payload, so in the case of IPv4 you would need to subtract 28 (20 bytes IP, 8 bytes ICMP) from the "size" param to match a Cisco side. Between that and the mtu issue above, Cisco and Juniper have created a real mess for inter-provider MTU negotiation. Of course it could be any number of other things too, not just MTU. For example, in my experience Cisco control-plane policing on 6500/7600 is absolutely horrific at applying fair rate limits. If you do bump your CoPP rate-limit (by say, bouncing a bgp session, doing a soft clear, etc), rather than simply cause tcp to back off and slowing down the transfer of data, more often than not what will happen is one stream will monopolize the bandwidth and cause the other sessions to not exchange keepalives. Being careful with said rate limits has resolved almost all of the problems that initially looked like a poor scheduler (though not to say that IOS doesn't have a poor scheduler anyways :P). Oh and while we're on the subject, am I the only one who is "concerned" by Juniper's configurable range of tcp-mss on BGP neighbors? [EMAIL PROTECTED] set protocols bgp tcp-mss ? Possible completions: <tcp-mss> Maximum TCP segment size (1..4096) Setting a TCP MSS to 1 and then trying to exchange a large amount of data makes for an excellent DoS, and many operating systems now include a minimum acceptable MSS setting as protection against this. -- Richard A Steenbergen <[EMAIL PROTECTED]> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC) _______________________________________________ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp _______________________________________________ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp