MTU is 1500 on all links: Core 1:
#sh int te9/1 | i MTU MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec, #sh int te9/2 | i MTU MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec, #sh int te8/1 | i MTU MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec, Core 2: #sh int te4/1 | i MTU MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec, Core 3: #sh int te4/1 | i MTU MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec, Core 4: #sh int te4/1 | i MTU MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec, Core 1 is physically connected to 2,3 and 4 (star topology). BGP is fully meshed - no route reflector. Andy On Fri, Jan 22, 2010 at 11:00 AM, roy <bandwidth.u...@gmail.com> wrote: > We had a somewhat similar problem with ospf/bgp which was eventually > resolved by making link mtu uniform across the links. Let me know if this > helps. > > On Friday, 22 January, 2010 04:07 PM, Gergely Antal wrote: >> >> just a thought : >> sh ip bgp neighbors | i Datagrams >> >> maybe one router tries to negotiate the session with low datagram size >> and the update storm floods the connection. >> >> >> On Fri, 22 Jan 2010 02:06:53 +0100 >> "Andy B."<globic...@gmail.com> wrote: >> >>> Hi, >>> >>> here we go: >>> >>> Core router that is causing headaches: >>> >>> interface Loopback0 >>> ip address x.x.x.130 255.255.255.255 >>> >>> interface TenGigabitEthernet9/1 >>> ip address y.y.y.1 255.255.255.252 >>> no ip redirects >>> no ip proxy-arp >>> no cdp enable >>> >>> router ospf 1 >>> router-id x.x.x.130 >>> log-adjacency-changes >>> redistribute connected subnets >>> redistribute static subnets >>> passive-interface default >>> no passive-interface TenGigabitEthernet8/1 >>> no passive-interface TenGigabitEthernet9/1 >>> no passive-interface TenGigabitEthernet9/2 >>> network y.y.y.0 0.0.0.3 area 0 >>> network y.y.y.4 0.0.0.3 area 0 >>> network y.y.y.8 0.0.0.3 area 0 >>> >>> >>> Adjacent router (one of them): >>> >>> interface Loopback0 >>> ip address x.x.x.131 255.255.255.255 >>> >>> interface TenGigabitEthernet4/1 >>> ip address y.y.y.2 255.255.255.252 >>> no ip redirects >>> no ip proxy-arp >>> >>> router ospf 1 >>> router-id x.x.x.131 >>> log-adjacency-changes >>> redistribute connected subnets >>> redistribute static subnets >>> passive-interface default >>> no passive-interface TenGigabitEthernet4/1 >>> network y.y.y.0 0.0.0.3 area 0 >>> >>> >>> I hope this helps... >>> >>> Andy >>> >>> >>> On Fri, Jan 22, 2010 at 1:53 AM, Jason LeBlanc >>> <jasonlebl...@gmail.com> wrote: >>>> >>>> Can you send your<snipped> OSPF config? >>>> >>>> On Jan 21, 2010, at 5:28 PM, Andy B. wrote: >>>> >>>>> Hi, >>>>> >>>>> I just fell over this thread while doing a little reseach to solve a >>>>> similar situation. >>>>> >>>>> Hardware: >>>>> >>>>> - 6509 with SUP720-3BXL on both ends >>>>> - SXF15a >>>>> - Uptime: 46 weeks >>>>> >>>>> Problem: >>>>> >>>>> - OSPF (for the loopback between cores) and BGP (mostly customers >>>>> whom we send the full table) going up and down all the time: >>>>> >>>>> %OSPF-5-ADJCHG: Process 1, Nbr x.x.x.130 on TenGigabitEthernet4/1 >>>>> from FULL to DOWN, Neighbor Down: Dead timer expired >>>>> %OSPF-5-ADJCHG: Process 1, Nbr x.x.x.131 on TenGigabitEthernet9/1 >>>>> from LOADING to FULL, Loading Done >>>>> %BGP-5-ADJCHANGE: neighbor y.y.y.14 Down BGP Notification sent >>>>> %BGP-3-NOTIFICATION: sent to neighbor y.y.y.14 4/0 (hold time >>>>> expired) 0 bytes %BGP-5-ADJCHANGE: neighbor y.y.y.14 Up >>>>> >>>>> This keeps going on for several hours, and suddenly it stabilizes >>>>> itself. >>>>> >>>>> Furthermore I use cacti to generate graphs from the core router via >>>>> SNMP. I have one VLAN that has around 15 GBPS traffic at peak times, >>>>> and as soon as I hit more than 15 GBPS, no more graphs are drawn, >>>>> core router console becomes rather unresponsive and OSPF starts to >>>>> behave strangely. >>>>> >>>>> What I can rule out is the fiber capacity. I have multiple circuits >>>>> and different paths and operators. The OSPF issue happens on all >>>>> circuits, not just a specific one. No 10 GE link is used more than >>>>> 60%. In fact, traffic from inside my backbone to any place outside >>>>> remains unaffected (thank God), but the core router itself is pretty >>>>> useless. Pinging the core's loopback or any ip loaded on that box >>>>> results in a 40-60% packet loss. >>>>> >>>>> CPU usage is not high, it's stable. No unusual processes, just IP >>>>> Input and BGP Scanner. More than 50% memory is still free at that >>>>> time. >>>>> >>>>> I've had this many times recently, but it really just happens when >>>>> my core goes beyond +- 15 GBPS of traffic (outbound). We've been >>>>> below 15 GBPS for 2 years and it never happaned at that time. Now >>>>> all this mess happens almost daily, rendering important billing >>>>> graphs useless and annoying full table BGP customers. >>>>> >>>>> Is this a memory issue, due to the router's long uptime? Would >>>>> reloading the router help in this case? That's the last thing I >>>>> would want to do, but if it helps... >>>>> >>>>> Cheers, >>>>> >>>>> Andy >>>>> >>>>> On Fri, Dec 11, 2009 at 5:22 PM, Drew Weaver >>>>> <drew.wea...@thenap.com> wrote: >>>>>> >>>>>> Howdy all, >>>>>> >>>>>> Last night I had an interesting encounter on one of my 6509s /w >>>>>> SUP7203-BXL. >>>>>> >>>>>> This switch has 3x iBGP sessions with full internet tables and is >>>>>> also running OSPF. >>>>>> >>>>>> Two of the three iBGP sessions randomly dropped with: >>>>>> >>>>>> %BGP-3-NOTIFICATION: sent to neighbor x.x.x.3 4/0 (hold time >>>>>> expired) 0 bytes, I also noticed that during this period OSPF >>>>>> dropped with Neighbor Down: Dead timer expired >>>>>> >>>>>> and then re-established, and then failed again, and >>>>>> re-established, and failed again, and so-on, and so-on. >>>>>> >>>>>> I checked the physical interfaces between this 6500 and the two >>>>>> GSR 12000s it peers with and there were no errors, there was also >>>>>> no obvious spike in traffic that would account for latency that >>>>>> might cause the hold timers to expire. I remember when this system >>>>>> first came online it took a really long time for it to download >>>>>> the full internet tables from the upstream GSRs and also during >>>>>> that time there was a lot of CPU time being eaten up, I am >>>>>> wondering if maybe the first session failing caused sort of a >>>>>> 'performance' domino effect which then caused everything else to >>>>>> fail, the issue eventually corrected itself and stabilized. >>>>>> >>>>>> This particular box is running 12.2(18)SXF17 so I am less likely >>>>>> to believe it is a software bug. >>>>>> >>>>>> Does anyone have any tips on both how I can avoid the hold timer >>>>>> issue altogether and also how I can make it so that if a session >>>>>> does go down and re-establish it doesn't totally nail the CPU >>>>>> while it's trying to re-establish/download the routes? A long time >>>>>> ago I also read that increasing the MTU on both ends of a circuit >>>>>> can make BGP tables download faster, I don't know if that's true >>>>>> or not, has anyone else found that? >>>>>> >>>>>> thanks, >>>>>> -Drew >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> cisco-nsp mailing list cisco-...@puck.nether.net >>>>>> https://puck.nether.net/mailman/listinfo/cisco-nsp >>>>>> archive at http://puck.nether.net/pipermail/cisco-nsp/ >>>>>> >>>>> _______________________________________________ >>>>> cisco-nsp mailing list cisco-...@puck.nether.net >>>>> https://puck.nether.net/mailman/listinfo/cisco-nsp >>>>> archive at http://puck.nether.net/pipermail/cisco-nsp/ >>>> >>>> >>> _______________________________________________ >>> cisco-nsp mailing list cisco-...@puck.nether.net >>> https://puck.nether.net/mailman/listinfo/cisco-nsp >>> archive at http://puck.nether.net/pipermail/cisco-nsp/ >> >> >> >> _______________________________________________ >> cisco-nsp mailing list cisco-...@puck.nether.net >> https://puck.nether.net/mailman/listinfo/cisco-nsp >> archive at http://puck.nether.net/pipermail/cisco-nsp/ > > _______________________________________________ > cisco-nsp mailing list cisco-...@puck.nether.net > https://puck.nether.net/mailman/listinfo/cisco-nsp > archive at http://puck.nether.net/pipermail/cisco-nsp/ > _______________________________________________ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/