Can you send your <snipped> OSPF config? On Jan 21, 2010, at 5:28 PM, Andy B. wrote:
> Hi, > > I just fell over this thread while doing a little reseach to solve a > similar situation. > > Hardware: > > - 6509 with SUP720-3BXL on both ends > - SXF15a > - Uptime: 46 weeks > > Problem: > > - OSPF (for the loopback between cores) and BGP (mostly customers whom > we send the full table) going up and down all the time: > > %OSPF-5-ADJCHG: Process 1, Nbr x.x.x.130 on TenGigabitEthernet4/1 from > FULL to DOWN, Neighbor Down: Dead timer expired > %OSPF-5-ADJCHG: Process 1, Nbr x.x.x.131 on TenGigabitEthernet9/1 from > LOADING to FULL, Loading Done > %BGP-5-ADJCHANGE: neighbor y.y.y.14 Down BGP Notification sent > %BGP-3-NOTIFICATION: sent to neighbor y.y.y.14 4/0 (hold time expired) 0 bytes > %BGP-5-ADJCHANGE: neighbor y.y.y.14 Up > > This keeps going on for several hours, and suddenly it stabilizes itself. > > Furthermore I use cacti to generate graphs from the core router via > SNMP. I have one VLAN that has around 15 GBPS traffic at peak times, > and as soon as I hit more than 15 GBPS, no more graphs are drawn, core > router console becomes rather unresponsive and OSPF starts to behave > strangely. > > What I can rule out is the fiber capacity. I have multiple circuits > and different paths and operators. The OSPF issue happens on all > circuits, not just a specific one. No 10 GE link is used more than > 60%. In fact, traffic from inside my backbone to any place outside > remains unaffected (thank God), but the core router itself is pretty > useless. Pinging the core's loopback or any ip loaded on that box > results in a 40-60% packet loss. > > CPU usage is not high, it's stable. No unusual processes, just IP > Input and BGP Scanner. More than 50% memory is still free at that > time. > > I've had this many times recently, but it really just happens when my > core goes beyond +- 15 GBPS of traffic (outbound). We've been below 15 > GBPS for 2 years and it never happaned at that time. Now all this mess > happens almost daily, rendering important billing graphs useless and > annoying full table BGP customers. > > Is this a memory issue, due to the router's long uptime? Would > reloading the router help in this case? That's the last thing I would > want to do, but if it helps... > > Cheers, > > Andy > > On Fri, Dec 11, 2009 at 5:22 PM, Drew Weaver <drew.wea...@thenap.com> wrote: >> Howdy all, >> >> Last night I had an interesting encounter on one of my 6509s /w SUP7203-BXL. >> >> This switch has 3x iBGP sessions with full internet tables and is also >> running OSPF. >> >> Two of the three iBGP sessions randomly dropped with: >> >> %BGP-3-NOTIFICATION: sent to neighbor x.x.x.3 4/0 (hold time expired) 0 >> bytes, I also noticed that during this period OSPF dropped with Neighbor >> Down: Dead timer expired >> >> and then re-established, and then failed again, and re-established, and >> failed again, and so-on, and so-on. >> >> I checked the physical interfaces between this 6500 and the two GSR 12000s >> it peers with and there were no errors, there was also no obvious spike in >> traffic that would account for latency that might cause the hold timers to >> expire. I remember when this system first came online it took a really long >> time for it to download the full internet tables from the upstream GSRs and >> also during that time there was a lot of CPU time being eaten up, I am >> wondering if maybe the first session failing caused sort of a 'performance' >> domino effect which then caused everything else to fail, the issue >> eventually corrected itself and stabilized. >> >> This particular box is running 12.2(18)SXF17 so I am less likely to believe >> it is a software bug. >> >> Does anyone have any tips on both how I can avoid the hold timer issue >> altogether and also how I can make it so that if a session does go down and >> re-establish it doesn't totally nail the CPU while it's trying to >> re-establish/download the routes? A long time ago I also read that >> increasing the MTU on both ends of a circuit can make BGP tables download >> faster, I don't know if that's true or not, has anyone else found that? >> >> thanks, >> -Drew >> >> >> _______________________________________________ >> cisco-nsp mailing list cisco-nsp@puck.nether.net >> https://puck.nether.net/mailman/listinfo/cisco-nsp >> archive at http://puck.nether.net/pipermail/cisco-nsp/ >> > _______________________________________________ > cisco-nsp mailing list cisco-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/cisco-nsp > archive at http://puck.nether.net/pipermail/cisco-nsp/ _______________________________________________ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/