On Wed, Dec 09, 2009 at 09:13:28AM -0700, David Ball wrote: > Do your KRT queues eventually flush though? Is it just a slow > control->fwding thing when large route updates occur? I've done 2 > upgrades in as many years to resolve a KRT related bug, but that > resulted in the queue NEVER emptying. It's apparently related to a > residual variable being set after an RPD restart (caused by another > bug) resulting in a kernel/rpd inconsistency. I'm told mine is > resolved in 9.5R3 (PR291407), but I got nervous when I read Richard's > earlier post.
The behavior we've always seen (from mid 7.x's until today) is that something seems to "block" the KRT queue while the pending changes keep piling up, then eventually whatever is causing the blockage clears and all the routes quickly install immediately thereafter. I saw the exact same behavior last night during an upgrade to 9.5R3, 263k routes stuck in Pending state for a hair over 10 minutes, then they all synced in just a few seconds. But I think it's actually getting worse, because in older versions the routes that were stuck in pending state didn't seem to be advertised to peers. This time it seemed to advertise the routes even though it didn't have them installed in hardware, resulting in blackholing of traffic. > 2009/12/9 Mark Tinka <mti...@globaltransit.net>: > > > > I'd be willing to help if we can offline this to a > > reproduction in my lab. > > > > I have a case that will have been open for 1 year, if > > February 2010 comes and we still haven't fixed it. So I know > > what it's like :-). I've personally never had any luck reproducing it in the lab, so I understand Juniper's frustration. It seems to require a complexity of routes, ports, and/or protocols which we simply don't have the time or money to reproduce in the lab, but I can reproduce it in the field (with undesired customer impact of course) nearly every time I reboot a router. Maybe we just need to help provide them with an example configuration that they can try to reproduce themselves. As for the cases which have been open for over 1 year... I had quite a few a year ago, JTAC was really dropping the ball in absolutely abysmal ways. IMHO it has gotten significantly better over the last year, both in JTAC and code quality, but they're still pretty hit or miss in the initial stages. The sad part was none of my issues were complex problems that actually took a year to resolve, they were all relatively simple problems that took JTAC a year and several escalations just to find someone competent who could actually read, understand, and follow my instructions to reproduce the problem. -- Richard A Steenbergen <r...@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC) _______________________________________________ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp