On Fri, 18 Aug 2023, Matthew Petach wrote:

Hi Robert,

Without naming any names, I will note that at some point in the not-too-distant 
past, I was part of a new-years-eve-holiday-escalation to 
$BACKBONE_ROUTER_PROVIDER when
the global network I was involved with started seeing excessive convergence 
times (greater than one hour from BGP update message received to FIB being 
updated).  
After tracking down development engineer from $RTR_PROVIDER on the new years 
eve holiday, it was determined that the problem lay in assumptions made about 
how communities
were stored in memory.  Think hashed buckets, with linked lists within each 
bucket.  If the communities all happened to hash to the same bucket, the linked 
list in that
bucket became extremely long; and if every prefix coming in, say from multiple 
sessions with a major transit provider, happened to be adding one more 
community to the very
long linked list in that one hash bucket, well, it ended up slowing down the 
processing to the point where updates to the FIB were still trickling in an 
hour after the BGP
neighbor had finished sending updates across.

A new hash function was developed on New Year's day, and a new version of code 
was built for us to deploy under relatively painful circumstances. 

This reminds me of two things.

First, some code I wrote more than 20 years ago to track and bill for overlapping dial-up sessions (i.e. dial-up account sharing). Processing the RADIUS accounting data, I built a binary tree of users with each node having a linked list of session data. I found while testing it, that as the amount of data fed in grew, the program got slower. I solved it by converting the session data linked lists to doubly linked lists, allowing me to add session data to the lists by jumping directly to the end, seeing if that's where the current session belonged, and walking back the list if necessary, but generally it was not since the input data was generally in chronological order. That made it super fast again.

Second, we ran into an issue with Arista some time ago and a peer on AMS-IX that set a ridiculous number of communities on their routes. Arista uses (used?) a fixed length buffer for communities in route-map processing and when doing "match community" in a route-map, if the set of communities on the route is longer than the fixed length buffer, and the communitites you're trying to match fall off the end, your route map match statement will fail to match, even though a show ip bgp... will show you that the communities you're trying to match are there.

----------------------------------------------------------------------
 Jon Lewis, MCP :)           |  I route
 StackPath, Sr. Neteng       |  therefore you are
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

Reply via email to