There is sometimes the option of a session reset and session restart at a specified interval after the event has triggered. Vendor dependent of course, but the option exists in IOS XE at least and most likely other vendors too. This allows for a recovery once too many prefixes have been received.

That probably would have saved a lot of site visits for Optus once the root cause of the prefixes was fixed at the edge.

Then there is the unanswered question of where an Out Of Band management network fitted into this picture which also likely would have provided a get-out-of-jail-free card much earlier in the day.

Reuben


On 14/11/2023 1:27 pm, John Edwards wrote:
The default behaviour of the "maximum prefix" BGP feature is to bring down the BGP session with the peer.

The alternate behaviour is to log a warning and accept a prefix.

I am not aware of an implementation that just allows "Accept up to X routes and then don't accept any more".

That sounds logical but in reality would lead to inconsistent behaviour that is more readily addressed with existing routing policy tools.

It appears that a failure of routing policy was a major contributor to an Optus outage, where that policy had an assumption of trusting internal peers and the fault was exacerbated by some mechanism where a policy failure was able to impact other logical networks on the same device (assuming there is/was more than 1 logical network).

Or maybe someone just leaked full routes into OSPF 🫠

John


_______________________________________________
AusNOG mailing list
AusNOG@lists.ausnog.net
https://lists.ausnog.net/mailman/listinfo/ausnog
_______________________________________________
AusNOG mailing list
AusNOG@lists.ausnog.net
https://lists.ausnog.net/mailman/listinfo/ausnog

Reply via email to