There is sometimes the option of a session reset and session restart at
a specified interval after the event has triggered. Vendor dependent of
course, but the option exists in IOS XE at least and most likely other
vendors too. This allows for a recovery once too many prefixes have
been received.
That probably would have saved a lot of site visits for Optus once the
root cause of the prefixes was fixed at the edge.
Then there is the unanswered question of where an Out Of Band management
network fitted into this picture which also likely would have provided a
get-out-of-jail-free card much earlier in the day.
Reuben
On 14/11/2023 1:27 pm, John Edwards wrote:
The default behaviour of the "maximum prefix" BGP feature is to bring
down the BGP session with the peer.
The alternate behaviour is to log a warning and accept a prefix.
I am not aware of an implementation that just allows "Accept up to X
routes and then don't accept any more".
That sounds logical but in reality would lead to inconsistent behaviour
that is more readily addressed with existing routing policy tools.
It appears that a failure of routing policy was a major contributor to
an Optus outage, where that policy had an assumption of
trusting internal peers and the fault was exacerbated by some mechanism
where a policy failure was able to impact other logical networks on the
same device (assuming there is/was more than 1 logical network).
Or maybe someone just leaked full routes into OSPF 🫠
John
_______________________________________________
AusNOG mailing list
AusNOG@lists.ausnog.net
https://lists.ausnog.net/mailman/listinfo/ausnog
_______________________________________________
AusNOG mailing list
AusNOG@lists.ausnog.net
https://lists.ausnog.net/mailman/listinfo/ausnog