On Mon, Dec 31, 2012 at 12:54 PM, Jakob Heitz <jakob.he...@ericsson.com> wrote:
> I don't think treat-as-withdraw is trying to fix a single session reset. 
> Graceful restart can fix that. It's the rolling resets that need a human to 
> remove a buggy router or a config that triggered the bug. That takes several 
> hours. Treat-as-withdraw limits the damage during those hours.
>
> Could we please settle on that without trying to solve the impossible?

If you read my posts to IDR on this topic, you'll see where I explain
how it is possible to solve the "impossible."

Specifically, you can ignore just about any bad update, or bad message
of any kind, as long as you can figure out where the next message
starts.  This fixes the rolling resets.

You may know that a lot of businesses suffered multi-hour outages in
October simply because of 5 DFZ routes announced by LANL that had
illegal attributes.  This is very hard for operators to troubleshoot
on most routers.  The routers that experienced rolling resets were
buggy but if the operators simply had a panic button, "ignore bad
messages," their networks would have been up and they would not have
been losing money by the minute.

This is not the only time bad updates have propagated through the DFZ
and caused big problems.  It has happened repeatedly.

I believe it will begin to happen more often inside datacenter
networks, because BGP is being used for more and more things, like
EVPN.  Operators are going to need BGP to become more robust.

You can make it a lot more robust just by deciding to ignore
everything in a bad message.  This is not good, but it is a lot better
than session-reset, in most cases.

Please, read my posts on this topic, and do not treat this problem as
an unsolvable one.  It can be largely solved in a way that gives a
very useful fallback option to operators.

This whole draft is about fallback options, and it is pretty stupid to
have a large amount of complexity to solve a small set of potential
bugs, when you could ALTERNATIVELY or IN ADDITION to that, have a very
low-complexity option that solves more problems.

-- 
Jeff S Wheeler <j...@inconcepts.biz>
Sr Network Operator  /  Innovative Network Concepts
_______________________________________________
GROW mailing list
GROW@ietf.org
https://www.ietf.org/mailman/listinfo/grow

Reply via email to