On Wed, 15 Aug 2007, Stephen Wilcox wrote:
(Check slide 4) - the simple fact was that with something like 7 of 9
cables down the redundancy is useless .. even if operators maintained
N+1 redundancy which is unlikely for many operators that would imply
50% of capacity was actually used with 50% spare.. however we see
around 78% of capacity is lost. There was simply to much traffic and
not enough capacity.. IP backbones fail pretty badly when faced with
extreme congestion.
Remember the end-to-end principle. IP backbones don't fail with extreme
congestion, IP applications fail with extreme congestion.
Should IP applications respond to extreme congestion conditions better?
Or should IP backbones have methods to predictably control which IP
applications receive the remaining IP bandwidth? Similar to the telephone
network special information tone -- All Circuits are Busy. Maybe we've
found a new use for ICMP Source Quench.
Even if the IP protocols recover "as designed," does human impatience mean
there is a maximum recovery timeout period before humans start making the
problem worse?