Re: [sidr] BGPSec scaling (was RE: beacons and bgpsec)

George, Wesley Mon, 12 Sep 2011 11:28:23 -0700

-----Original Message-----
From: christopher.mor...@gmail.com [mailto:christopher.mor...@gmail.com] On 
Behalf Of Christopher Morrow
Sent: Sunday, September 11, 2011 11:26 PM
To: Randy Bush; George, Wesley
Cc: Russ White; sidr@ietf.org
Subject: Re: [sidr] BGPSec scaling (was RE: beacons and bgpsec)


maybe what Wes is asking here is really:
"Could someone model the load on a router doing bgpsec, in a world of
bgpsec speaking devices?"

Something like, for a core network edge device (say sprint, C&W, TWTC,
UU/vzb,ATT an edge connecting device in their worst metro):
  o number of updates today/second (steady state and 'worst case')
  o projected growth of update stream (given historical data)
  o projected 'cost' (cpu cycles) of un-assisted bgpsec
  o projected RIB RAM size (use historical data to project forward)
  o projected beacons/second (which really just look like updates in
the update stream)
  o routing table size (projected forward from historical data)

It seems most of that data exists in one form or another, it seems
that running the math isn't "hard". There's a question of the validity
of the model... but that's always the case.

Wes, is this sort of thing what you're asking for?

WEG] Yes, to some extent, but you're right that the model is the hard part, not 
the math. In trying to unwind a similar problem of how to characterize 
steady-state and peak CPU load on a L3VPN PE router so that there are real 
rules of thumb for capacity management and scaling, we discovered a couple of 
things -
1) (some) Vendors are quite bad at providing reasonably accurate 
multi-dimensional scaling models based on testing or real-world results. They 
tend to give a lot of single-dimension scale limits (eg with this knob turned 
to 11, you can get this value), but are very conservative and mumbly when it 
comes to what the actual real-life limits are, YMMV, etc. As a result, 
sometimes you end up finding out about the scaling cliff as you're falling over 
it, or you pay for hardware that you can never fully use because you stick to 
very conservative limitations.
2) a corollary: behavior at scale becomes increasingly non-deterministic the 
more variables you're working with simultaneously. Even worse, it's difficult 
to account in a model for things that work well enough at moderate scale, but 
are not efficient enough for high scale, or suffer some sort of secondary 
impact due to dependencies, etc.
3) some routers are very bad at providing useful data about critical scaling 
vectors (updates per sec, changes in multicast state, etc). Coupled with the 
fact that each router's numbers can be wildly different, it's difficult to 
characterize a "common" router, let alone a common network.
4) there are widely varying opinions among vendors and operators as to what is 
an acceptable level of performance at scale i.e. time to convergence of last 
route, steady-state CPU utilization (how much headroom is enough), stability 
during system or network events.

I think that what is coming up here are concerns in a couple of different 
categories:
1) Short-term hardware scale - is BGPSec supportable with what is realistically 
available today? For how long? Is that long enough?
2) Long-term hardware scale (5+ years) - What's the next breakthrough? How long 
does that buy us? Is that long enough? What does it do to our time remaining 
before we have to redesign the routing system to make it keep scaling?
This is where we should be considering RFC4984 and either updating or affirming 
the guidance there.
3) Cost for both - what is an acceptable assumption of the cost premium for 
BGPSec, in both capital and personnel?

On the hardware side, we're in a discussion that sounds a lot like predicting 
peak oil - when do we run out of scale growth on Moore's law with the current 
overall Internet architecture, and will BGPSec be just "one more gas-guzzler on 
the road" or the straw that broke the camel's back?

I don't know that we're going to get a definitive answer from modeling, and I'm 
not trying to bring on analysis paralysis either. Randy's (and mine, and 
everyone else's) guess may be BS, but even making a gut check based on what 
info we have available and documenting the assumptions we're basing our 
decision on would be a good thing.

Wes

This E-mail and any of its attachments may contain Time Warner Cable 
proprietary information, which is privileged, confidential, or subject to 
copyright belonging to Time Warner Cable. This E-mail is intended solely for 
the use of the individual or entity to which it is addressed. If you are not 
the intended recipient of this E-mail, you are hereby notified that any 
dissemination, distribution, copying, or action taken in relation to the 
contents of and attachments to this E-mail is strictly prohibited and may be 
unlawful. If you have received this E-mail in error, please notify the sender 
immediately and permanently delete the original and any copy of this E-mail and 
any printout.
_______________________________________________
sidr mailing list
sidr@ietf.org
https://www.ietf.org/mailman/listinfo/sidr

Re: [sidr] BGPSec scaling (was RE: beacons and bgpsec)

Reply via email to