On Mon, Sep 12, 2011 at 2:28 PM, George, Wesley
<wesley.geo...@twcable.com> wrote:
> -----Original Message-----
> From: christopher.mor...@gmail.com [mailto:christopher.mor...@gmail.com] On 
> Behalf Of Christopher Morrow
> Sent: Sunday, September 11, 2011 11:26 PM
> To: Randy Bush; George, Wesley
> Cc: Russ White; sidr@ietf.org
> Subject: Re: [sidr] BGPSec scaling (was RE: beacons and bgpsec)
> maybe what Wes is asking here is really:
> "Could someone model the load on a router doing bgpsec, in a world of
> bgpsec speaking devices?"
> Something like, for a core network edge device (say sprint, C&W, TWTC,
> UU/vzb,ATT an edge connecting device in their worst metro):
>  o number of updates today/second (steady state and 'worst case')
>  o projected growth of update stream (given historical data)
>  o projected 'cost' (cpu cycles) of un-assisted bgpsec
>  o projected RIB RAM size (use historical data to project forward)
>  o projected beacons/second (which really just look like updates in
> the update stream)
>  o routing table size (projected forward from historical data)
> It seems most of that data exists in one form or another, it seems
> that running the math isn't "hard". There's a question of the validity
> of the model... but that's always the case.
> Wes, is this sort of thing what you're asking for?
> WEG] Yes, to some extent, but you're right that the model is the hard part, 
> not the math. In trying to unwind a similar problem of how to characterize 
> steady-state and peak CPU load on a L3VPN PE router so that there are real 
> rules of thumb for capacity management and scaling, we discovered a couple of 
> things -
> 1) (some) Vendors are quite bad at providing reasonably accurate 
> multi-dimensional scaling models based on testing or real-world results. They 
> tend to give a lot of single-dimension scale limits (eg with this knob turned 
> to 11, you can get this value), but are very conservative and mumbly when it 
> comes to what the actual real-life limits are, YMMV, etc. As a result, 
> sometimes you end up finding out about the scaling cliff as you're falling 
> over it, or you pay for hardware that you can never fully use because you 
> stick to very conservative limitations.
> 2) a corollary: behavior at scale becomes increasingly non-deterministic the 
> more variables you're working with simultaneously. Even worse, it's difficult 
> to account in a model for things that work well enough at moderate scale, but 
> are not efficient enough for high scale, or suffer some sort of secondary 
> impact due to dependencies, etc.
> 3) some routers are very bad at providing useful data about critical scaling 
> vectors (updates per sec, changes in multicast state, etc). Coupled with the 
> fact that each router's numbers can be wildly different, it's difficult to 
> characterize a "common" router, let alone a common network.
> 4) there are widely varying opinions among vendors and operators as to what 
> is an acceptable level of performance at scale i.e. time to convergence of 
> last route, steady-state CPU utilization (how much headroom is enough), 
> stability during system or network events.
> I think that what is coming up here are concerns in a couple of different 
> categories:
> 1) Short-term hardware scale - is BGPSec supportable with what is 
> realistically available today? For how long? Is that long enough?
> 2) Long-term hardware scale (5+ years) - What's the next breakthrough? How 
> long does that buy us? Is that long enough? What does it do to our time 
> remaining before we have to redesign the routing system to make it keep 
> scaling?
> This is where we should be considering RFC4984 and either updating or 
> affirming the guidance there.
> 3) Cost for both - what is an acceptable assumption of the cost premium for 
> BGPSec, in both capital and personnel?
> On the hardware side, we're in a discussion that sounds a lot like predicting 
> peak oil - when do we run out of scale growth on Moore's law with the current 
> overall Internet architecture, and will BGPSec be just "one more gas-guzzler 
> on the road" or the straw that broke the camel's back?
> I don't know that we're going to get a definitive answer from modeling, and 
> I'm not trying to bring on analysis paralysis either. Randy's (and mine, and 
> everyone else's) guess may be BS, but even making a gut check based on what 
> info we have available and documenting the assumptions we're basing our 
> decision on would be a good thing.

I agree with the above, and the last comment really was what I was
aiming at.. If someone were to model the 6-ish items I outlined, and
properly documented their test-harness (and maybe provided it out so
folk could test with their favorite settings?) that would help us get
around this paralysis problem. At least we'd feel a bit more
comfortable having something to check against.

sidr mailing list

Reply via email to