Jeffrey Haas wrote 2018-11-07 20:56:

I guess my question to those who live in IGP land is how often is this a problem? In the case of an IGP, the backpressure means you have databases
that are out of sync and end up with bad forwarding.

As discussed below, if you have multiple flooding paths, and not all of
them are congested or throttled, when at least one copy of the LSP makes
it across, convergence will be fast.

Both iBGP and eBGP. The two general issues are slow receivers (scale) or
responses to dropped packets.

Slow receivers are a problem for native flooding too.
Although I suspect you mean: after congestion problems, and the situation improves, native flooding will recover quickly, while TCP might intentionally
keep things slow for a longer period of time. Correct ?
Yes, that is an issue.

The general experiment I recommend to people trying to do this sort of thing is take your TCP stream of choice, pace it according to your transmission
needs, and then drop 5-30% of the packets.  Observe what happens.

Don't we have DSCP for this ?
And remember, the TCP connection will (almost) always be between two
directly connected endpoints.

TCP recovers fine. But the hiccups can do bad things when timeliness is expected. For example, 3 second hold times for aggressive BGP peering may
time out.

Our proposal is to do only flooding over TCP.
Adjacency management is still done based on native IIHs (and BFD).
Even if TCP stalls the flooding, the adjacency should stay up.
With flooding over multiple paths, it should not be a fatal event.

I guess I'd restate my concern as "for this application, ensure that you're okay with the results of stalled trasnmission". Effectively, see the answer
to the question I asked above about native behaviors.

[Flooding happens over multiple paths. As long as one path is quick, convergence
 will be quick too].

This, I think, is a better point addressing my concerns.  Thanks.

I expect there will be more issues that need to be addressed.
E.g. an old rule of thumb is: don't generate a routing update (packet) unless you are pretty sure you can send it right away, and the receiver can receive it.
Otherwise a lot of the actual communication might be stale information.

I'm not sure if people find this rule of thumb still relevant these days. (I know people who do not). With an abundance of cpu-cores, memory and bandwidth, it seems many problems of the past are not visible anymore. Unless you start pushing beyond what most people do. But if you do care, it is advisable to keep your TCP window-sizes small. Maybe at the default 16KB, maybe even 8KB or 4KB. With a window-size of 4KB you might be able to still send a dozen average-size LSPs, and those might get stuck/stale in TCP. But I think that's a good trade-off to get syncing of large LSDBs. As long as you don't set the window-size to 64KB or larger. And maybe even then, stale LSPs might be less of a problem than old-timers think.

henk.

_______________________________________________
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr

Reply via email to