Re: [Lsr] Why only a congestion-avoidance algorithm on the sender isn't enough

Henk Smit Mon, 04 May 2020 02:48:10 -0700


On Friday I wrote:

I still think we'll end up re-implementing a new (and weaker) TCP.


Christian Hopps wrote 2020-05-04 01:27:

Let's not be too cynical at the start though! :)


I wasn't trying to be cynical.
Let me explain my line of reasoning two years ago.

When reading about the proposals for limiting the flooding topology
in IS-IS, I read a requirement doc. It said that the goal was to
support areas (flooding domains) of 10k routers. Or maybe even 100k
routers. My immediate thought was: "how are you gonna sync the LSDB
when a router boots up ? That takes 300 to 3000 seconds !?".

This is the problem I wanted to solve. I hadn't even thought of
routers in dense topologies that have 1k+ neighbors.


There are currently heathens that use BGP as IGP in their data-centers.
There's even a cult that is developing a new IGP on top of BGP (LSVR).
If they think BGP/BGP-LS/LSVR are good choices for an IGP, why is that ?
One reason is that people claim that BGP is more scalable. Note, when
doing "Internet-routing" with large number of prefixes, routers, or
some implementations of BGP, still sometimes need minutes, or dozens
of minutes to process and re-advetise all those prefixes. So when we
talk about minutes, why do people think BGP is so much more wonderful ?

I think it's TCP. TCP can transport lots of info quickly andefficiently.

And conceptually TCP is easy to understand for the user ("you write
into a socket, you read from a socket on the other box. done").

If TCP is good enough for BGP bulk-transport, it should be good
enough for IS-IS bulk-transport.

If there are issues with using TCP for routing-protocols, I'm sure
we've solved those by now (in our implementations). We can use those
same solutions/tweaks we use for BGP's TCP in ISIS's TCP. Or am I
too naive now ?

BTW, all the implementations I've worked with used regular TCP. All
the Open Source BGPs seem to be using the regular TCP in their
kernels. Can someone explain why TCP is good for BGP but not for IS-IS ?

Almost 24 years ago, I sat on a bench in Santa Cruz discussing protocols

with an engineer who had a lot more experience than I had, and stillhave.

He was designing LDP at the time (with Yakov). LDP also uses TCP.
He said "if we had to design IS-IS now, of course we'd use TCP as
transport now". I never forgot that.


The goal here is not to make IS-IS transport optimal. We don't need to
use maximum available bandwidth. I just happen to think we need the
same 2 elements that TCP has: sender-side congestion-avoidance and
receiver-side flow-control. I hope I have explained why sender-side
congestion-control in IS-IS is not enough (you don't get the feedback
you need to make it work). Les and others have tried to explain
why receiver-side flow-control is hard to implement (the receiving
IS-IS might not know about the state of its interfaces, linecards, etc).

That's why I think we need both.
And when we implement both, it'll start to look like TCP.
So why not use TCP itself ?
Or Quic ? Or another transport that's already implemented ?

I'd note that our environment is a bit more controlled than the
end-to-end internet environment. In IS-IS we are dealing with single
link (logical) so very simple solutions (CTS/RTS, ethernet PAUSE)
could be viable.


Les's argument is that it's often not so controlled.

Let me ask you one question:
In your algorithm, the receiving IS-IS will send a "pause signal" when

it is overrun. How does IS-IS know it is overrun ? The router isdroppingIS-IS pdu's on the interface, on the linecard, on the queue betweenlinecardsand Control Plane, on the IS-IS process's input-queue. When queues arefull,you can't send a message up saying "we didn't have space for an IS-ISmessage,but we're sending you this message that we've just dropped an IS-ISmessage".

How do you envision this works ?

Imho receiver-side flow-control can only send a rough upper-bound on howmany

pdu's it can receive normally.

A solution with a "pause signal" is basically the same as areceiver-side

flow-control, where the receive-window is either 0 or infinite.

Thus our choice of algorithms may well be less restricted.


I'm looking forward to seeing (an outline of) your algorithm.

Again, I'm not pushing for TCP (anymore). I'm not pushing for anything.
I'm just trying to explain the problems that I see with solutions
that are, imho, a bit too simple to really help. Maybe I'm wrong, and
the problem is simpler than I think. Experimentation would be nice.

henk.

_______________________________________________
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] Why only a congestion-avoidance algorithm on the sender isn't enough

Reply via email to