Re: [Simh] Cluster communications errors

Johnny Billquist Wed, 18 Jul 2018 17:54:02 -0700

On 2018-07-19 02:29, Paul Koning wrote:

On Jul 18, 2018, at 8:22 PM, Johnny Billquist <[email protected]> wrote:

On 2018-07-19 02:07, Paul Koning wrote:

On Jul 18, 2018, at 7:18 PM, Johnny Billquist <[email protected]> wrote:

...


It's probably worth pointing out that the reason I implemented that was not 
because of hardware problems, but because of software problems. DECnet can 
degenerate pretty badly when packets are lost. And if you shove packets fast 
enough at the interface, the interface will (obviously) eventually run out of 
buffers, at which point packets will be dropped.
This is especially noticeable in DECnet/RSX at least. I think I know how to 
improve that software, but I have not had enough time to actually try fixing 
it. And it is especially noticeable when doing file transfers over DECnet.

All ARQ protocols suffer dramatically with packet loss.  The other day I was reading a 
recent paper about high speed long distance TCP.  It showed a graph of throughput vs. 
packet loss rate.  I forgot the exact numbers, but it was something like 0.01% packet 
loss rate causes a 90% throughput drop.  Compare that with the old (1970s) ARPAnet rule 
of thumb that 1% packet loss means 90% loss of throughput.  Those both make sense; the 
old one was for "high speed" links running at 56 kbps, rather than the 
multi-Gbps of current links.
The other thing with nontrivial packet loss is that any protocol with 
congestion control algorithms triggered by packet loss (such as recent versions 
of DECnet), the flow control machinery will severely throttle the link under 
such conditions.
So yes, anything you can do in the infrastructure to keep the packet loss well 
under 1% is going to be very helpful indeed.


Right. That said, TCP behaves extremely much better than DECnet here. At least 
if we talk about TCP with the ability to deal with out of order packets (which 
most should do) and DECnet under RSX. The problem with DECnet under RSX is that 
recovering from a lost packet because of congestion essentially guarantees that 
congestion will happen again, while TCP pretty quickly comes into a steady 
working state.


Out of order packet handling isn't involved in that.  Congestion doesn't 
reorder packets.  If you drop a packet, TCP and DECnet both force the 
retransmission of all packets starting with the dropped one.  (At least, I 
don't think selective ACK is used in TCP.)  DECnet described out of order 
packet caching for the same reason TCP does: to work efficiently in packet 
topologies that have multiple paths in which the routers do equal cost path 
splitting.  In DECnet, that support is optional; it's not in DECnet/E and I 
wouldn't expect it in other 16-bit platforms either.

This is maybe getting too technical, so let me know if we should takethis off list.

Yes, congestion does not reorder packets. However, if you cannot handleout of order packets, you have to retransmit everything from the pointwhere a packet was lost.If you can deal with packets out of order, you can keep the packets youreceived, even though there is a hole, and once that hole is plugged,you can ACK everything. And this is pretty normal in TCP, even withoutselective ACK.

So, in TCP, what normally happens is that a node is spraying packets asfast as it can. Some packets are lost, but not all of them. Includingsome holes in the sequence of received packets.TCP will after some time, or other heuristics, start retransmitting fromthe point where packets were lost, and as soon as the receiving end haveplugged the hole, it will jump forward with the ACKs, meaning the senderdoes not need to retransmit everything. Even more, if the sender doesretransmit everything, loosing some of those retransmitted packets willnot matter, since the receiver already have them anyway. At some point,you will get to a state where the receiver have no window open, so thetransmitter is getting blocked, and every time the receiver opens up awindow, which usually is just a packet or two in size, the transmittercan send that much data. But this much data is usually less than thenumber of buffers the hardware have, so there are no problems receivingthose packets, and TCP gets into a steady state where the transmittercan transmit packets as fast as the receiver can consume them, and apartfrom a few lost packets in the early stages, no packets are lost.

DECnet (at least in RSX) on the other hand will transmit a whole bunchof packets. The first few will get through, but at some point one orseveral are lost. After some time, DECnet decides that packets werelost, and will back up and start transmitting again from the point wherethe packets were lost. Once more it will soon blast more packets thanthe receiver can process, and you will once more get a timeoutsituation. DECnet is backing off on the timeouts every time thishappens, and soon you are at a horrendous 127s timeout for pretty muchevery other packet sent, meaning in effect you are only managing to sendone packet every 127s. This is worsened, I think, by something thatlooks like a bug in the NFT/FAL code in RSX, where the code assumes itis faster than the packet transfer rate, and can manage to do a fewthings before two packets have been received. How much is to blame onDECnet in general, and how much on NFT/FAL, I'm not entirely clear. LikeI said, I have not had time to really test things around this.But it's very easy to demonstrate the problem. Just setup an old PDP-11and a simh (or similar) machine on the same DECnet, and try to transfera larger file to the real PDP-11, and check network counters and observehow thing immediately go to a standstill.

Which is why I implemented the throttling in the bridge, which Markmentioned.

As far as path splitting goes, it is implemented in RSX-11M-PLUS, butdisabled. I tried enabling it once, but the system crashed. The manualshave it documented, but I'm wondering if DEC never actually completedthe work.

I have not analyzed other DECnet implementation enough to tell for sure if they 
also exhibit the same problem.


Another consideration is that TCP has seen another 20 years of work on 
congestion control since DECnet Phase IV.  But in any case, it may well be that 
VMS handles these things better.  It's also possible that DECnet/OSI does, 
since it is newer and was designed right around the time that DEC very 
seriously got into congestion control algorithm research.  Phase IV isn't so 
well developed; it largely predates that work.

Well, this isn't really about congestion control so much as just beingable to handle out of order packets. Although congestion control couldcertainly also be applied to alleviate the problem.

I know that OSI originally stated the same basic assumption DECnet have- links are 100% reliable and never drops or reorder packets.A very bad assumption to build protocols on, and OSI eventually alsodefined links and operations based on technology where these assumptionswere not true. So I would hope/assume that DECnet/OSI eventually gotbetter. But I strongly suspect it was not the case from the start.


  Johnny

--
Johnny Billquist                  || "I'm on a bus
                                  ||  on a psychedelic trip
email: [email protected]             ||  Reading murder books
pdp is alive!                     ||  tryin' to stay hip" - B. Idol
_______________________________________________
Simh mailing list
[email protected]
http://mailman.trailing-edge.com/mailman/listinfo/simh

Re: [Simh] Cluster communications errors

Reply via email to