Re: poll(): IN/OUT vs {RD,WR}NORM

2024-05-28 Thread Johnny Billquist

This is a bit offtopic, but anyway...

On 2024-05-28 05:02, Mouse wrote:

Where do we attach 3 priority levels to data?

[I]n the context of poll itself, it's undefined.  But it's easy to
think that the TCP urgent data would be something usable in this
context.  But as you note, the urgent data is a somewhat broken thing
that noone ever really figured out how it was meant to be used or
anything about it at all.


TCP's urgent pointer is well defined.  It is not, however, an
out-of-band data stream, nor, despite Berkeley's attempts, can it
really be twisted and bent into one, unless you are on a network which
is high bandwidth, low latency, and low loss (as compared to the
"out-of-band" data rate).  Even then, the receiving process has to
handle data promptly.  Which, probably not coincidentally, describes
Berkeley's network and most of their network programs at the time.

However, the urgent pointer is close to useless in today's network, in
that there are few-to-no use cases that it is actually useful for.


It was always useless. The original design clearly had an idea that they 
wanted to get something, but it was never clear exactly what that 
something was, and even less clear how the urgent pointer would provide it.


It's not a BSD problem, but a problem with the whole design. Which is 
why there is an RFC that basically says that it should all go away (RFC 
6093). But then BSD (and others) did try to make it somehow OOB, which 
makes it even more strange.


Oh well. As for poll, it tries to be generic, but at the same time, why 
did it then divide incoming data into three priority brackets? SysV 
stuff was never something that I got into.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: poll(): IN/OUT vs {RD,WR}NORM

2024-05-28 Thread Valery Ushakov
On Tue, May 28, 2024 at 02:33:48 +, David Holland wrote:

> anything other than the same set of vague descriptions we had in the
> older poll(2).

poll(2) is ... ok, I'm not even sure what adjective to use here.  I
had to write some async TCP poll code that needed to work on Linux,
Solaris and MacOS, and I also tested it on NetBSD - the behavior was
fairly noticably different (half-close was half the fun).  Yet all
behaviors were conforming to what the vaguue descriptions in (various)
poll(2) manpages said.

-uwe


Re: poll(): IN/OUT vs {RD,WR}NORM

2024-05-28 Thread Robert Elz
Date:Tue, 28 May 2024 11:03:02 +0200
From:Johnny Billquist 
Message-ID:  <3853e930-4e77-4f6d-8a73-ec826a067...@softjar.se>

  | This is a bit offtopic, but anyway...

So it is, but anyway...

[Quoting Mouse:]
  | > TCP's urgent pointer is well defined.  It is not, however, an
  | > out-of-band data stream,

That's correct.

  | > However, the urgent pointer is close to useless in today's network, in
  | > that there are few-to-no use cases that it is actually useful for.

That's probably correct too.  It is however still used (and still works)
in telnet - though that is not a frequently used application any more.

[end Mouse quotes]

  | It was always useless. The original design clearly had an idea that they 
  | wanted to get something, but it was never clear exactly what that 
  | something was, and even less clear how the urgent pointer would provide it.

That's incorrect.   It is quite clear what was wanted, and aside from a
possible off by one in the original wording, was quite clear in how it
worked, and it did work.

The U bit in the header simply tells the receiver that there is some
data in the data stream (which is not sent out of band) that it probably
should see as soon as it can, and (perhaps, this depends upon the application)
that temporarily suspending any time consuming processing of the intervening
data (such as passing commands to a shell to be executed) would be a good
idea, until the "urgent" data has been processed.

The urgent pointer simply indicates where in the data stream the receiver
needs to have processed to have encountered the urgent data.  It does not
(and never did) "point to" the urgent data.   [That's where the off by one
occurred, there were two references to it, one suggested that the urgent
pointer would reference the final byte of what is considered urgent, the
other that it would reference one beyond that, that is, the first byte beyond
the urgent data.   This was corrected in the Hosts Requirements RFCs, somewhere
in the mid 80's if I remember them, roughly.]   The actual data considered
as urgent could be any number of bytes leading up to that, depending upon
the application protocol.   The application was expected to be able to
detect that, provided it actually saw it in the stream - the U bit (which
would remain set in every packet until one was sent containing no data
that included or preceded any of the urgent data) just allows the receiver
to know that something is coming which it might want to look for - but it
is entirely up to the application protocol design to decide how it is to
be recognised, and what should be done because of it ("nothing" could be
a reasonable answer in some cases).


That is all very simple, and works very well, particularly on high
latency or lossy networks, as long as you're not expecting "urgent"
to mean "out of band" or "arrive quickly" or anything else like that.

It is (was) mostly use with telnet to handle things like interrupts,
where the telnet server would have received a command line, sent that
to the shell (command interpreter) to be processed, and is now waiting
for that to be complete before reading the next command - essentially
using the network, and the sender, as buffering so that it does not need
to grow indefinitely big buffers if the sender just keeps on sending
more and more.

In this situation, if the sender tries to abort a command, when someone
or something realises that it will never finish by itself, then (given that
TCP has no out of band data, which vastly decreases its complexity, and
by so doing increases its reliability) there's no way for the sender to
communicate with the server to convey a "stop that now" message.   And do
remember that all this was designed before unix existed (before RFC's existed,
you need to go back to the original IEN's) when operating systems didn't
work like unix does - it was possible that only one telnet connection
could be made to a destination host (not a TCP or telnet restriction, but
imposed by the OS not providing any kind of parallel processing or 
multi-tasking), so simply connecting again and killing the errant process
wasn't necessarily possible.   Character echo was often done by the
client, not by sending the echoed characters back from the server.
A very different world to the one we're used to.

The U bit (and the urgent pointer which is just a necessary accessory,
not the principle feature) allowed this to be handled.   When the client
had something that needed attention to send, it would send that as "urgent"
data.  But that would just go in sequence with previously sent data (which
in the case of telnet, where the receive window doesn't often fill, was
probably already in the network somewhere) - however the U bit can be set
in the header of every packet transmitted, including retransmits of earlier
data, or even in an in sequence, no data, packet, and will be - with the
sender sending a duplicate, or empty, packet if needed to g

Re: poll(): IN/OUT vs {RD,WR}NORM

2024-05-28 Thread Mouse
>>> However, the urgent pointer is close to useless in today's network,
>>> in that there are few-to-no use cases that it is actually useful
>>> for.
> That's probably correct too.  It is however still used (and still
> works) in telnet - though that is not a frequently used application
> any more.

I question whether it actually works except by accident; see RFC 6093.

> [That's where the off by one occurred, there were two references to
> it, one suggested that the urgent pointer would reference the final
> byte of what is considered urgent, the other that it would reference
> one beyond that, that is, the first byte beyond the urgent data.
> This was corrected in the Hosts Requirements RFCs, somewhere in the
> mid 80's if I remember them, roughly.]

But only a few implementors paid any attention, it appears.

> That is all very simple, and works very well, particularly on high
> latency or lossy networks, as long as you're not expecting "urgent"
> to mean "out of band" or "arrive quickly" or anything else like that.

But the facility it provides is of little-to-no use.  I can't recall
anything other than TELNET that actually uses it, though I am by no
stretch fmailiar with more than some of the commonest protocols out
there.

Furthermore, given that probably the most popular API to TCP, sockets,
botched it by trying to turn it into an out-of-band data stream, then
botched it further by pointing the urgent sequence number to the wrong
place, I'd say it is questionable whether it is good for _anything_ any
longer.

> If an application needs a mechanism like this, it works well.

That's a bit like saying that car hand crank starter handles are useful
if you need them: strictly true, but to a first and even second
approximation both the statement and the thing stated about are
irrelevant to everyone.

Also, it's true only provided you don't use sockets for your API (or
fixed sockets - has anyone done a TCP socket interface that exposes the
urgent popinter _properly_?), and provided your and the peer's
implementations agree on which sequence number goes in the urgent
field.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: poll(): IN/OUT vs {RD,WR}NORM

2024-05-28 Thread Robert Elz
Date:Tue, 28 May 2024 22:46:09 -0400 (EDT)
From:Mouse 
Message-ID:  <202405290246.waa17...@stone.rodents-montreal.org>

  | I question whether it actually works except by accident; see RFC 6093.

I hadn't seen that one before, I stopped seriously following the IETF
around the end of the last millennium, it was becoming way too commercially
based (decisions no longer based purely on technical merit) with way way
too much bureaucracy.

Aside from the middlebox problem (of which I wasn't aware -- and IMO
anything in the middle of the internet which touches anything above the
IP layer is simply broken - I know NAT forces force a little of that,
but NAT is also simply broken) there is nothing new in there, except
that their change from the Hosts Requirement solution for the off-by-one
issue was the wrong way to go.   The HR group discussed that at length,
using the "last byte of the urgent data" is safe, using that + 1 is not,
in that a system receiving urgent data which believes it should be +0
will be waiting for one more byte to arrive, which might never be sent,
if the transmitter is using +1.   On the other hand, if the transmitter
uses +0 and the receiver is expecting it to be +1, all that happens is
that the U bit turns off one byte sooner, all the urgent data is still
there and available to be read (of course, if anything is pretending this
is one byte of out of bound data they fail either way - but that, as that RFC
says, is simply broken).   (The uninteresting cases when both sender and
transmitter use the same concept aren't worthy of mention).

Of course, if essentially the whole internet has settled on the +1 version
(the original specification, instead of the example code) then perhaps
that change may have been warranted - I certainly haven't surveyed anything
to see which way various systems actually do it, and I expect a lot of
the original systems are long gone by now.

  | But only a few implementors paid any attention, it appears.

Does the BSD stack not do this the way that HR defined things?   I thought
that was changed way way back, before CSRG stopped generating the code.

  | But the facility it provides is of little-to-no use.  I can't recall
  | anything other than TELNET that actually uses it,

TELNET and those protocols based upon it (SMTP and FTP command at least).
SMTP has no actual use for urgent data, and never sends any, but FTP can
in some circumstances I believe (very ancient unreliable memory).

  | Furthermore, given that probably the most popular API to TCP, sockets,
  | botched it by trying to turn it into an out-of-band data stream,

Yes, that was broken.

  | then botched it further by pointing the urgent sequence number to
  | the wrong place,

In fairness, when that was done, it wasn't clear it was wrong - that
all long predated anyone even being aware that there were two different
meanings in the TCP spec, people just used whichever of them was most
convenient (in terms of how it was expressed, not which is easier to
implement) and ignored the other completely.   That's why it took
decades to get fixed - no-one knew that the spec was broken for a long
time.

Further, if used properly, it really doesn't matter much, the application
is intended to recognise the urgent data by its content in the data stream,
all the U bit (& urgent pointer) should be doing is giving it a boot up
the read stream to suggest that it should consume more quickly than it
otherwise would.  Whether than indication stops one byte earlier or later
should not really matter.

The text in that RFC about multiple urgent sequences also misses that I
think - all that matters is that as long as there is urgent data coming,
the application should be aware of that and modify its behaviour to read
more rapidly than it otherwise might (if it never delays reading from the
network, always receives & processes packets as soon as they arrive, which
for example, systems which do remote end echo need to do) then it doesn't
need to pay attention to the U bit at all).

If there are multiple sequences that demand speedy processing, each should
be processed when it is encountered, and if that affects what is done
with other, "normal" data that is also being read quickly, that's just an
aspect of the application protocol.

kre

ps: I am not suggesting that anyone go design new protocols to use urgent
data, just that the system isn't nearly as broken as some people like to
claim.