Re: O_DIRECT "packet mode" pipes on Linux

Robert Elz Wed, 23 Sep 2020 22:49:29 -0700

    Date:        Wed, 23 Sep 2020 21:47:10 -0700
    From:        Vito Caputo <[email protected]>
    Message-ID:  <[email protected]>



  | It's useful if you're doing something like say, aggregating data from
  | multiple piped sources into a single bytestream.  With the default
  | pipe behavior, you'd have the output interleaved at random boundaries.

If that's happening, then either the pipe implementation is badly broken,
or the applications using it aren't doing what you'd like them to do.

Writes (<= the pipe buffer size) have always (since ancient unix, probably
since pipes were first created) been atomic - nothing will randomly split
the data.

What the new option is offering (as best I can tell from the discussion
here, I am not a linux user) is passing those boundaries through the pipe
to the reader - that hasn't been a pipe feature, but it is exactly what a
unix domain datagram socket provides (these days pipes are sometimes
implemented using unix domain connection oriented sockets ... I'm guessing
that the option simply changes the transport protocol used with an
implementation that works that way).

  | With packetized pipes, if your sources write say, newline-delimited
  | text records, kept under PIPE_BUF length, the aggregated output would
  | always interleave between the lines, never in the middle of them.

That happens with regular pipes.

  | If we added this to the shell, I suppose the next thing to explore
  | would be how to get all the existing core shell utilities to detect a
  | packetized pipe on stdout and switch to a line-buffered mode instead
  | of block-buffered, assuming they're using stdio.

I suspect that is really all you need - a mechanism to request line
buffered output rather than blocksize buffered.   You don't need to
go fiddling with pipes for that, and abusing the pipe interface as a
way to pass a "line buffer this output please" request to the application
seems like the wrong way to achieve that to me.

This isn't a criticism of the datagram packet pipe idea - there are
applications for that (pipe is easier to use than manually setting up
a pair of unix domain datagram sockets) but that is for specialised
applications, where for whatever reason the receiver needs to read just
one packet at a time (usually because of a desire to have multiple
reading applications, each taking the next request, and then processing
it ... if there is just one receiving process all that is needed is
to stick a record length before each packet sent to a normal pipe, and
let the receiver process the records from the aggregations it receives).

kre

Re: O_DIRECT "packet mode" pipes on Linux

Reply via email to