Re: RFE: enable buffering on null-terminated data

2024-03-20 Thread Carl Edquist via GNU coreutils General Discussion



On Tue, 19 Mar 2024, Zachary Santer wrote:


On Tue, Mar 19, 2024 at 1:24 AM Kaz Kylheku  wrote:


But what tee does is set up _IONBF on its output streams,
including stdout.


So it doesn't buffer at all. Awesome. Nevermind.


Yay!  :D

And since tee uses fwrite to copy whatever input is available, that will 
mean 'records' are output on the same boundaries as the input (whether 
that be newlines, nuls, or just block boundaries).  So putting tee in the 
middle of a pipeline shouldn't itself interfere with whatever else you're 
up to.  (AND it's still relatively efficient, compared to some tools like 
cut that putchar a byte at a time.)


My note about pipelines like this though:

$ ./build.sh | sed s/what/ever/ | tee build.log

is that with the default stdio buffering, while all the commands in 
build.sh will be implicitly self-flushing, the sed in the middle will end 
up batching its output into blocks, so tee will also repeat them in 
blocks.


However, if stdbuf's magic env vars are exported in your shell (either by 
doing a trick like 'export $(env -i stdbuf -oL env)', or else more simply 
by first starting a new shell with 'stdbuf -oL bash'), then every command 
in your pipelines will start with the new default line-buffered stdout. 
That way your line-items from build.sh should get passed all the way 
through the pipeline as they are produced.



(But, proof's in the pudding, so whatever works for you :D )


Happy putting all the way!

Carl


Re: RFE: enable buffering on null-terminated data

2024-03-19 Thread Zachary Santer
On Tue, Mar 19, 2024 at 1:24 AM Kaz Kylheku  wrote:
>
> But what tee does is set up _IONBF on its output streams,
> including stdout.

So it doesn't buffer at all. Awesome. Nevermind.



Re: RFE: enable buffering on null-terminated data

2024-03-18 Thread Kaz Kylheku
On 2024-03-17 17:12, Zachary Santer wrote:
> On Thu, Mar 14, 2024 at 11:14 AM Carl Edquist  wrote:
> 
>> Where things get sloppy is if you add some stuff in a pipeline after your
>> build script, which results in things getting block-buffered along the
>> way:
>>
>> $ ./build.sh | sed s/what/ever/ | tee build.log
>>
>> And there you will definitely see a difference.
> 
> Sadly, the man page for stdbuf specifically calls out tee as being
> unaffected by stdbuf, because it adjusts the buffering of its standard
> streams itself. The script I mentioned pipes everything through tee,
> and I don't think I'm willing to refactor it not to. Ah well.

But what tee does is set up _IONBF on its output streams,
including stdout.



Re: RFE: enable buffering on null-terminated data

2024-03-17 Thread Zachary Santer
On Thu, Mar 14, 2024 at 11:14 AM Carl Edquist  wrote:

> Where things get sloppy is if you add some stuff in a pipeline after your
> build script, which results in things getting block-buffered along the
> way:
>
> $ ./build.sh | sed s/what/ever/ | tee build.log
>
> And there you will definitely see a difference.

Sadly, the man page for stdbuf specifically calls out tee as being
unaffected by stdbuf, because it adjusts the buffering of its standard
streams itself. The script I mentioned pipes everything through tee,
and I don't think I'm willing to refactor it not to. Ah well.

> Oh, I imagine "undefined operation" means something more like
> "unspecified" here.  stdbuf(1) uses setbuf(3), so the behavior you'll get
> should be whatever the setbuf(3) from the libc on your system does.
>
> I think all this means is that the C/POSIX standards are a bit loose about
> what is required of setbuf(3) when a buffer size is specified, and there
> is room in the standard for it to be interpreted as only a hint.

> Works for me (on glibc-2.23)

Thanks for setting me straight here.

> What may not be obvious is that the shell does not need to get involved
> with writing input for a coprocess or reading its output - the shell can
> start other (very fast) programs with input/output redirected to/from the
> coprocess pipes to do that processing.

Gosh, I'd like to see an example of that, too.

> My point though earlier was that a null-terminated record buffering mode,
> as useful as it sounds on the surface (for null-terminated paths), may
> actually be something _nobody_ has ever actually needed for an actual (not
> contrived) workflow.

I considered how it seemed like something people could need years ago
and only thought to email into email lists about it last weekend.
Maybe there are all sorts of people out there who have been using
'stdbuf --output=0' on null-terminated data for years and never
thought to raise the issue. I know that's not a very strong argument,
though.



Re: RFE: enable buffering on null-terminated data

2024-03-14 Thread Carl Edquist via GNU coreutils General Discussion



On Mon, 11 Mar 2024, Zachary Santer wrote:

On Mon, Mar 11, 2024 at 7:54 AM Carl Edquist  
wrote:


(In my coprocess management library, I effectively run every coproc 
with --output=L by default, by eval'ing the output of 'env -i stdbuf 
-oL env', because most of the time for a coprocess, that's whats 
wanted/necessary.)


Surrounded by 'set -a' and 'set +a', I guess? Now that's interesting.


Ah, no - I use the 'VAR=VAL command line' syntax so that it's specific to 
the command (it's not left exported to the shell).


Effectively the coprocess commands are run with

LD_PRELOAD=... _STDBUF_O=L command line

This allow running shell functions for the command line, which will all 
get the desired stdbuf behavior.  Because you can't pass a shell function 
(within the context of the current shell) as the command to stdbuf.


As far as I can tell, the stdbuf tool sets LD_PRELOAD (to point to 
libstdbuf.so) and your custom buffering options in _STDBUF_{I,O,E}, in the 
environment for the program it runs.  The double-env thing there is just a 
way to cleanly get exactly the env vars that stdbuf sets.  The values 
don't change, but since they are an implementation detail of stdbuf, it's 
a bit more portable to grab the values this way rather than hard code 
them.  This is done only once per shell session to extract the values, and 
save them to a private variable, and then they are used for the command 
line as show above.


Of course, if "command line" starts with "stdbuf --output=0" or whatever, 
that will override the new line-buffered default.



You can definitely export it to your shell though, either with 'set -a' 
like you said, or with the export command.  After that everything you run 
should get line-buffered stdio by default.



I just added that to a script I have that prints lines output by another 
command that it runs, generally a build script, to the command line, but 
updating the same line over and over again. I want to see if it updates 
more continuously like that.


So, a lot of times build scripts run a bunch of individual commands. 
Each of those commands has an implied flush when it terminates, so you 
will get the output from each of them promptly (as each command 
completes), even without using stdbuf.


Where things get sloppy is if you add some stuff in a pipeline after your 
build script, which results in things getting block-buffered along the 
way:


$ ./build.sh | sed s/what/ever/ | tee build.log

And there you will definitely see a difference.


sloppy () {
for x in {1..10}; do sleep .2; echo $x; done |
sed s/^/:::/ | cat
}

{
echo before:
sloppy
echo

export $(env -i stdbuf -oL env)

echo after:
sloppy
}

Yeah, there's really no way to break what I'm doing into a standard 
pipeline.


I admit I'm curious what you're up to  :)


Of course, using line-buffered or unbuffered output in this situation 
makes no sense. Where it might be useful in a pipeline is when an 
earlier command in a pipeline might only print things occasionally, and 
you want those things transformed and printed to the command line 
immediately.


Right ... And in that case, losing the performance benefit of a larger 
block buffer is a smaller price to pay.


My assumption is that line-buffering through setbuf(3) was implemented 
for printing to the command line, so its availability to stdbuf(1) is 
just a useful side effect.


Right, stdbuf(1) leverages setbuf(3).

setbuf(3) tweaks the buffering behavior of stdio streams (stdin, stdout, 
stderr, and anything else you open with, eg, fopen(3)).  It's not really 
limited to terminal applications, but yeah it makes it easier to ensure 
that your calls to printf(3) actually get output after each line (whether 
that's to a file or a pipe or a tty), without having to call an explicit 
fflush(3) of stdout every time.


stdbuf(1) sets LD_PRELOAD to libstdbuf.so for your program, causing it to 
call setbuf(3) at program startup based on the values of _STDBUF_* in the 
environment (which stdbuf(1) also sets).


(That's my read of it anyway.)

In the BUGS section in the man page for stdbuf(1), we see: On GLIBC 
platforms, specifying a buffer size, i.e., using fully buffered mode 
will result in undefined operation.


Eheh xD

Oh, I imagine "undefined operation" means something more like 
"unspecified" here.  stdbuf(1) uses setbuf(3), so the behavior you'll get 
should be whatever the setbuf(3) from the libc on your system does.


I think all this means is that the C/POSIX standards are a bit loose about 
what is required of setbuf(3) when a buffer size is specified, and there 
is room in the standard for it to be interpreted as only a hint.


If I'm not mistaken, then buffer modes other than 0 and L don't actually 
work. Maybe I should count my blessings here. I don't know what's going 
on in 

Re: RFE: enable buffering on null-terminated data

2024-03-11 Thread Zachary Santer
On Mon, Mar 11, 2024 at 7:54 AM Carl Edquist  wrote:
>
> (In my coprocess management library, I effectively run every coproc with
> --output=L by default, by eval'ing the output of 'env -i stdbuf -oL env',
> because most of the time for a coprocess, that's whats wanted/necessary.)

Surrounded by 'set -a' and 'set +a', I guess? Now that's interesting.
I just added that to a script I have that prints lines output by
another command that it runs, generally a build script, to the command
line, but updating the same line over and over again. I want to see if
it updates more continuously like that.

> ... Although, for your example coprocess use, where the shell both
> produces the input for the coproc and consumes its output, you might be
> able to simplify things by making the producer and consumer separate
> processes.  Then you could do a simpler 'producer | filter | consumer'
> without having to worry about buffering at all.  But if the producer and
> consumer need to be in the same process (eg they share state and are
> logically interdependent), then yeah that's where you need a coprocess for
> the filter.

Yeah, there's really no way to break what I'm doing into a standard pipeline.

> (Although given your time output, you might say the performance hit for
> unbuffered is not that huge.)

We see a somewhat bigger difference, at least proportionally, if we
get bash more or less out of the way. See command-buffering, attached.

Standard:
real0m0.202s
user0m0.280s
sys 0m0.076s
Line-buffered:
real0m0.497s
user0m0.374s
sys 0m0.545s
Unbuffered:
real0m0.648s
user0m0.544s
sys 0m0.702s

In coproc-buffering, unbuffered output was 21.7% slower than
line-buffered output, whereas here it's 30.4% slower.

Of course, using line-buffered or unbuffered output in this situation
makes no sense. Where it might be useful in a pipeline is when an
earlier command in a pipeline might only print things occasionally,
and you want those things transformed and printed to the command line
immediately.

> So ... again in theory I also feel like a null-terminated buffering mode
> for stdbuf(1) (and setbuf(3)) is kind of a missing feature.

My assumption is that line-buffering through setbuf(3) was implemented
for printing to the command line, so its availability to stdbuf(1) is
just a useful side effect.

In the BUGS section in the man page for stdbuf(1), we see:
On GLIBC platforms, specifying a buffer size, i.e., using fully
buffered mode will result in undefined operation.

If I'm not mistaken, then buffer modes other than 0 and L don't
actually work. Maybe I should count my blessings here. I don't know
what's going on in the background that would explain glibc not
supporting any of that, or stdbuf(1) implementing features that aren't
supported on the vast majority of systems where it will be installed.

> It may just
> be that nobody has actually had a real need for it.  (Yet?)

I imagine if anybody has, they just set --output=0 and moved on. Bash
scripts aren't the fastest thing in the world, anyway.


command-buffering
Description: Binary data


Re: RFE: enable buffering on null-terminated data

2024-03-11 Thread Carl Edquist via GNU coreutils General Discussion

On Sun, 10 Mar 2024, Zachary Santer wrote:


On Sun, Mar 10, 2024 at 4:36 PM Carl Edquist  wrote:


Out of curiosity, do you have an example command line for your use case?


My use for 'stdbuf --output=L' is to be able to run a command within a
bash coprocess.


Oh, cool, now you're talking!  ;)


(Really, a background process communicating with the parent process 
through FIFOs, since Bash prints a warning message if you try to run 
more than one coprocess at a time. Shouldn't make a difference here.)


(Kind of a side-note ... bash's limited coprocess handling was a long 
standing annoyance for me in the past, to the point that I wrote a bash 
coprocess management library to handle multiple active coprocess and give 
convenient methods for interaction.  Perhaps the trickiest bit about 
multiple coprocesses open at once (which I suspect is the reason support 
was never added to bash) is that you don't want the second and subsequent 
coprocesses to inherit the pipe fds of prior open coprocesses.  This can 
result in deadlock if, for instance, you close your write end to coproc1, 
but coproc1 continues to wait for input because coproc2 also has a copy of 
a write end of the pipe to coproc1's input.  So you need to be smart about 
subsequent coprocesses first closing all fds associated with other 
coprocesses.


Word to the wise: you might encounter this issue (coproc2 prevents coproc1 
from seeing its end-of-input) even though you are rigging this up yourself 
with FIFOs rather than bash's coproc builtin.)




See coproc-buffering, attached.


Thanks!

Without making the command's output either line-buffered or unbuffered, 
what I'm doing there would deadlock. I feed one line in and then expect 
to be able to read a transformed line immediately. If that transformed 
line is stuck in a buffer that's still waiting to be filled, then 
nothing happens.


I swear doing this actually makes sense in my application.


Yeah makes sense!  I am familiar with the problem you're describing.

(In my coprocess management library, I effectively run every coproc with 
--output=L by default, by eval'ing the output of 'env -i stdbuf -oL env', 
because most of the time for a coprocess, that's whats wanted/necessary.)



... Although, for your example coprocess use, where the shell both 
produces the input for the coproc and consumes its output, you might be 
able to simplify things by making the producer and consumer separate 
processes.  Then you could do a simpler 'producer | filter | consumer' 
without having to worry about buffering at all.  But if the producer and 
consumer need to be in the same process (eg they share state and are 
logically interdependent), then yeah that's where you need a coprocess for 
the filter.


... On the other hand, if the issue is that someone is producing one line 
at a time _interactively_ (that is, inputting text or commands from a 
terminal), then you might argue that the performance hit for unbuffered 
output will be insignificant compared to time spent waiting for terminal 
input.




$ ./coproc-buffering 10
Line-buffered:
real0m17.795s
user0m6.234s
sys 0m11.469s
Unbuffered:
real0m21.656s
user0m6.609s
sys 0m14.906s


Yeah, this makes sense in your particular example.

It looks like expand(1) uses putchar(3), so in unbuffered mode this 
translates to one write(2) call for every byte.  sed(1) is not quite as 
bad - in unbuffered it appears to output the line and the newline 
terminator separately, so two write(2) calls for every line.


So in both cases (but especially for expand), line buffering reduces the 
number of write(2) calls.


(Although given your time output, you might say the performance hit for 
unbuffered is not that huge.)



When I initially implemented this thing, I felt lucky that the data I 
was passing in were lines ending in newlines, and not null-terminated, 
since my script gets to benefit from 'stdbuf --output=L'.


:thumbsup:



Truth be told, I don't currently have a need for --output=N.


Mmm-hmm  :)


Of course, sed and all sorts of other Linux command-line tools can 
produce or handle null-terminated data.


Definitely.  So in the general case, theoretically it seems as useful to 
buffer output on nul bytes.


Note that for gnu sed in particular, there is a -u/--unbuffered option, 
which will effectively give you line buffered output, including buffering 
on nul bytes with -z/--null-data .


... I'll be honest though, I am having trouble imagining a realistic 
pipeline that filters filenames with embedded newlines using expand(1) 
;)


...

But, I want to be a good sport here and contrive an actual use case.

So for fun, say I want to use cut(1) (which performs poorly when 
unbuffered) in a coprocess that takes null-terminated file paths on input 
and outputs the first directory component (which possibly contains 
embedded newlines).


The basic command in the coprocess would be:

cut -d/ -f1 -z

but with the default 

Re: RFE: enable buffering on null-terminated data

2024-03-10 Thread Zachary Santer
On Sun, Mar 10, 2024 at 4:36 PM Carl Edquist  wrote:
>
> Hi Zack,
>
> This sounds like a potentially useful feature (it'd probably belong with a
> corresponding new buffer mode in setbuf(3)) ...
>
> > Filenames should be passed between utilities in a null-terminated
> > fashion, because the null byte is the only byte that can't appear within
> > one.
>
> Out of curiosity, do you have an example command line for your use case?

My use for 'stdbuf --output=L' is to be able to run a command within a
bash coprocess. (Really, a background process communicating with the
parent process through FIFOs, since Bash prints a warning message if
you try to run more than one coprocess at a time. Shouldn't make a
difference here.) See coproc-buffering, attached. Without making the
command's output either line-buffered or unbuffered, what I'm doing
there would deadlock. I feed one line in and then expect to be able to
read a transformed line immediately. If that transformed line is stuck
in a buffer that's still waiting to be filled, then nothing happens.

I swear doing this actually makes sense in my application.

$ ./coproc-buffering 10
Line-buffered:
real0m17.795s
user0m6.234s
sys 0m11.469s
Unbuffered:
real0m21.656s
user0m6.609s
sys 0m14.906s

When I initially implemented this thing, I felt lucky that the data I
was passing in were lines ending in newlines, and not null-terminated,
since my script gets to benefit from 'stdbuf --output=L'. Truth be
told, I don't currently have a need for --output=N. Of course, sed and
all sorts of other Linux command-line tools can produce or handle
null-terminated data.

> > If I want to buffer output data on null bytes, the closest I can get is
> > 'stdbuf --output=0', which doesn't buffer at all. This is pretty
> > inefficient.
>
> I'm just thinking that find(1), for instance, will end up calling write(2)
> exactly once per filename (-print or -print0) if run under stdbuf
> unbuffered, which is the same as you'd get with a corresponding stdbuf
> line-buffered mode (newline or null-terminated).
>
> It seems that where line buffering improves performance over unbuffered is
> when there are several calls to (for example) printf(3) in constructing a
> single line.  find(1), and some filters like grep(1), will write a line at
> a time in unbuffered mode, and thus don't seem to benefit at all from line
> buffering.  On the other hand, cut(1) appears to putchar(3) a byte at a
> time, which in unbuffered mode will (like you say) be pretty inefficient.
>
> So, depending on your use case, a new null-terminated line buffered option
> may or may not actually improve efficiency over unbuffered mode.

I hadn't considered that.

> You can run your commands under strace like
>
>  stdbuf --output=X  strace -c -ewrite  command ... | ...
>
> to count the number of actual writes for each buffering mode.

I'm running bash in MSYS2 on a Windows machine, so hopefully that
doesn't invalidate any assumptions. Now setting up strace around the
things within the coprocess, and only passing in one line, I now have
coproc-buffering-strace, attached. Giving the argument 'L', both sed
and expand call write() once. Giving the argument 0, sed calls write()
twice and expand calls it a bunch of times, seemingly once for each
character it outputs. So I guess that's it.

$ ./coproc-buffering-strace L
|Line with tabs   why?|

$ grep -c -F 'write:' sed-trace.txt expand-trace.txt
sed-trace.txt:1
expand-trace.txt:1

$ ./coproc-buffering-strace 0
|Line with tabs   why?|

$ grep -c -F 'write:' sed-trace.txt expand-trace.txt
sed-trace.txt:2
expand-trace.txt:30

> Carl
>
>
> PS, "find -printf" recognizes a '\c' escape to flush the output, in case
> that helps.  So "find -printf '%p\0\c'" would, for instance, already
> behave the same as "stdbuf --output=N  find -print0" with the new stdbuf
> output mode you're suggesting.
>
> (Though again, this doesn't actually seem to be any more efficient than
> running "stdbuf --output=0  find -print0")
>
> On Sun, 10 Mar 2024, Zachary Santer wrote:
>
> > Was "stdbuf feature request - line buffering but for null-terminated data"
> >
> > See below.
> >
> > On Sun, Mar 10, 2024 at 5:38 AM Pádraig Brady  wrote:
> >>
> >> On 09/03/2024 16:30, Zachary Santer wrote:
> >>> 'stdbuf --output=L' will line-buffer the command's output stream.
> >>> Pretty useful, but that's looking for newlines. Filenames should be
> >>> passed between utilities in a null-terminated fashion, because the
> >>> null byte is the only byte that can't appear within one.
> >>>
> >>> If I want to buffer output data on null bytes, the closest I can get
> >>> is 'stdbuf --output=0', which doesn't buffer at all. This is pretty
> >>> inefficient.
> >>>
> >>> 0 means unbuffered, and Z is already taken for, I guess, zebibytes.
> >>> --output=N, then?
> >>>
> >>> Would this require a change to libc implementations, or is it possible 
> >>> now?
> >>
> >> This does seem like useful 

Re: RFE: enable buffering on null-terminated data

2024-03-10 Thread Carl Edquist via GNU coreutils General Discussion

Hi Zack,

This sounds like a potentially useful feature (it'd probably belong with a 
corresponding new buffer mode in setbuf(3)) ...


Filenames should be passed between utilities in a null-terminated 
fashion, because the null byte is the only byte that can't appear within 
one.


Out of curiosity, do you have an example command line for your use case?

If I want to buffer output data on null bytes, the closest I can get is 
'stdbuf --output=0', which doesn't buffer at all. This is pretty 
inefficient.


I'm just thinking that find(1), for instance, will end up calling write(2) 
exactly once per filename (-print or -print0) if run under stdbuf 
unbuffered, which is the same as you'd get with a corresponding stdbuf 
line-buffered mode (newline or null-terminated).


It seems that where line buffering improves performance over unbuffered is 
when there are several calls to (for example) printf(3) in constructing a 
single line.  find(1), and some filters like grep(1), will write a line at 
a time in unbuffered mode, and thus don't seem to benefit at all from line 
buffering.  On the other hand, cut(1) appears to putchar(3) a byte at a 
time, which in unbuffered mode will (like you say) be pretty inefficient.


So, depending on your use case, a new null-terminated line buffered option 
may or may not actually improve efficiency over unbuffered mode.



You can run your commands under strace like

stdbuf --output=X  strace -c -ewrite  command ... | ...

to count the number of actual writes for each buffering mode.


Carl


PS, "find -printf" recognizes a '\c' escape to flush the output, in case 
that helps.  So "find -printf '%p\0\c'" would, for instance, already 
behave the same as "stdbuf --output=N  find -print0" with the new stdbuf 
output mode you're suggesting.


(Though again, this doesn't actually seem to be any more efficient than 
running "stdbuf --output=0  find -print0")


On Sun, 10 Mar 2024, Zachary Santer wrote:


Was "stdbuf feature request - line buffering but for null-terminated data"

See below.

On Sun, Mar 10, 2024 at 5:38 AM Pádraig Brady  wrote:


On 09/03/2024 16:30, Zachary Santer wrote:

'stdbuf --output=L' will line-buffer the command's output stream.
Pretty useful, but that's looking for newlines. Filenames should be
passed between utilities in a null-terminated fashion, because the
null byte is the only byte that can't appear within one.

If I want to buffer output data on null bytes, the closest I can get
is 'stdbuf --output=0', which doesn't buffer at all. This is pretty
inefficient.

0 means unbuffered, and Z is already taken for, I guess, zebibytes.
--output=N, then?

Would this require a change to libc implementations, or is it possible now?


This does seem like useful functionality,
but it would require support for libc implementations first.

cheers,
Pádraig





RFE: enable buffering on null-terminated data

2024-03-10 Thread Zachary Santer
Was "stdbuf feature request - line buffering but for null-terminated data"

See below.

On Sun, Mar 10, 2024 at 5:38 AM Pádraig Brady  wrote:
>
> On 09/03/2024 16:30, Zachary Santer wrote:
> > 'stdbuf --output=L' will line-buffer the command's output stream.
> > Pretty useful, but that's looking for newlines. Filenames should be
> > passed between utilities in a null-terminated fashion, because the
> > null byte is the only byte that can't appear within one.
> >
> > If I want to buffer output data on null bytes, the closest I can get
> > is 'stdbuf --output=0', which doesn't buffer at all. This is pretty
> > inefficient.
> >
> > 0 means unbuffered, and Z is already taken for, I guess, zebibytes.
> > --output=N, then?
> >
> > Would this require a change to libc implementations, or is it possible now?
>
> This does seem like useful functionality,
> but it would require support for libc implementations first.
>
> cheers,
> Pádraig