Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?

2024-04-28 Thread Zachary Santer
> On Sun, 21 Apr 2024, wrotycz wrote:
> >
> > It seems that it's 'interleaved' when buffer is written to a file or
> > pipe, and because stdout is buffered it waits until buffer is full or
> > flushed, while stderr is not and it doesn't wait and write immediately.
>
> Right; my point was just that stdout and stderr are still separate streams
> (with distinct buffers & buffering modes), even if fd 1 & 2 refer to the
> same pipe.

As I guess I should've expected, the behavior differs between a bash
script and a compiled program.

$ cat ./abc123
#!/bin/bash

printf '%s' 'a' >&2
printf '%s' '1'
printf '%s' 'b' >&2
printf '%s' '2'
printf '%s' 'c' >&2
printf '%s' '3'
printf '\n' >&2
printf '\n'

exit 0;
$ ./abc123
a1b2c3

$ ./abc123 2>&1 | cat
a1b2c3

$ cat ./abc123.c
#include 

int main()
{
  putc('a', stderr);
  putc('1', stdout);
  putc('b', stderr);
  putc('2', stdout);
  putc('c', stderr);
  putc('3', stdout);
  putc('\n', stderr);
  putc('\n', stdout);

  return 0;
}
$ gcc -o abc123.exe abc123.c
$ ./abc123.exe
a1b2c3

$ ./abc123.exe 2>&1 | cat
123
abc
$ stdbuf --output=0 --error=0 -- ./abc123.exe 2>&1 | cat
123
abc
$

I probably shouldn't go around assuming that things are smart. I'll
accept that adding logic to glibc to test if any given set of file
descriptors are pointing to the same file or pipe and ensuring that
anything written to any one of those file descriptors is always
actually written to the stream for the first one, for instance, would
probably be overkill.



Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?

2024-04-27 Thread Carl Edquist



On Sat, 20 Apr 2024, Zachary Santer wrote:

I don't know how buffering works when stdout and stderr get redirected 
to the same pipe. You'd think, whatever it is, it would have to be smart 
enough to keep them interleaved in the same order they were printed to 
in. That in mind, I would assume they both get placed into the same 
block buffer by default.



On Sun, 21 Apr 2024, wrotycz wrote:


Sat, Apr 20, 2024 at 16:45 Carl Edquist wrote:

However, stdout and stderr are still separate streams even if they 
refer to the same output file/pipe/device, so partial lines are not 
interleaved in the order that they were printed.


will output "abc\n123\n" instead of "a1b2c3\n\n", even if you run it as
$ ./abc123 2>&1 | cat


It seems that it's 'interleaved' when buffer is written to a file or 
pipe, and because stdout is buffered it waits until buffer is full or 
flushed, while stderr is not and it doesn't wait and write immediately.


Right; my point was just that stdout and stderr are still separate streams 
(with distinct buffers & buffering modes), even if fd 1 & 2 refer to the 
same pipe.


Carl



Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?

2024-04-20 Thread wrotycz
 Sat, Apr 20, 2024 at 16:45 Carl Edquist wrote:However, stdout 
and stderr are still separate streams even if they refer to the same output 
file/pipe/device, so partial lines are not interleaved in the order that they 
were printed. will output abc\n123\n instead of 
a1b2c3\n\n, even if you run it as   $ ./abc123 21 | cat  
 It seems that its interleaved when buffer is written to a file 
or pipe, and because stdout is buffered it waits until buffer is full or 
flushed, while stderr is not and it doesnt wait and write immediately.


Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?

2024-04-20 Thread Carl Edquist

On Sat, 20 Apr 2024, Zachary Santer wrote:


This was actually in RHEL 7.


Oh.  In that case it might be worth looking into ...


I don't know how buffering works when stdout and stderr get redirected 
to the same pipe. You'd think, whatever it is, it would have to be smart 
enough to keep them interleaved in the same order they were printed to 
in. That in mind, I would assume they both get placed into the same 
block buffer by default.


My take is always to try it and find out.  Though in this case I think the 
default (without using stdbuf) is that the program's stderr is output to 
the pipe immediately (ie, unbuffered) on each library call (fprintf(3), 
fputs(3), putc(3), fwrite(3)), while stdout is written to the pipe at 
block boundaries - even though fd 1 and 2 refer to the same pipe.


If you force line buffering for stdout and stderr, that is likely what you 
want, and it will interleave _lines_ in the order that they were printed.


However, stdout and stderr are still separate streams even if they refer 
to the same output file/pipe/device, so partial lines are not interleaved 
in the order that they were printed.


For example:

#include 

int main()
{
putc('a', stderr);
putc('1', stdout);
putc('b', stderr);
putc('2', stdout);
putc('c', stderr);
putc('3', stdout);
putc('\n', stderr);
putc('\n', stdout);

return 0;
}

will output "abc\n123\n" instead of "a1b2c3\n\n", even if you run it as

$ ./abc123 2>&1 | cat
or
$ stdbuf -oL -eL ./abc123 2>&1 | cat


...

Not that that's relevant for what you're doing :)

Carl




Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?

2024-04-20 Thread Zachary Santer
On Sat, Apr 20, 2024 at 11:58 AM Carl Edquist  wrote:
>
> On Thu, 18 Apr 2024, Zachary Santer wrote:
> >
> > Finally had a chance to try to build with 'stdbuf --output=L --error=L
> > --' in front of the build script, and it caused some crazy problems.
>
> For what it's worth, when I was trying that out msys2 (since that's what
> you said you were using), I also ran into some very weird errors when just
> trying to export LD_PRELOAD and _STDBUF_O to what stdbuf -oL sets.  It was
> weird because I didn't see issues when just running a command (including
> bash) directly under stdbuf.  I didn't get to the bottom of it though and
> I don't have access to a windows laptop any more to experiment.

This was actually in RHEL 7.

stdbuf --output=L --error=L -- "${@}" 2>&1 |
  tee log-file |
while IFS='' read -r line; do
  # do stuff
done
#

And then obviously the arguments to this script give the command I
want it to run.

> Also I might ask, why are you setting "--error=L" ?
>
> Not that this is the problem you're seeing, but in any case stderr is
> unbuffered by default, and you might mess up the output a bit by line
> buffering it, if it's expecting to output partial lines for progress or
> whatever.

I don't know how buffering works when stdout and stderr get redirected
to the same pipe. You'd think, whatever it is, it would have to be
smart enough to keep them interleaved in the same order they were
printed to in. That in mind, I would assume they both get placed into
the same block buffer by default.



Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?

2024-04-20 Thread Carl Edquist via GNU coreutils General Discussion

On Thu, 18 Apr 2024, Zachary Santer wrote:


On Wed, Mar 20, 2024 at 4:54 AM Carl Edquist  wrote:


However, if stdbuf's magic env vars are exported in your shell (either 
by doing a trick like 'export $(env -i stdbuf -oL env)', or else more 
simply by first starting a new shell with 'stdbuf -oL bash'), then 
every command in your pipelines will start with the new default 
line-buffered stdout. That way your line-items from build.sh should get 
passed all the way through the pipeline as they are produced.


Finally had a chance to try to build with 'stdbuf --output=L --error=L 
--' in front of the build script, and it caused some crazy problems.


For what it's worth, when I was trying that out msys2 (since that's what 
you said you were using), I also ran into some very weird errors when just 
trying to export LD_PRELOAD and _STDBUF_O to what stdbuf -oL sets.  It was 
weird because I didn't see issues when just running a command (including 
bash) directly under stdbuf.  I didn't get to the bottom of it though and 
I don't have access to a windows laptop any more to experiment.


Also I might ask, why are you setting "--error=L" ?

Not that this is the problem you're seeing, but in any case stderr is 
unbuffered by default, and you might mess up the output a bit by line 
buffering it, if it's expecting to output partial lines for progress or 
whatever.


Carl


Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?

2024-04-19 Thread Zachary Santer
On Fri, Apr 19, 2024 at 8:26 AM Pádraig Brady  wrote:
>
> Perhaps at this stage we should consider stdbuf ubiquitous enough to suffice,
> noting that it's also supported on FreeBSD.

Alternatively, if glibc were modified to act on these hypothetical
environment variables, it would be trivial to have stdbuf simply set
those, to ensure backwards compatibility.

> I'm surprised that the LD_PRELOAD setting is breaking your ada build,
> and it would be interesting to determine the reason for that.

If I had that kind of time...



Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?

2024-04-19 Thread Pádraig Brady

On 19/04/2024 12:36, Zachary Santer wrote:

On Fri, Apr 19, 2024 at 5:32 AM Pádraig Brady  wrote:


env variables are what I proposed 18 years ago now:
https://sourceware.org/bugzilla/show_bug.cgi?id=2457


And the "resistance to that" from the Red Hat people 24 years ago is
listed on a website that doesn't exist anymore.

If I'm to argue with a guy from 18 years ago...

Ulrich Drepper wrote:

Hell, no.  Programs expect a certain buffer mode and perhaps would work
unexpectedly if this changes.  By setting a mode to unbuffered, for instance,
you can easily DoS a system.  I can think about enough other reasons why this is
a terrible idea.  Programs explicitly must request a buffering scheme so that it
matches the way the program uses the stream.


If buffering were set according to the env vars before the program
configures buffers on its end, if it chooses to, then the env vars
have no effect. This is how the stdbuf util works, right now. Would
programs that expect a certain buffer mode not set that mode
explicitly themselves? Are you allowing untrusted users to set env
vars for important daemons or something? How is this a valid concern?

This is specific to the standard streams, 0-2. Buffering of stdout and
stderr is already configured dynamically by libc. If it's going to a
terminal, it's line-buffered. If it's not, it's fully buffered.


Playing devil's advocate, I guess programs may be depending
on the automatic buffering modes set.
I guess the thinking is that it was too easy to perturb
the system with env vars, though you can already do that with LD_PRELOAD.

Perhaps at this stage we should consider stdbuf ubiquitous enough to suffice,
noting that it's also supported on FreeBSD.
I'm surprised that the LD_PRELOAD setting is breaking your ada build,
and it would be interesting to determine the reason for that.

cheers,
Pádraig



Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?

2024-04-19 Thread Zachary Santer
On Fri, Apr 19, 2024 at 5:32 AM Pádraig Brady  wrote:
>
> env variables are what I proposed 18 years ago now:
> https://sourceware.org/bugzilla/show_bug.cgi?id=2457

And the "resistance to that" from the Red Hat people 24 years ago is
listed on a website that doesn't exist anymore.

If I'm to argue with a guy from 18 years ago...

Ulrich Drepper wrote:
> Hell, no.  Programs expect a certain buffer mode and perhaps would work
> unexpectedly if this changes.  By setting a mode to unbuffered, for instance,
> you can easily DoS a system.  I can think about enough other reasons why this 
> is
> a terrible idea.  Programs explicitly must request a buffering scheme so that 
> it
> matches the way the program uses the stream.

If buffering were set according to the env vars before the program
configures buffers on its end, if it chooses to, then the env vars
have no effect. This is how the stdbuf util works, right now. Would
programs that expect a certain buffer mode not set that mode
explicitly themselves? Are you allowing untrusted users to set env
vars for important daemons or something? How is this a valid concern?

This is specific to the standard streams, 0-2. Buffering of stdout and
stderr is already configured dynamically by libc. If it's going to a
terminal, it's line-buffered. If it's not, it's fully buffered.



Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?

2024-04-19 Thread Pádraig Brady

On 19/04/2024 01:16, Zachary Santer wrote:

Was "RFE: enable buffering on null-terminated data"

On Wed, Mar 20, 2024 at 4:54 AM Carl Edquist  wrote:


However, if stdbuf's magic env vars are exported in your shell (either by
doing a trick like 'export $(env -i stdbuf -oL env)', or else more simply
by first starting a new shell with 'stdbuf -oL bash'), then every command
in your pipelines will start with the new default line-buffered stdout.
That way your line-items from build.sh should get passed all the way
through the pipeline as they are produced.


Finally had a chance to try to build with 'stdbuf --output=L --error=L
--' in front of the build script, and it caused some crazy problems. I
was building Ada, though, so pretty good chance that part of the build
chain doesn't link against libc at all.

I got a bunch of
ERROR: ld.so: object '/usr/libexec/coreutils/libstdbuf.so' from
LD_PRELOAD cannot be preloaded: ignored.

And then it somehow caused compiler errors relating to the size of
what would be pointer types. Cleared out all the build products and
tried again without stdbuf and everything was fine.


From the original thread just within the coreutils email list, "stdbuf

feature request - line buffering but for null-terminated data":
On Tue, Mar 12, 2024 at 12:42 PM Kaz Kylheku  wrote:


I would say that if it is implemented, the programs which require
it should all make provisions to set it up themselves.

stdbuf is a hack/workaround for programs that ignore the
issue of buffering. Specifically, programs which send information
to one of the three standard streams, such that the information
is required in a timely way.  Those streams become fully buffered
when not connected to a terminal.


I think I've partially come around to this point of view. However,
instead of expecting all sorts of individual programs to implement
their own buffering mode command-line options, could this be handled
with environment variables, but without LD_PRELOAD? I don't know if
libc itself can check for those environment variables and adjust each
program's buffering on its own, but if so, that would be a much
simpler solution.

You could compare this to the various locale environment variables,
though I think a lot of commands whose behavior differ from locale to
locale do have to implement their own handling of that internally, at
least to some extent.

This seems like somewhat less of a hack, and if no part of a program
looks for those environment variables, it isn't going to find itself
getting broken by the dynamic linker. It's just not going to change
its buffering.

Additionally, things that don't link against libc could still honor
these environment variables, if the developers behind them care to put
in the effort.


env variables are what I proposed 18 years ago now:
https://sourceware.org/bugzilla/show_bug.cgi?id=2457

cheers,
Pádraig



Modify buffering of standard streams via environment variables (not LD_PRELOAD)?

2024-04-18 Thread Zachary Santer
Was "RFE: enable buffering on null-terminated data"

On Wed, Mar 20, 2024 at 4:54 AM Carl Edquist  wrote:
>
> However, if stdbuf's magic env vars are exported in your shell (either by
> doing a trick like 'export $(env -i stdbuf -oL env)', or else more simply
> by first starting a new shell with 'stdbuf -oL bash'), then every command
> in your pipelines will start with the new default line-buffered stdout.
> That way your line-items from build.sh should get passed all the way
> through the pipeline as they are produced.

Finally had a chance to try to build with 'stdbuf --output=L --error=L
--' in front of the build script, and it caused some crazy problems. I
was building Ada, though, so pretty good chance that part of the build
chain doesn't link against libc at all.

I got a bunch of
ERROR: ld.so: object '/usr/libexec/coreutils/libstdbuf.so' from
LD_PRELOAD cannot be preloaded: ignored.

And then it somehow caused compiler errors relating to the size of
what would be pointer types. Cleared out all the build products and
tried again without stdbuf and everything was fine.

>From the original thread just within the coreutils email list, "stdbuf
feature request - line buffering but for null-terminated data":
On Tue, Mar 12, 2024 at 12:42 PM Kaz Kylheku  wrote:
>
> I would say that if it is implemented, the programs which require
> it should all make provisions to set it up themselves.
>
> stdbuf is a hack/workaround for programs that ignore the
> issue of buffering. Specifically, programs which send information
> to one of the three standard streams, such that the information
> is required in a timely way.  Those streams become fully buffered
> when not connected to a terminal.

I think I've partially come around to this point of view. However,
instead of expecting all sorts of individual programs to implement
their own buffering mode command-line options, could this be handled
with environment variables, but without LD_PRELOAD? I don't know if
libc itself can check for those environment variables and adjust each
program's buffering on its own, but if so, that would be a much
simpler solution.

You could compare this to the various locale environment variables,
though I think a lot of commands whose behavior differ from locale to
locale do have to implement their own handling of that internally, at
least to some extent.

This seems like somewhat less of a hack, and if no part of a program
looks for those environment variables, it isn't going to find itself
getting broken by the dynamic linker. It's just not going to change
its buffering.

Additionally, things that don't link against libc could still honor
these environment variables, if the developers behind them care to put
in the effort.

Zack