Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?
> On Sun, 21 Apr 2024, wrotycz wrote: > > > > It seems that it's 'interleaved' when buffer is written to a file or > > pipe, and because stdout is buffered it waits until buffer is full or > > flushed, while stderr is not and it doesn't wait and write immediately. > > Right; my point was just that stdout and stderr are still separate streams > (with distinct buffers & buffering modes), even if fd 1 & 2 refer to the > same pipe. As I guess I should've expected, the behavior differs between a bash script and a compiled program. $ cat ./abc123 #!/bin/bash printf '%s' 'a' >&2 printf '%s' '1' printf '%s' 'b' >&2 printf '%s' '2' printf '%s' 'c' >&2 printf '%s' '3' printf '\n' >&2 printf '\n' exit 0; $ ./abc123 a1b2c3 $ ./abc123 2>&1 | cat a1b2c3 $ cat ./abc123.c #include int main() { putc('a', stderr); putc('1', stdout); putc('b', stderr); putc('2', stdout); putc('c', stderr); putc('3', stdout); putc('\n', stderr); putc('\n', stdout); return 0; } $ gcc -o abc123.exe abc123.c $ ./abc123.exe a1b2c3 $ ./abc123.exe 2>&1 | cat 123 abc $ stdbuf --output=0 --error=0 -- ./abc123.exe 2>&1 | cat 123 abc $ I probably shouldn't go around assuming that things are smart. I'll accept that adding logic to glibc to test if any given set of file descriptors are pointing to the same file or pipe and ensuring that anything written to any one of those file descriptors is always actually written to the stream for the first one, for instance, would probably be overkill.
Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?
On Sat, 20 Apr 2024, Zachary Santer wrote: I don't know how buffering works when stdout and stderr get redirected to the same pipe. You'd think, whatever it is, it would have to be smart enough to keep them interleaved in the same order they were printed to in. That in mind, I would assume they both get placed into the same block buffer by default. On Sun, 21 Apr 2024, wrotycz wrote: Sat, Apr 20, 2024 at 16:45 Carl Edquist wrote: However, stdout and stderr are still separate streams even if they refer to the same output file/pipe/device, so partial lines are not interleaved in the order that they were printed. will output "abc\n123\n" instead of "a1b2c3\n\n", even if you run it as $ ./abc123 2>&1 | cat It seems that it's 'interleaved' when buffer is written to a file or pipe, and because stdout is buffered it waits until buffer is full or flushed, while stderr is not and it doesn't wait and write immediately. Right; my point was just that stdout and stderr are still separate streams (with distinct buffers & buffering modes), even if fd 1 & 2 refer to the same pipe. Carl
Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?
Sat, Apr 20, 2024 at 16:45 Carl Edquist wrote:However, stdout and stderr are still separate streams even if they refer to the same output file/pipe/device, so partial lines are not interleaved in the order that they were printed. will output abc\n123\n instead of a1b2c3\n\n, even if you run it as $ ./abc123 21 | cat It seems that its interleaved when buffer is written to a file or pipe, and because stdout is buffered it waits until buffer is full or flushed, while stderr is not and it doesnt wait and write immediately.
Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?
On Sat, 20 Apr 2024, Zachary Santer wrote: This was actually in RHEL 7. Oh. In that case it might be worth looking into ... I don't know how buffering works when stdout and stderr get redirected to the same pipe. You'd think, whatever it is, it would have to be smart enough to keep them interleaved in the same order they were printed to in. That in mind, I would assume they both get placed into the same block buffer by default. My take is always to try it and find out. Though in this case I think the default (without using stdbuf) is that the program's stderr is output to the pipe immediately (ie, unbuffered) on each library call (fprintf(3), fputs(3), putc(3), fwrite(3)), while stdout is written to the pipe at block boundaries - even though fd 1 and 2 refer to the same pipe. If you force line buffering for stdout and stderr, that is likely what you want, and it will interleave _lines_ in the order that they were printed. However, stdout and stderr are still separate streams even if they refer to the same output file/pipe/device, so partial lines are not interleaved in the order that they were printed. For example: #include int main() { putc('a', stderr); putc('1', stdout); putc('b', stderr); putc('2', stdout); putc('c', stderr); putc('3', stdout); putc('\n', stderr); putc('\n', stdout); return 0; } will output "abc\n123\n" instead of "a1b2c3\n\n", even if you run it as $ ./abc123 2>&1 | cat or $ stdbuf -oL -eL ./abc123 2>&1 | cat ... Not that that's relevant for what you're doing :) Carl
Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?
On Sat, Apr 20, 2024 at 11:58 AM Carl Edquist wrote: > > On Thu, 18 Apr 2024, Zachary Santer wrote: > > > > Finally had a chance to try to build with 'stdbuf --output=L --error=L > > --' in front of the build script, and it caused some crazy problems. > > For what it's worth, when I was trying that out msys2 (since that's what > you said you were using), I also ran into some very weird errors when just > trying to export LD_PRELOAD and _STDBUF_O to what stdbuf -oL sets. It was > weird because I didn't see issues when just running a command (including > bash) directly under stdbuf. I didn't get to the bottom of it though and > I don't have access to a windows laptop any more to experiment. This was actually in RHEL 7. stdbuf --output=L --error=L -- "${@}" 2>&1 | tee log-file | while IFS='' read -r line; do # do stuff done # And then obviously the arguments to this script give the command I want it to run. > Also I might ask, why are you setting "--error=L" ? > > Not that this is the problem you're seeing, but in any case stderr is > unbuffered by default, and you might mess up the output a bit by line > buffering it, if it's expecting to output partial lines for progress or > whatever. I don't know how buffering works when stdout and stderr get redirected to the same pipe. You'd think, whatever it is, it would have to be smart enough to keep them interleaved in the same order they were printed to in. That in mind, I would assume they both get placed into the same block buffer by default.
Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?
On Thu, 18 Apr 2024, Zachary Santer wrote: On Wed, Mar 20, 2024 at 4:54 AM Carl Edquist wrote: However, if stdbuf's magic env vars are exported in your shell (either by doing a trick like 'export $(env -i stdbuf -oL env)', or else more simply by first starting a new shell with 'stdbuf -oL bash'), then every command in your pipelines will start with the new default line-buffered stdout. That way your line-items from build.sh should get passed all the way through the pipeline as they are produced. Finally had a chance to try to build with 'stdbuf --output=L --error=L --' in front of the build script, and it caused some crazy problems. For what it's worth, when I was trying that out msys2 (since that's what you said you were using), I also ran into some very weird errors when just trying to export LD_PRELOAD and _STDBUF_O to what stdbuf -oL sets. It was weird because I didn't see issues when just running a command (including bash) directly under stdbuf. I didn't get to the bottom of it though and I don't have access to a windows laptop any more to experiment. Also I might ask, why are you setting "--error=L" ? Not that this is the problem you're seeing, but in any case stderr is unbuffered by default, and you might mess up the output a bit by line buffering it, if it's expecting to output partial lines for progress or whatever. Carl
Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?
On Fri, Apr 19, 2024 at 8:26 AM Pádraig Brady wrote: > > Perhaps at this stage we should consider stdbuf ubiquitous enough to suffice, > noting that it's also supported on FreeBSD. Alternatively, if glibc were modified to act on these hypothetical environment variables, it would be trivial to have stdbuf simply set those, to ensure backwards compatibility. > I'm surprised that the LD_PRELOAD setting is breaking your ada build, > and it would be interesting to determine the reason for that. If I had that kind of time...
Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?
On 19/04/2024 12:36, Zachary Santer wrote: On Fri, Apr 19, 2024 at 5:32 AM Pádraig Brady wrote: env variables are what I proposed 18 years ago now: https://sourceware.org/bugzilla/show_bug.cgi?id=2457 And the "resistance to that" from the Red Hat people 24 years ago is listed on a website that doesn't exist anymore. If I'm to argue with a guy from 18 years ago... Ulrich Drepper wrote: Hell, no. Programs expect a certain buffer mode and perhaps would work unexpectedly if this changes. By setting a mode to unbuffered, for instance, you can easily DoS a system. I can think about enough other reasons why this is a terrible idea. Programs explicitly must request a buffering scheme so that it matches the way the program uses the stream. If buffering were set according to the env vars before the program configures buffers on its end, if it chooses to, then the env vars have no effect. This is how the stdbuf util works, right now. Would programs that expect a certain buffer mode not set that mode explicitly themselves? Are you allowing untrusted users to set env vars for important daemons or something? How is this a valid concern? This is specific to the standard streams, 0-2. Buffering of stdout and stderr is already configured dynamically by libc. If it's going to a terminal, it's line-buffered. If it's not, it's fully buffered. Playing devil's advocate, I guess programs may be depending on the automatic buffering modes set. I guess the thinking is that it was too easy to perturb the system with env vars, though you can already do that with LD_PRELOAD. Perhaps at this stage we should consider stdbuf ubiquitous enough to suffice, noting that it's also supported on FreeBSD. I'm surprised that the LD_PRELOAD setting is breaking your ada build, and it would be interesting to determine the reason for that. cheers, Pádraig
Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?
On Fri, Apr 19, 2024 at 5:32 AM Pádraig Brady wrote: > > env variables are what I proposed 18 years ago now: > https://sourceware.org/bugzilla/show_bug.cgi?id=2457 And the "resistance to that" from the Red Hat people 24 years ago is listed on a website that doesn't exist anymore. If I'm to argue with a guy from 18 years ago... Ulrich Drepper wrote: > Hell, no. Programs expect a certain buffer mode and perhaps would work > unexpectedly if this changes. By setting a mode to unbuffered, for instance, > you can easily DoS a system. I can think about enough other reasons why this > is > a terrible idea. Programs explicitly must request a buffering scheme so that > it > matches the way the program uses the stream. If buffering were set according to the env vars before the program configures buffers on its end, if it chooses to, then the env vars have no effect. This is how the stdbuf util works, right now. Would programs that expect a certain buffer mode not set that mode explicitly themselves? Are you allowing untrusted users to set env vars for important daemons or something? How is this a valid concern? This is specific to the standard streams, 0-2. Buffering of stdout and stderr is already configured dynamically by libc. If it's going to a terminal, it's line-buffered. If it's not, it's fully buffered.
Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)?
On 19/04/2024 01:16, Zachary Santer wrote: Was "RFE: enable buffering on null-terminated data" On Wed, Mar 20, 2024 at 4:54 AM Carl Edquist wrote: However, if stdbuf's magic env vars are exported in your shell (either by doing a trick like 'export $(env -i stdbuf -oL env)', or else more simply by first starting a new shell with 'stdbuf -oL bash'), then every command in your pipelines will start with the new default line-buffered stdout. That way your line-items from build.sh should get passed all the way through the pipeline as they are produced. Finally had a chance to try to build with 'stdbuf --output=L --error=L --' in front of the build script, and it caused some crazy problems. I was building Ada, though, so pretty good chance that part of the build chain doesn't link against libc at all. I got a bunch of ERROR: ld.so: object '/usr/libexec/coreutils/libstdbuf.so' from LD_PRELOAD cannot be preloaded: ignored. And then it somehow caused compiler errors relating to the size of what would be pointer types. Cleared out all the build products and tried again without stdbuf and everything was fine. From the original thread just within the coreutils email list, "stdbuf feature request - line buffering but for null-terminated data": On Tue, Mar 12, 2024 at 12:42 PM Kaz Kylheku wrote: I would say that if it is implemented, the programs which require it should all make provisions to set it up themselves. stdbuf is a hack/workaround for programs that ignore the issue of buffering. Specifically, programs which send information to one of the three standard streams, such that the information is required in a timely way. Those streams become fully buffered when not connected to a terminal. I think I've partially come around to this point of view. However, instead of expecting all sorts of individual programs to implement their own buffering mode command-line options, could this be handled with environment variables, but without LD_PRELOAD? I don't know if libc itself can check for those environment variables and adjust each program's buffering on its own, but if so, that would be a much simpler solution. You could compare this to the various locale environment variables, though I think a lot of commands whose behavior differ from locale to locale do have to implement their own handling of that internally, at least to some extent. This seems like somewhat less of a hack, and if no part of a program looks for those environment variables, it isn't going to find itself getting broken by the dynamic linker. It's just not going to change its buffering. Additionally, things that don't link against libc could still honor these environment variables, if the developers behind them care to put in the effort. env variables are what I proposed 18 years ago now: https://sourceware.org/bugzilla/show_bug.cgi?id=2457 cheers, Pádraig
Modify buffering of standard streams via environment variables (not LD_PRELOAD)?
Was "RFE: enable buffering on null-terminated data" On Wed, Mar 20, 2024 at 4:54 AM Carl Edquist wrote: > > However, if stdbuf's magic env vars are exported in your shell (either by > doing a trick like 'export $(env -i stdbuf -oL env)', or else more simply > by first starting a new shell with 'stdbuf -oL bash'), then every command > in your pipelines will start with the new default line-buffered stdout. > That way your line-items from build.sh should get passed all the way > through the pipeline as they are produced. Finally had a chance to try to build with 'stdbuf --output=L --error=L --' in front of the build script, and it caused some crazy problems. I was building Ada, though, so pretty good chance that part of the build chain doesn't link against libc at all. I got a bunch of ERROR: ld.so: object '/usr/libexec/coreutils/libstdbuf.so' from LD_PRELOAD cannot be preloaded: ignored. And then it somehow caused compiler errors relating to the size of what would be pointer types. Cleared out all the build products and tried again without stdbuf and everything was fine. >From the original thread just within the coreutils email list, "stdbuf feature request - line buffering but for null-terminated data": On Tue, Mar 12, 2024 at 12:42 PM Kaz Kylheku wrote: > > I would say that if it is implemented, the programs which require > it should all make provisions to set it up themselves. > > stdbuf is a hack/workaround for programs that ignore the > issue of buffering. Specifically, programs which send information > to one of the three standard streams, such that the information > is required in a timely way. Those streams become fully buffered > when not connected to a terminal. I think I've partially come around to this point of view. However, instead of expecting all sorts of individual programs to implement their own buffering mode command-line options, could this be handled with environment variables, but without LD_PRELOAD? I don't know if libc itself can check for those environment variables and adjust each program's buffering on its own, but if so, that would be a much simpler solution. You could compare this to the various locale environment variables, though I think a lot of commands whose behavior differ from locale to locale do have to implement their own handling of that internally, at least to some extent. This seems like somewhat less of a hack, and if no part of a program looks for those environment variables, it isn't going to find itself getting broken by the dynamic linker. It's just not going to change its buffering. Additionally, things that don't link against libc could still honor these environment variables, if the developers behind them care to put in the effort. Zack