Re: [Toybox] [PATCH] xputs: Do flush

enh via Toybox Mon, 20 May 2024 05:36:46 -0700

On Sun, May 19, 2024 at 10:56 PM Rob Landley <r...@landley.net> wrote:
>
> On 5/18/24 21:53, Yi-Yo Chiang wrote:
> > What I wanted to address with this patch are:
> >
> > 1. Fix this line of
> > xputs() 
> > https://github.com/landley/toybox/blob/master/toys/net/microcom.c#L113
> > The prompt text is not flushed immediately, so it is not shown to the user 
> > until
> > the escape char is entered (which defeats the purpose of the prompt, that 
> > is to
>
> I agree you've identified two problems (unflushed prompt, comment not matching
> code) that both need to be fixed. I'm just unhappy with the solutions, and am
> concerned about a larger design problem.
>
> I implemented TOYFLAG_NOBUF and applied it to this command. The result 
> compiles
> but I'm not near serial hardware at the moment, does it fix the problem for 
> you?
>
> Trying to fix it via micromanagement (adding more flushing and switching some
> but not all output contexts in the same command between FILE * and file
> descriptor) makes my head hurt...
>
> Adding flushing to xputs() is an Elliott question is because he's the one who
> can presumably keep stdio buffer semantics in his head. They make zero sense 
> to
> me. I added a flush to xputsn() because in line buffering mode it was the
> "dangerous" one that had no newline thus no flush, but then when we go to 
> block
> buffering mode xputs() needs a flush just like xputsn() would, and MAYBE it's
> good to have the flush because in line buffer mode it would be a NOP? Except 
> the
> command selected block buffering mode precisely BECAUSE it didn't want to 
> flush
> each line, so why should xputs() do it when the command asked not to? And if
> xputs() is doing it, why is xprintf() _not_ doing it? And if xprintf() _is_
> doing it, then we're back to basically not buffering the output...


this to me was exactly why it should be "everything flushes" or
"nothing flushes". not "some subset of calls for some subset of
inputs", because no-one can keep all the special cases in their heads.
and "everything flushes" is problematic from a performance
perspective, so "nothing flushes" wins by default. (but, yes, when we
have our own kernel, have a time-based component to buffering layer's
flushing is an interesting idea :-) )

> > tell the user what the escape char is) and stdout is flushed by handle_esc.
> > To fix this we either make xputs() flush automatically, or we just add a 
> > single
> > line of manual flush() after xputs() in microcom.c.
> > Either is fine with me.
>
> When I searched for the first xputs in microcom I got:
>
>   xputsn("\r\n[b]reak, [p]aste file, [q]uit: ");
>   if (read(0, &input, 1)<1 || input == CTRL('D') || input == 'q') {
>
> Which is a separate function (the n version is no newline, it does not add the
> newline the way libc puts() traditionally does), with its own flushing
> semantics: xputsn() doesn't call xputs(), and neither calls or is called by
> xprintf(). "Some functions flush, some functions don't" is a bit of a design
> sharp edge.

(yeah, exactly.)

> The larger problem is manual flushing of the output buffer is a terrible
> interface, and leads to missing error checking without which a command won't
> reliably exit when its output terminal closes because the whole SIGPIPE thing
> was its own can of worms that even bionic used to manually mess with. Which is
> why I originally made toybox not ever do that (systemic fix) but I got
> complaints about performance.
>
> Your other patch changes a bunch of xprintf() to dprintf() which is even 
> _more_
> fun because dprintf() writes directly to the file descriptor (bypassing the
> buffer in the libc global FILE * instance "stdio"), which means in the absence
> of manual flushing the dprintf() data will go out first and then the stdio 
> data
> will go out sometime later, in the wrong order. Mixing the two output formats
> tends to invert the order that way, unless you disable output buffering.
>
> I like systematic solutions that mean the problem never comes up again. 
> Elliott
> has been advocating the whack-a-mole "fix them as we find them" approach here
> which I am not comfortable with. I've been leaning towards adding a
> TOYFLAG_NOBUF so we can select a third buffering type, and go back to "do not
> buffer output at all, flush after every ANSI write function" for at least some
> commands I'd like to be able to reason about. Especially for commands like 
> this
> where output performance really doesn't seem like it should be an issue.

+1 --- an inherently interactive program like this seems reasonable to
be unbuffered. (except for file transfer?)

> And OTHER implementations can't consistently get this right, which is why 
> whether:
>
>   for i in {1..100}; do echo -n .; sleep .1; done | less
>
> Produces any output before 10 seconds have elapsed is potluck, and varies from
> release to release of the same distro.
>
> Oh, and the gnu/crazies just came up with a fourth category of write as a
> gnu/extension: flush after NUL byte.
>
> https://lists.gnu.org/archive/html/coreutils/2024-03/msg00016.html

(fwiw, i think that was just some internet rando asking for it, no?
and they didn't actually implement it?)

> It's very gnu to fix "this is too complicated to be reliable" by adding MORE
> complication. Note how the problem WE hit here was 1) we didn't ask for 
> LINEBUF
> mode, 2) \r doesn't count as a line for buffer flushing purposes anyway, 3) 
> the
> new feature making it trigger on NUL instead _still_ wouldn't make \r count 
> as a
> line for buffer flushing purposes.
>
> My suggestion for a "proper fix" to the problem _category_ of small writes 
> being
> too expensive was to have either libc or the kernel use nagle's algorithm for
> writes from userspace, like it does for network connections. (There was a fix 
> to
> this category of issue decades ago, it just never got applied HERE.)
>
> But that hasn't been popular, and it's a pain to implement in userspace 
> (because
> we don't have access to mulitple cheap timers like the kernel does, we need to
> take a signal and there's both a limited number of signals).

do you run on anything that doesn't have real-time signals? i was
going to say that -- since toybox is a closed world -- you could just
use SIGUSR2, but i see that dhcp is already using that! but if you can
assume real-time signals, there are plenty of them...

> So that's not what
> the ANSI committee decided to do in 1988, instead overlaying terrible 
> semantics
> on Ken/Dennis/Doug's elegant original design.
>
> Now that the kernel has vdso, this seems to me like an obvious use for it: 
> have
> a vdso write() function detect fd 1 and write the bytes(s) into a page size 
> ring
> buffer with incremented output position indicator, maybe with the buffer 
> taking
> a soft fault on first write to enable the kernel timer to flush it 1/10 second
> later (whatever the existing nagle timeout is), and big writes after small
> writes could even become an iovec or something (maybe using cmpxchg to rewind
> the buffer position indicator if the timer expiration hasn't advanced it yet) 
> to
> avoid extra syscalls when flushing the to buffer full instead of timer 
> expiration.
>
> Then userspace does the obvious thing and the kernel makes it fast. As
> Ken/Dennis/Doug intended.
>
> > 2. The comment string before xputs() is misleading and I wanted to fix that.
> > I was misled by the comment line, and incorrectly thought xputs() was 
> > flushing.
>
> Yeah, my bad. It's changed multiple times because of arguments about
> performance. It's xputsn() and xputsl() that are currently flushing. (Which 
> made
> sense when "line buffer vs no buffer" were the options, vs block buffer vs 
> line
> buffer and you couldn't ask for NOT buffered. Commit 3e0e8c687eee back in 
> January.)
>
> Oh, and I've had similar arguments on the READ FILE * side, wanting getline() 
> to
> pass along the leftover readahead data after \n to child processes, ala "read 
> i;
> zcat > $i" which has the problem of getline() reading a block of data into the
> FILE * buffer which then isn't available to zcat.
>
> I eventually (https://landley.net/notes-2023.html#24-06-2023) worked out that 
> on
> pipes you can use O_DIRECT to prevent collating and thus most instances of
> readahead, and on seekable files you can fseeko() on the file pointer to force
> it to back up to where it THINKS it is and thus make the readahead data
> available through the filehandle again, and that SHOULD cover the majority of
> cases the shell cares about?
>
> But that was AFTER I complained at the posix guys that there's no portable way
> to ask an input FILE * how much data is in its buffer and wound up with SIX
> different contexts (two standards committees, three linux library maintainers,
> plus BSD) pointing fingers at each other, which I ranted about in my blog:
>
> https://landley.net/notes-2022.html#19-04-2022
>
> Thus the old saying that there are two hard problems in computer science:
> naming things, cache invalidation, and off by one errors.
>
> > I only discovered otherwise after iterations of trial-and-error.
> > To fix this is to either make xputs() flush automatically, or just update 
> > the
> > comment string to clarify that xputs() does _not_ flush.
> > Again either path is fine with me. I just merged the fixes of the two 
> > problems
> > into one because they share the former solution. Plus xputsl() and xputsn() 
> > are
> > also flushing, so I thought / believed xputs() was mistakenly made 
> > non-flushing.
>
> Even xprintf() _used_ to flush. All the xout() functions did. And then grep 
> was
> too slow... (See "historical performance complaints", above.)
>
> > Since the intention _is_ to keep xputs() non-flushing, how about we just 
> > update
> > the comment of xputs() (2), and add the line of flush in microcom.c (1)?
>
> As I said, I did TOYFLAG_NOBUF and hope it works for you?
>
> I don't know if xputs() should have the flushing added or the comment removed,
> that's a design question for Elliott.

comment removed makes sense to me?

i like the x<foo>() => "it's <foo>, but exits on error" pattern. super
consistent and easy to reason about, and easy to read when you know
it, and easy to fix code that should/shouldn't error in either
direction, by adding or removing the 'x'. (one of the biggest problems
with the old xflush() was that it mostly _didn't_ flush! xferror() is
a weird name, but at least it does what it says on the tin :-) )

> Rob
> _______________________________________________
> Toybox mailing list
> Toybox@lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net
_______________________________________________
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net

Re: [Toybox] [PATCH] xputs: Do flush

Reply via email to