On Wed, Aug 10, 2022 at 02:23:08PM -0600, Theo de Raadt wrote:
> Scott Cheloha <[email protected]> wrote:
>
> > On Wed, Aug 10, 2022 at 12:26:17PM -0600, Theo de Raadt wrote:
> > > Scott Cheloha <[email protected]> wrote:
> > >
> > > > We're sorta-kinda circling around adding the missing (?) stdio error
> > > > checking to other utilities in bin/ and usr.bin/, no? I want to be
> > > > sure I understand how to do the next patch, because if we do that it
> > > > will probably be a bunch of programs all at once.
> > >
> > > This specific program has not checked for this condition since at least
> > > 2 AT&T UNIX.
> > >
> > > Your change does not just add a new warning. It adds a new exit code
> > > condition.
> > >
> > > Some scripts using echo, which accepted the condition because echo would
> > > exit 0 and not check for this condition, will now see this exit 1. Some
> > > scripts will abort, because they use "set -o errexit" or similar.
> > >
> > > You are changing the exit code for a command which is used a lot.
> > >
> > > POSIX does not require or specify exit 1 for this condition. If you
> > > disagree, please show where it says so.
> >
> > It's the usual thing. >0 if "an error occurred".
>
> The 40 year old code base says otherwise.
>
> > Here is my thinking:
> >
> > echo(1) has ONE job: print the arguments given.
> >
> > If it fails to print those arguments, shouldn't we signal that to the
> > program that forked echo(1)?
>
> Only if you validate all callers can handle this change in behaviour.
>
> > How is echo(1) supposed to differentiate between a write(2) that is
> > allowed to fail, e.g. a diagnostic printout from fw_update to the
> > user's stderr, and one that is not allowed to fail?
>
> Perhaps it is not supposed to validate this problem in 2022, because it
> didn't validate it for 40 years.
>
> > Consider this scenario:
> >
> > 1. A shell script uses echo(1) to write something to a file.
> >
> > /bin/echo foo.dat >> /var/workerd/data-processing-list
> >
> > 2. The bytes don't arrive on disk because the file system is full.
> >
> > 3. The shell script succeeds because echo(1) can't fail, even if
> > it fails to do what it was told to do.
> >
> > Isn't that bad?
> >
> > And it isn't necessarily true that some other thing will fail later
> > and the whole interlocking system will fail. ENOSPC is a transient
> > condition. One write(2) can fail and the next write(2) can succeed.
>
> Yet, for 40 years noone complained.
>
> Consider the situation you break and change the behaviour of 1000's of
> shell scripts, and haven'd lifted your finger once to review all the
> shell scripts that call echo.
>
> Have you even compared this behaviour to the echo built-ins in all
> the shells?
I assume what you mean to say is, roughly:
Gee, this seems risky.
What do other echo implementations do?
1. Our ksh(1) already checks for stdout errors in the echo builtin.
2. FreeBSD's /bin/echo has checked for writev(2) errors in /bin/echo
since 2003:
https://cgit.freebsd.org/src/commit/bin/echo/echo.c?id=91b7d6dc5871f532b1a86ee76389a9bc348bdf58
3. NetBSD's /bin/echo has checked for stdout errors with ferror(3)
since 2008:
http://cvsweb.netbsd.org/bsdweb.cgi/src/bin/echo/echo.c?rev=1.18&content-type=text/x-cvsweb-markup&only_with_tag=MAIN
4. NetBSD's /bin/sh echo builtin has checked for write errors since
2008:
http://cvsweb.netbsd.org/bsdweb.cgi/src/bin/sh/bltin/echo.c?rev=1.14&content-type=text/x-cvsweb-markup&only_with_tag=MAIN
5. OpenSolaris has checked for fflush(3) errors in /usr/bin/echo since
2005 (OpenSolaris launch):
https://github.com/illumos/illumos-gate/blob/7c478bd95313f5f23a4c958a745db2134aa03244/usr/src/cmd/echo/echo.c#L144
6. Looking forward, illumos inherited and retains the behavior in
their /usr/bin/echo.
7. Extrapolating backward, we can assume Solaris did that checking in
/usr/bin/echo prior to 2005.
8. GNU Coreutils echo has checked for fflush(3) and fclose(3) errors on
stdout since 2000:
https://git.savannah.gnu.org/cgit/coreutils.git/commit/src/echo.c?id=d3683509b3953beb014e540f6d6194658ede1dea
They use close_stdout() in an atexit(3) hook. close_stdout() is a
convenience function provided by gnulib since 1998 that does what I
described:
https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=23928550db5d400f27fa67de29c738ca324a31ea;hp=f76477e515b36a1e10f7734aac3c5478ccf75989
Maybe of note is that they do this atexit(3) stdout flush/close
error checking for many of their utilities.
9. The GNU Bash echo builtin has checked for write errors since v2.04,
in 2000:
https://git.savannah.gnu.org/cgit/bash.git/commit/builtins/echo.def?id=bb70624e964126b7ac4ff085ba163a9c35ffa18f
They even noted it in the CHANGES file for that release:
https://git.savannah.gnu.org/cgit/bash.git/commit/CHANGES?id=bb70624e964126b7ac4ff085ba163a9c35ffa18f
--
I don't think that we are first movers in this case.