Scott Cheloha <[email protected]> wrote: > On Wed, Aug 10, 2022 at 12:26:17PM -0600, Theo de Raadt wrote: > > Scott Cheloha <[email protected]> wrote: > > > > > We're sorta-kinda circling around adding the missing (?) stdio error > > > checking to other utilities in bin/ and usr.bin/, no? I want to be > > > sure I understand how to do the next patch, because if we do that it > > > will probably be a bunch of programs all at once. > > > > This specific program has not checked for this condition since at least > > 2 AT&T UNIX. > > > > Your change does not just add a new warning. It adds a new exit code > > condition. > > > > Some scripts using echo, which accepted the condition because echo would > > exit 0 and not check for this condition, will now see this exit 1. Some > > scripts will abort, because they use "set -o errexit" or similar. > > > > You are changing the exit code for a command which is used a lot. > > > > POSIX does not require or specify exit 1 for this condition. If you > > disagree, please show where it says so. > > It's the usual thing. >0 if "an error occurred".
The 40 year old code base says otherwise. > Here is my thinking: > > echo(1) has ONE job: print the arguments given. > > If it fails to print those arguments, shouldn't we signal that to the > program that forked echo(1)? Only if you validate all callers can handle this change in behaviour. > How is echo(1) supposed to differentiate between a write(2) that is > allowed to fail, e.g. a diagnostic printout from fw_update to the > user's stderr, and one that is not allowed to fail? Perhaps it is not supposed to validate this problem in 2022, because it didn't validate it for 40 years. > Consider this scenario: > > 1. A shell script uses echo(1) to write something to a file. > > /bin/echo foo.dat >> /var/workerd/data-processing-list > > 2. The bytes don't arrive on disk because the file system is full. > > 3. The shell script succeeds because echo(1) can't fail, even if > it fails to do what it was told to do. > > Isn't that bad? > > And it isn't necessarily true that some other thing will fail later > and the whole interlocking system will fail. ENOSPC is a transient > condition. One write(2) can fail and the next write(2) can succeed. Yet, for 40 years noone complained. Consider the situation you break and change the behaviour of 1000's of shell scripts, and haven'd lifted your finger once to review all the shell scripts that call echo. Have you even compared this behaviour to the echo built-ins in all the shells? Isn't that bad? Your proposal is dead in the water until you assess *the code that calls echo*. If you find one script that breaks, the proposal is even more dead.
