On Wed, Aug 10, 2022 at 12:26:17PM -0600, Theo de Raadt wrote: > Scott Cheloha <scottchel...@gmail.com> wrote: > > > We're sorta-kinda circling around adding the missing (?) stdio error > > checking to other utilities in bin/ and usr.bin/, no? I want to be > > sure I understand how to do the next patch, because if we do that it > > will probably be a bunch of programs all at once. > > This specific program has not checked for this condition since at least > 2 AT&T UNIX. > > Your change does not just add a new warning. It adds a new exit code > condition. > > Some scripts using echo, which accepted the condition because echo would > exit 0 and not check for this condition, will now see this exit 1. Some > scripts will abort, because they use "set -o errexit" or similar. > > You are changing the exit code for a command which is used a lot. > > POSIX does not require or specify exit 1 for this condition. If you > disagree, please show where it says so.
It's the usual thing. >0 if "an error occurred". https://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html EXIT STATUS The following exit values shall be returned: 0 Successful completion. >0 An error occurred. CONSEQUENCES OF ERRORS Default. > So my question is: What will be broken by this change? > > Nothing isn't an answer. I can write a 5 line shell script that will > observe the change in behaviour. Many large shell scripts could break > from this. I am thinking of fw_update and the installer, but it could > also be a problem in Makefiles. Here is my thinking: echo(1) has ONE job: print the arguments given. If it fails to print those arguments, shouldn't we signal that to the program that forked echo(1)? How is echo(1) supposed to differentiate between a write(2) that is allowed to fail, e.g. a diagnostic printout from fw_update to the user's stderr, and one that is not allowed to fail? > > I want to be sure I understand how to do the next patch, because if we > > do that it will probably be a bunch of programs all at once. > > If you cannot speak to the exit code command changing for this one > simple program, I think there is no case for adding to to hundreds of > other programs. Unless POSIX specifies the requirement, I'd like to see > some justification. > > There will always be situations that UNIX didn't anticipate or handle, > and then POSIX failed to specify. Such things are now unhandled, probably > forever, and have become defacto standards. > > On the balance, is your diff improving on some dangerous problem, or is > it introducing a vast number of dangerous new risks which cannot be > identified (and which would require an audit of every known script > calling echo). Has such an audit been started? Consider this scenario: 1. A shell script uses echo(1) to write something to a file. /bin/echo foo.dat >> /var/workerd/data-processing-list 2. The bytes don't arrive on disk because the file system is full. 3. The shell script succeeds because echo(1) can't fail, even if it fails to do what it was told to do. Isn't that bad? And it isn't necessarily true that some other thing will fail later and the whole interlocking system will fail. ENOSPC is a transient condition. One write(2) can fail and the next write(2) can succeed.