> It is possible for write() calls to fail partway through, after > already having written some data.
It is. As you note later, it's also possible for read(). The rightest thing to do, it seems to me, would be to return the error indication along with how much was successfully written (or read). But that, of course, requires a completely new API, which I gather is more intrusive than you want to get into here. > Basically, it is not feasible to check for and report all possible > errors ahead of time, In some cases - such as EIO - it is not possible even in theory. > nor in general is it possible or even desirable to unwind portions of > a write that have already been completed, Agreed. In some cases, by the time the error is detected, the bits may not even exist on the local machine any longer. > which means that if a failure occurs partway through a write there > are two reasonable choices for proceeding: > (a) return success with a short count reporting how much data has > already been written; > (b) return failure. Right. Personally, my own preference is for (a), with the error remembered and returned on the next write (resp. read) even if there is nothing (else) erroneous about that next operation. > It seems to me that for most errors (a) is preferable, since > correctly written user software will detect the short count, retry > with the rest of the data, and hit the error case directly, but it > seems not everyone agrees with me. Well, if it _will_ "hit the error case directly", maybe. It is not clear to me that it will. Except for EPIPE (which will rarely be returned; most writers will die on SIGPIPE instead), none of those is guaranteed to repeat on the next write - though admittedly some are more likely to than others, and some of them (eg, EFAULT) definitely will recur unless something in the writing process intervenes. > [test with deliberately mprotect()ed part of buffer] > - for regular files on ffs and probably most things that use > uiomove_ubc, the data in the accessible part of the buffer is > written, the call fails with EFAULT, and the size of the file is > reverted to what it was at the start. !! That, I would say, strongly violates POLA. It is not behaviour I would have been likely to guess. > Anyhow, if you've made it this far, the actual question is: is the > current behavior really what we want? It is not what _I_ would prefer. If we _had_ a more elaborate API, one that could return partial success followed by an error, then I'd say we could ignore the question of what write() and read() do on the grounds that code that really cares can always use the more detailed call. If adding that is an option, great. If not, well, I think returning a short count and remembering the error for the next call is about the best option available. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B