I apologize for throwing around words like "stupid". Whether or not the current semantics can be improved, that's not a constructive way to characterize them. I'm sorry.
As three people have ably pointed out :-), the particular case of a pipe/FIFO isn't seekable and doesn't need the f_pos member anyway (it's effectively always O_APPEND). That's what I get for checking against standards documents at 3AM. Of course, this has nothing to do with the point that led me to comment on pipes/FIFOs (which was that there exist file types that never return 0<ret<nbytes). And it was in the context of a very explicit aside that f_pos is not _interesting_ on a pipe/FIFO, except as an indicator of total bytes written. You could only peek at this with an (admittedly non-portable) llseek(fd, 0, SEEK_CUR) anyway -- which you would only do for diagnostic purposes. But diagnosis of odd corner cases (rarely in my code, usually in other people's) is what I do day in and day out, so for me it would be worth having. In any case, you're all right that the standard doesn't require you to do anything useful with f_pos on a pipe/FIFO. But you're permitted to make it useful if you want to: <1003.1 lseek()> The behavior of lseek() on devices which are incapable of seeking is implementation-defined. The value of the file offset associated with such a device is undefined. </1003.1> Tracking f_pos accurately when writes from multiple threads hit the same fd (pipe or not) isn't portable, but I recall situations where it would have been useful. And if f_pos has to be kept at all in the uncontended case, it costs you little or nothing to do it in a thread-safe manner -- as long as you don't overconstrain the semantics such that you forbid the transient overshoot associated with a short write. In fact, unless there's something I've missed, increasing f_pos before entering vfs_write() happens to be _faster_ than the current code for common load patterns, both single- and multi-threaded (although getting the full benefit in the multi-threaded case will take some fiddling with f_count placement). I say it costs "little or nothing" only because altering an loff_t atomically is not free. But even on x86, with its inability to atomically modify any 64-bit entity in memory, an uncontended spinlock on a cacheline already in L1 is so cheap that making the f_pos changes atomic will (I think) be lost in the noise. In any case, rewriting read_write.c is proving interesting. I'll let you all know if anything comes of it. In the meantime, thanks for your (really quite friendly under the circumstances) comments. Cheers, - Michael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/