On Fri, Mar 09, 2007 at 04:19:55AM -0800, Michael K. Edwards wrote: > On 3/8/07, Benjamin LaHaise <[EMAIL PROTECTED]> wrote: > >Any number of things can cause a short write to occur, and rewinding the > >file position after the fact is just as bad. A sane app has to either > >serialise the writes itself or use a thread safe API like pwrite(). > > Not on a pipe/FIFO. Short writes there are flat out verboten by > 1003.1 unless O_NONBLOCK is set. (Not that f_pos is interesting on a > pipe except as a "bytes sent" indicator -- and in the multi-threaded > scenario, if you do the speculative update that I'm suggesting, you > can't 100% trust it unless you ensure that you are not in > mid-read/write in some other thread at the moment you sample f_pos. > But that doesn't make it useless.)
Writes to a pipe/FIFO are atomic, so long as they fit within the pipe buffer size, while f_pos on a pipe is undefined -- what exactly is the issue here? The semantics you're assuming are not defined by POSIX. Heck, even looking at a man page for one of the *BSDs states "Some devices are incapable of seeking. The value of the pointer associated with such a device is undefined." What part of undefined is problematic? > As to what a "sane app" has to do: it's just not that unusual to write > application code that treats a short read/write as a catastrophic > error, especially when the fd is of a type that is known never to > produce a short read/write unless something is drastically wrong. For > instance, I bomb on short write in audio applications where the driver > is known to block until enough bytes have been read/written, period. > When switching from reading a stream of audio frames from thread A to > reading them from thread B, I may be willing to omit app > serialization, because I can tolerate an imperfect hand-off in which > thread A steals one last frame after thread B has started reading -- > as long as the fd doesn't get screwed up. There is no reason for the > generic sys_read code to leave a race open in which the same frame is > read by both threads and a hardware buffer overrun results later. I hope I don't have to run any of your software. Short writes can and do happen because of a variety of reasons: signals, memory allocation failures, quota being exceeded.... These are all error conditions the kernel has to provide well defined semantics for, as well behaved applications will try to handle them gracefully. > In short, I'm not proposing that the kernel perfectly serialize > concurrent reads and writes to arbitrary fd types. I'm proposing that > it not do something blatantly stupid and easily avoided in generic > code that makes it impossible for any fd type to guarantee that, after > 10 successful pipelined 100-byte reads or writes, f_pos will have > advanced by 1000. The semantics you're looking for are defined for regular files with O_APPEND. Anything else is asking for synchronization that other applications do not require and do not desire. -ben -- "Time is of no importance, Mr. President, only life is important." Don't Email: <[EMAIL PROTECTED]>. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/