On Tue, Sep 8, 2009 at 10:20 PM, Nathan Stratton Treadway<natha...@ontko.com> wrote: > At that time the discussion was focused on common-src/stream.c, which > hadn't changed significantly between those versions, but it would be > interesting to know if there were any changes in the sendbackup code > path after 2.5.0p1 that seem like they might be related to this bug....
Sigh, that reaches way back into the CVS days and Jean-Louis' memory -- he's not around right now. There were *bunches* of changes between 2.5.0 and 2.5.2, not least of which the introduction of the Backup API (which became the Application API). If we could reproduce the behavior in a test program, exploring the history might be a more fruitful avenue for pursuit, since we'd know what to look for. One thing we could do is remove all of the other patches and replace the "fixing" fcntl() calls with something similar -- maybe just sleep(1)? If this "fixes" the problem, then that's evidence this was a race condition. Alternately, remove all of the other patches and add: 1285 ptr = buffer; 1286 bytes_written = 0; g_debug("fcntl(%d) => %d", fileno(pipe_fp), fcntl(fileno(pipe_fp), F_GETFL, 0), O_NONBLOCK); 1287 just_written = full_write(fileno(pipe_fp), ptr, (size_t)bytes_read); 1288 if (just_written < (size_t)bytes_read) { and try to trigger the same error. I suspect you'll find that the fcntl is returning something with bit 2 (O_NONBLOCK) set. Dustin -- Open Source Storage Engineer http://www.zmanda.com