Hi David, I've also seen short writes on local file systems -- can't even count the number of times I've modified codes to use wrappers that handle short reads/writes. Not at all surprised you see them when suspending the app.
http://www.opengroup.org/onlinepubs/000095399/functions/write.html "If write() is interrupted by a signal after it successfully writes some data, it shall return the number of bytes written." Similar language exists for read as well. I always thought libc should handle the retry for you by default, but I didn't write the spec. Signals are relatively rare, and the window is a bit smaller for a local file system, which may be why they haven't seen it/properly dealt with it yet. Kevin David Singleton wrote: > The POSIX standard pretty clearly allows short writes to occur (number of > bytes written less than requested in a successful call to write) but its > not something you see very often and I dont think many users/applications > expect it to occur when writing to disk based files. We are seeing it > fairly regularly and just wanted to confirm that we (rather our users) > should expect this behaviour from Lustre. > > We are seeing the issue with the infamous Gaussian quantum chem code > which spends literally days constantly writing and reading to scratch files > in roughly 1GB chunks as part of out-of-core solvers. We manage jobs using > simple SIGSTOP/SIGCONT based suspend/resume and occasionally jobs will flag > a short write immediately after a SIGCONT. The application incorrectly > treats this as an error and aborts. Adding code to complete the write > appears to fix the problem (as you'd hope). Now we are at the stage of > "debating" with the application developers whether it's their problem or > Lustre's. > > Is this considered normal Lustre behaviour? > > This is with 1.8.3 clients on 2.6.27.46. > > Thanks, > David > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss