Hello again, Sorry for resending this, but I'm suffering from a bit of Warnock's Dilemma. Perhaps there is an obvious solution that I'm overlooking? Or maybe I need to give more detail?
Thanks, - Dan Boorstein On Tue, Dec 10, 2013 at 2:34 PM, Dan Boorstein <[email protected]> wrote: > Hello, > > > I have an application that streams results from the STDOUT of compute > nodes back to the parent for more processing. I've encountered an issue > where lines longer than 1024 bytes appear to collide. > > > In this first case, I get what I expect. Starting 1000 tasks, that each > return a 1024 byte string (including newline), results in 1000 1024 byte > strings: > > > > srun --ntasks=1000 perl -E 'say 1 x 1023' | perl -nE 'say length' | sort > | uniq -c > > 1000 1024 > > > When I increase the total length to 1025 bytes it appears to eclipse some > buffer size, resulting in lines of various sizes. > > > > srun --ntasks=1000 perl -E 'say 1 x 1024' | perl -nE 'say length' | sort > | uniq -c > > 802 1 > > 2 10241 > > 92 1025 > > 3 11265 > > 2 12289 > > ... > > > The documentation for --output states that STDOUT from nodes is line > buffered. Does my observation disagree with that statement or is this > corruption happening elsewhere? Is this an issue with writes longer than > 1024 bytes not being atomic? > > It is interesting to note that the –l/--label option changes the behavior. > Though it appears to result in a synchronization of the data (resulting in > deterministic line sizes), it still seems to split lines longer than 1024 > bytes: > > > > > srun -l --ntasks=1000 perl -E 'say 1 x 1024' | perl -nE 'say length' | > sort | uniq -c > > 1000 1030 > > 1000 6 > > > Is there a way to coax srun to keep the long lines intact while streaming > results from the nodes? > > > Thanks, > > > - Dan Boorstein >
