Pádraig Brady wrote:
> Jim Meyering wrote:
>> Pádraig Brady wrote:
>>> Pádraig Brady wrote:
>>>> You wouldn't want multiple threads/processes fighting over
>>>> the disk head so you would do something like:
>>>>
>>>> find /disk1 | xargs md5sum & find /disk2 | xargs md5sum
>>>>
>>>> Note if we're piping/redirecting the output of the above
>>>> then we must be careful to line buffer the output from md5sum
>>>> so that it's not interspersed. Hmm I wonder should
>>>> we linebuffer the output from *sum by default.
>>> In the attached patch, I've changed the default buffering
>>> to line buffered to address the above issue. For standard
>>> size files there is a 2% performance drop.
>>
>> Good catch.
>> It sounds like this fixes a real (albeit obscure) bug, so this
>> might deserve a NEWS item, though I admit it is borderline.
>
> Well it would easily be hit when one tries to parallelize the processes.
> So I'll add a NEWS item and a test along the lines of:
Thanks!
> (mkdir t && cd t && seq 100 | xargs touch)
> (find t t t t -type f | xargs -n100 -P4 md5sum) \
> | sed -n '/[0-9a-f]\{32\} /!p' | grep . >/dev/null && fail=1
Odd... that doesn't fail on any of the systems where I tried it:
rawhide, fedora 11, debian unstable.