Jim Meyering wrote, On 02/09/2011 12:29 PM: > Jim Meyering wrote: >> Running "make -j25 check" on a nominal-12-core F14 system would >> cause serious difficulty leading to an OOM kill -- and this is brand new. > > If I revert my earlier patch and instead simply > insist that sort not do anything in parallel, > > ASSORT = LC_ALL=C sort --parallel=1 > > then there is no hang, and things finish in relatively good time.
I previously noticed the memory issue as a surprising (for me, at least) side-effect of parallel sort (http://lists.gnu.org/archive/html/coreutils/2010-12/msg00079.html and http://lists.gnu.org/archive/html/coreutils/2010-12/msg00084.html) But I'm noticing another side-effect of the new default behavior: On a shared linux system (both on a server with many cores, and on a SGE cluster), were users are using 'sort' as part of their scripts, memory usage and avg.load sometimes peaks beyond what is available, because the each sort process now uses up to 8 cores by default. So while implicitly users assume they use 1 core (at least with SGE, where you can specify how many threads your job will require), in practice they use many more. Globally setting OMP_NUM_THREADS=1 restores the old behavior, and only explicitly specifying "--parallel=X" lets sort use more than one thread, but I'm wondering if it's not better to default back to 1 core, and require explicit "--parallel" to do multi-threaded sort. -gordon