Pádraig Brady wrote: > # SUBTHREAD_LINES_HEURISTIC = 4 > $ for i in $(seq 22); do > j=$((2<<$i)) > yes | head -n$j > t.sort > strace -f -c -e clone ./sort --parallel=16 t.sort -o /dev/null 2>&1 | > join --nocheck-order -a1 -o1.4,1.5 - /dev/null | > sed -n "s/\([0-9]*\) clone/$j\t\1/p" > done > 4 1 > 8 3 > 16 7 > 32 15 > 64 15 > 128 15 > 256 15 > 512 15 > 1024 15 > 2048 15 > 4096 15 > 8192 15 > 16384 15 > 32768 15 > 65536 15 > 131072 15 > 262144 15 > 524288 15 > 1048576 15 > 2097152 15 > 4194304 30 > 8388608 45 > > # As above, but add -S1M option to sort > > 4 1 > 8 3 > 16 7 > 32 15 > 64 15 > 128 15 > 256 15 > 512 15 > 1024 15 > 2048 15 > 4096 15 > 8192 15 > 16384 30 > 32768 45 > 65536 90 > 131072 165 > 262144 315 > 524288 622 > 1048576 1245 > 2097152 2475 > 4194304 4935 > 8388608 9855 > > With SUBTHREAD_LINES_HEURISTIC=128k and -S1M option to sort we get no threads > as > nlines never gets above 12787 (there looks to be around 80 bytes overhead per > line?). > Only when -S >= 12M do we get nlines high enough to create threads.
Thanks for pursuing this. Here's a proposed patch to address the other problem. It doesn't have much of an effect (any?) on your issue when using very little memory, but when a sort user specifies -S1M, I think they probably want to avoid the expense (memory) of going multi-threaded. What do you think? >From 4f591fdd0bb78f621d2b72021de883fc4df1e179 Mon Sep 17 00:00:00 2001 From: Jim Meyering <meyer...@redhat.com> Date: Wed, 16 Mar 2011 16:09:31 +0100 Subject: [PATCH] sort: avoid memory pressure of 130MB/thread when reading from pipe * src/sort.c (INPUT_FILE_SIZE_GUESS): Decrease initial allocation factor used to size buffer used when reading a non-regular file. For motivation, see discussion here: http://thread.gmane.org/gmane.comp.gnu.coreutils.general/878/focus=887 --- src/sort.c | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/src/sort.c b/src/sort.c index 9b8666a..07d6765 100644 --- a/src/sort.c +++ b/src/sort.c @@ -319,8 +319,12 @@ static size_t merge_buffer_size = MAX (MIN_MERGE_BUFFER_SIZE, 256 * 1024); specified by the user. Zero if the user has not specified a size. */ static size_t sort_size; -/* The guessed size for non-regular files. */ -#define INPUT_FILE_SIZE_GUESS (1024 * 1024) +/* The initial allocation factor for non-regular files. + This is used, e.g., when reading from a pipe. + Don't make it too big, since it is multiplied by ~130 to + obtain the size of the actual buffer sort will allocate. + Also, there may be 8 threads all doing this at the same time. */ +#define INPUT_FILE_SIZE_GUESS (128 * 1024) /* Array of directory names in which any temporary files are to be created. */ static char const **temp_dirs; -- 1.7.4.1.430.g5aa4d