Jim Meyering wrote: > Jim Meyering wrote: >> Running "make -j25 check" on a nominal-12-core F14 system would >> cause serious difficulty leading to an OOM kill -- and this is brand new. >> It worked fine yesterday. I tracked it down to all of the make processes >> working on the "built_programs.list" (in src/Makefile.am) rule >> >> built_programs.list: >> @echo $(bin_PROGRAMS) $(bin_SCRIPTS) | tr ' ' '\n' \ >> | sed -e 's,$(EXEEXT)$$,,' | $(ASSORT) -u | tr '\n' ' ' >> >> Which made me realize we were running that submake over 400 times, >> once per test scripts (including skipped ones). That's well worth >> avoiding, even if it means a new temporary file. >> >> I don't know the root cause of the OOM-kill (preceded by interminable >> minutes of a seemingly hung and barely responsive system) or why it started >> happening today (afaics, none of the programs involved was updated), >> but this does fix it... > > FYI, > I've tracked this down a little further. > The horrid performance (hung system and eventual OOM-kill) > are related to the use of sort above. This is the definition: > > ASSORT = LC_ALL=C sort > > If I revert my earlier patch and instead simply > insist that sort not do anything in parallel, > > ASSORT = LC_ALL=C sort --parallel=1 > > then there is no hang, and things finish in relatively good time. > > I don't have a good stand-alone reproducer yet > and am out of time for today.
Well, right after writing that, I realized the key to the misbehavior: sort was reading from a *pipe*: # This finishes right away, reading from input file "k": seq 99 >k && for i in $(seq 33); do LC_ALL=C timeout 1 sort k > /dev/null & done # When reading from a pipe, it's a very different story: # Without the "timeout 7" prefix, the following would render an N-core # system (5<N) unusable for many minutes. As it is, be prepared: # my system goes unresponsive after 1 second, and doesn't return until timeout. for i in $(seq 33); do seq 88|LC_ALL=C timeout 7 sort --para=5 >/dev/null & done Occasionally, the above jobs all complete quickly. My first question was why were *any* processes being spawned to handle such a small input file. The first answer is in the first hunk: diff --git a/src/sort.c b/src/sort.c index 13954cb..b9316e7 100644 --- a/src/sort.c +++ b/src/sort.c @@ -112,9 +112,8 @@ struct rlimit { size_t rlim_cur; }; /* Heuristic value for the number of lines for which it is worth creating a subthread, during an internal merge sort, on a machine that has processors galore. Currently this number is just a guess. - This value must be at least 4. We don't know of any machine where - this number has any practical effect. */ -enum { SUBTHREAD_LINES_HEURISTIC = 4 }; + This value must be at least 4. */ +enum { SUBTHREAD_LINES_HEURISTIC = 32 * 1024 }; /* The number of threads after which there are diminishing performance gains. */ The old definition of SUBTHREAD_LINES_HEURISTIC meant that any group of 5 or more lines would be split in two and sorted via two (or more) separate threads. Thus, with just 40 lines, you could get the maximum of 8 threads working. That is obviously not efficient, unless lines are so pathologically long that the cost of comparing two of them approaches the cost of creating a new process. With the above, sort would use a more reasonable number. Tests on high-end hardware and using very short lines suggest that a value like 200,000 would still be conservative. But even with that change, the above for-loop would sometimes hang. Pursuing it further, I discovered that sort was allocating 130MB (INPUT_FILE_SIZE_GUESS * 129, with the former being 1MiB) of space for its input buffer, simply because it was reading from a pipe. diff --git a/src/sort.c b/src/sort.c index 13954cb..92c8d4e 100644 --- a/src/sort.c +++ b/src/sort.c @@ -316,7 +315,7 @@ static size_t merge_buffer_size = MAX (MIN_MERGE_BUFFER_SIZE, 256 * 1024); static size_t sort_size; /* The guessed size for non-regular files. */ -#define INPUT_FILE_SIZE_GUESS (1024 * 1024) +#define INPUT_FILE_SIZE_GUESS (64 * 1024) /* Array of directory names in which any temporary files are to be created. */ static char const **temp_dirs; With that, the while-loop does not hang any more. Note again that this is on a multi-core system. I'll investigate more tomorrow.