Jim Meyering wrote:
> Running "make -j25 check" on a nominal-12-core F14 system would
> cause serious difficulty leading to an OOM kill -- and this is brand new.
> It worked fine yesterday.  I tracked it down to all of the make processes
> working on the "built_programs.list" (in src/Makefile.am) rule
>
> built_programs.list:
>       @echo $(bin_PROGRAMS) $(bin_SCRIPTS) | tr ' ' '\n' \
>         | sed -e 's,$(EXEEXT)$$,,' | $(ASSORT) -u | tr '\n' ' '
>
> Which made me realize we were running that submake over 400 times,
> once per test scripts (including skipped ones).  That's well worth
> avoiding, even if it means a new temporary file.
>
> I don't know the root cause of the OOM-kill (preceded by interminable
> minutes of a seemingly hung and barely responsive system) or why it started
> happening today (afaics, none of the programs involved was updated),
> but this does fix it...

FYI,
I've tracked this down a little further.
The horrid performance (hung system and eventual OOM-kill)
are related to the use of sort above.  This is the definition:

    ASSORT = LC_ALL=C sort

If I revert my earlier patch and instead simply
insist that sort not do anything in parallel,

    ASSORT = LC_ALL=C sort --parallel=1

then there is no hang, and things finish in relatively good time.

I don't have a good stand-alone reproducer yet
and am out of time for today.

Reply via email to