http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035

            Bug ID: 60035
           Summary: [PATCH] make it possible to use OMP on both sides of a
                    fork (without violating standard)
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: njs at pobox dot com
                CC: jakub at gcc dot gnu.org

Created attachment 32019
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32019&action=edit
patch to make openmp -> quiesce -> fork -> openmp work

This is a re-open of #52303 and #58378, with more arguments, and a proposed
patch that fixes the problem without violating the openmp standard.

Background: Almost all scientific/numerical code delegates linear algebra
operations to some optimized BLAS library. Currently, the main contenders for
this library are:
1) ATLAS: free software, but uses extensive build-time configuration, which
means it must be re-compiled from source by every user to achieve competitive
performance.
2) MKL: proprietary, but technically excellent.
3) OpenBLAS: free software, but uses OpenMP for threading, which means that any
program which does linear algebra and also expects fork() to work is screwed
[1], at least when using GCC.

This means that for projects like numpy, which are used in a very large range
of downstream products, we are pretty much screwed too. Many of our users use
fork(), for various good reasons that I can elaborate if desired, so we can't
just recommend OpenBLAS in general -- ATLAS or MKL are superior for . But
recompiling ATLAS is difficult, so we can't recommend that as a general
solution, or provide it in pre-compiled downloads. So what we end up doing is
shipping slow, unoptimized BLAS, while all the major "scientific python"
distros ship MKL; and we also get constantly pressured by users to either ship
binaries with MKL or with OpenBLAS built with icc; and we field a new bug
report every week or two from people who use OpenBLAS without realizing it and
are experiencing mysterious hangs. (Or sometimes other projects get caught in
the crossfire, e.g. [2] which is someone trying to figure out why their web-app
can't generate plot graphics when using the celery job queue manager.)
Meanwhile people are waiting with bated breath for clang to get an openmp
implementation so that they can shift their whole stack over there, solely
because of this one bug.

Basically the current situation is causing ongoing pain for a wide range of
people and makes free software uncompetitive with proprietary software for
scientific code using Python in general. But it doesn't have to be this way! In
actual practice on real implementations -- regardless of what POSIX says --
it's perfectly safe to use arbitrary POSIX APIs after fork, so long as all
threads are in a known, quiescent state when the fork occurs.

The attached patch has essentially no impact on compliant OpenMP-using
programs; in particular, and unlike the patch in #58378, it has no affect on
the behavior of the parent process, and in the child process it does nothing
that violates POSIX unless the user has violated POSIX first. But it makes it
safe in practice to use OpenMP encapsulated within a serial library API,
without mysterious breakage depending on far away parts of the program, and in
particular should fix the OpenBLAS issue.

Test case included in patch is by Olivier Grisel, from #58378. Patch is against
current gcc svn trunk (r206297).

[1] https://github.com/xianyi/OpenBLAS/issues/294
[2] https://github.com/celery/celery/issues/1842

Reply via email to