On Mon, 10 May 2010, N.M. Maclaren wrote:

As explained by Sylvain, current Open MPI implementation always returns
MPI_THREAD_SINGLE as provided thread level if neither --enable-mpi-threads
nor --enable-progress-threads was specified at configure (v1.4).

That is definitely the correct action. Unless an application or library has been built with thread support, or can guaranteed to be called only from a single thread, using threads is catastrophic.
I personnaly see that as a bug, but I certainly lack some knowledge on non-linux OSes. From my point of view, any normal library should be THREAD_SERIALIZED, and thread-safe library should be THREAD_MULTIPLE. I don't see other libraries which claims to be "totally incompatible with the use of threads". They may not be thread-safe, in which case the programmer must ensure locking and memory coherency to use them in conjunction with threads, but that's about what THREAD_SERIALIZED is about IMO.

And, regrettably,
given modern approaches to building software and the **** configure
design, configure is where the test has to go.
configure is where the tests is. And configure indeed returns "We have threads" (OMPI_HAVE_THREADS = 1). And given this, I don't see why we wouldn't be MPI_THREAD_SERIALIZED. At least MPI_THREAD_FUNELLED.

On some systems, there are certain actions that require thread affinity
(sometimes including I/O, and often undocumented).  zOS is one, but I
have seen it under a few Unices, too.

On others, they use a completely different (and seriously incompatible,
at both the syntactic and semantic levels) set of libraries.  E.g. AIX.

If we use OpenMP with MPI, we need at least MPI_THREAD_FUNNELED even
if MPI functions are called only outside of omp parallel region,
like below.

   #pragma omp parallel for
   for (...) {
       /* computation */
   }
   MPI_Allreduce(...);
   #pragma omp parallel for
   for (...) {
       /* computation */
   }

I don't think that's correct.  That would call MPI_Allreduce once for
each thread, in parallel, on the same process - which wouldn't work.
I think the idea is precisely _not_ to call MPI_Allreduce within parallel sections, i.e. only have the master thread call MPI.

This means Open MPI users must specify --enable-mpi-threads or
--enable-progress-threads to use OpenMP. Is it true?
But this two configure options, i.e. OMPI_HAVE_THREAD_SUPPORT macro,
lead to performance penalty by mutex lock/unlock.

That's unavoidable, in general, with one niggle.  If the programmer
guarantees BOTH to call MPI on the global master thread AND to ensure
that all memory is synchronised before it does so, there is no need
for mutexes. The MPI specification lacks some of the necessary
paranoia in this respect.
In my understanding of MPI_THREAD_SERIALIZED, the memory coherency was guaranteed. If not, the programmer has to ensure it.

I believe OMPI_HAVE_THREADS (not OMPI_HAVE_THREAD_SUPPORT !) is sufficient
to support MPI_THREAD_FUNNELED and MPI_THREAD_SERIALIZED, and therefore
OMPI_HAVE_THREAD_SUPPORT should be OMPI_HAVE_THREADS at following
part in ompi_mpi_init function, as suggested by Sylvain.

I can't comment on that, though I doubt it's quite that simple.  There's
a big difference between MPI_THREAD_FUNNELED and MPI_THREAD_SERIALIZED
in implementation impact.
I don't see the relationship between THREAD_SERIALIZED/FUNNELED and OMPI_HAVE_THREAD_SUPPORT. Actually, OMPI_HAVE_THREAD_SUPPORT seems to have no relationship with how the OS supports threads (that's why I think it is misleading).

But I don't see a big difference between THREAD_SERIALIZED and THREAD_FUNNELED anyway. Do you have more information on systems where the caller thread id makes a difference in MPI ?

Just for the record, we (at Bull) patched our MPI library and had no problem so far with applications using MPI + Threads or MPI + OpenMP, given that they don't call MPI within parallel sections. But of course, we only use linux, so your mileage may vary.

Sylvain

Reply via email to