Re: [hwloc-devel] Graceful abort for non-C99 compilers

2010-05-10 Thread Pavan Balaji


Darn. The patch is incorrect. Sorry, too sleepy. It should be checking 
for $ac_prog_cc_c99 instead. But you get the idea.


 -- Pavan

On 05/10/2010 08:56 PM, Pavan Balaji wrote:
I understand that hwloc requires C99 support. However, for compilers 
that don't support C99, would you be willing to gracefully abort during 
configure instead of failing at make time?


https://trac.mcs.anl.gov/projects/mpich2/changeset/6634

I agree that most compilers today probably support C99, but whether to 
drop C89-only support is a policy decision we are yet to make in MPICH2, 
and will need the above patch to get hwloc integrated cleanly in all our 
supported environments.


Thanks,

  -- Pavan



--
Pavan Balaji
http://www.mcs.anl.gov/~balaji


[hwloc-devel] Graceful abort for non-C99 compilers

2010-05-10 Thread Pavan Balaji


I understand that hwloc requires C99 support. However, for compilers 
that don't support C99, would you be willing to gracefully abort during 
configure instead of failing at make time?


https://trac.mcs.anl.gov/projects/mpich2/changeset/6634

I agree that most compilers today probably support C99, but whether to 
drop C89-only support is a policy decision we are yet to make in MPICH2, 
and will need the above patch to get hwloc integrated cleanly in all our 
supported environments.


Thanks,

 -- Pavan

--
Pavan Balaji
http://www.mcs.anl.gov/~balaji


Re: [OMPI devel] Thread safety levels

2010-05-10 Thread Sylvain Jeaugey

On Mon, 10 May 2010, N.M. Maclaren wrote:


As explained by Sylvain, current Open MPI implementation always returns
MPI_THREAD_SINGLE as provided thread level if neither --enable-mpi-threads
nor --enable-progress-threads was specified at configure (v1.4).


That is definitely the correct action.  Unless an application or library 
has been built with thread support, or can guaranteed to be called only 
from a single thread, using threads is catastrophic.
I personnaly see that as a bug, but I certainly lack some knowledge on 
non-linux OSes. From my point of view, any normal library should be 
THREAD_SERIALIZED, and thread-safe library should be THREAD_MULTIPLE. I 
don't see other libraries which claims to be "totally incompatible with 
the use of threads". They may not be thread-safe, in which case the 
programmer must ensure locking and memory coherency to use them in 
conjunction with threads, but that's about what THREAD_SERIALIZED is about 
IMO.



And, regrettably,
given modern approaches to building software and the  configure
design, configure is where the test has to go.
configure is where the tests is. And configure indeed returns "We have 
threads" (OMPI_HAVE_THREADS = 1). And given this, I don't see why we 
wouldn't be MPI_THREAD_SERIALIZED. At least MPI_THREAD_FUNELLED.



On some systems, there are certain actions that require thread affinity
(sometimes including I/O, and often undocumented).  zOS is one, but I
have seen it under a few Unices, too.

On others, they use a completely different (and seriously incompatible,
at both the syntactic and semantic levels) set of libraries.  E.g. AIX.


If we use OpenMP with MPI, we need at least MPI_THREAD_FUNNELED even
if MPI functions are called only outside of omp parallel region,
like below.

   #pragma omp parallel for
   for (...) {
   /* computation */
   }
   MPI_Allreduce(...);
   #pragma omp parallel for
   for (...) {
   /* computation */
   }


I don't think that's correct.  That would call MPI_Allreduce once for
each thread, in parallel, on the same process - which wouldn't work.
I think the idea is precisely _not_ to call MPI_Allreduce within parallel 
sections, i.e. only have the master thread call MPI.



This means Open MPI users must specify --enable-mpi-threads or
--enable-progress-threads to use OpenMP. Is it true?
But this two configure options, i.e. OMPI_HAVE_THREAD_SUPPORT macro,
lead to performance penalty by mutex lock/unlock.


That's unavoidable, in general, with one niggle.  If the programmer
guarantees BOTH to call MPI on the global master thread AND to ensure
that all memory is synchronised before it does so, there is no need
for mutexes. The MPI specification lacks some of the necessary
paranoia in this respect.
In my understanding of MPI_THREAD_SERIALIZED, the memory coherency was 
guaranteed. If not, the programmer has to ensure it.



I believe OMPI_HAVE_THREADS (not OMPI_HAVE_THREAD_SUPPORT !) is sufficient
to support MPI_THREAD_FUNNELED and MPI_THREAD_SERIALIZED, and therefore
OMPI_HAVE_THREAD_SUPPORT should be OMPI_HAVE_THREADS at following
part in ompi_mpi_init function, as suggested by Sylvain.


I can't comment on that, though I doubt it's quite that simple.  There's
a big difference between MPI_THREAD_FUNNELED and MPI_THREAD_SERIALIZED
in implementation impact.
I don't see the relationship between THREAD_SERIALIZED/FUNNELED and 
OMPI_HAVE_THREAD_SUPPORT. Actually, OMPI_HAVE_THREAD_SUPPORT seems to have 
no relationship with how the OS supports threads (that's why I think it is 
misleading).


But I don't see a big difference between THREAD_SERIALIZED and 
THREAD_FUNNELED anyway. Do you have more information on systems where the 
caller thread id makes a difference in MPI ?


Just for the record, we (at Bull) patched our MPI library and had no 
problem so far with applications using MPI + Threads or MPI + OpenMP, 
given that they don't call MPI within parallel sections. But of course, we 
only use linux, so your mileage may vary.


Sylvain