Re: [OMPI devel] Hang in collectives involving shared memory

Ralph Castain Wed, 10 Jun 2009 13:52:06 -0400

Much appreciated!

Per some of my other comments on this thread and on the referenced ticket,
can you tell me what kernel you have on that machine? I assume you have NUMA
support enabled, given that chipset?


Thanks!
Ralph

On Wed, Jun 10, 2009 at 10:29 AM, Sylvain Jeaugey
<sylvain.jeau...@bull.net>wrote:

> Hum, very glad that padb works with Open MPI, I couldn't live without it.
> In my opinion, the best debug tool for parallel applications, and more
> importantly, the only one that scales.
>
> About the issue, I couldn't reproduce it on my platform (tried 2 nodes with
> 2 to 8 processes each, nodes are twin 2.93 GHz Nehalem, IB is Mellanox QDR).
>
> So my feeling about that is that is may be very hardware related.
> Especially if you use the hierarch component, some transactions will be done
> through RDMA on one side and read directly through shared memory on the
> other side, which can, depending on the hardware, produce very different
> timings and bugs. Did you try with a different collective component (i.e.
> not hierarch) ? Or with another interconnect ? [Yes, of course, if it is a
> race condition, we might well avoid the bug because timings will be
> different, but that's still information]
>
> Perhaps all what I'm saying makes no sense or you already thought about
> this, anyway, if you want me to try different things, just let me know.
>
> Sylvain
>
>
> On Wed, 10 Jun 2009, Ralph Castain wrote:
>
>  Hi Ashley
>>
>> Thanks! I would definitely be interested and will look at the tool.
>> Meantime, I have filed a bunch of data on this in
>> ticket #1944, so perhaps you might take a glance at that and offer some
>> thoughts?
>>
>> https://svn.open-mpi.org/trac/ompi/ticket/1944
>>
>> Will be back after I look at the tool.
>>
>> Thanks again
>> Ralph
>>
>>
>> On Wed, Jun 10, 2009 at 8:51 AM, Ashley Pittman <ash...@pittman.co.uk>
>> wrote:
>>
>>      Ralph,
>>
>>      If I may say this is exactly the type of problem the tool I have been
>>      working on recently aims to help with and I'd be happy to help you
>>      through it.
>>
>>      Firstly I'd say of the three collectives you mention, MPI_Allgather,
>>      MPI_Reduce and MPI_Bcast one exhibit a many-to-many, one a
>> many-to-one
>>      and the last a many-to-one communication pattern.  The scenario of a
>>      root process falling behind and getting swamped in comms is a
>> plausible
>>      one for MPI_Reduce only but doesn't hold water with the other two.
>>  You
>>      also don't mention if the loop is over a single collective or if you
>>      have loop calling a number of different collectives each iteration.
>>
>>      padb, the tool I've been working on has the ability to look at
>> parallel
>>      jobs and report on the state of collective comms and should help
>> narrow
>>      you down on erroneous processes and those simply blocked waiting for
>>      comms.  I'd recommend using it to look at maybe four or five
>> instances
>>      where the application has hung and look for any common features
>> between
>>      them.
>>
>>      Let me know if you are willing to try this route and I'll talk, the
>> code
>>      is downloadable from http://padb.pittman.org.uk and if you want the
>> full
>>      collective functionality you'll need to patch openmp with the patch
>> from
>>      http://padb.pittman.org.uk/extensions.html
>>
>>      Ashley,
>>
>>      --
>>
>>      Ashley Pittman, Bath, UK.
>>
>>      Padb - A parallel job inspection tool for cluster computing
>>      http://padb.pittman.org.uk
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Hang in collectives involving shared memory

Reply via email to