I am pretty sure MTL's and BTL's are very different, but just as a note, This users code (Crash) hangs at MPI_Allreduce() in
Openib But runs on: tcp psm (an mtl, different hardware) Putting it out there if it does have any bearing. Otherwise ignore. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On May 12, 2011, at 10:20 AM, Brock Palen wrote: > On May 12, 2011, at 10:13 AM, Jeff Squyres wrote: > >> On May 11, 2011, at 3:21 PM, Dave Love wrote: >> >>> We can reproduce it with IMB. We could provide access, but we'd have to >>> negotiate with the owners of the relevant nodes to give you interactive >>> access to them. Maybe Brock's would be more accessible? (If you >>> contact me, I may not be able to respond for a few days.) >> >> Brock has replied off-list that he, too, is able to reliably reproduce the >> issue with IMB, and is working to get access for us. Many thanks for your >> offer; let's see where Brock's access takes us. > > I should also note that as far as I know I have three codes (CRASH, Namd > (some cases), and another user code. That lockup on a collective on OpenIB > but run with the same library on Gig-e. > > So I am not sure it is limited to IMB, or I could be crossing errors, > normally I would assume unmatched eager recvs for this sort of problem. > >> >>>> -- we have not closed this issue, >>> >>> Which issue? I couldn't find a relevant-looking one. >> >> https://svn.open-mpi.org/trac/ompi/ticket/2714 >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > >