[OMPI devel] non-blocking barrier

2012-07-06 Thread Eugene Loh
Either there is a problem with MPI_Ibarrier or I don't understand the semantics. The following example is with openmpi-1.9a1r26747. (Thanks for the fix in 26757. I tried with that as well with same results.) I get similar results for different OSes, compilers, bitness, etc. % cat ibarrier

Re: [OMPI devel] non-blocking barrier

2012-07-06 Thread George Bosilca
No, it is not right. With the ibarrier usage you're making below, the output should be similar to the first case (all should leave at earlist at 6.0). The ibarrier is still a synchronizing point, all processes MUST reach it before anyone is allowed to leave. However, if you move the ibarrier on

Re: [OMPI devel] SM component init unload

2012-07-06 Thread George Bosilca
You're right, the code was overzealous. I fix it by removing the parsing of the modex data completely. In any case, the collective module has another chance of deselecting itself, upon creation of a new communicator (thus, after the modex was completed). George On Jul 6, 2012, at 2:20, Ral

Re: [OMPI devel] openib max_cqe

2012-07-06 Thread TERRY DONTJE
On 7/5/2012 5:47 PM, Shamis, Pavel wrote: I mentioned on the call that for Mellanox devices (+OFA verbs) this resource is really cheap. Do you run mellanox hca + OFA verbs ? (I'll reply because I know Terry is offline for the rest of the day) Yes, he does. I asked because SUN used to have o

[OMPI devel] VampirTrace: time not increasing

2012-07-06 Thread Fluder, Eugene
I got the following error running a VT enabled run of AMBER. This was reported in December of 2009 under almost identical conditions but the thread does not contain a resolution. I reran the test with VT_UNIFY=no and it completed normally. The same error occurred when I ran vtunify separately.

Re: [OMPI devel] VampirTrace: time not increasing

2012-07-06 Thread Holger Mickler
Hi Gene, this error is often caused by insufficiently synchronized TSCs (time stamp counter) of different processors/cores. When VT uses the TSC for timing the events (it does that by default), and the processes switch to another core during execution, it may well happen that the next recorded tim

Re: [OMPI devel] non-blocking barrier

2012-07-06 Thread Richard Graham
Don't agree here - the only synchronization point is the completion. Ibarrier can't be completed until all have entered the barrier, but each process can leave the ibarrier() call as soon as they want to. Rich -Original Message- From: devel-boun...@open-mpi.org [mailto:devel-boun...@o

Re: [OMPI devel] non-blocking barrier

2012-07-06 Thread Richard Graham
Forget what I just posted - I looked at George's words, and not the code - wait() is the synchronization point, so George's response is correct. Rich -Original Message- From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On Behalf Of George Bosilca Sent: Friday, July 06

Re: [OMPI devel] VampirTrace: time not increasing

2012-07-06 Thread Holger Mickler
Oh, I just realized that you are probably using the Open MPI version of VT which builds as part of the Open MPI build. I'm not 100% sure if the modification of config.h works as laid out, but it should... you need to look out for VT's config.h then, not Open MPI's. Holger On 07/06/2012 04:54 PM,

Re: [OMPI devel] [EXTERNAL] Re: non-blocking barrier

2012-07-06 Thread Barrett, Brian W
Yeah, there was a bug in the code. Fixed now. Brian On 7/6/12 10:47 AM, "Richard Graham" wrote: >Forget what I just posted - I looked at George's words, and not the code >- wait() is the synchronization point, so George's response is correct. > >Rich > >-Original Message- >From: devel-

Re: [OMPI devel] VampirTrace: time not increasing

2012-07-06 Thread Fluder, Eugene
Holger, Thanks. I appreciate the detail. Gene -- Eugene M Fluder, Jr, PhD Computational Scientist Scientific Computing The Mt. Sinai School of Medicine One Gustave L. Levy Place, Box 1498 New York, NY 10029-6574 T: 212 659 8608 F: 646

[OMPI devel] reduce_scatter_block failing on v1.7

2012-07-06 Thread Eugene Loh
The new reduce_scatter_block test is segfaulting with v1.7 but not with the trunk. When we drop down into MPI_Reduce_scatter_block and attempt to call comm->c_coll.coll_reduce_scatter_block() it's NULL. (So is comm->c_coll.coll_reduce_scatter_block_module.) Is there some work on the trunk t

Re: [OMPI devel] [EXTERNAL] reduce_scatter_block failing on v1.7

2012-07-06 Thread Barrett, Brian W
On 7/6/12 2:31 PM, "Eugene Loh" wrote: >The new reduce_scatter_block test is segfaulting with v1.7 but not with >the trunk. When we drop down into MPI_Reduce_scatter_block and attempt >to call > >comm->c_coll.coll_reduce_scatter_block() > >it's NULL. (So is comm->c_coll.coll_reduce_scatter_bloc

Re: [OMPI devel] VampirTrace: time not increasing

2012-07-06 Thread Fluder, Eugene
Thanks. Now that I know what to look for, I should be able to figure it out. BTW, I switched the script that ultimately runs the mpiexec from tcsh to bash and the problem went away. Not complaining but do you have any idea why that might be? Gene -- Eugene M Fluder, Jr, PhD Computat

[OMPI devel] ibm/collective/bcast_f08.f90

2012-07-06 Thread Eugene Loh
I assume this is an orphaned file that should be removed? (It looks like a draft version of ibcast_f08.f90.)