Re: [OMPI devel] barrier problem

2012-03-28 Thread Pavel Mezentsev
I took the best result from each version, that's why different algotithm numbers were chosen. I've studied the matter a bit further and here's what I got: with openmpi 1.5.4 these are the average times: /opt/openmpi-1.5.4/intel12/bin/mpirun -x OMP_NUM_THREADS=1 -hostfile hosts_all2all_4 -npernode

Re: [OMPI devel] barrier problem

2012-03-27 Thread Jeffrey Squyres
FWIW: 1. There were definitely some issues with binding to cores and process layouts on Opterons that should be fixed in the 1.5.5 that was finally released today. 2. It is strange that the performance of barrier is so much different between 1.5.4 and 1.5.5. Is there a reason you were choosing

Re: [OMPI devel] barrier problem

2012-03-23 Thread Shamis, Pavel
Pavel, Mvapich implements multicore optimized collectives, which perform substantially better than default algorithms. FYI, ORNL team works on new high performance collectives framework for OMPI. The framework provides significant boost in collectives performance. Regards, Pavel (Pasha) Shami

Re: [OMPI devel] barrier problem

2012-03-23 Thread Pavel Mezentsev
I've been comparing 1.5.4 and 1.5.5rc3 with the same parameters that's why I didn't use --bind-to-core. I checked and the usage of --bind-to-core improved the result comparing to 1.5.4: #repetitions t_min[usec] t_max[usec] t_avg[usec] 100084.9685.0885.02 So I gu

Re: [OMPI devel] barrier problem

2012-03-23 Thread Ralph Castain
I don't see where you told OMPI to --bind-to-core. We don't automatically bind, so you have to explicitly tell us to do so. On Mar 23, 2012, at 6:20 AM, Pavel Mezentsev wrote: > Hello > > I'm doing some testing with IMB and dicovered a strange thing: > > Since I have a system with new AMD opte

[OMPI devel] barrier problem

2012-03-23 Thread Pavel Mezentsev
Hello I'm doing some testing with IMB and dicovered a strange thing: Since I have a system with new AMD opteron 6276 processors I'm using 1.5.5rc3 since it supports binding to cores. But when I run the barrier test form intel mpi benchmarks, the best I get is: #repetitions t_min[usec] t_max[us