I've been comparing 1.5.4 and 1.5.5rc3 with the same parameters that's why
I didn't use --bind-to-core. I checked and the usage of --bind-to-core
improved the result comparing to 1.5.4:
#repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
         1000        84.96        85.08        85.02

So I guess with 1.5.5 the processes move from core to core within node even
though I use all cores, right? Then why 1.5.4 behaves differently?

I need --bind-to-core in some cases and that's why I need 1.5.5rc3 instead
of more stable 1.5.4. I know that I can use numactl explicitly but
--bind-to-core is more convinient :)

2012/3/23 Ralph Castain <r...@open-mpi.org>

> I don't see where you told OMPI to --bind-to-core. We don't automatically
> bind, so you have to explicitly tell us to do so.
>
> On Mar 23, 2012, at 6:20 AM, Pavel Mezentsev wrote:
>
> > Hello
> >
> > I'm doing some testing with IMB and dicovered a strange thing:
> >
> > Since I have a system with new AMD opteron 6276 processors I'm using
> 1.5.5rc3 since it supports binding to cores.
> >
> > But when I run the barrier test form intel mpi benchmarks, the best I
> get is:
> > #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
> >           598     15159.56     15211.05     15184.70
> >  (/opt/openmpi-1.5.5rc3/intel12/bin/mpirun -x OMP_NUM_THREADS=1
>  -hostfile hosts_all2all_2 -npernode 32 --mca btl openib,sm,self -mca
> coll_tuned_use_dynamic_rules 1 -mca coll_tuned_barrier_algorithm 1 -np 256
> openmpi-1.5.5rc3/intel12/IMB-MPI1 -off_cache 16,64 -msglog 1:16 -npmin 256
> barrier)
> >
> > And with openmpi 1.5.4 the result is much better:
> > #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
> >          1000       113.23       113.33       113.28
> >
> > (/opt/openmpi-1.5.4/intel12/bin/mpirun -x OMP_NUM_THREADS=1  -hostfile
> hosts_all2all_2 -npernode 32 --mca btl openib,sm,self -mca
> coll_tuned_use_dynamic_rules 1 -mca coll_tuned_barrier_algorithm 3 -np 256
> openmpi-1.5.4/intel12/IMB-MPI1 -off_cache 16,64 -msglog 1:16 -npmin 256
> barrier)
> >
> > and still I couldn't come close to the result I got with mvapich:
> > #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
> >          1000        17.51        17.53        17.53
> >
> > (/opt/mvapich2-1.8/intel12/bin/mpiexec.hydra -env OMP_NUM_THREADS 1
> -hostfile hosts_all2all_2 -np 256 mvapich2-1.8/intel12/IMB-MPI1 -mem 2
> -off_cache 16,64 -msglog 1:16 -npmin 256 barrier)
> >
> > I dunno if this is a bug or me doing something not the way I should. So
> is there a way to improve my results?
> >
> > Best regards,
> > Pavel Mezentsev
> >
> >
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Reply via email to