Great! Would you mind showing the revised table? I'm curious as to the relative
performance.
On Jun 11, 2013, at 4:53 PM, eblo...@1scom.net wrote:
> Problem solved. I did not configure with --with-mxm=/opt/mellanox/mcm and
> this location was not auto-detected. Once I rebuilt with this option,
Problem solved. I did not configure with --with-mxm=/opt/mellanox/mcm and
this location was not auto-detected. Once I rebuilt with this option,
everything worked fine. Scaled better than MVAPICH out to 800. MVAPICH
configure log showed that it had found this component of the OFED stack.
Ed
> If
Couple of things stand out. You should remove the following configure options:
--enable-mpi-thread-multiple
--with-threads
--enable-heterogeneous
Thread multiple is not ready yet in OMPI (and openib doesn't support threaded
operations anyway), and the support for hetero systems really isn't work
In fact, I also have tried to configure the OpenMPI with this:
./configure --with-sge --with-openib --enable-mpi-thread-multiple
--with-threads --with-hwloc --enable-heterogeneous --disable-vt
--enable-openib-dynamic-sl --prefix=/home/jescudero/opt/openmpi
And the problem is still present
El
If you run at 224 and things look okay, then I would suspect something in the
upper level switch that spans cabinets. At that point, I'd have to leave it to
Mellanox to advise.
On Jun 11, 2013, at 6:55 AM, "Blosch, Edwin L" wrote:
> I tried adding "-mca btl openib,sm,self" but it did not mak
I tried adding "-mca btl openib,sm,self" but it did not make any difference.
Jesus' e-mail this morning has got me thinking. In our system, each cabinet
has 224 cores, and we are reaching a different level of the system architecture
when we go beyond 224. I got an additional data point at 256
--mca btl_openib_ib_path_record_**service_level 1 flag controls openib btl,
you need to remove --mca mtl mxm from command line.
Have you compiled OpenMPI with rhel6.4 inbox ofed driver? AFAIK, the MOFED
2.x does not have XRC and you mentioned "--enable-openib-connectx-xrc" flag
in configure.
O
I have a 16-node Mellanox cluster built with Mellanox ConnectX3 cards.
Recently I have updated the MLNX_OFED to the 2.0.5 version. The reason
of this e-mail to the OpenMPI users list is that I am not able to run
MPI applications using the service levels (SLs) feature of the OpenMPI
driver.
Cu