Re: [OMPI users] OpenMPI + InfiniBand

2016-12-26 Thread Sergei Hrushev
Hi Gilles!


> this looks like a very different issue, orted cannot be remotely started.
> ...
>
> a better option (as long as you do not plan to relocate Open MPI install
> dir) is to configure with
>
> --enable-mpirun-prefix-by-default
>

Yes, that's was a problem with orted.
I checked PATH and LD_LIBRARY_PATH variables and both are specified, but it
was not enough!

So I added --enable-mpirun-prefix-by-default to configure and even when
--prefix isn't specified the recompiled version woks properly.

When Ethernet transfer is used, all works both with and without
--enable-mpirun-prefix-by-default.

Thank you!

Best regards,
Sergei.
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-12-22 Thread Sergei Hrushev
Hi All !

As there are no any positive changes with "UDSM + IPoIB" problem since my
previous post,
we installed IPoIB on the cluster and "No OpenFabrics connection..." error
doesn't appear more.
But now OpenMPI reports about another problem:

In app ERROR OUTPUT stream:

[node2:14142] [[37935,0],0] ORTE_ERROR_LOG: Data unpack had inadequate
space in file base/plm_base_launch_support.c at line 1035

In app OUTPUT stream:

--
ORTE was unable to reliably start one or more daemons.
This usually is caused by:

* not finding the required libraries and/or binaries on
  one or more nodes. Please check your PATH and LD_LIBRARY_PATH
  settings, or configure OMPI with --enable-orterun-prefix-by-default

* lack of authority to execute on one or more specified nodes.
  Please verify your allocation and authorities.

* the inability to write startup files into /tmp
(--tmpdir/orte_tmpdir_base).
  Please check with your sys admin to determine the correct location to use.

*  compilation of the orted with dynamic libraries when static are required
  (e.g., on Cray). Please check your configure cmd line and consider using
  one of the contrib/platform definitions for your system type.

* an inability to create a connection back to mpirun due to a
  lack of common network interfaces and/or no route found between
  them. Please check network connectivity (including firewalls
  and network routing requirements).
--

When I'm trying to run the task using single node - all works properly.
But when I specify "run on 2 nodes", the problem appears.

I tried to run ping using IPoIB addresses and all hosts are resolved
properly,
ping requests and replies are going over IB without any problems.
So all nodes (including head) see each other via IPoIB.
But MPI app fails.

Same test task works perfect on all nodes being run with Ethernet transport
instead of InfiniBand.

P.S. We use Torque resource manager to enqueue MPI tasks.

Best regards,
Sergei.
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-02 Thread Sergei Hrushev
Hi Nathan!

UDCM does not require IPoIB. It should be working for you. Can you build
> Open MPI with --enable-debug and run with -mca btl_base_verbose 100 and
> create a gist with the output.
>
>
Ok, done:

https://gist.github.com/hsa-online/30bb27a90bb7b225b233cc2af11b3942


Best regards,
Sergei.
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
>
> I actually just filed a Github issue to ask this exact question:
>
> https://github.com/open-mpi/ompi/issues/2326
>
>
Good idea, thanks!
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
>
>
> I haven't worked with InfiniBand for years, but I do believe that yes: you
> need IPoIB enabled on your IB devices to get the RDMA CM support to work.
>
>
Yes, I saw too that RDMA CM requires IP, but in my case OpenMPI reports
that UD CM can't be used too.
Is it also require IPoIB?

Is it possible to read more about UD CM somewhere?
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
Hi John !

I'm experimenting now with a head node and single compute node, all the
rest of cluster is switched off.

can you run :
>
> ibhosts
>

# ibhosts
Ca  : 0x7cfe900300bddec0 ports 1 "MT25408 ConnectX Mellanox
Technologies"
Ca  : 0xe41d2d030050caf0 ports 1 "MT25408 ConnectX Mellanox
Technologies"


>
> ibstat
>

# ibstat
CA 'mlx4_0'
CA type: MT4099
Number of ports: 1
Firmware version: 2.35.5100
Hardware version: 0
Node GUID: 0xe41d2d030050caf0
System image GUID: 0xe41d2d030050caf3
Port 1:
State: Active
Physical state: LinkUp
Rate: 56
Base lid: 1
LMC: 0
SM lid: 3
Capability mask: 0x0251486a
Port GUID: 0xe41d2d030050caf1
Link layer: InfiniBand


>
>
ibdiagnet
>
>
# ibdiagnet
# cat ibdiagnet.log
-W- Topology file is not specified.
Reports regarding cluster links will use direct routes.
-I- Using port 1 as the local port.
-I- Discovering ... 3 nodes (1 Switches & 2 CA-s) discovered.


-I---
-I- Bad Guids/LIDs Info
-I---
-I- No bad Guids were found

-I---
-I- Links With Logical State = INIT
-I---
-I- No bad Links (with logical state = INIT) were found

-I---
-I- General Device Info
-I---

-I---
-I- PM Counters Info
-I---
-I- No illegal PM counters values were found

-I---
-I- Fabric Partitions Report (see ibdiagnet.pkey for a full hosts list)
-I---
-I-PKey:0x7fff Hosts:2 full:2 limited:0

-I---
-I- IPoIB Subnets Check
-I---
-I- Subnet: IPv4 PKey:0x7fff QKey:0x0b1b MTU:2048Byte rate:10Gbps
SL:0x00
-W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps

-I---
-I- Bad Links Info
-I- No bad link were found
-I---

-I- Done. Run time was 2 seconds.


>
> Lord help me for being so naive, but do you have a subnet manager running?
>

It seems, yes (I even have standby):

# service --status-all | grep opensm
 [ + ]  opensm

# cat ibdiagnet.sm

ibdiagnet fabric SM report

  SM - master
MT25408/P1 lid=0x0003 guid=0x7cfe900300bddec1 dev=4099 priority:0

  SM - standby
The Local Device : MT25408/P1 lid=0x0001 guid=0xe41d2d030050caf1
dev=4099 priority:0

Best regards,
Sergei.
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
Hi Jeff !

What does "ompi_info | grep openib" show?
>
>
$ ompi_info | grep openib
 MCA btl: openib (MCA v2.0.0, API v2.0.0, Component v1.10.2)

Additionally, Mellanox provides alternate support through their MXM
> libraries, if you want to try that.
>

Yes, I know.
But we already have a hybrid cluster with OpenMPI, OpenMP, CUDA, Torque and
many other libraries installed,
and because it works perfect over Ethernet interconnect my idea was to add
InfiniBand support with minimum
of changes. Mainly because we already have some custom-written software for
OpenMPI.


> If that shows that you have the openib BTL plugin loaded, try running with
> "mpirun --mca btl_base_verbose 100 ..."  That will provide additional
> output about why / why not each point-to-point plugin is chosen.
>
>
Yes, I tried to get this info already.
And I saw in log that rdmacm wants IP address on port.
So my question in topc start message was:

Is it enough for OpenMPI to have RDMA only or IPoIB should also be
installed?

The mpirun output is:

[node1:02674] mca: base: components_register: registering btl components
[node1:02674] mca: base: components_register: found loaded component openib
[node1:02674] mca: base: components_register: component openib register
function successful
[node1:02674] mca: base: components_register: found loaded component sm
[node1:02674] mca: base: components_register: component sm register
function successful
[node1:02674] mca: base: components_register: found loaded component self
[node1:02674] mca: base: components_register: component self register
function successful
[node1:02674] mca: base: components_open: opening btl components
[node1:02674] mca: base: components_open: found loaded component openib
[node1:02674] mca: base: components_open: component openib open function
successful
[node1:02674] mca: base: components_open: found loaded component sm
[node1:02674] mca: base: components_open: component sm open function
successful
[node1:02674] mca: base: components_open: found loaded component self
[node1:02674] mca: base: components_open: component self open function
successful
[node1:02674] select: initializing btl component openib
[node1:02674] openib BTL: rdmacm IP address not found on port
[node1:02674] openib BTL: rdmacm CPC unavailable for use on mlx4_0:1;
skipped
[node1:02674] select: init of component openib returned failure
[node1:02674] mca: base: close: component openib closed
[node1:02674] mca: base: close: unloading component openib
[node1:02674] select: initializing btl component sm
[node1:02674] select: init of component sm returned failure
[node1:02674] mca: base: close: component sm closed
[node1:02674] mca: base: close: unloading component sm
[node1:02674] select: initializing btl component self
[node1:02674] select: init of component self returned success
[node1:02674] mca: bml: Using self btl to [[16642,1],0] on node node1
[node1:02674] mca: base: close: component self closed
[node1:02674] mca: base: close: unloading component self

Best regards,
Sergei.
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-30 Thread Sergei Hrushev
Hi Gilles!


> is there any reason why you configure with --with-verbs-libdir=/usr/lib ?
> as far as i understand, --with-verbs should be enough, and /usr/lib
> nor /usr/local/lib should ever be used in the configure command line
> (and btw, are you running on a 32 bits system ? should the 64 bits
> libs be in /usr/lib64 ?)
>

I'm on Ubuntu 16.04 x86_64 and it has /usr/lib and /usr/lib32.
As I understand /usr/lib is assumed to be /usr/lib64.
So the library path is correct.


>
> make sure you
> ulimit -l unlimited
> before you invoke mpirun, and this value is correctly propagated to
> the remote nodes
> /* the failure could be a side effect of a low ulimit -l */
>

Yes, ulimit -l returns "unlimited".
So this is also correct.

Best regards,
Sergei.
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-30 Thread Sergei Hrushev
>
> Sorry - shoot down my idea. Over to someone else (me hides head in shame)
>
>
No problem, thanks for your try!
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-28 Thread Sergei Hrushev
>
> Sergei,   what does the command  "ibv_devinfo" return please?
>
> I had a recent case like this, but on Qlogic hardware.
> Sorry if I am mixing things up.
>
>
An output of ibv_devinfo from cluster's 1st node is:

$ ibv_devinfo -d mlx4_0
hca_id: mlx4_0
transport:  InfiniBand (0)
fw_ver: 2.35.5100
node_guid:  7cfe:9003:00bd:dec0
sys_image_guid: 7cfe:9003:00bd:dec3
vendor_id:  0x02c9
vendor_part_id: 4099
hw_ver: 0x0
board_id:   MT_1100120019
phys_port_cnt:  1
port:   1
state:  PORT_ACTIVE (4)
max_mtu:4096 (5)
active_mtu: 4096 (5)
sm_lid: 3
port_lid:   3
port_lmc:   0x00
link_layer: InfiniBand
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

[OMPI users] OpenMPI + InfiniBand

2016-10-28 Thread Sergei Hrushev
Hello, All !

We have a problem with OpenMPI version 1.10.2 on a cluster with newly
installed Mellanox InfiniBand adapters.
OpenMPI was re-configured and re-compiled using: --with-verbs
--with-verbs-libdir=/usr/lib

And our test MPI task returns proper results but it seems OpenMPI continues
to use existing 1Gbit Ethernet network instead of InfiniBand.

An output file contains these lines:
--
No OpenFabrics connection schemes reported that they were able to be
used on a specific port.  As such, the openib BTL (OpenFabrics
support) will be disabled for this port.

  Local host:   node1
  Local device: mlx4_0
  Local port:   1
  CPCs attempted:   rdmacm, udcm
--

InfiniBand network itself seems to be working:

$ ibstat mlx4_0 shows:

CA 'mlx4_0'
CA type: MT4099
Number of ports: 1
Firmware version: 2.35.5100
Hardware version: 0
Node GUID: 0x7cfe900300bddec0
System image GUID: 0x7cfe900300bddec3
Port 1:
State: Active
Physical state: LinkUp
Rate: 56
Base lid: 3
LMC: 0
SM lid: 3
Capability mask: 0x0251486a
Port GUID: 0x7cfe900300bddec1
Link layer: InfiniBand

ibping also works.
ibnetdiscover shows the correct topology of  IB network.

Cluster works under Ubuntu 16.04 and we use drivers from OS (OFED is not
installed).

Is it enough for OpenMPI to have RDMA only or IPoIB should also be
installed?
What else can be checked?

Thanks a lot for any help!
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users