
on one node ./IOR running with OpenMPI but with two node it fails with 
"][connect/btl_openib_connect_udcm.c:1575:udcm_wait_for_send_completion] send 
failed with verbs status 2"

One Node

[root@vcn03 C]# mpirun --allow-run-as-root -np 1 -host vcn03 ./IOR
WARNING: No preset parameters were found for the device that Open MPI

Local host: vcn03
Device name: mlx5_0
Device vendor ID: 0x02c9
Device vendor part ID: 4114

Default device parameters will be used, which may result in lower
performance. You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your

NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_no_device_params_found to 0.
error modifing QP to RTR errno says Invalid argument
IOR-2.10.3: MPI Coordinated Test of Parallel I/O

Run began: Tue Mar 13 11:50:15 2018
Command line used: ./IOR
Machine: Linux vcn03

api = POSIX
test filename = testFile
access = single-shared-file
ordering in a file = sequential offsets
ordering inter file= no tasks offsets
clients = 1 (1 per node)
repetitions = 1
xfersize = 262144 bytes
blocksize = 1 MiB
aggregate filesize = 1 MiB

Operation Max (MiB) Min (MiB) Mean (MiB) Std Dev Max (OPs) Min (OPs) Mean (OPs) 
Std Dev Mean (s)
--------- --------- --------- ---------- ------- --------- --------- ---------- 
------- --------
write 312.36 312.36 312.36 0.00 1249.44 1249.44 1249.44 0.00 0.00320 EXCEL
read 996.42 996.42 996.42 0.00 3985.69 3985.69 3985.69 0.00 0.00100 EXCEL

Max Write: 312.36 MiB/sec (327.53 MB/sec)
Max Read: 996.42 MiB/sec (1044.82 MB/sec)

Run finished: Tue Mar 13 11:50:15 2018

two node run

[root@vcn03 C]# mpirun --allow-run-as-root -np 2 -host vcn03,vcn04 ./IOR
WARNING: No preset parameters were found for the device that Open MPI

Local host: vcn04
Device name: mlx5_0
Device vendor ID: 0x02c9
Device vendor part ID: 4114

Default device parameters will be used, which may result in lower
performance. You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your

NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_no_device_params_found to 0.
error modifing QP to RTR errno says Invalid argument
error modifing QP to RTR errno says Invalid argument
mlx5: vcn04: got completion with error:
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 78006802 0a00016f 00005bd2
 send failed with verbs status 2
[vcn04:28705] *** An error occurred in MPI_Send
[vcn04:28705] *** reported by process [2204631041,1]
[vcn04:28705] *** on communicator MPI_COMM_WORLD
[vcn04:28705] *** MPI_ERR_OTHER: known error not in list
[vcn04:28705] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
[vcn04:28705] *** and potentially your MPI job)
[vcn03:05349] 1 more process has sent help message help-mpi-btl-openib.txt / no 
device params found
[vcn03:05349] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help 
/ error messages
[root@vcn03 C]#
From: devel [devel-boun...@lists.open-mpi.org] on behalf of Pharthiphan Asokan 
Sent: Tuesday, March 13, 2018 9:13 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] How to Build OpenMPI to support FDR over SR-IOV

[This sender failed our fraud detection checks and may not be who they appear 
to be. Learn about spoofing at http://aka.ms/LearnAboutSpoofing]

HI Jeff,

by adding PATH and LD_LIBRARY_PATH, I don't see orted not found issue.

[root@vcn03 pasokan]# mpirun --allow-run-as-root -np 4 -host 
vcn03,vcn03,vcn04,vcn04 /mnt/lustre_client/pasokan/a.out
WARNING: No preset parameters were found for the device that Open MPI

Local host: vcn03
Device name: mlx5_0
Device vendor ID: 0x02c9
Device vendor part ID: 4114

Default device parameters will be used, which may result in lower
performance. You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your

NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_no_device_params_found to 0.
error modifing QP to RTR errno says Invalid argument
error modifing QP to RTR errno says Invalid argument
error modifing QP to RTR errno says Invalid argument
error modifing QP to RTR errno says Invalid argument
Hello world from processor vcn03, rank 0 out of 4 processors
Hello world from processor vcn03, rank 1 out of 4 processors
Hello world from processor vcn04, rank 2 out of 4 processors
Hello world from processor vcn04, rank 3 out of 4 processors
[vcn03:05070] 3 more processes have sent help message help-mpi-btl-openib.txt / 
no device params found
[vcn03:05070] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help 
/ error messages
[root@vcn03 pasokan]#

but still IOR isn't running while compiled using OpenMPI, throwing segmentation 
fault, which used to be very straight forward in Baremetal but not in KVM + 

From: Pharthiphan Asokan
Sent: Tuesday, March 13, 2018 8:42 PM
To: Open MPI Developers
Subject: RE: [OMPI devel] How to Build OpenMPI to support FDR over SR-IOV

Thanks Jeff,

OpenMPI is installed here

[root@vcn03 C]# cd /mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/
bin/ etc/ include/ lib/ share/
[root@vcn03 C]#

why exporting these variables not taking effect

export PATH=/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/bin:$PATH

but as said by providing --prefix 
/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/ is working

[root@vcn03 C]# mpirun --prefix 
/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/ --allow-run-as-root -np 
2 -host vcn03,vcn04 hostname
[root@vcn03 C]#

though my issue is IOR isn't running while compile with OpenMPI on SR-IOV 

[root@vcn03 C]# pwd
[root@vcn03 C]#
[root@vcn03 C]# export 
[root@vcn03 C]# export 
[root@vcn03 C]# export 
[root@vcn03 C]#
[root@vcn03 C]# gmake posix mpiio
mpicc -o IOR IOR.o utilities.o parse_options.o \
aiori-POSIX.o aiori-noMPIIO.o aiori-noHDF5.o aiori-noNCMPI.o \
mpicc -o IOR IOR.o utilities.o parse_options.o \
aiori-POSIX.o aiori-MPIIO.o aiori-noHDF5.o aiori-noNCMPI.o \
[root@vcn03 C]# ./IOR
WARNING: No preset parameters were found for the device that Open MPI

Local host: vcn03
Device name: mlx5_0
Device vendor ID: 0x02c9
Device vendor part ID: 4114

Default device parameters will be used, which may result in lower
performance. You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your

NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_no_device_params_found to 0.
error modifing QP to RTR errno says Invalid argument
Segmentation fault
[root@vcn03 C]#

Please help !

From: devel [devel-boun...@lists.open-mpi.org] on behalf of Jeff Squyres 
(jsquyres) [jsquy...@cisco.com]
Sent: Tuesday, March 13, 2018 8:20 PM
To: Open MPI Developers List
Subject: Re: [OMPI devel] How to Build OpenMPI to support FDR over SR-IOV

On Mar 13, 2018, at 2:08 AM, Pharthiphan Asokan <paso...@ddn.com> wrote:
> [root@vcn03 C]# mpirun --allow-run-as-root -np 2 -host vcn03,vcn04 hostname
> bash: orted: command not found

This is the key ^^

These FAQ items may help:

* https://www.open-mpi.org/faq/?category=running#run-prereqs.
* https://www.open-mpi.org/faq/?category=running#adding-ompi-to-path
* https://www.open-mpi.org/faq/?category=running#mpirun-prefix

Jeff Squyres

devel mailing list
devel mailing list
devel mailing list

Reply via email to