I am using 2.4-1.0.0 mellanox ofed.

I downloaded mofed tarball
hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5.tar and extracted
it. It has mxm directory.

hpcx-v1.2.0-325-[root@JARVICE ~]# ls
hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5
archive      fca    hpcx-init-ompi-mellanox-v1.8.sh  ibprof  modulefiles
ompi-mellanox-v1.8  sources  VERSION
bupc-master  hcoll  hpcx-init.sh                     knem    mxm
README.txt          utils

I tried using LD_PRELOAD for libmxm, but getting a different error stack
now as following

[root@JARVICE ~]# ./openmpi-1.8.4/openmpinstall/bin/mpirun
--allow-run-as-root --mca mtl mxm -x
LD_PRELOAD="./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1
./hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5/mxm/lib/libmxm.so.2"
-n 1 ./backend  localhost : -x
LD_PRELOAD="./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1
./hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5/mxm/lib/libmxm.so.2
./libci.so" -n 1 ./app2
 i am backend
[JARVICE:00564] mca: base: components_open: component pml / cm open
function failed
[JARVICE:564  :0] Caught signal 11 (Segmentation fault)
[JARVICE:00565] mca: base: components_open: component pml / cm open
function failed
[JARVICE:565  :0] Caught signal 11 (Segmentation fault)
==== backtrace ====
 2 0x000000000005640c mxm_handle_error()
/scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u5-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5/mxm-v3.2/src/mxm/util/debug/debug.c:641
 3 0x000000000005657c mxm_error_signal_handler()
/scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u5-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5/mxm-v3.2/src/mxm/util/debug/debug.c:616
 4 0x00000000000329a0 killpg()  ??:0
 5 0x0000000000045491 mca_base_components_close()  ??:0
 6 0x000000000004e99a mca_base_framework_close()  ??:0
 7 0x0000000000045431 mca_base_component_close()  ??:0
 8 0x000000000004515c mca_base_framework_components_open()  ??:0
 9 0x00000000000a0de9 mca_pml_base_open()  pml_base_frame.c:0
10 0x000000000004eb1c mca_base_framework_open()  ??:0
11 0x0000000000043eb3 ompi_mpi_init()  ??:0
12 0x0000000000067cb0 PMPI_Init_thread()  ??:0
13 0x0000000000404fdf main()  /root/rain_ib/backend/backend.c:1237
14 0x000000000001ed1d __libc_start_main()  ??:0
15 0x0000000000402db9 _start()  ??:0
===================
--------------------------------------------------------------------------
A requested component was not found, or was unable to be opened.  This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded).  Note that
Open MPI stopped checking at the first component that it did not find.

Host:      JARVICE
Framework: mtl
Component: mxm
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 564 on node JARVICE exited on
signal 11 (Segmentation fault).
--------------------------------------------------------------------------
[JARVICE:00562] 1 more process has sent help message help-mca-base.txt /
find-available:not-valid
[JARVICE:00562] Set MCA parameter "orte_base_help_aggregate" to 0 to see
all help / error messages


Subhra


On Sun, Apr 12, 2015 at 10:48 PM, Mike Dubman <mi...@dev.mellanox.co.il>
wrote:

> seems like mxm was not found in your ld_library_path.
>
> what mofed version do you use?
> does it have /opt/mellanox/mxm in it?
> You could just run mpirun from HPCX package which looks for mxm internally
> and recompile ompi as mentioned in README.
>
> On Mon, Apr 13, 2015 at 3:24 AM, Subhra Mazumdar <
> subhramazumd...@gmail.com> wrote:
>
>> Hi,
>>
>> I used mxm mtl as follows but getting segfault. It says mxm component not
>> found but I have compiled openmpi with mxm. Any idea what I might be
>> missing?
>>
>> [root@JARVICE ~]# ./openmpi-1.8.4/openmpinstall/bin/mpirun
>> --allow-run-as-root --mca pml cm --mca mtl mxm -n 1 -x
>> LD_PRELOAD=./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1 ./backend
>> localhosst : -n 1 -x LD_PRELOAD="./libci.so
>> ./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1" ./app2
>>  i am backend
>> [JARVICE:08398] *** Process received signal ***
>> [JARVICE:08398] Signal: Segmentation fault (11)
>> [JARVICE:08398] Signal code: Address not mapped (1)
>> [JARVICE:08398] Failing at address: 0x10
>> [JARVICE:08398] [ 0] /lib64/libpthread.so.0(+0xf710)[0x7ff8d0ddb710]
>> [JARVICE:08398] [ 1]
>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_components_close+0x21)[0x7ff8cf9ae491]
>> [JARVICE:08398] [ 2]
>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_framework_close+0x6a)[0x7ff8cf9b799a]
>> [JARVICE:08398] [ 3]
>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_component_close+0x21)[0x7ff8cf9ae431]
>> [JARVICE:08398] [ 4]
>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_framework_components_open+0x11c)[0x7ff8cf9ae15c]
>> [JARVICE:08398] [ 5]
>> ./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1(+0xa0de9)[0x7ff8d1089de9]
>> [JARVICE:08398] [ 6]
>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_framework_open+0x7c)[0x7ff8cf9b7b1c]
>> [JARVICE:08398] [ 7] [JARVICE:08398] mca: base: components_open:
>> component pml / cm open function failed
>>
>> ./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1(ompi_mpi_init+0x4b3)[0x7ff8d102ceb3]
>> [JARVICE:08398] [ 8]
>> ./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1(PMPI_Init_thread+0x100)[0x7ff8d1050cb0]
>> [JARVICE:08398] [ 9] ./backend[0x404fdf]
>> [JARVICE:08398] [10]
>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7ff8cfeded1d]
>> [JARVICE:08398] [11] ./backend[0x402db9]
>> [JARVICE:08398] *** End of error message ***
>> --------------------------------------------------------------------------
>> A requested component was not found, or was unable to be opened.  This
>> means that this component is either not installed or is unable to be
>> used on your system (e.g., sometimes this means that shared libraries
>> that the component requires are unable to be found/loaded).  Note that
>> Open MPI stopped checking at the first component that it did not find.
>>
>> Host:      JARVICE
>> Framework: mtl
>> Component: mxm
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> mpirun noticed that process rank 0 with PID 8398 on node JARVICE exited
>> on signal 11 (Segmentation fault).
>> --------------------------------------------------------------------------
>>
>>
>> Subhra.
>>
>>
>> On Fri, Apr 10, 2015 at 12:12 AM, Mike Dubman <mi...@dev.mellanox.co.il>
>> wrote:
>>
>>> no need IPoIB, mxm uses native IB.
>>>
>>> Please see HPCX (pre-compiled ompi, integrated with MXM and FCA) README
>>> file for details how to compile/select.
>>>
>>> The default transport is UD for internode communication and
>>> shared-memory for intra-node.
>>>
>>> http://bgate,mellanox.com/products/hpcx/
>>>
>>> Also, mxm included in the Mellanox OFED.
>>>
>>> On Fri, Apr 10, 2015 at 5:26 AM, Subhra Mazumdar <
>>> subhramazumd...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Does ipoib need to be configured on the ib cards for mxm (I have a
>>>> separate ethernet connection too)? Also are there special flags in mpirun
>>>> to select from UD/RC/DC? What is the default?
>>>>
>>>> Thanks,
>>>> Subhra.
>>>>
>>>>
>>>> On Tue, Mar 31, 2015 at 9:46 AM, Mike Dubman <mi...@dev.mellanox.co.il>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>> mxm uses IB rdma/roce technologies. Once can select UD/RC/DC
>>>>> transports to be used in mxm.
>>>>>
>>>>> By selecting mxm, all MPI p2p routines will be mapped to appropriate
>>>>> mxm functions.
>>>>>
>>>>> M
>>>>>
>>>>> On Mon, Mar 30, 2015 at 7:32 PM, Subhra Mazumdar <
>>>>> subhramazumd...@gmail.com> wrote:
>>>>>
>>>>>> Hi MIke,
>>>>>>
>>>>>> Does the mxm mtl use infiniband rdma? Also from programming
>>>>>> perspective, do I need to use anything else other than MPI_Send/MPI_Recv?
>>>>>>
>>>>>> Thanks,
>>>>>> Subhra.
>>>>>>
>>>>>>
>>>>>> On Sun, Mar 29, 2015 at 11:14 PM, Mike Dubman <
>>>>>> mi...@dev.mellanox.co.il> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>> openib btl does not support this thread model.
>>>>>>> You can use OMPI w/ mxm (-mca mtl mxm) and multiple thread mode lin
>>>>>>> 1.8 x series or (-mca pml yalla) in the master branch.
>>>>>>>
>>>>>>> M
>>>>>>>
>>>>>>> On Mon, Mar 30, 2015 at 9:09 AM, Subhra Mazumdar <
>>>>>>> subhramazumd...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Can MPI_THREAD_MULTIPLE and openib btl work together in open mpi
>>>>>>>> 1.8.4? If so are there any command line options needed during run time?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Subhra.
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> Link to this post:
>>>>>>>> http://www.open-mpi.org/community/lists/users/2015/03/26574.php
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Kind Regards,
>>>>>>>
>>>>>>> M.
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> Link to this post:
>>>>>>> http://www.open-mpi.org/community/lists/users/2015/03/26575.php
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post:
>>>>>> http://www.open-mpi.org/community/lists/users/2015/03/26580.php
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Kind Regards,
>>>>>
>>>>> M.
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post:
>>>>> http://www.open-mpi.org/community/lists/users/2015/03/26584.php
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2015/04/26663.php
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Kind Regards,
>>>
>>> M.
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2015/04/26665.php
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/04/26686.php
>>
>
>
>
> --
>
> Kind Regards,
>
> M.
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/04/26688.php
>

Reply via email to