Hello Charles

You are heading in the right direction.

First you might want to run the libfabric fi_info command to see what
capabilities you picked up from the libfabric RPMs.

Next you may well not actually be using the OFI  mtl.

Could you run your app with

export OMPI_MCA_mtl_base_verbose=100

and post the output?

It would also help if you described the system you are using :  OS
interconnect cpu type etc.

Howard

Charles A Taylor <chas...@ufl.edu> schrieb am Do. 14. Juni 2018 um 06:36:

> Because of the issues we are having with OpenMPI and the openib BTL
> (questions previously asked), I’ve been looking into what other transports
> are available.  I was particularly interested in OFI/libfabric support but
> cannot find any information on it more recent than a reference to the usNIC
> BTL from 2015 (Jeff Squyres, Cisco).  Unfortunately, the openmpi-org
> website FAQ’s covering OpenFabrics support don’t mention anything beyond
> OpenMPI 1.8.  Given that 3.1 is the current stable version, that seems odd.
>
> That being the case, I thought I’d ask here. After laying down the
> libfabric-devel RPM and building (3.1.0) with —with-libfabric=/usr, I end
> up with an “ofi” MTL but nothing else.   I can run with OMPI_MCA_mtl=ofi
> and OMPI_MCA_btl=“self,vader,openib” but it eventually crashes in
> libopen-pal.so.   (mpi_waitall() higher up the stack).
>
> GIZMO:9185 terminated with signal 11 at PC=2b4d4b68a91d SP=7ffcfbde9ff0.
> Backtrace:
>
> /apps/mpi/intel/2018.1.163/openmpi/3.1.0/lib64/libopen-pal.so.40(+0x9391d)[0x2b4d4b68a91d]
>
> /apps/mpi/intel/2018.1.163/openmpi/3.1.0/lib64/libopen-pal.so.40(opal_progress+0x24)[0x2b4d4b632754]
>
> /apps/mpi/intel/2018.1.163/openmpi/3.1.0/lib64/libmpi.so.40(ompi_request_default_wait_all+0x11f)[0x2b4d47be2a6f]
>
> /apps/mpi/intel/2018.1.163/openmpi/3.1.0/lib64/libmpi.so.40(PMPI_Waitall+0xbd)[0x2b4d47c2ce4d]
>
> Questions: Am I using the OFI MTL as intended?   Should there be an “ofi”
> BTL?   Does anyone use this?
>
> Thanks,
>
> Charlie Taylor
> UF Research Computing
>
> PS - If you could use some help updating the FAQs, I’d be willing to put
> in some time.  I’d probably learn a lot.
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to