Martin
Sorry for the late reply, I forget to check the users list as most of the
discussions take place on IRC (which I also forget to check these days...).
Anyway
When using hpx in distrivuted mode, you have two options - compile hpx with
NETWORKING on and use either the MPI parcelport (which we assume you have
compiled and installed on your system as usual), or use the new still slightly
experimental libfabric parcelport. There is a libfabric provider for PSM2 which
is the one used on omnipath OPA networks.
see here https://github.com/ofiwg/libfabric/wiki/Provider-Feature-Matrix-master
for capabilities.
HPX currently runs on the sockets and GNI providers and uses endpoint type
FI_EP_RDM and makes use of FI_SEND, FI_RECV, FI_RMA, and secondary capability
FI_SOURCE plus a few others I can't remember from the top of my head.
Looking at the chart, PSM2 provider supports all the things we need, so it
ought to be possible to run the libfabric network layer on an omnipath machine.
However - currently the master branch of HPX doesn't support this and the stuff
you'd need is in another branch that needs a bit of work to merge in. I have it
in my todo list to get the network running on summit (infiniband verbs - no
FI_SOURCE = problem), but I'm not sure when I'll be able to start work on it.
You should probably just use the MPI parcelport in HPX for now - but If you
were interested in getting the libfabric stuff running for improved distributed
performance, it ought to be straightforward to get woking since all the stuff
we need appears to be supported - however it'd need a bit of tweaking and
experimenting to get running probably - is there any way I can get access to
your machine to log in and try a build/test?
If you're more interested in simply using mpi in your existing code and not
using hpx as a distributed tasking layer, then just turn
HPX_WITH_NETWORKING=OFF and then use hpx for tasks on a node and your existing
mpi between nodes.
HTH
JB
From: hpx-users-boun...@stellar.cct.lsu.edu
on behalf of Ohlerich, Martin
Sent: 10 December 2019 10:58:27
To: hpx-users@stellar.cct.lsu.edu
Subject: [hpx-users] Request for Experience
Dear Colleagues,
my name is Martin Ohlerich. I'm working at the Leibniz Super-Computing Center
near Munich (LRZ), and test currently the capabilities of HPX. On our Linux
cluster with infinitband network, the tests were so far successful. On
SuperMUC-NG, we've an Intel OPA network, which seems to have some peculiarities
(that we also observed when trying to employ GPI (GASPI)). Is there any
experience with such a network type for HPX in the community?
I welcome any hint on where to find about the startup mechanism, and debugging
possibilities. I tried so far the easy approach to install HPX via Spack. On
SNG that might be not the correct way to go.
Many thanks in advance! Also in the name of our users!
Best regards,
Martin Ohlerich
___
hpx-users mailing list
hpx-users@stellar.cct.lsu.edu
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users