Dear Sean, You gave me a lot of info! Now I am going to set up RHEL7 with Mellanox OFED to test it. Before I had setup without Mlx OFED on Ubuntu. Do you think it might cause issues ? Also please let me know the ompi version you have used and do I understand it right that ucx is intalled and configured separately, then openmpi is configured to used it?
Thanks again for your help! Harutyun On Sun, Sep 4, 2022, 09:20 Sean Crosby <scro...@unimelb.edu.au> wrote: > Hi Harutyun, > > We use RoCE v2 using OpenMPI on our cluster, and it works great. We used > to use the openib BTL, but have moved competely across to UCX. > > You have to configure RoCE on your switches and NICs (we use a mixture of > Mellanox CX-4, CX-5 and CX-6 NICs, with Mellanox switches running Cumulus). > We use DSCP and priority 3 for RoCE traffic tagging, and all our nodes run > Mellanox OFED on RHEL7. > > Once RoCE is configured and tested (using things like ib_send_bw -d > mlx5_bond_0 -x 7 -R -T 106 -D 10), getting UCX to use RoCE is quite easy, > and compiling OpenMPI to use UCX is also very easy. > > Sean > ------------------------------ > *From:* users <users-boun...@lists.open-mpi.org> on behalf of Harutyun > Umrshatyan via users <users@lists.open-mpi.org> > *Sent:* Sunday, 4 September 2022 04:28 > *To:* users@lists.open-mpi.org <users@lists.open-mpi.org> > *Cc:* Harutyun Umrshatyan <harutyun...@grovf.com> > *Subject:* [EXT] [OMPI users] MPI with RoCE > > * External email: Please exercise caution * > ------------------------------ > Hi everyone > > Could someone please share any experience using MPI with RoCE ? > I am trying to set up infiniband adapters (Mellanox cards for example) and > run MPI applications with RoCE (Instead of TCP). > As I understand, there might be some environment requirements or > restrictions like kernel version, installed drivers, etc. > I have tried a lot of versions of mpi libs and could not succeed. Would > highly appreciate any hint or experience shared. > > Best regards, > Harutyun Umrshatyan > >