Re: [OMPI devel] RDMA and OMPI implementation

2022-04-21 Thread Masoud Hemmatpour via devel
Hi Tomislav,

Thank you very much for your answer! Sure, I ask my question in UCX mailing
list.

Thanks,




On Thu, 21 Apr 2022 at 16:27, Tomislav Janjusic 
wrote:

> Hi Masoud,
>
> > I would say how can I see a complete list of such factors like message
> size, memory map, ... etc
> For UCX, depending on where you have it installed, you'll find 'ucx_info'
> which will list all available tuning parameters.
>
> For general ompi tuning I would start with ompi_info -a, and just look
> through the parameters.
>
> If you need further clarification regarding UCX this mailing list is
> probably not your best choice. I would direct my questions on the UCX
> mailing list here: ucx-gr...@elist.ornl.gov, and you can register here:
> https://elist.ornl.gov/mailman/listinfo/ucx-group
>
> Best, Tommy
>
> --
> Tomislav Janjusic
> Staff Eng., Mellanox, HPC SW
> +1 (512) 598-0386
> NVIDIA
>
> -Original Message-
> From: devel  On Behalf Of Jeff Squyres
> (jsquyres) via devel
> Sent: Thursday, April 21, 2022 7:18 AM
> To: Masoud Hemmatpour 
> Cc: Jeff Squyres (jsquyres) ; Open MPI Developers <
> devel@lists.open-mpi.org>
> Subject: Re: [OMPI devel] RDMA and OMPI implementation
>
> External email: Use caution opening links or attachments
>
>
> In UCX's case, the choice is almost entirely driven by the UCX library.
> You'll need to look at the UCX code and/or ask NVIDIA.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> ____
> From: Masoud Hemmatpour 
> Sent: Thursday, April 21, 2022 7:57 AM
> To: Jeff Squyres (jsquyres)
> Cc: Open MPI Developers
> Subject: Re: [OMPI devel] RDMA and OMPI implementation
>
>
>
> Thanks again for your answer and I hope I dont bother you with my
> questions! If I can ask my last question here. I would say how can I see a
> complete list of such factors like message size, memory map, ... etc?
> Is there any reading or should I look at the code, if any, could you
> please give me a starting point to look at it? In the case of UCX and
> UCX-enabled network interfaces (such as IB) is it a UCX decision or OpenMPI
> decision to use or not RDMA?
>
> Sorry for my long question, and thank you again!
>
>
>
>
>
>
> On Thu, Apr 21, 2022 at 1:09 PM Jeff Squyres (jsquyres) <
> jsquy...@cisco.com<mailto:jsquy...@cisco.com>> wrote:
> It means that your underlying network transport supports RDMA.
>
> To be clear, if you built Open MPI with UCX support, and you run on a
> system with UCX-enabled network interfaces (such as IB), Open MPI should
> automatically default to using those UCX interfaces.  This means you'll get
> all the benefits of an HPC-class networking transport (low latency,
> hardware offload, ... etc.).
>
> For any given send/receive in your MPI application, in the right
> circumstances (message size, memory map, ... etc.), Open MPI will use RDMA
> to effect a network transfer.  There are many different run-time issues
> that will drive the choice of whether any individual network transfer
> actually uses RDMA or not.
>
> --
> Jeff Squyres
> jsquy...@cisco.com<mailto:jsquy...@cisco.com>
>
> 
> From: Masoud Hemmatpour mailto:mashe...@gmail.com>>
> Sent: Thursday, April 21, 2022 2:38 AM
> To: Open MPI Developers
> Cc: Jeff Squyres (jsquyres)
> Subject: Re: [OMPI devel] RDMA and OMPI implementation
>
>
> Thank you very much for your description! Actually, I read this issue on
> github:
>
> Is OpenMPI supporting RDMA?<
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopen-mpi%2Fompi%2Fissues%2F5789data=05%7C01%7Ctomislavj%40nvidia.com%7Ca1c6d1a5fad94d71864f08da239143ab%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637861404070677349%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=MJtyqPZKQmMw8mob0YKhT6vsgvU1bytf1nZcIPSyRdo%3Dreserved=0
> >
>
> If I have IB and I install and use UCX, does this guarantee that I am
> using RDMA or still it does not guarantee?
>
>
> Thanks again,
>
>
>
>
>
>
>
>
>
> On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel <
> devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org> devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>> wrote:
> Let me add a little more color to William's response.  The general theme
> is: it depends on what the underlying network provides.
>
> Some underlying networks natively support one-sided operations like PUT /
> WRITE and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.).  Some don't
> (like TCP).
>
> Open MPI w

Re: [OMPI devel] RDMA and OMPI implementation

2022-04-21 Thread Tomislav Janjusic via devel
Hi Masoud,

> I would say how can I see a complete list of such factors like message size, 
> memory map, ... etc
For UCX, depending on where you have it installed, you'll find 'ucx_info' which 
will list all available tuning parameters.

For general ompi tuning I would start with ompi_info -a, and just look through 
the parameters.

If you need further clarification regarding UCX this mailing list is probably 
not your best choice. I would direct my questions on the UCX mailing list here: 
ucx-gr...@elist.ornl.gov, and you can register here: 
https://elist.ornl.gov/mailman/listinfo/ucx-group

Best, Tommy

--
Tomislav Janjusic
Staff Eng., Mellanox, HPC SW
+1 (512) 598-0386
NVIDIA

-Original Message-
From: devel  On Behalf Of Jeff Squyres 
(jsquyres) via devel
Sent: Thursday, April 21, 2022 7:18 AM
To: Masoud Hemmatpour 
Cc: Jeff Squyres (jsquyres) ; Open MPI Developers 

Subject: Re: [OMPI devel] RDMA and OMPI implementation

External email: Use caution opening links or attachments


In UCX's case, the choice is almost entirely driven by the UCX library.  You'll 
need to look at the UCX code and/or ask NVIDIA.

--
Jeff Squyres
jsquy...@cisco.com


From: Masoud Hemmatpour 
Sent: Thursday, April 21, 2022 7:57 AM
To: Jeff Squyres (jsquyres)
Cc: Open MPI Developers
Subject: Re: [OMPI devel] RDMA and OMPI implementation



Thanks again for your answer and I hope I dont bother you with my questions! If 
I can ask my last question here. I would say how can I see a complete list of 
such factors like message size, memory map, ... etc?
Is there any reading or should I look at the code, if any, could you please 
give me a starting point to look at it? In the case of UCX and UCX-enabled 
network interfaces (such as IB) is it a UCX decision or OpenMPI decision to use 
or not RDMA?

Sorry for my long question, and thank you again!






On Thu, Apr 21, 2022 at 1:09 PM Jeff Squyres (jsquyres) 
mailto:jsquy...@cisco.com>> wrote:
It means that your underlying network transport supports RDMA.

To be clear, if you built Open MPI with UCX support, and you run on a system 
with UCX-enabled network interfaces (such as IB), Open MPI should automatically 
default to using those UCX interfaces.  This means you'll get all the benefits 
of an HPC-class networking transport (low latency, hardware offload, ... etc.).

For any given send/receive in your MPI application, in the right circumstances 
(message size, memory map, ... etc.), Open MPI will use RDMA to effect a 
network transfer.  There are many different run-time issues that will drive the 
choice of whether any individual network transfer actually uses RDMA or not.

--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>


From: Masoud Hemmatpour mailto:mashe...@gmail.com>>
Sent: Thursday, April 21, 2022 2:38 AM
To: Open MPI Developers
Cc: Jeff Squyres (jsquyres)
Subject: Re: [OMPI devel] RDMA and OMPI implementation


Thank you very much for your description! Actually, I read this issue on github:

Is OpenMPI supporting 
RDMA?<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopen-mpi%2Fompi%2Fissues%2F5789data=05%7C01%7Ctomislavj%40nvidia.com%7Ca1c6d1a5fad94d71864f08da239143ab%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637861404070677349%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=MJtyqPZKQmMw8mob0YKhT6vsgvU1bytf1nZcIPSyRdo%3Dreserved=0>

If I have IB and I install and use UCX, does this guarantee that I am using 
RDMA or still it does not guarantee?


Thanks again,









On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel 
mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>>
 wrote:
Let me add a little more color to William's response.  The general theme is: it 
depends on what the underlying network provides.

Some underlying networks natively support one-sided operations like PUT / WRITE 
and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.).  Some don't (like TCP).

Open MPI will adapt to use whatever transports the underlying network supports.

Additionally, the determination of whether Open MPI uses a "two sided" or "one 
sided" type of network transport operation depends on a bunch of other factors. 
 The most efficient method to get a message from sender to receive may depend 
on issues such as the size of the message, the memory map of the message, the 
current network resource utilization, the specific MPI operation, ... etc.

Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided 
operations.  So if you want to use "RDMA", you may need to use an NVIDIA-based 
network (e.g., IB or RoCE).  That's not the only type of network one-sided 
operations available, but it's common.

--
Jeff Squyres
jsquy...@

Re: [OMPI devel] RDMA and OMPI implementation

2022-04-21 Thread Jeff Squyres (jsquyres) via devel
In UCX's case, the choice is almost entirely driven by the UCX library.  You'll 
need to look at the UCX code and/or ask NVIDIA.

--
Jeff Squyres
jsquy...@cisco.com


From: Masoud Hemmatpour 
Sent: Thursday, April 21, 2022 7:57 AM
To: Jeff Squyres (jsquyres)
Cc: Open MPI Developers
Subject: Re: [OMPI devel] RDMA and OMPI implementation



Thanks again for your answer and I hope I dont bother you with my questions! If 
I can ask my last question here. I would say how can I see a complete list of 
such factors like message size, memory map, ... etc?
Is there any reading or should I look at the code, if any, could you please 
give me a starting point to look at it? In the case of UCX and UCX-enabled 
network interfaces (such as IB) is it a UCX decision or OpenMPI decision to use 
or not RDMA?

Sorry for my long question, and thank you again!






On Thu, Apr 21, 2022 at 1:09 PM Jeff Squyres (jsquyres) 
mailto:jsquy...@cisco.com>> wrote:
It means that your underlying network transport supports RDMA.

To be clear, if you built Open MPI with UCX support, and you run on a system 
with UCX-enabled network interfaces (such as IB), Open MPI should automatically 
default to using those UCX interfaces.  This means you'll get all the benefits 
of an HPC-class networking transport (low latency, hardware offload, ... etc.).

For any given send/receive in your MPI application, in the right circumstances 
(message size, memory map, ... etc.), Open MPI will use RDMA to effect a 
network transfer.  There are many different run-time issues that will drive the 
choice of whether any individual network transfer actually uses RDMA or not.

--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>


From: Masoud Hemmatpour mailto:mashe...@gmail.com>>
Sent: Thursday, April 21, 2022 2:38 AM
To: Open MPI Developers
Cc: Jeff Squyres (jsquyres)
Subject: Re: [OMPI devel] RDMA and OMPI implementation


Thank you very much for your description! Actually, I read this issue on github:

Is OpenMPI supporting RDMA?<https://github.com/open-mpi/ompi/issues/5789>

If I have IB and I install and use UCX, does this guarantee that I am using 
RDMA or still it does not guarantee?


Thanks again,









On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel 
mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>>
 wrote:
Let me add a little more color to William's response.  The general theme is: it 
depends on what the underlying network provides.

Some underlying networks natively support one-sided operations like PUT / WRITE 
and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.).  Some don't (like TCP).

Open MPI will adapt to use whatever transports the underlying network supports.

Additionally, the determination of whether Open MPI uses a "two sided" or "one 
sided" type of network transport operation depends on a bunch of other factors. 
 The most efficient method to get a message from sender to receive may depend 
on issues such as the size of the message, the memory map of the message, the 
current network resource utilization, the specific MPI operation, ... etc.

Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided 
operations.  So if you want to use "RDMA", you may need to use an NVIDIA-based 
network (e.g., IB or RoCE).  That's not the only type of network one-sided 
operations available, but it's common.

--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com><mailto:jsquy...@cisco.com<mailto:jsquy...@cisco.com>>


From: devel 
mailto:devel-boun...@lists.open-mpi.org><mailto:devel-boun...@lists.open-mpi.org<mailto:devel-boun...@lists.open-mpi.org>>>
 on behalf of Zhang, William via devel 
mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>>
Sent: Wednesday, April 20, 2022 6:12 PM
To: Open MPI Developers
Cc: Zhang, William
Subject: Re: [OMPI devel] RDMA and OMPI implementation

Hello Masoud,

Responded inline

Thanks,
William

From: devel 
mailto:devel-boun...@lists.open-mpi.org><mailto:devel-boun...@lists.open-mpi.org<mailto:devel-boun...@lists.open-mpi.org>>>
 on behalf of Masoud Hemmatpour via devel 
mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>>
Reply-To: Open MPI Developers 
mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>>
Date: Wednesday, April 20, 2022 at 5:29 AM
To: Open MPI Developers 
mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>>
Cc: Masoud Hemmatpour 
mailto:mashe...@gmail.com><mailto:mashe...@gmail.com<mai

Re: [OMPI devel] RDMA and OMPI implementation

2022-04-21 Thread Masoud Hemmatpour via devel
Thanks again for your answer and I hope I dont bother you with my
questions! If I can ask my last question here. I would say how can I see a
complete list of such factors like *message size, memory map, ... etc*?
Is there any reading or should I look at the code, if any, could you please
give me a starting point to look at it? In the case of UCX and UCX-enabled
network interfaces (such as IB) is it a UCX decision or OpenMPI decision to
use or not RDMA?

Sorry for my long question, and thank you again!






On Thu, Apr 21, 2022 at 1:09 PM Jeff Squyres (jsquyres) 
wrote:

> It means that your underlying network transport supports RDMA.
>
> To be clear, if you built Open MPI with UCX support, and you run on a
> system with UCX-enabled network interfaces (such as IB), Open MPI should
> automatically default to using those UCX interfaces.  This means you'll get
> all the benefits of an HPC-class networking transport (low latency,
> hardware offload, ... etc.).
>
> For any given send/receive in your MPI application, in the right
> circumstances (message size, memory map, ... etc.), Open MPI will use RDMA
> to effect a network transfer.  There are many different run-time issues
> that will drive the choice of whether any individual network transfer
> actually uses RDMA or not.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> 
> From: Masoud Hemmatpour 
> Sent: Thursday, April 21, 2022 2:38 AM
> To: Open MPI Developers
> Cc: Jeff Squyres (jsquyres)
> Subject: Re: [OMPI devel] RDMA and OMPI implementation
>
>
> Thank you very much for your description! Actually, I read this issue on
> github:
>
> Is OpenMPI supporting RDMA?<https://github.com/open-mpi/ompi/issues/5789>
>
> If I have IB and I install and use UCX, does this guarantee that I am
> using RDMA or still it does not guarantee?
>
>
> Thanks again,
>
>
>
>
>
>
>
>
>
> On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel <
> devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>> wrote:
> Let me add a little more color to William's response.  The general theme
> is: it depends on what the underlying network provides.
>
> Some underlying networks natively support one-sided operations like PUT /
> WRITE and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.).  Some don't
> (like TCP).
>
> Open MPI will adapt to use whatever transports the underlying network
> supports.
>
> Additionally, the determination of whether Open MPI uses a "two sided" or
> "one sided" type of network transport operation depends on a bunch of other
> factors.  The most efficient method to get a message from sender to receive
> may depend on issues such as the size of the message, the memory map of the
> message, the current network resource utilization, the specific MPI
> operation, ... etc.
>
> Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided
> operations.  So if you want to use "RDMA", you may need to use an
> NVIDIA-based network (e.g., IB or RoCE).  That's not the only type of
> network one-sided operations available, but it's common.
>
> --
> Jeff Squyres
> jsquy...@cisco.com<mailto:jsquy...@cisco.com>
>
> ________________
> From: devel  devel-boun...@lists.open-mpi.org>> on behalf of Zhang, William via devel <
> devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>
> Sent: Wednesday, April 20, 2022 6:12 PM
> To: Open MPI Developers
> Cc: Zhang, William
> Subject: Re: [OMPI devel] RDMA and OMPI implementation
>
> Hello Masoud,
>
> Responded inline
>
> Thanks,
> William
>
> From: devel  devel-boun...@lists.open-mpi.org>> on behalf of Masoud Hemmatpour via
> devel mailto:devel@lists.open-mpi.org>>
> Reply-To: Open MPI Developers  devel@lists.open-mpi.org>>
> Date: Wednesday, April 20, 2022 at 5:29 AM
> To: Open MPI Developers  devel@lists.open-mpi.org>>
> Cc: Masoud Hemmatpour mailto:mashe...@gmail.com>>
> Subject: [EXTERNAL] [OMPI devel] RDMA and OMPI implementation
>
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
> Hello Everyone,
>
> Sorry, MPI is quite new for me, in particular the implementation. If you
> don't mind, I have some very basic questions regarding the OMPI
> implementation.
>
> If I use one-sided MPI operations (Get, and Put) forcefully I use RDMA? –
> It depends, but it’s not guaranteed. For example, in Open MPI 4.0.x, there
> was the osc/pt2pt component that implemented osc operations using
> s

Re: [OMPI devel] RDMA and OMPI implementation

2022-04-21 Thread Jeff Squyres (jsquyres) via devel
It means that your underlying network transport supports RDMA.

To be clear, if you built Open MPI with UCX support, and you run on a system 
with UCX-enabled network interfaces (such as IB), Open MPI should automatically 
default to using those UCX interfaces.  This means you'll get all the benefits 
of an HPC-class networking transport (low latency, hardware offload, ... etc.).

For any given send/receive in your MPI application, in the right circumstances 
(message size, memory map, ... etc.), Open MPI will use RDMA to effect a 
network transfer.  There are many different run-time issues that will drive the 
choice of whether any individual network transfer actually uses RDMA or not.

--
Jeff Squyres
jsquy...@cisco.com


From: Masoud Hemmatpour 
Sent: Thursday, April 21, 2022 2:38 AM
To: Open MPI Developers
Cc: Jeff Squyres (jsquyres)
Subject: Re: [OMPI devel] RDMA and OMPI implementation


Thank you very much for your description! Actually, I read this issue on github:

Is OpenMPI supporting RDMA?<https://github.com/open-mpi/ompi/issues/5789>

If I have IB and I install and use UCX, does this guarantee that I am using 
RDMA or still it does not guarantee?


Thanks again,









On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel 
mailto:devel@lists.open-mpi.org>> wrote:
Let me add a little more color to William's response.  The general theme is: it 
depends on what the underlying network provides.

Some underlying networks natively support one-sided operations like PUT / WRITE 
and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.).  Some don't (like TCP).

Open MPI will adapt to use whatever transports the underlying network supports.

Additionally, the determination of whether Open MPI uses a "two sided" or "one 
sided" type of network transport operation depends on a bunch of other factors. 
 The most efficient method to get a message from sender to receive may depend 
on issues such as the size of the message, the memory map of the message, the 
current network resource utilization, the specific MPI operation, ... etc.

Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided 
operations.  So if you want to use "RDMA", you may need to use an NVIDIA-based 
network (e.g., IB or RoCE).  That's not the only type of network one-sided 
operations available, but it's common.

--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>


From: devel 
mailto:devel-boun...@lists.open-mpi.org>> on 
behalf of Zhang, William via devel 
mailto:devel@lists.open-mpi.org>>
Sent: Wednesday, April 20, 2022 6:12 PM
To: Open MPI Developers
Cc: Zhang, William
Subject: Re: [OMPI devel] RDMA and OMPI implementation

Hello Masoud,

Responded inline

Thanks,
William

From: devel 
mailto:devel-boun...@lists.open-mpi.org>> on 
behalf of Masoud Hemmatpour via devel 
mailto:devel@lists.open-mpi.org>>
Reply-To: Open MPI Developers 
mailto:devel@lists.open-mpi.org>>
Date: Wednesday, April 20, 2022 at 5:29 AM
To: Open MPI Developers 
mailto:devel@lists.open-mpi.org>>
Cc: Masoud Hemmatpour mailto:mashe...@gmail.com>>
Subject: [EXTERNAL] [OMPI devel] RDMA and OMPI implementation


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.

Hello Everyone,

Sorry, MPI is quite new for me, in particular the implementation. If you don't 
mind, I have some very basic questions regarding the OMPI implementation.

If I use one-sided MPI operations (Get, and Put) forcefully I use RDMA? – It 
depends, but it’s not guaranteed. For example, in Open MPI 4.0.x, there was the 
osc/pt2pt component that implemented osc operations using send/receive. Or for 
example, with calls to libfabric’s osc api, it depends on the implementation of 
the underlying provider.
Is it possible to have one-sided without RDMA? - Yes

In general, other types of MPI operations like Send/Receive or collective 
operations are implemented using RDMA or not necessarily? – Not necessarily. 
For example, using TCP won’t use RDMA. The underlying communication protocol 
could very well implement send/receive using RDMA though.

How can I be sure that I am using RDMA for a specific operation? – I’m not sure 
there’s an easy way to do this, I think you have to have some understanding of 
what communication protocol you’re using and what that protocol is doing.

Thank you very much in advance for your help!
Best Regards,



--
Best Regards,
Masoud Hemmatpour, PhD



Re: [OMPI devel] RDMA and OMPI implementation

2022-04-21 Thread Masoud Hemmatpour via devel
Thank you very much for your description! Actually, I read this issue on
github:

Is OpenMPI supporting RDMA? <https://github.com/open-mpi/ompi/issues/5789>

If I have IB and I install and use UCX, does this guarantee that I am using
RDMA or still it does not guarantee?


Thanks again,









On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel <
devel@lists.open-mpi.org> wrote:

> Let me add a little more color to William's response.  The general theme
> is: it depends on what the underlying network provides.
>
> Some underlying networks natively support one-sided operations like PUT /
> WRITE and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.).  Some don't
> (like TCP).
>
> Open MPI will adapt to use whatever transports the underlying network
> supports.
>
> Additionally, the determination of whether Open MPI uses a "two sided" or
> "one sided" type of network transport operation depends on a bunch of other
> factors.  The most efficient method to get a message from sender to receive
> may depend on issues such as the size of the message, the memory map of the
> message, the current network resource utilization, the specific MPI
> operation, ... etc.
>
> Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided
> operations.  So if you want to use "RDMA", you may need to use an
> NVIDIA-based network (e.g., IB or RoCE).  That's not the only type of
> network one-sided operations available, but it's common.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> 
> From: devel  on behalf of Zhang,
> William via devel 
> Sent: Wednesday, April 20, 2022 6:12 PM
> To: Open MPI Developers
> Cc: Zhang, William
> Subject: Re: [OMPI devel] RDMA and OMPI implementation
>
> Hello Masoud,
>
> Responded inline
>
> Thanks,
> William
>
> From: devel  on behalf of Masoud
> Hemmatpour via devel 
> Reply-To: Open MPI Developers 
> Date: Wednesday, April 20, 2022 at 5:29 AM
> To: Open MPI Developers 
> Cc: Masoud Hemmatpour 
> Subject: [EXTERNAL] [OMPI devel] RDMA and OMPI implementation
>
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
> Hello Everyone,
>
> Sorry, MPI is quite new for me, in particular the implementation. If you
> don't mind, I have some very basic questions regarding the OMPI
> implementation.
>
> If I use one-sided MPI operations (Get, and Put) forcefully I use RDMA? –
> It depends, but it’s not guaranteed. For example, in Open MPI 4.0.x, there
> was the osc/pt2pt component that implemented osc operations using
> send/receive. Or for example, with calls to libfabric’s osc api, it depends
> on the implementation of the underlying provider.
> Is it possible to have one-sided without RDMA? - Yes
>
> In general, other types of MPI operations like Send/Receive or collective
> operations are implemented using RDMA or not necessarily? – Not
> necessarily. For example, using TCP won’t use RDMA. The underlying
> communication protocol could very well implement send/receive using RDMA
> though.
>
> How can I be sure that I am using RDMA for a specific operation? – I’m not
> sure there’s an easy way to do this, I think you have to have some
> understanding of what communication protocol you’re using and what that
> protocol is doing.
>
> Thank you very much in advance for your help!
> Best Regards,
>
>

-- 
Best Regards,
Masoud Hemmatpour, PhD


Re: [OMPI devel] RDMA and OMPI implementation

2022-04-20 Thread Jeff Squyres (jsquyres) via devel
Let me add a little more color to William's response.  The general theme is: it 
depends on what the underlying network provides.

Some underlying networks natively support one-sided operations like PUT / WRITE 
and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.).  Some don't (like TCP).

Open MPI will adapt to use whatever transports the underlying network supports.

Additionally, the determination of whether Open MPI uses a "two sided" or "one 
sided" type of network transport operation depends on a bunch of other factors. 
 The most efficient method to get a message from sender to receive may depend 
on issues such as the size of the message, the memory map of the message, the 
current network resource utilization, the specific MPI operation, ... etc.

Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided 
operations.  So if you want to use "RDMA", you may need to use an NVIDIA-based 
network (e.g., IB or RoCE).  That's not the only type of network one-sided 
operations available, but it's common.

--
Jeff Squyres
jsquy...@cisco.com


From: devel  on behalf of Zhang, William via 
devel 
Sent: Wednesday, April 20, 2022 6:12 PM
To: Open MPI Developers
Cc: Zhang, William
Subject: Re: [OMPI devel] RDMA and OMPI implementation

Hello Masoud,

Responded inline

Thanks,
William

From: devel  on behalf of Masoud Hemmatpour 
via devel 
Reply-To: Open MPI Developers 
Date: Wednesday, April 20, 2022 at 5:29 AM
To: Open MPI Developers 
Cc: Masoud Hemmatpour 
Subject: [EXTERNAL] [OMPI devel] RDMA and OMPI implementation


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.

Hello Everyone,

Sorry, MPI is quite new for me, in particular the implementation. If you don't 
mind, I have some very basic questions regarding the OMPI implementation.

If I use one-sided MPI operations (Get, and Put) forcefully I use RDMA? – It 
depends, but it’s not guaranteed. For example, in Open MPI 4.0.x, there was the 
osc/pt2pt component that implemented osc operations using send/receive. Or for 
example, with calls to libfabric’s osc api, it depends on the implementation of 
the underlying provider.
Is it possible to have one-sided without RDMA? - Yes

In general, other types of MPI operations like Send/Receive or collective 
operations are implemented using RDMA or not necessarily? – Not necessarily. 
For example, using TCP won’t use RDMA. The underlying communication protocol 
could very well implement send/receive using RDMA though.

How can I be sure that I am using RDMA for a specific operation? – I’m not sure 
there’s an easy way to do this, I think you have to have some understanding of 
what communication protocol you’re using and what that protocol is doing.

Thank you very much in advance for your help!
Best Regards,



Re: [OMPI devel] RDMA and OMPI implementation

2022-04-20 Thread Zhang, William via devel
Hello Masoud,

Responded inline

Thanks,
William

From: devel  on behalf of Masoud Hemmatpour 
via devel 
Reply-To: Open MPI Developers 
Date: Wednesday, April 20, 2022 at 5:29 AM
To: Open MPI Developers 
Cc: Masoud Hemmatpour 
Subject: [EXTERNAL] [OMPI devel] RDMA and OMPI implementation


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Hello Everyone,

Sorry, MPI is quite new for me, in particular the implementation. If you don't 
mind, I have some very basic questions regarding the OMPI implementation.

If I use one-sided MPI operations (Get, and Put) forcefully I use RDMA? – It 
depends, but it’s not guaranteed. For example, in Open MPI 4.0.x, there was the 
osc/pt2pt component that implemented osc operations using send/receive. Or for 
example, with calls to libfabric’s osc api, it depends on the implementation of 
the underlying provider.
Is it possible to have one-sided without RDMA? - Yes

In general, other types of MPI operations like Send/Receive or collective 
operations are implemented using RDMA or not necessarily? – Not necessarily. 
For example, using TCP won’t use RDMA. The underlying communication protocol 
could very well implement send/receive using RDMA though.

How can I be sure that I am using RDMA for a specific operation? – I’m not sure 
there’s an easy way to do this, I think you have to have some understanding of 
what communication protocol you’re using and what that protocol is doing.

Thank you very much in advance for your help!
Best Regards,