Re: [OMPI devel] RDMA and OMPI implementation
Hi Tomislav, Thank you very much for your answer! Sure, I ask my question in UCX mailing list. Thanks, On Thu, 21 Apr 2022 at 16:27, Tomislav Janjusic wrote: > Hi Masoud, > > > I would say how can I see a complete list of such factors like message > size, memory map, ... etc > For UCX, depending on where you have it installed, you'll find 'ucx_info' > which will list all available tuning parameters. > > For general ompi tuning I would start with ompi_info -a, and just look > through the parameters. > > If you need further clarification regarding UCX this mailing list is > probably not your best choice. I would direct my questions on the UCX > mailing list here: ucx-gr...@elist.ornl.gov, and you can register here: > https://elist.ornl.gov/mailman/listinfo/ucx-group > > Best, Tommy > > -- > Tomislav Janjusic > Staff Eng., Mellanox, HPC SW > +1 (512) 598-0386 > NVIDIA > > -Original Message- > From: devel On Behalf Of Jeff Squyres > (jsquyres) via devel > Sent: Thursday, April 21, 2022 7:18 AM > To: Masoud Hemmatpour > Cc: Jeff Squyres (jsquyres) ; Open MPI Developers < > devel@lists.open-mpi.org> > Subject: Re: [OMPI devel] RDMA and OMPI implementation > > External email: Use caution opening links or attachments > > > In UCX's case, the choice is almost entirely driven by the UCX library. > You'll need to look at the UCX code and/or ask NVIDIA. > > -- > Jeff Squyres > jsquy...@cisco.com > > ____ > From: Masoud Hemmatpour > Sent: Thursday, April 21, 2022 7:57 AM > To: Jeff Squyres (jsquyres) > Cc: Open MPI Developers > Subject: Re: [OMPI devel] RDMA and OMPI implementation > > > > Thanks again for your answer and I hope I dont bother you with my > questions! If I can ask my last question here. I would say how can I see a > complete list of such factors like message size, memory map, ... etc? > Is there any reading or should I look at the code, if any, could you > please give me a starting point to look at it? In the case of UCX and > UCX-enabled network interfaces (such as IB) is it a UCX decision or OpenMPI > decision to use or not RDMA? > > Sorry for my long question, and thank you again! > > > > > > > On Thu, Apr 21, 2022 at 1:09 PM Jeff Squyres (jsquyres) < > jsquy...@cisco.com<mailto:jsquy...@cisco.com>> wrote: > It means that your underlying network transport supports RDMA. > > To be clear, if you built Open MPI with UCX support, and you run on a > system with UCX-enabled network interfaces (such as IB), Open MPI should > automatically default to using those UCX interfaces. This means you'll get > all the benefits of an HPC-class networking transport (low latency, > hardware offload, ... etc.). > > For any given send/receive in your MPI application, in the right > circumstances (message size, memory map, ... etc.), Open MPI will use RDMA > to effect a network transfer. There are many different run-time issues > that will drive the choice of whether any individual network transfer > actually uses RDMA or not. > > -- > Jeff Squyres > jsquy...@cisco.com<mailto:jsquy...@cisco.com> > > > From: Masoud Hemmatpour mailto:mashe...@gmail.com>> > Sent: Thursday, April 21, 2022 2:38 AM > To: Open MPI Developers > Cc: Jeff Squyres (jsquyres) > Subject: Re: [OMPI devel] RDMA and OMPI implementation > > > Thank you very much for your description! Actually, I read this issue on > github: > > Is OpenMPI supporting RDMA?< > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopen-mpi%2Fompi%2Fissues%2F5789data=05%7C01%7Ctomislavj%40nvidia.com%7Ca1c6d1a5fad94d71864f08da239143ab%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637861404070677349%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=MJtyqPZKQmMw8mob0YKhT6vsgvU1bytf1nZcIPSyRdo%3Dreserved=0 > > > > If I have IB and I install and use UCX, does this guarantee that I am > using RDMA or still it does not guarantee? > > > Thanks again, > > > > > > > > > > On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel < > devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org> devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>> wrote: > Let me add a little more color to William's response. The general theme > is: it depends on what the underlying network provides. > > Some underlying networks natively support one-sided operations like PUT / > WRITE and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.). Some don't > (like TCP). > > Open MPI w
Re: [OMPI devel] RDMA and OMPI implementation
Hi Masoud, > I would say how can I see a complete list of such factors like message size, > memory map, ... etc For UCX, depending on where you have it installed, you'll find 'ucx_info' which will list all available tuning parameters. For general ompi tuning I would start with ompi_info -a, and just look through the parameters. If you need further clarification regarding UCX this mailing list is probably not your best choice. I would direct my questions on the UCX mailing list here: ucx-gr...@elist.ornl.gov, and you can register here: https://elist.ornl.gov/mailman/listinfo/ucx-group Best, Tommy -- Tomislav Janjusic Staff Eng., Mellanox, HPC SW +1 (512) 598-0386 NVIDIA -Original Message- From: devel On Behalf Of Jeff Squyres (jsquyres) via devel Sent: Thursday, April 21, 2022 7:18 AM To: Masoud Hemmatpour Cc: Jeff Squyres (jsquyres) ; Open MPI Developers Subject: Re: [OMPI devel] RDMA and OMPI implementation External email: Use caution opening links or attachments In UCX's case, the choice is almost entirely driven by the UCX library. You'll need to look at the UCX code and/or ask NVIDIA. -- Jeff Squyres jsquy...@cisco.com From: Masoud Hemmatpour Sent: Thursday, April 21, 2022 7:57 AM To: Jeff Squyres (jsquyres) Cc: Open MPI Developers Subject: Re: [OMPI devel] RDMA and OMPI implementation Thanks again for your answer and I hope I dont bother you with my questions! If I can ask my last question here. I would say how can I see a complete list of such factors like message size, memory map, ... etc? Is there any reading or should I look at the code, if any, could you please give me a starting point to look at it? In the case of UCX and UCX-enabled network interfaces (such as IB) is it a UCX decision or OpenMPI decision to use or not RDMA? Sorry for my long question, and thank you again! On Thu, Apr 21, 2022 at 1:09 PM Jeff Squyres (jsquyres) mailto:jsquy...@cisco.com>> wrote: It means that your underlying network transport supports RDMA. To be clear, if you built Open MPI with UCX support, and you run on a system with UCX-enabled network interfaces (such as IB), Open MPI should automatically default to using those UCX interfaces. This means you'll get all the benefits of an HPC-class networking transport (low latency, hardware offload, ... etc.). For any given send/receive in your MPI application, in the right circumstances (message size, memory map, ... etc.), Open MPI will use RDMA to effect a network transfer. There are many different run-time issues that will drive the choice of whether any individual network transfer actually uses RDMA or not. -- Jeff Squyres jsquy...@cisco.com<mailto:jsquy...@cisco.com> From: Masoud Hemmatpour mailto:mashe...@gmail.com>> Sent: Thursday, April 21, 2022 2:38 AM To: Open MPI Developers Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI devel] RDMA and OMPI implementation Thank you very much for your description! Actually, I read this issue on github: Is OpenMPI supporting RDMA?<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopen-mpi%2Fompi%2Fissues%2F5789data=05%7C01%7Ctomislavj%40nvidia.com%7Ca1c6d1a5fad94d71864f08da239143ab%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637861404070677349%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=MJtyqPZKQmMw8mob0YKhT6vsgvU1bytf1nZcIPSyRdo%3Dreserved=0> If I have IB and I install and use UCX, does this guarantee that I am using RDMA or still it does not guarantee? Thanks again, On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>> wrote: Let me add a little more color to William's response. The general theme is: it depends on what the underlying network provides. Some underlying networks natively support one-sided operations like PUT / WRITE and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.). Some don't (like TCP). Open MPI will adapt to use whatever transports the underlying network supports. Additionally, the determination of whether Open MPI uses a "two sided" or "one sided" type of network transport operation depends on a bunch of other factors. The most efficient method to get a message from sender to receive may depend on issues such as the size of the message, the memory map of the message, the current network resource utilization, the specific MPI operation, ... etc. Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided operations. So if you want to use "RDMA", you may need to use an NVIDIA-based network (e.g., IB or RoCE). That's not the only type of network one-sided operations available, but it's common. -- Jeff Squyres jsquy...@
Re: [OMPI devel] RDMA and OMPI implementation
In UCX's case, the choice is almost entirely driven by the UCX library. You'll need to look at the UCX code and/or ask NVIDIA. -- Jeff Squyres jsquy...@cisco.com From: Masoud Hemmatpour Sent: Thursday, April 21, 2022 7:57 AM To: Jeff Squyres (jsquyres) Cc: Open MPI Developers Subject: Re: [OMPI devel] RDMA and OMPI implementation Thanks again for your answer and I hope I dont bother you with my questions! If I can ask my last question here. I would say how can I see a complete list of such factors like message size, memory map, ... etc? Is there any reading or should I look at the code, if any, could you please give me a starting point to look at it? In the case of UCX and UCX-enabled network interfaces (such as IB) is it a UCX decision or OpenMPI decision to use or not RDMA? Sorry for my long question, and thank you again! On Thu, Apr 21, 2022 at 1:09 PM Jeff Squyres (jsquyres) mailto:jsquy...@cisco.com>> wrote: It means that your underlying network transport supports RDMA. To be clear, if you built Open MPI with UCX support, and you run on a system with UCX-enabled network interfaces (such as IB), Open MPI should automatically default to using those UCX interfaces. This means you'll get all the benefits of an HPC-class networking transport (low latency, hardware offload, ... etc.). For any given send/receive in your MPI application, in the right circumstances (message size, memory map, ... etc.), Open MPI will use RDMA to effect a network transfer. There are many different run-time issues that will drive the choice of whether any individual network transfer actually uses RDMA or not. -- Jeff Squyres jsquy...@cisco.com<mailto:jsquy...@cisco.com> From: Masoud Hemmatpour mailto:mashe...@gmail.com>> Sent: Thursday, April 21, 2022 2:38 AM To: Open MPI Developers Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI devel] RDMA and OMPI implementation Thank you very much for your description! Actually, I read this issue on github: Is OpenMPI supporting RDMA?<https://github.com/open-mpi/ompi/issues/5789> If I have IB and I install and use UCX, does this guarantee that I am using RDMA or still it does not guarantee? Thanks again, On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>> wrote: Let me add a little more color to William's response. The general theme is: it depends on what the underlying network provides. Some underlying networks natively support one-sided operations like PUT / WRITE and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.). Some don't (like TCP). Open MPI will adapt to use whatever transports the underlying network supports. Additionally, the determination of whether Open MPI uses a "two sided" or "one sided" type of network transport operation depends on a bunch of other factors. The most efficient method to get a message from sender to receive may depend on issues such as the size of the message, the memory map of the message, the current network resource utilization, the specific MPI operation, ... etc. Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided operations. So if you want to use "RDMA", you may need to use an NVIDIA-based network (e.g., IB or RoCE). That's not the only type of network one-sided operations available, but it's common. -- Jeff Squyres jsquy...@cisco.com<mailto:jsquy...@cisco.com><mailto:jsquy...@cisco.com<mailto:jsquy...@cisco.com>> From: devel mailto:devel-boun...@lists.open-mpi.org><mailto:devel-boun...@lists.open-mpi.org<mailto:devel-boun...@lists.open-mpi.org>>> on behalf of Zhang, William via devel mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>> Sent: Wednesday, April 20, 2022 6:12 PM To: Open MPI Developers Cc: Zhang, William Subject: Re: [OMPI devel] RDMA and OMPI implementation Hello Masoud, Responded inline Thanks, William From: devel mailto:devel-boun...@lists.open-mpi.org><mailto:devel-boun...@lists.open-mpi.org<mailto:devel-boun...@lists.open-mpi.org>>> on behalf of Masoud Hemmatpour via devel mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>> Reply-To: Open MPI Developers mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>> Date: Wednesday, April 20, 2022 at 5:29 AM To: Open MPI Developers mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>> Cc: Masoud Hemmatpour mailto:mashe...@gmail.com><mailto:mashe...@gmail.com<mai
Re: [OMPI devel] RDMA and OMPI implementation
Thanks again for your answer and I hope I dont bother you with my questions! If I can ask my last question here. I would say how can I see a complete list of such factors like *message size, memory map, ... etc*? Is there any reading or should I look at the code, if any, could you please give me a starting point to look at it? In the case of UCX and UCX-enabled network interfaces (such as IB) is it a UCX decision or OpenMPI decision to use or not RDMA? Sorry for my long question, and thank you again! On Thu, Apr 21, 2022 at 1:09 PM Jeff Squyres (jsquyres) wrote: > It means that your underlying network transport supports RDMA. > > To be clear, if you built Open MPI with UCX support, and you run on a > system with UCX-enabled network interfaces (such as IB), Open MPI should > automatically default to using those UCX interfaces. This means you'll get > all the benefits of an HPC-class networking transport (low latency, > hardware offload, ... etc.). > > For any given send/receive in your MPI application, in the right > circumstances (message size, memory map, ... etc.), Open MPI will use RDMA > to effect a network transfer. There are many different run-time issues > that will drive the choice of whether any individual network transfer > actually uses RDMA or not. > > -- > Jeff Squyres > jsquy...@cisco.com > > > From: Masoud Hemmatpour > Sent: Thursday, April 21, 2022 2:38 AM > To: Open MPI Developers > Cc: Jeff Squyres (jsquyres) > Subject: Re: [OMPI devel] RDMA and OMPI implementation > > > Thank you very much for your description! Actually, I read this issue on > github: > > Is OpenMPI supporting RDMA?<https://github.com/open-mpi/ompi/issues/5789> > > If I have IB and I install and use UCX, does this guarantee that I am > using RDMA or still it does not guarantee? > > > Thanks again, > > > > > > > > > > On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel < > devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>> wrote: > Let me add a little more color to William's response. The general theme > is: it depends on what the underlying network provides. > > Some underlying networks natively support one-sided operations like PUT / > WRITE and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.). Some don't > (like TCP). > > Open MPI will adapt to use whatever transports the underlying network > supports. > > Additionally, the determination of whether Open MPI uses a "two sided" or > "one sided" type of network transport operation depends on a bunch of other > factors. The most efficient method to get a message from sender to receive > may depend on issues such as the size of the message, the memory map of the > message, the current network resource utilization, the specific MPI > operation, ... etc. > > Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided > operations. So if you want to use "RDMA", you may need to use an > NVIDIA-based network (e.g., IB or RoCE). That's not the only type of > network one-sided operations available, but it's common. > > -- > Jeff Squyres > jsquy...@cisco.com<mailto:jsquy...@cisco.com> > > ________________ > From: devel devel-boun...@lists.open-mpi.org>> on behalf of Zhang, William via devel < > devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>> > Sent: Wednesday, April 20, 2022 6:12 PM > To: Open MPI Developers > Cc: Zhang, William > Subject: Re: [OMPI devel] RDMA and OMPI implementation > > Hello Masoud, > > Responded inline > > Thanks, > William > > From: devel devel-boun...@lists.open-mpi.org>> on behalf of Masoud Hemmatpour via > devel mailto:devel@lists.open-mpi.org>> > Reply-To: Open MPI Developers devel@lists.open-mpi.org>> > Date: Wednesday, April 20, 2022 at 5:29 AM > To: Open MPI Developers devel@lists.open-mpi.org>> > Cc: Masoud Hemmatpour mailto:mashe...@gmail.com>> > Subject: [EXTERNAL] [OMPI devel] RDMA and OMPI implementation > > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > Hello Everyone, > > Sorry, MPI is quite new for me, in particular the implementation. If you > don't mind, I have some very basic questions regarding the OMPI > implementation. > > If I use one-sided MPI operations (Get, and Put) forcefully I use RDMA? – > It depends, but it’s not guaranteed. For example, in Open MPI 4.0.x, there > was the osc/pt2pt component that implemented osc operations using > s
Re: [OMPI devel] RDMA and OMPI implementation
It means that your underlying network transport supports RDMA. To be clear, if you built Open MPI with UCX support, and you run on a system with UCX-enabled network interfaces (such as IB), Open MPI should automatically default to using those UCX interfaces. This means you'll get all the benefits of an HPC-class networking transport (low latency, hardware offload, ... etc.). For any given send/receive in your MPI application, in the right circumstances (message size, memory map, ... etc.), Open MPI will use RDMA to effect a network transfer. There are many different run-time issues that will drive the choice of whether any individual network transfer actually uses RDMA or not. -- Jeff Squyres jsquy...@cisco.com From: Masoud Hemmatpour Sent: Thursday, April 21, 2022 2:38 AM To: Open MPI Developers Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI devel] RDMA and OMPI implementation Thank you very much for your description! Actually, I read this issue on github: Is OpenMPI supporting RDMA?<https://github.com/open-mpi/ompi/issues/5789> If I have IB and I install and use UCX, does this guarantee that I am using RDMA or still it does not guarantee? Thanks again, On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel mailto:devel@lists.open-mpi.org>> wrote: Let me add a little more color to William's response. The general theme is: it depends on what the underlying network provides. Some underlying networks natively support one-sided operations like PUT / WRITE and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.). Some don't (like TCP). Open MPI will adapt to use whatever transports the underlying network supports. Additionally, the determination of whether Open MPI uses a "two sided" or "one sided" type of network transport operation depends on a bunch of other factors. The most efficient method to get a message from sender to receive may depend on issues such as the size of the message, the memory map of the message, the current network resource utilization, the specific MPI operation, ... etc. Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided operations. So if you want to use "RDMA", you may need to use an NVIDIA-based network (e.g., IB or RoCE). That's not the only type of network one-sided operations available, but it's common. -- Jeff Squyres jsquy...@cisco.com<mailto:jsquy...@cisco.com> From: devel mailto:devel-boun...@lists.open-mpi.org>> on behalf of Zhang, William via devel mailto:devel@lists.open-mpi.org>> Sent: Wednesday, April 20, 2022 6:12 PM To: Open MPI Developers Cc: Zhang, William Subject: Re: [OMPI devel] RDMA and OMPI implementation Hello Masoud, Responded inline Thanks, William From: devel mailto:devel-boun...@lists.open-mpi.org>> on behalf of Masoud Hemmatpour via devel mailto:devel@lists.open-mpi.org>> Reply-To: Open MPI Developers mailto:devel@lists.open-mpi.org>> Date: Wednesday, April 20, 2022 at 5:29 AM To: Open MPI Developers mailto:devel@lists.open-mpi.org>> Cc: Masoud Hemmatpour mailto:mashe...@gmail.com>> Subject: [EXTERNAL] [OMPI devel] RDMA and OMPI implementation CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hello Everyone, Sorry, MPI is quite new for me, in particular the implementation. If you don't mind, I have some very basic questions regarding the OMPI implementation. If I use one-sided MPI operations (Get, and Put) forcefully I use RDMA? – It depends, but it’s not guaranteed. For example, in Open MPI 4.0.x, there was the osc/pt2pt component that implemented osc operations using send/receive. Or for example, with calls to libfabric’s osc api, it depends on the implementation of the underlying provider. Is it possible to have one-sided without RDMA? - Yes In general, other types of MPI operations like Send/Receive or collective operations are implemented using RDMA or not necessarily? – Not necessarily. For example, using TCP won’t use RDMA. The underlying communication protocol could very well implement send/receive using RDMA though. How can I be sure that I am using RDMA for a specific operation? – I’m not sure there’s an easy way to do this, I think you have to have some understanding of what communication protocol you’re using and what that protocol is doing. Thank you very much in advance for your help! Best Regards, -- Best Regards, Masoud Hemmatpour, PhD
Re: [OMPI devel] RDMA and OMPI implementation
Thank you very much for your description! Actually, I read this issue on github: Is OpenMPI supporting RDMA? <https://github.com/open-mpi/ompi/issues/5789> If I have IB and I install and use UCX, does this guarantee that I am using RDMA or still it does not guarantee? Thanks again, On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel < devel@lists.open-mpi.org> wrote: > Let me add a little more color to William's response. The general theme > is: it depends on what the underlying network provides. > > Some underlying networks natively support one-sided operations like PUT / > WRITE and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.). Some don't > (like TCP). > > Open MPI will adapt to use whatever transports the underlying network > supports. > > Additionally, the determination of whether Open MPI uses a "two sided" or > "one sided" type of network transport operation depends on a bunch of other > factors. The most efficient method to get a message from sender to receive > may depend on issues such as the size of the message, the memory map of the > message, the current network resource utilization, the specific MPI > operation, ... etc. > > Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided > operations. So if you want to use "RDMA", you may need to use an > NVIDIA-based network (e.g., IB or RoCE). That's not the only type of > network one-sided operations available, but it's common. > > -- > Jeff Squyres > jsquy...@cisco.com > > > From: devel on behalf of Zhang, > William via devel > Sent: Wednesday, April 20, 2022 6:12 PM > To: Open MPI Developers > Cc: Zhang, William > Subject: Re: [OMPI devel] RDMA and OMPI implementation > > Hello Masoud, > > Responded inline > > Thanks, > William > > From: devel on behalf of Masoud > Hemmatpour via devel > Reply-To: Open MPI Developers > Date: Wednesday, April 20, 2022 at 5:29 AM > To: Open MPI Developers > Cc: Masoud Hemmatpour > Subject: [EXTERNAL] [OMPI devel] RDMA and OMPI implementation > > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > Hello Everyone, > > Sorry, MPI is quite new for me, in particular the implementation. If you > don't mind, I have some very basic questions regarding the OMPI > implementation. > > If I use one-sided MPI operations (Get, and Put) forcefully I use RDMA? – > It depends, but it’s not guaranteed. For example, in Open MPI 4.0.x, there > was the osc/pt2pt component that implemented osc operations using > send/receive. Or for example, with calls to libfabric’s osc api, it depends > on the implementation of the underlying provider. > Is it possible to have one-sided without RDMA? - Yes > > In general, other types of MPI operations like Send/Receive or collective > operations are implemented using RDMA or not necessarily? – Not > necessarily. For example, using TCP won’t use RDMA. The underlying > communication protocol could very well implement send/receive using RDMA > though. > > How can I be sure that I am using RDMA for a specific operation? – I’m not > sure there’s an easy way to do this, I think you have to have some > understanding of what communication protocol you’re using and what that > protocol is doing. > > Thank you very much in advance for your help! > Best Regards, > > -- Best Regards, Masoud Hemmatpour, PhD
Re: [OMPI devel] RDMA and OMPI implementation
Let me add a little more color to William's response. The general theme is: it depends on what the underlying network provides. Some underlying networks natively support one-sided operations like PUT / WRITE and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.). Some don't (like TCP). Open MPI will adapt to use whatever transports the underlying network supports. Additionally, the determination of whether Open MPI uses a "two sided" or "one sided" type of network transport operation depends on a bunch of other factors. The most efficient method to get a message from sender to receive may depend on issues such as the size of the message, the memory map of the message, the current network resource utilization, the specific MPI operation, ... etc. Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided operations. So if you want to use "RDMA", you may need to use an NVIDIA-based network (e.g., IB or RoCE). That's not the only type of network one-sided operations available, but it's common. -- Jeff Squyres jsquy...@cisco.com From: devel on behalf of Zhang, William via devel Sent: Wednesday, April 20, 2022 6:12 PM To: Open MPI Developers Cc: Zhang, William Subject: Re: [OMPI devel] RDMA and OMPI implementation Hello Masoud, Responded inline Thanks, William From: devel on behalf of Masoud Hemmatpour via devel Reply-To: Open MPI Developers Date: Wednesday, April 20, 2022 at 5:29 AM To: Open MPI Developers Cc: Masoud Hemmatpour Subject: [EXTERNAL] [OMPI devel] RDMA and OMPI implementation CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hello Everyone, Sorry, MPI is quite new for me, in particular the implementation. If you don't mind, I have some very basic questions regarding the OMPI implementation. If I use one-sided MPI operations (Get, and Put) forcefully I use RDMA? – It depends, but it’s not guaranteed. For example, in Open MPI 4.0.x, there was the osc/pt2pt component that implemented osc operations using send/receive. Or for example, with calls to libfabric’s osc api, it depends on the implementation of the underlying provider. Is it possible to have one-sided without RDMA? - Yes In general, other types of MPI operations like Send/Receive or collective operations are implemented using RDMA or not necessarily? – Not necessarily. For example, using TCP won’t use RDMA. The underlying communication protocol could very well implement send/receive using RDMA though. How can I be sure that I am using RDMA for a specific operation? – I’m not sure there’s an easy way to do this, I think you have to have some understanding of what communication protocol you’re using and what that protocol is doing. Thank you very much in advance for your help! Best Regards,
Re: [OMPI devel] RDMA and OMPI implementation
Hello Masoud, Responded inline Thanks, William From: devel on behalf of Masoud Hemmatpour via devel Reply-To: Open MPI Developers Date: Wednesday, April 20, 2022 at 5:29 AM To: Open MPI Developers Cc: Masoud Hemmatpour Subject: [EXTERNAL] [OMPI devel] RDMA and OMPI implementation CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hello Everyone, Sorry, MPI is quite new for me, in particular the implementation. If you don't mind, I have some very basic questions regarding the OMPI implementation. If I use one-sided MPI operations (Get, and Put) forcefully I use RDMA? – It depends, but it’s not guaranteed. For example, in Open MPI 4.0.x, there was the osc/pt2pt component that implemented osc operations using send/receive. Or for example, with calls to libfabric’s osc api, it depends on the implementation of the underlying provider. Is it possible to have one-sided without RDMA? - Yes In general, other types of MPI operations like Send/Receive or collective operations are implemented using RDMA or not necessarily? – Not necessarily. For example, using TCP won’t use RDMA. The underlying communication protocol could very well implement send/receive using RDMA though. How can I be sure that I am using RDMA for a specific operation? – I’m not sure there’s an easy way to do this, I think you have to have some understanding of what communication protocol you’re using and what that protocol is doing. Thank you very much in advance for your help! Best Regards,