[OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-06 Thread Filippo Spiga
Dear Open MPI developers,

I wonder if there is a way to address this particular scenario using MPI_T or 
other strategies in Open MPI. I saw a similar discussion few days ago, I assume 
the same challenges are applied in this case but I just want to check. Here is 
the scenario:

We have a system composed by dual rail Mellanox IB, two distinct Connect-IB 
cards per node each one sitting on a different PCI-E lane out of two distinct 
sockets. We are seeking a way to control MPI traffic thought each one of them 
directly into the application. In specific we have a single MPI rank per node 
that goes multi-threading using OpenMP. MPI_THREAD_MULTIPLE is used, each 
OpenMP thread may initiate MPI communication. We would like to assign IB-0 to 
thread 0 and IB-1 to thread 1.

Via mpirun or env variables we can control which IB interface to use by binding 
it to a specific MPI rank (or by apply a policy that relate IB to MPi ranks). 
But if there is only one MPI rank active, how we can differentiate the traffic 
across multiple IB cards?

Thanks in advance for any suggestion about this matter.

Regards,
Filippo

--
Mr. Filippo SPIGA, M.Sc.
http://filippospiga.info ~ skype: filippo.spiga

«Nobody will drive us out of Cantor's paradise.» ~ David Hilbert

*
Disclaimer: "Please note this message and any attachments are CONFIDENTIAL and 
may be privileged or otherwise protected from disclosure. The contents are not 
to be disclosed to anyone other than the addressee. Unauthorized recipients are 
requested to preserve this confidentiality and to advise the sender immediately 
of any error in transmission."




Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-06 Thread Rolf vandeVaart
It is my belief that you cannot do this at least with the openib BTL.  The IB 
card to be used for communication is selected during the MPI _Init() phase 
based on where the CPU process is bound to.  You can see some of this selection 
by using the --mca btl_base_verbose 1 flag.  There is a bunch of output (which 
I have deleted), but you will see a few lines like this.

[ivy5] [rank=1] openib: using port mlx5_0:1
[ivy5] [rank=1] openib: using port mlx5_0:2
[ivy4] [rank=0] openib: using port mlx5_0:1
[ivy4] [rank=0] openib: using port mlx5_0:2

And if you have multiple NICs, you may also see some messages like this:
 "[rank=%d] openib: skipping device %s; it is too far away"
(This was lifted from the  code. I do not have a configuration right now where 
I can generate the second message.)

I cannot see how we can make this specific to a thread.  Maybe others have a 
different opinion.
Rolf

>-Original Message-
>From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Filippo Spiga
>Sent: Monday, April 06, 2015 5:46 AM
>To: Open MPI Users
>Cc: Mohammed Sourouri
>Subject: [OMPI users] Different HCA from different OpenMP threads (same
>rank using MPI_THREAD_MULTIPLE)
>
>Dear Open MPI developers,
>
>I wonder if there is a way to address this particular scenario using MPI_T or
>other strategies in Open MPI. I saw a similar discussion few days ago, I assume
>the same challenges are applied in this case but I just want to check. Here is
>the scenario:
>
>We have a system composed by dual rail Mellanox IB, two distinct Connect-IB
>cards per node each one sitting on a different PCI-E lane out of two distinct
>sockets. We are seeking a way to control MPI traffic thought each one of
>them directly into the application. In specific we have a single MPI rank per
>node that goes multi-threading using OpenMP. MPI_THREAD_MULTIPLE is
>used, each OpenMP thread may initiate MPI communication. We would like to
>assign IB-0 to thread 0 and IB-1 to thread 1.
>
>Via mpirun or env variables we can control which IB interface to use by binding
>it to a specific MPI rank (or by apply a policy that relate IB to MPi ranks). 
>But if
>there is only one MPI rank active, how we can differentiate the traffic across
>multiple IB cards?
>
>Thanks in advance for any suggestion about this matter.
>
>Regards,
>Filippo
>
>--
>Mr. Filippo SPIGA, M.Sc.
>http://filippospiga.info ~ skype: filippo.spiga
>
>«Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
>
>*
>Disclaimer: "Please note this message and any attachments are
>CONFIDENTIAL and may be privileged or otherwise protected from disclosure.
>The contents are not to be disclosed to anyone other than the addressee.
>Unauthorized recipients are requested to preserve this confidentiality and to
>advise the sender immediately of any error in transmission."
>
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: http://www.open-
>mpi.org/community/lists/users/2015/04/26614.php

---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-06 Thread Ralph Castain
I’m afraid Rolf is correct. We can only define the binding pattern at time of 
initial process execution, which is well before you start spinning up 
individual threads. At that point, we no longer have the ability to do binding.

That said, you can certainly have your application specify a thread-level 
binding. You’d have to do the heavy lifting yourself in the app, I’m afraid, 
instead of relying on us to do it for you.


> On Apr 6, 2015, at 2:24 PM, Rolf vandeVaart  wrote:
> 
> It is my belief that you cannot do this at least with the openib BTL.  The IB 
> card to be used for communication is selected during the MPI _Init() phase 
> based on where the CPU process is bound to.  You can see some of this 
> selection by using the --mca btl_base_verbose 1 flag.  There is a bunch of 
> output (which I have deleted), but you will see a few lines like this.
> 
> [ivy5] [rank=1] openib: using port mlx5_0:1
> [ivy5] [rank=1] openib: using port mlx5_0:2
> [ivy4] [rank=0] openib: using port mlx5_0:1
> [ivy4] [rank=0] openib: using port mlx5_0:2
> 
> And if you have multiple NICs, you may also see some messages like this:
> "[rank=%d] openib: skipping device %s; it is too far away"
> (This was lifted from the  code. I do not have a configuration right now 
> where I can generate the second message.)
> 
> I cannot see how we can make this specific to a thread.  Maybe others have a 
> different opinion.
> Rolf
> 
>> -Original Message-
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Filippo Spiga
>> Sent: Monday, April 06, 2015 5:46 AM
>> To: Open MPI Users
>> Cc: Mohammed Sourouri
>> Subject: [OMPI users] Different HCA from different OpenMP threads (same
>> rank using MPI_THREAD_MULTIPLE)
>> 
>> Dear Open MPI developers,
>> 
>> I wonder if there is a way to address this particular scenario using MPI_T or
>> other strategies in Open MPI. I saw a similar discussion few days ago, I 
>> assume
>> the same challenges are applied in this case but I just want to check. Here 
>> is
>> the scenario:
>> 
>> We have a system composed by dual rail Mellanox IB, two distinct Connect-IB
>> cards per node each one sitting on a different PCI-E lane out of two distinct
>> sockets. We are seeking a way to control MPI traffic thought each one of
>> them directly into the application. In specific we have a single MPI rank per
>> node that goes multi-threading using OpenMP. MPI_THREAD_MULTIPLE is
>> used, each OpenMP thread may initiate MPI communication. We would like to
>> assign IB-0 to thread 0 and IB-1 to thread 1.
>> 
>> Via mpirun or env variables we can control which IB interface to use by 
>> binding
>> it to a specific MPI rank (or by apply a policy that relate IB to MPi 
>> ranks). But if
>> there is only one MPI rank active, how we can differentiate the traffic 
>> across
>> multiple IB cards?
>> 
>> Thanks in advance for any suggestion about this matter.
>> 
>> Regards,
>> Filippo
>> 
>> --
>> Mr. Filippo SPIGA, M.Sc.
>> http://filippospiga.info ~ skype: filippo.spiga
>> 
>> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
>> 
>> *
>> Disclaimer: "Please note this message and any attachments are
>> CONFIDENTIAL and may be privileged or otherwise protected from disclosure.
>> The contents are not to be disclosed to anyone other than the addressee.
>> Unauthorized recipients are requested to preserve this confidentiality and to
>> advise the sender immediately of any error in transmission."
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: http://www.open-
>> mpi.org/community/lists/users/2015/04/26614.php
> 
> ---
> This email message is for the sole use of the intended recipient(s) and may 
> contain
> confidential information.  Any unauthorized review, use, disclosure or 
> distribution
> is prohibited.  If you are not the intended recipient, please contact the 
> sender by
> reply email and destroy all copies of the original message.
> ---
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26624.php



Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-07 Thread Filippo Spiga
Thanks Rolf and Ralph for the replies!

On Apr 6, 2015, at 10:37 PM, Ralph Castain  wrote:
> That said, you can certainly have your application specify a thread-level 
> binding. You’d have to do the heavy lifting yourself in the app, I’m afraid, 
> instead of relying on us to do it for you

Ok, my application must do it and I am fine with it. But how? I mean, does Open 
MPi expose some API that allows such fine grain control?

F

--
Mr. Filippo SPIGA, M.Sc.
http://filippospiga.info ~ skype: filippo.spiga

«Nobody will drive us out of Cantor's paradise.» ~ David Hilbert

*
Disclaimer: "Please note this message and any attachments are CONFIDENTIAL and 
may be privileged or otherwise protected from disclosure. The contents are not 
to be disclosed to anyone other than the addressee. Unauthorized recipients are 
requested to preserve this confidentiality and to advise the sender immediately 
of any error in transmission."




Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-07 Thread Rolf vandeVaart
I still do not believe there is a way for you to steer your traffic based on 
the thread that is calling into Open MPI. While you can spawn your own threads, 
Open MPI is going to figure out what interfaces to use based on the 
characteristics of the process during MPI_Init.  Even if Open MPI decides to 
use two interfaces, the use of these will be done based on the process.  It 
will alternate between them independent of which thread happens to be doing the 
sends or receives.  There is no way of doing this with something like 
MPI_T_cvar_write which I think is what you were looking for.

Rolf  

>-Original Message-
>From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Filippo Spiga
>Sent: Tuesday, April 07, 2015 5:46 AM
>To: Open MPI Users
>Subject: Re: [OMPI users] Different HCA from different OpenMP threads
>(same rank using MPI_THREAD_MULTIPLE)
>
>Thanks Rolf and Ralph for the replies!
>
>On Apr 6, 2015, at 10:37 PM, Ralph Castain  wrote:
>> That said, you can certainly have your application specify a thread-level
>binding. You’d have to do the heavy lifting yourself in the app, I’m afraid,
>instead of relying on us to do it for you
>
>Ok, my application must do it and I am fine with it. But how? I mean, does
>Open MPi expose some API that allows such fine grain control?
>
>F
>
>--
>Mr. Filippo SPIGA, M.Sc.
>http://filippospiga.info ~ skype: filippo.spiga


---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-07 Thread Abdul Rahman Riza
how to unsubscribe?

On Mon, Apr 6, 2015 at 4:45 PM, Filippo Spiga 
wrote:

> Dear Open MPI developers,
>
> I wonder if there is a way to address this particular scenario using MPI_T
> or other strategies in Open MPI. I saw a similar discussion few days ago, I
> assume the same challenges are applied in this case but I just want to
> check. Here is the scenario:
>
> We have a system composed by dual rail Mellanox IB, two distinct
> Connect-IB cards per node each one sitting on a different PCI-E lane out of
> two distinct sockets. We are seeking a way to control MPI traffic thought
> each one of them directly into the application. In specific we have a
> single MPI rank per node that goes multi-threading using OpenMP.
> MPI_THREAD_MULTIPLE is used, each OpenMP thread may initiate MPI
> communication. We would like to assign IB-0 to thread 0 and IB-1 to thread
> 1.
>
> Via mpirun or env variables we can control which IB interface to use by
> binding it to a specific MPI rank (or by apply a policy that relate IB to
> MPi ranks). But if there is only one MPI rank active, how we can
> differentiate the traffic across multiple IB cards?
>
> Thanks in advance for any suggestion about this matter.
>
> Regards,
> Filippo
>
> --
> Mr. Filippo SPIGA, M.Sc.
> http://filippospiga.info ~ skype: filippo.spiga
>
> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
>
> *
> Disclaimer: "Please note this message and any attachments are CONFIDENTIAL
> and may be privileged or otherwise protected from disclosure. The contents
> are not to be disclosed to anyone other than the addressee. Unauthorized
> recipients are requested to preserve this confidentiality and to advise the
> sender immediately of any error in transmission."
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/04/26614.php


Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-07 Thread Abdul Rahman Riza
how to unsubscribe?

On Tue, Apr 7, 2015 at 4:45 PM, Filippo Spiga 
wrote:

> Thanks Rolf and Ralph for the replies!
>
> On Apr 6, 2015, at 10:37 PM, Ralph Castain  wrote:
> > That said, you can certainly have your application specify a
> thread-level binding. You’d have to do the heavy lifting yourself in the
> app, I’m afraid, instead of relying on us to do it for you
>
> Ok, my application must do it and I am fine with it. But how? I mean, does
> Open MPi expose some API that allows such fine grain control?
>
> F
>
> --
> Mr. Filippo SPIGA, M.Sc.
> http://filippospiga.info ~ skype: filippo.spiga
>
> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
>
> *
> Disclaimer: "Please note this message and any attachments are CONFIDENTIAL
> and may be privileged or otherwise protected from disclosure. The contents
> are not to be disclosed to anyone other than the addressee. Unauthorized
> recipients are requested to preserve this confidentiality and to advise the
> sender immediately of any error in transmission."
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/04/26632.php


Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-07 Thread Ralph Castain
Easiest way is to follow the link at the bottom of the message:

http://www.open-mpi.org/mailman/listinfo.cgi/users 



> On Apr 7, 2015, at 10:39 AM, Abdul Rahman Riza  wrote:
> 
> how to unsubscribe?
> 
> On Tue, Apr 7, 2015 at 4:45 PM, Filippo Spiga  > wrote:
> Thanks Rolf and Ralph for the replies!
> 
> On Apr 6, 2015, at 10:37 PM, Ralph Castain  > wrote:
> > That said, you can certainly have your application specify a thread-level 
> > binding. You’d have to do the heavy lifting yourself in the app, I’m 
> > afraid, instead of relying on us to do it for you
> 
> Ok, my application must do it and I am fine with it. But how? I mean, does 
> Open MPi expose some API that allows such fine grain control?
> 
> F
> 
> --
> Mr. Filippo SPIGA, M.Sc.
> http://filippospiga.info  ~ skype: filippo.spiga
> 
> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
> 
> *
> Disclaimer: "Please note this message and any attachments are CONFIDENTIAL 
> and may be privileged or otherwise protected from disclosure. The contents 
> are not to be disclosed to anyone other than the addressee. Unauthorized 
> recipients are requested to preserve this confidentiality and to advise the 
> sender immediately of any error in transmission."
> 
> 
> ___
> users mailing list
> us...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> 
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26632.php 
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26640.php