Re: [OMPI devel] 1.10.3rc MTT failures

2016-04-26 Thread Gilles Gouaillardet

Jeff,

fwiw, it is possible to save some bandwidth (~4x) with the --depth option


full clone:

git clone https://ggouaillar...@github.com/open-mpi/ompi-tests.git
Cloning into 'ompi-tests'...
remote: Counting objects: 32016, done.
remote: Total 32016 (delta 0), reused 0 (delta 0), pack-reused 32016
Receiving objects: 100% (32016/32016), 61.31 MiB | 645.00 KiB/s, done.
Resolving deltas: 100% (20719/20719), done.
Checking out files: 100% (9221/9221), done.


last commit only :

git clone --depth=1 https://ggouaillar...@github.com/open-mpi/ompi-tests.git
Cloning into 'ompi-tests'...
remote: Counting objects: 10687, done.
remote: Compressing objects: 100% (4667/4667), done.
remote: Total 10687 (delta 4972), reused 9595 (delta 4477), pack-reused 0
Receiving objects: 100% (10687/10687), 13.29 MiB | 673.00 KiB/s, done.
Resolving deltas: 100% (4972/4972), done.

Cheers,

Gilles

On 4/26/2016 12:03 AM, Jeff Squyres (jsquyres) wrote:

On Apr 25, 2016, at 9:50 AM, Gilles Gouaillardet 
 wrote:

and fwiw, Jeff uses an internally mirrored repo for ompi-tests, so it Cisco 
clusters should use the latest test suites.

Correct.  My local git mirrors update nightly.

FWIW: This made a *huge* difference when we were using SVN for ompi-tests.  An 
individual SVN checkout across the network was reeely slow; it was 
*significantly* faster to do a local SVN checkout.

I'm sure it's still faster to do a local git clone, but I don't know offhand if 
the amount of speedup is compared to a github.com clone of ompi-tests.





[OMPI devel] Process affinity detection

2016-04-26 Thread Sylvain Jeaugey
Within the BTL code (and surely elsewhere), we can use those convenient 
OPAL_PROC_ON_LOCAL_{NODE,SOCKET, ...} macros to figure out where another 
endpoint is located compared to us.


The problem is that it only works when ORTE defines it. The NODE works 
almost always since ORTE is always doing it. But for the NUMA, SOCKET, 
or CORE to work, we need to use Open MPI binding/mapping capabilities. 
If the process affinity was set with something else (custom scripts 
using taskset, cpusets, ...), it doesn't work.


How hard do you think it would it be to detect the affinity and set 
those flags using hwloc to figure out if we're on the same {SOCKET, 
CORE, ...} ? Where would it be simpler to do this ?


Thanks.
Sylvain

---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


Re: [OMPI devel] Process affinity detection

2016-04-26 Thread Ralph Castain
Hmmm…you mean for procs on the same node? I’m not sure how you can do it 
without introducing another data exchange, and that would require the app to 
execute it since otherwise we have no idea when they set the affinity.

If we assume they set the affinity prior to calling MPI_Init, then we could do 
it - but at the cost of forcing a modex. You can only detect your own affinity, 
so to get the relative placement, you have to do an exchange if we can’t pass 
it to you. Perhaps we could offer it as an option?


> On Apr 26, 2016, at 2:27 PM, Sylvain Jeaugey  wrote:
> 
> Within the BTL code (and surely elsewhere), we can use those convenient 
> OPAL_PROC_ON_LOCAL_{NODE,SOCKET, ...} macros to figure out where another 
> endpoint is located compared to us.
> 
> The problem is that it only works when ORTE defines it. The NODE works almost 
> always since ORTE is always doing it. But for the NUMA, SOCKET, or CORE to 
> work, we need to use Open MPI binding/mapping capabilities. If the process 
> affinity was set with something else (custom scripts using taskset, cpusets, 
> ...), it doesn't work.
> 
> How hard do you think it would it be to detect the affinity and set those 
> flags using hwloc to figure out if we're on the same {SOCKET, CORE, ...} ? 
> Where would it be simpler to do this ?
> 
> Thanks.
> Sylvain
> 
> ---
> This email message is for the sole use of the intended recipient(s) and may 
> contain
> confidential information.  Any unauthorized review, use, disclosure or 
> distribution
> is prohibited.  If you are not the intended recipient, please contact the 
> sender by
> reply email and destroy all copies of the original message.
> ---
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18821.php



Re: [OMPI devel] Process affinity detection

2016-04-26 Thread Sylvain Jeaugey
Indeed, I implied that affinity was set before MPI_Init (usually even 
before the process is launched).


And yes, that would require a modex ... but I thought there was one 
already and maybe we could pack the affinity information inside the 
existing one.


On 04/26/2016 02:56 PM, Ralph Castain wrote:

Hmmm…you mean for procs on the same node? I’m not sure how you can do it 
without introducing another data exchange, and that would require the app to 
execute it since otherwise we have no idea when they set the affinity.

If we assume they set the affinity prior to calling MPI_Init, then we could do 
it - but at the cost of forcing a modex. You can only detect your own affinity, 
so to get the relative placement, you have to do an exchange if we can’t pass 
it to you. Perhaps we could offer it as an option?



On Apr 26, 2016, at 2:27 PM, Sylvain Jeaugey  wrote:

Within the BTL code (and surely elsewhere), we can use those convenient 
OPAL_PROC_ON_LOCAL_{NODE,SOCKET, ...} macros to figure out where another 
endpoint is located compared to us.

The problem is that it only works when ORTE defines it. The NODE works almost 
always since ORTE is always doing it. But for the NUMA, SOCKET, or CORE to 
work, we need to use Open MPI binding/mapping capabilities. If the process 
affinity was set with something else (custom scripts using taskset, cpusets, 
...), it doesn't work.

How hard do you think it would it be to detect the affinity and set those flags 
using hwloc to figure out if we're on the same {SOCKET, CORE, ...} ? Where 
would it be simpler to do this ?

Thanks.
Sylvain

---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/04/18821.php

___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/04/18822.php




Re: [OMPI devel] Process affinity detection

2016-04-26 Thread Ralph Castain

> On Apr 26, 2016, at 3:35 PM, Sylvain Jeaugey  wrote:
> 
> Indeed, I implied that affinity was set before MPI_Init (usually even before 
> the process is launched).
> 
> And yes, that would require a modex ... but I thought there was one already 
> and maybe we could pack the affinity information inside the existing one.

If the BTLs et al don’t require the modex, then we don’t perform it (e.g., when 
launched by mpirun or via a PMIx-enabled RM). So when someone does as you 
describe, then we would have to force the modex to exchange the info. Doable, 
but results in a scaling penalty, and so definitely not something we want to do 
by default.


> 
> On 04/26/2016 02:56 PM, Ralph Castain wrote:
>> Hmmm…you mean for procs on the same node? I’m not sure how you can do it 
>> without introducing another data exchange, and that would require the app to 
>> execute it since otherwise we have no idea when they set the affinity.
>> 
>> If we assume they set the affinity prior to calling MPI_Init, then we could 
>> do it - but at the cost of forcing a modex. You can only detect your own 
>> affinity, so to get the relative placement, you have to do an exchange if we 
>> can’t pass it to you. Perhaps we could offer it as an option?
>> 
>> 
>>> On Apr 26, 2016, at 2:27 PM, Sylvain Jeaugey  wrote:
>>> 
>>> Within the BTL code (and surely elsewhere), we can use those convenient 
>>> OPAL_PROC_ON_LOCAL_{NODE,SOCKET, ...} macros to figure out where another 
>>> endpoint is located compared to us.
>>> 
>>> The problem is that it only works when ORTE defines it. The NODE works 
>>> almost always since ORTE is always doing it. But for the NUMA, SOCKET, or 
>>> CORE to work, we need to use Open MPI binding/mapping capabilities. If the 
>>> process affinity was set with something else (custom scripts using taskset, 
>>> cpusets, ...), it doesn't work.
>>> 
>>> How hard do you think it would it be to detect the affinity and set those 
>>> flags using hwloc to figure out if we're on the same {SOCKET, CORE, ...} ? 
>>> Where would it be simpler to do this ?
>>> 
>>> Thanks.
>>> Sylvain
>>> 
>>> ---
>>> This email message is for the sole use of the intended recipient(s) and may 
>>> contain
>>> confidential information.  Any unauthorized review, use, disclosure or 
>>> distribution
>>> is prohibited.  If you are not the intended recipient, please contact the 
>>> sender by
>>> reply email and destroy all copies of the original message.
>>> ---
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2016/04/18821.php
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2016/04/18822.php
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18823.php



Re: [OMPI devel] Process affinity detection

2016-04-26 Thread Sylvain Jeaugey

Oh, I see. No, we don't want to add a full modex if there isn't one already.

Now, if we restrict this to the intra-node (we don't care on which 
socket/core is a distant process), is there any simple way to do an 
intra-node-only modex ?


On 04/26/2016 04:28 PM, Ralph Castain wrote:

On Apr 26, 2016, at 3:35 PM, Sylvain Jeaugey  wrote:

Indeed, I implied that affinity was set before MPI_Init (usually even before 
the process is launched).

And yes, that would require a modex ... but I thought there was one already and 
maybe we could pack the affinity information inside the existing one.

If the BTLs et al don’t require the modex, then we don’t perform it (e.g., when 
launched by mpirun or via a PMIx-enabled RM). So when someone does as you 
describe, then we would have to force the modex to exchange the info. Doable, 
but results in a scaling penalty, and so definitely not something we want to do 
by default.



On 04/26/2016 02:56 PM, Ralph Castain wrote:

Hmmm…you mean for procs on the same node? I’m not sure how you can do it 
without introducing another data exchange, and that would require the app to 
execute it since otherwise we have no idea when they set the affinity.

If we assume they set the affinity prior to calling MPI_Init, then we could do 
it - but at the cost of forcing a modex. You can only detect your own affinity, 
so to get the relative placement, you have to do an exchange if we can’t pass 
it to you. Perhaps we could offer it as an option?



On Apr 26, 2016, at 2:27 PM, Sylvain Jeaugey  wrote:

Within the BTL code (and surely elsewhere), we can use those convenient 
OPAL_PROC_ON_LOCAL_{NODE,SOCKET, ...} macros to figure out where another 
endpoint is located compared to us.

The problem is that it only works when ORTE defines it. The NODE works almost 
always since ORTE is always doing it. But for the NUMA, SOCKET, or CORE to 
work, we need to use Open MPI binding/mapping capabilities. If the process 
affinity was set with something else (custom scripts using taskset, cpusets, 
...), it doesn't work.

How hard do you think it would it be to detect the affinity and set those flags 
using hwloc to figure out if we're on the same {SOCKET, CORE, ...} ? Where 
would it be simpler to do this ?

Thanks.
Sylvain

---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/04/18821.php

___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/04/18822.php

___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/04/18823.php

___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/04/18824.php




Re: [OMPI devel] Process affinity detection

2016-04-26 Thread Ralph Castain

> On Apr 26, 2016, at 4:33 PM, Sylvain Jeaugey  wrote:
> 
> Oh, I see. No, we don't want to add a full modex if there isn't one already.
> 
> Now, if we restrict this to the intra-node (we don't care on which 
> socket/core is a distant process), is there any simple way to do an 
> intra-node-only modex ?

Sure - we can “pmix.put” the data with “local” scope to avoid it going 
anywhere, and then add an option in “fence” to do only a local fence (i.e., 
across the procs on the same node) to ensure the data was ready. Or we could do 
a non-blocking “get” to retrieve it and let the “fence” be done in the MPI 
layer by blocking until the data is returned.

Either way, not something we would want to do by default. Still, since the user 
knows they are doing this, should be easy enough for them to provide an option 
telling us to perform the extra maneuver.


> 
> On 04/26/2016 04:28 PM, Ralph Castain wrote:
>>> On Apr 26, 2016, at 3:35 PM, Sylvain Jeaugey  wrote:
>>> 
>>> Indeed, I implied that affinity was set before MPI_Init (usually even 
>>> before the process is launched).
>>> 
>>> And yes, that would require a modex ... but I thought there was one already 
>>> and maybe we could pack the affinity information inside the existing one.
>> If the BTLs et al don’t require the modex, then we don’t perform it (e.g., 
>> when launched by mpirun or via a PMIx-enabled RM). So when someone does as 
>> you describe, then we would have to force the modex to exchange the info. 
>> Doable, but results in a scaling penalty, and so definitely not something we 
>> want to do by default.
>> 
>> 
>>> On 04/26/2016 02:56 PM, Ralph Castain wrote:
 Hmmm…you mean for procs on the same node? I’m not sure how you can do it 
 without introducing another data exchange, and that would require the app 
 to execute it since otherwise we have no idea when they set the affinity.
 
 If we assume they set the affinity prior to calling MPI_Init, then we 
 could do it - but at the cost of forcing a modex. You can only detect your 
 own affinity, so to get the relative placement, you have to do an exchange 
 if we can’t pass it to you. Perhaps we could offer it as an option?
 
 
> On Apr 26, 2016, at 2:27 PM, Sylvain Jeaugey  wrote:
> 
> Within the BTL code (and surely elsewhere), we can use those convenient 
> OPAL_PROC_ON_LOCAL_{NODE,SOCKET, ...} macros to figure out where another 
> endpoint is located compared to us.
> 
> The problem is that it only works when ORTE defines it. The NODE works 
> almost always since ORTE is always doing it. But for the NUMA, SOCKET, or 
> CORE to work, we need to use Open MPI binding/mapping capabilities. If 
> the process affinity was set with something else (custom scripts using 
> taskset, cpusets, ...), it doesn't work.
> 
> How hard do you think it would it be to detect the affinity and set those 
> flags using hwloc to figure out if we're on the same {SOCKET, CORE, ...} 
> ? Where would it be simpler to do this ?
> 
> Thanks.
> Sylvain
> 
> ---
> This email message is for the sole use of the intended recipient(s) and 
> may contain
> confidential information.  Any unauthorized review, use, disclosure or 
> distribution
> is prohibited.  If you are not the intended recipient, please contact the 
> sender by
> reply email and destroy all copies of the original message.
> ---
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18821.php
 ___
 devel mailing list
 de...@open-mpi.org
 Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
 Link to this post: 
 http://www.open-mpi.org/community/lists/devel/2016/04/18822.php
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2016/04/18823.php
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2016/04/18824.php
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18825.php