Re: [OMPI devel] Cuda build break

2017-10-04 Thread r...@open-mpi.org
Fix is here: https://github.com/open-mpi/ompi/pull/4301 


> On Oct 4, 2017, at 11:19 AM, Jeff Squyres (jsquyres)  
> wrote:
> 
> Thanks Ralph.
> 
>> On Oct 4, 2017, at 2:07 PM, r...@open-mpi.org wrote:
>> 
>> I’ll fix
>> 
>>> On Oct 4, 2017, at 10:57 AM, Sylvain Jeaugey  wrote:
>>> 
>>> See my last comment on #4257 :
>>> 
>>> https://github.com/open-mpi/ompi/pull/4257#issuecomment-332900393
>>> 
>>> We should completely disable CUDA in hwloc. It is breaking the build, but 
>>> more importantly, it creates an extra dependency on the CUDA runtime that 
>>> Open MPI doesn't have, even when compiled with --with-cuda (we load symbols 
>>> dynamically).
>>> 
>>> On 10/04/2017 10:42 AM, Barrett, Brian via devel wrote:
 All -
 
 It looks like nVidia’s MTT started failing on 9/26, due to not finding 
 Cuda.  There’s a suspicious commit given the error message in the hwloc 
 cuda changes.  Jeff and Brice, it’s your patch, can you dig into the build 
 failures?
 
 Brian
 ___
 devel mailing list
 devel@lists.open-mpi.org
 https://lists.open-mpi.org/mailman/listinfo/devel
>>> 
>>> ---
>>> This email message is for the sole use of the intended recipient(s) and may 
>>> contain
>>> confidential information.  Any unauthorized review, use, disclosure or 
>>> distribution
>>> is prohibited.  If you are not the intended recipient, please contact the 
>>> sender by
>>> reply email and destroy all copies of the original message.
>>> ---
>>> ___
>>> devel mailing list
>>> devel@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/devel
>> 
>> ___
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/devel
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> 
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] HWLOC / rmaps ppr build failure

2017-10-04 Thread r...@open-mpi.org
Thanks! Fix is here: https://github.com/open-mpi/ompi/pull/4301 


> On Oct 4, 2017, at 11:10 AM, Brice Goglin  wrote:
> 
> Looks like you're using a hwloc < 1.11. If you want to support this old
> API while using the 1.11 names, you can add this to OMPI after #include
> 
> #if HWLOC_API_VERSION < 0x00010b00
> #define HWLOC_OBJ_NUMANODE HWLOC_OBJ_NODE
> #define HWLOC_OBJ_PACKAGE HWLOC_OBJ_SOCKET
> #endif
> 
> Brice
> 
> 
> 
> 
> Le 04/10/2017 19:54, Barrett, Brian via devel a écrit :
>> It looks like a change in either HWLOC or the rmaps ppr component is causing 
>> Cisco build failures on master for the last couple of days:
>> 
>>  https://mtt.open-mpi.org/index.php?do_redir=2486
>> 
>> rmaps_ppr.c:665:17: error: ‘HWLOC_OBJ_NUMANODE’ undeclared (first use in 
>> this function); did you mean ‘HWLOC_OBJ_NODE’?
>> level = HWLOC_OBJ_NUMANODE;
>> ^~
>> HWLOC_OBJ_NODE
>> rmaps_ppr.c:665:17: note: each undeclared identifier is reported only once 
>> for each function it
>> 
>> Can someone take a look?
>> ___
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/devel
> 
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] Cuda build break

2017-10-04 Thread Jeff Squyres (jsquyres)
Thanks Ralph.

> On Oct 4, 2017, at 2:07 PM, r...@open-mpi.org wrote:
> 
> I’ll fix
> 
>> On Oct 4, 2017, at 10:57 AM, Sylvain Jeaugey  wrote:
>> 
>> See my last comment on #4257 :
>> 
>> https://github.com/open-mpi/ompi/pull/4257#issuecomment-332900393
>> 
>> We should completely disable CUDA in hwloc. It is breaking the build, but 
>> more importantly, it creates an extra dependency on the CUDA runtime that 
>> Open MPI doesn't have, even when compiled with --with-cuda (we load symbols 
>> dynamically).
>> 
>> On 10/04/2017 10:42 AM, Barrett, Brian via devel wrote:
>>> All -
>>> 
>>> It looks like nVidia’s MTT started failing on 9/26, due to not finding 
>>> Cuda.  There’s a suspicious commit given the error message in the hwloc 
>>> cuda changes.  Jeff and Brice, it’s your patch, can you dig into the build 
>>> failures?
>>> 
>>> Brian
>>> ___
>>> devel mailing list
>>> devel@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/devel
>> 
>> ---
>> This email message is for the sole use of the intended recipient(s) and may 
>> contain
>> confidential information.  Any unauthorized review, use, disclosure or 
>> distribution
>> is prohibited.  If you are not the intended recipient, please contact the 
>> sender by
>> reply email and destroy all copies of the original message.
>> ---
>> ___
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/devel
> 
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel


-- 
Jeff Squyres
jsquy...@cisco.com

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] HWLOC / rmaps ppr build failure

2017-10-04 Thread Brice Goglin
Looks like you're using a hwloc < 1.11. If you want to support this old
API while using the 1.11 names, you can add this to OMPI after #include

#if HWLOC_API_VERSION < 0x00010b00
#define HWLOC_OBJ_NUMANODE HWLOC_OBJ_NODE
#define HWLOC_OBJ_PACKAGE HWLOC_OBJ_SOCKET
#endif

Brice




Le 04/10/2017 19:54, Barrett, Brian via devel a écrit :
> It looks like a change in either HWLOC or the rmaps ppr component is causing 
> Cisco build failures on master for the last couple of days:
>
>   https://mtt.open-mpi.org/index.php?do_redir=2486
>
> rmaps_ppr.c:665:17: error: ‘HWLOC_OBJ_NUMANODE’ undeclared (first use in this 
> function); did you mean ‘HWLOC_OBJ_NODE’?
>  level = HWLOC_OBJ_NUMANODE;
>  ^~
>  HWLOC_OBJ_NODE
> rmaps_ppr.c:665:17: note: each undeclared identifier is reported only once 
> for each function it
>
> Can someone take a look?
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel


Re: [OMPI devel] Cuda build break

2017-10-04 Thread r...@open-mpi.org
I’ll fix

> On Oct 4, 2017, at 10:57 AM, Sylvain Jeaugey  wrote:
> 
> See my last comment on #4257 :
> 
> https://github.com/open-mpi/ompi/pull/4257#issuecomment-332900393
> 
> We should completely disable CUDA in hwloc. It is breaking the build, but 
> more importantly, it creates an extra dependency on the CUDA runtime that 
> Open MPI doesn't have, even when compiled with --with-cuda (we load symbols 
> dynamically).
> 
> On 10/04/2017 10:42 AM, Barrett, Brian via devel wrote:
>> All -
>> 
>> It looks like nVidia’s MTT started failing on 9/26, due to not finding Cuda. 
>>  There’s a suspicious commit given the error message in the hwloc cuda 
>> changes.  Jeff and Brice, it’s your patch, can you dig into the build 
>> failures?
>> 
>> Brian
>> ___
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/devel
> 
> ---
> This email message is for the sole use of the intended recipient(s) and may 
> contain
> confidential information.  Any unauthorized review, use, disclosure or 
> distribution
> is prohibited.  If you are not the intended recipient, please contact the 
> sender by
> reply email and destroy all copies of the original message.
> ---
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] HWLOC / rmaps ppr build failure

2017-10-04 Thread r...@open-mpi.org
Hmmm...I suspect this is a hwloc v2 vs v1 issue. I’ll fix it

> On Oct 4, 2017, at 10:54 AM, Barrett, Brian via devel 
>  wrote:
> 
> It looks like a change in either HWLOC or the rmaps ppr component is causing 
> Cisco build failures on master for the last couple of days:
> 
>  https://mtt.open-mpi.org/index.php?do_redir=2486
> 
> rmaps_ppr.c:665:17: error: ‘HWLOC_OBJ_NUMANODE’ undeclared (first use in this 
> function); did you mean ‘HWLOC_OBJ_NODE’?
> level = HWLOC_OBJ_NUMANODE;
> ^~
> HWLOC_OBJ_NODE
> rmaps_ppr.c:665:17: note: each undeclared identifier is reported only once 
> for each function it
> 
> Can someone take a look?
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] Cuda build break

2017-10-04 Thread Sylvain Jeaugey

See my last comment on #4257 :

https://github.com/open-mpi/ompi/pull/4257#issuecomment-332900393

We should completely disable CUDA in hwloc. It is breaking the build, 
but more importantly, it creates an extra dependency on the CUDA runtime 
that Open MPI doesn't have, even when compiled with --with-cuda (we load 
symbols dynamically).


On 10/04/2017 10:42 AM, Barrett, Brian via devel wrote:

All -

It looks like nVidia’s MTT started failing on 9/26, due to not finding Cuda.  
There’s a suspicious commit given the error message in the hwloc cuda changes.  
Jeff and Brice, it’s your patch, can you dig into the build failures?

Brian
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel


---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] HWLOC / rmaps ppr build failure

2017-10-04 Thread Barrett, Brian via devel
It looks like a change in either HWLOC or the rmaps ppr component is causing 
Cisco build failures on master for the last couple of days:

  https://mtt.open-mpi.org/index.php?do_redir=2486

rmaps_ppr.c:665:17: error: ‘HWLOC_OBJ_NUMANODE’ undeclared (first use in this 
function); did you mean ‘HWLOC_OBJ_NODE’?
 level = HWLOC_OBJ_NUMANODE;
 ^~
 HWLOC_OBJ_NODE
rmaps_ppr.c:665:17: note: each undeclared identifier is reported only once for 
each function it

Can someone take a look?
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] Cuda build break

2017-10-04 Thread Barrett, Brian via devel
All -

It looks like nVidia’s MTT started failing on 9/26, due to not finding Cuda.  
There’s a suspicious commit given the error message in the hwloc cuda changes.  
Jeff and Brice, it’s your patch, can you dig into the build failures?

Brian
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel