Re: [OMPI devel] [PATCH] Fix for xlc-13.1.0 ICE (hwloc)

2016-05-09 Thread Josh Hursey
(Just to followup for the list)
I merged in the master PR, and created a PR for v2.0.0:
  https://github.com/open-mpi/ompi-release/pull/1149

Thanks Paul and Brice!


On Sun, May 8, 2016 at 3:21 PM, Brice Goglin  wrote:

> Thanks, applied to hwloc. And PR for OMPI master at
> https://github.com/open-mpi/ompi/pull/1657
> Brice
>
>
>
> Le 06/05/2016 00:29, Paul Hargrove a écrit :
>
> I have some good news:  I have a fix!!
>
> FWIW: I too can build w/ xlc 12.1 (also BG/Q).
> It is just the 13.1.0 on Power7 that crashes building hwloc.
> Meanwhile, 13.1.2 on Power8 little-endian does not crash (but is a
> different front-end than big-endian if I understand correctly).
>
> I started "bisecting" the file topology-xml-nolibxml.c and found that xlc
> is crashing on "__hwloc_attribute_may_alias".
> Simply disabling use of that attribute resolves the problem.
>
> So, here is the fix, which simply changes the check for this attribute to
> match the way in which hwloc uses it.
> It disqualifies the buggy compiler version(s) based on behavior, rather
> than us trying to list affected versions.
>
> --- config/hwloc_check_attributes.m4~   2016-05-05 17:18:10.380479303 -0500
> +++ config/hwloc_check_attributes.m42016-05-05 17:21:30.399799031 -0500
> @@ -322,9 +322,10 @@
>  # Attribute may_alias: No suitable cross-check available, that works
> for non-supporting compilers
>  # Ignored by intel-9.1.045 -- turn off with -wd1292
>  # Ignored by PGI-6.2.5; ignore not detected due to missing cross-check
> +# The test case is chosen to match hwloc's usage, and reproduces an
> xlc-13.1.0 bug.
>  #
>  _HWLOC_CHECK_SPECIFIC_ATTRIBUTE([may_alias],
> -[int * p_value __attribute__ ((__may_alias__));],
> +[struct { int i; } __attribute__ ((__may_alias__)) * p_value;],
>  [],
>  [])
>
>
> -Paul [proving that I am good for more than just *breaking* other people's
> software - I can fix things too]
>
> On Thu, May 5, 2016 at 2:28 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> On May 5, 2016, at 5:27 PM, Josh Hursey < 
>> jjhur...@open-mpi.org> wrote:
>> >
>> > Since this also happens with hwloc 1.11.3 standalone maybe hwloc folks
>> can take point on further investigation?
>>
>> I think Brice would love your assistance in figuring this out, since I'm
>> guessing he doesn't have access to these platforms, either.  :-)
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2016/05/18917.php
>>
>
>
>
> --
> Paul H. Hargrove   
> phhargr...@lbl.gov
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/05/18940.php
>


Re: [OMPI devel] [PATCH] Fix for xlc-13.1.0 ICE (hwloc)

2016-05-08 Thread Brice Goglin
Thanks, applied to hwloc. And PR for OMPI master at
https://github.com/open-mpi/ompi/pull/1657
Brice



Le 06/05/2016 00:29, Paul Hargrove a écrit :
> I have some good news:  I have a fix!!
>
> FWIW: I too can build w/ xlc 12.1 (also BG/Q).
> It is just the 13.1.0 on Power7 that crashes building hwloc.
> Meanwhile, 13.1.2 on Power8 little-endian does not crash (but is a
> different front-end than big-endian if I understand correctly).
>
> I started "bisecting" the file topology-xml-nolibxml.c and found that
> xlc is crashing on "__hwloc_attribute_may_alias".
> Simply disabling use of that attribute resolves the problem.
>
> So, here is the fix, which simply changes the check for this attribute
> to match the way in which hwloc uses it.
> It disqualifies the buggy compiler version(s) based on behavior,
> rather than us trying to list affected versions.
>
> --- config/hwloc_check_attributes.m4~   2016-05-05 17:18:10.380479303
> -0500
> +++ config/hwloc_check_attributes.m42016-05-05 17:21:30.399799031
> -0500
> @@ -322,9 +322,10 @@
>  # Attribute may_alias: No suitable cross-check available, that
> works for non-supporting compilers
>  # Ignored by intel-9.1.045 -- turn off with -wd1292
>  # Ignored by PGI-6.2.5; ignore not detected due to missing
> cross-check
> +# The test case is chosen to match hwloc's usage, and reproduces
> an xlc-13.1.0 bug.
>  #
>  _HWLOC_CHECK_SPECIFIC_ATTRIBUTE([may_alias],
> -[int * p_value __attribute__ ((__may_alias__));],
> +[struct { int i; } __attribute__ ((__may_alias__)) * p_value;],
>  [],
>  [])
>
>
> -Paul [proving that I am good for more than just *breaking* other
> people's software - I can fix things too]
>
> On Thu, May 5, 2016 at 2:28 PM, Jeff Squyres (jsquyres)
> mailto:jsquy...@cisco.com>> wrote:
>
> On May 5, 2016, at 5:27 PM, Josh Hursey  > wrote:
> >
> > Since this also happens with hwloc 1.11.3 standalone maybe hwloc
> folks can take point on further investigation?
>
> I think Brice would love your assistance in figuring this out,
> since I'm guessing he doesn't have access to these platforms,
> either.  :-)
>
> --
> Jeff Squyres
> jsquy...@cisco.com 
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org 
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/05/18917.php
>
>
>
>
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> 
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



Re: [OMPI devel] [PATCH] Fix for xlc-13.1.0 ICE (hwloc)

2016-05-06 Thread Brice Goglin
Thanks
I think I would be fine with that fix. Unfortunately I won't have a good
internet access until sunday night. I won't be able to test anything
properly earlier :/



Le 06/05/2016 00:29, Paul Hargrove a écrit :
> I have some good news:  I have a fix!!
>
> FWIW: I too can build w/ xlc 12.1 (also BG/Q).
> It is just the 13.1.0 on Power7 that crashes building hwloc.
> Meanwhile, 13.1.2 on Power8 little-endian does not crash (but is a
> different front-end than big-endian if I understand correctly).
>
> I started "bisecting" the file topology-xml-nolibxml.c and found that
> xlc is crashing on "__hwloc_attribute_may_alias".
> Simply disabling use of that attribute resolves the problem.
>
> So, here is the fix, which simply changes the check for this attribute
> to match the way in which hwloc uses it.
> It disqualifies the buggy compiler version(s) based on behavior,
> rather than us trying to list affected versions.
>
> --- config/hwloc_check_attributes.m4~   2016-05-05 17:18:10.380479303
> -0500
> +++ config/hwloc_check_attributes.m42016-05-05 17:21:30.399799031
> -0500
> @@ -322,9 +322,10 @@
>  # Attribute may_alias: No suitable cross-check available, that
> works for non-supporting compilers
>  # Ignored by intel-9.1.045 -- turn off with -wd1292
>  # Ignored by PGI-6.2.5; ignore not detected due to missing
> cross-check
> +# The test case is chosen to match hwloc's usage, and reproduces
> an xlc-13.1.0 bug.
>  #
>  _HWLOC_CHECK_SPECIFIC_ATTRIBUTE([may_alias],
> -[int * p_value __attribute__ ((__may_alias__));],
> +[struct { int i; } __attribute__ ((__may_alias__)) * p_value;],
>  [],
>  [])
>
>
> -Paul [proving that I am good for more than just *breaking* other
> people's software - I can fix things too]
>
> On Thu, May 5, 2016 at 2:28 PM, Jeff Squyres (jsquyres)
> mailto:jsquy...@cisco.com>> wrote:
>
> On May 5, 2016, at 5:27 PM, Josh Hursey  > wrote:
> >
> > Since this also happens with hwloc 1.11.3 standalone maybe hwloc
> folks can take point on further investigation?
>
> I think Brice would love your assistance in figuring this out,
> since I'm guessing he doesn't have access to these platforms,
> either.  :-)
>
> --
> Jeff Squyres
> jsquy...@cisco.com 
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org 
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/05/18917.php
>
>
>
>
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> 
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



[OMPI devel] [PATCH] Fix for xlc-13.1.0 ICE (hwloc)

2016-05-05 Thread Paul Hargrove
I have some good news:  I have a fix!!

FWIW: I too can build w/ xlc 12.1 (also BG/Q).
It is just the 13.1.0 on Power7 that crashes building hwloc.
Meanwhile, 13.1.2 on Power8 little-endian does not crash (but is a
different front-end than big-endian if I understand correctly).

I started "bisecting" the file topology-xml-nolibxml.c and found that xlc
is crashing on "__hwloc_attribute_may_alias".
Simply disabling use of that attribute resolves the problem.

So, here is the fix, which simply changes the check for this attribute to
match the way in which hwloc uses it.
It disqualifies the buggy compiler version(s) based on behavior, rather
than us trying to list affected versions.

--- config/hwloc_check_attributes.m4~   2016-05-05 17:18:10.380479303 -0500
+++ config/hwloc_check_attributes.m42016-05-05 17:21:30.399799031 -0500
@@ -322,9 +322,10 @@
 # Attribute may_alias: No suitable cross-check available, that works
for non-supporting compilers
 # Ignored by intel-9.1.045 -- turn off with -wd1292
 # Ignored by PGI-6.2.5; ignore not detected due to missing cross-check
+# The test case is chosen to match hwloc's usage, and reproduces an
xlc-13.1.0 bug.
 #
 _HWLOC_CHECK_SPECIFIC_ATTRIBUTE([may_alias],
-[int * p_value __attribute__ ((__may_alias__));],
+[struct { int i; } __attribute__ ((__may_alias__)) * p_value;],
 [],
 [])


-Paul [proving that I am good for more than just *breaking* other people's
software - I can fix things too]

On Thu, May 5, 2016 at 2:28 PM, Jeff Squyres (jsquyres) 
wrote:

> On May 5, 2016, at 5:27 PM, Josh Hursey  wrote:
> >
> > Since this also happens with hwloc 1.11.3 standalone maybe hwloc folks
> can take point on further investigation?
>
> I think Brice would love your assistance in figuring this out, since I'm
> guessing he doesn't have access to these platforms, either.  :-)
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/05/18917.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900