Re: [hwloc-devel] Pgcc issues fixed?
On Nov 12, 2009, at 9:05 AM, Samuel Thibault wrote: > The *only* weird possibility would be if RH (or Suse) patched their > old glibcs to fix this problem but didn't update the minor number. Ok but in that case we'd just use the PLPA implementation that should work fine. On the long run RH/Suse will eventually really upgrade their glibc and get the bumped minor. Sounds good. -- Jeff Squyres jsquy...@cisco.com
Re: [hwloc-devel] Pgcc issues fixed?
Jeff Squyres, le Thu 12 Nov 2009 08:58:19 -0800, a écrit : > The *only* weird possibility would be if RH (or Suse) patched their > old glibcs to fix this problem but didn't update the minor number. Ok but in that case we'd just use the PLPA implementation that should work fine. On the long run RH/Suse will eventually really upgrade their glibc and get the bumped minor. Samuel
Re: [hwloc-devel] Pgcc issues fixed?
On Nov 12, 2009, at 8:48 AM, Samuel Thibault wrote: > On Nov 11, 2009, at 4:57 PM, Samuel Thibault wrote: > >Maybe what we can do is using PLPA's functions if __GLIBC__ is <= > >2 and __GLIBC_MINOR__ is < the first version which is known to be > >correct or if CPU_SET can't be compiled, and rely on the glibc > >functions else. Of course we have to rely on glibc in any case for > >pthread_setaffinity_np(). > > That sounds good. Even after glibc was fixed, "bad" versions of it > were still in many already-installed machines for many years And these had a minor number earlier than the fixed glibc, right? Yes -- that's why I'm saying your plan sounds good. :-) The *only* weird possibility would be if RH (or Suse) patched their old glibcs to fix this problem but didn't update the minor number. Things like this have happened before; it's why I always prefer testing for behavior rather than version numbers. But I don't quite know how to probe for this in the running glibc -- you *may or may not* encounter a problem if you have a size mismatch. Version number might be the best that we can do here. :-\ -- Jeff Squyres jsquy...@cisco.com
Re: [hwloc-devel] Pgcc issues fixed?
Jeff Squyres, le Thu 12 Nov 2009 08:34:24 -0800, a écrit : > On Nov 11, 2009, at 4:57 PM, Samuel Thibault wrote: > >Maybe what we can do is using PLPA's functions if __GLIBC__ is <= > >2 and __GLIBC_MINOR__ is < the first version which is known to be > >correct or if CPU_SET can't be compiled, and rely on the glibc > >functions else. Of course we have to rely on glibc in any case for > >pthread_setaffinity_np(). > > That sounds good. Even after glibc was fixed, "bad" versions of it > were still in many already-installed machines for many years And these had a minor number earlier than the fixed glibc, right? Samuel
Re: [hwloc-devel] Pgcc issues fixed?
On Nov 11, 2009, at 4:57 PM, Samuel Thibault wrote: Maybe what we can do is using PLPA's functions if __GLIBC__ is <= 2 and __GLIBC_MINOR__ is < the first version which is known to be correct or if CPU_SET can't be compiled, and rely on the glibc functions else. Of course we have to rely on glibc in any case for pthread_setaffinity_np(). That sounds good. Even after glibc was fixed, "bad" versions of it were still in many already-installed machines for many years (there's still lots of rhel4 machines out there; do we know of RH patched their rhel4 glibc to fix this problem?). -- Jeff Squyres jsquy...@cisco.com
Re: [hwloc-devel] Pgcc issues fixed?
Jeff Squyres, le Mon 09 Nov 2009 08:05:47 -0500, a écrit : > Fair enough. What about if we have an AC check for > pthread_setaffinity_np and use that if it exists, and if it doesn't > use the PLPA way? Err, remember that pthread_setaffinity_np alone doesn't permit to bind another process, and suffers from the same size parameter kludge (it has been introduced in 2003). > BTW, how does pthread_setaffinity_np() work? Does it check the > running kernel and ensure to do the Right Thing? Like sched_setaffinity does, yes. > That was definitely a problem in the past -- kernel and glibc would > mismatch in terms of set/getaffinity (which was included in many > distros). They have been fixed at the same time, 2004-03-18. Maybe what we can do is using PLPA's functions if __GLIBC__ is <= 2 and __GLIBC_MINOR__ is < the first version which is known to be correct or if CPU_SET can't be compiled, and rely on the glibc functions else. Of course we have to rely on glibc in any case for pthread_setaffinity_np(). Samuel
Re: [hwloc-devel] Pgcc issues fixed?
On Nov 9, 2009, at 5:12 AM, Samuel Thibault wrote: What I dislike in that approach is that it means we'd have to closely follow future changes in the kernel ABI, while the API is not supposed to change (even if it has in the past). Also, now that glibc provides pthread_setaffinity_np, we should take advantage of it to implement hwloc_set_thread_cpubind, and there is no way we can re-implement it ourselves (the missing piece is the pthread_t -> tid translation). Fair enough. What about if we have an AC check for pthread_setaffinity_np and use that if it exists, and if it doesn't use the PLPA way? So if the timeline looks like this: - way in the past (time flows down) | -> "bad" setaffinity days of kernel/glibc mixing | PLPA method is known to work here | -> pthread_setaffinity_np is introduced, fixes problems | \|/ - present Then if AC causes hwloc to prefer pthread_setaffinity_np(), then we're covered for all the old systems with either old kernels and/or old glibc where problems occur. BTW, how does pthread_setaffinity_np() work? Does it check the running kernel and ensure to do the Right Thing? That was definitely a problem in the past -- kernel and glibc would mismatch in terms of set/getaffinity (which was included in many distros). -- Jeff Squyres jsquy...@cisco.com
Re: [hwloc-devel] Pgcc issues fixed?
Jeff Squyres, le Thu 05 Nov 2009 07:58:58 -0500, a écrit : > This problem may go away if we adapt PLPA's approach to sched_[set| > get]affinity. What I dislike in that approach is that it means we'd have to closely follow future changes in the kernel ABI, while the API is not supposed to change (even if it has in the past). Also, now that glibc provides pthread_setaffinity_np, we should take advantage of it to implement hwloc_set_thread_cpubind, and there is no way we can re-implement it ourselves (the missing piece is the pthread_t -> tid translation). Samuel
Re: [hwloc-devel] Pgcc issues fixed?
This problem may go away if we adapt PLPA's approach to sched_[set| get]affinity. On Nov 4, 2009, at 10:34 PM, Chris Samuel wrote: - "Chris Samuel"wrote: > - "Jeff Squyres" wrote: > > > K. Clear for a final rc / release? > > Go for it, am just about to go run a training course > now so won't be available until this arvo Melbourne > time.. Seems fine with PGI, Intel and GCC on AMD64, so I thought I'd give it a whirl on our old SLES9 PPC64 cluster with XLC, that whinges about the usual params unused, but also says: "topology-linux.c", line 146.33: 1506-280 (W) Function argument assignment between types "unsigned int" and "struct {...}*" is not allowed. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency ___ hwloc-devel mailing list hwloc-de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel -- Jeff Squyres jsquy...@cisco.com
Re: [hwloc-devel] Pgcc issues fixed?
- "Jeff Squyres"wrote: > K. Clear for a final rc / release? Go for it, am just about to go run a training course now so won't be available until this arvo Melbourne time.. cheers! Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency
Re: [hwloc-devel] Pgcc issues fixed?
Brice Goglin, le Wed 04 Nov 2009 22:03:11 +0100, a écrit : > I think pgcc building fine was the last possible problem, so I assume > everything is ok now. Same for me. Samuel
Re: [hwloc-devel] Pgcc issues fixed?
I think pgcc building fine was the last possible problem, so I assume everything is ok now. Brice Jeff Squyres wrote: > K. Clear for a final rc / release? > > (due to timezone differences, I'm going to assume yes -- I'll make an > rc now and if all goes well, let's release tomorrow morning) > > > On Nov 4, 2009, at 6:41 AM, Samuel Thibault wrote: > >> Chris Samuel, le Wed 04 Nov 2009 16:04:40 +1100, a écrit : >> > I'm still seeing a heap with the Intel compilers. >> >> Yes, these are mostly pedantism warnings which we do not need to fix for >> this release (would need tagging parameters as unused etc.) >> >> Samuel >> ___ >> hwloc-devel mailing list >> hwloc-de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel >> > >
Re: [hwloc-devel] Pgcc issues fixed?
K. Clear for a final rc / release? (due to timezone differences, I'm going to assume yes -- I'll make an rc now and if all goes well, let's release tomorrow morning) On Nov 4, 2009, at 6:41 AM, Samuel Thibault wrote: Chris Samuel, le Wed 04 Nov 2009 16:04:40 +1100, a écrit : > I'm still seeing a heap with the Intel compilers. Yes, these are mostly pedantism warnings which we do not need to fix for this release (would need tagging parameters as unused etc.) Samuel ___ hwloc-devel mailing list hwloc-de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel -- Jeff Squyres jsquy...@cisco.com
Re: [hwloc-devel] Pgcc issues fixed?
Chris Samuel, le Wed 04 Nov 2009 16:04:40 +1100, a écrit : > I'm still seeing a heap with the Intel compilers. Yes, these are mostly pedantism warnings which we do not need to fix for this release (would need tagging parameters as unused etc.) Samuel
Re: [hwloc-devel] Pgcc issues fixed?
Thanks for your patience! Samuel just committed a bunch more -- try the next one: http://www.open-mpi.org/~jsquyres/unofficial/ On Nov 3, 2009, at 6:19 PM, Chris Samuel wrote: - "Chris Samuel"wrote: > Will try PGI 7.0 now. I can confirm it compiles OK with PGI 7.0, 7.1 and 7.2 with the same warnings as for 8.0. These warnings also appear with 9.0. Lots of warnings from the Intel v11 compilers, I've attached a text file for the entire make process. GCC 4.4.2 compiles it without a warning (using -Werror to catch any). Likewise GCC 4.1.2 that comes with CentOS 5. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency -- Jeff Squyres jsquy...@cisco.com
Re: [hwloc-devel] Pgcc issues fixed?
- "Chris Samuel"wrote: > Will try PGI 7.0 now. I can confirm it compiles OK with PGI 7.0, 7.1 and 7.2 with the same warnings as for 8.0. These warnings also appear with 9.0. Lots of warnings from the Intel v11 compilers, I've attached a text file for the entire make process. GCC 4.4.2 compiles it without a warning (using -Werror to catch any). Likewise GCC 4.1.2 that comes with CentOS 5. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency makelog-icc Description: Binary data
Re: [hwloc-devel] Pgcc issues fixed?
- "Chris Samuel"wrote: > Grabbing now, thanks! Compiled OK with: [csamuel@tango hwloc-0.9.1rc3r1276]$ pgcc -V pgcc 8.0-6 64-bit target on x86-64 Linux -tp gh-64 Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved. Copyright 2000-2009, STMicroelectronics, Inc. All Rights Reserved. Just these warnings: /bin/sh ../libtool --tag=CC --mode=compile pgcc -DHAVE_CONFIG_H -I. -I../include/private -I../include/hwloc -I../include -I../include -I/usr/local/openmpi/1.3.3-pgi/include -I/usr/local/mpfr/2.4.1/include -I/usr/local/gmp/4.3.1/include -I/usr/include/libxml2 -g -c -o cpuset.lo cpuset.c libtool: compile: pgcc -DHAVE_CONFIG_H -I. -I../include/private -I../include/hwloc -I../include -I../include -I/usr/local/openmpi/1.3.3-pgi/include -I/usr/local/mpfr/2.4.1/include -I/usr/local/gmp/4.3.1/include -I/usr/include/libxml2 -g -c cpuset.c -fpic -DPIC -o .libs/cpuset.o PGC-W-0155-Long value is passed to a nonprototyped function - argument #1 (cpuset.c: 453) PGC-W-0155-Long value is passed to a nonprototyped function - argument #1 (cpuset.c: 489) PGC-W-0155-Long value is passed to a nonprototyped function - argument #1 (cpuset.c: 506) PGC-W-0155-Long value is passed to a nonprototyped function - argument #1 (cpuset.c: 507) PGC/x86-64 Linux 8.0-6: compilation completed with warnings Will try PGI 7.0 now. -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency
Re: [hwloc-devel] Pgcc issues fixed?
- "Jeff Squyres"wrote: > Try this tarball: > > http://www.open-mpi.org/~jsquyres/unofficial/hwloc-0.9.1rc3r1276.tar.bz2 Grabbing now, thanks! Sorry for not seeing the email yesterday, it was a public holiday here yesterday (Melbourne Cup Day, yes we have a public holiday for a horse race!). cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency
Re: [hwloc-devel] Pgcc issues fixed?
Try this tarball: http://www.open-mpi.org/~jsquyres/unofficial/hwloc-0.9.1rc3r1276.tar.bz2 I cut it from the 0.9 branch with Samuel's latest commits (i.e., it's post rc3). It should be fully bootstrapped and not require you to have the latest autotools. On Nov 3, 2009, at 5:38 PM, Chris Samuel wrote: - "Jeff Squyres (jsquyres)"wrote: > Pgcc issues fixed? Sorry folks, have not yet got the SVN checkout to configure yet due to it requiring newer tools than I have and am buried trying to get board reports out at present. Hopefully will have some time tomorrow 2-3pm my time. If it's any help it's just PGI v8.0 (haven't tried v7 yet) not the current v9 release. Sorry about this.. Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency ___ hwloc-devel mailing list hwloc-de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel -- Jeff Squyres jsquy...@cisco.com
Re: [hwloc-devel] Pgcc issues fixed?
- "Jeff Squyres (jsquyres)"wrote: > Pgcc issues fixed? Sorry folks, have not yet got the SVN checkout to configure yet due to it requiring newer tools than I have and am buried trying to get board reports out at present. Hopefully will have some time tomorrow 2-3pm my time. If it's any help it's just PGI v8.0 (haven't tried v7 yet) not the current v9 release. Sorry about this.. Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency