Re: [hwloc-devel] Pgcc issues fixed?

2009-11-12 Thread Jeff Squyres

On Nov 12, 2009, at 9:05 AM, Samuel Thibault wrote:


> The *only* weird possibility would be if RH (or Suse) patched their
> old glibcs to fix this problem but didn't update the minor number.

Ok but in that case we'd just use the PLPA implementation that should
work fine. On the long run RH/Suse will eventually really upgrade  
their

glibc and get the bumped minor.




Sounds good.

--
Jeff Squyres
jsquy...@cisco.com



Re: [hwloc-devel] Pgcc issues fixed?

2009-11-12 Thread Samuel Thibault
Jeff Squyres, le Thu 12 Nov 2009 08:58:19 -0800, a écrit :
> The *only* weird possibility would be if RH (or Suse) patched their  
> old glibcs to fix this problem but didn't update the minor number.   

Ok but in that case we'd just use the PLPA implementation that should
work fine. On the long run RH/Suse will eventually really upgrade their
glibc and get the bumped minor.

Samuel


Re: [hwloc-devel] Pgcc issues fixed?

2009-11-12 Thread Jeff Squyres

On Nov 12, 2009, at 8:48 AM, Samuel Thibault wrote:


> On Nov 11, 2009, at 4:57 PM, Samuel Thibault wrote:
> >Maybe what we can do is using PLPA's functions if __GLIBC__ is <=
> >2 and __GLIBC_MINOR__ is < the first version which is known to be
> >correct or if CPU_SET can't be compiled, and rely on the glibc
> >functions else.  Of course we have to rely on glibc in any case for
> >pthread_setaffinity_np().
>
> That sounds good.  Even after glibc was fixed, "bad" versions of it
> were still in many already-installed machines for many years

And these had a minor number earlier than the fixed glibc, right?




Yes -- that's why I'm saying your plan sounds good.  :-)

The *only* weird possibility would be if RH (or Suse) patched their  
old glibcs to fix this problem but didn't update the minor number.   
Things like this have happened before; it's why I always prefer  
testing for behavior rather than version numbers.


But I don't quite know how to probe for this in the running glibc --  
you *may or may not* encounter a problem if you have a size mismatch.   
Version number might be the best that we can do here.  :-\


--
Jeff Squyres
jsquy...@cisco.com



Re: [hwloc-devel] Pgcc issues fixed?

2009-11-12 Thread Samuel Thibault
Jeff Squyres, le Thu 12 Nov 2009 08:34:24 -0800, a écrit :
> On Nov 11, 2009, at 4:57 PM, Samuel Thibault wrote:
> >Maybe what we can do is using PLPA's functions if __GLIBC__ is <=
> >2 and __GLIBC_MINOR__ is < the first version which is known to be
> >correct or if CPU_SET can't be compiled, and rely on the glibc
> >functions else.  Of course we have to rely on glibc in any case for
> >pthread_setaffinity_np().
> 
> That sounds good.  Even after glibc was fixed, "bad" versions of it  
> were still in many already-installed machines for many years

And these had a minor number earlier than the fixed glibc, right?

Samuel


Re: [hwloc-devel] Pgcc issues fixed?

2009-11-12 Thread Jeff Squyres

On Nov 11, 2009, at 4:57 PM, Samuel Thibault wrote:


Maybe what we can do is using PLPA's functions if __GLIBC__ is <=
2 and __GLIBC_MINOR__ is < the first version which is known to be
correct or if CPU_SET can't be compiled, and rely on the glibc
functions else.  Of course we have to rely on glibc in any case for
pthread_setaffinity_np().




That sounds good.  Even after glibc was fixed, "bad" versions of it  
were still in many already-installed machines for many years (there's  
still lots of rhel4 machines out there; do we know of RH patched their  
rhel4 glibc to fix this problem?).


--
Jeff Squyres
jsquy...@cisco.com



Re: [hwloc-devel] Pgcc issues fixed?

2009-11-11 Thread Samuel Thibault
Jeff Squyres, le Mon 09 Nov 2009 08:05:47 -0500, a écrit :
> Fair enough.  What about if we have an AC check for  
> pthread_setaffinity_np and use that if it exists, and if it doesn't  
> use the PLPA way?

Err, remember that pthread_setaffinity_np alone doesn't permit to bind
another process, and suffers from the same size parameter kludge (it has
been introduced in 2003).

> BTW, how does pthread_setaffinity_np() work?  Does it check the  
> running kernel and ensure to do the Right Thing?

Like sched_setaffinity does, yes.

> That was definitely a problem in the past -- kernel and glibc would
> mismatch in terms of set/getaffinity (which was included in many
> distros).

They have been fixed at the same time, 2004-03-18.

Maybe what we can do is using PLPA's functions if __GLIBC__ is <=
2 and __GLIBC_MINOR__ is < the first version which is known to be
correct or if CPU_SET can't be compiled, and rely on the glibc
functions else.  Of course we have to rely on glibc in any case for
pthread_setaffinity_np().

Samuel


Re: [hwloc-devel] Pgcc issues fixed?

2009-11-09 Thread Jeff Squyres

On Nov 9, 2009, at 5:12 AM, Samuel Thibault wrote:


What I dislike in that approach is that it means we'd have to closely
follow future changes in the kernel ABI, while the API is not supposed
to change (even if it has in the past).  Also, now that glibc provides
pthread_setaffinity_np, we should take advantage of it to implement
hwloc_set_thread_cpubind, and there is no way we can re-implement it
ourselves (the missing piece is the pthread_t -> tid translation).




Fair enough.  What about if we have an AC check for  
pthread_setaffinity_np and use that if it exists, and if it doesn't  
use the PLPA way?  So if the timeline looks like this:


- way in the past (time flows down)
  |
  -> "bad" setaffinity days of kernel/glibc mixing
  |  PLPA method is known to work here
  |
  -> pthread_setaffinity_np is introduced, fixes problems
  |
 \|/
- present

Then if AC causes hwloc to prefer pthread_setaffinity_np(), then we're  
covered for all the old systems with either old kernels and/or old  
glibc where problems occur.


BTW, how does pthread_setaffinity_np() work?  Does it check the  
running kernel and ensure to do the Right Thing?  That was definitely  
a problem in the past -- kernel and glibc would mismatch in terms of  
set/getaffinity (which was included in many distros).


--
Jeff Squyres
jsquy...@cisco.com



Re: [hwloc-devel] Pgcc issues fixed?

2009-11-09 Thread Samuel Thibault
Jeff Squyres, le Thu 05 Nov 2009 07:58:58 -0500, a écrit :
> This problem may go away if we adapt PLPA's approach to sched_[set| 
> get]affinity.

What I dislike in that approach is that it means we'd have to closely
follow future changes in the kernel ABI, while the API is not supposed
to change (even if it has in the past).  Also, now that glibc provides
pthread_setaffinity_np, we should take advantage of it to implement
hwloc_set_thread_cpubind, and there is no way we can re-implement it
ourselves (the missing piece is the pthread_t -> tid translation).

Samuel


Re: [hwloc-devel] Pgcc issues fixed?

2009-11-05 Thread Jeff Squyres
This problem may go away if we adapt PLPA's approach to sched_[set| 
get]affinity.



On Nov 4, 2009, at 10:34 PM, Chris Samuel wrote:



- "Chris Samuel"  wrote:

> - "Jeff Squyres"  wrote:
>
> > K.  Clear for a final rc / release?
>
> Go for it, am just about to go run a training course
> now so won't be available until this arvo Melbourne
> time..

Seems fine with PGI, Intel and GCC on AMD64, so I
thought I'd give it a whirl on our old SLES9 PPC64
cluster with XLC, that whinges about the usual params
unused, but also says:

"topology-linux.c", line 146.33: 1506-280 (W) Function argument  
assignment between types "unsigned int" and "struct {...}*" is not  
allowed.


cheers,
Chris
--
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency
___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel




--
Jeff Squyres
jsquy...@cisco.com



Re: [hwloc-devel] Pgcc issues fixed?

2009-11-04 Thread Chris Samuel

- "Jeff Squyres"  wrote:

> K.  Clear for a final rc / release?

Go for it, am just about to go run a training course
now so won't be available until this arvo Melbourne
time..

cheers!
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency


Re: [hwloc-devel] Pgcc issues fixed?

2009-11-04 Thread Samuel Thibault
Brice Goglin, le Wed 04 Nov 2009 22:03:11 +0100, a écrit :
> I think  pgcc building fine was the last possible problem, so I assume
> everything is ok now.

Same for me.

Samuel


Re: [hwloc-devel] Pgcc issues fixed?

2009-11-04 Thread Brice Goglin
I think  pgcc building fine was the last possible problem, so I assume
everything is ok now.

Brice



Jeff Squyres wrote:
> K.  Clear for a final rc / release?
>
> (due to timezone differences, I'm going to assume yes -- I'll make an
> rc now and if all goes well, let's release tomorrow morning)
>
>
> On Nov 4, 2009, at 6:41 AM, Samuel Thibault wrote:
>
>> Chris Samuel, le Wed 04 Nov 2009 16:04:40 +1100, a écrit :
>> > I'm still seeing a heap with the Intel compilers.
>>
>> Yes, these are mostly pedantism warnings which we do not need to fix for
>> this release (would need tagging parameters as unused etc.)
>>
>> Samuel
>> ___
>> hwloc-devel mailing list
>> hwloc-de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>>
>
>



Re: [hwloc-devel] Pgcc issues fixed?

2009-11-04 Thread Jeff Squyres

K.  Clear for a final rc / release?

(due to timezone differences, I'm going to assume yes -- I'll make an  
rc now and if all goes well, let's release tomorrow morning)



On Nov 4, 2009, at 6:41 AM, Samuel Thibault wrote:


Chris Samuel, le Wed 04 Nov 2009 16:04:40 +1100, a écrit :
> I'm still seeing a heap with the Intel compilers.

Yes, these are mostly pedantism warnings which we do not need to fix  
for

this release (would need tagging parameters as unused etc.)

Samuel
___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel




--
Jeff Squyres
jsquy...@cisco.com




Re: [hwloc-devel] Pgcc issues fixed?

2009-11-04 Thread Samuel Thibault
Chris Samuel, le Wed 04 Nov 2009 16:04:40 +1100, a écrit :
> I'm still seeing a heap with the Intel compilers.

Yes, these are mostly pedantism warnings which we do not need to fix for
this release (would need tagging parameters as unused etc.)

Samuel


Re: [hwloc-devel] Pgcc issues fixed?

2009-11-03 Thread Jeff Squyres
Thanks for your patience!  Samuel just committed a bunch more -- try  
the next one:


http://www.open-mpi.org/~jsquyres/unofficial/


On Nov 3, 2009, at 6:19 PM, Chris Samuel wrote:



- "Chris Samuel"  wrote:

> Will try PGI 7.0 now.

I can confirm it compiles OK with PGI 7.0, 7.1 and 7.2
with the same warnings as for 8.0.

These warnings also appear with 9.0.

Lots of warnings from the Intel v11 compilers,
I've attached a text file for the entire make
process.

GCC 4.4.2 compiles it without a warning (using -Werror
to catch any).  Likewise GCC 4.1.2 that comes with CentOS 5.

cheers,
Chris
--
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency





--
Jeff Squyres
jsquy...@cisco.com



Re: [hwloc-devel] Pgcc issues fixed?

2009-11-03 Thread Chris Samuel

- "Chris Samuel"  wrote:

> Will try PGI 7.0 now.

I can confirm it compiles OK with PGI 7.0, 7.1 and 7.2
with the same warnings as for 8.0.

These warnings also appear with 9.0.

Lots of warnings from the Intel v11 compilers,
I've attached a text file for the entire make
process.

GCC 4.4.2 compiles it without a warning (using -Werror
to catch any).  Likewise GCC 4.1.2 that comes with CentOS 5.

cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency


makelog-icc
Description: Binary data


Re: [hwloc-devel] Pgcc issues fixed?

2009-11-03 Thread Chris Samuel

- "Chris Samuel"  wrote:

> Grabbing now, thanks!

Compiled OK with:

[csamuel@tango hwloc-0.9.1rc3r1276]$ pgcc -V

pgcc 8.0-6 64-bit target on x86-64 Linux -tp gh-64
Copyright 1989-2000, The Portland Group, Inc.  All Rights Reserved.
Copyright 2000-2009, STMicroelectronics, Inc.  All Rights Reserved.

Just these warnings:

/bin/sh ../libtool --tag=CC   --mode=compile pgcc -DHAVE_CONFIG_H -I. 
-I../include/private -I../include/hwloc  -I../include -I../include 
-I/usr/local/openmpi/1.3.3-pgi/include -I/usr/local/mpfr/2.4.1/include 
-I/usr/local/gmp/4.3.1/include  -I/usr/include/libxml2   -g -c -o cpuset.lo 
cpuset.c


libtool: compile:  pgcc -DHAVE_CONFIG_H -I. -I../include/private 
-I../include/hwloc -I../include -I../include 
-I/usr/local/openmpi/1.3.3-pgi/include -I/usr/local/mpfr/2.4.1/include 
-I/usr/local/gmp/4.3.1/include -I/usr/include/libxml2 -g -c cpuset.c  -fpic 
-DPIC -o .libs/cpuset.o 

   
PGC-W-0155-Long value is passed to a nonprototyped function - argument #1 
(cpuset.c: 453) 

  
PGC-W-0155-Long value is passed to a nonprototyped function - argument #1 
(cpuset.c: 489) 

  
PGC-W-0155-Long value is passed to a nonprototyped function - argument #1 
(cpuset.c: 506) 

  
PGC-W-0155-Long value is passed to a nonprototyped function - argument #1 
(cpuset.c: 507) 

  
PGC/x86-64 Linux 8.0-6: compilation completed with warnings 
   

Will try PGI 7.0 now.

-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency


Re: [hwloc-devel] Pgcc issues fixed?

2009-11-03 Thread Chris Samuel

- "Jeff Squyres"  wrote:

> Try this tarball:
> 
> http://www.open-mpi.org/~jsquyres/unofficial/hwloc-0.9.1rc3r1276.tar.bz2

Grabbing now, thanks!

Sorry for not seeing the email yesterday, it was a
public holiday here yesterday (Melbourne Cup Day, yes
we have a public holiday for a horse race!).

cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency


Re: [hwloc-devel] Pgcc issues fixed?

2009-11-03 Thread Jeff Squyres

Try this tarball:

http://www.open-mpi.org/~jsquyres/unofficial/hwloc-0.9.1rc3r1276.tar.bz2

I cut it from the 0.9 branch with Samuel's latest commits (i.e., it's  
post rc3).


It should be fully bootstrapped and not require you to have the latest  
autotools.



On Nov 3, 2009, at 5:38 PM, Chris Samuel wrote:



- "Jeff Squyres (jsquyres)"  wrote:

> Pgcc issues fixed?

Sorry folks, have not yet got the SVN checkout to configure
yet due to it requiring newer tools than I have and am buried
trying to get board reports out at present.

Hopefully will have some time tomorrow 2-3pm my time.

If it's any help it's just PGI v8.0 (haven't tried
v7 yet) not the current v9 release.

Sorry about this..

Chris
--
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency
___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel




--
Jeff Squyres
jsquy...@cisco.com



Re: [hwloc-devel] Pgcc issues fixed?

2009-11-03 Thread Chris Samuel

- "Jeff Squyres (jsquyres)"  wrote:

> Pgcc issues fixed?

Sorry folks, have not yet got the SVN checkout to configure
yet due to it requiring newer tools than I have and am buried
trying to get board reports out at present.

Hopefully will have some time tomorrow 2-3pm my time.

If it's any help it's just PGI v8.0 (haven't tried
v7 yet) not the current v9 release.

Sorry about this..

Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency