Re: [hwloc-devel] towards PLPA-like API in 1.0

2009-11-11 Thread Samuel Thibault
Brice Goglin, le Mon 09 Nov 2009 15:18:11 +0100, a écrit :
> I don't think we need SET_CPUBIND since (from what I understand) it
> would be equivalent to SET_PROC_CPUBIND | SET_THREAD_CPUBIND.

Being able to set oneself's cpuset is not the same as being able to set
the cpuset of other processes or other threads.

We should also probably expose whether strict binding is available and
whether whole processes and single threads can be bound.

In a word, I believe we should just expose which exact parts of the
binding API will never return ENOSYS.  That would make it a lot easier
to comprehend and much more straightforward to use it even though
the interface is very powerful and can thus lead to a lot of binding
possibilities.  If an application has requirements A, B and C and would
be even happier if D and E were available, it can prepare arguments
for A, B and C, and if D and E are announced as being available, add
arguments for them, and then call the function.  That saves trying with
D and E, or only with D, or only with E, and eventually fallback to not
using either..

> We'd have to keep in mind that 32bits in this flag bitmask is small
> (we'll likely need many other flags in the future, for instance
> IODEVICE_DISCOVERY, SET/GET_MEMBIND, SET/GET_RANGE_MEMBIND).

That's probably a hint that we should find another way :)

How about a function that returns a structure filled with one long
per feature (detection, thread bind, memory bind, etc.), part of
hwloc_topology_t?  That way we can extend it at will.

Samuel


[hwloc-devel] Create success (hwloc r1.0a1r1335)

2009-11-11 Thread MPI Team
Creating nightly hwloc snapshot SVN tarball was a success.

Snapshot:   hwloc 1.0a1r1335
Start time: Wed Nov 11 21:01:02 EST 2009
End time:   Wed Nov 11 21:02:54 EST 2009

Your friendly daemon,
Cyrador


Re: [hwloc-devel] towards PLPA-like API in 1.0

2009-11-11 Thread Samuel Thibault
Brice Goglin, le Thu 05 Nov 2009 17:22:15 +0100, a écrit :
> + int hwloc_plpa_sched_getaffinity(pid_t pid, hwloc_cpuset_t cpuset);
> 
> It's just a hwloc_get_cpubind(), but we don't have it since it would not
> be supported on all OS. But I think we should add it anyway.

Being discussed in another thread.

> * Then we have all count-spec related API, which lets you look for
> information about all processors, or all online ones, or all offline ones.
> 
> If people are really interested with offline CPUs, they can look at the
> get_offline_cpuset below. There is no topology information about offline
> CPUs on Linux anyway,

And at least on some other OSes as well, but not on Solaris for
instance.

> + hwloc_cpuset_t hwloc_topology_get_offline_cpuset(hwloc_topology_t topology);
> 
> Returns a CPU set of existing CPUs that are offline, disabled by
> administrator, or unavailable to this process if we're restricting the
> topology to the process origin binding for instance. I am not sure we
> actually need to distinguish all these cases.

Mmm, I think there's one more thing that is actually more precise in
some way: "the CPUs that we don't provide topology objects for": some
OSes don't expose unauthorized CPUs even if they are online.  It would
thus also include CPUs which have explicitly been ignored because
HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM is not set.

I think it can be useful to provide the cpuset of allowed cpus.  One
could imagine a tool that negociates with the administrator tools
which cpus to be added/removed from the allowed set, knowing the whole
topology and which CPUs are allowed would be useful.  Maybe powering
up/down could be involved too, thus the offline mask too.  The current
binding is already available from hwloc_get_cpubind.

Actually, that gives me the idea that it'd be probably nice to somehow
show that in the graphical lstopo :)

Samuel


Re: [hwloc-devel] Pgcc issues fixed?

2009-11-11 Thread Samuel Thibault
Jeff Squyres, le Mon 09 Nov 2009 08:05:47 -0500, a écrit :
> Fair enough.  What about if we have an AC check for  
> pthread_setaffinity_np and use that if it exists, and if it doesn't  
> use the PLPA way?

Err, remember that pthread_setaffinity_np alone doesn't permit to bind
another process, and suffers from the same size parameter kludge (it has
been introduced in 2003).

> BTW, how does pthread_setaffinity_np() work?  Does it check the  
> running kernel and ensure to do the Right Thing?

Like sched_setaffinity does, yes.

> That was definitely a problem in the past -- kernel and glibc would
> mismatch in terms of set/getaffinity (which was included in many
> distros).

They have been fixed at the same time, 2004-03-18.

Maybe what we can do is using PLPA's functions if __GLIBC__ is <=
2 and __GLIBC_MINOR__ is < the first version which is known to be
correct or if CPU_SET can't be compiled, and rely on the glibc
functions else.  Of course we have to rely on glibc in any case for
pthread_setaffinity_np().

Samuel


Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1333

2009-11-11 Thread Samuel Thibault
Another way to go is in hwloc_plpa_sched_setaffinity put in
#ifdef HWLOC_LINUX_SYS some code that calls the internal
hwloc_linux_set_tid_cpubind (with a strong comment that nobody else
should call it), so that

- existing linux plpa users can have the same behavior, but we can
  document here that calling hwloc_plpa_sched_setaffinity with a pid
  different than zero portably means something only for single-threaded
  processes.
- non-linux plpa users are restricted to what really is portable.
- we don't have to cripple the hwloc interface (i.e. document that we
  accept non-portable input) just for the linuxish interface.

Samuel


Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1333

2009-11-11 Thread Samuel Thibault
Brice Goglin, le Thu 12 Nov 2009 00:31:48 +0100, a écrit :
> The problem is that our hwloc/Linux does not implement
> set_proc_cpubind() so far.

But it can implement one that assumes that the target process is
singlethreaded, i.e. in hwloc_set_proc_cpubind distinguish between
HWLOC_CPUBIND_PROCESS being set or not, or by just passing the policy
flag as such to OS hooks.

> * document in hwloc.h that it may bind a single thread if the
> application (wrongly) passes a tid

I'd really rather avoid even mentioning tids in the hwloc documentation
except saying "don't use that, it's not portable, don't even ask, you'd
be horrified".

> * document that hwloc_plpa_sched_setaffinity now works on processes
> instead of pids and that application should use thread_t and
> set_thread_cpubind for local threads

Or pass 0 to express "the current thread", which was already valid for
plpa_sched_setaffinity, and _is_ portable (and should already have been
the only thing that truly portable applications use).

> * maybe return -ENOSYS on Linux if STRICT is given?

I guess you mean return 0 if STRICT is not given, and mean "it's not
strict because we haven't actually done it for all the threads, or even
not at all"?  I'd really rather not lie like this.

Samuel


Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1333

2009-11-11 Thread Brice Goglin
Samuel Thibault wrote:
> bgog...@osl.iu.edu, le Wed 11 Nov 2009 11:33:31 -0500, a écrit :
>   
>> +/** \brief Bind thread given by \p pid to CPU set \p cpuset.
>> + *
>> + * \note This function now manipulates hwloc cpusets.
>> + */
>> +static __inline int
>> +hwloc_plpa_sched_setaffinity(hwloc_topology_t topology, hwloc_pid_t pid, 
>> hwloc_cpuset_t cpuset)
>> +{
>> +  /* FIXME: should be set_thread_cpubind with a pid */
>> +  return hwloc_set_proc_cpubind(topology, pid, cpuset, 0);
>> +}
>> 
>
> That's one instance where the Linux interface is odd (it talks about a
> pid, but it's actually a thread, and there is no way to set the affinity
> mask of a whole process...) and I believe we shouldn't try to support
> all the cases.  I'd suggest to bind a thread only when pid is 0.  If
> pid is not zero, that means that either the application is calling the
> linux-only gettid() or some other linux-only way to get the tid of a
> specific thread, or it assumes that the target is a single-threaded
> process and thus providing the pid of that process is enough to change
> its cpu affinity.  In that case we can use hwloc_set_proc_cpubind like
> already done above.  Same for getaffinity.
>   

The problem is that our hwloc/Linux does not implement
set_proc_cpubind() so far. That's why PLPA/hwloc reports that binding is
not supported.

What about we add set_proc_cpubind() support to Linux and:
* document in hwloc.h that it may bind a single thread if the
application (wrongly) passes a tid
* document that hwloc_plpa_sched_setaffinity now works on processes
instead of pids and that application should use thread_t and
set_thread_cpubind for local threads
* maybe return -ENOSYS on Linux if STRICT is given?

Brice



Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1333

2009-11-11 Thread Samuel Thibault
bgog...@osl.iu.edu, le Wed 11 Nov 2009 11:33:31 -0500, a écrit :
> +/** \brief Bind thread given by \p pid to CPU set \p cpuset.
> + *
> + * \note This function now manipulates hwloc cpusets.
> + */
> +static __inline int
> +hwloc_plpa_sched_setaffinity(hwloc_topology_t topology, hwloc_pid_t pid, 
> hwloc_cpuset_t cpuset)
> +{
> +  /* FIXME: should be set_thread_cpubind with a pid */
> +  return hwloc_set_proc_cpubind(topology, pid, cpuset, 0);
> +}

That's one instance where the Linux interface is odd (it talks about a
pid, but it's actually a thread, and there is no way to set the affinity
mask of a whole process...) and I believe we shouldn't try to support
all the cases.  I'd suggest to bind a thread only when pid is 0.  If
pid is not zero, that means that either the application is calling the
linux-only gettid() or some other linux-only way to get the tid of a
specific thread, or it assumes that the target is a single-threaded
process and thus providing the pid of that process is enough to change
its cpu affinity.  In that case we can use hwloc_set_proc_cpubind like
already done above.  Same for getaffinity.

Samuel


Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1333

2009-11-11 Thread Samuel Thibault
bgog...@osl.iu.edu, le Wed 11 Nov 2009 11:33:31 -0500, a écrit :
> +  /* FIXME: should be SET_THREAD_CPUBIND given with a pid */
> +  if (flags & HWLOC_SUPPORT_SET_PROC_CPUBIND)
> +*api_type = HWLOC_PLPA_PROBE_OK;
> +  else
> +*api_type = HWLOC_PLPA_PROBE_NOT_SUPPORTED;
> +  return 0;
> +}

Just to refine my thoughts: providing the interface but accepting only
pid 0 (i.e. "self") is however portability-wise ok and can already be
mapped to the current hwloc interface.

Samuel


Re: [hwloc-devel] towards PLPA-like API in 1.0

2009-11-11 Thread Samuel Thibault
Brice Goglin, le Wed 11 Nov 2009 17:34:30 +0100, a écrit :
> Would it make sense to add support for the CPUBIND_THREAD policy in
> set_proc_cpubind? (and maybe rename it into set_pid_cpubind, or add
> set_thread_pid_cpubind).

I answered in the ticket itself: linux 2.4 insanely mixed the notion of
pid and threads and it's still a mess.  An application can't get a tid
without explicitely calling the linux-only system call, while it could
just use the portable pthread_self() and such.

Samuel


Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1330

2009-11-11 Thread Samuel Thibault
Jeff Squyres, le Wed 11 Nov 2009 05:54:00 -0800, a écrit :
> How about HWLOC_UNSUPPORTED_SYS?

I don't think it's a good idea to make it a compile-time thing rather
than a runtime-time thing: if we expose to the application the fact
that the OS on which the application is building is not supported, it
may disable some parts of its internals, even if later a newer version
hwloc that does support the system gets installed, and then the user
would have to recompile its application in order to get the application
support compiled.

Samuel


Re: [hwloc-devel] towards PLPA-like API in 1.0

2009-11-11 Thread Brice Goglin
I just pushed the PLPA stuff to trunk. One problem we have is that we
cannot the binding-capability in the PLPA-interface. The reason is that
PLPA wants to bind a thread given by a pid (ie do sched_setaffinity).
hwloc only has the ability to bind the current thread, or to bind an
entire process given by a pid.  Would it make sense to add support for
the CPUBIND_THREAD policy in set_proc_cpubind? (and maybe rename it into
set_pid_cpubind, or add set_thread_pid_cpubind).

Brice



Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1330

2009-11-11 Thread Brice Goglin
Jeff Squyres wrote:
> How about HWLOC_UNSUPPORTED_SYS?
>

We currently have a mix of internal LINUX/SOLARIS/../UNSUPPORTED_SYS and
one public HWLOC_LINUX_SYS. I could merge all of them into HWLOC_*_SYS,
sure.

Brice



Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1330

2009-11-11 Thread Jeff Squyres

How about HWLOC_UNSUPPORTED_SYS?

On Nov 11, 2009, at 2:38 AM,  wrote:


Author: bgoglin
Date: 2009-11-11 05:38:26 EST (Wed, 11 Nov 2009)
New Revision: 1330
URL: https://svn.open-mpi.org/trac/hwloc/changeset/1330

Log:
Add hwloc_topology_get_support()

This also defines UNSUPPORTED_SYS on not supported systems.

We should report the supported flags in lstopo -v -
but I don't want to modify all linux/tests/*.output now
since some flags may change in the near future.
So we'll see later.
Text files modified:
   trunk/configure.ac| 1 +
   trunk/doc/Makefile.am | 1 +
   trunk/include/hwloc.h |20 
   trunk/src/topology.c  |27 +++
   4 files changed, 49 insertions(+), 0 deletions(-)

Modified: trunk/configure.ac
=
=
=
=
=
=
=
=
==
--- trunk/configure.ac  (original)
+++ trunk/configure.ac  2009-11-11 05:38:26 EST (Wed, 11 Nov 2009)
@@ -203,6 +203,7 @@
 ;;
   *)
 AC_MSG_RESULT([Unsupported! ($target)])
+AC_DEFINE(UNSUPPORTED_SYS, 1, [Define to 1 on unsupported  
systems])
  
AC_MSG_WARN 
([***])

 AC_MSG_WARN([*** hwloc does not support this system.])
 AC_MSG_WARN([*** hwloc will *attempt* to build (but it may not  
work).])


Modified: trunk/doc/Makefile.am
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/doc/Makefile.am   (original)
+++ trunk/doc/Makefile.am   2009-11-11 05:38:26 EST (Wed, 11 Nov  
2009)

@@ -299,6 +299,7 @@
$(DOX_MAN_DIR)/man3/hwloc_topology_export_xml.3 \
 $(DOX_MAN_DIR)/man3/hwloc_topology_flags_e.3 \
 $(DOX_MAN_DIR)/man3/hwloc_topology_get_depth.3 \
+$(DOX_MAN_DIR)/man3/hwloc_topology_get_support.3 \
 $(DOX_MAN_DIR)/man3/ 
hwloc_topology_ignore_all_keep_structure.3 \

 $(DOX_MAN_DIR)/man3/hwloc_topology_ignore_type.3 \
 $(DOX_MAN_DIR)/man3/ 
hwloc_topology_ignore_type_keep_structure.3 \


Modified: trunk/include/hwloc.h
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/include/hwloc.h   (original)
+++ trunk/include/hwloc.h   2009-11-11 05:38:26 EST (Wed, 11 Nov  
2009)

@@ -360,6 +360,26 @@
  */
 extern int hwloc_topology_set_xml(hwloc_topology_t __hwloc_restrict  
topology, const char * __hwloc_restrict xmlpath);


+/** \brief Flags describing the actual OS support for this topology.
+ *
+ * Flags are retrieved with hwloc_topology_get_support().
+ */
+enum hwloc_topology_support_flags_e {
+  /* \brief Topology discovery is supported. */
+  HWLOC_SUPPORT_DISCOVERY = (1<<0),
+  /* \brief Binding a process is supported. */
+  HWLOC_SUPPORT_SET_PROC_CPUBIND = (1<<1),
+  /* \brief Binding a thread is supported. */
+  HWLOC_SUPPORT_SET_THREAD_CPUBIND = (1<<2),
+  /* \brief Getting the binding of a process is supported. */
+  HWLOC_SUPPORT_GET_PROC_CPUBIND = (1<<3),
+  /* \brief Getting the binding of a thread is supported. */
+  HWLOC_SUPPORT_GET_THREAD_CPUBIND = (1<<4),
+};
+
+/** \brief Retrieve the OR'ed flags of topology support. */
+extern int hwloc_topology_get_support(hwloc_topology_t  
__hwloc_restrict topology, unsigned long *flags);

+
 /** @} */



Modified: trunk/src/topology.c
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/src/topology.c(original)
+++ trunk/src/topology.c2009-11-11 05:38:26 EST (Wed, 11 Nov  
2009)

@@ -1611,3 +1611,30 @@
 assert(obj->type == HWLOC_OBJ_PROC);
   }
 }
+
+int
+hwloc_topology_get_support(struct hwloc_topology * topology,  
unsigned long *flagsp)

+{
+  unsigned long flags = 0;
+#ifndef UNSUPPORTED_SYS
+  flags |= HWLOC_SUPPORT_DISCOVERY;
+#endif
+
+  /* if not is_thissystem, set_cpubind is fake
+   * and get_cpubind returns the whole system cpuset,
+   * so don't report that set/get_cpubind as supported
+   */
+  if (topology->is_thissystem) {
+if (topology->set_proc_cpubind)
+  flags |= HWLOC_SUPPORT_SET_PROC_CPUBIND;
+if (topology->set_thread_cpubind)
+  flags |= HWLOC_SUPPORT_SET_THREAD_CPUBIND;
+if (topology->get_proc_cpubind)
+  flags |= HWLOC_SUPPORT_GET_PROC_CPUBIND;
+if (topology->get_thread_cpubind)
+  flags |= HWLOC_SUPPORT_GET_THREAD_CPUBIND;
+  }
+
+  *flagsp = flags;
+  return 0;
+}
___
hwloc-svn mailing list
hwloc-...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-svn




--
Jeff Squyres
jsquy...@cisco.com