Re: [hwloc-devel] towards PLPA-like API in 1.0
Brice Goglin, le Mon 09 Nov 2009 15:18:11 +0100, a écrit : > I don't think we need SET_CPUBIND since (from what I understand) it > would be equivalent to SET_PROC_CPUBIND | SET_THREAD_CPUBIND. Being able to set oneself's cpuset is not the same as being able to set the cpuset of other processes or other threads. We should also probably expose whether strict binding is available and whether whole processes and single threads can be bound. In a word, I believe we should just expose which exact parts of the binding API will never return ENOSYS. That would make it a lot easier to comprehend and much more straightforward to use it even though the interface is very powerful and can thus lead to a lot of binding possibilities. If an application has requirements A, B and C and would be even happier if D and E were available, it can prepare arguments for A, B and C, and if D and E are announced as being available, add arguments for them, and then call the function. That saves trying with D and E, or only with D, or only with E, and eventually fallback to not using either.. > We'd have to keep in mind that 32bits in this flag bitmask is small > (we'll likely need many other flags in the future, for instance > IODEVICE_DISCOVERY, SET/GET_MEMBIND, SET/GET_RANGE_MEMBIND). That's probably a hint that we should find another way :) How about a function that returns a structure filled with one long per feature (detection, thread bind, memory bind, etc.), part of hwloc_topology_t? That way we can extend it at will. Samuel
[hwloc-devel] Create success (hwloc r1.0a1r1335)
Creating nightly hwloc snapshot SVN tarball was a success. Snapshot: hwloc 1.0a1r1335 Start time: Wed Nov 11 21:01:02 EST 2009 End time: Wed Nov 11 21:02:54 EST 2009 Your friendly daemon, Cyrador
Re: [hwloc-devel] towards PLPA-like API in 1.0
Brice Goglin, le Thu 05 Nov 2009 17:22:15 +0100, a écrit : > + int hwloc_plpa_sched_getaffinity(pid_t pid, hwloc_cpuset_t cpuset); > > It's just a hwloc_get_cpubind(), but we don't have it since it would not > be supported on all OS. But I think we should add it anyway. Being discussed in another thread. > * Then we have all count-spec related API, which lets you look for > information about all processors, or all online ones, or all offline ones. > > If people are really interested with offline CPUs, they can look at the > get_offline_cpuset below. There is no topology information about offline > CPUs on Linux anyway, And at least on some other OSes as well, but not on Solaris for instance. > + hwloc_cpuset_t hwloc_topology_get_offline_cpuset(hwloc_topology_t topology); > > Returns a CPU set of existing CPUs that are offline, disabled by > administrator, or unavailable to this process if we're restricting the > topology to the process origin binding for instance. I am not sure we > actually need to distinguish all these cases. Mmm, I think there's one more thing that is actually more precise in some way: "the CPUs that we don't provide topology objects for": some OSes don't expose unauthorized CPUs even if they are online. It would thus also include CPUs which have explicitly been ignored because HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM is not set. I think it can be useful to provide the cpuset of allowed cpus. One could imagine a tool that negociates with the administrator tools which cpus to be added/removed from the allowed set, knowing the whole topology and which CPUs are allowed would be useful. Maybe powering up/down could be involved too, thus the offline mask too. The current binding is already available from hwloc_get_cpubind. Actually, that gives me the idea that it'd be probably nice to somehow show that in the graphical lstopo :) Samuel
Re: [hwloc-devel] Pgcc issues fixed?
Jeff Squyres, le Mon 09 Nov 2009 08:05:47 -0500, a écrit : > Fair enough. What about if we have an AC check for > pthread_setaffinity_np and use that if it exists, and if it doesn't > use the PLPA way? Err, remember that pthread_setaffinity_np alone doesn't permit to bind another process, and suffers from the same size parameter kludge (it has been introduced in 2003). > BTW, how does pthread_setaffinity_np() work? Does it check the > running kernel and ensure to do the Right Thing? Like sched_setaffinity does, yes. > That was definitely a problem in the past -- kernel and glibc would > mismatch in terms of set/getaffinity (which was included in many > distros). They have been fixed at the same time, 2004-03-18. Maybe what we can do is using PLPA's functions if __GLIBC__ is <= 2 and __GLIBC_MINOR__ is < the first version which is known to be correct or if CPU_SET can't be compiled, and rely on the glibc functions else. Of course we have to rely on glibc in any case for pthread_setaffinity_np(). Samuel
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1333
Another way to go is in hwloc_plpa_sched_setaffinity put in #ifdef HWLOC_LINUX_SYS some code that calls the internal hwloc_linux_set_tid_cpubind (with a strong comment that nobody else should call it), so that - existing linux plpa users can have the same behavior, but we can document here that calling hwloc_plpa_sched_setaffinity with a pid different than zero portably means something only for single-threaded processes. - non-linux plpa users are restricted to what really is portable. - we don't have to cripple the hwloc interface (i.e. document that we accept non-portable input) just for the linuxish interface. Samuel
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1333
Brice Goglin, le Thu 12 Nov 2009 00:31:48 +0100, a écrit : > The problem is that our hwloc/Linux does not implement > set_proc_cpubind() so far. But it can implement one that assumes that the target process is singlethreaded, i.e. in hwloc_set_proc_cpubind distinguish between HWLOC_CPUBIND_PROCESS being set or not, or by just passing the policy flag as such to OS hooks. > * document in hwloc.h that it may bind a single thread if the > application (wrongly) passes a tid I'd really rather avoid even mentioning tids in the hwloc documentation except saying "don't use that, it's not portable, don't even ask, you'd be horrified". > * document that hwloc_plpa_sched_setaffinity now works on processes > instead of pids and that application should use thread_t and > set_thread_cpubind for local threads Or pass 0 to express "the current thread", which was already valid for plpa_sched_setaffinity, and _is_ portable (and should already have been the only thing that truly portable applications use). > * maybe return -ENOSYS on Linux if STRICT is given? I guess you mean return 0 if STRICT is not given, and mean "it's not strict because we haven't actually done it for all the threads, or even not at all"? I'd really rather not lie like this. Samuel
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1333
Samuel Thibault wrote: > bgog...@osl.iu.edu, le Wed 11 Nov 2009 11:33:31 -0500, a écrit : > >> +/** \brief Bind thread given by \p pid to CPU set \p cpuset. >> + * >> + * \note This function now manipulates hwloc cpusets. >> + */ >> +static __inline int >> +hwloc_plpa_sched_setaffinity(hwloc_topology_t topology, hwloc_pid_t pid, >> hwloc_cpuset_t cpuset) >> +{ >> + /* FIXME: should be set_thread_cpubind with a pid */ >> + return hwloc_set_proc_cpubind(topology, pid, cpuset, 0); >> +} >> > > That's one instance where the Linux interface is odd (it talks about a > pid, but it's actually a thread, and there is no way to set the affinity > mask of a whole process...) and I believe we shouldn't try to support > all the cases. I'd suggest to bind a thread only when pid is 0. If > pid is not zero, that means that either the application is calling the > linux-only gettid() or some other linux-only way to get the tid of a > specific thread, or it assumes that the target is a single-threaded > process and thus providing the pid of that process is enough to change > its cpu affinity. In that case we can use hwloc_set_proc_cpubind like > already done above. Same for getaffinity. > The problem is that our hwloc/Linux does not implement set_proc_cpubind() so far. That's why PLPA/hwloc reports that binding is not supported. What about we add set_proc_cpubind() support to Linux and: * document in hwloc.h that it may bind a single thread if the application (wrongly) passes a tid * document that hwloc_plpa_sched_setaffinity now works on processes instead of pids and that application should use thread_t and set_thread_cpubind for local threads * maybe return -ENOSYS on Linux if STRICT is given? Brice
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1333
bgog...@osl.iu.edu, le Wed 11 Nov 2009 11:33:31 -0500, a écrit : > +/** \brief Bind thread given by \p pid to CPU set \p cpuset. > + * > + * \note This function now manipulates hwloc cpusets. > + */ > +static __inline int > +hwloc_plpa_sched_setaffinity(hwloc_topology_t topology, hwloc_pid_t pid, > hwloc_cpuset_t cpuset) > +{ > + /* FIXME: should be set_thread_cpubind with a pid */ > + return hwloc_set_proc_cpubind(topology, pid, cpuset, 0); > +} That's one instance where the Linux interface is odd (it talks about a pid, but it's actually a thread, and there is no way to set the affinity mask of a whole process...) and I believe we shouldn't try to support all the cases. I'd suggest to bind a thread only when pid is 0. If pid is not zero, that means that either the application is calling the linux-only gettid() or some other linux-only way to get the tid of a specific thread, or it assumes that the target is a single-threaded process and thus providing the pid of that process is enough to change its cpu affinity. In that case we can use hwloc_set_proc_cpubind like already done above. Same for getaffinity. Samuel
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1333
bgog...@osl.iu.edu, le Wed 11 Nov 2009 11:33:31 -0500, a écrit : > + /* FIXME: should be SET_THREAD_CPUBIND given with a pid */ > + if (flags & HWLOC_SUPPORT_SET_PROC_CPUBIND) > +*api_type = HWLOC_PLPA_PROBE_OK; > + else > +*api_type = HWLOC_PLPA_PROBE_NOT_SUPPORTED; > + return 0; > +} Just to refine my thoughts: providing the interface but accepting only pid 0 (i.e. "self") is however portability-wise ok and can already be mapped to the current hwloc interface. Samuel
Re: [hwloc-devel] towards PLPA-like API in 1.0
Brice Goglin, le Wed 11 Nov 2009 17:34:30 +0100, a écrit : > Would it make sense to add support for the CPUBIND_THREAD policy in > set_proc_cpubind? (and maybe rename it into set_pid_cpubind, or add > set_thread_pid_cpubind). I answered in the ticket itself: linux 2.4 insanely mixed the notion of pid and threads and it's still a mess. An application can't get a tid without explicitely calling the linux-only system call, while it could just use the portable pthread_self() and such. Samuel
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1330
Jeff Squyres, le Wed 11 Nov 2009 05:54:00 -0800, a écrit : > How about HWLOC_UNSUPPORTED_SYS? I don't think it's a good idea to make it a compile-time thing rather than a runtime-time thing: if we expose to the application the fact that the OS on which the application is building is not supported, it may disable some parts of its internals, even if later a newer version hwloc that does support the system gets installed, and then the user would have to recompile its application in order to get the application support compiled. Samuel
Re: [hwloc-devel] towards PLPA-like API in 1.0
I just pushed the PLPA stuff to trunk. One problem we have is that we cannot the binding-capability in the PLPA-interface. The reason is that PLPA wants to bind a thread given by a pid (ie do sched_setaffinity). hwloc only has the ability to bind the current thread, or to bind an entire process given by a pid. Would it make sense to add support for the CPUBIND_THREAD policy in set_proc_cpubind? (and maybe rename it into set_pid_cpubind, or add set_thread_pid_cpubind). Brice
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1330
Jeff Squyres wrote: > How about HWLOC_UNSUPPORTED_SYS? > We currently have a mix of internal LINUX/SOLARIS/../UNSUPPORTED_SYS and one public HWLOC_LINUX_SYS. I could merge all of them into HWLOC_*_SYS, sure. Brice
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r1330
How about HWLOC_UNSUPPORTED_SYS? On Nov 11, 2009, at 2:38 AM, wrote: Author: bgoglin Date: 2009-11-11 05:38:26 EST (Wed, 11 Nov 2009) New Revision: 1330 URL: https://svn.open-mpi.org/trac/hwloc/changeset/1330 Log: Add hwloc_topology_get_support() This also defines UNSUPPORTED_SYS on not supported systems. We should report the supported flags in lstopo -v - but I don't want to modify all linux/tests/*.output now since some flags may change in the near future. So we'll see later. Text files modified: trunk/configure.ac| 1 + trunk/doc/Makefile.am | 1 + trunk/include/hwloc.h |20 trunk/src/topology.c |27 +++ 4 files changed, 49 insertions(+), 0 deletions(-) Modified: trunk/configure.ac = = = = = = = = == --- trunk/configure.ac (original) +++ trunk/configure.ac 2009-11-11 05:38:26 EST (Wed, 11 Nov 2009) @@ -203,6 +203,7 @@ ;; *) AC_MSG_RESULT([Unsupported! ($target)]) +AC_DEFINE(UNSUPPORTED_SYS, 1, [Define to 1 on unsupported systems]) AC_MSG_WARN ([***]) AC_MSG_WARN([*** hwloc does not support this system.]) AC_MSG_WARN([*** hwloc will *attempt* to build (but it may not work).]) Modified: trunk/doc/Makefile.am = = = = = = = = == --- trunk/doc/Makefile.am (original) +++ trunk/doc/Makefile.am 2009-11-11 05:38:26 EST (Wed, 11 Nov 2009) @@ -299,6 +299,7 @@ $(DOX_MAN_DIR)/man3/hwloc_topology_export_xml.3 \ $(DOX_MAN_DIR)/man3/hwloc_topology_flags_e.3 \ $(DOX_MAN_DIR)/man3/hwloc_topology_get_depth.3 \ +$(DOX_MAN_DIR)/man3/hwloc_topology_get_support.3 \ $(DOX_MAN_DIR)/man3/ hwloc_topology_ignore_all_keep_structure.3 \ $(DOX_MAN_DIR)/man3/hwloc_topology_ignore_type.3 \ $(DOX_MAN_DIR)/man3/ hwloc_topology_ignore_type_keep_structure.3 \ Modified: trunk/include/hwloc.h = = = = = = = = == --- trunk/include/hwloc.h (original) +++ trunk/include/hwloc.h 2009-11-11 05:38:26 EST (Wed, 11 Nov 2009) @@ -360,6 +360,26 @@ */ extern int hwloc_topology_set_xml(hwloc_topology_t __hwloc_restrict topology, const char * __hwloc_restrict xmlpath); +/** \brief Flags describing the actual OS support for this topology. + * + * Flags are retrieved with hwloc_topology_get_support(). + */ +enum hwloc_topology_support_flags_e { + /* \brief Topology discovery is supported. */ + HWLOC_SUPPORT_DISCOVERY = (1<<0), + /* \brief Binding a process is supported. */ + HWLOC_SUPPORT_SET_PROC_CPUBIND = (1<<1), + /* \brief Binding a thread is supported. */ + HWLOC_SUPPORT_SET_THREAD_CPUBIND = (1<<2), + /* \brief Getting the binding of a process is supported. */ + HWLOC_SUPPORT_GET_PROC_CPUBIND = (1<<3), + /* \brief Getting the binding of a thread is supported. */ + HWLOC_SUPPORT_GET_THREAD_CPUBIND = (1<<4), +}; + +/** \brief Retrieve the OR'ed flags of topology support. */ +extern int hwloc_topology_get_support(hwloc_topology_t __hwloc_restrict topology, unsigned long *flags); + /** @} */ Modified: trunk/src/topology.c = = = = = = = = == --- trunk/src/topology.c(original) +++ trunk/src/topology.c2009-11-11 05:38:26 EST (Wed, 11 Nov 2009) @@ -1611,3 +1611,30 @@ assert(obj->type == HWLOC_OBJ_PROC); } } + +int +hwloc_topology_get_support(struct hwloc_topology * topology, unsigned long *flagsp) +{ + unsigned long flags = 0; +#ifndef UNSUPPORTED_SYS + flags |= HWLOC_SUPPORT_DISCOVERY; +#endif + + /* if not is_thissystem, set_cpubind is fake + * and get_cpubind returns the whole system cpuset, + * so don't report that set/get_cpubind as supported + */ + if (topology->is_thissystem) { +if (topology->set_proc_cpubind) + flags |= HWLOC_SUPPORT_SET_PROC_CPUBIND; +if (topology->set_thread_cpubind) + flags |= HWLOC_SUPPORT_SET_THREAD_CPUBIND; +if (topology->get_proc_cpubind) + flags |= HWLOC_SUPPORT_GET_PROC_CPUBIND; +if (topology->get_thread_cpubind) + flags |= HWLOC_SUPPORT_GET_THREAD_CPUBIND; + } + + *flagsp = flags; + return 0; +} ___ hwloc-svn mailing list hwloc-...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-svn -- Jeff Squyres jsquy...@cisco.com