Re: [hwloc-devel] Using hwloc to detect Hard Disks
On 24/09/14 00:57, Ralph Castain wrote: > Memory info is available from lshw, though they are a GPL code: FWIW on this laptop (Intel Haswell) lshw only report DIMM info when run as root, which I suspect would point them to accessing DMI information via /dev/mem. Using strace supports this: 3405 open("/dev/mem", O_RDONLY)= -1 EACCES (Permission denied) FWIW dmidecode does the same. samuel@haswell:~$ dmidecode # dmidecode 2.12 /dev/mem: Permission denied All the best, Chris -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci
Re: [hwloc-devel] GIT: hwloc branch master updated. 0e6fe307c10d47efee3fb95c50aee9c0f01bc8ec
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 30/03/14 02:04, Ralph Castain wrote: > turns out that some linux distro's automatically set LS_COLORS in > your environment when running old versions of csh/tcsh via their > default dot files For example RHEL6 does this.. - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlM4oXcACgkQO2KABBYQAh8Y6QCbBf7tHJ/7CuUSUbcaa+SvRtBx snwAn2zLdXGF9bzyBVmsPjl56uY3ozWW =FxcX -END PGP SIGNATURE-
Re: [hwloc-devel] lstopo - please add the information about the kernel to the graphical output
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/09/13 02:03, Jiri Hladky wrote: > I vote for --append-legend I like that too, though the idea of an additional undocumented --jirka option also appeals. :-) - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlIlLFcACgkQO2KABBYQAh8JyACfbEIKp5fvL1RodhpORUPLj0zN w4gAn2CLmB8x6roBovo0vdEjumrDb7KE =rnHF -END PGP SIGNATURE-
Re: [hwloc-devel] lstopo - please add the information about the kernel to the graphical output
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 28/08/13 02:19, Brice Goglin wrote: > The problem I have while playing with this is that it takes a lot > of space. Putting the entire uname on a single line will be > truncated when the topology drawing isn't large (on machines with 2 > cores for instance). And using multiple lines would make the legend > huge. Would there be any benefit of inserting it into the EXIF information for the image (every time) instead? That way it would be accessible for those who need it (now and in the future) whilst not cluttering up the image. cheers, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlIf1v0ACgkQO2KABBYQAh9QlgCdG+o7x6GaTiT0dnBPvMRW/UCH dgcAn1LCpbhvVafxd95hW/+6G97/HKNe =9iNn -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc-distrib - please add the option to distribute the jobs in the reverse direction
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 27/08/13 00:07, Brice Goglin wrote: > But there's a more general problem here, some people may want > something similar for other cases. I need to think about it. Something like a sort order perhaps, combined with some method to exclude or weight PUs based on some metrics (including a user defined weight)? I had a quick poke around looking at /proc/irq/*/ and it would appear you can gather info about which CPUs are eligible to handle IRQs from the smp_affinity bitmask (or smp_affinity_list). The node file there just "shows the node to which the device using the IRQ reports itself as being attached. This hardware locality information does not include information about any possible driver locality preference." cheers, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlIcF/QACgkQO2KABBYQAh/7oQCcDSLlgEJqBGDerUD481ho6UWc Rp0AnRC4cC/Kdhwe75tgg1O/LrcfxXM0 =r4pj -END PGP SIGNATURE-
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r5324 - branches/libpciaccess/doc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 17/02/13 01:22, Jeff Squyres (jsquyres) wrote: > No, it's not. RHEL6, for example, does have libpciaccess, but > does not have a libpciaccess-dev (or devel). Ergo, you have to > get externally. It's in the server-optional repo: [root@imgr ~]# yum info libpciaccess-devel Loaded plugins: downloadonly, etckeeper, product-id, rhnplugin, security Available Packages Name: libpciaccess-devel Arch: i686 Version : 0.12.1 Release : 1.el6 Size: 11 k Repo: rhel-x86_64-server-optional-6 Summary : PCI access library development package License : MIT Description : Development package for libpciaccess. Name: libpciaccess-devel Arch: x86_64 Version : 0.12.1 Release : 1.el6 Size: 11 k Repo: rhel-x86_64-server-optional-6 Summary : PCI access library development package License : MIT Description : Development package for libpciaccess. If they're not enabled then you won't see it. cheers, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlEjAc8ACgkQO2KABBYQAh88ngCfTGoYJrWvW4RclZxsBrq/6/Fo FHIAn3wU9c4UD9B+Vg9GGWLip2wNx353 =+Jg7 -END PGP SIGNATURE-
Re: [hwloc-devel] libpci: GPL
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 /* * Trying to catch up with email, but I've not seen the question of * whether or not linking proprietary->BSD->GPL was OK or not addressed * yet. */ On 06/02/13 08:50, Jeff Squyres (jsquyres) wrote: > It was just pointed out to me that libpci is licensed under the GPL > (not the LGPL). > > Hence, even though hwloc is BSD, if it links to libpci.*, it's > tainted. I wouldn't say hwloc is tainted, more that you were tainting the GPL'd code by linking the proprietary code to it, but that's just case of perspective. ;-) After a brief search of the GPL FAQs I'd say the closest I can get is: http://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.html#GPLWrapper # I'd like to incorporate GPL-covered software in my proprietary # system. Can I do this by putting a “wrapper” module, under a # GPL-compatible lax permissive license (such as the X11 license) # in between the GPL-covered part and the proprietary part? # #No. The X11 license is compatible with the GPL, so you can add a # module to the GPL-covered program and put it under the X11 license. # But if you were to incorporate them both in a larger program, that # whole would include the GPL-covered part, so it would have to be # licensed as a whole under the GNU GPL. # # The fact that proprietary module A communicates with GPL-covered # module C only through X11-licensed module B is legally irrelevant; # what matters is the fact that module C is included in the whole. So yes, if you want to permit proprietary code to link to hwloc then you need to stick to permissive licenses in hwlocs dependencies. Disclaimer: IANAL, I don't play a lawyer on TV (or the Internet), batteries not included, caveat emptor, dates in calendar are closer than they appear, etc, etc, etc... Of course it might be possible to ask the pciutils maintainer to split out libpci from pciutils and LGPL it. Interestingly, Steam for Linux appears to have linked to libpci.. http://steamcommunity.com/app/221410/discussions/1/846938351130480716/ cheers! Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlEi/wcACgkQO2KABBYQAh+MyACfdd9CyGvIcIIHZD2pTvVM1ZXG 6SUAn1Yr9D4knUhld9F/fa68EzR64Xnq =Sd+l -END PGP SIGNATURE-
Re: [hwloc-devel] Cgroup resource limits
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/11/12 09:05, Ralph Castain wrote: > System resource managers don't usually provide this capability, so we > will do it at the ORTE level. Interestingly one of the Torque developers posted this overnight: http://www.supercluster.org/pipermail/torqueusers/2012-November/015183.html # We are interested in incorporated cgroups into TORQUE. One # of the things that is delaying it is that we haven't found # a good library to manage the cgroups - it is obviously a much # larger project if we have to write such a library ourselves, # and also much harder to maintain. Does anyone know of a good # library for cgroups? So I've pointed them at this thread and strongly encouraged them to get involved. cheers, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCYdA4ACgkQO2KABBYQAh9aGACdEP+xuJptSUwAe0tHyUzJRi25 tTwAn0V/km+ltgigmQa5XoVI7lIVUlTw =UzmX -END PGP SIGNATURE-
Re: [hwloc-devel] Cgroup resource limits
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 06/11/12 13:01, Ralph Castain wrote: > Depends on the use-case. If you are going to direct-launch the > processes (e.g., using srun), then you are correct. Yup. > However, that isn't the case in other scenarios. For example, if > you get an allocation and then use mpirun to launch your job, you > definitely do *not* want the RM setting the cgroup constraints as > the RM only launches the orteds - it never sees the MPI procs. The > constraints are to apply to the individual procs as separate > entities - if you apply them to the orteds, then all procs will be > constrained to the same container. Ick. That's not been my experience recently; for instance Torque currently creates a cpuset for your job containing all the procs you've been given there and then you can use mpirun/mpiexec to launch orted across all the nodes you've been given. Those processes are then constrained to the allocation set up on each node. They are free to bind themselves to the cores present within that cpuset should they so wish. In the very beginnings (when I was at VPAC and when wew were using MVPAICH2 rather than OpenMPI) Torque would bind processes to a core within the allocation which worked fine for that, but of course broke in the way you explain when we moved to Open-MPI. I fixed that bug up very quickly.. ;-) We've only ever run Slurm on BlueGene where this isn't an issue, so I don't know if that does things differently. > Similarly, if you are running MapReduce, your application has to > figure out what nodes to run on, how much memory will be required, > etc. All that goes into the allocation request (made by the > equivalent of mpirun in that scenario) sent to the RM. Again, the > orteds need to set those constraints on a per-process basis. But for the scheduler to be able to plan workload well I believe that once your job has started the best you can do is ask for less than you have been given, otherwise you're free to game the system by queuing a short small job and once it's started asking for many more cores or RAM.. :-) > So we need the capability in ORTE to support the non-direct-launch > cases. I'm pretty sure we're agreeing here, just in different ways of expressing ourselves.. :-) cheers! Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCYcvAACgkQO2KABBYQAh9C8ACcD3Tvjho1ZWuDMI+qX7iccUDQ mQQAmgNmVRisYsUfajunEBacNFjRBCIa =1S3e -END PGP SIGNATURE-
Re: [hwloc-devel] Cgroup resource limits
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 06/11/12 01:43, Ralph Castain wrote: > On Nov 4, 2012, at 7:28 PM, Christopher Samuel > <sam...@unimelb.edu.au> wrote: > >> I would argue that the resource managers *should* be doing it > > No argument from me - I would love for them to provide me with an > easy API that mpirun can use to specify the requirements for a > given application. Wouldn't it be the other way around with the resource manager setting limits and then having the job run inside it? Basically like the current cpuset support in Torque, et. al, but on steroids. That way mpirun and/or orted could learn from the kernel the details of the cgroup it is in and arrange itself appropriately. I believe that Slurm has some support for cgroups already: http://www.schedmd.com/slurmdocs/cgroups.html [memcg performance] > Yick! However, I would expect the community to reduce that impact > over time. If systems don't want that capability, then they can > and should disable it. On the other hand, if they do want it, then > we want to support it. Indeed! cheers, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCYReYACgkQO2KABBYQAh+BxQCbB1lbNCqotuA2paV+G6+cfAdP xxwAnAurUX8OoK1+4oJJJY7NV9cmIoRV =yrCv -END PGP SIGNATURE-
Re: [hwloc-devel] Cgroup resource limits
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/11/12 09:05, Ralph Castain wrote: > System resource managers don't usually provide this capability, so > we will do it at the ORTE level. I would argue that the resource managers *should* be doing it - however, I will also argue that the resource managers should be doing it via hwloc (so I'm afraid it's not an out for you folks :-) ). It's also worth remembering that the memcg code has an appalling reputation with the kernel developers in terms of performance overhead, for instance at the recent Kernel Summit numbers were reported showing a substantial impact for just having the code present, but not used. Following that a patch set was sent out trying to avoid that impact if it's not in use which doesn't help here but does give a measure of the performance hit: http://lwn.net/Articles/517562/ # So as one can see, the difference between base and nomemcg in terms # of both system time and elapsed time is quite drastic, and consistent # with the figures shown by Mel Gorman in the Kernel summit. This is a # ~7 % drop in performance, just by having memcg enabled. memcg # functions appear heavily in the profiles, even if all tasks lives in # the root memcg. cheers, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCXMlUACgkQO2KABBYQAh8eTgCgkruuxIKc3mqpoxwMaeQBI1hR /osAn225q4G6FWs1b4Lm6F/9GHDgw9JB =jkm0 -END PGP SIGNATURE-
Re: [hwloc-devel] backends and plugins
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/08/12 19:02, Brice Goglin wrote: > Aside from the main "discover" callback, backends may also define > some callbacks to be invoked when new object are created. The main > example is Linux creating "OS devices" when a new "PCI device" is > added by the PCI backend. That could also be useful to some folks for non-PCI devices, say if a CPU gets hotplugged in/out (or more likely added/removed from a cpuset/cgroup you're in). - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlAoe3QACgkQO2KABBYQAh8DagCeKwDn0lPdX1D7GlLD0ksuIX/t jvEAn2l7+FQhnYvdPoN1CUd6Y6oyHSTv =mBxD -END PGP SIGNATURE-
Re: [hwloc-devel] [PATCH] Use plain "inline" in C++
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 10/05/12 07:40, Jeff Squyres wrote: > Huh -- really? I always thought that the C++ language itself > included the keyword "inline". I asked via Twitter and got these responses.. # Inline was part of C++98 - the first c++ standard, and # the inline kwd is in the cfront 1.0 ('86) source. So # functionally, yes. ...and... # This may be a different question than "have all C++ # compilers always accepted inline?" I note that autoconf has an inline test for C: http://www.gnu.org/software/autoconf/manual/autoconf-2.67/html_node/C-Compiler.html But not for C++: http://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.69/html_node/C_002b_002b-Compiler.html So perhaps the fact that they've never needed to implement such a test is in itself a good guide ? cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk+rPAoACgkQO2KABBYQAh+fqwCfbsCOjeK5y+WEZnWQ1e+pQmQg DhQAoJdN6S7IJpUZ51IlXbE0QJOI1jjI =dWPv -END PGP SIGNATURE-
Re: [hwloc-devel] lstopo-nox strikes back
On 26/04/12 02:35, Brice Goglin wrote: > I think I would vote for lstopo (no X/cairo) and lstopo so > that completion helps. Not sure if that's an option with Debian given the policy; the hwloc package would have to have lstopo with X enabled and then a nox package would install that variant of lstopo and use the alternatives system to select which to use. cheers, Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/
Re: [hwloc-devel] lstopo-nox strikes back
On 25/04/12 23:44, Jeffrey Squyres wrote: > FWIW: Having lstopo plugins for output would obviate the need for > having two executable names. IIRC that's generally handled via the alternatives system (or diversions if you don't like alternatives) in Debian/Ubuntu. -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/
Re: [hwloc-devel] interoperability with X displays
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 30/03/12 01:08, Brice Goglin wrote: > * The code uses NVIDIA's apparently-open-source nvctrl library. The > lib is unfortunately only built as a static lib in at least debian > and ubuntu (without -fPIC), which is annoying. I don't see that reported as a bug in the BTS, so I'd suggest reporting it and seeing what happens. http://bugs.debian.org/cgi-bin/pkgreport.cgi?src=nvidia-settings cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk90+swACgkQO2KABBYQAh8n7gCeLoqfiHq70fpNhctK4ivoVB9C LzgAn0qakbmIrTGMJUzCVZNXGmsrxEJK =27JG -END PGP SIGNATURE-
Re: [hwloc-devel] Fwd: BGQ empty topology with MPI
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 26/03/12 17:14, Brice Goglin wrote: > Thanks, that would explain such a strange behavior. Not a problem. > For the record, you can run "lstopo -v" or even "lstopo -.xml" to > get more info, especially machine attributes. OK, please find attached both lstopo -v (with debug enabled) and also the XML file requested. This is BG/P, not BG/Q of course! cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9wDuYACgkQO2KABBYQAh+5rwCffUVzbgIGgfAH9HtjAlBO90uV kLoAn0Rk2X6dlkNCBC3hKqPz1EZlx9KO =G9MN -END PGP SIGNATURE- could not open /proc/cpuinfo * CPU cpusets * cpu 0 (os 0) has cpuset 0x0001 cpu 1 (os 1) has cpuset 0x0002 cpu 2 (os 2) has cpuset 0x0004 cpu 3 (os 3) has cpuset 0x0008 Machine#0(Backend=Linux OSName=CNK OSRelease=2.6.16.60-304 OSVersion=1 HostName=r00-m1-n02.pcf.vlsci.unimelb.edu.au Architecture=BGP) cpuset 0xf...f complete 0x000f online 0xf...f allowed 0xf...f nodeset 0x0 completeN 0x0 allowedN 0xf...f PU#0 cpuset 0x0001 PU#1 cpuset 0x0002 PU#2 cpuset 0x0004 PU#3 cpuset 0x0008 Restrict topology cpusets to existing PU and NODE objects Propagate offline and disallowed cpus down and up Propagate nodesets Machine#0(Backend=Linux OSName=CNK OSRelease=2.6.16.60-304 OSVersion=1 HostName=r00-m1-n02.pcf.vlsci.unimelb.edu.au Architecture=BGP) cpuset 0x000f complete 0x000f online 0x000f allowed 0x000f PU#0 cpuset 0x0001 complete 0x0001 online 0x0001 allowed 0x0001 PU#1 cpuset 0x0002 complete 0x0002 online 0x0002 allowed 0x0002 PU#2 cpuset 0x0004 complete 0x0004 online 0x0004 allowed 0x0004 PU#3 cpuset 0x0008 complete 0x0008 online 0x0008 allowed 0x0008 Removing unauthorized and offline cpusets from all cpusets Removing disallowed memory according to nodesets Machine#0(Backend=Linux OSName=CNK OSRelease=2.6.16.60-304 OSVersion=1 HostName=r00-m1-n02.pcf.vlsci.unimelb.edu.au Architecture=BGP) cpuset 0x000f complete 0x000f online 0x000f allowed 0x000f PU#0 cpuset 0x0001 complete 0x0001 online 0x0001 allowed 0x0001 PU#1 cpuset 0x0002 complete 0x0002 online 0x0002 allowed 0x0002 PU#2 cpuset 0x0004 complete 0x0004 online 0x0004 allowed 0x0004 PU#3 cpuset 0x0008 complete 0x0008 online 0x0008 allowed 0x0008 Removing ignored objects Machine#0(Backend=Linux OSName=CNK OSRelease=2.6.16.60-304 OSVersion=1 HostName=r00-m1-n02.pcf.vlsci.unimelb.edu.au Architecture=BGP) cpuset 0x000f complete 0x000f online 0x000f allowed 0x000f PU#0 cpuset 0x0001 complete 0x0001 online 0x0001 allowed 0x0001 PU#1 cpuset 0x0002 complete 0x0002 online 0x0002 allowed 0x0002 PU#2 cpuset 0x0004 complete 0x0004 online 0x0004 allowed 0x0004 PU#3 cpuset 0x0008 complete 0x0008 online 0x0008 allowed 0x0008 Removing empty objects except numa nodes and PCI devices Machine#0(Backend=Linux OSName=CNK OSRelease=2.6.16.60-304 OSVersion=1 HostName=r00-m1-n02.pcf.vlsci.unimelb.edu.au Architecture=BGP) cpuset 0x000f complete 0x000f online 0x000f allowed 0x000f PU#0 cpuset 0x0001 complete 0x0001 online 0x0001 allowed 0x0001 PU#1 cpuset 0x0002 complete 0x0002 online 0x0002 allowed 0x0002 PU#2 cpuset 0x0004 complete 0x0004 online 0x0004 allowed 0x0004 PU#3 cpuset 0x0008 complete 0x0008 online 0x0008 allowed 0x0008 Removing objects whose type has HWLOC_IGNORE_TYPE_KEEP_STRUCTURE and have only one child or are the only child Machine#0(Backend=Linux OSName=CNK OSRelease=2.6.16.60-304 OSVersion=1 HostName=r00-m1-n02.pcf.vlsci.unimelb.edu.au Architecture=BGP) cpuset 0x000f complete 0x000f online 0x000f allowed 0x000f PU#0 cpuset 0x0001 complete 0x0001 online 0x0001 allowed 0x0001 PU#1 cpuset 0x0002 complete 0x0002 online 0x0002 allowed 0x0002 PU#2 cpuset 0x0004 complete 0x0004 online 0x0004 allowed 0x0004 PU#3 cpuset 0x0008 complete 0x0008 online 0x0008 allowed 0x0008 Add default object sets Ok, finished tweaking, now connect Machine#0(Backend=Linux OSName=CNK OSRelease=2.6.16.60-304 OSVersion=1 HostName=r00-m1-n02.pcf.vlsci.unimelb.edu.au Architecture=BGP) cpuset 0x000f complete 0x000f online 0x000f allowed 0x000f nodeset 0xf...f completeN 0xf...f allowedN
Re: [hwloc-devel] Fwd: BGQ empty topology with MPI
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 25/03/12 17:43, Brice Goglin wrote: > But it'd be good to understand what's going on in /sys on this > machine. And I still don't understand why MPI changes things here. My guess (looking at the BG/P CNK kernel code) is that /sys is not present on a BG/Q compute node, only on its I/O nodes (which run a Linux kernel), and so the code is only picking them up when the I/O is being redirected via an I/O node (i.e. when MPI is in play). Now I'd have thought that would happen with or without MPI, but who knows.. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9v4XkACgkQO2KABBYQAh8QrwCdGVrp1OzExLnB9v696lqEO2yz qKwAnivU+GJ2lXB5wzRBw1WlCkj0XeSy =rgKS -END PGP SIGNATURE-
Re: [hwloc-devel] Fwd: BGQ empty topology with MPI
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 25/03/12 09:04, Daniel Ibanez wrote: > Additional printfs confirm that with MPI in the code, > hwloc_accessat succeeds on the various /sys/ directories, but the > overall procedure for getting PUs from these fails. Without MPI, > access to /sys/ directories fails but the fallback > hwloc_setup_pu_level works. Sounds like your I/O with MPI is getting redirected to the I/O node (and hence finding /sys from the Linux kernel there) but when you're running without MPI it's trying to open files on the compute node and the CNK isn't presenting the /sys directories, causing it to fall back. I've run lstopo on our BG/P and I get to see the 4 cores there whether it's the stock code or if I add an MPI_Init() to the start. The output from lstopo when built with --enable-debug confirms it's reporting kernel and hostname info from the I/O node associated with the block: Machine#0(Backend=Linux OSName=CNK OSRelease=2.6.16.60-304 OSVersion=1 HostName=r00-m1-n04.pcf.vlsci.unimelb.edu.au Architecture=BGP) [...] It might be interesting to build something like ls with the BG/Q compilers to see if you can run it on a compute node to see what /proc or /sys look like in each case. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9v33UACgkQO2KABBYQAh+S1ACfSypUPtoOFV8fHOObBztuUMGI RmwAnRy/Estz8Qi2KzAuQigPJbgtSlD4 =sdGx -END PGP SIGNATURE-
Re: [hwloc-devel] PCI device name question
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 21/03/12 08:07, Brice Goglin wrote: > New patch attached, it doesn't add port numbers for non-IB > devices. Extract from lstopo on SGI XE270 box with Mellanox dual port IB card: PCIBridge PCI 15b3:673c Net L#2 "ib1" Net L#3 "ib0" OpenFabrics L#4 "mlx4_0" Looks OK to me. - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9r5lQACgkQO2KABBYQAh8kygCfWGaqIN0Xo8nHFCWhL31iCgtQ JqIAn0WP5CXBFBhsJL7qB5vpGABfPtel =i2eQ -END PGP SIGNATURE-
Re: [hwloc-devel] BGQ empty topology with MPI
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/03/12 20:58, Brice Goglin wrote: > So there's something strange going on when MPI is added. Which MPI > are using? Is this a derivative of MPICH that embeds hwloc? (MPICH > >= 1.2.1 if I remember correctly) Not sure about BG/Q, but BG/P uses code derived from MPICH2 according to: http://wiki.bg.anl-external.org/index.php/Main_Page Our BG/P seems to claim it's from MPICH2 1.1: samuel@tambo:~> mpicc -v mpicc for 1.1 cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9r42cACgkQO2KABBYQAh9mbwCeOYrI5bsk/XOiXFl128BksV2D SR4An1bs09e2lpyYadABbaRIG2dtg7Fr =ucpF -END PGP SIGNATURE-
Re: [hwloc-devel] BGQ empty topology with MPI
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/03/12 01:08, Daniel Ibanez wrote: > Attached is the stderr and stdout from lstopo compiled as you > said. Interesting, so it's not correctly detecting the topology as BG/Q is 16 compute cores, each with 4 hardware threads. Instead it's detecting all 64 hardware threads and treating them as cores if I'm reading that right. I was puzzled by the OS info output too, it says: Machine#0(Backend=Linux OSName=CNK OSRelease=2.6.32-220.el6.bgq110_20120104.ppc64 OSVersion=1 HostName=R00-ID-J04.i2b.cetus Architecture=) cpuset 0xf...f complete 0x,0x online 0xf...f allowed 0xf...f nodeset 0x0 completeN 0x0 allowedN 0xf...f However, looking at the (open) source code for the CNK [1] (at least for BG/P) the uname info seems to be derived from the I/O nodes when its running in CIOD mode, so I suspect that's what's happening here (looks like a RHEL6 derived kernel from that). > I can't run hwloc-gather-topology.sh on the compute nodes since its > a script, but I can run it on the front end node. For those unfamiliar with BlueGene (at least P, and I suspect the same is true for Q), this is because the CNK doesn't implement fork() or execve(), they're designed to start your code and just keep running it until it dies. [1] - http://wiki.bg.anl-external.org/index.php/Cnk cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9r4r4ACgkQO2KABBYQAh8zswCfaoTK+PQ/ystZEX23AxK/0007 OwYAmwYHiVYzjtrCrAJ5L0GNfdbM/Hsr =9qJj -END PGP SIGNATURE-
Re: [hwloc-devel] BGQ empty topology with MPI
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 21/03/12 13:37, Daniel Ibanez wrote: > Please let me know if theres a hint of what could be causing it, > where to post, and what info to provide. Are you running Linux or CNK on the compute nodes for this? cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9pQ+QACgkQO2KABBYQAh/y1gCdFVeWEOgfdobkp+Xdl/Y9y6+i 0a4Anjt1REedBOQKbCvTEvl5tZrLSJjy =/Tk1 -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc-1.3.1 assertion failures on Linux/POWER7
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/02/12 10:38, Paul H. Hargrove wrote: > With that out of the way, I am please to say that when configuring > hwloc-1.3.1 with "CFLAGS=-qhalt=e" the correct variant of > sched_setaffinity() is detected. This gets rid of the messages > regarding sched_setaffinity() at build time, and the related test > failures w/ SIGSEGV. That fixes up the segmentation faults for me too with 1.4. - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk8p4nYACgkQO2KABBYQAh/IVwCdFvlclbpwYjR6WFc5/i3epNbo iWoAnAkMfNiXZNOD7Aort23h1LbrOA6y =5aXN -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc-1.3.1 assertion failures on Linux/POWER7
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/02/12 12:56, Paul H. Hargrove wrote: > When running "make check" in hwloc-1.3.1 on a Linux/POWER7 system I see: Doesn't seem to happen on Power6 (SLES10SP4) with GCC for hwloc 1.3.1 or 1.4. With XLC and 1.3.1 and 1.4 I get plenty of warnings (compile logs for both attached) whilst compiling and then 4 failures in make check (accompanied with segmentation faults): samuel@tambo:~/HWLOC/hwloc-1.3.1> grep -B1 FAIL: log /bin/sh: line 1: 5267 Segmentation fault ${dir}$tst FAIL: hwloc_bind /bin/sh: line 1: 5285 Segmentation fault ${dir}$tst FAIL: hwloc_get_last_cpu_location - -- /bin/sh: line 1: 5335 Segmentation fault ${dir}$tst FAIL: hwloc_is_thissystem - -- /bin/sh: line 1: 5481 Segmentation fault ${dir}$tst FAIL: glibc-sched samuel@tambo:~/HWLOC/hwloc-1.4> grep -B1 FAIL: log /bin/sh: line 1: 16973 Segmentation fault ${dir}$tst FAIL: hwloc_bind /bin/sh: line 1: 16991 Segmentation fault ${dir}$tst FAIL: hwloc_get_last_cpu_location - -- /bin/sh: line 1: 17073 Segmentation fault ${dir}$tst FAIL: hwloc_is_thissystem - -- /bin/sh: line 1: 17229 Segmentation fault ${dir}$tst FAIL: glibc-sched cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk8oqC0ACgkQO2KABBYQAh9oQgCfepxba/cUI4U6OgbU2kmfE76M xMAAoJBPm2gtrdYbAlgv58Kz3Jp+8V8o =bhTx -END PGP SIGNATURE- Making all in src make[1]: Entering directory `/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/src' CC topology.lo "/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include/hwloc.h", line 1203.28: 1506-1385 (W) The attribute "pure" is not a valid type attribute. CC traversal.lo "/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include/hwloc.h", line 1203.28: 1506-1385 (W) The attribute "pure" is not a valid type attribute. CC distances.lo "/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include/hwloc.h", line 1203.28: 1506-1385 (W) The attribute "pure" is not a valid type attribute. "distances.c", line 62.42: 1506-404 (W) restrict can only qualify a pointer type. "distances.c", line 84.50: 1506-404 (W) restrict can only qualify a pointer type. "distances.c", line 226.40: 1506-404 (W) restrict can only qualify a pointer type. CC topology-synthetic.lo "/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include/hwloc.h", line 1203.28: 1506-1385 (W) The attribute "pure" is not a valid type attribute. CC topology-xml.lo "/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include/hwloc.h", line 1203.28: 1506-1385 (W) The attribute "pure" is not a valid type attribute. CC bind.lo "/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include/hwloc.h", line 1203.28: 1506-1385 (W) The attribute "pure" is not a valid type attribute. CC cpuset.lo "/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include/hwloc.h", line 1203.28: 1506-1385 (W) The attribute "pure" is not a valid type attribute. CC misc.lo "/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include/hwloc.h", line 1203.28: 1506-1385 (W) The attribute "pure" is not a valid type attribute. CC topology-linux.lo "/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include/hwloc.h", line 1203.28: 1506-1385 (W) The attribute "pure" is not a valid type attribute. "topology-linux.c", line 303.33: 1506-280 (W) Function argument assignment between types "unsigned int" and "struct {...}*" is not allowed. "topology-linux.c", line 303.27: 1506-098 (E) Missing argument(s). "topology-linux.c", line 391.32: 1506-280 (W) Function argument assignment between types "unsigned int" and "struct {...}*" is not allowed. "topology-linux.c", line 391.26: 1506-098 (E) Missing argument(s). "topology-linux.c", line 715.40: 1506-280 (W) Function argument assignment between types "unsigned int" and "struct {...}*" is not allowed. "topology-linux.c", line 715.34: 1506-098 (E) Missing argument(s). "topology-linux.c", line 807.40: 1506-280 (W) Function argument assignment between types "unsigned int" and "struct {...}*" is not allowed. "topology-linux.c", line 807.34: 1506-098 (E) Missing argument(s). CCLD libhwloc.la make[1]: Leaving directory `/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/src' Making all in include make[1]: Entering directory `/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include' make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include' Ma
Re: [hwloc-devel] [hwloc-announce] hwloc v1.4rc2 released
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 23/01/12 20:34, Brice Goglin wrote: > Please test and let us know if you find any issue. I will hopefully get a chance to test this out tomorrow night on Arch Linux (so generally newer tools than in usual distros, plus 3.2.1 kernel) if that's a help? Don't want to hold things up unnecessarily. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk8d4+4ACgkQO2KABBYQAh+L+ACglykUblddUtuP28SE9g4KvAv/ y90AoJaduYI09ozEXK0rLKPDpP0UFPwn =hqPV -END PGP SIGNATURE-
Re: [hwloc-devel] Something lighter-weight than XML?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/09/11 19:54, Jeff Squyres wrote: > Fail enough, Nice Freudian slip. :-) > but do the back-end nodes have libxml? Apparently so.. rpm --root /bgsys/drivers/ppcfloor/linux/OS -qa | grep -i xml libxml2-2.6.23-22 That's the I/O node filesystem, which is what the compute node kernel maps I/O's back to I believe. Mind you most people on BG do statically linking as dynamic linking is rather new there. > For us to do what we want, it would need to be available on > all nodes because the OMPI orted processes would be querying > hwloc for the local topology and then sending it to the "head" > node process (usually mpirun) for further analysis and process > mapping. Umm, not sure that'll work on a BG because you can't fork() or execve() on a BG, the IBM mpirun runs on the login node and talks to an mpirund on the service node which then launches the users code on the compute nodes via the Navigator API. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk5kgRIACgkQO2KABBYQAh+EHgCfQhsNl5axcV+tHQ6jrAJW6Pq6 6EQAn3Dc4qkwoRd23KimXh9rrO0CKz9n =xlWv -END PGP SIGNATURE-
Re: [hwloc-devel] Something lighter-weight than XML?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/09/11 02:01, Jeff Squyres wrote: > Blue Gene? Well BG/P doesn't support Open-MPI, but the service (management) node and the front end (login) nodes are PPC SLES10 and libxml2 is there.. tambo-m:~ # rpm -q libxml2 libxml2-2.6.23-15.25.5 cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk5gNCIACgkQO2KABBYQAh/slQCePYvmBweezxSw0B+GySgdpmz8 bZIAn2MB0wg6ahQomHqWtiocCRZcYm/O =DLRU -END PGP SIGNATURE-
Re: [hwloc-devel] Something lighter-weight than XML?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/09/11 01:30, Jeff Squyres wrote: > Is there any chance that a lighter-weight, simple string > parsing module could be added to hwloc? What about something based on YAML ? http://www.yaml.org/spec/1.2/spec.html Designed to be easy to read by a human.. - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk5gM5YACgkQO2KABBYQAh8LAgCgh9dBLor3Sfiw8PCDvffZxjN1 j/YAnjB9vno4MY34DSxOwWT45yyU29y/ =/FPJ -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc trunk nightly 1.3a1r3511 fails to build on CentOS 5.6 & RHEL 5.6
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/07/11 19:18, Brice Goglin wrote: > Christopher, it should work starting with trunk r3535. Looks good to me with hwloc-1.3a1r3537. :-) Thanks! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk4SlVwACgkQO2KABBYQAh+o7gCfREGmCbd+agP8DGg03BWh983D aM0An0i+PxtgV//Q+x49JT2KdHE/9TBd =bfDl -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc trunk nightly 1.3a1r3511 fails to build on CentOS 5.6 & RHEL 5.6
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 17/06/11 17:14, Brice Goglin wrote: > Is there anyway to tell autoconf to ignore/drop some cached results? > Resetting the value of ac_cv_lib_pci_pci_lookup_name may work but I > don't know if we can do that (if we can, we should do it for > pci_init/pci_cleanup vs -lz too). Good guess Brice! This patch to configure from that nightly snapshot gets it working again: - --- configure.old 2011-06-17 11:01:25.0 +1000 +++ configure 2011-06-17 17:59:23.0 +1000 @@ -10321,6 +10321,7 @@ if test "x$ac_cv_lib_resolv_inet_ntoa" = x""yes; then : { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pci_lookup_name in -lpci" >&5 $as_echo_n "checking for pci_lookup_name in -lpci... " >&6; } +unset ac_cv_lib_pci_pci_lookup_name if test "${ac_cv_lib_pci_pci_lookup_name+set}" = set; then : $as_echo_n "(cached) " >&6 else thus: checking pci/pci.h usability... yes checking pci/pci.h presence... yes checking for pci/pci.h... yes checking for pci_init in -lpci... yes checking for pci_lookup_name in -lpci... no checking for inet_ntoa in -lresolv... yes checking for pci_lookup_name in -lpci... yes checking whether PCI_LOOKUP_NO_NUMBERS is declared... yes checking for pci_find_cap in -lpci... yes checking whether struct pci_dev has a device_class field... yes checking whether struct pci_dev has a domain field... yes and it compiles and builds an lstopo which includes PCI info! cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk37CoQACgkQO2KABBYQAh+rNgCgjjC7lgqxKd4EFqROW88roXHe J+QAnRBAFPxTK1dRB/06jjV1I9CTWZlv =1yf6 -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc trunk nightly 1.3a1r3511 fails to build on CentOS 5.6 & RHEL 5.6
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 14/06/11 00:17, Jeff Squyres wrote: > Heh; next time you might want to compress. :-) Er, I think I forgot - it was late here.. ;-) > I got the logs; I'm looking at them now... Cool. > Are you sure that the config.log you sent matches > the makelog.txt? I see PCI checks in the configure > stdout, but nothing about that in config.log... Sigh, I think I ended up building in two different locations somehow - as I said, it was late. :-) I'll fix that up and send it on again.. Mea culpa, mea maxima culpa.. :-) - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk32pnMACgkQO2KABBYQAh9j1ACeLrgNt6lW24VczVo353uzjVgC nG8An2yqoidfk7RJ4K2+WGC9qtpBXtS4 =ww6d -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc trunk nightly 1.3a1r3511 fails to build on CentOS 5.6 & RHEL 5.6
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 13/06/11 23:11, Jeff Squyres wrote: > Chris -- could you send your config.log? Posted but currently held for moderation due to size.. - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk32EJEACgkQO2KABBYQAh+N1ACfY2gTSkhf98S/h5wp5L8J6TbF oAUAn05HwjFQ/B1AEsd5/lUGubHzqjO8 =aSC4 -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc trunk nightly 1.3a1r3511 fails to build on CentOS 5.6 & RHEL 5.6
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 13/06/11 23:11, Jeff Squyres wrote: > Chris -- could you send your config.log? Sure - attached to this email, along with a "makelog" which contains the output of configure && make V=1. > I'd like to see how it's deciding that libpci is ok > in configure if we cannot possibly ever link properly... I suspect that this symbol is getting pulled in only when it is being incorporated into the shared library, or perhaps when pci_lookup_name() is being called ? cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk32Dy8ACgkQO2KABBYQAh+AwACfdy8HVEbv+LRKQ6lJ3W/ebmZA 9H4AnRkoR+HTFpW6hsL9KgbQJNFLOy6l =6Syz -END PGP SIGNATURE- This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. It was created by hwloc configure 1.3a1, which was generated by GNU Autoconf 2.65. Invocation command line was $ ./configure ## - ## ## Platform. ## ## - ## hostname = bruce-m.vlsci.unimelb.edu.au uname -m = x86_64 uname -r = 2.6.18-238.5.1.el5 uname -s = Linux uname -v = #1 SMP Fri Apr 1 18:41:58 EDT 2011 /usr/bin/uname -p = unknown /bin/uname -X = unknown /bin/arch = x86_64 /usr/bin/arch -k = unknown /usr/convex/getsysinfo = unknown /usr/bin/hostinfo = unknown /bin/machine = unknown /usr/bin/oslevel = unknown /bin/universe = unknown PATH: /usr/kerberos/sbin PATH: /usr/kerberos/bin PATH: /usr/local/sbin PATH: /usr/local/bin PATH: /sbin PATH: /bin PATH: /usr/sbin PATH: /usr/bin PATH: /opt/xcat/bin PATH: /opt/xcat/sbin PATH: /root/bin ## --- ## ## Core tests. ## ## --- ## configure:2841: checking build system type configure:2855: result: x86_64-unknown-linux-gnu configure:2875: checking host system type configure:2888: result: x86_64-unknown-linux-gnu configure:2908: checking target system type configure:2921: result: x86_64-unknown-linux-gnu configure:2965: checking for a BSD-compatible install configure:3033: result: /usr/bin/install -c configure:3044: checking whether build environment is sane configure:3094: result: yes configure:3235: checking for a thread-safe mkdir -p configure:3274: result: /bin/mkdir -p configure:3287: checking for gawk configure:3303: found /bin/gawk configure:3314: result: gawk configure:3325: checking whether make sets $(MAKE) configure:3347: result: yes configure:3422: checking how to create a ustar tar archive configure:3435: tar --version tar (GNU tar) 1.15.1 configure:3438: $? = 0 configure:3478: tardir=conftest.dir && eval tar --format=ustar -chf - "$tardir" >conftest.tar configure:3481: $? = 0 configure:3485: tar -xf - &5 gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50) Copyright (C) 2006 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. configure:3914: $? = 0 configure:3903: gcc -v >&5 Using built-in specs. Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --disable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux Thread model: posix gcc version 4.1.2 20080704 (Red Hat 4.1.2-50) configure:3914: $? = 0 configure:3903: gcc -V >&5 gcc: '-V' option must have argument configure:3914: $? = 1 configure:3903: gcc -qversion >&5 gcc: unrecognized option '-qversion' gcc: no input files configure:3914: $? = 1 configure:3934: checking whether the C compiler works configure:3956: gccconftest.c >&5 configure:3960: $? = 0 configure:4009: result: yes configure:4012: checking for C compiler default output file name configure:4014: result: a.out configure:4020: checking for suffix of executables configure:4027: gcc -o conftestconftest.c >&5 configure:4031: $? = 0 configure:4053: result: configure:4075: checking whether we are cross compiling configure:4083: gcc -o conftestconftest.c >&5 configure:4087: $? = 0 configure:4094: ./conftest configure:4098: $? = 0 configure:4113: result: no configure:4118: checking for suffix of object files configure:4140: gcc -c conftest.c >&5 configure:4144: $? = 0 configure:4165: result: o configure:4169: checking whether we
Re: [hwloc-devel] hwloc trunk nightly 1.3a1r3511 fails to build on CentOS 5.6 & RHEL 5.6
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 13/06/11 22:45, Jeff Squyres wrote: > Ah, that might explain it, then. Yeah.. :-( > I guess this means we need to add a few configure tests > to figure out the dependencies of libpci (if any). Yuck. Indeed. > Do we have any idea what function in libpci is calling the resolver > functionality? Complete guesswork from nm -l /usr/lib64/libpci.a, this is the object module where __res_query() is referenced: names-net.o: r .LC0 000b r .LC1 0072 r .LC10 r .LC11 0087 r .LC12 0030 r .LC13 0010 r .LC2 001a r .LC3 002e r .LC4 003a r .LC5 0041 r .LC6 004d r .LC7 005e r .LC8 0064 r .LC9 U _GLOBAL_OFFSET_TABLE_ U __h_errno_location U __memcpy_chk U __res_init U __res_query U __sprintf_chk U __stack_chk_fail U __strdup t dns_skip_name U pci_get_param 0060 T pci_id_net_lookup b resolver_inited.6042 Now that defines pci_id_net_lookup, and that appears to be referenced here: names.o: r .LC0 0003 r .LC1 0028 r .LC10 004f r .LC11 0054 r .LC12 005b r .LC13 0062 r .LC14 006c r .LC15 007b r .LC16 0081 r .LC17 0086 r .LC18 008c r .LC19 000e r .LC2 0092 r .LC20 0098 r .LC21 009e r .LC22 00a6 r .LC23 00b1 r .LC24 001a r .LC3 0029 r .LC4 002f r .LC5 0039 r .LC6 r .LC7 0046 r .LC8 004e r .LC9 U _GLOBAL_OFFSET_TABLE_ U __snprintf_chk U __sprintf_chk U __stack_chk_fail 0160 t format_name t format_name_pair 0250 t id_lookup 0380 t id_lookup_subsys U pci_id_cache_dirty U pci_id_cache_load U pci_id_insert U pci_id_lookup U pci_id_net_lookup U pci_load_name_list 0460 T pci_lookup_name U pci_mfree U snprintf That defines pci_lookup_name() and that is called from hwloc here: $ grep -R pci_lookup_name . ./src/topology-libpci.c:/* starting from pciutils 2.2, pci_lookup_name() takes a variable number ./src/topology-libpci.c:resname = pci_lookup_name(pciaccess, name, sizeof(name), ./src/topology-libpci.c:resname = pci_lookup_name(pciaccess, name, sizeof(name), ./src/topology-libpci.c:resname = pci_lookup_name(pciaccess, name, sizeof(name), So my guess it's the fact that we only have a static library that's causing the linker to pull in all the symbols, whether needed or not. :-( cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk32CgcACgkQO2KABBYQAh+0/wCgkb10IWRnNxh5BicSrv6HC0U/ TIQAn2hmylDYzNx7Z8+oeR//zPtLRP3k =vhQT -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc trunk nightly 1.3a1r3511 fails to build on CentOS 5.6 & RHEL 5.6
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 13/06/11 22:22, Jeff Squyres wrote: > Chris -- does nm on your libpci not show this? Nope, there is no libpci.so* on RHEL5.6, just a libpci.a. (and no, I've no idea why either!) cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk32BZ0ACgkQO2KABBYQAh9efQCfVMelwpuJr3s+IQYAAvr4QZmi +soAoJg4UtDiAknoHK7oheHaQRgWSjvk =ZHvx -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc trunk nightly 1.3a1r3511 fails to build on CentOS 5.6 & RHEL 5.6
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 13/06/11 19:41, Brice Goglin wrote: > libpci needs -lz in some cases, maybe it needs more. > However, in the case of -lz, we failed to configure > if we didn't add -lz to AC_CHECK_LIB. Your configure > works fine, right? Yup, configure works just fine and does add -lz. Very odd! cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk32BP4ACgkQO2KABBYQAh9BdwCggTP7Uy3G8n40WbUSUt8cs5UT XesAmgI4P8LI/LVwY01bBDMIHE2heriS =SIhh -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc trunk nightly 1.3a1r3511 fails to build on CentOS 5.6 & RHEL 5.6
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/06/11 15:45, Christopher Samuel wrote: > I *suspect* it's being pulled in by libpci - here: > > $ nm /usr/lib/libpci.a | grep res_query > U __res_query OK, looks like libpci may well be the culprit. Linking with the default libtool command line includes the symbol: $ /bin/sh ../libtool --tag=CC --mode=link gcc -g -fvisibility=hidden -I/usr/include/libxml2 -fvisibility=hidden -I/usr/include/libxml2 -I/tmp/HWLOC/hwloc-1.3a1r3511/include-no-undefined -version-number 0:0:0 -lxml2 -lz -lm-lpci -o libhwloc.la -rpath /usr/local/lib topology.lo traversal.lo distances.lo topology-synthetic.lo bind.lo cpuset.lo misc.lo topology-xml.lo topology-libpci.lo topology-linux.lo topology-x86.lo libtool: link: rm -fr .libs/libhwloc.la .libs/libhwloc.lai .libs/libhwloc.so .libs/libhwloc.so.0 .libs/libhwloc.so.0.0.0 libtool: link: gcc -g -shared .libs/topology.o .libs/traversal.o .libs/distances.o .libs/topology-synthetic.o .libs/bind.o .libs/cpuset.o .libs/misc.o .libs/topology-xml.o .libs/topology-libpci.o .libs/topology-linux.o .libs/topology-x86.o -lxml2 -lz -lm -lpci - -Wl,-soname -Wl,libhwloc.so.0 -o .libs/libhwloc.so.0.0.0 libtool: link: (cd ".libs" && rm -f "libhwloc.so.0" && ln -s "libhwloc.so.0.0.0" "libhwloc.so.0") libtool: link: (cd ".libs" && rm -f "libhwloc.so" && ln -s "libhwloc.so.0.0.0" "libhwloc.so") libtool: link: ( cd ".libs" && rm -f "libhwloc.la" && ln -s "../libhwloc.la" "libhwloc.la" ) $ grep -R __res_query . Binary file ./.libs/libhwloc.so matches Binary file ./.libs/libhwloc.so.0.0.0 matches Binary file ./.libs/libhwloc.so.0 matches Removing the -lpci results in a library which no longer references it.. $ /bin/sh ../libtool --tag=CC --mode=link gcc -g -fvisibility=hidden -I/usr/include/libxml2 -fvisibility=hidden -I/usr/include/libxml2 -I/tmp/HWLOC/hwloc-1.3a1r3511/include-no-undefined -version-number 0:0:0 -lxml2 -lz -lm -o libhwloc.la -rpath /usr/local/lib topology.lo traversal.lo distances.lo topology-synthetic.lo bind.lo cpuset.lo misc.lo topology-xml.lo topology-libpci.lo topology-linux.lo topology-x86.lo libtool: link: rm -fr .libs/libhwloc.la .libs/libhwloc.lai .libs/libhwloc.so .libs/libhwloc.so.0 .libs/libhwloc.so.0.0.0 libtool: link: gcc -g -shared .libs/topology.o .libs/traversal.o .libs/distances.o .libs/topology-synthetic.o .libs/bind.o .libs/cpuset.o .libs/misc.o .libs/topology-xml.o .libs/topology-libpci.o .libs/topology-linux.o .libs/topology-x86.o -lxml2 -lz -lm - -Wl,-soname -Wl,libhwloc.so.0 -o .libs/libhwloc.so.0.0.0 libtool: link: (cd ".libs" && rm -f "libhwloc.so.0" && ln -s "libhwloc.so.0.0.0" "libhwloc.so.0") libtool: link: (cd ".libs" && rm -f "libhwloc.so" && ln -s "libhwloc.so.0.0.0" "libhwloc.so") libtool: link: ( cd ".libs" && rm -f "libhwloc.la" && ln -s "../libhwloc.la" "libhwloc.la" ) $ grep -R __res_query . $ So it's a system library issue - over to you folks! :-) cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk311F4ACgkQO2KABBYQAh/iIgCffHbQkshfPPBOo+ls2JwXB2pO jSgAnjZqBtDsHZBJNVkmrxq6uJ9KRonv =ZC4K -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc trunk nightly 1.3a1r3511 fails to build on CentOS 5.6 & RHEL 5.6
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/06/11 11:42, Christopher Samuel wrote: > CCLD lstopo > /tmp/hwloc-1.3a1r3511/src/.libs/libhwloc.so: undefined reference to > `__res_query' For the record this happens with system GCC & GCC 4.4, Intel compilers and PGI compilers on RHEL 5.6 and CentOS 5.6 (both on our SGI CentOS system and a CentOS box run by another HPC center). I've even tried with ld from binutils 2.20.1 which was on our CentOS system with the same result. Links OK on SLES9, SLES10 and Ubuntu 11.04. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk31xT8ACgkQO2KABBYQAh97UQCeJ5z3YbjYxs4a9gjFh+ZiDH6O TjYAnivq6MY0HTuP0xjc5wx6Vfef+HZc =52rR -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc trunk nightly 1.3a1r3511 fails to build on CentOS 5.6 & RHEL 5.6
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/06/11 11:55, Samuel Thibault wrote: > Could you look for this in your /usr/include? $ grep -R __res_query /usr/include/ /usr/include/resolv.h:#define res_query __res_query /usr/include/resolv.h:#define res_querydomain __res_querydomain $ grep -wR res_query /usr/include/ /usr/include/resolv.h:#define res_query __res_query /usr/include/resolv.h:int res_query (const char *, int, int, u_char *, int) __THROW; The symbol itself is only in the .so files, not in any of the object modules, so it looks like it's something being pulled in as a link-time dependency by libtool/ld. $ grep -R res_query . Binary file ./src/.libs/libhwloc.so.0.0.0 matches Binary file ./src/.libs/libhwloc.so matches Binary file ./src/.libs/libhwloc.so.0 matches > I fail to see how that symbol can ever get into > libhwloc.so, as we don't do any network thing at all... I *suspect* it's being pulled in by libpci - here: $ nm /usr/lib/libpci.a | grep res_query U __res_query $ rpm -qf /usr/lib/libpci.a pciutils-devel-3.1.7-3.el5 Oddly that's listed as being in the library on Ubuntu 11.04 too, but it's not ending up in the libhwloc.so on that platform. cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk30UmQACgkQO2KABBYQAh9Y7QCfRjySIuh4eoN7BEGqJ4BXWLgB KsUAoJaYlF8xKgFrg2NM/a8QIhjdgK9/ =/diu -END PGP SIGNATURE-
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v1.2rc1 released
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/04/11 03:17, Brice Goglin wrote: > The Hardware Locality (hwloc) team is pleased to > announce the first release candidate of v1.2: I get the following warnings when doing a "make check" on SuSE SLES10 on PPC64 and RHEL5.6 on x86-64: CC libhwloc_ports_la-topology-windows.lo topology-windows.c: In function 'hwloc_win_get_VirtualAllocExNumaProc': topology-windows.c:323: warning: assignment from incompatible pointer type topology-windows.c:328: warning: assignment from incompatible pointer type CC libhwloc_ports_la-topology-darwin.lo CC libhwloc_ports_la-topology-freebsd.lo topology-freebsd.c: In function 'hwloc_freebsd_set_thread_cpubind': topology-freebsd.c:126: warning: passing argument 3 of 'pthread_setaffinity_np' from incom patible pointer type topology-freebsd.c: In function 'hwloc_freebsd_get_thread_cpubind': topology-freebsd.c:150: warning: passing argument 3 of 'pthread_getaffinity_np' from incom patible pointer type On Ubuntu 10.04 I get the same set (slightly different diagnostics): CC libhwloc_ports_la-topology-windows.lo topology-windows.c: In function 'hwloc_win_get_VirtualAllocExNumaProc': topology-windows.c:323: warning: assignment from incompatible pointer type topology-windows.c:328: warning: assignment from incompatible pointer type CC libhwloc_ports_la-topology-darwin.lo CC libhwloc_ports_la-topology-freebsd.lo topology-freebsd.c: In function 'hwloc_freebsd_set_thread_cpubind': topology-freebsd.c:126: warning: passing argument 3 of 'pthread_setaffinity_np' from incompatible pointer type /usr/include/pthread.h:448: note: expected 'const struct cpu_set_t *' but argument is of type 'cpuset_t *' topology-freebsd.c: In function 'hwloc_freebsd_get_thread_cpubind': topology-freebsd.c:150: warning: passing argument 3 of 'pthread_getaffinity_np' from incompatible pointer type /usr/include/pthread.h:453: note: expected 'struct cpu_set_t *' but argument is of type 'cpuset_t *' Are these ignorable ? cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk2dAc8ACgkQO2KABBYQAh8/XACfXQvAYrwQ9rry1qSL3mvyCvKj SHoAn0ZZIXiXOaylfQB09jGtFJJJ8S3Z =ajlA -END PGP SIGNATURE-
Re: [hwloc-devel] AMD fusion & hwloc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/04/11 07:05, Brice Goglin wrote: > Actually, I am not so familiar with all these marketing > names, so I don't really know if the "Fusion" name applies > to Bulldozer chips, or only to the already available > laptop-like Bobcat chips. According to the wikipedia page: http://en.wikipedia.org/wiki/AMD_Fusion currently Fusion (14h) is Bobcat only, with possible Bulldozer based systems in 2012 (and with an "enhanced Bobcat" somewhere in between). Linux support for these systems may still be in progress, MCE & Oprofile support was merged in 2.6.36 but support for the 14h and 12h family temperature sensors was only merged in 2.6.38. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk2c/m4ACgkQO2KABBYQAh8j3wCfdzQqh0elitMRf5fTr8Hbmtig ZAAAnRyn71pf94dxQRkvhL3LHYXkITOW =aIzd -END PGP SIGNATURE-
Re: [hwloc-devel] upcoming releases
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 31/03/11 07:13, Brice Goglin wrote: > Comments? Sounds reasonable to me. - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk2T46QACgkQO2KABBYQAh9DfgCeL/XMokMPaKTUnEJYm+kj3zwE GzQAoIQLAzsayfT7yNNUxwXXcA2/ny8J =cQDB -END PGP SIGNATURE-
[hwloc-devel] Fwd: Fw: How to get cache sizes on AIX 6.1 ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi folks, I spotted the comment recently about not knowing how to determine cache sizes on AIX and so I pinged (pung?) a friend of mine at IBM in case he could find out. Looks like he had some success (attached).. :-) cheers! Chris - Original Message Subject:Fw: How to get cache sizes on AIX 6.1 ? List-Post: hwloc-devel@lists.open-mpi.org Date: Thu, 17 Feb 2011 15:28:24 +1100 From: Peter Zutenis <zute...@au1.ibm.com> To: sam...@unimelb.edu.au Chris, The attached bit of code may help you. I hope it is what you are after. P.S. Must curry soon.. *Peter Zutenis * Senior I/T Specialist - pSeries, powerVM and AIX Systems and Technology Group IBM Australia Ltd 60 City Road Southgate, 3006 Ph: 61-3-9626-6841 Mobile: 0413 274 855 Email: zute...@au1.ibm.com <http://www.ibm.com/systems/power/> - - Forwarded by Peter Zutenis/Australia/IBM on 17/02/2011 15:26 - From: Peter Farrell/Australia/IBM To: Peter Zutenis/Australia/IBM@IBMAU List-Post: hwloc-devel@lists.open-mpi.org Date: 17/02/2011 12:24 Subject:Re: Fw: How to get cache sizes on AIX 6.1 ? - Peter, It's not all my own work, a modification of something I found on the Internet. Tested on Power4 but I see no reason why it shouldn't work on the other boxes. Peter F. From:Peter Zutenis/Australia/IBM To:Peter Farrell/Australia/IBM@IBMAU List-Post: hwloc-devel@lists.open-mpi.org Date:16/02/2011 04:01 PM Subject:Fw: How to get cache sizes on AIX 6.1 ? - Hi Pete, Hope all is well. Got this interesting request from a client of mine. Do you have any idea how I can pursue this within IBM ? Cheers, *Peter Zutenis * Senior I/T Specialist - pSeries, powerVM and AIX Systems and Technology Group IBM Australia Ltd 60 City Road Southgate, 3006 Ph: 61-3-9626-6841 Mobile: 0413 274 855 Email: zute...@au1.ibm.com <http://www.ibm.com/systems/power/> - - Forwarded by Peter Zutenis/Australia/IBM on 16/02/2011 16:00 - From: Christopher Samuel <sam...@unimelb.edu.au> To: Peter Zutenis/Australia/IBM@IBMAU List-Post: hwloc-devel@lists.open-mpi.org Date: 15/02/2011 14:04 Subject:How to get cache sizes on AIX 6.1 ? - Hiya Peter! Long time no speak (or tweet!). The hwloc folks were asked: $ btw: are there any plans to fully support POWER6 $ and/or POWER7 running AIX6.1 for the future? $ Actually we can get the topology right but cache $ sizes are missing. Their response was: # obj->attr->cache.size = 0; /* TODO: ? */ # # :) # # I don't know which AIX API can provide it. Would you have any idea how to get the cache size programaticaly from user space on AIX ? cheers! Chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk1cudIACgkQO2KABBYQAh9QnACfafgXOxJcs2PlTBCQmIG2rBKG uKQAniWz1qWHkLwBi+dBHj4FnmqZ7cNl =z5mY -END PGP SIGNATURE- qcpu.c Description: Binary data
Re: [hwloc-devel] get cpu where a process/thread is executing
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 15/02/11 00:56, Jeff Squyres wrote: > Maybe get_current_cpuset? get_recent_cpuset ? - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk1Z6/EACgkQO2KABBYQAh/gOQCfUgYq/ECuUpEuTWAjraGSWBV8 2IIAniW1ZNp4lQTuo5sr/y8zU69eHhmL =nXpo -END PGP SIGNATURE-
Re: [hwloc-devel] CMake instead of m4
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 14/02/11 05:04, LdK wrote: > Why don't you use CMake instead of autoconf/automake > old couple as build system ? Any extra dependency that's needed for a piece of software to be installed decreases its attractiveness to potential users and (very important to hwloc) integrators. The benefit to sysadmins such as myself of packages using autotools is that they should work with whatever the system has already and not require another package to be installed. Whenever we come across a piece of software we need to install here that uses CMake there's a collective sigh of "oh no, not again".. Think of using autotools as a way of increasing your karma by taking a little bit more pain in return for decreasing a whole lot more sysadmins pain.. ;-) cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk1YpbAACgkQO2KABBYQAh8kjwCeJEpjJ+qEX2nvWewyfryvoAIg MsMAoJLDPS9aGcNkNoFzS/OcLpwvi6YV =SI2F -END PGP SIGNATURE-
[hwloc-devel] Images from lstopo slightly truncated width wise when in cpuset
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 An edge case and purely cosmetic, but I just ran lstopo on a compute node through Torque in a single CPU job with X11 forwarding and found that the width of the image is not quite sufficient to fit all the text in, specifically the RAM info on the right hand socket is partly truncated. I've attached an OK and a truncated image to this for info generated with v1.1. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk0KplkACgkQO2KABBYQAh894gCeP9AZq1APX+38xPqwiSb83Nv6 MusAoJamUkdkehd3/7heNWhN/iUZ0LD8 =p4ks -END PGP SIGNATURE-
Re: [hwloc-devel] 1.1rc4 released
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 16/12/10 18:12, Brice Goglin wrote: > Le 16/12/2010 02:29, Christopher Samuel a écrit : >> make check fails on our CentOS 5.4 box: > > We can safely ignore this failure. Can you change the exit line at the > end of tests/linux/gather/test-gather-topology.sh.in into "exit 0", > reconfigure and rerun make check to see if anything else fails? Looks fine, other than the expected "XFAILED (38, Function not implemented)" stuff. > In your case, the problem is related to gather-topology.sh not gathering > all the Linux cpuset/cgroup info properly (we would need to parse > /proc/mounts in gather-topology.sh). I will try to fix this in 1.2 (I am > opening a ticket about it). We don't need to delay 1.1 because of this, > so I will ignore the failure. Makes good sense! cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk0Kk4UACgkQO2KABBYQAh+yXQCbBtnumpt0N0AnglL0d9o2lt5T YJ0AoJJFuKiuWYXvUFOIAf9ozDR1Vv4B =m7Bb -END PGP SIGNATURE-
Re: [hwloc-devel] 1.1rc4 released
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 14/12/10 05:42, Jeff Squyres wrote: > Please test! > > http://www.open-mpi.org/projects/hwloc/ > http://www.open-mpi.org/software/hwloc/v1.1/ > make check fails on our CentOS 5.4 box: Saving current system topology to XML... Saving current system topology to a tarball... Hierarchy gathered in /tmp/tmp.pXNgB29823/save.tar.bz2 and kept in /tmp/tmp.PUwJU29842/save/ Expected topology output stored in /tmp/tmp.pXNgB29823/save.output Extracting tarball... Saving tarball topology to XML... Comparing XML outputs... - --- save.xml2010-12-16 12:25:39.0 +1100 +++ save2.xml 2010-12-16 12:25:42.0 +1100 @@ -3,7 +3,6 @@ - - FAIL: test-gather-topology.sh 1 of 1 test failed Please report to http://www.open-mpi.org/community/help/ Passes on RHEL5.5/x86-64, SLES10/PPC and Ubuntu 10.10/x86-64. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk0Ja2IACgkQO2KABBYQAh8dMwCffRDOx5ERnDpCer8dHdKpueTf 3lQAnRlg8G56W7ULR/0m9QPGbCNdOHiD =tneb -END PGP SIGNATURE-
[hwloc-devel] hwloc 1.1rc4r2838 warnings
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Not sure if these were present beforehand, but Ubuntu 10.04 warns about the following issues on x86-64 when doing a "make check": /home/samuel/Downloads/HWLOC/v1.1/tests/ports/include/windows.h:23: warning: function declaration isn?t a prototype /home/samuel/Downloads/HWLOC/v1.1/tests/ports/include/windows.h:23: warning: function declaration isn?t a prototype topology-windows.c:209: warning: ISO C forbids assignment between function pointer and ?void *? topology-windows.c:210: warning: ISO C forbids assignment between function pointer and ?void *? topology-windows.c:214: warning: assignment from incompatible pointer type topology-windows.c:219: warning: assignment from incompatible pointer type topology-windows.c:280: warning: ISO C forbids assignment between function pointer and ?void *? topology-windows.c:281: warning: ISO C forbids assignment between function pointer and ?void *? topology-windows.c:282: warning: ISO C forbids assignment between function pointer and ?void *? topology-windows.c:283: warning: ISO C forbids assignment between function pointer and ?void *? topology-freebsd.c:125: warning: passing argument 3 of ?pthread_setaffinity_np? from incompatible pointer type topology-freebsd.c:149: warning: passing argument 3 of ?pthread_getaffinity_np? from incompatible pointer type xmlbuffer.c:39: warning: format not a string literal and no format arguments xmlbuffer.c:42: warning: format not a string literal and no format arguments SLES10 has these additional warnings: topology-linux.c:1002: warning: implicit declaration of function ?migrate_pages? lstopo.c:186:5: warning: "CAIRO_HAS_SVG_SURFACE" is not defined lstopo.c:517:5: warning: "CAIRO_HAS_SVG_SURFACE" is not defined lstopo-cairo.c:22:5: warning: "CAIRO_HAS_SVG_SURFACE" is not defined lstopo-cairo.c:50:104: warning: "CAIRO_HAS_SVG_SURFACE" is not defined lstopo-cairo.c:87:79: warning: "CAIRO_HAS_SVG_SURFACE" is not defined lstopo-cairo.c:443:5: warning: "CAIRO_HAS_SVG_SURFACE" is not defined hwloc-distrib.c:199: warning: comparison between signed and unsigned Any use ? cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzt5VoACgkQO2KABBYQAh8XQACglU3LKKDMuPk3Pywm/0ZdRh8o Pu0An0bu4feXwQctibrBoKP2UBCZu4Ay =UyJP -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc to be included in RHEL 6.1
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 24/11/10 23:28, Samuel Thibault wrote: > Again, we are not talking about the ordering :) > We are talking about the labels that we put on the > figure, whether we want to show the actual physical > IDs, or just the logical IDs Apologies - I do mean the labelling - I guess what I was trying to say was that my preference is that the labelling should depict the logical ordering. Caffeine++; /* :-) */ cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzt0zsACgkQO2KABBYQAh/sIACeNd7XPlu+8xkPYfNSWrOnhllM 9c4An1f6TRnbIJNqStXoVcpqJM4LxImL =t2qo -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc 1.1 rc2 make check fails on SLES10SP1 on PPC64
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 24/11/10 17:47, Christopher Samuel wrote: > I can get the free(fullmask); to not fail if I comment out > the memset() and migrate_pages() calls. If I just comment > out the migrate_pages() then it still fails so there's > something wrong in that calculation from the look of it. I can now duplicate the error on my Ubuntu x86-64 laptop by installing the libnuma-dev package and running hwloc_bind with the environment variable MALLOC_CHECK_ set to 3. Brice, I'll check your suggestion now. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzst6UACgkQO2KABBYQAh9oRwCdHiVKjX8w0+3z5+L9YhXk6fSZ y6UAn341AMsUTwr5uF6G4CwSsC3r0oDg =j5I3 -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc 1.1 rc2 make check fails on SLES10SP1 on PPC64
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 24/11/10 17:47, Christopher Samuel wrote: > I can get the free(fullmask); to not fail if I comment out > the memset() and migrate_pages() calls. If I just comment > out the migrate_pages() then it still fails so there's > something wrong in that calculation from the look of it. Turns out that is correlated with a Valgrind error: ==6259== Invalid write of size 4 ==6259==at 0xFFBA650: memset (mc_replace_strmem.c:626) ==6259==by 0x10016867: hwloc_linux_set_thisthread_membind (topology-linux.c:1001) ==6259==by 0x1000DBCB: hwloc_set_membind_nodeset (bind.c:243) ==6259==by 0x1000DC7B: hwloc_set_membind (bind.c:259) ==6259==by 0x10002E2F: testmem (hwloc_bind.c:79) ==6259==by 0x1000361B: testmem2 (hwloc_bind.c:113) ==6259==by 0x10003723: testmem3 (hwloc_bind.c:128) ==6259==by 0x100039CF: main (hwloc_bind.c:187) ==6259== Address 0x69fd354 is 0 bytes after a block of size 4 alloc'd ==6259==at 0xFFB86C8: malloc (vg_replace_malloc.c:236) ==6259==by 0x10016837: hwloc_linux_set_thisthread_membind (topology-linux.c:999) ==6259==by 0x1000DBCB: hwloc_set_membind_nodeset (bind.c:243) ==6259==by 0x1000DC7B: hwloc_set_membind (bind.c:259) ==6259==by 0x10002E2F: testmem (hwloc_bind.c:79) ==6259==by 0x1000361B: testmem2 (hwloc_bind.c:113) ==6259==by 0x10003723: testmem3 (hwloc_bind.c:128) ==6259==by 0x100039CF: main (hwloc_bind.c:187) - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzsteAACgkQO2KABBYQAh9z6gCfUrJ0IM/WZRNr58Qtlgt9YhBO Kv0AnAyaAGfH6Y2HRqaZ8E8CHrEMMtYS =p1lq -END PGP SIGNATURE-
Re: [hwloc-devel] Valgrind errors for hwloc_bind in 1.1rc4r2825
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 24/11/10 17:21, Christopher Samuel wrote: > This is valgrind's summary of errors on PPC: I meant to include the link to interpreting Valgrind output: http://www.valgrind.org/docs/manual/quick-start.html#quick-start.interpret More details on Valgrind's memcheck tool error messages: http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.errormsgs cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzssD0ACgkQO2KABBYQAh/KnACePe/famchZljAQajp4tlG/dA0 EiEAn2K38Pz4DB42LM91FlGuT4t5w5FX =Eg34 -END PGP SIGNATURE-
Re: [hwloc-devel] Next 1.0/1.1 RC's
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 23/11/10 19:31, Samuel Thibault wrote: > Could you try the latest 1.1rc snapshot? There's an > additional free fixup. Still fails with 1.1rc4r2825 I'm afraid. :-( (gdb) bt full #0 0x0fb8d460 in raise () from /lib/power6x/libc.so.6 No symbol table info available. #1 0x0fb8eed4 in abort () from /lib/power6x/libc.so.6 No symbol table info available. #2 0x0fbcaa14 in __libc_message () from /lib/power6x/libc.so.6 No symbol table info available. #3 0x0fbd2304 in malloc_printerr () from /lib/power6x/libc.so.6 No symbol table info available. #4 0x0fbd3f98 in free () from /lib/power6x/libc.so.6 No symbol table info available. #5 0x1001689c in hwloc_linux_set_thisthread_membind (topology=0x10030018, nodeset=0x10031038, policy=HWLOC_MEMBIND_BIND, flags=8) at topology-linux.c:1003 fullmask = 0x10038248 max_os_index = 32 linuxmask = 0x10036af8 linuxpolicy = 1 err = -1 #6 0x1000dbcc in hwloc_set_membind_nodeset (topology=0x10030018, nodeset=0x10031038, policy=HWLOC_MEMBIND_BIND, flags=8) at bind.c:243 No locals. #7 0x1000dc7c in hwloc_set_membind (topology=0x10030018, set=0x10035e88, policy=HWLOC_MEMBIND_BIND, flags=8) at bind.c:259 nodeset = 0x100381e8 ret = 268656936 #8 0x10002e30 in testmem (nodeset=0x10035e88, policy=HWLOC_MEMBIND_BIND, flags=8, expected=1) at hwloc_bind.c:79 new_nodeset = 0x10036128 newpolicy = HWLOC_MEMBIND_FIRSTTOUCH area = 0x0 area_size = 1024 #9 0x1000361c in testmem2 (set=0x10035e88, flags=8) at hwloc_bind.c:113 No locals. #10 0x10003724 in testmem3 (set=0x10035e88) at hwloc_bind.c:128 No locals. #11 0x100039d0 in main () at hwloc_bind.c:187 set = 0x10035e88 obj = 0x10031180 str = 0x100365c0 "0x00ff" - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzsn4EACgkQO2KABBYQAh+74QCdEQ/KtODfCoIHFkVFrKHhAaNY WxsAniGEi62T2uZ6gjyyxue0c+hr+QZo =MjIg -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc 1.1 rc2 make check fails on SLES10SP1 on PPC64
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/11/10 15:51, Christopher Samuel wrote: > Sadly I'm getting the same issue with hwloc-1.1rc2r2806, > Valgrind hasn't been of any help either. :-( I should say the problem does not exhibit itself with valgrind. It also does not exhibit itself if I link with the debugging malloc "dmalloc" from dmalloc.com. I'm starting to wonder whether this is some weird issue with the glibc malloc/free on SLES10/PPC.. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzqCDkACgkQO2KABBYQAh9r5QCfeE6VysNV8hYJZwLVmkgCpzg0 ti4AnRE2x3UNdOtYNcCduB6dbNV9/R3o =yKvD -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc 1.1 rc2 make check fails on SLES10SP1 on PPC64
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/11/10 10:54, Samuel Thibault wrote: > I believe I know why. I've commited a fix, could you > test this night's snapshot of v1.1? Sadly I'm getting the same issue with hwloc-1.1rc2r2806, Valgrind hasn't been of any help either. :-( I'm in the Qantas lounge for an hour or two with very slow wifi (scored an upgrade on points) and then I'll be offline for quite a while flying and then with jetlag. cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzp9ucACgkQO2KABBYQAh+ilACdGHOcckHOHDHrSEyt7PRTkRIR xJEAnR2EzDdZQufbwlPPmQOWKVSE8C/H =KxGk -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc 1.1 rc2 make check fails on SLES10SP1 on PPC64
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/11/10 08:11, Brice Goglin wrote: > You should try checkouting from svn branch v1.1, there > are several other fixes in there anyway, maybe I am just > confusing some bug reports. OK, tried that (and did --disable-debug so I could see what was happening) but I'm still seeing that same failure: bound *** glibc detected *** /tmp/hwloc/v1.1/tests/.libs/hwloc_bind: free(): invalid next size (fast): 0x1001c240 *** === Backtrace: = /lib/power6x/libc.so.6[0xfe73304] /lib/power6x/libc.so.6(__libc_free+0xc8)[0xfe74f98] /tmp/hwloc/v1.1/src/.libs/libhwloc.so.1[0xff9cd38] /tmp/hwloc/v1.1/src/.libs/libhwloc.so.1(hwloc_set_membind_nodeset+0x16c)[0xff93908] /tmp/hwloc/v1.1/src/.libs/libhwloc.so.1(hwloc_set_membind+0x80)[0xff939d0] /tmp/hwloc/v1.1/tests/.libs/hwloc_bind[0x10002490] /tmp/hwloc/v1.1/tests/.libs/hwloc_bind[0x10002c7c] /tmp/hwloc/v1.1/tests/.libs/hwloc_bind[0x10002d84] /tmp/hwloc/v1.1/tests/.libs/hwloc_bind[0x10003030] /lib/power6x/libc.so.6[0xfe1690c] /lib/power6x/libc.so.6[0xfe16c44] === Memory map: 0010-00103000 r-xp 0010 00:00 0 0fb14000-0fb17000 r-xp fd:03 6772 /lib/libdl-2.4.so 0fb17000-0fb26000 ---p 3000 fd:03 6772 /lib/libdl-2.4.so 0fb26000-0fb27000 r--p 2000 fd:03 6772 /lib/libdl-2.4.so 0fb27000-0fb28000 rw-p 3000 fd:03 6772 /lib/libdl-2.4.so 0fb38000-0fb3e000 r-xp fd:06 41707 /usr/lib/libnuma.so.1 0fb3e000-0fb4d000 ---p 6000 fd:06 41707 /usr/lib/libnuma.so.1 0fb4d000-0fb4e000 rw-p 5000 fd:06 41707 /usr/lib/libnuma.so.1 0fb4e000-0fb5 rw-p 0fb4e000 00:00 0 0fb6-0fc0b000 r-xp fd:03 6876 /lib/power6x/libm-2.4.so 0fc0b000-0fc1a000 ---p 000ab000 fd:03 6876 /lib/power6x/libm-2.4.so 0fc1a000-0fc1e000 r--p 000aa000 fd:03 6876 /lib/power6x/libm-2.4.so 0fc1e000-0fc1f000 rw-p 000ae000 fd:03 6876 /lib/power6x/libm-2.4.so 0fc2f000-0fc43000 r-xp fd:03 431 /lib/libz.so.1.2.3 0fc43000-0fc52000 ---p 00014000 fd:03 431 /lib/libz.so.1.2.3 0fc52000-0fc53000 rw-p 00013000 fd:03 431 /lib/libz.so.1.2.3 0fc63000-0fdcb000 r-xp fd:06 28900 /usr/lib/libxml2.so.2.6.23 0fdcb000-0fdda000 ---p 00168000 fd:06 28900 /usr/lib/libxml2.so.2.6.23 0fdda000-0fde7000 rw-p 00167000 fd:06 28900 /usr/lib/libxml2.so.2.6.23 0fde7000-0fde8000 rw-p 0fde7000 00:00 0 0fdf8000-0ff5c000 r-xp fd:03 6877 /lib/power6x/libc-2.4.so 0ff5c000-0ff6b000 ---p 00164000 fd:03 6877 /lib/power6x/libc-2.4.so 0ff6b000-0ff6d000 r--p 00163000 fd:03 6877 /lib/power6x/libc-2.4.so 0ff6d000-0ff71000 rw-p 00165000 fd:03 6877 /lib/power6x/libc-2.4.so 0ff71000-0ff74000 rw-p 0ff71000 00:00 0 0ff84000-0ffa6000 r-xp fd:05 12424 /tmp/hwloc/v1.1/src/.libs/libhwloc.so.1.0.1 0ffa6000-0ffb6000 ---p 00022000 fd:05 12424 /tmp/hwloc/v1.1/src/.libs/libhwloc.so.1.0.1 0ffb6000-0ffb7000 rw-p 00022000 fd:05 12424 /tmp/hwloc/v1.1/src/.libs/libhwloc.so.1.0.1 0ffc7000-0ffdd000 r-xp fd:03 6875 /lib/power6x/libpthread-2.4.so 0ffdd000-0ffec000 ---p 00016000 fd:03 6875 /lib/power6x/libpthread-2.4.so 0ffec000-0ffed000 r--p 00015000 fd:03 6875 /lib/power6x/libpthread-2.4.so 0ffed000-0ffee000 rw-p 00016000 fd:03 6875 /lib/power6x/libpthread-2.4.so 0ffee000-0fff rw-p 0ffee000 00:00 0 1000-10004000 r-xp fd:05 13553 /tmp/hwloc/v1.1/tests/.libs/hwloc_bind 10013000-10014000 rw-p 3000 fd:05 13553 /tmp/hwloc/v1.1/tests/.libs/hwloc_bind 10014000-10035000 rwxp 10014000 00:00 0 [heap] f7fb3000-f7fb6000 rw-p f7fb3000 00:00 0 f7fdd000-f7fdf000 rw-p f7fdd000 00:00 0 f7fdf000-f7fff000 r-xp fd:03 175 /lib/ld-2.4.so f800e000-f800f000 r--p 0001f000 fd:03 175 /lib/ld-2.4.so f800f000-f801 rw-p 0002 fd:03 175 /lib/ld-2.4.so ff8c6000-ff8da000 rw-p ff8c6000 00:00 0 ff8da000-ff8db000 rw-p ff8da000 00:00 0 [stack] ff8db000-ff8dc000 rw-p ff8db000 00:00 0 /bin/sh: line 1: 19260 Aborted ${dir}$tst FAIL: hwloc_bind cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzpmngACgkQO2KABBYQAh+DMQCfbuniwf93NgSAXmhl9Wrwq9HE BN8An06sVC8UMOaBcSk/b/mGONd/J38J =1UIM -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc 1.1 rc2 make check fails on SLES10SP1 on PPC64
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/11/10 07:33, Brice Goglin wrote: > This patch (on top of the previous patch) should make > hwloc-gather-topology.sh work again (and make check too, > hopefully). Well that fixed that part, "patch" whinged that it was reversed but -R made it happy. However, I've now got another failure later on: Bind area : OK Get area : FAILED (38, Function not implemented) (expected) bound *** glibc detected *** /tmp/hwloc/hwloc-1.1rc2/tests/.libs/hwloc_bind: free(): invalid next size (fast): 0x1001c240 *** === Backtrace: = /lib/power6x/libc.so.6[0xfb8f304] /lib/power6x/libc.so.6(__libc_free+0xc8)[0xfb90f98] /tmp/hwloc/hwloc-1.1rc2/src/.libs/libhwloc.so.1[0xff9cd2c] /tmp/hwloc/hwloc-1.1rc2/src/.libs/libhwloc.so.1(hwloc_set_membind_nodeset+0x16c)[0xff938d8] /tmp/hwloc/hwloc-1.1rc2/src/.libs/libhwloc.so.1(hwloc_set_membind+0x80)[0xff939a0] /tmp/hwloc/hwloc-1.1rc2/tests/.libs/hwloc_bind[0x100024d0] /tmp/hwloc/hwloc-1.1rc2/tests/.libs/hwloc_bind[0x10002cbc] /tmp/hwloc/hwloc-1.1rc2/tests/.libs/hwloc_bind[0x10002dc4] /tmp/hwloc/hwloc-1.1rc2/tests/.libs/hwloc_bind[0x10003070] /lib/power6x/libc.so.6[0xfb3290c] /lib/power6x/libc.so.6[0xfb32c44] === Memory map: 0010-00103000 r-xp 0010 00:00 0 0fb14000-0fc78000 r-xp fd:03 6877 /lib/power6x/libc-2.4.so 0fc78000-0fc87000 ---p 00164000 fd:03 6877 /lib/power6x/libc-2.4.so 0fc87000-0fc89000 r--p 00163000 fd:03 6877 /lib/power6x/libc-2.4.so 0fc89000-0fc8d000 rw-p 00165000 fd:03 6877 /lib/power6x/libc-2.4.so 0fc8d000-0fc9 rw-p 0fc8d000 00:00 0 0fca-0fca6000 r-xp fd:06 41707 /usr/lib/libnuma.so.1 0fca6000-0fcb5000 ---p 6000 fd:06 41707 /usr/lib/libnuma.so.1 0fcb5000-0fcb6000 rw-p 5000 fd:06 41707 /usr/lib/libnuma.so.1 0fcb6000-0fcb8000 rw-p 0fcb6000 00:00 0 0fcc8000-0fd73000 r-xp fd:03 6876 /lib/power6x/libm-2.4.so 0fd73000-0fd82000 ---p 000ab000 fd:03 6876 /lib/power6x/libm-2.4.so 0fd82000-0fd86000 r--p 000aa000 fd:03 6876 /lib/power6x/libm-2.4.so 0fd86000-0fd87000 rw-p 000ae000 fd:03 6876 /lib/power6x/libm-2.4.so 0fd97000-0fdab000 r-xp fd:03 431 /lib/libz.so.1.2.3 0fdab000-0fdba000 ---p 00014000 fd:03 431 /lib/libz.so.1.2.3 0fdba000-0fdbb000 rw-p 00013000 fd:03 431 /lib/libz.so.1.2.3 0fdcb000-0fdce000 r-xp fd:03 6772 /lib/libdl-2.4.so 0fdce000-0fddd000 ---p 3000 fd:03 6772 /lib/libdl-2.4.so 0fddd000-0fdde000 r--p 2000 fd:03 6772 /lib/libdl-2.4.so 0fdde000-0fddf000 rw-p 3000 fd:03 6772 /lib/libdl-2.4.so 0fdef000-0ff57000 r-xp fd:06 28900 /usr/lib/libxml2.so.2.6.23 0ff57000-0ff66000 ---p 00168000 fd:06 28900 /usr/lib/libxml2.so.2.6.23 0ff66000-0ff73000 rw-p 00167000 fd:06 28900 /usr/lib/libxml2.so.2.6.23 0ff73000-0ff74000 rw-p 0ff73000 00:00 0 0ff84000-0ffa6000 r-xp fd:05 908 /tmp/hwloc/hwloc-1.1rc2/src/.libs/libhwloc.so.1.0.0 0ffa6000-0ffb6000 ---p 00022000 fd:05 908 /tmp/hwloc/hwloc-1.1rc2/src/.libs/libhwloc.so.1.0.0 0ffb6000-0ffb7000 rw-p 00022000 fd:05 908 /tmp/hwloc/hwloc-1.1rc2/src/.libs/libhwloc.so.1.0.0 0ffc7000-0ffdd000 r-xp fd:03 6875 /lib/power6x/libpthread-2.4.so 0ffdd000-0ffec000 ---p 00016000 fd:03 6875 /lib/power6x/libpthread-2.4.so 0ffec000-0ffed000 r--p 00015000 fd:03 6875 /lib/power6x/libpthread-2.4.so 0ffed000-0ffee000 rw-p 00016000 fd:03 6875 /lib/power6x/libpthread-2.4.so 0ffee000-0fff rw-p 0ffee000 00:00 0 1000-10004000 r-xp fd:05 1492 /tmp/hwloc/hwloc-1.1rc2/tests/.libs/hwloc_bind 10013000-10014000 rw-p 3000 fd:05 1492 /tmp/hwloc/hwloc-1.1rc2/tests/.libs/hwloc_bind 10014000-10035000 rwxp 10014000 00:00 0 [heap] f7fb3000-f7fb6000 rw-p f7fb3000 00:00 0 f7fdd000-f7fdf000 rw-p f7fdd000 00:00 0 f7fdf000-f7fff000 r-xp fd:03 175 /lib/ld-2.4.so f800e000-f800f000 r--p 0001f000 fd:03 175 /lib/ld-2.4.so f800f000-f801 rw-p 0002 fd:03 175 /lib/ld-2.4.so ffe5-ffe63000 rw-p ffe5 00:00 0 ffe63000-ffe64000 rw-p ffe63000 00:00 0 ffe64000-ffe65000 rw-p ffe64000 00:00 0 [stack] /bin/sh: line 1: 453 Aborted ${dir}$tst FAIL: hwloc_bind - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzphZ8ACgkQO2KABBYQAh+0JQCbBWQXf1S5jPwwtgsNme97TfJo mgcAoIz/D3AGho2tiMkXoagBySPBqh8R =W92O -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc 1.1 rc2 make check fails on SLES10SP1 on PPC64
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/11/10 06:51, Brice Goglin wrote: > Here's the patch OK, so that makes lstopo do the right thing in both cases, brilliant! But make check fails still though, I'm now seeing an error when it gathers the topology: Saving current system topology to a tarball... /tmp/hwloc/hwloc-1.1rc2/tests/linux/hwloc-gather-topology.sh: line 54: /tmp/tmp.WPnBao1790/save//proc/cpuinfo: No such file or directory /tmp/hwloc/hwloc-1.1rc2/tests/linux/hwloc-gather-topology.sh: line 54: /tmp/tmp.WPnBao1790/save//proc/meminfo: No such file or directory /tmp/hwloc/hwloc-1.1rc2/tests/linux/hwloc-gather-topology.sh: line 54: /tmp/tmp.WPnBao1790/save//proc/stat: No such file or directory Just did a quick make check with the 1.1rc2 vanilla and can confirm those errors don't happen there. cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzpe+4ACgkQO2KABBYQAh8xGgCdGz9J5GKZjVku5UxWT8DmG3W6 e5sAnAnf/AiTr7CkHTX4uvCkuGz7xqlH =ShP2 -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc 1.1 rc2 make check fails on SLES10SP1 on PPC64
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/11/10 06:12, Brice Goglin wrote: > You might want to add some printf in hwloc_opendir like I did with > hwloc_access in my debug patch. In look_powerpc_device_tree() I did similar and found that it never proceeds past this loop: if (('.' == dirent->d_name[0]) || (0 == (dirent->d_type & DT_DIR))) continue; Adding some debugging to print the name and type and whether they were used or skipped I see that when it fails the dirent->d_type is always '0', but when it works it's '4'. The manual page for readdir(3) says: # Currently, only some file systems (among them: Btrfs, # ext2, ext3, and ext4) have full support returning the file # type in d_type. All applications must properly handle # a return of DT_UNKNOWN. So I'm guessing that reiserfs and GPFS (both of which are available on this PPC64 box) are returning DT_UNKNOWN (0). So the above loop will need to catch that and, if it is DT_UNKNOWN, do a stat or lstat on the entry to find out what it is. :-( cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzpdEsACgkQO2KABBYQAh/9hwCcDNWfn7cAjpioLdDeQfCG9Nnr k/8AmwZ9X4nMLZNimH2djc+P19f7M2Ll =0C8H -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc 1.1 rc2 make check fails on SLES10SP1 on PPC64
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/11/10 05:45, Christopher Samuel wrote: > I'll try applying the same patch to the x86-64 build and doing > a --enable-debug build there and compare them. Looking at the strace and the source seems to show that both builds enter the look_powerpc_device_tree() function (they both access proc/device-tree/cpus) but then on the PPC system it doesn't look any further, whilst on x86-64 it continues happily to look through the rest of the contents of the extracted tar file. Still prodding.. - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzpbPgACgkQO2KABBYQAh8RoQCdEEuLZOSVM9KWpK+SVV1WtPfa NY4AnRwViiXaRR2IWe+pXhRdIZ6FCBY4 =G57/ -END PGP SIGNATURE-
Re: [hwloc-devel] hwloc 1.1 rc2 make check fails on SLES10SP1 on PPC64
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 21/11/10 21:38, Brice Goglin wrote: > It looks like it's using /proc/cpuinfo instead of /sys. It may be a > problem with accessat. Could you apply the attached debug patch and > rerun the above command line? Not a problem, here you go! cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzpVacACgkQO2KABBYQAh+2QQCfQxMqTGxpxzOoVWZ7l4l+HzEL SM4Ani1rpJgxP/yxMKoyz8CPcuFWWJVE =y5Dp -END PGP SIGNATURE- numa distance indexes: 0 1 os node 0 has cpuset 0x000f os node 1 has cpuset 0x00f0 node distance matrix: 0 1 0 10 20 1 20 10 trying to group NUMANode objects into misc objects according to physical distances * Topology extraction from /proc/cpuinfo * processor 0 cpu 0 (os 0) has cpuset 0x0001 processor 1 cpu 1 (os 1) has cpuset 0x0002 processor 2 cpu 2 (os 2) has cpuset 0x0004 processor 3 cpu 3 (os 3) has cpuset 0x0008 processor 4 cpu 4 (os 4) has cpuset 0x0010 processor 5 cpu 5 (os 5) has cpuset 0x0020 processor 6 cpu 6 (os 6) has cpuset 0x0040 processor 7 cpu 7 (os 7) has cpuset 0x0080 8 online processors found, with id max 8 online processor cpuset: 0x00ff * Topology summary * 8 processors (8 max id) 0 sockets, but some missing socket No cores and L2 cache were found in /proc/device-tree/cpus, exiting 0 cores, but some missing core Machine#0(Backend=Linux) cpuset 0xf...f complete 0x00ff online 0x00ff allowed 0xf...f nodeset 0x0 completeN 0x0003 allowedN 0xf...f NUMANode#0(local=0KB total=7864320KB) cpuset 0x000f nodeset 0x0001 PU#0 cpuset 0x0001 PU#1 cpuset 0x0002 PU#2 cpuset 0x0004 PU#3 cpuset 0x0008 NUMANode#1(local=0KB total=8192000KB) cpuset 0x00f0 nodeset 0x0002 PU#4 cpuset 0x0010 PU#5 cpuset 0x0020 PU#6 cpuset 0x0040 PU#7 cpuset 0x0080 Restrict topology cpusets to existing PU and NODE objects Propagate offline and disallowed cpus down and up Propagate nodesets Machine#0(Backend=Linux) cpuset 0x00ff complete 0x00ff online 0x00ff allowed 0x00ff nodeset 0x0003 completeN 0x0003 allowedN 0x0003 NUMANode#0(local=0KB total=7864320KB) cpuset 0x000f complete 0x000f online 0x000f allowed 0x000f nodeset 0x0001 completeN 0x0001 allowedN 0x0001 PU#0 cpuset 0x0001 complete 0x0001 online 0x0001 allowed 0x0001 nodeset 0x0001 completeN 0x0001 allowedN 0x0001 PU#1 cpuset 0x0002 complete 0x0002 online 0x0002 allowed 0x0002 nodeset 0x0001 completeN 0x0001 allowedN 0x0001 PU#2 cpuset 0x0004 complete 0x0004 online 0x0004 allowed 0x0004 nodeset 0x0001 completeN 0x0001 allowedN 0x0001 PU#3 cpuset 0x0008 complete 0x0008 online 0x0008 allowed 0x0008 nodeset 0x0001 completeN 0x0001 allowedN 0x0001 NUMANode#1(local=0KB total=8192000KB) cpuset 0x00f0 complete 0x00f0 online 0x00f0 allowed 0x00f0 nodeset 0x0002 completeN 0x0002 allowedN 0x0002 PU#4 cpuset 0x0010 complete 0x0010 online 0x0010 allowed 0x0010 nodeset 0x0002 completeN 0x0002 allowedN 0x0002 PU#5 cpuset 0x0020 complete 0x0020 online 0x0020 allowed 0x0020 nodeset 0x0002 completeN 0x0002 allowedN 0x0002 PU#6 cpuset 0x0040 complete 0x0040 online 0x0040 allowed 0x0040 nodeset 0x0002 completeN 0x0002 allowedN 0x0002 PU#7 cpuset 0x0080 complete 0x0080 online 0x0080 allowed 0x0080 nodeset 0x0002 completeN 0x0002 allowedN 0x0002 Removing unauthorized and offline cpusets from all cpusets Removing disallowed memory according to nodesets Machine#0(Backend=Linux) cpuset 0x00ff complete 0x00ff online 0x00ff allowed 0x00ff nodeset 0x0003 completeN 0x0003 allowedN 0x0003 NUMANode#0(local=0KB total=7864320KB) cpuset 0x000f complete 0x000f online 0x000f allowed 0x000f nodeset 0x0001 completeN 0x0001 allowedN 0x0001 PU#0 cpuset 0x0001 complete 0x0001 online 0x0001 allowed 0x0001 nodeset 0x0001 completeN 0x0001 allowedN 0x0001 PU#1 cpuset 0x0002 complete 0x0002 online 0x0002 allowed 0x0002 nodeset 0x0001 completeN 0x0001 allowedN 0x0001 PU#2 cpuset 0x0004 complete 0x0004 online 0x0004 allowed 0x0004 nodeset 0x0001 completeN 0x0001 allowedN 0x0001 PU#3 cpuset 0x0008 complete 0x0008 online 0x0008 allowed 0x000
Re: [hwloc-devel] PCI device location in hwloc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 19/11/10 17:01, Christopher Samuel wrote: > Just tried this on our BlueGene/P management node Now my brain is working again, attached is the output from this system, with 3 PCI bridges. Ignore the weird ethernet numbering, this is just the consequence of YAST doing odd things when it had its network interfaces renumbered at one point. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzn3v4ACgkQO2KABBYQAh8o+ACgk2EXK3cKl64FkZWkHnFIfoP9 GCMAnRKnVR/oQ4J+HVc6YjLaru8TXBlC =7jPa -END PGP SIGNATURE-
[hwloc-devel] hwloc 1.1 rc2 make check fails on SLES10SP1 on PPC64
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi there, I'm getting a failure doing a make check on SLES10 SP1 on our BG/P service (management) node: FAIL: test-gather-topology.sh I've attached the output of the "make check" in case that helps. On the plus side it passes on Ubuntu 10.04 and RHEL 5.5. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkznPYAACgkQO2KABBYQAh/hZwCfcsJ0j3vfb9FzF7DohpZns2j4 5nMAoJLqo5k9CzEjlD9nFmyeJotUnSdU =iEB6 -END PGP SIGNATURE- Making check in src make[1]: Entering directory `/tmp/hwloc/hwloc-1.1rc2/src' CC topology.lo CC traversal.lo CC topology-synthetic.lo CC bind.lo CC cpuset.lo CC misc.lo CC topology-xml.lo CC topology-linux.lo topology-linux.c: In function 'hwloc_linux_set_thisthread_membind': topology-linux.c:1010: warning: implicit declaration of function 'migrate_pages' CCLD libhwloc.la make[1]: Leaving directory `/tmp/hwloc/hwloc-1.1rc2/src' Making check in include make[1]: Entering directory `/tmp/hwloc/hwloc-1.1rc2/include' make[1]: Nothing to be done for `check'. make[1]: Leaving directory `/tmp/hwloc/hwloc-1.1rc2/include' Making check in utils make[1]: Entering directory `/tmp/hwloc/hwloc-1.1rc2/utils' CC lstopo-lstopo.o CC lstopo-lstopo-color.o CC lstopo-lstopo-text.o CC lstopo-lstopo-draw.o CC lstopo-lstopo-fig.o CC lstopo-lstopo-cairo.o CC lstopo-lstopo-xml.o CCLD lstopo CC hwloc-calc.o CCLD hwloc-calc CC hwloc-bind.o CCLD hwloc-bind CC hwloc-distrib.o CCLD hwloc-distrib CC hwloc-ps.o CCLD hwloc-ps Creating hwloc.7 man page... Creating lstopo.1 man page... Creating hwloc-bind.1 man page... Creating hwloc-distrib.1 man page... Creating hwloc-calc.1 man page... Creating hwloc-ps.1 man page... make check-TESTS make[2]: Entering directory `/tmp/hwloc/hwloc-1.1rc2/utils' PASS: test-hwloc-distrib.sh = 1 test passed = make[2]: Leaving directory `/tmp/hwloc/hwloc-1.1rc2/utils' make[1]: Leaving directory `/tmp/hwloc/hwloc-1.1rc2/utils' Making check in tests make[1]: Entering directory `/tmp/hwloc/hwloc-1.1rc2/tests' Making check in ports make[2]: Entering directory `/tmp/hwloc/hwloc-1.1rc2/tests/ports' make libhwloc-ports.la make[3]: Entering directory `/tmp/hwloc/hwloc-1.1rc2/tests/ports' CC libhwloc_ports_la-topology.lo CC libhwloc_ports_la-traversal.lo CC libhwloc_ports_la-topology-synthetic.lo CC libhwloc_ports_la-topology-solaris.lo CC libhwloc_ports_la-topology-aix.lo CC libhwloc_ports_la-topology-osf.lo CC libhwloc_ports_la-topology-windows.lo topology-windows.c: In function 'hwloc_win_get_VirtualAllocExNumaProc': topology-windows.c:214: warning: assignment from incompatible pointer type topology-windows.c:219: warning: assignment from incompatible pointer type CC libhwloc_ports_la-topology-darwin.lo CC libhwloc_ports_la-topology-freebsd.lo topology-freebsd.c: In function 'hwloc_freebsd_set_thread_cpubind': topology-freebsd.c:125: warning: passing argument 3 of 'pthread_setaffinity_np' from incompatible pointer type topology-freebsd.c: In function 'hwloc_freebsd_get_thread_cpubind': topology-freebsd.c:149: warning: passing argument 3 of 'pthread_getaffinity_np' from incompatible pointer type CC libhwloc_ports_la-topology-hpux.lo CCLD libhwloc-ports.la make[3]: Leaving directory `/tmp/hwloc/hwloc-1.1rc2/tests/ports' make[2]: Leaving directory `/tmp/hwloc/hwloc-1.1rc2/tests/ports' Making check in xml make[2]: Entering directory `/tmp/hwloc/hwloc-1.1rc2/tests/xml' make check-TESTS make[3]: Entering directory `/tmp/hwloc/hwloc-1.1rc2/tests/xml' PASS: 16amd64-8n2c-cpusets.xml PASS: 16em64t-4s2c2t.xml PASS: 16em64t-4s2c2t-offlines.xml PASS: 8em64t-2mi2ma2c.xml == All 4 tests passed == make[3]: Leaving directory `/tmp/hwloc/hwloc-1.1rc2/tests/xml' make[2]: Leaving directory `/tmp/hwloc/hwloc-1.1rc2/tests/xml' Making check in linux make[2]: Entering directory `/tmp/hwloc/hwloc-1.1rc2/tests/linux' Making check in gather make[3]: Entering directory `/tmp/hwloc/hwloc-1.1rc2/tests/linux/gather' make check-TESTS make[4]: Entering directory `/tmp/hwloc/hwloc-1.1rc2/tests/linux/gather' Saving current system topology to XML... Saving current system topology to a tarball... Hierarchy gathered in /tmp/tmp.HznAQg9426/save.tar.bz2 and kept in /tmp/tmp.rnuQJn9445/save/ Expected topology output stored in /tmp/tmp.HznAQg9426/save.output Extracting tarball... Saving tarball topology to XML... Comparing XML outputs... --- save.xml2010-11-20 14:09:39.0 +1100 +++ save2.xml
Re: [hwloc-devel] PCI device location in hwloc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 19/11/10 01:42, Brice Goglin wrote: > It's not in trunk yet. Try this branch instead: > https://svn.open-mpi.org/svn/hwloc/branches/libpci Just tried this on our BlueGene/P management node (Power6 with SLES10 SP1) and it fails to configure with: configure:10657: checking for cpuid configure:10682: gcc -c -I/tmp/hwloc/libpci/include conftest.c >&5 /tmp/hwloc/libpci/include/private/cpuid.h: In function 'hwloc_cpuid': /tmp/hwloc/libpci/include/private/cpuid.h:54: error: impossible constraint in 'asm' tambo-m:/tmp/libpci # gcc -v Using built-in specs. Target: powerpc64-suse-linux Configured with: ../configure --enable-threads=posix --prefix=/usr - --with-local-prefix=/usr/local --infodir=/usr/share/info - --mandir=/usr/share/man --libdir=/usr/lib --libexecdir=/usr/lib - --enable-languages=c,c++,objc,fortran,obj-c++,java,ada - --enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.1.2 - --enable-ssp --disable-libssp --disable-libgcj --with-slibdir=/lib - --with-system-zlib --enable-shared --enable-__cxa_atexit - --enable-libstdcxx-allocator=new --program-suffix= - --enable-version-specific-runtime-libs --without-system-libunwind - --with-cpu=default32 --enable-secureplt --with-long-double-128 - --host=powerpc64-suse-linux Thread model: posix gcc version 4.1.2 20070115 (SUSE Linux) Any ideas ? cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzmErYACgkQO2KABBYQAh/F5ACbB3aeo8omt61QtgMihKL1L0Jz CuwAnja+xiKjY01h6QOUhOuYuh7E+9dv =mX4Y -END PGP SIGNATURE-
Re: [hwloc-devel] PCI device location in hwloc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 19/11/10 10:04, Brice Goglin wrote: > Many dual-nehalem-EP boxes have a single I/O hub which is connected to > both sockets, so no I/O affinity there. If your machine wasn't designed > to have many big PCIe slots (e.g. to plug multiple GPUs), that's > probably what's happening here. That would make sense, these are their SuperMicro based systems, nothing unusual on these. cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzlsnEACgkQO2KABBYQAh9GBQCfVobPoqVMAFuKBtEB6bJLrj9m 9vAAnRUHkCado+u2GB8CM7bpyVUYne76 =/0sJ -END PGP SIGNATURE-
Re: [hwloc-devel] 1.0.2rc2 posted
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 21/07/10 19:29, Samuel Thibault wrote: > To make distcheck from an svn export, you need to be able > to provide a complete distribution tarball, i.e. have tools > to build docs. You should rather use the released tarbal. I was using the 1.02rc2 tarball. - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkxI3aIACgkQO2KABBYQAh/jQgCeIYPxqBSicld8LNunOk+Fq0Ou rkAAoIDs0XMwat+6Eaq6LQeiONB6OBD4 =EbSK -END PGP SIGNATURE-
Re: [hwloc-devel] 1.0.2rc2 posted
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 20/07/10 22:09, Jeff Squyres wrote: > Including the extern "C" stuff Brice just put in: A simple ./configure && make distcheck fails on a SuSE SLES 9 PPC64 box I have access to: make[1]: Entering directory `/tmp/chris/hwloc-1.0.2rc2' ERROR: Did not build both of the doxygen docs and README. ERROR: This tarball is not complete! ERROR: Cowardly refusing to complete successfully... Is that just meant to work from an svn export ? I've only run across distcheck recently with Torque, and there I am working from an svn export. If I do a "make check" it appears OK, though it reports at one stage: *** Printing overall tree Machine#0(3597MB) PU#0 PU#1 *** The number of sockets is unknown *** Logical processor 0 has 0 caches totaling 0KB PASS: hwloc-hello I presume that's just because it's got an ancient kernel ? 2.6.5-7.244-pseries64 cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkxGd6AACgkQO2KABBYQAh+ckACfQ0UQGS4/8p5wKl859FiBtnSJ ykoAn1A309a8F6OPPafxgqDMF0VxN/Yb =5Y4S -END PGP SIGNATURE-
Re: [hwloc-devel] 1.0.1rc1
On 02/06/10 04:03, Jeff Squyres wrote: > So do we like 1.0.1rc1? Looks OK to me. -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/
[hwloc-devel] hwloc-announce archives not working ?
Hi folks, Was just going to point the other Torque developers towards hwloc now 1.0 has been released and found I couldn't link to the announcement of it as the archives for the announce list don't work. :-( http://www.open-mpi.org/MailArchives/hwloc-announce/ cheers, Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/
Re: [hwloc-devel] 1.0rc7
On 18/05/10 06:45, Jeff Squyres wrote: > Can you guys do some basic testing on the rc7 tarball? Looks good to me, no warnings from Coccicheck, a full build then make distclean is identical to a newly unpacked tarball and lstopo seems to work. Make check seems OK both on my laptop and on a dual socket Nehalem box. cheers, Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/
Re: [hwloc-devel] hwloc 1.0-rc5 cannot recreate hwloc.7 after make distclean
On 07/05/10 04:45, Christopher Samuel wrote: > make[1]: *** No rule to make target `hwloc.7', needed by `all-am'. Stop. Confirming fixed in rc6, thanks! -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r2083
On 07/05/10 08:29, Jeff Squyres wrote: > So if hwloc_snprintf() returns 0, how do you know if you > got a zero-length string or an error? The GNU libc manual page for snprintf() et. al says: If an output error is encountered, a negative value is returned. So I'd have thought that a negative value should be returned if we abort due to not being to allocate enough memory. cheers, Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/
Re: [hwloc-devel] Requirements for building hwloc with XML support ?
On 06/05/10 13:37, Brice Goglin wrote: > Configure uses the pkgconfig file Ahh, I was missing pkg-config being installed, wondered why it was finding the libxml2 dev stuff I had! cheers, Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/
[hwloc-devel] hwloc 1.0-rc5 cannot recreate hwloc.7 after make distclean
This bug seems easy to reproduce: ../configure make make distclean ../configure make [...] make[1]: *** No rule to make target `hwloc.7', needed by `all-am'. Stop. make[1]: Leaving directory `/home/samuel/Downloads/HWLOC/hwloc-1.0rc5/utils' make: *** [all-recursive] Error 1 -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/
Re: [hwloc-devel] want 1.0rc4?
On 03/05/10 09:57, Jeff Squyres wrote: > 1.0rc4 is up. Running coccicheck on 1.0rc4 flags up this construct, I presume as an ambiguous construction: if (!topology->flags & HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM) { That's at line 1518 of src/topology.c in hwloc_discover(). The check file simply says: // !x combines boolean negation with bitwise and It's also flagged some unchecked malloc()'s in various bits: line 41 of src/misc.c in hwloc_snprintf(): str = malloc(size); line 321 of src/topology-linux.c in hwloc_linux_get_proc_tids(): tids = malloc(max_tids*sizeof(pid_t)); line 328 of src/topology-linux.c in hwloc_linux_get_proc_tids(): tids = realloc(tids, max_tids*sizeof(pid_t)); line 1561 of src/topology.c in hwloc_discover(): objs = malloc(n_objs * sizeof(objs[0])); Hope these are helpful! Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/