We're interested in getting bgq support but we'ré not familiar with cnk at all. So if you know how to manipulate binding there, maybe helping you in adding cnk support would be easier ? Btw, does lstopo work on the compute nodes ? Brice
Jeff Hammond <jhamm...@alcf.anl.gov> a écrit : Hi, I'm interested in seeing hwloc supported on Blue Gene/Q. As it is not listed on http://www.open-mpi.org/projects/hwloc/ nor does it use a standard operating system (although CNK is POSIX and very Linux-like in general), I didn't have a reasonable expectation that it would work without some development, but I verified that some of the tests fail, just to be sure. Fortunately, almost all of the tests passed, except for glibc-sched and hwloc_bind. I suspect this is due to the various incompatibilities between glibc in CNK vs. Linux. The failure of both tests occurs with XLC and GCC, although I report the details below for XLC only. Should I report this as a bug? Is it sufficient to port to Blue Gene/Q if I provide the kernel API calls related to thread location, etc. or would any hwloc developers be interested in Blue Gene/Q access for the purposes of development and testing? Thanks, Jeff [jhammond@cetuslac1 tests]$ ../configure --prefix=/home/jhammond/HWLOC/hwloc-1.4.2rc1/install --enable-static --disable-shared CC=bgxlc_r --host=powerpc64-bgq-linux ==> 16927.output <== ==> 16927.cobaltlog <== /usr/bin/qsub.py -t 15 -n 1 --mode=c1 ./glibc-sched submitted with cwd set to: /veas_home/jhammond/HWLOC/hwloc-1.4.2rc1/build/tests Command: '/bgsys/drivers/ppcfloor/bin/runjob' '--np' '1' '--ranks-per-node' '1' '--cwd' '/veas_home/jhammond/HWLOC/hwloc-1.4.2rc1/build/tests' '--block' 'EAS-20040-31371-128' '--corner' 'R00-M1-N04-J09' '--shape' '1x1x1x1x1' '--envs' 'COBALT_JOBID=16927' '--envs' 'BG_SHAREDMEMSIZE=32' '--verbose' '4' ':' '/veas_home/jhammond/HWLOC/hwloc-1.4.2rc1/build/tests/./glibc-sched' Info: stdin received from /dev/null Info: stdout sent to /veas_home/jhammond/HWLOC/hwloc-1.4.2rc1/build/tests/16927.output Info: stderr sent to /veas_home/jhammond/HWLOC/hwloc-1.4.2rc1/build/tests/16927.error Job 16927/jhammond/19558: Block EAS-20040-31371-128 for location EAS-31151-31151-1 already booted. Starting task for job 16927. (APG) Info: task completed normally with an exit code of 134; initiating job cleanup and removal ==> 16927.error <== 2012-05-03 14:06:58.888 (INFO ) [0xfffac848a40] EAS-20040-31371-128:1498:ibm.runjob.client.options.Parser: set local socket to runjob_mux from properties file 2012-05-03 14:07:00.892 (INFO ) [0xfffac848a40] EAS-20040-31371-128:34615:ibm.runjob.client.Job: job 34615 started glibc-sched: ../../tests/glibc-sched.c:43: main: Assertion `!err' failed. 2012-05-03 14:07:02.772 (WARN ) [0xfffac848a40] EAS-20040-31371-128:34615:ibm.runjob.client.Job: terminated by signal 6 2012-05-03 14:07:02.772 (WARN ) [0xfffac848a40] EAS-20040-31371-128:34615:ibm.runjob.client.Job: abnormal termination by signal 6 from rank 0 ==> 16912.output <== system set is 0xffffffff,0xffffffff Bind this singlethreaded process : FAILED (3, No such process) ==> 16912.cobaltlog <== /usr/bin/qsub.py -t 15 -n 1 --mode=c1 ./hwloc_bind submitted with cwd set to: /veas_home/jhammond/HWLOC/hwloc-1.4.2rc1/build/tests Command: '/bgsys/drivers/ppcfloor/bin/runjob' '--np' '1' '--ranks-per-node' '1' '--cwd' '/veas_home/jhammond/HWLOC/hwloc-1.4.2rc1/build/tests' '--block' 'EAS-20040-31371-128' '--corner' 'R00-M1-N04-J27' '--shape' '1x1x1x1x1' '--envs' 'COBALT_JOBID=16912' '--envs' 'BG_SHAREDMEMSIZE=32' '--verbose' '4' ':' '/veas_home/jhammond/HWLOC/hwloc-1.4.2rc1/build/tests/./hwloc_bind' Info: stdin received from /dev/null Info: stdout sent to /veas_home/jhammond/HWLOC/hwloc-1.4.2rc1/build/tests/16912.output Info: stderr sent to /veas_home/jhammond/HWLOC/hwloc-1.4.2rc1/build/tests/16912.error Job 16912/jhammond/19536: Block EAS-20040-31371-128 for location EAS-20141-20141-1 already booted. Starting task for job 16912. (APG) Info: task completed normally with an exit code of 134; initiating job cleanup and removal ==> 16912.error <== 2012-05-03 14:02:09.516 (INFO ) [0xfffb35c8a40] EAS-20040-31371-128:31883:ibm.runjob.client.options.Parser: set local socket to runjob_mux from properties file 2012-05-03 14:02:11.526 (INFO ) [0xfffb35c8a40] EAS-20040-31371-128:34593:ibm.runjob.client.Job: job 34593 started *** glibc detected *** /veas_home/jhammond/HWLOC/hwloc-1.4.2rc1/build/tests/./hwloc_bind: free(): invalid next size (fast): 0x00000019c5016b00 *** ======= Backtrace: ========= [0x103d398] [0x1042ad8] [0x1049800] [0x101e178] [0x101e220] [0x1024520] [0x1024974] [0x1024ccc] [0x1024eb0] [0x1016478] [0x10014a4] [0x1001958] [0x10260d8] [0x10263d4] ======= Memory map: ======== 10000000-10060000 r-xp 00000000 00:0f 1504118 /sbin/sysiod 10060000-10070000 rw-p 00050000 00:0f 1504118 /sbin/sysiod 10070000-10100000 rw-p 00000000 00:00 0 [heap] fff90000000-fff90030000 rw-p 00000000 00:00 0 fff90030000-fff94000000 ---p 00000000 00:00 0 fff94c00000-fff94d00000 rw-p 00000000 00:00 0 fff95a00000-fff95a30000 rw-p 00000000 00:00 0 fff95a30000-fff95a40000 -w-s 3fcc48f0000 00:0e 11206 /dev/infiniband/uverbs0 fff95a40000-fff95a50000 -w-s 3fcc08f0000 00:0e 11206 /dev/infiniband/uverbs0 fff95a50000-fff95a60000 r-xp 00000000 00:0f 3369690 /usr/lib64/libbgvrnic-rdmav2.so fff95a60000-fff95a70000 rw-p 00000000 00:0f 3369690 /usr/lib64/libbgvrnic-rdmav2.so fff95a70000-fff95a80000 ---p 00000000 00:00 0 fff95a80000-fff96480000 rw-p 00000000 00:00 0 fff96480000-fff964b0000 r-xp 00000000 00:0f 3350116 /lib64/libselinux.so.1 fff964b0000-fff964c0000 r--p 00020000 00:0f 3350116 /lib64/libselinux.so.1 fff964c0000-fff964d0000 rw-p 00030000 00:0f 3350116 /lib64/libselinux.so.1 fff964d0000-fff96540000 r-xp 00000000 00:0f 2638150 /lib64/libfreebl3.so fff96540000-fff96550000 r--p 00060000 00:0f 2638150 /lib64/libfreebl3.so fff96550000-fff96560000 rw-p 00070000 00:0f 2638150 /lib64/libfreebl3.so fff96560000-fff96570000 r-xp 00000000 00:0f 2939812 /lib64/libkeyutils.so.1.3 fff96570000-fff96580000 r--p 00000000 00:0f 2939812 /lib64/libkeyutils.so.1.3 fff96580000-fff96590000 rw-p 00010000 00:0f 2939812 /lib64/libkeyutils.so.1.3 fff96590000-fff965a0000 r-xp 00000000 00:0f 2924424 /lib64/libkrb5support.so.0.1 fff965a0000-fff965b0000 r--p 00000000 00:0f 2924424 /lib64/libkrb5support.so.0.1 fff965b0000-fff965c0000 rw-p 00010000 00:0f 2924424 /lib64/libkrb5support.so.0.1 fff965c0000-fff97510000 r-xp 00000000 00:0f 914026 /usr/lib64/libicudata.so.42.1 fff97510000-fff97520000 rw-p 00f40000 00:0f 914026 /usr/lib64/libicudata.so.42.1 fff97520000-fff97550000 r-xp 00000000 00:0f 1277798 /usr/lib64/libsasl2.so.2.0.23 fff97550000-fff97560000 r--p 00020000 00:0f 1277798 /usr/lib64/libsasl2.so.2.0.23 fff97560000-fff97570000 rw-p 00030000 00:0f 1277798 /usr/lib64/libsasl2.so.2.0.23 fff97570000-fff975d0000 r-xp 00000000 00:0f 1327117 /lib64/libnspr4.so fff975d0000-fff975e0000 r--p 00050000 00:0f 1327117 /lib64/libnspr4.so fff975e0000-fff975f0000 rw-p 00060000 00:0f 1327117 /lib64/libnspr4.so fff975f0000-fff97600000 r-xp 00000000 00:0f 1250566 /lib64/libplc4.so fff97600000-fff97610000 r--p 00000000 00:0f 1250566 /lib64/libplc4.so fff97610000-fff97620000 rw-p 00010000 00:0f 1250566 /lib64/libplc4.so fff97620000-fff97630000 r-xp 00000000 00:0f 2447916 /lib64/libplds4.so fff97630000-fff97640000 r--p 00000000 00:0f 2447916 /lib64/libplds4.so fff97640000-fff97650000 rw-p 00010000 00:0f 2447916 /lib64/libplds4.so fff97650000-fff97680000 r-xp 00000000 00:0f 2121934 /usr/lib64/libnssutil3.so fff97680000-fff97690000 r--p 00020000 00:0f 2121934 /usr/lib64/libnssutil3.so fff97690000-fff976a0000 rw-p 00030000 00:0f 2121934 /usr/lib64/libnssutil3.so fff976a0000-fff976b0000 rw-p 00000000 00:00 0 fff976b0000-fff97840000 r-xp 00000000 00:0f 811615 /usr/lib64/libnss3.so fff97840000-fff97850000 r--p 00180000 00:0f 811615 /usr/lib64/libnss3.so fff97850000-fff97860000 rw-p 00190000 00:0f 811615 /usr/lib64/libnss3.so fff97860000-fff97870000 rw-p 00000000 00:00 0 fff97870000-fff978b0000 r-xp 00000000 00:0f 1001793 /usr/lib64/libsmime3.so fff978b0000-fff978c0000 r--p 00030000 00:0f 1001793 /usr/lib64/libsmime3.so fff978c0000-fff978d0000 rw-p 00040000 00:0f 1001793 /usr/lib64/libsmime3.so fff978d0000-fff97920000 r-xp 00000000 00:0f 2769018 /usr/lib64/libssl3.so fff97920000-fff97930000 r--p 00040000 00:0f 2769018 /usr/lib64/libssl3.so fff97930000-fff97940000 rw-p 00050000 00:0f 2769018 /usr/lib64/libssl3.so fff97940000-fff97960000 r-xp 00000000 00:0f 3219112 /lib64/libresolv-2.12.so fff97960000-fff97970000 r--p 00010000 00:0f 3219112 /lib64/libresolv-2.12.so fff97970000-fff97980000 rw-p 00020000 00:0f 3219112 /lib64/libresolv-2.12.so fff97980000-fff97990000 r-xp 00000000 00:0f 822832 /lib64/libcrypt-2.12.so fff97990000-fff979a0000 r--p 00000000 00:0f 822832 /lib64/libcrypt-2.12.so fff979a0000-fff979b0000 rw-p 00010000 00:0f 822832 /lib64/libcrypt-2.12.so fff979b0000-fff979d0000 rw-p 00000000 00:00 0 fff979d0000-fff979e0000 r-xp 00000000 00:0f 1327124 /lib64/libuuid.so.1.3.0 fff979e0000-fff979f0000 rw-p 00000000 00:0f 1327124 /lib64/libuuid.so.1.3.0 fff979f0000-fff97a10000 r-xp 00000000 00:0f 2029697 /lib64/libz.so.1.2.3 fff97a10000-fff97a20000 r--p 00010000 00:0f 2029697 /lib64/libz.so.1.2.3 fff97a20000-fff97a30000 rw-p 00020000 00:0f 2029697 /lib64/libz.so.1.2.3 fff97a30000-fff97c50000 r-xp 00000000 00:0f 2098005 /usr/lib64/libcrypto.so.1.0.0 fff97c50000-fff97c70000 r--p 00210000 00:0f 2098005 /usr/lib64/libcrypto.so.1.0.0 fff97c70000-fff97c90000 rw-p 00230000 00:0f 2098005 /usr/lib64/libcrypto.so.1.0.0 fff97c90000-fff97ca0000 rw-p 00000000 00:00 0 fff97ca0000-fff97ce0000 r-xp 00000000 00:0f 1250555 /lib64/libk5crypto.so.3.1 fff97ce0000-fff97cf0000 r--p 00030000 00:0f 1250555 /lib64/libk5crypto.so.3.1 fff97cf0000-fff97d00000 rw-p 00040000 00:0f 1250555 /lib64/libk5crypto.so.3.1 fff97d00000-fff97d10000 r-xp 00000000 00:0f 1327131 /lib64/libcom_err.so.2.1 fff97d10000-fff97d20000 r--p 00000000 00:0f 1327131 /lib64/libcom_err.so.2.1 fff97d20000-fff97d30000 rw-p 00010000 00:0f 1327131 /lib64/libcom_err.so.2.1 fff97d30000-fff97e40000 r-xp 00000000 00:0f 3350124 /lib64/libkrb5.so.3.3 fff97e40000-fff97e50000 r--p 00100000 00:0f 3350124 /lib64/libkrb5.so.3.3 fff97e50000-fff97e60000 rw-p 00110000 00:0f 3350124 /lib64/libkrb5.so.3.3 fff97e60000-fff97eb0000 r-xp 00000000 00:0f 1327132 /lib64/libgssapi_krb5.so.2.2 fff97eb0000-fff97ec0000 r--p 00040000 00:0f 1327132 /lib64/libgssapi_krb5.so.2.2 fff97ec0000-fff97ed0000 rw-p 00050000 00:0f 1327132 /lib64/libgssapi_krb5.so.2.2 fff97ed0000-fff980f0000 r-xp 00000000 00:0f 159290 /usr/lib64/libicui18n.so.42.1 fff980f0000-fff98120000 rw-p 00210000 00:0f 159290 /usr/lib64/libicui18n.so.42.1 fff98120000-fff982c0000 r-xp 00000000 00:0f 2142969 /usr/lib64/libicuuc.so.42.1 fff982c0000-fff982e0000 rw-p 001a0000 00:0f 2142969 /usr/lib64/libicuuc.so.42.1 fff982e0000-fff982f0000 r-xp 00000000 00:0f 1250562 /lib64/libdl-2.12.so fff982f0000-fff98300000 r--p 00000000 00:0f 1250562 /lib64/libdl-2.12.so fff98300000-fff98310000 rw-p 00010000 00:0f 1250562 /lib64/libdl-2.12.so fff98310000-fff98350000 r-xp 00000000 00:0f 78455 /usr/lib64/libapr-1.so.0.3.9 fff98350000-fff98360000 rw-p 00040000 00:0f 78455 /usr/lib64/libapr-1.so.0.3.9 fff98360000-fff98530000 r-xp 00000000 00:0f 3221811 /lib64/libdb-4.7.so fff98530000-fff98550000 rw-p 001d0000 00:0f 3221811 /lib64/libdb-4.7.so fff98550000-fff98590000 r-xp 00000000 00:0f 3219122 /lib64/libexpat.so.1.5.2 fff98590000-fff985a0000 rw-p 00030000 00:0f 3219122 /lib64/libexpat.so.1.5.2 fff985a0000-fff985c0000 r-xp 00000000 00:0f 2029706 /lib64/liblber-2.4.so.2.5.6 fff985c0000-fff985d0000 r--p 00010000 00:0f 2029706 /lib64/liblber-2.4.so.2.5.6 fff985d0000-fff985e0000 rw-p 00020000 00:0f 2029706 /lib64/liblber-2.4.so.2.5.6 fff985e0000-fff98650000 r-xp 00000000 00:0f 1327127 /lib64/libldap-2.4.so.2.5.6 fff98650000-fff98660000 r--p 00060000 00:0f 1327127 /lib64/libldap-2.4.so.2.5.6 fff98660000-fff98670000 rw-p 00070000 00:0f 1327127 /lib64/libldap-2.4.so.2.5.6 fff98670000-fff986b0000 r-xp 00000000 00:0f 571721 /usr/lib64/libaprutil-1.so.0.3.9 fff986b0000-fff986c0000 rw-p 00030000 00:0f 571721 /usr/lib64/libaprutil-1.so.0.3.9 fff986c0000-fff986d0000 r-xp 00000000 00:0f 2632456 /lib64/librt-2.12.so fff986d0000-fff986e0000 r--p 00000000 00:0f 2632456 /lib64/librt-2.12.so fff986e0000-fff986f0000 rw-p 00010000 00:0f 2632456 /lib64/librt-2.12.so fff986f0000-fff98760000 r-xp 00000000 00:0f 2078098 /usr/lib64/libssl.so.1.0.0 fff98760000-fff98770000 r--p 00060000 00:0f 2078098 /usr/lib64/libssl.so.1.0.0 fff98770000-fff98780000 rw-p 00070000 00:0f 2078098 /usr/lib64/libssl.so.1.0.0 fff98780000-fff988b0000 r-xp 00000000 00:0f 3397951 /usr/lib64/libboost_regex-mt.so.5 fff988b0000-fff988c0000 rw-p 00130000 00:0f 3397951 /usr/lib64/libboost_regex-mt.so.5 fff988c0000-fff988e0000 r-xp 00000000 00:0f 3135862 /usr/lib64/libboost_thread-mt.so.5 fff988e0000-fff988f0000 rw-p 00010000 00:0f 3135862 /usr/lib64/libboost_thread-mt.so.5 fff988f0000-fff98980000 r-xp 00000000 00:0f 784193 /usr/lib64/libboost_serialization-mt.so.5 fff98980000-fff98990000 rw-p 00080000 00:0f 784193 /usr/lib64/libboost_serialization-mt.so.5 fff98990000-fff989a0000 r-xp 00000000 00:0f 2912814 /usr/lib64/libmlx4-rdmav2.so fff989a0000-fff989b0000 rw-p 00000000 00:0f 2912814 /usr/lib64/libmlx4-rdmav2.so fff989b0000-fff989c0000 r-xp 00000000 00:0f 2213992 /usr/lib64/librdmacm.so.1.0.0 fff989c0000-fff989d0000 rw-p 00000000 00:0f 2213992 /usr/lib64/librdmacm.so.1.0.0 fff989d0000-fff989f0000 r-xp 00000000 00:0f 948700 /usr/lib64/libibverbs.so.1.0.0 fff989f0000-fff98a00000 rw-p 00010000 00:0f 948700 /usr/lib64/libibverbs.so.1.0.0 fff98a00000-fff98d60000 r-xp 00000000 00:0f 2989947 /lib/liblog4cxx.so.10.0.0 fff98d60000-fff98db0000 rw-p 00360000 00:0f 2989947 /lib/liblog4cxx.so.10.0.0 fff98db0000-fff98dc0000 rw-p 00000000 00:00 0 fff98dc0000-fff98f80000 r-xp 00000000 00:0f 76138 /lib64/libc-2.12.so fff98f80000-fff98f90000 r--p 001b0000 00:0f 76138 /lib64/libc-2.12.so fff98f90000-fff98fa0000 rw-p 001c0000 00:0f 76138 /lib64/libc-2.12.so fff98fa0000-fff98fb0000 rw-p 00000000 00:00 0 fff98fb0000-fff98fd0000 r-xp 00000000 00:0f 3350121 /lib64/libpthread-2.12.so fff98fd0000-fff98fe0000 r--p 00010000 00:0f 3350121 /lib64/libpthread-2.12.so fff98fe0000-fff98ff0000 rw-p 00020000 00:0f 3350121 /lib64/libpthread-2.12.so fff98ff0000-fff99010000 r-xp 00000000 00:0f 822834 /lib64/libgcc_s-4.4.6-20110824.so.1 fff99010000-fff99020000 rw-p 00010000 00:0f 822834 /lib64/libgcc_s-4.4.6-20110824.so.1 fff99020000-fff99100000 r-xp 00000000 00:0f 3219113 /lib64/libm-2.12.so fff99100000-fff99110000 r--p 000d0000 00:0f 3219113 /lib64/libm-2.12.so fff99110000-fff99120000 rw-p 000e0000 00:0f 3219113 /lib64/libm-2.12.so fff99120000-fff99280000 r-xp 00000000 00:0f 2869896 /usr/lib64/libstdc++.so.6.0.13 fff99280000-fff99290000 r--p 00150000 00:0f 2869896 /usr/lib64/libstdc++.so.6.0.13 fff99290000-fff992a0000 rw-p 00160000 00:0f 2869896 /usr/lib64/libstdc++.so.6.0.13 fff992a0000-fff992c0000 rw-p 00000000 00:00 0 fff992c0000-fff99320000 r-xp 00000000 00:0f 1455532 /lib64/libbgcios.so.1.0.0 fff99320000-fff99330000 rw-p 00060000 00:0f 1455532 /lib64/libbgcios.so.1.0.0 fff99330000-fff99390000 r-xp 00000000 00:0f 2565847 /usr/lib64/libboost_program_options-mt.so.5 fff99390000-fff993a0000 rw-p 00060000 00:0f 2565847 /usr/lib64/libboost_program_options-mt.so.5 fff993a0000-fff993b0000 r-xp 00000000 00:0f 3263171 /usr/lib64/libboost_system-mt.so.5 fff993b0000-fff993c0000 rw-p 00000000 00:0f 3263171 /usr/lib64/libboost_system-mt.so.5 fff993c0000-fff99860000 r-xp 00000000 00:0f 2989944 /lib/libbgutility.so.1.0.0 fff99860000-fff99890000 rw-p 00490000 00:0f 2989944 /lib/libbgutility.so.1.0.0 fff99890000-fff998b0000 rw-p 00000000 00:00 0 fff998b0000-fff998d0000 r-xp 00000000 00:00 0 [vdso] fff998d0000-fff99900000 r-xp 00000000 00:0f 2939827 /lib64/ld-2.12.so fff99900000-fff99910000 r--p 00020000 00:0f 2939827 /lib64/ld-2.12.so fff99910000-fff99920000 rw-p 00030000 00:0f 2939827 /lib64/ld-2.12.so fffca310000-fffca460000 rw-p 00000000 00:00 0 [stack] 2012-05-03 14:02:13.391 (WARN ) [0xfffb35c8a40] EAS-20040-31371-128:34593:ibm.runjob.client.Job: terminated by signal 6 2012-05-03 14:02:13.392 (WARN ) [0xfffb35c8a40] EAS-20040-31371-128:34593:ibm.runjob.client.Job: abnormal termination by signal 6 from rank 0 -- Jeff Hammond Argonne Leadership Computing Facility University of Chicago Computation Institute jhamm...@alcf.anl.gov / (630) 252-5381 http://www.linkedin.com/in/jeffhammond https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond (in-progress) https://wiki.alcf.anl.gov/old/index.php/User:Jhammond (deprecated) https://wiki-old.alcf.anl.gov/index.php/User:Jhammond(deprecated) _____________________________________________ hwloc-users mailing list hwloc-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users