Brice, I confirm your patch solves the issue I reported earlier for OMPI. I did not try it on a standalone HWLOC, so I am not sure that it maintains the coherency of the output. If you want I can give it a try.
Thanks, George. On Thu, Sep 10, 2015 at 6:08 PM, Brice Goglin <brice.gog...@inria.fr> wrote: > Try this patch (it applies to hwloc v1.9-v1.11, it should be OK against > OMPI's tree). > Your bridge 22:00.0 says it contains the master bus 00. It causes a cycle > in hwloc's insert algorithm, caught be the assertion. The patch just > removes this invalid bridge entirely. > > Brice > > > > Le 10/09/2015 21:23, George Bosilca a écrit : > > It used to work. Now I don't know exactly when I last updated the trunk > version on the cluster, but not more than 10 days ago. > > lstopo complains with the same assert. Interestingly enough, the same > binary succeed on the other nodes of the same cluster ... > > George. > > > On Thu, Sep 10, 2015 at 3:20 PM, Brice Goglin <brice.gog...@inria.fr> > wrote: > >> Did it work on the same machine before? Or did OMPI enable hwloc's PCI >> discovery recently? >> >> Does lstopo complain the same? >> >> Brice >> >> >> >> Le 10/09/2015 21:10, George Bosilca a écrit : >> >> With the current trunk version I keep getting an assert deep down in >> orted. >> >> orted: >> ../../../../../../../ompi/opal/mca/hwloc/hwloc1110/hwloc/src/pci-common.c:177: >> hwloc_pci_try_insert_siblings_below_new_bridge: Assertion `comp != >> HWLOC_PCI_BUSID_SUPERSET' failed. >> >> The stack looks like this: >> >> [dancer18:21100] *** Process received signal *** >> [dancer18:21100] Signal: Aborted (6) >> [dancer18:21100] Signal code: (-6) >> [dancer18:21100] [ 0] /lib64/libpthread.so.0(+0xf710)[0x7fc22ce61710] >> [dancer18:21100] [ 1] /lib64/libc.so.6(gsignal+0x35)[0x7fc22caf0625] >> [dancer18:21100] [ 2] /lib64/libc.so.6(abort+0x175)[0x7fc22caf1e05] >> [dancer18:21100] [ 3] /lib64/libc.so.6(+0x2b74e)[0x7fc22cae974e] >> [dancer18:21100] [ 4] >> /lib64/libc.so.6(__assert_perror_fail+0x0)[0x7fc22cae9810] >> [dancer18:21100] [ 5] >> /home/bosilca/opt/trunk/debug/lib/libopen-pal.so.0(+0xb0a62)[0x7fc22ddc6a62] >> [dancer18:21100] [ 6] >> /home/bosilca/opt/trunk/debug/lib/libopen-pal.so.0(+0xb0b60)[0x7fc22ddc6b60] >> [dancer18:21100] [ 7] >> /home/bosilca/opt/trunk/debug/lib/libopen-pal.so.0(opal_hwloc1110_hwloc_insert_pci_device_list+0x8f)[0x7fc22ddc724c] >> [dancer18:21100] [ 8] >> /home/bosilca/opt/trunk/debug/lib/libopen-pal.so.0(+0xbf2d6)[0x7fc22ddd52d6] >> [dancer18:21100] [ 9] >> /home/bosilca/opt/trunk/debug/lib/libopen-pal.so.0(+0xd22f7)[0x7fc22dde82f7] >> [dancer18:21100] [10] >> /home/bosilca/opt/trunk/debug/lib/libopen-pal.so.0(opal_hwloc1110_hwloc_topology_load+0x1a3)[0x7fc22dde8ee1] >> [dancer18:21100] [11] >> /home/bosilca/opt/trunk/debug/lib/libopen-pal.so.0(opal_hwloc_base_get_topology+0x80)[0x7fc22ddb6ece] >> [dancer18:21100] [12] >> /home/bosilca/opt/trunk/debug/lib/libopen-rte.so.0(orte_ess_base_orted_setup+0x127)[0x7fc22e0b3523] >> [dancer18:21100] [13] >> /home/bosilca/opt/trunk/debug/lib/openmpi/mca_ess_env.so(+0xe45)[0x7fc22c6bbe45] >> [dancer18:21100] [14] >> /home/bosilca/opt/trunk/debug/lib/libopen-rte.so.0(orte_init+0x2c6)[0x7fc22e06b55a] >> [dancer18:21100] [15] >> /home/bosilca/opt/trunk/debug/lib/libopen-rte.so.0(orte_daemon+0x5c1)[0x7fc22e09a895] >> [dancer18:21100] [16] /home/bosilca/opt/trunk/debug/bin/orted[0x40082a] >> [dancer18:21100] [17] >> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fc22cadcd5d] >> [dancer18:21100] [18] /home/bosilca/opt/trunk/debug/bin/orted[0x4006e9] >> >> Any ideas? >> >> George. >> >> >> >> _______________________________________________ >> devel mailing listde...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/09/17993.php >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/09/17994.php >> > > > > _______________________________________________ > devel mailing listde...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/17995.php > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/17997.php >