Re: [hwloc-devel] xml file load incompatibilities

2013-09-21 Thread Ralph Castain
Okay, I found it - was a sequencing problem in OMPI itself (we "set" the new 
topology too late in the setup sequence). Sorry for false alarm.

Thanks for the help!
Ralph

On Sep 20, 2013, at 11:36 PM, Brice Goglin  wrote:

> Strange, the backtrace below looks total crazy, I don't see how debug checks 
> could still pass in that case.
> Any chance you valgrind that thing?
> 
> Brice
> 
> 
> 
> Le 21/09/2013 01:09, Ralph Castain a écrit :
>> Hmmm...nope, not a peep (no extra output at all). Just segfaulted like 
>> before.
>> 
>> On Sep 20, 2013, at 4:06 PM, Brice Goglin  wrote:
>> 
>>> Try adding HWLOC_DEBUG_CHECK=1 in your environment, it will enable many 
>>> assertions at the end of hwloc_topology_load()
>>> 
>>> Brice
>>> 
>>> 
>>> 
>>> Le 21/09/2013 01:03, Ralph Castain a écrit :
 I didn't try loading it with lstopo - just tried the OMPI trunk. It loads 
 okay, but segfaults when you try to find an object by depth
 
 #0  0x0001005fe5dc in opal_hwloc172_hwloc_get_obj_by_depth 
 (topology=Cannot access memory at address 0xfff7
 ) at traversal.c:623
 #1  0x000100b6dfaa in opal_hwloc172_hwloc_get_root_obj 
 (topology=Cannot access memory at address 0xfff7
 ) at rmaps_rr_mappers.c:747
 #2  0x000100b6e139 in orte_rmaps_rr_byslot (jdata=Cannot access memory 
 at address 0xff77
 ) at rmaps_rr_mappers.c:774
 #3  0x000100b6d6da in orte_rmaps_rr_map (jdata=Cannot access memory at 
 address 0xff17
 ) at rmaps_rr.c:211
 #4  0x000100353098 in orte_rmaps_base_map_job (fd=Cannot access memory 
 at address 0xfe7b
 ) at base/rmaps_base_map_job.c:320
 #5  0x0001005ce28c in event_process_active_single_queue (base=Cannot 
 access memory at address 0xffe7
 ) at event.c:1367
 #6  0x0001005ce500 in event_process_active (base=Cannot access memory 
 at address 0xffe7
 ) at event.c:1437
 #7  0x0001005ceb71 in opal_libevent2021_event_base_loop (base=Cannot 
 access memory at address 0xffb7
 ) at event.c:1645
 #8  0x0001002c5158 in orterun (argc=Cannot access memory at address 
 0xfd1b
 ) at orterun.c:3039
 #9  0x0001002c32a4 in main (argc=Cannot access memory at address 
 0xfffb
 ) at main.c:14
 
 Looks to me like memory may be getting hosed
 
 
 On Sep 20, 2013, at 2:59 PM, Brice Goglin  wrote:
 
> I can't see any segfault. Where does the segfault occurs for you? In OMPI 
> only (or lstopo too)? When loading or when using the topology?
> 
> I tried lstopo on that file with and without HWLOC_NO_LIBXML_IMPORT=1 (in 
> case the bug is in one of XML backends), looks ok.
> 
> Brice
> 
> 
> 
> 
> 
> Le 20/09/2013 23:53, Ralph Castain a écrit :
>> Here are the two files I tried - not from the same machine. The foo.xml 
>> works, the topo.xml segfaults
>> 
>> 
>> 
>> 
>> One of our users reported it from their machine, but I don't have their 
>> topo file.
>> 
>> On Sep 20, 2013, at 2:41 PM, Brice Goglin  wrote:
>> 
>>> Hello,
>>> I don't see anything reason for such an incompatibility. But there are
>>> many combinations, we can't test everything.
>>> I can't reproduce that on my machines. Can you send the XML output of
>>> both versions on one of your machines?
>>> Brice
>>> 
>>> 
>>> 
>>> Le 20/09/2013 23:32, Ralph Castain a écrit :
 Hi folks
 
 I've run across a rather strange behavior. We have two branches in 
 OMPI - the devel trunk (using hwloc v1.7.2) and our feature release 
 series (using hwloc 1.5.2). I have found the following:
 
 *the feature series can correctly load an xml file generated by lstopo 
 of versions 1.5 or greater
 
 * the devel series can correctly load an xml file generated by lstopo 
 of versions 1.7 or greater, but not files generated by prior versions. 
 In the latter case, I segfault as soon as I try to use the loaded 
 topology.
 
 Any ideas why the discrepancy? Can I at least detect the version used 
 to create a file when loading it so I can error out instead of 
 segfaulting?
 
 Ralph
 
 ___
 hwloc-devel mailing list
 hwloc-de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>>> ___
>>> hwloc-devel mailing list
>>> hwloc-de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>> 
>> 
>> 

Re: [hwloc-devel] xml file load incompatibilities

2013-09-21 Thread Brice Goglin
Strange, the backtrace below looks total crazy, I don't see how debug
checks could still pass in that case.
Any chance you valgrind that thing?

Brice



Le 21/09/2013 01:09, Ralph Castain a écrit :
> Hmmm...nope, not a peep (no extra output at all). Just segfaulted like
> before.
>
> On Sep 20, 2013, at 4:06 PM, Brice Goglin  > wrote:
>
>> Try adding HWLOC_DEBUG_CHECK=1 in your environment, it will enable
>> many assertions at the end of hwloc_topology_load()
>>
>> Brice
>>
>>
>>
>> Le 21/09/2013 01:03, Ralph Castain a écrit :
>>> I didn't try loading it with lstopo - just tried the OMPI trunk. It
>>> loads okay, but segfaults when you try to find an object by depth
>>>
>>> #0  0x0001005fe5dc in opal_hwloc172_hwloc_get_obj_by_depth
>>> (topology=Cannot access memory at address 0xfff7
>>> ) at traversal.c:623
>>> #1  0x000100b6dfaa in opal_hwloc172_hwloc_get_root_obj
>>> (topology=Cannot access memory at address 0xfff7
>>> ) at rmaps_rr_mappers.c:747
>>> #2  0x000100b6e139 in orte_rmaps_rr_byslot (jdata=Cannot access
>>> memory at address 0xff77
>>> ) at rmaps_rr_mappers.c:774
>>> #3  0x000100b6d6da in orte_rmaps_rr_map (jdata=Cannot access
>>> memory at address 0xff17
>>> ) at rmaps_rr.c:211
>>> #4  0x000100353098 in orte_rmaps_base_map_job (fd=Cannot access
>>> memory at address 0xfe7b
>>> ) at base/rmaps_base_map_job.c:320
>>> #5  0x0001005ce28c in event_process_active_single_queue
>>> (base=Cannot access memory at address 0xffe7
>>> ) at event.c:1367
>>> #6  0x0001005ce500 in event_process_active (base=Cannot access
>>> memory at address 0xffe7
>>> ) at event.c:1437
>>> #7  0x0001005ceb71 in opal_libevent2021_event_base_loop
>>> (base=Cannot access memory at address 0xffb7
>>> ) at event.c:1645
>>> #8  0x0001002c5158 in orterun (argc=Cannot access memory at
>>> address 0xfd1b
>>> ) at orterun.c:3039
>>> #9  0x0001002c32a4 in main (argc=Cannot access memory at address
>>> 0xfffb
>>> ) at main.c:14
>>>
>>> Looks to me like memory may be getting hosed
>>>
>>>
>>> On Sep 20, 2013, at 2:59 PM, Brice Goglin >> > wrote:
>>>
 I can't see any segfault. Where does the segfault occurs for you?
 In OMPI only (or lstopo too)? When loading or when using the topology?

 I tried lstopo on that file with and without
 HWLOC_NO_LIBXML_IMPORT=1 (in case the bug is in one of XML
 backends), looks ok.

 Brice





 Le 20/09/2013 23:53, Ralph Castain a écrit :
> Here are the two files I tried - not from the same machine. The foo.xml 
> works, the topo.xml segfaults
>
>
>
>
> One of our users reported it from their machine, but I don't have their 
> topo file.
>
> On Sep 20, 2013, at 2:41 PM, Brice Goglin  wrote:
>
>> Hello,
>> I don't see anything reason for such an incompatibility. But there are
>> many combinations, we can't test everything.
>> I can't reproduce that on my machines. Can you send the XML output of
>> both versions on one of your machines?
>> Brice
>>
>>
>>
>> Le 20/09/2013 23:32, Ralph Castain a écrit :
>>> Hi folks
>>>
>>> I've run across a rather strange behavior. We have two branches in OMPI 
>>> - the devel trunk (using hwloc v1.7.2) and our feature release series 
>>> (using hwloc 1.5.2). I have found the following:
>>>
>>> *the feature series can correctly load an xml file generated by lstopo 
>>> of versions 1.5 or greater
>>>
>>> * the devel series can correctly load an xml file generated by lstopo 
>>> of versions 1.7 or greater, but not files generated by prior versions. 
>>> In the latter case, I segfault as soon as I try to use the loaded 
>>> topology.
>>>
>>> Any ideas why the discrepancy? Can I at least detect the version used 
>>> to create a file when loading it so I can error out instead of 
>>> segfaulting?
>>>
>>> Ralph
>>>
>>> ___
>>> hwloc-devel mailing list
>>> hwloc-de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>> ___
>> hwloc-devel mailing list
>> hwloc-de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>
>
> ___
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel

 ___
 hwloc-devel mailing list
 hwloc-de...@open-mpi.org 
 http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel

Re: [hwloc-devel] xml file load incompatibilities

2013-09-20 Thread Ralph Castain
Hmmm...nope, not a peep (no extra output at all). Just segfaulted like before.

On Sep 20, 2013, at 4:06 PM, Brice Goglin  wrote:

> Try adding HWLOC_DEBUG_CHECK=1 in your environment, it will enable many 
> assertions at the end of hwloc_topology_load()
> 
> Brice
> 
> 
> 
> Le 21/09/2013 01:03, Ralph Castain a écrit :
>> I didn't try loading it with lstopo - just tried the OMPI trunk. It loads 
>> okay, but segfaults when you try to find an object by depth
>> 
>> #0  0x0001005fe5dc in opal_hwloc172_hwloc_get_obj_by_depth 
>> (topology=Cannot access memory at address 0xfff7
>> ) at traversal.c:623
>> #1  0x000100b6dfaa in opal_hwloc172_hwloc_get_root_obj (topology=Cannot 
>> access memory at address 0xfff7
>> ) at rmaps_rr_mappers.c:747
>> #2  0x000100b6e139 in orte_rmaps_rr_byslot (jdata=Cannot access memory 
>> at address 0xff77
>> ) at rmaps_rr_mappers.c:774
>> #3  0x000100b6d6da in orte_rmaps_rr_map (jdata=Cannot access memory at 
>> address 0xff17
>> ) at rmaps_rr.c:211
>> #4  0x000100353098 in orte_rmaps_base_map_job (fd=Cannot access memory 
>> at address 0xfe7b
>> ) at base/rmaps_base_map_job.c:320
>> #5  0x0001005ce28c in event_process_active_single_queue (base=Cannot 
>> access memory at address 0xffe7
>> ) at event.c:1367
>> #6  0x0001005ce500 in event_process_active (base=Cannot access memory at 
>> address 0xffe7
>> ) at event.c:1437
>> #7  0x0001005ceb71 in opal_libevent2021_event_base_loop (base=Cannot 
>> access memory at address 0xffb7
>> ) at event.c:1645
>> #8  0x0001002c5158 in orterun (argc=Cannot access memory at address 
>> 0xfd1b
>> ) at orterun.c:3039
>> #9  0x0001002c32a4 in main (argc=Cannot access memory at address 
>> 0xfffb
>> ) at main.c:14
>> 
>> Looks to me like memory may be getting hosed
>> 
>> 
>> On Sep 20, 2013, at 2:59 PM, Brice Goglin  wrote:
>> 
>>> I can't see any segfault. Where does the segfault occurs for you? In OMPI 
>>> only (or lstopo too)? When loading or when using the topology?
>>> 
>>> I tried lstopo on that file with and without HWLOC_NO_LIBXML_IMPORT=1 (in 
>>> case the bug is in one of XML backends), looks ok.
>>> 
>>> Brice
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Le 20/09/2013 23:53, Ralph Castain a écrit :
 Here are the two files I tried - not from the same machine. The foo.xml 
 works, the topo.xml segfaults
 
 
 
 
 One of our users reported it from their machine, but I don't have their 
 topo file.
 
 On Sep 20, 2013, at 2:41 PM, Brice Goglin  wrote:
 
> Hello,
> I don't see anything reason for such an incompatibility. But there are
> many combinations, we can't test everything.
> I can't reproduce that on my machines. Can you send the XML output of
> both versions on one of your machines?
> Brice
> 
> 
> 
> Le 20/09/2013 23:32, Ralph Castain a écrit :
>> Hi folks
>> 
>> I've run across a rather strange behavior. We have two branches in OMPI 
>> - the devel trunk (using hwloc v1.7.2) and our feature release series 
>> (using hwloc 1.5.2). I have found the following:
>> 
>> *the feature series can correctly load an xml file generated by lstopo 
>> of versions 1.5 or greater
>> 
>> * the devel series can correctly load an xml file generated by lstopo of 
>> versions 1.7 or greater, but not files generated by prior versions. In 
>> the latter case, I segfault as soon as I try to use the loaded topology.
>> 
>> Any ideas why the discrepancy? Can I at least detect the version used to 
>> create a file when loading it so I can error out instead of segfaulting?
>> 
>> Ralph
>> 
>> ___
>> hwloc-devel mailing list
>> hwloc-de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
> ___
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
 
 
 ___
 hwloc-devel mailing list
 hwloc-de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>>> 
>>> ___
>>> hwloc-devel mailing list
>>> hwloc-de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>> 
>> 
>> 
>> ___
>> hwloc-devel mailing list
>> hwloc-de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
> 
> ___
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel



Re: [hwloc-devel] xml file load incompatibilities

2013-09-20 Thread Brice Goglin
Try adding HWLOC_DEBUG_CHECK=1 in your environment, it will enable many
assertions at the end of hwloc_topology_load()

Brice



Le 21/09/2013 01:03, Ralph Castain a écrit :
> I didn't try loading it with lstopo - just tried the OMPI trunk. It
> loads okay, but segfaults when you try to find an object by depth
>
> #0  0x0001005fe5dc in opal_hwloc172_hwloc_get_obj_by_depth
> (topology=Cannot access memory at address 0xfff7
> ) at traversal.c:623
> #1  0x000100b6dfaa in opal_hwloc172_hwloc_get_root_obj
> (topology=Cannot access memory at address 0xfff7
> ) at rmaps_rr_mappers.c:747
> #2  0x000100b6e139 in orte_rmaps_rr_byslot (jdata=Cannot access
> memory at address 0xff77
> ) at rmaps_rr_mappers.c:774
> #3  0x000100b6d6da in orte_rmaps_rr_map (jdata=Cannot access
> memory at address 0xff17
> ) at rmaps_rr.c:211
> #4  0x000100353098 in orte_rmaps_base_map_job (fd=Cannot access
> memory at address 0xfe7b
> ) at base/rmaps_base_map_job.c:320
> #5  0x0001005ce28c in event_process_active_single_queue
> (base=Cannot access memory at address 0xffe7
> ) at event.c:1367
> #6  0x0001005ce500 in event_process_active (base=Cannot access
> memory at address 0xffe7
> ) at event.c:1437
> #7  0x0001005ceb71 in opal_libevent2021_event_base_loop
> (base=Cannot access memory at address 0xffb7
> ) at event.c:1645
> #8  0x0001002c5158 in orterun (argc=Cannot access memory at
> address 0xfd1b
> ) at orterun.c:3039
> #9  0x0001002c32a4 in main (argc=Cannot access memory at address
> 0xfffb
> ) at main.c:14
>
> Looks to me like memory may be getting hosed
>
>
> On Sep 20, 2013, at 2:59 PM, Brice Goglin  > wrote:
>
>> I can't see any segfault. Where does the segfault occurs for you? In
>> OMPI only (or lstopo too)? When loading or when using the topology?
>>
>> I tried lstopo on that file with and without HWLOC_NO_LIBXML_IMPORT=1
>> (in case the bug is in one of XML backends), looks ok.
>>
>> Brice
>>
>>
>>
>>
>>
>> Le 20/09/2013 23:53, Ralph Castain a écrit :
>>> Here are the two files I tried - not from the same machine. The foo.xml 
>>> works, the topo.xml segfaults
>>>
>>>
>>>
>>>
>>> One of our users reported it from their machine, but I don't have their 
>>> topo file.
>>>
>>> On Sep 20, 2013, at 2:41 PM, Brice Goglin  wrote:
>>>
 Hello,
 I don't see anything reason for such an incompatibility. But there are
 many combinations, we can't test everything.
 I can't reproduce that on my machines. Can you send the XML output of
 both versions on one of your machines?
 Brice



 Le 20/09/2013 23:32, Ralph Castain a écrit :
> Hi folks
>
> I've run across a rather strange behavior. We have two branches in OMPI - 
> the devel trunk (using hwloc v1.7.2) and our feature release series 
> (using hwloc 1.5.2). I have found the following:
>
> *the feature series can correctly load an xml file generated by lstopo of 
> versions 1.5 or greater
>
> * the devel series can correctly load an xml file generated by lstopo of 
> versions 1.7 or greater, but not files generated by prior versions. In 
> the latter case, I segfault as soon as I try to use the loaded topology.
>
> Any ideas why the discrepancy? Can I at least detect the version used to 
> create a file when loading it so I can error out instead of segfaulting?
>
> Ralph
>
> ___
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
 ___
 hwloc-devel mailing list
 hwloc-de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>>>
>>>
>>> ___
>>> hwloc-devel mailing list
>>> hwloc-de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>>
>> ___
>> hwloc-devel mailing list
>> hwloc-de...@open-mpi.org 
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>
>
>
> ___
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel



Re: [hwloc-devel] xml file load incompatibilities

2013-09-20 Thread Ralph Castain
I didn't try loading it with lstopo - just tried the OMPI trunk. It loads okay, 
but segfaults when you try to find an object by depth

#0  0x0001005fe5dc in opal_hwloc172_hwloc_get_obj_by_depth (topology=Cannot 
access memory at address 0xfff7
) at traversal.c:623
#1  0x000100b6dfaa in opal_hwloc172_hwloc_get_root_obj (topology=Cannot 
access memory at address 0xfff7
) at rmaps_rr_mappers.c:747
#2  0x000100b6e139 in orte_rmaps_rr_byslot (jdata=Cannot access memory at 
address 0xff77
) at rmaps_rr_mappers.c:774
#3  0x000100b6d6da in orte_rmaps_rr_map (jdata=Cannot access memory at 
address 0xff17
) at rmaps_rr.c:211
#4  0x000100353098 in orte_rmaps_base_map_job (fd=Cannot access memory at 
address 0xfe7b
) at base/rmaps_base_map_job.c:320
#5  0x0001005ce28c in event_process_active_single_queue (base=Cannot access 
memory at address 0xffe7
) at event.c:1367
#6  0x0001005ce500 in event_process_active (base=Cannot access memory at 
address 0xffe7
) at event.c:1437
#7  0x0001005ceb71 in opal_libevent2021_event_base_loop (base=Cannot access 
memory at address 0xffb7
) at event.c:1645
#8  0x0001002c5158 in orterun (argc=Cannot access memory at address 
0xfd1b
) at orterun.c:3039
#9  0x0001002c32a4 in main (argc=Cannot access memory at address 
0xfffb
) at main.c:14

Looks to me like memory may be getting hosed


On Sep 20, 2013, at 2:59 PM, Brice Goglin  wrote:

> I can't see any segfault. Where does the segfault occurs for you? In OMPI 
> only (or lstopo too)? When loading or when using the topology?
> 
> I tried lstopo on that file with and without HWLOC_NO_LIBXML_IMPORT=1 (in 
> case the bug is in one of XML backends), looks ok.
> 
> Brice
> 
> 
> 
> 
> 
> Le 20/09/2013 23:53, Ralph Castain a écrit :
>> Here are the two files I tried - not from the same machine. The foo.xml 
>> works, the topo.xml segfaults
>> 
>> 
>> 
>> 
>> 
>> One of our users reported it from their machine, but I don't have their topo 
>> file.
>> 
>> On Sep 20, 2013, at 2:41 PM, Brice Goglin  wrote:
>> 
>>> Hello,
>>> I don't see anything reason for such an incompatibility. But there are
>>> many combinations, we can't test everything.
>>> I can't reproduce that on my machines. Can you send the XML output of
>>> both versions on one of your machines?
>>> Brice
>>> 
>>> 
>>> 
>>> Le 20/09/2013 23:32, Ralph Castain a écrit :
 Hi folks
 
 I've run across a rather strange behavior. We have two branches in OMPI - 
 the devel trunk (using hwloc v1.7.2) and our feature release series (using 
 hwloc 1.5.2). I have found the following:
 
 *the feature series can correctly load an xml file generated by lstopo of 
 versions 1.5 or greater
 
 * the devel series can correctly load an xml file generated by lstopo of 
 versions 1.7 or greater, but not files generated by prior versions. In the 
 latter case, I segfault as soon as I try to use the loaded topology.
 
 Any ideas why the discrepancy? Can I at least detect the version used to 
 create a file when loading it so I can error out instead of segfaulting?
 
 Ralph
 
 ___
 hwloc-devel mailing list
 hwloc-de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>>> ___
>>> hwloc-devel mailing list
>>> hwloc-de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>> 
>> 
>> ___
>> hwloc-devel mailing list
>> hwloc-de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
> 
> ___
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel



Re: [hwloc-devel] xml file load incompatibilities

2013-09-20 Thread Brice Goglin
I can't see any segfault. Where does the segfault occurs for you? In
OMPI only (or lstopo too)? When loading or when using the topology?

I tried lstopo on that file with and without HWLOC_NO_LIBXML_IMPORT=1
(in case the bug is in one of XML backends), looks ok.

Brice





Le 20/09/2013 23:53, Ralph Castain a écrit :
> Here are the two files I tried - not from the same machine. The foo.xml 
> works, the topo.xml segfaults
>
>
>
>
>
> One of our users reported it from their machine, but I don't have their topo 
> file.
>
> On Sep 20, 2013, at 2:41 PM, Brice Goglin  wrote:
>
>> Hello,
>> I don't see anything reason for such an incompatibility. But there are
>> many combinations, we can't test everything.
>> I can't reproduce that on my machines. Can you send the XML output of
>> both versions on one of your machines?
>> Brice
>>
>>
>>
>> Le 20/09/2013 23:32, Ralph Castain a écrit :
>>> Hi folks
>>>
>>> I've run across a rather strange behavior. We have two branches in OMPI - 
>>> the devel trunk (using hwloc v1.7.2) and our feature release series (using 
>>> hwloc 1.5.2). I have found the following:
>>>
>>> *the feature series can correctly load an xml file generated by lstopo of 
>>> versions 1.5 or greater
>>>
>>> * the devel series can correctly load an xml file generated by lstopo of 
>>> versions 1.7 or greater, but not files generated by prior versions. In the 
>>> latter case, I segfault as soon as I try to use the loaded topology.
>>>
>>> Any ideas why the discrepancy? Can I at least detect the version used to 
>>> create a file when loading it so I can error out instead of segfaulting?
>>>
>>> Ralph
>>>
>>> ___
>>> hwloc-devel mailing list
>>> hwloc-de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>> ___
>> hwloc-devel mailing list
>> hwloc-de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>
>
> ___
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel



Re: [hwloc-devel] xml file load incompatibilities

2013-09-20 Thread Ralph Castain
Here are the two files I tried - not from the same machine. The foo.xml works, 
the topo.xml segfaults




topo.xml
Description: XML document


foo.xml
Description: XML document


One of our users reported it from their machine, but I don't have their topo 
file.

On Sep 20, 2013, at 2:41 PM, Brice Goglin  wrote:

> Hello,
> I don't see anything reason for such an incompatibility. But there are
> many combinations, we can't test everything.
> I can't reproduce that on my machines. Can you send the XML output of
> both versions on one of your machines?
> Brice
> 
> 
> 
> Le 20/09/2013 23:32, Ralph Castain a écrit :
>> Hi folks
>> 
>> I've run across a rather strange behavior. We have two branches in OMPI - 
>> the devel trunk (using hwloc v1.7.2) and our feature release series (using 
>> hwloc 1.5.2). I have found the following:
>> 
>> *the feature series can correctly load an xml file generated by lstopo of 
>> versions 1.5 or greater
>> 
>> * the devel series can correctly load an xml file generated by lstopo of 
>> versions 1.7 or greater, but not files generated by prior versions. In the 
>> latter case, I segfault as soon as I try to use the loaded topology.
>> 
>> Any ideas why the discrepancy? Can I at least detect the version used to 
>> create a file when loading it so I can error out instead of segfaulting?
>> 
>> Ralph
>> 
>> ___
>> hwloc-devel mailing list
>> hwloc-de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
> 
> ___
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel



Re: [hwloc-devel] xml file load incompatibilities

2013-09-20 Thread Brice Goglin
Hello,
I don't see anything reason for such an incompatibility. But there are
many combinations, we can't test everything.
I can't reproduce that on my machines. Can you send the XML output of
both versions on one of your machines?
Brice



Le 20/09/2013 23:32, Ralph Castain a écrit :
> Hi folks
>
> I've run across a rather strange behavior. We have two branches in OMPI - the 
> devel trunk (using hwloc v1.7.2) and our feature release series (using hwloc 
> 1.5.2). I have found the following:
>
> *the feature series can correctly load an xml file generated by lstopo of 
> versions 1.5 or greater
>
> * the devel series can correctly load an xml file generated by lstopo of 
> versions 1.7 or greater, but not files generated by prior versions. In the 
> latter case, I segfault as soon as I try to use the loaded topology.
>
> Any ideas why the discrepancy? Can I at least detect the version used to 
> create a file when loading it so I can error out instead of segfaulting?
>
> Ralph
>
> ___
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel



[hwloc-devel] xml file load incompatibilities

2013-09-20 Thread Ralph Castain
Hi folks

I've run across a rather strange behavior. We have two branches in OMPI - the 
devel trunk (using hwloc v1.7.2) and our feature release series (using hwloc 
1.5.2). I have found the following:

*the feature series can correctly load an xml file generated by lstopo of 
versions 1.5 or greater

* the devel series can correctly load an xml file generated by lstopo of 
versions 1.7 or greater, but not files generated by prior versions. In the 
latter case, I segfault as soon as I try to use the loaded topology.

Any ideas why the discrepancy? Can I at least detect the version used to create 
a file when loading it so I can error out instead of segfaulting?

Ralph