That's what we needed to know - i.e., that setting num_sockets=1 generates an
error instead of segfaulting down the road. I can submit a CMR to do so.
thx!
On Feb 22, 2012, at 4:12 PM, Eugene Loh wrote:
> On 02/22/12 14:54, Ralph Castain wrote:
>> That doesn't really address the issue, though.
On 02/22/12 14:54, Ralph Castain wrote:
That doesn't really address the issue, though. What I want to know is:
what happens when you try to bind processes? What about
-bind-to-socket, and -persocket options? Etc. Reason I'm concerned:
I'm not sure what happens if the socket layer isn't present.
Le 22/02/2012 20:24, Eugene Loh a écrit :
> On 2/22/2012 11:08 AM, Ralph Castain wrote:
>> On Feb 22, 2012, at 11:59 AM, Brice Goglin wrote:
>>> Le 22/02/2012 17:48, Ralph Castain a écrit :
On Feb 22, 2012, at 9:39 AM, Eugene Loh wrote
> On 2/21/2012 10:31 PM, Eugene Loh wrote:
>> ...
On Feb 22, 2012, at 12:24 PM, Eugene Loh wrote:
> On 2/22/2012 11:08 AM, Ralph Castain wrote:
>> On Feb 22, 2012, at 11:59 AM, Brice Goglin wrote:
>>> Le 22/02/2012 17:48, Ralph Castain a écrit :
On Feb 22, 2012, at 9:39 AM, Eugene Loh wrote
> On 2/21/2012 10:31 PM, Eugene Loh wrote:
>>>
On 2/22/2012 11:08 AM, Ralph Castain wrote:
On Feb 22, 2012, at 11:59 AM, Brice Goglin wrote:
Le 22/02/2012 17:48, Ralph Castain a écrit :
On Feb 22, 2012, at 9:39 AM, Eugene Loh wrote
On 2/21/2012 10:31 PM, Eugene Loh wrote:
... "sockets" is unknown and hwloc returns 0 for num_sockets and O
On Feb 22, 2012, at 11:59 AM, Brice Goglin wrote:
> Le 22/02/2012 17:48, Ralph Castain a écrit :
>> On Feb 22, 2012, at 9:39 AM, Eugene Loh wrote:
>>
>>> On 2/21/2012 10:31 PM, Eugene Loh wrote:
... "sockets" is unknown and hwloc returns 0 for num_sockets and OMPI
pukes on divide by
Le 22/02/2012 17:48, Ralph Castain a écrit :
> On Feb 22, 2012, at 9:39 AM, Eugene Loh wrote:
>
>> On 2/21/2012 10:31 PM, Eugene Loh wrote:
>>> ... "sockets" is unknown and hwloc returns 0 for num_sockets and OMPI
>>> pukes on divide by zero. OS info was listed in the original message
>>> (belo
On Feb 22, 2012, at 9:39 AM, Eugene Loh wrote:
> On 2/21/2012 10:31 PM, Eugene Loh wrote:
>> ... "sockets" is unknown and hwloc returns 0 for num_sockets and OMPI pukes
>> on divide by zero. OS info was listed in the original message (below).
>> Might we want to do something else? E.g., ass
Much simpler solution - on that platform, you should add "orte_num_sockets=1"
to your default mca param file. Problem solved. It's why that param exists, and
we added it specifically at Terry's request for an earlier, similar problem.
On Feb 22, 2012, at 8:55 AM, Brice Goglin wrote:
> Le 22/02
On 2/21/2012 10:31 PM, Eugene Loh wrote:
... "sockets" is unknown and hwloc returns 0 for num_sockets and OMPI
pukes on divide by zero. OS info was listed in the original message
(below). Might we want to do something else? E.g., assume
num_sockets==1 when num_sockets==0 (if you know what I
Le 22/02/2012 07:36, Eugene Loh a écrit :
> On 2/21/2012 5:40 PM, Paul H. Hargrove wrote:
>> Here are the first of the results of the testing I promised.
>> I am not 100% sure how to reach the code that Eugene reported as
>> problematic,
> I don't think you're going to see it. Somehow, hwloc on th
On 2/21/2012 5:40 PM, Paul H. Hargrove wrote:
Here are the first of the results of the testing I promised.
I am not 100% sure how to reach the code that Eugene reported as
problematic,
I don't think you're going to see it. Somehow, hwloc on the config in
question thinks there is no socket leve
On 02/21/12 19:29, Jeffrey Squyres wrote:
What's the output of running lstopo from hwloc 1.3.2? (this is the version
that's in the OMPI trunk and v1.5 branches)
http://www.open-mpi.org/software/hwloc/v1.3/
Is there any difference from v1.4 hwloc?
http://www.open-mpi.org/software/hw
My build with the "2011_sp1.8.273" Intel compilers passes the same tests
as I detailed below for "2011_sp1.7.256".
I don't suspect any longer that the compiler is at fault, but am willing
to try additional/alternate tests to help confirm.
-Paul
On 2/21/2012 5:40 PM, Paul H. Hargrove wrote:
He
Here are the first of the results of the testing I promised.
I am not 100% sure how to reach the code that Eugene reported as
problematic, so I tried just running the ring test with various
-bind-to-* options. I am quite willing to run additional test cases.
All runs are w/ OMPI_MCA_btl=sm,s
I have been testing v1.5 with slightly older Intel
"composerxe-2011.5.220" compilers.
I see a "make check" failure in opal_datatype_test which is not present
with any other compiler (such as gcc on the same node).
This has been seen most recently on the 1.5.5rc2r25990 tarball generated
earlier t
What's the output of running lstopo from hwloc 1.3.2? (this is the version
that's in the OMPI trunk and v1.5 branches)
http://www.open-mpi.org/software/hwloc/v1.3/
Is there any difference from v1.4 hwloc?
http://www.open-mpi.org/software/hwloc/v1.4/
On Feb 21, 2012, at 7:20 PM, Eugen
17 matches
Mail list logo