On 02/21/12 19:29, Jeffrey Squyres wrote:
What's the output of running lstopo from hwloc 1.3.2? (this is the version
that's in the OMPI trunk and v1.5 branches)
http://www.open-mpi.org/software/hwloc/v1.3/
Is there any difference from v1.4 hwloc?
http://www.open-mpi.org/software/hw
On 2/21/2012 5:40 PM, Paul H. Hargrove wrote:
Here are the first of the results of the testing I promised.
I am not 100% sure how to reach the code that Eugene reported as
problematic,
I don't think you're going to see it. Somehow, hwloc on the config in
question thinks there is no socket leve
More notes:
I've tested ompi-1.5.4 and it has the same problem. So, this is NOT a
regression.
Terry D. has observed that Ubuntu is NOT a supported platform for the
Solaris Studio compilers.
So, I've reproduced on a Scientific Linux 5.5 platform (Red Hat
Enterprise Linux 5.5 clone, like Cent
Le 22/02/2012 07:36, Eugene Loh a écrit :
> On 2/21/2012 5:40 PM, Paul H. Hargrove wrote:
>> Here are the first of the results of the testing I promised.
>> I am not 100% sure how to reach the code that Eugene reported as
>> problematic,
> I don't think you're going to see it. Somehow, hwloc on th
On 2/21/2012 10:31 PM, Eugene Loh wrote:
... "sockets" is unknown and hwloc returns 0 for num_sockets and OMPI
pukes on divide by zero. OS info was listed in the original message
(below). Might we want to do something else? E.g., assume
num_sockets==1 when num_sockets==0 (if you know what I
Much simpler solution - on that platform, you should add "orte_num_sockets=1"
to your default mca param file. Problem solved. It's why that param exists, and
we added it specifically at Terry's request for an earlier, similar problem.
On Feb 22, 2012, at 8:55 AM, Brice Goglin wrote:
> Le 22/02
On Feb 22, 2012, at 9:39 AM, Eugene Loh wrote:
> On 2/21/2012 10:31 PM, Eugene Loh wrote:
>> ... "sockets" is unknown and hwloc returns 0 for num_sockets and OMPI pukes
>> on divide by zero. OS info was listed in the original message (below).
>> Might we want to do something else? E.g., ass
Le 22/02/2012 17:48, Ralph Castain a écrit :
> On Feb 22, 2012, at 9:39 AM, Eugene Loh wrote:
>
>> On 2/21/2012 10:31 PM, Eugene Loh wrote:
>>> ... "sockets" is unknown and hwloc returns 0 for num_sockets and OMPI
>>> pukes on divide by zero. OS info was listed in the original message
>>> (belo
On Feb 22, 2012, at 11:59 AM, Brice Goglin wrote:
> Le 22/02/2012 17:48, Ralph Castain a écrit :
>> On Feb 22, 2012, at 9:39 AM, Eugene Loh wrote:
>>
>>> On 2/21/2012 10:31 PM, Eugene Loh wrote:
... "sockets" is unknown and hwloc returns 0 for num_sockets and OMPI
pukes on divide by
On 2/22/2012 11:08 AM, Ralph Castain wrote:
On Feb 22, 2012, at 11:59 AM, Brice Goglin wrote:
Le 22/02/2012 17:48, Ralph Castain a écrit :
On Feb 22, 2012, at 9:39 AM, Eugene Loh wrote
On 2/21/2012 10:31 PM, Eugene Loh wrote:
... "sockets" is unknown and hwloc returns 0 for num_sockets and O
On Feb 22, 2012, at 12:24 PM, Eugene Loh wrote:
> On 2/22/2012 11:08 AM, Ralph Castain wrote:
>> On Feb 22, 2012, at 11:59 AM, Brice Goglin wrote:
>>> Le 22/02/2012 17:48, Ralph Castain a écrit :
On Feb 22, 2012, at 9:39 AM, Eugene Loh wrote
> On 2/21/2012 10:31 PM, Eugene Loh wrote:
>>>
I think I have the beginning of a fix for this issue.
I had not even noticed earlier that the error in event.h is from the C++
compiler, when compiling file.cxx in the c++ bindings. That makes the
vendor-specific addition of "-library=stlport4" to CXXFLAGS quite
relevant to the problem/soluti
Le 22/02/2012 20:24, Eugene Loh a écrit :
> On 2/22/2012 11:08 AM, Ralph Castain wrote:
>> On Feb 22, 2012, at 11:59 AM, Brice Goglin wrote:
>>> Le 22/02/2012 17:48, Ralph Castain a écrit :
On Feb 22, 2012, at 9:39 AM, Eugene Loh wrote
> On 2/21/2012 10:31 PM, Eugene Loh wrote:
>> ...
Please verify this list of supported systems for the v1.5.5 release:
- The run-time systems that are currently supported are:
- rsh / ssh
- LoadLeveler
- PBS Pro, Open PBS, Torque
- Platform LSF (v7.0.2 and later)
- SLURM
- Cray XT-3, XT-4, and XT-5
- Oracle Grid Engine (OGE) 6.1, 6.
On 02/22/12 14:54, Ralph Castain wrote:
That doesn't really address the issue, though. What I want to know is:
what happens when you try to bind processes? What about
-bind-to-socket, and -persocket options? Etc. Reason I'm concerned:
I'm not sure what happens if the socket layer isn't present.
That's what we needed to know - i.e., that setting num_sockets=1 generates an
error instead of segfaulting down the road. I can submit a CMR to do so.
thx!
On Feb 22, 2012, at 4:12 PM, Eugene Loh wrote:
> On 02/22/12 14:54, Ralph Castain wrote:
>> That doesn't really address the issue, though.
Terry / Eugene --
Can you comment?
On Feb 22, 2012, at 3:16 PM, Paul H. Hargrove wrote:
> I think I have the beginning of a fix for this issue.
>
> I had not even noticed earlier that the error in event.h is from the C++
> compiler, when compiling file.cxx in the c++ bindings. That makes the
Folks at Oracle should decide, but I suspect "Solaris 10" should be
updated to "Solaris 10 and 11", or just "11".
-Paul
On 2/22/2012 2:44 PM, Jeffrey Squyres wrote:
Please verify this list of supported systems for the v1.5.5 release:
- The run-time systems that are currently supported are:
Paul,
Haven't you been running Intel compilers on OS X?
Also, do we have specifics about which gcc's on Mac OS X? I have (OS
X 10.5.8):
savaii:~ baker$ ls -l /usr/bin/gcc*
lrwxr-xr-x 1 root wheel 7 Oct 2 2009 /usr/bin/gcc -> gcc-4.0
-r-xr-xr-x 1 root wheel 258368 Feb 19 2008
I have NOT been running Intel's compilers on Macs, only on Linux.
I *tried* PGI's compilers on MacOS, but that was a flop.
I have used Clang (comes w/ XCode 4.2) on MacOS, and that works for me
but is not extensively tested.
-Paul
On 2/22/2012 6:13 PM, Larry Baker wrote:
Paul,
Haven't you be
I can get exact info from my MacOS 10.7 machine later, but its gcc is
llvm-gcc-4.2 IIRC.
Here are my 10.5 and 10.6:
ProductName:Mac OS X
ProductVersion: 10.5.8
BuildVersion: 9L31a
powerpc
lrwxr-xr-x 1 root wheel 7 Nov 1 2008 /usr/bin/gcc -> gcc-4.0
-r-xr-xr-x 1 root wheel 2583
21 matches
Mail list logo