Ok, so I'm viewing this has a hardware/BIOS/something else failure, and doesn't
indicate one way or the other whether the new OMPI 1.6 affinity code is working.
I would still very much like to see other people's testing results.
On May 30, 2012, at 2:02 PM, Brice Goglin wrote:
> Something is
Something is preventing all cores from appearing. The BIOS?
My E5-2650 processors definitely have 8 cores (without counting
hyperthreads) as advertised by Intel.
Brice
Le 30/05/2012 19:58, Mike Dubman a écrit :
> no cgroups or cpusets.
>
> On Wed, May 30, 2012 at 4:59 PM, Jeff Squyres
no cgroups or cpusets.
On Wed, May 30, 2012 at 4:59 PM, Jeff Squyres wrote:
> On May 30, 2012, at 9:47 AM, Mike Dubman wrote:
>
> > ohh.. you are right, false alarm :) sorry siblings != cores - so it is HT
>
> OMPI 1.6.soon-to-be-1 should handle HT properly, meaning that it
ohh.. you are right, false alarm :) sorry siblings != cores - so it is HT
On Wed, May 30, 2012 at 4:36 PM, Brice Goglin wrote:
> Your /proc/cpuinfo output (filtered below) looks like only two sockets
> (physical ids 0 and 1), with one core each (cpu cores=1, core id=0),
Your /proc/cpuinfo output (filtered below) looks like only two sockets
(physical ids 0 and 1), with one core each (cpu cores=1, core id=0),
with hyperthreading (siblings=2). So lstopo looks good.
E5-2650 is supposed to have 8 cores. I assume you use Linux
cgroups/cpusets to restrict the available
or, lstopo lies (Im not using the latest hwloc but one which comes with
distro).
The machine has two dual-code sockets, total 4 physical cores:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
Hmmm...well, from what I see, mpirun was actually giving you the right answer!
I only see TWO cores on each node, yet you told it to bind FOUR processes on
each node, each proc to be bound to a unique core.
The error message was correct - there are not enough cores on those nodes to do
what
attached.
On Wed, May 30, 2012 at 2:32 PM, Jeff Squyres wrote:
> On May 30, 2012, at 7:20 AM, Jeff Squyres wrote:
>
> >> $hwloc-ls --of console
> >> Machine (32GB)
> >> NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (20MB) + L2 L#0 (256KB)
> + L1 L#0 (32KB) + Core L#0
> >>
On May 30, 2012, at 7:20 AM, Jeff Squyres wrote:
>> $hwloc-ls --of console
>> Machine (32GB)
>> NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (20MB) + L2 L#0 (256KB) + L1
>> L#0 (32KB) + Core L#0
>>PU L#0 (P#0)
>>PU L#1 (P#2)
>> NUMANode L#1 (P#1 16GB) + Socket L#1 + L3 L#1 (20MB) + L2
On May 30, 2012, at 5:05 AM, Mike Dubman wrote:
> Not good:
@#$%@#%@#!! But I guess this is why we test. :-(
> /labhome/alexm/workspace/openmpi-1.6.1a1hge06c2f2a0859/inst/bin/mpirun --host
> h-qa-017,h-qa-017,h-qa-017,h-qa-017,h-qa-018,h-qa-018,h-qa-018,h-qa-018 -np 8
> --bind-to-core
Not good:
/labhome/alexm/workspace/openmpi-1.6.1a1hge06c2f2a0859/inst/bin/mpirun
--host
h-qa-017,h-qa-017,h-qa-017,h-qa-017,h-qa-018,h-qa-018,h-qa-018,h-qa-018 -np
8 --bind-to-core -bynode -display-map
/usr/mpi/gcc/mlnx-openmpi-1.6rc4/tests/osu_benchmarks-3.1.1/osu_alltoall
Per ticket #3108, there were still some unfortunate bugs in the affinity code
in 1.6. :-(
These have now been fixed. ...but since is the 2nd or 3rd time we have "fixed"
the 1.5/1.6 series w.r.t. processor affinity, I'd really like people to test
this stuff before it's committed and we ship
12 matches
Mail list logo