Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST

2012-05-30 Thread Mike Dubman
Not good: /labhome/alexm/workspace/openmpi-1.6.1a1hge06c2f2a0859/inst/bin/mpirun --host h-qa-017,h-qa-017,h-qa-017,h-qa-017,h-qa-018,h-qa-018,h-qa-018,h-qa-018 -np 8 --bind-to-core -bynode -display-map /usr/mpi/gcc/mlnx-openmpi-1.6rc4/tests/osu_benchmarks-3.1.1/osu_alltoall =

Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST

2012-05-30 Thread Jeff Squyres
On May 30, 2012, at 5:05 AM, Mike Dubman wrote: > Not good: @#$%@#%@#!! But I guess this is why we test. :-( > /labhome/alexm/workspace/openmpi-1.6.1a1hge06c2f2a0859/inst/bin/mpirun --host > h-qa-017,h-qa-017,h-qa-017,h-qa-017,h-qa-018,h-qa-018,h-qa-018,h-qa-018 -np 8 > --bind-to-core -byno

Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST

2012-05-30 Thread Jeff Squyres
On May 30, 2012, at 7:20 AM, Jeff Squyres wrote: >> $hwloc-ls --of console >> Machine (32GB) >> NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (20MB) + L2 L#0 (256KB) + L1 >> L#0 (32KB) + Core L#0 >>PU L#0 (P#0) >>PU L#1 (P#2) >> NUMANode L#1 (P#1 16GB) + Socket L#1 + L3 L#1 (20MB) + L2

Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST

2012-05-30 Thread Mike Dubman
attached. On Wed, May 30, 2012 at 2:32 PM, Jeff Squyres wrote: > On May 30, 2012, at 7:20 AM, Jeff Squyres wrote: > > >> $hwloc-ls --of console > >> Machine (32GB) > >> NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (20MB) + L2 L#0 (256KB) > + L1 L#0 (32KB) + Core L#0 > >>PU L#0 (P#0) > >>

Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST

2012-05-30 Thread Ralph Castain
Hmmm...well, from what I see, mpirun was actually giving you the right answer! I only see TWO cores on each node, yet you told it to bind FOUR processes on each node, each proc to be bound to a unique core. The error message was correct - there are not enough cores on those nodes to do what you

Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST

2012-05-30 Thread Mike Dubman
or, lstopo lies (Im not using the latest hwloc but one which comes with distro). The machine has two dual-code sockets, total 4 physical cores: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 45 model name : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz step

Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST

2012-05-30 Thread Brice Goglin
Your /proc/cpuinfo output (filtered below) looks like only two sockets (physical ids 0 and 1), with one core each (cpu cores=1, core id=0), with hyperthreading (siblings=2). So lstopo looks good. E5-2650 is supposed to have 8 cores. I assume you use Linux cgroups/cpusets to restrict the available c

Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST

2012-05-30 Thread Mike Dubman
ohh.. you are right, false alarm :) sorry siblings != cores - so it is HT On Wed, May 30, 2012 at 4:36 PM, Brice Goglin wrote: > Your /proc/cpuinfo output (filtered below) looks like only two sockets > (physical ids 0 and 1), with one core each (cpu cores=1, core id=0), with > hyperthreading (s

Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST

2012-05-30 Thread Jeff Squyres
On May 30, 2012, at 9:47 AM, Mike Dubman wrote: > ohh.. you are right, false alarm :) sorry siblings != cores - so it is HT OMPI 1.6.soon-to-be-1 should handle HT properly, meaning that it will bind to all the HT's in a core and/or socket. Are you using Linux cgroups/cpusets to restrict availab

Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST

2012-05-30 Thread Mike Dubman
no cgroups or cpusets. On Wed, May 30, 2012 at 4:59 PM, Jeff Squyres wrote: > On May 30, 2012, at 9:47 AM, Mike Dubman wrote: > > > ohh.. you are right, false alarm :) sorry siblings != cores - so it is HT > > OMPI 1.6.soon-to-be-1 should handle HT properly, meaning that it will bind > to all th

Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST

2012-05-30 Thread Brice Goglin
Something is preventing all cores from appearing. The BIOS? My E5-2650 processors definitely have 8 cores (without counting hyperthreads) as advertised by Intel. Brice Le 30/05/2012 19:58, Mike Dubman a écrit : > no cgroups or cpusets. > > On Wed, May 30, 2012 at 4:59 PM, Jeff Squyres

[OMPI devel] Open MPI services migration: part 1

2012-05-30 Thread Jeff Squyres
Be advised that I have received the following message from our Indiana U. hosting provider. SVN/Trac/OpenGrok will be unavailable during the time frame described below. More migrations will follow (e.g., web services) in the coming weeks; stay tuned. - We are planning to move all the serv

Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST

2012-05-30 Thread Jeff Squyres
Ok, so I'm viewing this has a hardware/BIOS/something else failure, and doesn't indicate one way or the other whether the new OMPI 1.6 affinity code is working. I would still very much like to see other people's testing results. On May 30, 2012, at 2:02 PM, Brice Goglin wrote: > Something is

Re: [OMPI devel] Compile-time MPI_Datatype checking

2012-05-30 Thread Jeff Squyres
I've reviewed the patch. Good stuff! I annotated your patch file -- see attached for my comments. Search for "*** JMS" (my initials). On May 29, 2012, at 11:08 AM, Dmitri Gribenko wrote: > Hello, > > I've implemented a patch for clang that enables compile-time checking > of arguments to fu

Re: [OMPI devel] Compile-time MPI_Datatype checking

2012-05-30 Thread Dmitri Gribenko
Hi Jeff, On Thu, May 31, 2012 at 12:57 AM, Jeff Squyres wrote: > I've reviewed the patch.  Good stuff! Thank you very much for the review. Answers to comments below. Updated patch attached. *** JMS What do the 3-argument forms of type_tag_for_datatype() do? They aren't described in ht