On all but the 2 machines with the newer bios (just the first socket): mach1:~ # lstopo -p --of console
NUMANode P#0 (12GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#0 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#4 L2 (512KB) + L1 (64KB) + Core P#2 + PU P#8 L2 (512KB) + L1 (64KB) + Core P#3 + PU P#12 L2 (512KB) + L1 (64KB) + Core P#4 + PU P#16 L2 (512KB) + L1 (64KB) + Core P#5 + PU P#20 NUMANode P#1 (12GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#24 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#28 L2 (512KB) + L1 (64KB) + Core P#2 + PU P#32 L2 (512KB) + L1 (64KB) + Core P#3 + PU P#36 L2 (512KB) + L1 (64KB) + Core P#4 + PU P#40 L2 (512KB) + L1 (64KB) + Core P#5 + PU P#44 mach1:~ # lstopo -l --of console NUMANode L#0 (P#0 12GB) + L3 L#0 (5118KB) L2 L#0 (512KB) + L1 L#0 (64KB) + Core L#0 + PU L#0 (P#0) L2 L#1 (512KB) + L1 L#1 (64KB) + Core L#1 + PU L#1 (P#4) L2 L#2 (512KB) + L1 L#2 (64KB) + Core L#2 + PU L#2 (P#8) L2 L#3 (512KB) + L1 L#3 (64KB) + Core L#3 + PU L#3 (P#12) L2 L#4 (512KB) + L1 L#4 (64KB) + Core L#4 + PU L#4 (P#16) L2 L#5 (512KB) + L1 L#5 (64KB) + Core L#5 + PU L#5 (P#20) NUMANode L#1 (P#1 12GB) + L3 L#1 (5118KB) L2 L#6 (512KB) + L1 L#6 (64KB) + Core L#6 + PU L#6 (P#24) L2 L#7 (512KB) + L1 L#7 (64KB) + Core L#7 + PU L#7 (P#28) L2 L#8 (512KB) + L1 L#8 (64KB) + Core L#8 + PU L#8 (P#32) L2 L#9 (512KB) + L1 L#9 (64KB) + Core L#9 + PU L#9 (P#36) L2 L#10 (512KB) + L1 L#10 (64KB) + Core L#10 + PU L#10 (P#40) L2 L#11 (512KB) + L1 L#11 (64KB) + Core L#11 + PU L#11 (P#44) Now for the 2 with the bios update: mach2:~ # lstopo -p --of console NUMANode P#0 (12GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#0 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#1 L2 (512KB) + L1 (64KB) + Core P#2 + PU P#2 L2 (512KB) + L1 (64KB) + Core P#3 + PU P#3 L2 (512KB) + L1 (64KB) + Core P#4 + PU P#4 L2 (512KB) + L1 (64KB) + Core P#5 + PU P#5 NUMANode P#1 (12GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#6 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#7 L2 (512KB) + L1 (64KB) + Core P#2 + PU P#8 L2 (512KB) + L1 (64KB) + Core P#3 + PU P#9 L2 (512KB) + L1 (64KB) + Core P#4 + PU P#10 L2 (512KB) + L1 (64KB) + Core P#5 + PU P#11 mach2:~ # lstopo -l --of console NUMANode L#0 (P#0 12GB) + L3 L#0 (5118KB) L2 L#0 (512KB) + L1 L#0 (64KB) + Core L#0 + PU L#0 (P#0) L2 L#1 (512KB) + L1 L#1 (64KB) + Core L#1 + PU L#1 (P#1) L2 L#2 (512KB) + L1 L#2 (64KB) + Core L#2 + PU L#2 (P#2) L2 L#3 (512KB) + L1 L#3 (64KB) + Core L#3 + PU L#3 (P#3) L2 L#4 (512KB) + L1 L#4 (64KB) + Core L#4 + PU L#4 (P#4) L2 L#5 (512KB) + L1 L#5 (64KB) + Core L#5 + PU L#5 (P#5) NUMANode L#1 (P#1 12GB) + L3 L#1 (5118KB) L2 L#6 (512KB) + L1 L#6 (64KB) + Core L#6 + PU L#6 (P#6) L2 L#7 (512KB) + L1 L#7 (64KB) + Core L#7 + PU L#7 (P#7) L2 L#8 (512KB) + L1 L#8 (64KB) + Core L#8 + PU L#8 (P#8) L2 L#9 (512KB) + L1 L#9 (64KB) + Core L#9 + PU L#9 (P#9) L2 L#10 (512KB) + L1 L#10 (64KB) + Core L#10 + PU L#10 (P#10) L2 L#11 (512KB) + L1 L#11 (64KB) + Core L#11 + PU L#11 (P#11) We do not use hyperthreading.... ________________________________ From: devel <devel-boun...@open-mpi.org> on behalf of Ralph Castain <r...@open-mpi.org> Sent: Monday, November 10, 2014 2:38 PM To: Open MPI Developers Subject: Re: [OMPI devel] mpirun does not honor rankfile So a key point here is that PU in lstopo output equates to hyperthread when hyperthreads are enabled, and those are always uniquely numbered. On my (admittedly puny by comparison) dual-socket Nehalem box, I get this for physical: $ lstopo -p --of console Machine (16GB) NUMANode P#0 (8127MB) + Socket P#0 + L3 (12MB) L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#0 PU P#0 PU P#12 L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#1 PU P#2 PU P#14 L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#2 PU P#4 PU P#16 L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#8 PU P#6 PU P#18 L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#9 PU P#8 PU P#20 L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#10 PU P#10 PU P#22 NUMANode P#1 (8192MB) + Socket P#1 + L3 (12MB) L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#0 PU P#1 PU P#13 L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#1 PU P#3 PU P#15 L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#2 PU P#5 PU P#17 L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#8 PU P#7 PU P#19 L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#9 PU P#9 PU P#21 L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#10 PU P#11 PU P#23 Note that all the cores and hyperthreads are labeled with “P” because I added the -p option to request physical numbering. As you can see, the core numbering is done on a per-socket basis and are not unique. If I then ask for logical numbering: $ lstopo -l --of console Machine (16GB) NUMANode L#0 (P#0 8127MB) + Socket L#0 + L3 L#0 (12MB) L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 PU L#0 (P#0) PU L#1 (P#12) L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 PU L#2 (P#2) PU L#3 (P#14) L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 PU L#4 (P#4) PU L#5 (P#16) L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 PU L#6 (P#6) PU L#7 (P#18) L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 PU L#8 (P#8) PU L#9 (P#20) L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 PU L#10 (P#10) PU L#11 (P#22) NUMANode L#1 (P#1 8192MB) + Socket L#1 + L3 L#1 (12MB) L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 PU L#12 (P#1) PU L#13 (P#13) L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 PU L#14 (P#3) PU L#15 (P#15) L2 L#8 (256KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 PU L#16 (P#5) PU L#17 (P#17) L2 L#9 (256KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 PU L#18 (P#7) PU L#19 (P#19) L2 L#10 (256KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 PU L#20 (P#9) PU L#21 (P#21) L2 L#11 (256KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 PU L#22 (P#11) PU L#23 (P#23) You see a unique logical number for every core, as you’d expect. The problem is the core numbering, which is not unique for physical id’s. You might compare your machines to mine using the same commands to see how it looks. The BIOS can indeed change the numbering pattern, so that might indeed be an issue. On Nov 10, 2014, at 11:27 AM, Tom Wurgler <twu...@goodyear.com<mailto:twu...@goodyear.com>> wrote: If we run > lstopo --output-format fig we get a diagram of the socket/numa/core layouts and all but those 2 give "PU P#0", PU P#4, PU P#8....in the smallest box. and in the lower left corner it says "physical" If we then add an option > lstopo --logical --output-format fig we get PU L#0, PU L#1, PU L#2 ... and its says logical. On the 2 boxes with the newer bios, both --logical and default physical have the same NUMBERS, even though one is PU L# and the other is PU P#, and the numbers just go 0, 1,2,3.... So is this a BIOS setting that is causing it to report one way on some and a different way on others? And LSF takes what it gets? I am attempting to print the bios settings for each so I can compare them.... For a given node, the numbers LSF gives are unique. ________________________________ From: devel <devel-boun...@open-mpi.org<mailto:devel-boun...@open-mpi.org>> on behalf of Ralph Castain <r...@open-mpi.org<mailto:r...@open-mpi.org>> Sent: Monday, November 10, 2014 2:09 PM To: Open MPI Developers Subject: Re: [OMPI devel] mpirun does not honor rankfile Hmmm….and those are, of course, intended to be physical core numbers. I wonder how they are numbering them? The OS index won’t be unique, which is what is causing us trouble, so they must have some way of translating them to provide a unique number. On Nov 10, 2014, at 10:42 AM, Tom Wurgler <twu...@goodyear.com<mailto:twu...@goodyear.com>> wrote: LSF gives this, for example, over which we (LSF users) have no control. rank 0=mach1 slot=0 rank 1=mach1 slot=4 rank 2=mach1 slot=8 rank 3=mach1 slot=12 rank 4=mach1 slot=16 rank 5=mach1 slot=20 rank 6=mach1 slot=24 rank 7=mach1 slot=28 rank 8=mach1 slot=32 rank 9=mach1 slot=36 rank 10=mach1 slot=40 rank 11=mach1 slot=44 rank 12=mach1 slot=1 rank 13=mach1 slot=5 rank 14=mach1 slot=9 rank 15=mach1 slot=13 I have also filed a service ticket with LSF to see if they can change to logical numbering etc. In the meantime we have written a translator, but this is cluster, actually node, specific and should not be called a solution. Running lstopo on the whole cluster found 2 nodes giving logical numbering and the rest giving physical. Which is interesting in itself. We find those 2 nodes having a newer bios level. Still investigating this... thanks tom Tom Wurgler Application Systems Principal The Goodyear Tire & Rubber Company 200 Innovation Way, Akron, OH 44316 phone.330.796.1656 twu...@goodyear.com<mailto:twu...@goodyear.com> ________________________________ From: devel <devel-boun...@open-mpi.org<mailto:devel-boun...@open-mpi.org>> on behalf of Ralph Castain <r...@open-mpi.org<mailto:r...@open-mpi.org>> Sent: Monday, November 10, 2014 1:16 PM To: Open MPI Developers Subject: Re: [OMPI devel] mpirun does not honor rankfile I’ve been taking a look at this, and I believe I can get something implemented shortly. However, one problem I’ve encountered is that physical core indexes are NOT unique in many systems, e.g., x86 when hyperthreads are enabled. So you would have to specify socket:core in order to get a unique location. Alternatively, when hyperthreads are enabled, the physical hyperthread number is unique. My question, therefore, is whether or not this is going to work for you? I don’t know what LSF is giving you - can you provide a socket:core, or a physical hyperthread number? On Nov 6, 2014, at 8:34 AM, Ralph Castain <rhc.open...@gmail.com<mailto:rhc.open...@gmail.com>> wrote: IIRC, you prefix the core number with a P to indicate physical I’ll see what I can do about getting the physical notation re-implemented - just can’t promise when that will happen On Nov 6, 2014, at 8:30 AM, Tom Wurgler <twu...@goodyear.com<mailto:twu...@goodyear.com>> wrote: Well, unless we can get LSF to use physical numbering, we are dead in the water without a translator of some sort. We are trying to figure how we can automate the translation in the meantime, but we have a mix of clusters and the mapping is different between them. We daily use 1.6.4 openmpi (vs all this current testing has been with 1.8.3). In reading the 1.8.1 man page for mpirun, it states that "Starting with Open MPI v1.7, all socket/core slot locations are be specified as logical indexes (the Open MPI v1.6 series used physical indexes)." But testing using rankfiles with 1.6.4, it behaves like 1.8.3, ie using logical indexes. Is there maybe a switch in 1.6.4 to use physical indexes? I am not seeing it in the mpirun --help... thanks ________________________________ From: devel <devel-boun...@open-mpi.org<mailto:devel-boun...@open-mpi.org>> on behalf of Ralph Castain <rhc.open...@gmail.com<mailto:rhc.open...@gmail.com>> Sent: Thursday, November 6, 2014 11:08 AM To: Open MPI Developers Subject: Re: [OMPI devel] mpirun does not honor rankfile Ugh….we used to have a switch for that purpose, but it became hard to manage the code. I could reimplement at some point, but it won’t be in the immediate future. I gather the issue is that the system tools report physical numbering, and so you have to mentally translate to create the rankfile? Or is there an automated script you run to do the translation? In other words, is it possible to simplify the translation in the interim? Or is this a show-stopper for you? On Nov 6, 2014, at 7:21 AM, Tom Wurgler <twu...@goodyear.com<mailto:twu...@goodyear.com>> wrote: So we used lstopo with a arg of "--logical" and the output showed the core numbering 0,1,2,3...47 instead of 0,4,8,12 etc. The multiplying by 4 you speak of falls apart when you get to the second socket as its physical numbers are 1,5,9,13... and its logical is 12,13,14,15.... So the question is can we get mpirun to honor the physical numbering? thanks! tom ________________________________ From: devel <devel-boun...@open-mpi.org<mailto:devel-boun...@open-mpi.org>> on behalf of Ralph Castain <rhc.open...@gmail.com<mailto:rhc.open...@gmail.com>> Sent: Wednesday, November 5, 2014 6:30 PM To: Open MPI Developers Subject: Re: [OMPI devel] mpirun does not honor rankfile I suspect the issue may be with physical vs logical numbering. As I said, we use logical numbering in the rankfile, not physical. So I’m not entirely sure how to translate the cpumask in your final table into the numbering shown in your rankfile listings. Is the cpumask showing a physical core number? I ask because it sure looks like the logical numbering we use is getting multiplied by 4 to become the cpumask you show. If they logically number their cores by socket (i.e., core 0 is first core in first socket, core 1 is first core in second socket, etc.), then that would explain the output. On Nov 5, 2014, at 2:23 PM, Tom Wurgler <twu...@goodyear.com<mailto:twu...@goodyear.com>> wrote: Well, further investigation found this: If I edit the rank file and change it like this: before: rank 0=mach1 slot=0 rank 1=mach1 slot=4 rank 2=mach1 slot=8 rank 3=mach1 slot=12 rank 4=mach1 slot=16 rank 5=mach1 slot=20 rank 6=mach1 slot=24 rank 7=mach1 slot=28 rank 8=mach1 slot=32 rank 9=mach1 slot=36 rank 10=mach1 slot=40 rank 11=mach1 slot=44 rank 12=mach1 slot=1 rank 13=mach1 slot=5 rank 14=mach1 slot=9 rank 15=mach1 slot=13 after: rank 0=mach1 slot=0 rank 1=mach1 slot=1 rank 2=mach1 slot=2 rank 3=mach1 slot=3 rank 4=mach1 slot=4 rank 5=mach1 slot=5 rank 6=mach1 slot=6 rank 7=mach1 slot=7 rank 8=mach1 slot=8 rank 9=mach1 slot=9 rank 10=mach1 slot=10 rank 11=mach1 slot=11 rank 12=mach1 slot=12 rank 13=mach1 slot=13 rank 14=mach1 slot=14 rank 15=mach1 slot=15 It does what I expect: PID COMMAND CPUMASK TOTAL [ N0 N1 N2 N3 N4 N5 N6 N7 ] 12192 my_executable 0 472.0M [ 472.0M 0 0 0 0 0 0 0 ] 12193 my_executable 4 358.0M [ 358.0M 0 0 0 0 0 0 0 ] 12194 my_executable 8 450.4M [ 450.4M 0 0 0 0 0 0 0 ] 12195 my_executable 12 439.1M [ 439.1M 0 0 0 0 0 0 0 ] 12196 my_executable 16 392.1M [ 392.1M 0 0 0 0 0 0 0 ] 12197 my_executable 20 420.6M [ 420.6M 0 0 0 0 0 0 0 ] 12198 my_executable 24 414.9M [ 0 414.9M 0 0 0 0 0 0 ] 12199 my_executable 28 388.9M [ 0 388.9M 0 0 0 0 0 0 ] 12200 my_executable 32 452.7M [ 0 452.7M 0 0 0 0 0 0 ] 12201 my_executable 36 438.9M [ 0 438.9M 0 0 0 0 0 0 ] 12202 my_executable 40 369.3M [ 0 369.3M 0 0 0 0 0 0 ] 12203 my_executable 44 440.5M [ 0 440.5M 0 0 0 0 0 0 ] 12204 my_executable 1 447.7M [ 0 0 447.7M 0 0 0 0 0 ] 12205 my_executable 5 367.1M [ 0 0 367.1M 0 0 0 0 0 ] 12206 my_executable 9 426.5M [ 0 0 426.5M 0 0 0 0 0 ] 12207 my_executable 13 414.2M [ 0 0 414.2M 0 0 0 0 0 ] We use hwloc 1.4 to generate a layout of the cores etc. So either LSF created the wrong rankfile (via my config errors, most likely) or mpirun can't deal with that rankfile. I can try the nightly tarball as well. The hardware is 48 core AMD: 4 sockets, 2 Numa nodes per socket with 6 cores each. thanks tom ________________________________ From: devel <devel-boun...@open-mpi.org<mailto:devel-boun...@open-mpi.org>> on behalf of Ralph Castain <rhc.open...@gmail.com<mailto:rhc.open...@gmail.com>> Sent: Wednesday, November 5, 2014 4:27 PM To: Open MPI Developers Subject: Re: [OMPI devel] mpirun does not honor rankfile Hmmm…well, it seems to be working fine in 1.8.4rc1 (I only have 12 cores on my humble machine). However, I can’t test any interactions with LSF, though that shouldn’t be an issue: $ mpirun -host bend001 -rf ./rankfile --report-bindings --display-devel-map hostname Data for JOB [60677,1] offset 0 Mapper requested: NULL Last mapper: rank_file Mapping policy: BYUSER Ranking policy: SLOT Binding policy: CPUSET Cpu set: NULL PPR: NULL Cpus-per-rank: 1 Num new daemons: 0 New daemon starting vpid INVALID Num nodes: 1 Data for node: bend001 Launch id: -1 State: 2 Daemon: [[60677,0],0] Daemon launched: True Num slots: 12 Slots in use: 12 Oversubscribed: FALSE Num slots allocated: 12 Max slots: 0 Username on node: NULL Num procs: 12 Next node_rank: 12 Data for proc: [[60677,1],0] Pid: 0 Local rank: 0 Node rank: 0 App rank: 0 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 0,12 Data for proc: [[60677,1],1] Pid: 0 Local rank: 1 Node rank: 1 App rank: 1 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 8,20 Data for proc: [[60677,1],2] Pid: 0 Local rank: 2 Node rank: 2 App rank: 2 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 5,17 Data for proc: [[60677,1],3] Pid: 0 Local rank: 3 Node rank: 3 App rank: 3 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 9,21 Data for proc: [[60677,1],4] Pid: 0 Local rank: 4 Node rank: 4 App rank: 4 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 11,23 Data for proc: [[60677,1],5] Pid: 0 Local rank: 5 Node rank: 5 App rank: 5 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 7,19 Data for proc: [[60677,1],6] Pid: 0 Local rank: 6 Node rank: 6 App rank: 6 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 3,15 Data for proc: [[60677,1],7] Pid: 0 Local rank: 7 Node rank: 7 App rank: 7 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 6,18 Data for proc: [[60677,1],8] Pid: 0 Local rank: 8 Node rank: 8 App rank: 8 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 2,14 Data for proc: [[60677,1],9] Pid: 0 Local rank: 9 Node rank: 9 App rank: 9 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 4,16 Data for proc: [[60677,1],10] Pid: 0 Local rank: 10 Node rank: 10 App rank: 10 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 10,22 Data for proc: [[60677,1],11] Pid: 0 Local rank: 11 Node rank: 11 App rank: 11 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 1,13 [bend001:24667] MCW rank 1 bound to socket 0[core 4[hwt 0-1]]: [../../../../BB/..][../../../../../..] [bend001:24667] MCW rank 2 bound to socket 1[core 8[hwt 0-1]]: [../../../../../..][../../BB/../../..] [bend001:24667] MCW rank 3 bound to socket 1[core 10[hwt 0-1]]: [../../../../../..][../../../../BB/..] [bend001:24667] MCW rank 4 bound to socket 1[core 11[hwt 0-1]]: [../../../../../..][../../../../../BB] [bend001:24667] MCW rank 5 bound to socket 1[core 9[hwt 0-1]]: [../../../../../..][../../../BB/../..] [bend001:24667] MCW rank 6 bound to socket 1[core 7[hwt 0-1]]: [../../../../../..][../BB/../../../..] [bend001:24667] MCW rank 7 bound to socket 0[core 3[hwt 0-1]]: [../../../BB/../..][../../../../../..] [bend001:24667] MCW rank 8 bound to socket 0[core 1[hwt 0-1]]: [../BB/../../../..][../../../../../..] [bend001:24667] MCW rank 9 bound to socket 0[core 2[hwt 0-1]]: [../../BB/../../..][../../../../../..] [bend001:24667] MCW rank 10 bound to socket 0[core 5[hwt 0-1]]: [../../../../../BB][../../../../../..] [bend001:24667] MCW rank 11 bound to socket 1[core 6[hwt 0-1]]: [../../../../../..][BB/../../../../..] [bend001:24667] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: [BB/../../../../..][../../../../../..] Can you try with the latest nightly 1.8 tarball? http://www.open-mpi.org/nightly/v1.8/ Note that it is also possible that hwloc isn’t correctly identifying the cores here. Can you tell us something about the hardware? Do you have hardware threads enabled? I ask because the binding being reported by us is the cpu numbers as identified by hwloc - which may not be the same you are expecting from some hardware vendor’s map. We are using logical processor assignments, not physical. You can use the —report-bindings option to show the resulting map, as above. On Nov 5, 2014, at 7:21 AM, twu...@goodyear.com<mailto:twu...@goodyear.com> wrote: I am using openmpi v 1.8.3 and LSF 9.1.3. LSF creates a rankfile that looks like: RANK_FILE: ====================================================================== rank 0=mach1 slot=0 rank 1=mach1 slot=4 rank 2=mach1 slot=8 rank 3=mach1 slot=12 rank 4=mach1 slot=16 rank 5=mach1 slot=20 rank 6=mach1 slot=24 rank 7=mach1 slot=28 rank 8=mach1 slot=32 rank 9=mach1 slot=36 rank 10=mach1 slot=40 rank 11=mach1 slot=44 rank 12=mach1 slot=1 rank 13=mach1 slot=5 rank 14=mach1 slot=9 rank 15=mach1 slot=13 which really are the cores I want to use, in order. I logon to this machine and type (all on one line): /apps/share/openmpi/1.8.3.I1217913/bin/mpirun \ --mca orte_base_help_aggregate 0 \ -v -display-devel-allocation \ -display-devel-map \ --rankfile RANK_FILE \ --mca btl openib,tcp,sm,self \ --x LD_LIBRARY_PATH \ --np 16 \ my_executable \ -i model.i \ -l model.o And I get the following on the screen: ====================== ALLOCATED NODES ====================== mach1: slots=16 max_slots=0 slots_inuse=0 state=UP ================================================================= Data for JOB [52387,1] offset 0 Mapper requested: NULL Last mapper: rank_file Mapping policy: BYUSER Ranking policy: SLOT Binding policy: CPUSET Cpu set: NULL PPR: NULL Cpus-per-rank: 1 Num new daemons: 0 New daemon starting vpid INVALID Num nodes: 1 Data for node: mach1 Launch id: -1 State: 2 Daemon: [[52387,0],0] Daemon launched: True Num slots: 16 Slots in use: 16 Oversubscribed: FALSE Num slots allocated: 16 Max slots: 0 Username on node: NULL Num procs: 16 Next node_rank: 16 Data for proc: [[52387,1],0] Pid: 0 Local rank: 0 Node rank: 0 App rank: 0 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 0 Data for proc: [[52387,1],1] Pid: 0 Local rank: 1 Node rank: 1 App rank: 1 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 16 Data for proc: [[52387,1],2] Pid: 0 Local rank: 2 Node rank: 2 App rank: 2 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 32 Data for proc: [[52387,1],3] Pid: 0 Local rank: 3 Node rank: 3 App rank: 3 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 1 Data for proc: [[52387,1],4] Pid: 0 Local rank: 4 Node rank: 4 App rank: 4 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 17 Data for proc: [[52387,1],5] Pid: 0 Local rank: 5 Node rank: 5 App rank: 5 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 33 Data for proc: [[52387,1],6] Pid: 0 Local rank: 6 Node rank: 6 App rank: 6 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 2 Data for proc: [[52387,1],7] Pid: 0 Local rank: 7 Node rank: 7 App rank: 7 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 18 Data for proc: [[52387,1],8] Pid: 0 Local rank: 8 Node rank: 8 App rank: 8 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 34 Data for proc: [[52387,1],9] Pid: 0 Local rank: 9 Node rank: 9 App rank: 9 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 3 Data for proc: [[52387,1],10] Pid: 0 Local rank: 10 Node rank: 10 App rank: 10 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 19 Data for proc: [[52387,1],11] Pid: 0 Local rank: 11 Node rank: 11 App rank: 11 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 35 Data for proc: [[52387,1],12] Pid: 0 Local rank: 12 Node rank: 12 App rank: 12 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 4 Data for proc: [[52387,1],13] Pid: 0 Local rank: 13 Node rank: 13 App rank: 13 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 20 Data for proc: [[52387,1],14] Pid: 0 Local rank: 14 Node rank: 14 App rank: 14 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 36 Data for proc: [[52387,1],15] Pid: 0 Local rank: 15 Node rank: 15 App rank: 15 State: INITIALIZED Restarts: 0 App_context: 0 Locale: UNKNOWN Bind location: (null) Binding: 5 And a numa-map of the node shows: PID COMMAND CPUMASK TOTAL [ N0 N1 N2 N3 N4 N5 N6 N7 ] 31044 my_executable 0 443.3M [ 443.3M 0 0 0 0 0 0 0 ] 31045 my_executable 16 459.7M [ 459.7M 0 0 0 0 0 0 0 ] 31046 my_executable 32 435.0M [ 0 435.0M 0 0 0 0 0 0 ] 31047 my_executable 1 468.8M [ 0 0 468.8M 0 0 0 0 0 ] 31048 my_executable 17 493.2M [ 0 0 493.2M 0 0 0 0 0 ] 31049 my_executable 33 498.0M [ 0 0 0 498.0M 0 0 0 0 ] 31050 my_executable 2 501.2M [ 0 0 0 0 501.2M 0 0 0 ] 31051 my_executable 18 502.4M [ 0 0 0 0 502.4M 0 0 0 ] 31052 my_executable 34 500.5M [ 0 0 0 0 0 500.5M 0 0 ] 31053 my_executable 3 515.6M [ 0 0 0 0 0 0 515.6M 0 ] 31054 my_executable 19 508.1M [ 0 0 0 0 0 0 508.1M 0 ] 31055 my_executable 35 503.9M [ 0 0 0 0 0 0 0 503.9M ] 31056 my_executable 4 502.1M [ 502.1M 0 0 0 0 0 0 0 ] 31057 my_executable 20 515.2M [ 515.2M 0 0 0 0 0 0 0 ] 31058 my_executable 36 508.1M [ 0 508.1M 0 0 0 0 0 0 ] 31059 my_executable 5 446.7M [ 0 0 446.7M 0 0 0 0 0 ] -- Why didn't mpirun honor the ranfile and put the processes on the correct cores in the proper order? It looks to me like mpirun doesn't like the rankfile...?? Thanks for any help. Tom _______________________________________________ devel mailing list de...@open-mpi.org<mailto:de...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: http://www.open-mpi.org/community/lists/devel/2014/11/16199.php _______________________________________________ devel mailing list de...@open-mpi.org<mailto:de...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: http://www.open-mpi.org/community/lists/devel/2014/11/16221.php _______________________________________________ devel mailing list de...@open-mpi.org<mailto:de...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: http://www.open-mpi.org/community/lists/devel/2014/11/16229.php _______________________________________________ devel mailing list de...@open-mpi.org<mailto:de...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: http://www.open-mpi.org/community/lists/devel/2014/11/16233.php _______________________________________________ devel mailing list de...@open-mpi.org<mailto:de...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: http://www.open-mpi.org/community/lists/devel/2014/11/16272.php _______________________________________________ devel mailing list de...@open-mpi.org<mailto:de...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: http://www.open-mpi.org/community/lists/devel/2014/11/16274.php