Brock --

Can you run with "ompi_info --all"?

With "--param all all", ompi_info in v1.8.x is defaulting to only showing level 
1 MCA params.  It's showing you all possible components and variables, but only 
level 1.

Or you could also use "--level 9" to show all 9 levels.  Here's the relevant 
section from the README:

-----
The following options may be helpful:

--all       Show a *lot* of information about your Open MPI
            installation. 
--parsable  Display all the information in an easily
            grep/cut/awk/sed-able format.
--param <framework> <component>
            A <framework> of "all" and a <component> of "all" will
            show all parameters to all components.  Otherwise, the
            parameters of all the components in a specific framework,
            or just the parameters of a specific component can be
            displayed by using an appropriate <framework> and/or
            <component> name.
--level <level>
            By default, ompi_info only shows "Level 1" MCA parameters
            -- parameters that can affect whether MPI processes can
            run successfully or not (e.g., determining which network
            interfaces to use).  The --level option will display all
            MCA parameters from level 1 to <level> (the max <level>
            value is 9).  Use "ompi_info --param <framework>
            <component> --level 9" to see *all* MCA parameters for a
            given component.  See "The Modular Component Architecture
            (MCA)" section, below, for a fuller explanation.
----




On Jun 24, 2014, at 5:19 AM, Ralph Castain <r...@open-mpi.org> wrote:

> That's odd - it shouldn't truncate the output. I'll take a look later today - 
> we're all gathered for a developer's conference this week, so I'll be able to 
> poke at this with Nathan.
> 
> 
> 
> On Mon, Jun 23, 2014 at 3:15 PM, Brock Palen <bro...@umich.edu> wrote:
> Perfection, flexible, extensible, so nice.
> 
> BTW this doesn't happen older versions,
> 
> [brockp@flux-login2 34241]$ ompi_info --param all all
> Error getting SCIF driver version
>                  MCA btl: parameter "btl_tcp_if_include" (current value: "",
>                           data source: default, level: 1 user/basic, type:
>                           string)
>                           Comma-delimited list of devices and/or CIDR
>                           notation of networks to use for MPI communication
>                           (e.g., "eth0,192.168.0.0/16").  Mutually exclusive
>                           with btl_tcp_if_exclude.
>                  MCA btl: parameter "btl_tcp_if_exclude" (current value:
>                           "127.0.0.1/8,sppp", data source: default, level: 1
>                           user/basic, type: string)
>                           Comma-delimited list of devices and/or CIDR
>                           notation of networks to NOT use for MPI
>                           communication -- all devices not matching these
>                           specifications will be used (e.g.,
>                           "eth0,192.168.0.0/16").  If set to a non-default
>                           value, it is mutually exclusive with
>                           btl_tcp_if_include.
> 
> 
> This is normally much longer.  And yes we don't have the PHI stuff installed 
> on all nodes, strange that 'all all' is now very short,  ompi_info -a  still 
> works though.
> 
> 
> 
> Brock Palen
> www.umich.edu/~brockp
> CAEN Advanced Computing
> XSEDE Campus Champion
> bro...@umich.edu
> (734)936-1985
> 
> 
> 
> On Jun 20, 2014, at 1:48 PM, Ralph Castain <r...@open-mpi.org> wrote:
> 
> > Put "orte_hetero_nodes=1" in your default MCA param file - uses can 
> > override by setting that param to 0
> >
> >
> > On Jun 20, 2014, at 10:30 AM, Brock Palen <bro...@umich.edu> wrote:
> >
> >> Perfection!  That appears to do it for our standard case.
> >>
> >> Now I know how to set MCA options by env var or config file.  How can I 
> >> make this the default, that then a user can override?
> >>
> >> Brock Palen
> >> www.umich.edu/~brockp
> >> CAEN Advanced Computing
> >> XSEDE Campus Champion
> >> bro...@umich.edu
> >> (734)936-1985
> >>
> >>
> >>
> >> On Jun 20, 2014, at 1:21 PM, Ralph Castain <r...@open-mpi.org> wrote:
> >>
> >>> I think I begin to grok at least part of the problem. If you are 
> >>> assigning different cpus on each node, then you'll need to tell us that 
> >>> by setting --hetero-nodes otherwise we won't have any way to report that 
> >>> back to mpirun for its binding calculation.
> >>>
> >>> Otherwise, we expect that the cpuset of the first node we launch a daemon 
> >>> onto (or where mpirun is executing, if we are only launching local to 
> >>> mpirun) accurately represents the cpuset on every node in the allocation.
> >>>
> >>> We still might well have a bug in our binding computation - but the above 
> >>> will definitely impact what you said the user did.
> >>>
> >>> On Jun 20, 2014, at 10:06 AM, Brock Palen <bro...@umich.edu> wrote:
> >>>
> >>>> Extra data point if I do:
> >>>>
> >>>> [brockp@nyx5508 34241]$ mpirun --report-bindings --bind-to core hostname
> >>>> --------------------------------------------------------------------------
> >>>> A request was made to bind to that would result in binding more
> >>>> processes than cpus on a resource:
> >>>>
> >>>> Bind to:         CORE
> >>>> Node:            nyx5513
> >>>> #processes:  2
> >>>> #cpus:          1
> >>>>
> >>>> You can override this protection by adding the "overload-allowed"
> >>>> option to your binding directive.
> >>>> --------------------------------------------------------------------------
> >>>>
> >>>> [brockp@nyx5508 34241]$ mpirun -H nyx5513 uptime
> >>>> 13:01:37 up 31 days, 23:06,  0 users,  load average: 10.13, 10.90, 12.38
> >>>> 13:01:37 up 31 days, 23:06,  0 users,  load average: 10.13, 10.90, 12.38
> >>>> [brockp@nyx5508 34241]$ mpirun -H nyx5513 --bind-to core hwloc-bind --get
> >>>> 0x00000010
> >>>> 0x00001000
> >>>> [brockp@nyx5508 34241]$ cat $PBS_NODEFILE | grep nyx5513
> >>>> nyx5513
> >>>> nyx5513
> >>>>
> >>>> Interesting, if I force bind to core, MPI barfs saying there is only 1 
> >>>> cpu available, PBS says it gave it two, and if I force (this is all 
> >>>> inside an interactive job) just on that node hwloc-bind --get I get what 
> >>>> I expect,
> >>>>
> >>>> Is there a way to get a map of what MPI thinks it has on each host?
> >>>>
> >>>> Brock Palen
> >>>> www.umich.edu/~brockp
> >>>> CAEN Advanced Computing
> >>>> XSEDE Campus Champion
> >>>> bro...@umich.edu
> >>>> (734)936-1985
> >>>>
> >>>>
> >>>>
> >>>> On Jun 20, 2014, at 12:38 PM, Brock Palen <bro...@umich.edu> wrote:
> >>>>
> >>>>> I was able to produce it in my test.
> >>>>>
> >>>>> orted affinity set by cpuset:
> >>>>> [root@nyx5874 ~]# hwloc-bind --get --pid 103645
> >>>>> 0x0000c002
> >>>>>
> >>>>> This mask (1, 14,15) which is across sockets, matches the cpu set setup 
> >>>>> by the batch system.
> >>>>> [root@nyx5874 ~]# cat 
> >>>>> /dev/cpuset/torque/12719806.nyx.engin.umich.edu/cpus
> >>>>> 1,14-15
> >>>>>
> >>>>> The ranks though were then all set to the same core:
> >>>>>
> >>>>> [root@nyx5874 ~]# hwloc-bind --get --pid 103871
> >>>>> 0x00008000
> >>>>> [root@nyx5874 ~]# hwloc-bind --get --pid 103872
> >>>>> 0x00008000
> >>>>> [root@nyx5874 ~]# hwloc-bind --get --pid 103873
> >>>>> 0x00008000
> >>>>>
> >>>>> Which is core 15:
> >>>>>
> >>>>> report-bindings gave me:
> >>>>> You can see how a few nodes were bound to all the same core, the last 
> >>>>> one in each case.  I only gave you the results for the hose nyx5874.
> >>>>>
> >>>>> [nyx5526.engin.umich.edu:23726] MCW rank 0 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5878.engin.umich.edu:103925] MCW rank 8 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5533.engin.umich.edu:123988] MCW rank 1 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5879.engin.umich.edu:102808] MCW rank 9 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5874.engin.umich.edu:103645] MCW rank 41 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5874.engin.umich.edu:103645] MCW rank 42 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5874.engin.umich.edu:103645] MCW rank 43 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5888.engin.umich.edu:117400] MCW rank 11 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5786.engin.umich.edu:30004] MCW rank 19 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5786.engin.umich.edu:30004] MCW rank 18 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5594.engin.umich.edu:33884] MCW rank 24 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5594.engin.umich.edu:33884] MCW rank 25 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5594.engin.umich.edu:33884] MCW rank 26 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5798.engin.umich.edu:53026] MCW rank 59 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5798.engin.umich.edu:53026] MCW rank 60 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5798.engin.umich.edu:53026] MCW rank 56 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5798.engin.umich.edu:53026] MCW rank 57 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5798.engin.umich.edu:53026] MCW rank 58 bound to socket 1[core 
> >>>>> 15[hwt 0]]: [./././././././.][./././././././B]
> >>>>> [nyx5545.engin.umich.edu:88170] MCW rank 2 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5613.engin.umich.edu:25229] MCW rank 31 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5880.engin.umich.edu:01406] MCW rank 10 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5770.engin.umich.edu:86538] MCW rank 6 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5613.engin.umich.edu:25228] MCW rank 30 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5577.engin.umich.edu:65949] MCW rank 4 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5607.engin.umich.edu:30379] MCW rank 14 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5544.engin.umich.edu:72960] MCW rank 47 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5544.engin.umich.edu:72959] MCW rank 46 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5848.engin.umich.edu:04332] MCW rank 33 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5848.engin.umich.edu:04333] MCW rank 34 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5544.engin.umich.edu:72958] MCW rank 45 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5858.engin.umich.edu:12165] MCW rank 35 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5607.engin.umich.edu:30380] MCW rank 15 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5544.engin.umich.edu:72957] MCW rank 44 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5858.engin.umich.edu:12167] MCW rank 37 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5870.engin.umich.edu:33811] MCW rank 7 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5582.engin.umich.edu:81994] MCW rank 5 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5848.engin.umich.edu:04331] MCW rank 32 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5557.engin.umich.edu:46654] MCW rank 50 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5858.engin.umich.edu:12166] MCW rank 36 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5799.engin.umich.edu:67802] MCW rank 22 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5799.engin.umich.edu:67803] MCW rank 23 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5556.engin.umich.edu:50889] MCW rank 3 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5625.engin.umich.edu:95931] MCW rank 53 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5625.engin.umich.edu:95930] MCW rank 52 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5557.engin.umich.edu:46655] MCW rank 51 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5625.engin.umich.edu:95932] MCW rank 54 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5625.engin.umich.edu:95933] MCW rank 55 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5866.engin.umich.edu:16306] MCW rank 40 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5861.engin.umich.edu:22761] MCW rank 61 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5861.engin.umich.edu:22762] MCW rank 62 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5861.engin.umich.edu:22763] MCW rank 63 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5557.engin.umich.edu:46652] MCW rank 48 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5557.engin.umich.edu:46653] MCW rank 49 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5866.engin.umich.edu:16304] MCW rank 38 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5788.engin.umich.edu:02465] MCW rank 20 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5597.engin.umich.edu:68071] MCW rank 27 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5775.engin.umich.edu:27952] MCW rank 17 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5866.engin.umich.edu:16305] MCW rank 39 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5788.engin.umich.edu:02466] MCW rank 21 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5775.engin.umich.edu:27951] MCW rank 16 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5597.engin.umich.edu:68073] MCW rank 29 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5597.engin.umich.edu:68072] MCW rank 28 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5552.engin.umich.edu:30481] MCW rank 12 is not bound (or bound to 
> >>>>> all available processors)
> >>>>> [nyx5552.engin.umich.edu:30482] MCW rank 13 is not bound (or bound to 
> >>>>> all available processors)
> >>>>>
> >>>>>
> >>>>> Brock Palen
> >>>>> www.umich.edu/~brockp
> >>>>> CAEN Advanced Computing
> >>>>> XSEDE Campus Champion
> >>>>> bro...@umich.edu
> >>>>> (734)936-1985
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Jun 20, 2014, at 12:20 PM, Brock Palen <bro...@umich.edu> wrote:
> >>>>>
> >>>>>> Got it,
> >>>>>>
> >>>>>> I have the input from the user and am testing it out.
> >>>>>>
> >>>>>> It probably has less todo with torque and more cpuset's,
> >>>>>>
> >>>>>> I'm working on producing it myself also.
> >>>>>>
> >>>>>> Brock Palen
> >>>>>> www.umich.edu/~brockp
> >>>>>> CAEN Advanced Computing
> >>>>>> XSEDE Campus Champion
> >>>>>> bro...@umich.edu
> >>>>>> (734)936-1985
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Jun 20, 2014, at 12:18 PM, Ralph Castain <r...@open-mpi.org> wrote:
> >>>>>>
> >>>>>>> Thanks - I'm just trying to reproduce one problem case so I can look 
> >>>>>>> at it. Given that I don't have access to a Torque machine, I need to 
> >>>>>>> "fake" it.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Jun 20, 2014, at 9:15 AM, Brock Palen <bro...@umich.edu> wrote:
> >>>>>>>
> >>>>>>>> In this case they are a single socket, but as you can see they could 
> >>>>>>>> be ether/or depending on the job.
> >>>>>>>>
> >>>>>>>> Brock Palen
> >>>>>>>> www.umich.edu/~brockp
> >>>>>>>> CAEN Advanced Computing
> >>>>>>>> XSEDE Campus Champion
> >>>>>>>> bro...@umich.edu
> >>>>>>>> (734)936-1985
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Jun 19, 2014, at 2:44 PM, Ralph Castain <r...@open-mpi.org> wrote:
> >>>>>>>>
> >>>>>>>>> Sorry, I should have been clearer - I was asking if cores 8-11 are 
> >>>>>>>>> all on one socket, or span multiple sockets
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Jun 19, 2014, at 11:36 AM, Brock Palen <bro...@umich.edu> wrote:
> >>>>>>>>>
> >>>>>>>>>> Ralph,
> >>>>>>>>>>
> >>>>>>>>>> It was a large job spread across.  Our system allows users to ask 
> >>>>>>>>>> for 'procs' which are laid out in any format.
> >>>>>>>>>>
> >>>>>>>>>> The list:
> >>>>>>>>>>
> >>>>>>>>>>> [nyx5406:2][nyx5427:2][nyx5506:2][nyx5311:3]
> >>>>>>>>>>> [nyx5329:4][nyx5398:4][nyx5396:11][nyx5397:11]
> >>>>>>>>>>> [nyx5409:11][nyx5411:11][nyx5412:3]
> >>>>>>>>>>
> >>>>>>>>>> Shows that nyx5406 had 2 cores,  nyx5427 also 2,  nyx5411 had 11.
> >>>>>>>>>>
> >>>>>>>>>> They could be spread across any number of sockets configuration.  
> >>>>>>>>>> We start very lax "user requests X procs" and then the user can 
> >>>>>>>>>> request more strict requirements from there.  We support mostly 
> >>>>>>>>>> serial users, and users can colocate on nodes.
> >>>>>>>>>>
> >>>>>>>>>> That is good to know, I think we would want to turn our default to 
> >>>>>>>>>> 'bind to core' except for our few users who use hybrid mode.
> >>>>>>>>>>
> >>>>>>>>>> Our CPU set tells you what cores the job is assigned.  So in the 
> >>>>>>>>>> problem case provided, the cpuset/cgroup shows only cores 8-11 are 
> >>>>>>>>>> available to this job on this node.
> >>>>>>>>>>
> >>>>>>>>>> Brock Palen
> >>>>>>>>>> www.umich.edu/~brockp
> >>>>>>>>>> CAEN Advanced Computing
> >>>>>>>>>> XSEDE Campus Champion
> >>>>>>>>>> bro...@umich.edu
> >>>>>>>>>> (734)936-1985
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Jun 18, 2014, at 11:10 PM, Ralph Castain <r...@open-mpi.org> 
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> The default binding option depends on the number of procs - it is 
> >>>>>>>>>>> bind-to core for np=2, and bind-to socket for np > 2. You never 
> >>>>>>>>>>> said, but should I assume you ran 4 ranks? If so, then we should 
> >>>>>>>>>>> be trying to bind-to socket.
> >>>>>>>>>>>
> >>>>>>>>>>> I'm not sure what your cpuset is telling us - are you binding us 
> >>>>>>>>>>> to a socket? Are some cpus in one socket, and some in another?
> >>>>>>>>>>>
> >>>>>>>>>>> It could be that the cpuset + bind-to socket is resulting in some 
> >>>>>>>>>>> odd behavior, but I'd need a little more info to narrow it down.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Jun 18, 2014, at 7:48 PM, Brock Palen <bro...@umich.edu> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> I have started using 1.8.1 for some codes (meep in this case) 
> >>>>>>>>>>>> and it sometimes works fine, but in a few cases I am seeing 
> >>>>>>>>>>>> ranks being given overlapping CPU assignments, not always though.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Example job, default binding options (so by-core right?):
> >>>>>>>>>>>>
> >>>>>>>>>>>> Assigned nodes, the one in question is nyx5398, we use torque 
> >>>>>>>>>>>> CPU sets, and use TM to spawn.
> >>>>>>>>>>>>
> >>>>>>>>>>>> [nyx5406:2][nyx5427:2][nyx5506:2][nyx5311:3]
> >>>>>>>>>>>> [nyx5329:4][nyx5398:4][nyx5396:11][nyx5397:11]
> >>>>>>>>>>>> [nyx5409:11][nyx5411:11][nyx5412:3]
> >>>>>>>>>>>>
> >>>>>>>>>>>> [root@nyx5398 ~]# hwloc-bind --get --pid 16065
> >>>>>>>>>>>> 0x00000200
> >>>>>>>>>>>> [root@nyx5398 ~]# hwloc-bind --get --pid 16066
> >>>>>>>>>>>> 0x00000800
> >>>>>>>>>>>> [root@nyx5398 ~]# hwloc-bind --get --pid 16067
> >>>>>>>>>>>> 0x00000200
> >>>>>>>>>>>> [root@nyx5398 ~]# hwloc-bind --get --pid 16068
> >>>>>>>>>>>> 0x00000800
> >>>>>>>>>>>>
> >>>>>>>>>>>> [root@nyx5398 ~]# cat 
> >>>>>>>>>>>> /dev/cpuset/torque/12703230.nyx.engin.umich.edu/cpus
> >>>>>>>>>>>> 8-11
> >>>>>>>>>>>>
> >>>>>>>>>>>> So torque claims the CPU set setup for the job has 4 cores, but 
> >>>>>>>>>>>> as you can see the ranks were giving identical binding.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I checked the pids they were part of the correct CPU set, I also 
> >>>>>>>>>>>> checked, orted:
> >>>>>>>>>>>>
> >>>>>>>>>>>> [root@nyx5398 ~]# hwloc-bind --get --pid 16064
> >>>>>>>>>>>> 0x00000f00
> >>>>>>>>>>>> [root@nyx5398 ~]# hwloc-calc --intersect PU 16064
> >>>>>>>>>>>> ignored unrecognized argument 16064
> >>>>>>>>>>>>
> >>>>>>>>>>>> [root@nyx5398 ~]# hwloc-calc --intersect PU 0x00000f00
> >>>>>>>>>>>> 8,9,10,11
> >>>>>>>>>>>>
> >>>>>>>>>>>> Which is exactly what I would expect.
> >>>>>>>>>>>>
> >>>>>>>>>>>> So ummm, i'm lost why this might happen?  What else should I 
> >>>>>>>>>>>> check?  Like I said not all jobs show this behavior.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Brock Palen
> >>>>>>>>>>>> www.umich.edu/~brockp
> >>>>>>>>>>>> CAEN Advanced Computing
> >>>>>>>>>>>> XSEDE Campus Champion
> >>>>>>>>>>>> bro...@umich.edu
> >>>>>>>>>>>> (734)936-1985
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> _______________________________________________
> >>>>>>>>>>>> users mailing list
> >>>>>>>>>>>> us...@open-mpi.org
> >>>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>>>>>>> Link to this post: 
> >>>>>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/06/24672.php
> >>>>>>>>>>>
> >>>>>>>>>>> _______________________________________________
> >>>>>>>>>>> users mailing list
> >>>>>>>>>>> us...@open-mpi.org
> >>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>>>>>> Link to this post: 
> >>>>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/06/24673.php
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> users mailing list
> >>>>>>>>>> us...@open-mpi.org
> >>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>>>>> Link to this post: 
> >>>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/06/24675.php
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> users mailing list
> >>>>>>>>> us...@open-mpi.org
> >>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>>>> Link to this post: 
> >>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/06/24676.php
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> users mailing list
> >>>>>>>> us...@open-mpi.org
> >>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>>> Link to this post: 
> >>>>>>>> http://www.open-mpi.org/community/lists/users/2014/06/24677.php
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> users mailing list
> >>>>>>> us...@open-mpi.org
> >>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>> Link to this post: 
> >>>>>>> http://www.open-mpi.org/community/lists/users/2014/06/24678.php
> >>>>>>
> >>>>>
> >>>>
> >>>> _______________________________________________
> >>>> users mailing list
> >>>> us...@open-mpi.org
> >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>> Link to this post: 
> >>>> http://www.open-mpi.org/community/lists/users/2014/06/24681.php
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> us...@open-mpi.org
> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>> Link to this post: 
> >>> http://www.open-mpi.org/community/lists/users/2014/06/24682.php
> >>
> >> _______________________________________________
> >> users mailing list
> >> us...@open-mpi.org
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >> Link to this post: 
> >> http://www.open-mpi.org/community/lists/users/2014/06/24683.php
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/users/2014/06/24684.php
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/06/24690.php
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/06/24694.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to