Should we just do this, then:
Index: mca/hwloc/base/hwloc_base_util.c
===================================================================
--- mca/hwloc/base/hwloc_base_util.c (revision 25885)
+++ mca/hwloc/base/hwloc_base_util.c (working copy)
@@ -173,6 +173,9 @@
"hwloc:base:get_topology"));
if (0 != hwloc_topology_init(&opal_hwloc_topology) ||
+ 0 != hwloc_topology_set_flags(opal_hwloc_topology,
+ (HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM |
+ HWLOC_TOPOLOGY_FLAG_WHOLE_IO)) ||
0 != hwloc_topology_load(opal_hwloc_topology)) {
return OPAL_ERR_NOT_SUPPORTED;
}
On Feb 9, 2012, at 8:04 AM, Ralph Castain wrote:
> Yes, I missed that point before - too early in the morning :-/
>
> As I said in my last note, it would be nice to either have a flag indicating
> we are bound, or see all the cpu info so we can compute that we are bound.
> Either way, we still need to have a complete picture of all I/O devices so
> you can compute the distance.
>
>
> On Feb 9, 2012, at 6:01 AM, [email protected] wrote:
>
>>
>>
>> [email protected] wrote on 02/09/2012 01:32:31 PM:
>>
>> > De : Ralph Castain <[email protected]>
>> > A : Open MPI Developers <[email protected]>
>> > Date : 02/09/2012 01:32 PM
>> > Objet : Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see
>> > processes as bound if the job has been launched by srun
>> > Envoyé par : [email protected]
>> >
>> > Hi Nadia
>> >
>> > I'm wondering what value there is in showing the full topology, or
>> > using it in any of our components, if the process is restricted to a
>> > specific set of cpus? Does it really help to know that there are
>> > other cpus out there that are unreachable?
>>
>> Ralph,
>>
>> The intention here is not to show cpus that are unreachable, but to fix an
>> issue we have at least in get_ib_dev_distance() in the openib btl.
>>
>> The problem is that if a process is restricted to a single CPU, the
>> algorithm used in get_ib_dev_distance doesn't work at all:
>> I have 2 ib interfaces on my victim (say mlx4_0 and mlx4_1), and I want the
>> openib btl to select the one that is the closest to my rank.
>>
>> As I said in my first e-mail, here is what is done today:
>> . opal_paffinity_base_get_processor_info() is called to get the number of
>> logical processors (we get 1 due to the singleton cpuset)
>> . we loop over that # of processors to check whether our process is bound
>> to one of them. In our case the loop will be executed only once and we will
>> never get the correct binding information.
>> . if the process is bound actually get the distance to the device.
>> in our case, the distance won't be computed and mlx4_0 will be seen
>> as "equivalent" to mlx4_1 in terms of distances. This is what I definitely
>> want to avoid.
>>
>> Regards,
>> Nadia
>>
>> >
>> > On Feb 9, 2012, at 5:15 AM, [email protected] wrote:
>> >
>> >
>> >
>> > [email protected] wrote on 02/09/2012 12:20:41 PM:
>> >
>> > > De : Brice Goglin <[email protected]>
>> > > A : Open MPI Developers <[email protected]>
>> > > Date : 02/09/2012 12:20 PM
>> > > Objet : Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see
>> > > processes as bound if the job has been launched by srun
>> > > Envoyé par : [email protected]
>> > >
>> > > By default, hwloc only shows what's inside the current cpuset. There's
>> > > an option to show everything instead (topology flag).
>> >
>> > So may be using that flag inside
>> > opal_paffinity_base_get_processor_info() would be a better fix than
>> > the one I'm proposing in my patch.
>> >
>> > I found a bunch of other places where things are managed as in
>> > get_ib_dev_distance().
>> >
>> > Just doing a grep in the sources, I could find:
>> > . init_maffinity() in btl/sm/btl_sm.c
>> > . vader_init_maffinity() in btl/vader/btl_vader.c
>> > . get_ib_dev_distance() in btl/wv/btl_wv_component.c
>> >
>> > So I think the flag Brice is talking about should definitely be the fix.
>> >
>> > Regards,
>> > Nadia
>> >
>> > >
>> > > Brice
>> > >
>> > >
>> > >
>> > > Le 09/02/2012 12:18, Jeff Squyres a écrit :
>> > > > Just so that I understand this better -- if a process is bound in
>> > > a cpuset, will tools like hwloc's lstopo only show the Linux
>> > > processors *in that cpuset*? I.e., does it not have any visibility
>> > > of the processors outside of its cpuset?
>> > > >
>> > > >
>> > > > On Jan 27, 2012, at 11:38 AM, nadia.derbey wrote:
>> > > >
>> > > >> Hi,
>> > > >>
>> > > >> If a job is launched using "srun --resv-ports --cpu_bind:..." and
>> > > >> slurm
>> > > >> is configured with:
>> > > >> TaskPlugin=task/affinity
>> > > >> TaskPluginParam=Cpusets
>> > > >>
>> > > >> each rank of that job is in a cpuset that contains a single CPU.
>> > > >>
>> > > >> Now, if we use carto on top of this, the following happens in
>> > > >> get_ib_dev_distance() (in btl/openib/btl_openib_component.c):
>> > > >> . opal_paffinity_base_get_processor_info() is called to get the
>> > > >> number of logical processors (we get 1 due to the singleton
>> > > >> cpuset)
>> > > >> . we loop over that # of processors to check whether our process is
>> > > >> bound to one of them. In our case the loop will be executed only
>> > > >> once and we will never get the correct binding information.
>> > > >> . if the process is bound actually get the distance to the device.
>> > > >> in our case we won't execute that part of the code.
>> > > >>
>> > > >> The attached patch is a proposal to fix the issue.
>> > > >>
>> > > >> Regards,
>> > > >> Nadia
>> > > >>
>> > <get_ib_dev_distance.patch>_______________________________________________
>> > > >> devel mailing list
>> > > >> [email protected]
>> > > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > > >
>> > >
>> > > _______________________________________________
>> > > devel mailing list
>> > > [email protected]
>> > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > _______________________________________________
>> > devel mailing list
>> > [email protected]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > _______________________________________________
>> > devel mailing list
>> > [email protected]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/devel_______________________________________________
>> devel mailing list
>> [email protected]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Jeff Squyres
[email protected]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/