Just so that I understand this better -- if a process is bound in a cpuset, 
will tools like hwloc's lstopo only show the Linux processors *in that cpuset*? 
 I.e., does it not have any visibility of the processors outside of its cpuset?


On Jan 27, 2012, at 11:38 AM, nadia.derbey wrote:

> Hi,
> 
> If a job is launched using "srun --resv-ports --cpu_bind:..." and slurm
> is configured with:
>   TaskPlugin=task/affinity
>   TaskPluginParam=Cpusets
> 
> each rank of that job is in a cpuset that contains a single CPU.
> 
> Now, if we use carto on top of this, the following happens in
> get_ib_dev_distance() (in btl/openib/btl_openib_component.c):
>   . opal_paffinity_base_get_processor_info() is called to get the
>     number of logical processors (we get 1 due to the singleton cpuset)
>   . we loop over that # of processors to check whether our process is
>     bound to one of them. In our case the loop will be executed only
>     once and we will never get the correct binding information.
>   . if the process is bound actually get the distance to the device.
>     in our case we won't execute that part of the code.
> 
> The attached patch is a proposal to fix the issue.
> 
> Regards,
> Nadia
> <get_ib_dev_distance.patch>_______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to