Hi all, When slurm is configured with the following parameters TaskPlugin=task/affinity TaskPluginParam=Cpusets srun binds the processes by placing them into different cpusets, each containing a single core.
e.g. "srun -N 2 -n 4" will create 2 cpusets in each of the two allocated nodes and place the four ranks there, each single rank with a singleton as a cpu constraint. The issue in that case is in the macro OPAL_PAFFINITY_PROCESS_IS_BOUND (in opal/mca/paffinity/paffinity.h): . opal_paffinity_base_get_processor_info() fills in num_processors with 1 (this is the size of each cpu_set) . num_bound is set to 1 too and this implies *bound=false So, the binding is correctly done by slurm and not detected by MPI. To support the cpuset binding done by slurm, I propose the following patch: hg diff opal/mca/paffinity/paffinity.h diff -r 4d8c8a39b06f opal/mca/paffinity/paffinity.h --- a/opal/mca/paffinity/paffinity.h Thu Apr 21 17:38:00 2011 +0200 +++ b/opal/mca/paffinity/paffinity.h Tue Jul 12 15:44:59 2011 +0200 @@ -218,7 +218,8 @@ num_bound++; \ } \ } \ - if (0 < num_bound && num_bound < num_processors) { \ + if (0 < num_bound && ((num_processors == 1) || \ + (num_bound < num_processors))) { \ *(bound) = true; \ } \ } \