Re: [OMPI users] Open MPI and SLURM_CPUS_PER_TASK

2011-11-30 Thread Ralph Castain
Hi Igor

As I recall, this eventually traced back to a change in slurm at some point. I 
believe the latest interpretation is in line with your suggestion. I believe we 
didn't change it because nobody seemed to care very much, but I have no 
objection to including it in the next release.

Thanks
ralph

On Nov 28, 2011, at 3:50 AM, Igor Geier wrote:

> Dear all,
> 
> there's been some discussions about this already, but the issue is still 
> there (in 1.4.4). When running SLURM jobs with the --cpus-per-task parameter 
> set (e.g. when running Open MPI-OpenMP jobs, so that --cpus-per-tasks 
> corresponds to the number of OpenMP threads per rank), you get the
> 
> "All nodes which are allocated for this job are already filled."
> 
> error, if SLURM_CPUS_PER_TASK > SLURM_TASKS_PER_NODE. In ras_slurm_module.c, 
> the number of slots is divided by the SLURM_CPUS_PER_TASK value (so that it 
> becomes 0). The following patch seems to work for our cluster:
> 
> --- a/orte/mca/ras/slurm/ras_slurm_module.c 2009-12-08 21:36:38.0 
> +0100
> +++ b/orte/mca/ras/slurm/ras_slurm_module.c 2011-11-25 12:28:55.0 
> +0100
> @@ -353,7 +353,8 @@
> node->state = ORTE_NODE_STATE_UP;
> node->slots_inuse = 0;
> node->slots_max = 0;
> -node->slots = slots[i] / cpus_per_task;
> +/* Don't divide by cpus_per_task */
> +node->slots = slots[i]; 
> opal_list_append(nodelist, >super);
> }
> free(slots);
> 
> Are there situations where this might not work?
> 
> Best regards
> 
> Igor
> 
> -- 
> 
> Igor Geier
> 
> --
> Center for Scientific Computing (CSC)
> University of Frankfurt
> Max-von-Laue-Straße 1
> 60438 Frankfurt am Main
> +49(0)69/798-47353
> ge...@csc.uni-frankfurt.de
> http://csc.uni-frankfurt.de/
> --
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] Open MPI and SLURM_CPUS_PER_TASK

2011-11-28 Thread Igor Geier
Dear all,

there's been some discussions about this already, but the issue is still there 
(in 1.4.4). When running SLURM jobs with the --cpus-per-task parameter set 
(e.g. when running Open MPI-OpenMP jobs, so that --cpus-per-tasks corresponds 
to the number of OpenMP threads per rank), you get the

"All nodes which are allocated for this job are already filled."

error, if SLURM_CPUS_PER_TASK > SLURM_TASKS_PER_NODE. In ras_slurm_module.c, 
the number of slots is divided by the SLURM_CPUS_PER_TASK value (so that it 
becomes 0). The following patch seems to work for our cluster:

--- a/orte/mca/ras/slurm/ras_slurm_module.c 2009-12-08 21:36:38.0 
+0100
+++ b/orte/mca/ras/slurm/ras_slurm_module.c 2011-11-25 12:28:55.0 
+0100
@@ -353,7 +353,8 @@
 node->state = ORTE_NODE_STATE_UP;
 node->slots_inuse = 0;
 node->slots_max = 0;
-node->slots = slots[i] / cpus_per_task;
+/* Don't divide by cpus_per_task */
+node->slots = slots[i]; 
 opal_list_append(nodelist, >super);
 }
 free(slots);

Are there situations where this might not work?

Best regards

Igor

-- 

Igor Geier

--
Center for Scientific Computing (CSC)
University of Frankfurt
Max-von-Laue-Straße 1
60438 Frankfurt am Main
+49(0)69/798-47353
ge...@csc.uni-frankfurt.de
http://csc.uni-frankfurt.de/
--