Hi Ralph,

thanks a lot, including it in the next release would be great.

Best regards

Igor

On Wed, 30 Nov 2011 14:30:25 -0700
Ralph Castain <r...@open-mpi.org> wrote:

> Hi Igor
> 
> As I recall, this eventually traced back to a change in slurm at some point. 
> I believe the latest interpretation is in line with your suggestion. I 
> believe we didn't change it because nobody seemed to care very much, but I 
> have no objection to including it in the next release.
> 
> Thanks
> ralph
> 
> On Nov 28, 2011, at 3:50 AM, Igor Geier wrote:
> 
> > Dear all,
> > 
> > there's been some discussions about this already, but the issue is still 
> > there (in 1.4.4). When running SLURM jobs with the --cpus-per-task 
> > parameter set (e.g. when running Open MPI-OpenMP jobs, so that 
> > --cpus-per-tasks corresponds to the number of OpenMP threads per rank), you 
> > get the
> > 
> > "All nodes which are allocated for this job are already filled."
> > 
> > error, if SLURM_CPUS_PER_TASK > SLURM_TASKS_PER_NODE. In 
> > ras_slurm_module.c, the number of slots is divided by the 
> > SLURM_CPUS_PER_TASK value (so that it becomes 0). The following patch seems 
> > to work for our cluster:
> > 
> > --- a/orte/mca/ras/slurm/ras_slurm_module.c     2009-12-08 
> > 21:36:38.000000000 +0100
> > +++ b/orte/mca/ras/slurm/ras_slurm_module.c     2011-11-25 
> > 12:28:55.000000000 +0100
> > @@ -353,7 +353,8 @@
> >         node->state = ORTE_NODE_STATE_UP;
> >         node->slots_inuse = 0;
> >         node->slots_max = 0;
> > -        node->slots = slots[i] / cpus_per_task;
> > +        /* Don't divide by cpus_per_task */
> > +        node->slots = slots[i]; 
> >         opal_list_append(nodelist, &node->super);
> >     }
> >     free(slots);
> > 
> > Are there situations where this might not work?
> > 
> > Best regards
> > 
> > Igor
> > 
> > -- 
> > 
> > Igor Geier
> > 
> > --------------------------------------
> > Center for Scientific Computing (CSC)
> > University of Frankfurt
> > Max-von-Laue-Straße 1
> > 60438 Frankfurt am Main
> > +49(0)69/798-47353
> > ge...@csc.uni-frankfurt.de
> > http://csc.uni-frankfurt.de/
> > --------------------------------------
> > 
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to