I just checked and it appears bycore does correctly translate to byslot. So 
your patch does indeed appear to be correct. If you don't mind, I'm going to 
apply it for you as I'm working on a correction for how we handle oversubscribe 
flags, and I want to ensure the patch gets included so we compute oversubscribe 
correctly.

Thanks for catching this!

On Nov 30, 2010, at 10:33 PM, Ralph Castain wrote:

> Afraid I don't speak much slurm any more (thank goodness!).
> 
> From your output, It looks like the system is mapping bynode instead of 
> byslot. IIRC, isn't bycore just supposed to be a pseudonym for byslot? So 
> perhaps the problem is that "bycore" causes us to set the "bynode" flag by 
> mistake. Did you check that?
> 
> BTW: when running cpus-per-proc, a slot doesn't have X processes. I suspect 
> this is just a language thing, but it will create confusion. A slot consists 
> of X cpus - we still assign only one process to each slot.
> 
> On Nov 30, 2010, at 10:47 AM, Damien Guinier wrote:
> 
>> hi all,
>> 
>> Many time, there are no difference between "proc" and "slot". But when you 
>> use "mpirun -cpus-per-proc X", slot have X procs.
>> On orte/mca/rmaps/base/rmaps_base_common_mappers.c, there are a confusion 
>> between proc and slot. this little error impact mapping action:
>> 
>> On OMPI last version with 32 cores compute node:
>> salloc -n 8 -c 8 mpirun -bind-to-core -bycore ./a.out
>> [rank:0]<stdout>: host:compute18
>> [rank:1]<stdout>: host:compute19
>> [rank:2]<stdout>: host:compute18
>> [rank:3]<stdout>: host:compute19
>> [rank:4]<stdout>: host:compute18
>> [rank:5]<stdout>: host:compute19
>> [rank:6]<stdout>: host:compute18
>> [rank:7]<stdout>: host:compute19
>> 
>> with patch:
>> [rank:0]<stdout>: host:compute18
>> [rank:1]<stdout>: host:compute18
>> [rank:2]<stdout>: host:compute18
>> [rank:3]<stdout>: host:compute18
>> [rank:4]<stdout>: host:compute19
>> [rank:5]<stdout>: host:compute19
>> [rank:6]<stdout>: host:compute19
>> [rank:7]<stdout>: host:compute19
>> 
>> Can you say, if my patch is correct ?
>> 
>> Thanks you
>> 
>> Damien
>> 
>> <patch_cpu_per_rank.txt>_______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 


Reply via email to