Re: [OMPI users] oversubscription of slots with GridEngine

Ralph Castain Wed, 12 Nov 2014 11:04:21 -0500 (EST)

> On Nov 12, 2014, at 7:15 AM, Dave Love <d.l...@liverpool.ac.uk> wrote:
> 
> Ralph Castain <r...@open-mpi.org> writes:
> 
>> You might also add the —display-allocation flag to mpirun so we can
>> see what it thinks the allocation looks like. If there are only 16
>> slots on the node, it seems odd that OMPI would assign 32 procs to it
>> unless it thinks there is only 1 node in the job, and oversubscription
>> is allowed (which it won’t be by default if it read the GE allocation)
> 
> I think there's a problem with documentation at least not being
> explicit, and it would really help to have it clarified unless I'm
> missing some.


Not quite sure I understand this comment - the problem is that we aren’t 
correctly reading the allocation, as evidenced by when the user ran with 
—display-allocation. From what we can see, it looks like the PE_HOSTFILE may be 
containing some unexpected characters that make us think we hit EOF at the end 
of the first line, thus ignoring the second node.


> 
> Although there's probably more to it in this case, the behaviour seemed
> consistent with what I deduced (without reading the code) from the doc,
> ompi_info, and experiment that at least wasn't inconsistent:  the node
> has 32 processing units, and the default allocation is by socket,
> apparently round-robin within nodes.  I can't check the actual behaviour
> in that case just now.
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25769.php

Re: [OMPI users] oversubscription of slots with GridEngine

Reply via email to