> On Nov 12, 2014, at 7:15 AM, Dave Love <d.l...@liverpool.ac.uk> wrote: > > Ralph Castain <r...@open-mpi.org> writes: > >> You might also add the —display-allocation flag to mpirun so we can >> see what it thinks the allocation looks like. If there are only 16 >> slots on the node, it seems odd that OMPI would assign 32 procs to it >> unless it thinks there is only 1 node in the job, and oversubscription >> is allowed (which it won’t be by default if it read the GE allocation) > > I think there's a problem with documentation at least not being > explicit, and it would really help to have it clarified unless I'm > missing some.
Not quite sure I understand this comment - the problem is that we aren’t correctly reading the allocation, as evidenced by when the user ran with —display-allocation. From what we can see, it looks like the PE_HOSTFILE may be containing some unexpected characters that make us think we hit EOF at the end of the first line, thus ignoring the second node. > > Although there's probably more to it in this case, the behaviour seemed > consistent with what I deduced (without reading the code) from the doc, > ompi_info, and experiment that at least wasn't inconsistent: the node > has 32 processing units, and the default allocation is by socket, > apparently round-robin within nodes. I can't check the actual behaviour > in that case just now. > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/11/25769.php