Am 12.08.2014 um 16:57 schrieb Antonio Rago:

> Brilliant, this works!
> However I’ve to say that it seems that it seems that code becomes slightly 
> less performing.
> Is there a way to instruct mpirun on which core to use, and maybe create this 
> map automatically with grid engine?

In the open source version of SGE the requested core binding is only a soft 
request. The Univa version can handle this as a hard request though, as the 
scheduler will do the assignment and knows which cores are used. I have no 
information whether this will be forwarded to Open MPI automatically. I assume 
not, and it must be read out of the machine file (there ought to be an extra 
column for it in their version) and feed to Open MPI by some measures.

-- Reuti


> Thanks in advance
> Antonio
> 
> 
> 
> 
> On 12 Aug 2014, at 14:10, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote:
> 
>> The quick and dirty answer is that in the v1.8 series, Open MPI started 
>> binding MPI processes to cores by default.
>> 
>> When you run 2 independent jobs on the same machine in the way in which you 
>> described, the two jobs won't have knowledge of each other, and therefore 
>> they will both starting binging MPI processes starting with logical core 0.
>> 
>> The easy workaround is to disable bind-to-core behavior.  For example, 
>> "mpirun --bind-to none ...".  In this way, the OS will (more or less) load 
>> balance your MPI jobs to available cores (assuming you don't run more MPI 
>> processes than cores).
>> 
>> 
>> On Aug 12, 2014, at 7:05 AM, Antonio Rago <antonio.r...@plymouth.ac.uk> 
>> wrote:
>> 
>>> Dear mailing list
>>> I’m running into trouble in the configuration of the small cluster I’m 
>>> managing.
>>> I’ve installed openmpi-1.8.1 with gcc 4.7 on a Centos 6.5 with infiniband 
>>> support.
>>> Compile and installation were all ok and i can compile and actually run 
>>> parallel jobs, both directly or by submitting them with the queue manager 
>>> (gridengine).
>>> My problem is that when two different subsets of two job end on the same 
>>> node, they will not spread equally and use all the cores of the node but 
>>> instead they will run on a common subset of cores leaving some other 
>>> totally empty.
>>> For example two 4 core jobs on a 8 core node will result in only 4 core 
>>> running on the node (all of them being oversubscribed) and the other 4 
>>> cores being empty.
>>> Clearly there must be an error in the way I’ve configured stuffs but i 
>>> cannot find any hint on how to solve the problem.
>>> I’ve tried to do different map (map by core or by slot) but I’ve never 
>>> succeeded.
>>> Could you give a me suggestion on this issue?
>>> Regards
>>> Antonio
>>> 
>>> ________________________________
>>> [http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>
>>> 
>>> This email and any files with it are confidential and intended solely for 
>>> the use of the recipient to whom it is addressed. If you are not the 
>>> intended recipient then copying, distribution or other use of the 
>>> information contained is strictly prohibited and you should not rely on it. 
>>> If you have received this email in error please let the sender know 
>>> immediately and delete it from your system(s). Internet emails are not 
>>> necessarily secure. While we take every care, Plymouth University accepts 
>>> no responsibility for viruses and it is your responsibility to scan emails 
>>> and their attachments. Plymouth University does not accept responsibility 
>>> for any changes made after it was sent. Nothing in this email or its 
>>> attachments constitutes an order for goods or services unless accompanied 
>>> by an official order form.
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2014/08/24986.php
>> 
>> 
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/08/24991.php
> 
> ________________________________
> [http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>
> 
> This email and any files with it are confidential and intended solely for the 
> use of the recipient to whom it is addressed. If you are not the intended 
> recipient then copying, distribution or other use of the information 
> contained is strictly prohibited and you should not rely on it. If you have 
> received this email in error please let the sender know immediately and 
> delete it from your system(s). Internet emails are not necessarily secure. 
> While we take every care, Plymouth University accepts no responsibility for 
> viruses and it is your responsibility to scan emails and their attachments. 
> Plymouth University does not accept responsibility for any changes made after 
> it was sent. Nothing in this email or its attachments constitutes an order 
> for goods or services unless accompanied by an official order form.
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/24992.php
> 

Reply via email to