Hi,

> Am 26.07.2017 um 15:03 schrieb Kulshrestha, Vipul 
> <vipul_kulshres...@mentor.com>:
> 
> Thanks for a quick response.
>  
> I will try building OMPI as suggested.
>  
> On the integration with unsupported distribution systems, we cannot use 
> script based approach, because often these machines don’t have ssh permission 
> in customer environment. I will explore the path of writing orte component. 
> At this stage, I don’t understand the effort for the same.
>  
> I guess my question 2 was not understood correctly. I used the below command 
> as an example for SGE and want to understand the expected behavior for such a 
> command. With the below command, I expect to have 8 copies of a.out launched

Yep.


> with each copy having access to 40GB of memory. Is that correct?

SGE will grant the memory, not Open MPI. This is done by SGE's tight 
integration and as slave tasks are started by `qrsh -inherit …` and not by a 
plain `ssh`. This way SGE can keep track of the started processes.


> I am doubtful, because I don’t understand how mpirun gets access to 
> information about RAM requirement.
>  
> qsub –pe orte 8 –b y –V –l m_mem_free=40G –cwd mpirun –np 8 a.out

In case your application relies on the actual value of "m_mem_free", your 
application has to request this information. This might different between the 
various queuing systems though. In SGE one could either use `qstat` to `grep` 
the information, or (instead of a direct `mpirun`) uses a jobscript which will 
feed this value in addition to an environment variable, which you can access 
directly in your application. On a command line it would be:

$ qsub –pe orte 8 –b y –v m_mem_free=40G –l m_mem_free=40G –cwd mpirun –np 8 
a.out

1. -V might set to many variable. Usually I suggest to forward only environment 
variables which are necessary for the job. The user could set some environment 
variable by accident and wonder why the job, which started a couple of days 
later only, crashes; but submitting exactly the same job again succeeds.

2. The 40G is a string in the environment variable, you may want to use the 
plain value in bytes there.

-- Reuti
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to