Re: [gridengine users] Q: Understanding of Loose and Tight Integration of PEs.

Reuti Wed, 18 Nov 2015 14:28:18 -0800

Ups - fatal typo - it's late:

Am 18.11.2015 um 23:09 schrieb Reuti:


> Hi,
> 
> Am 18.11.2015 um 22:00 schrieb Lee, Wayne:
> 
>> To list,
>> 
>> I’ve been reading some of the information from various web links regarding 
>> the differences between “loose” and “tight” integration associated with 
>> Parallel Environments (PEs) within Grid Engine (GE).   One of the weblinks I 
>> found which provides a really good explanation of this is “Dan Templeton’s 
>> PE Tight Integration 
>> (https://blogs.oracle.com/templedf/entry/pe_tight_integration).  I would 
>> like to just confirm my understanding of “loose”/”tight” integration as well 
>> as what the role of the “rsh” wrapper is in the process. 
>> 
>> 1.       Essentially, as best as I can tell an application, regardless if it 
>> is setup to use either “loose” or “tight” integration have the GE 
>> “sge_execd” execution daemon start up the “Master” task that is part of a 
>> parallel job application.   An example of this would be an MPI (eg. LAM, 
>> Intel, Platform, Open, etc.) application.   So I’m assuming I would the 
>> “sge_execd” daemon fork off a “sge_shepherd” process which in turn starts up 
>> something like “mpirun” or some script.  Is this correct?
> 
> Yes.
> 
> But to be complete: in addition we first have to distinguish whether the MPI 
> slave tasks can be started by an `ssh`/`rsh` (resp. `qrsh -inherit ...` for a 
> tight integration) on its own, or whether they need some running daemons 
> beforehand. Creating a tight integration for a daemon based setup is more 
> convoluted by far, and my Howtos for PVM, LAM/MPI and early versions of 
> MPICH2 are still available, but I wouldn't recommend to use it - unless you 
> have some legacy applications which depend on this and you can't recompile 
> them.
> 
> Recent versions of Intel MPI, Open MPI, MPICH2 and Platform MPI can achieve a 
> tight integration with minimal effort. Let me know if you need more 
> information about a specific one.
> 
> 
>> 2.       The differences between the “loose” and “tight” integration is how 
>> the parallel job application’s “Slave” tasks are handled.   With “loose” 
>> integration the slave tasks/processes are not managed and started by GE.   
>> The application would start up the slave tasks via something like “rsh” or 
>> “ssh”.    An example of this is mpirun starting the various slave processes 
>> to the various nodes listed in the “$pe_hostlist” provided by GE.  With 
>> “tight” integration, the slave tasks/processes are managed and started by GE 
>> but through the use of “qrsh”.  Is this correct?
> 
> Yes.
> 
> 
>> 3.       One of the things I was reading from the document discussing 
>> “loose” and “tight” integration using LAM MPI was the differences in the way 
>> they handle “accounting” and how the processes associated with a parallel 
>> job are handled if deleted using qdel.    By “accounting”, does this mean 
>> that the GE is able to better keep track of where each of the slave tasks 
>> are and how much resources are being used by the slave tasks?    So does 
>> this mean that “tight” integration is preferable over “loose” integration 
>> since one allows GE to better keep track of the resources used by the slave 
>> tasks and one is able to better delete a “tight” integration job in a 
>> “cleaner” manner?
> 
> Yes - absolutely.
> 
> 
>> 4.       Continuing with “tight” integration.   Does this also mean that if 
>> a parallel MPI application uses either “rsh” or “ssh” to facilitate the 
>> communications between the Master and Slave tasks/processes, that 
>> essentially, “qrsh”, intercepts or replaces the communications performed by 
>> “rsh” or “ssh”?     Hence this is why the “rsh” wrapper script is used to 
>> facilitate the “tight” integration.   Is that correct?
> 
> The wrapper solution is only necessary in case the actual MPI library has now 
> builtin support for SGE. In case of Open MPI (./configure --with-sge ...) and

Should read: [...] actual MPI library has no builtin support [...]

-- Reuti


> MPICH2 the support is built in and you can find hints to set it up on their 
> websites - no wrapper necessary and the start_/stop-_proc_args can be set to 
> NONE (i.e.: they call `qrsh` directly, in case they discover that they are 
> executed under SGE [by certain set environment variables]). The 
> start_proc_args in the PE was/is used to set up the links to the wrapper(s) 
> and reformat the $pe_hostfile, in case the parallel library understands only 
> a different format*. This is necessary e.g. for MPICH(1).
> 
> *) In case you heard of the application Gaussian: I also create the 
> "%lindaworkers=..." list of nodes for the input file line in the 
> start_proc_args.
> 
> 
>> 5.       I was reading from some of the postings in the GE archive from 
>> someone named “Reuti” regarding the “rsh” wrapper script.   If I understood 
>> what he wrote correctly, it doesn’t matter if the Parallel MPI application 
>> is using either “rsh” or “ssh”, the “rsh” wrapper script provided by GE is 
>> just to force the application so use GE’s qrsh?    Am I stating this 
>> correctly?    Another way to state this is that “rsh” is just a name.   The 
>> name could be anything as long as your MPI application is configured to use 
>> whatever name of the communications protocol is used by the application, 
>> essentially the basic contents of the wrapper script won’t change aside from 
>> the name “rsh” and locations of scripts referenced by the wrapper script.   
>> Again, am I stating this correctly?
> 
> Yes to all.
> 
> 
>> 6.       With regards to the various types and vendor’s MPI implementation.  
>>  What does it exactly mean that certain MPI implementations are GE aware?   
>> I tend to think that this means that parallel applications built with GE 
>> aware MPI implementations know where to find the “$pe_hostfile” that GE 
>> generates based on what resources the parallel application needs.   Is that 
>> all to it for the MPI implementation to be GE aware?    I know that with 
>> Intel or Open MPI, the PE environments that I’ve created don’t really 
>> require any special scripts for the “start_proc_args” and “stop_proc_args” 
>> parameters in the PE.    However, based on what little I have seen, LAM and 
>> Platform MPI implementations appear to require one to use scripts based on 
>> ones like “startmpi.sh” and “stopmpi.sh” in order to setup the proper 
>> formatted $pe_hostfile to be used by these MPI implementations.   Is my 
>> understanding of this correct?
> 
> Yes. While LAM/MPI is daemon based, Platform MPI uses a plain call to the 
> slave nodes and can be tightly integrated by the wrapper and setting `export 
> MPI_REMSH=rsh`.
> 
> For a builtin tight integration the MPI library needs to a) discover under 
> what queuing system it is running (set environment variables, cna be SGE, 
> SLURM, LSF, PBS, ...), b) find and honor the $pe_hostfile automatically 
> (resp. other files for other queuing systems), c) start `qrsh -inherit ...` 
> to start something on the granted nodes (some implementations need -V here 
> too (you can check the source of Open MPI for example), to forward some 
> variable to the slaves - *not* `qsub -V ...` which I try to avoid, as a 
> random adjustment to the user's shell might lead to a crash of the job when 
> it finally starts and this can be really hard to investigate, as a new 
> submission with a fresh shell might work again.
> 
> 
>> 7.       I was looking at the following options for the “qconf –sconf” 
>> (global configuration) from GE.   
>> 
>> qlogin_command             builtin
>> qlogin_daemon                builtin
>> rlogin_command              builtin
>> rlogin_daemon                 builtin
>> rsh_command                   builtin
>> rsh_daemon                      builtin
>> 
>> I was attempting to fully understand how the above parameters are related to 
>> the execution of Parallel application jobs in GE.   What I’m wonder here is 
>> if the parallel application job I would want GE to manage requires and uses 
>> “ssh” by default for communications between Master and Slave tasks, does 
>> this mean, that the above parameters would need to be configured to use 
>> “slogin”, “ssh”, “sshd”, etc.?
> 
> No. These are two different things. With all of the above settings (before 
> this question) you first configure SGE to intercept the `rsh` resp. `ssh` 
> call (hence the application should never use an absolute path to start them). 
> This will lead the to effect that `qrsh - inherit ...` will finally call the 
> communication method which is configured by "rsh_command" and "rsh_daemon". 
> If possible they should stay as "builtin". Then SGE will use its own internal 
> communication to start the slave tasks, hence the cluster needs no `ssh` or 
> `rsh` at all. In my clusters this is even disabled for normal users, and only 
> admins can `ssh` to the nodes (if a user needs X11 forwarding to a node, this 
> would be special of course). To let users check a node they have to run an 
> interactive job in a special queue, which grants only 10 seconds CPU time 
> (while the wallclock time can be almost infinity).
> 
> Other settings for these parameters are covered in this document - also 
> different kinds of communication can be set up for different nodes and 
> direction of the calls:
> 
> https://arc.liv.ac.uk/SGE/htmlman/htmlman5/remote_startup.html
> 
> Let me know in case you need further details.
> 
> -- Reuti
> 
> 
>> 
>> Apologies for all the questions.   I just want to ensure I understand the 
>> PEs a bit more.
>> 
>> Kind Regards,
>> 
>> -------
>> Wayne Lee
>> 
>> _______________________________________________
>> users mailing list
>> [email protected]
>> https://gridengine.org/mailman/listinfo/users
> 
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Q: Understanding of Loose and Tight Integration of PEs.

Reply via email to