Ups - fatal typo - it's late: Am 18.11.2015 um 23:09 schrieb Reuti:
> Hi, > > Am 18.11.2015 um 22:00 schrieb Lee, Wayne: > >> To list, >> >> I’ve been reading some of the information from various web links regarding >> the differences between “loose” and “tight” integration associated with >> Parallel Environments (PEs) within Grid Engine (GE). One of the weblinks I >> found which provides a really good explanation of this is “Dan Templeton’s >> PE Tight Integration >> (https://blogs.oracle.com/templedf/entry/pe_tight_integration). I would >> like to just confirm my understanding of “loose”/”tight” integration as well >> as what the role of the “rsh” wrapper is in the process. >> >> 1. Essentially, as best as I can tell an application, regardless if it >> is setup to use either “loose” or “tight” integration have the GE >> “sge_execd” execution daemon start up the “Master” task that is part of a >> parallel job application. An example of this would be an MPI (eg. LAM, >> Intel, Platform, Open, etc.) application. So I’m assuming I would the >> “sge_execd” daemon fork off a “sge_shepherd” process which in turn starts up >> something like “mpirun” or some script. Is this correct? > > Yes. > > But to be complete: in addition we first have to distinguish whether the MPI > slave tasks can be started by an `ssh`/`rsh` (resp. `qrsh -inherit ...` for a > tight integration) on its own, or whether they need some running daemons > beforehand. Creating a tight integration for a daemon based setup is more > convoluted by far, and my Howtos for PVM, LAM/MPI and early versions of > MPICH2 are still available, but I wouldn't recommend to use it - unless you > have some legacy applications which depend on this and you can't recompile > them. > > Recent versions of Intel MPI, Open MPI, MPICH2 and Platform MPI can achieve a > tight integration with minimal effort. Let me know if you need more > information about a specific one. > > >> 2. The differences between the “loose” and “tight” integration is how >> the parallel job application’s “Slave” tasks are handled. With “loose” >> integration the slave tasks/processes are not managed and started by GE. >> The application would start up the slave tasks via something like “rsh” or >> “ssh”. An example of this is mpirun starting the various slave processes >> to the various nodes listed in the “$pe_hostlist” provided by GE. With >> “tight” integration, the slave tasks/processes are managed and started by GE >> but through the use of “qrsh”. Is this correct? > > Yes. > > >> 3. One of the things I was reading from the document discussing >> “loose” and “tight” integration using LAM MPI was the differences in the way >> they handle “accounting” and how the processes associated with a parallel >> job are handled if deleted using qdel. By “accounting”, does this mean >> that the GE is able to better keep track of where each of the slave tasks >> are and how much resources are being used by the slave tasks? So does >> this mean that “tight” integration is preferable over “loose” integration >> since one allows GE to better keep track of the resources used by the slave >> tasks and one is able to better delete a “tight” integration job in a >> “cleaner” manner? > > Yes - absolutely. > > >> 4. Continuing with “tight” integration. Does this also mean that if >> a parallel MPI application uses either “rsh” or “ssh” to facilitate the >> communications between the Master and Slave tasks/processes, that >> essentially, “qrsh”, intercepts or replaces the communications performed by >> “rsh” or “ssh”? Hence this is why the “rsh” wrapper script is used to >> facilitate the “tight” integration. Is that correct? > > The wrapper solution is only necessary in case the actual MPI library has now > builtin support for SGE. In case of Open MPI (./configure --with-sge ...) and Should read: [...] actual MPI library has no builtin support [...] -- Reuti > MPICH2 the support is built in and you can find hints to set it up on their > websites - no wrapper necessary and the start_/stop-_proc_args can be set to > NONE (i.e.: they call `qrsh` directly, in case they discover that they are > executed under SGE [by certain set environment variables]). The > start_proc_args in the PE was/is used to set up the links to the wrapper(s) > and reformat the $pe_hostfile, in case the parallel library understands only > a different format*. This is necessary e.g. for MPICH(1). > > *) In case you heard of the application Gaussian: I also create the > "%lindaworkers=..." list of nodes for the input file line in the > start_proc_args. > > >> 5. I was reading from some of the postings in the GE archive from >> someone named “Reuti” regarding the “rsh” wrapper script. If I understood >> what he wrote correctly, it doesn’t matter if the Parallel MPI application >> is using either “rsh” or “ssh”, the “rsh” wrapper script provided by GE is >> just to force the application so use GE’s qrsh? Am I stating this >> correctly? Another way to state this is that “rsh” is just a name. The >> name could be anything as long as your MPI application is configured to use >> whatever name of the communications protocol is used by the application, >> essentially the basic contents of the wrapper script won’t change aside from >> the name “rsh” and locations of scripts referenced by the wrapper script. >> Again, am I stating this correctly? > > Yes to all. > > >> 6. With regards to the various types and vendor’s MPI implementation. >> What does it exactly mean that certain MPI implementations are GE aware? >> I tend to think that this means that parallel applications built with GE >> aware MPI implementations know where to find the “$pe_hostfile” that GE >> generates based on what resources the parallel application needs. Is that >> all to it for the MPI implementation to be GE aware? I know that with >> Intel or Open MPI, the PE environments that I’ve created don’t really >> require any special scripts for the “start_proc_args” and “stop_proc_args” >> parameters in the PE. However, based on what little I have seen, LAM and >> Platform MPI implementations appear to require one to use scripts based on >> ones like “startmpi.sh” and “stopmpi.sh” in order to setup the proper >> formatted $pe_hostfile to be used by these MPI implementations. Is my >> understanding of this correct? > > Yes. While LAM/MPI is daemon based, Platform MPI uses a plain call to the > slave nodes and can be tightly integrated by the wrapper and setting `export > MPI_REMSH=rsh`. > > For a builtin tight integration the MPI library needs to a) discover under > what queuing system it is running (set environment variables, cna be SGE, > SLURM, LSF, PBS, ...), b) find and honor the $pe_hostfile automatically > (resp. other files for other queuing systems), c) start `qrsh -inherit ...` > to start something on the granted nodes (some implementations need -V here > too (you can check the source of Open MPI for example), to forward some > variable to the slaves - *not* `qsub -V ...` which I try to avoid, as a > random adjustment to the user's shell might lead to a crash of the job when > it finally starts and this can be really hard to investigate, as a new > submission with a fresh shell might work again. > > >> 7. I was looking at the following options for the “qconf –sconf” >> (global configuration) from GE. >> >> qlogin_command builtin >> qlogin_daemon builtin >> rlogin_command builtin >> rlogin_daemon builtin >> rsh_command builtin >> rsh_daemon builtin >> >> I was attempting to fully understand how the above parameters are related to >> the execution of Parallel application jobs in GE. What I’m wonder here is >> if the parallel application job I would want GE to manage requires and uses >> “ssh” by default for communications between Master and Slave tasks, does >> this mean, that the above parameters would need to be configured to use >> “slogin”, “ssh”, “sshd”, etc.? > > No. These are two different things. With all of the above settings (before > this question) you first configure SGE to intercept the `rsh` resp. `ssh` > call (hence the application should never use an absolute path to start them). > This will lead the to effect that `qrsh - inherit ...` will finally call the > communication method which is configured by "rsh_command" and "rsh_daemon". > If possible they should stay as "builtin". Then SGE will use its own internal > communication to start the slave tasks, hence the cluster needs no `ssh` or > `rsh` at all. In my clusters this is even disabled for normal users, and only > admins can `ssh` to the nodes (if a user needs X11 forwarding to a node, this > would be special of course). To let users check a node they have to run an > interactive job in a special queue, which grants only 10 seconds CPU time > (while the wallclock time can be almost infinity). > > Other settings for these parameters are covered in this document - also > different kinds of communication can be set up for different nodes and > direction of the calls: > > https://arc.liv.ac.uk/SGE/htmlman/htmlman5/remote_startup.html > > Let me know in case you need further details. > > -- Reuti > > >> >> Apologies for all the questions. I just want to ensure I understand the >> PEs a bit more. >> >> Kind Regards, >> >> ------- >> Wayne Lee >> >> _______________________________________________ >> users mailing list >> [email protected] >> https://gridengine.org/mailman/listinfo/users > > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
