Re: [OMPI users] openmpi's mpi_comm_spawn integrated with sge?

Will Glover Tue, 25 Jan 2011 14:10:31 -0500

Thanks for your response, Reuti.  Actually I had seen you mention the SGE 
mailing list in response to a similar question but I can't for the life of me 
find that list :(


As for using the background queue, just to clarify - is the idea to submit my 
parallel job on a regular queue with 100 processors at nice 0, but allow other 
'background queue' jobs on the same processors at nice 19?  Presumably, I'd 
still need mpi-2's dynamic process management to free up processors when they 
are not needed (at the moment, they use 100% cpu idling in MPI_Recv for 
example).  Did I understand you correctly?
-- 
Will

--- On Tue, 1/25/11, Reuti <re...@staff.uni-marburg.de> wrote:

> From: Reuti <re...@staff.uni-marburg.de>
> Subject: Re: [OMPI users] openmpi's mpi_comm_spawn integrated with sge?
> To: "Open MPI Users" <us...@open-mpi.org>
> Date: Tuesday, January 25, 2011, 9:27 AM
> Am 25.01.2011 um 12:32 schrieb Terry
> Dontje:
> 
> > On 01/25/2011 02:17 AM, Will Glover wrote:
> >> Hi all,
> >> I tried a google/mailing list search for this but
> came up with nothing, so here goes:
> >> 
> >> Is there any level of automation between open
> mpi's dynamic process management and the SGE queue
> manager?  
> >> In particular, can I make a call to mpi_comm_spawn
> and have SGE dynamically increase the number of slots? 
> 
> >> This seems a little far fetched, but it would be
> really useful if this is possible.  My application is
> 'restricted' to coarse-grain task parallelism and involves a
> work load that varies significantly during runtime (between
> 1 and ~100 parallel tasks).  Dynamic process management
> would maintain an optimal number of processors and reduce
> idling.
> >> 
> >> Many thanks,
> >> 
> > This is an interesting idea but no integration has
> been done that would allow an MPI job to request more slots.
> 
> 
> Similar ideas were on the former SGE mailing list a couple
> of times - having varying resource requests over the
> lifetime of a job (cores, memory, licenses, ...). This would
> mean in the end to have some kind of real-time-queuing
> system, as you have to have the necessary resources to be
> free in time for sure.
> 
> Besides this also some syntax for either requesting a
> "resource profile over time" when such a job is submitted
> would be necessary, or to allow a job while it's running
> issuing some kinds of commands to request/release resources
> on demand.
> 
> If you have such a "resource profile over time" for a bunch
> of jobs, it could then be extended to solve a cutting-stock
> problem where the unit to be cut would be time, e.g. arrange
> these 10 jobs that they finish in the least amount of time
> all together - and you could predict exactly when each job
> will end. This is getting really complex.
> 
> ==
> 
> What can be done in your situation: have some kind of
> "background queue" with a nice value of 19, but the parallel
> job you submit to a queue with the default nice value 0.
> Although you request 100 cores and reserve them (i.e. the
> background queue shouldn't be suspended in such a case of
> course), the background queue will still run at full speed
> when nothing else is running on the nodes. When some of the
> parallel tasks are started on the nodes, they will get most
> of the computing time (this means: oversubscription by
> intention). The background queue can be used for less
> important jobs. Such a setup is usefull when your parallel
> application isn't running in parallel all the time like in
> your case.
> 
> -- Reuti
> 
> 
> > -- 
> > <Mail-Anhang.gif>
> > Terry D. Dontje | Principal Software Engineer
> > Developer Tools Engineering | +1.781.442.2631
> > Oracle - Performance Technologies
> > 95 Network Drive, Burlington, MA 01803
> > Email terry.don...@oracle.com
> > 
> > 
> > 
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] openmpi's mpi_comm_spawn integrated with sge?

Reply via email to