Just some other questions on ras functions: - we found this old thread <http://www.open-mpi.org/community/lists/users/2007/04/2974.php> in which you said that orte doesn't allow to spawn processes outside ras allocation (e.g. with MPI_Comm_spawn). Is it still like that? So, allocate() function is called only once time? - where can we found the number of requested processes for the task? We searched in *orte_job_t* passed to allocate() but we cannot find it. We don't know if take it from argv ("mpirun -np 4 ...") it's a good strategy.
We really appreciate your help guys :) Cheers, Federico Reghenzani 2015-04-13 21:08 GMT+02:00 Ralph Castain <r...@open-mpi.org>: > Yes - but the processes must stay in the same location > > On Apr 13, 2015, at 12:02 PM, Federico Reghenzani < > federico1.reghenz...@mail.polimi.it> wrote: > > Thank you. > > And, to workaround, is it possible to temporary suspend processes on a > node and later resume it (requested by RM)? I saw in the code that orted > can receive SIGTSTP and SIGCONT to suspend/resume processes. > > > Cheers, > Federico Reghenzani > > 2015-04-10 16:58 GMT+02:00 Ralph Castain <r...@open-mpi.org>: > >> I’m afraid not. The MPI job would not be very happy to suddenly lose some >> nodes during execution, and relocating MPI processes during execution is >> something we don’t currently support. >> >> There is work underway to integrate the RM more fully into that procedure >> so it could tell the MPI job to checkpoint, wait until that completed, >> terminate the job, and then fast-restart it on the new nodes - but that >> isn’t here yet. >> >> >> On Apr 10, 2015, at 7:54 AM, Federico Reghenzani < >> federico1.reghenz...@mail.polimi.it> wrote: >> >> The RM can ask for deallocation of some nodes? >> >> For example, mpirun asks to the RM which resources are available (let >> node1, node2, node3) and spawns orted in the nodes. After some time during >> the elaboration, can the RM ask to deassign node3 or reassign jobs on >> node3 to node4? >> >> Cheers, >> Federico Reghenzani >> >> 2015-03-26 18:09:22 GMT+06:00 Artem Polyakov <artpol84_at_[hidden]>: >> >> P.S. also check ESS (orte/mca/ess) for environment setup. >> 2015-03-26 18:06 GMT+06:00 Artem Polyakov <artpol84_at_[hidden]>: >> > >> > 2015-03-26 17:58 GMT+06:00 Gianmario Pozzi <pozzigmario_at_[hidden]>: >> > >> >> Hi everyone, >> >> I'm an italian M.Sc. student in Computer Engineering at Politecnico di >> >> Milano. >> >> >> >> My team and I are trying to integrate OpenMPI with a real time >> resource >> >> manager written by a group of students named BBQ ( >> >> http://bosp.dei.polimi.it/ ). We are encountering some troubles, >> though. >> >> >> >> Our main issue is to understand how ORTE interacts with the resource >> >> manager, which parts of the code (if any) are executed on the "slave" >> nodes >> >> and which ones on the "master". >> >> We spent some time looking at the source code but we still have many >> >> doubts. >> >> >> > >> > Hello, >> > check orte/mca/plm and orte/mca/ras >> > PLM - process lifecycle manager >> > RAS - resource allocation subsystem. >> > >> > In RAS mpirun detects under which RM it works and gets the allocation. >> > in PLM spawn of remote processes is done. >> > mpirun spawns orted daemons on the slave nodes and all the rest is done >> > without RM intervention (IMHO). >> > >> > >> >> >> >> Thank you. >> >> >> >> _______________________________________________ >> >> devel mailing list >> >> devel_at_[hidden] >> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> Link to this post: >> >> http://www.open-mpi.org/community/lists/devel/2015/03/17157.php >> >> >> > >> > >> > >> > -- >> > С Уважением, ÐŸÐ¾Ð»Ñ ÐºÐ¾Ð² Рртем Юрьевич >> > Best regards, Artem Y. Polyakov >> > >> >> >> -- >>> С Уважением, ÐŸÐ¾Ð»Ñ ÐºÐ¾Ð² Рртем Юрьевич >>> Best regards, Artem Y. Polyakov >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/04/17210.php >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/04/17211.php >> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/04/17215.php > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/04/17216.php >