Just some other questions on ras functions:
- we found this old thread
<http://www.open-mpi.org/community/lists/users/2007/04/2974.php> in
which you said that orte doesn't allow to spawn processes outside ras
allocation (e.g. with MPI_Comm_spawn). Is it still  like that? So,
allocate() function is called only once time?
- where can we found the number of requested processes for the task? We
searched in *orte_job_t* passed to allocate() but we cannot find it. We
don't know if take it from argv ("mpirun -np 4 ...") it's a good strategy.


We really appreciate your help guys :)

Cheers,
Federico Reghenzani

2015-04-13 21:08 GMT+02:00 Ralph Castain <r...@open-mpi.org>:

> Yes - but the processes must stay in the same location
>
> On Apr 13, 2015, at 12:02 PM, Federico Reghenzani <
> federico1.reghenz...@mail.polimi.it> wrote:
>
> Thank you.
>
> And, to workaround, is it possible to temporary suspend processes on a
> node and later resume it (requested by RM)?  I saw in the code that orted
> can receive SIGTSTP and SIGCONT to suspend/resume processes.
>
>
> Cheers,
> Federico Reghenzani
>
> 2015-04-10 16:58 GMT+02:00 Ralph Castain <r...@open-mpi.org>:
>
>> I’m afraid not. The MPI job would not be very happy to suddenly lose some
>> nodes during execution, and relocating MPI processes during execution is
>> something we don’t currently support.
>>
>> There is work underway to integrate the RM more fully into that procedure
>> so it could tell the MPI job to checkpoint, wait until that completed,
>> terminate the job, and then fast-restart it on the new nodes - but that
>> isn’t here yet.
>>
>>
>> On Apr 10, 2015, at 7:54 AM, Federico Reghenzani <
>> federico1.reghenz...@mail.polimi.it> wrote:
>>
>> The RM can ask for deallocation of some nodes?
>>
>> For example, mpirun asks to the RM which resources are available (let
>> node1, node2, node3) and spawns orted in the nodes. After some time during
>> the elaboration, can the RM ask to deassign node3 or  reassign jobs on
>> node3 to node4?
>>
>> Cheers,
>> Federico Reghenzani
>>
>> 2015-03-26 18:09:22 GMT+06:00 Artem Polyakov <artpol84_at_[hidden]>:
>>
>> P.S. also check ESS (orte/mca/ess) for environment setup.
>> 2015-03-26 18:06 GMT+06:00 Artem Polyakov <artpol84_at_[hidden]>:
>> >
>> > 2015-03-26 17:58 GMT+06:00 Gianmario Pozzi <pozzigmario_at_[hidden]>:
>> >
>> >> Hi everyone,
>> >> I'm an italian M.Sc. student in Computer Engineering at Politecnico di
>> >> Milano.
>> >>
>> >> My team and I are trying to integrate OpenMPI with a real time
>> resource
>> >> manager written by a group of students named BBQ (
>> >> http://bosp.dei.polimi.it/ ). We are encountering some troubles,
>> though.
>> >>
>> >> Our main issue is to understand how ORTE interacts with the resource
>> >> manager, which parts of the code (if any) are executed on the "slave"
>> nodes
>> >> and which ones on the "master".
>> >> We spent some time looking at the source code but we still have many
>> >> doubts.
>> >>
>> >
>> > Hello,
>> > check orte/mca/plm and orte/mca/ras
>> > PLM - process lifecycle manager
>> > RAS - resource allocation subsystem.
>> >
>> > In RAS mpirun detects under which RM it works and gets the allocation.
>> > in PLM spawn of remote processes is done.
>> > mpirun spawns orted daemons on the slave nodes and all the rest is done
>> > without RM intervention (IMHO).
>> >
>> >
>> >>
>> >> Thank you.
>> >>
>> >> _______________________________________________
>> >> devel mailing list
>> >> devel_at_[hidden]
>> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> >> Link to this post:
>> >> http://www.open-mpi.org/community/lists/devel/2015/03/17157.php
>> >>
>> >
>> >
>> >
>> > --
>> > С Уважением, ÐŸÐ¾Ð»Ñ ÐºÐ¾Ð² Рртем Юрьевич
>> > Best regards, Artem Y. Polyakov
>> >
>>
>>
>>  --
>>> С Уважением, ÐŸÐ¾Ð»Ñ ÐºÐ¾Ð² Рртем Юрьевич
>>> Best regards, Artem Y. Polyakov
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/04/17210.php
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/04/17211.php
>>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/04/17215.php
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/04/17216.php
>

Reply via email to