> On Jun 11, 2015, at 1:13 AM, Federico Reghenzani
> <[email protected]> wrote:
>
> Just some other questions on ras functions:
> - we found this old thread
> <http://www.open-mpi.org/community/lists/users/2007/04/2974.php> in which you
> said that orte doesn't allow to spawn processes outside ras allocation (e.g.
> with MPI_Comm_spawn). Is it still like that? So, allocate() function is
> called only once time?
Not really - you can use an MPI_Info key to “add-host” or “add-hostfile”
> - where can we found the number of requested processes for the task? We
> searched in orte_job_t passed to allocate() but we cannot find it. We don't
> know if take it from argv ("mpirun -np 4 ...") it's a good strategy.
It is in the orte_job_t:
/* number of procs in this job */
orte_vpid_t num_procs;
>
>
> We really appreciate your help guys :)
>
> Cheers,
> Federico Reghenzani
>
> 2015-04-13 21:08 GMT+02:00 Ralph Castain <[email protected]
> <mailto:[email protected]>>:
> Yes - but the processes must stay in the same location
>
>> On Apr 13, 2015, at 12:02 PM, Federico Reghenzani
>> <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> Thank you.
>>
>> And, to workaround, is it possible to temporary suspend processes on a node
>> and later resume it (requested by RM)? I saw in the code that orted can
>> receive SIGTSTP and SIGCONT to suspend/resume processes.
>>
>>
>> Cheers,
>> Federico Reghenzani
>>
>> 2015-04-10 16:58 GMT+02:00 Ralph Castain <[email protected]
>> <mailto:[email protected]>>:
>> I’m afraid not. The MPI job would not be very happy to suddenly lose some
>> nodes during execution, and relocating MPI processes during execution is
>> something we don’t currently support.
>>
>> There is work underway to integrate the RM more fully into that procedure so
>> it could tell the MPI job to checkpoint, wait until that completed,
>> terminate the job, and then fast-restart it on the new nodes - but that
>> isn’t here yet.
>>
>>
>>> On Apr 10, 2015, at 7:54 AM, Federico Reghenzani
>>> <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>> The RM can ask for deallocation of some nodes?
>>>
>>> For example, mpirun asks to the RM which resources are available (let
>>> node1, node2, node3) and spawns orted in the nodes. After some time during
>>> the elaboration, can the RM ask to deassign node3 or reassign jobs on
>>> node3 to node4?
>>>
>>> Cheers,
>>> Federico Reghenzani
>>>
>>> 2015-03-26 18:09:22 GMT+06:00 Artem Polyakov <artpol84_at_[hidden]>:
>>>
>>> P.S. also check ESS (orte/mca/ess) for environment setup.
>>> 2015-03-26 18:06 GMT+06:00 Artem Polyakov <artpol84_at_[hidden]>:
>>> >
>>> > 2015-03-26 17:58 GMT+06:00 Gianmario Pozzi <pozzigmario_at_[hidden]>:
>>> >
>>> >> Hi everyone,
>>> >> I'm an italian M.Sc. student in Computer Engineering at Politecnico di
>>> >> Milano.
>>> >>
>>> >> My team and I are trying to integrate OpenMPI with a real time resource
>>> >> manager written by a group of students named BBQ (
>>> >> http://bosp.dei.polimi.it/ <http://bosp.dei.polimi.it/> ). We are
>>> >> encountering some troubles, though.
>>> >>
>>> >> Our main issue is to understand how ORTE interacts with the resource
>>> >> manager, which parts of the code (if any) are executed on the "slave"
>>> >> nodes
>>> >> and which ones on the "master".
>>> >> We spent some time looking at the source code but we still have many
>>> >> doubts.
>>> >>
>>> >
>>> > Hello,
>>> > check orte/mca/plm and orte/mca/ras
>>> > PLM - process lifecycle manager
>>> > RAS - resource allocation subsystem.
>>> >
>>> > In RAS mpirun detects under which RM it works and gets the allocation.
>>> > in PLM spawn of remote processes is done.
>>> > mpirun spawns orted daemons on the slave nodes and all the rest is done
>>> > without RM intervention (IMHO).
>>> >
>>> >
>>> >>
>>> >> Thank you.
>>> >>
>>> >> _______________________________________________
>>> >> devel mailing list
>>> >> devel_at_[hidden]
>>> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>> >> Link to this post:
>>> >> http://www.open-mpi.org/community/lists/devel/2015/03/17157.php
>>> >> <http://www.open-mpi.org/community/lists/devel/2015/03/17157.php>
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > С Уважением, ÐŸÐ¾Ð»Ñ ÐºÐ¾Ð² Рртем Юрьевич
>>> > Best regards, Artem Y. Polyakov
>>> >
>>>
>>> --
>>> С Уважением, ÐŸÐ¾Ð»Ñ ÐºÐ¾Ð² Рртем Юрьевич
>>> Best regards, Artem Y. Polyakov
>>> _______________________________________________
>>> devel mailing list
>>> [email protected] <mailto:[email protected]>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/04/17210.php
>>> <http://www.open-mpi.org/community/lists/devel/2015/04/17210.php>
>>
>> _______________________________________________
>> devel mailing list
>> [email protected] <mailto:[email protected]>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/04/17211.php
>> <http://www.open-mpi.org/community/lists/devel/2015/04/17211.php>
>>
>> _______________________________________________
>> devel mailing list
>> [email protected] <mailto:[email protected]>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/04/17215.php
>> <http://www.open-mpi.org/community/lists/devel/2015/04/17215.php>
>
> _______________________________________________
> devel mailing list
> [email protected] <mailto:[email protected]>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/04/17216.php
> <http://www.open-mpi.org/community/lists/devel/2015/04/17216.php>
>
> _______________________________________________
> devel mailing list
> [email protected]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/06/17491.php