On Tue, Feb 18, 2014 at 06:39:12AM -0800, Ralph Castain wrote:
> On Feb 18, 2014, at 6:24 AM, Adrian Reber <adr...@lisas.de> wrote:
> 
> > On Fri, Feb 14, 2014 at 02:51:51PM -0800, Ralph Castain wrote:
> >> On Feb 13, 2014, at 11:26 AM, Adrian Reber <adr...@lisas.de> wrote:
> >>> I tried to implement something like you described. It is not yet event
> >>> driven, but before continuing I wanted to get some feedback if it is at
> >>> least the right start:
> >>> 
> >>> https://lisas.de/git/?p=open-mpi.git;a=commitdiff;h=5048a9cec2cd0bc4867eadfd7e48412b73267706
> >>> 
> >>> I looked at the other ORTE_OOB_* macros and tried to model my
> >>> functionality a bit after what I have seen there. Right now it is still
> >>> a simple function which just tries to call ft_event() on all oob
> >>> components. Does this look right so far?
> >> 
> >> Sorry for delay - yes, that looks like the right direction. I would 
> >> suggest doing it via the current state machine, though, by simply defining 
> >> another job or proc state in orte/mca/plm/plm_types.h, and then 
> >> registering a callback function using the 
> >> orte_state.add_job[proc]_state(state, function to be called, 
> >> ORTE_ERR_PRI). Then you can activate it by calling 
> >> ORTE_ACTIVATE_JOB[PROC]_STATE(NULL, state) and it will be handled in the 
> >> proper order.
> > 
> > What is a job/proc in the Open MPI context.
> 
> A "job" is the entire application, while a "proc" is just one process in that 
> application. In this case you could use either one as you are checkpointing 
> the entire job, but all this activity is occurring inside each proc. So I'd 
> suggest defining it as a proc state since it only really involves local 
> actions.
> 
> If you like, I can define the required code in the trunk and let you fill in 
> the event functionality.

That would be great.

                Adrian

Reply via email to