On Jul 26, 2006, at 5:06 PM, Phil Carns wrote:
I don't see why the two have to be dependent for this to work.
Do you mean by the parent posting a job, the state machine
stepping code would handling the actual posting? I was assuming
that the parent state action could just call
job_concurrent_sm_post (or whatever its called).
Could it be similar to the request scheduler job posting code?
The parent state action could call job_concurrent_sm_post with an
array of the child sms, which just calls sm_post and adds the
parent sm and its array to an operation queue. Then a
job_concurrent_sm_test function could test for completion of a
parent sm by looking at all the sms in the array to see if they
completed. The job_testcontext code would have to be modified of
course (maybe rework the do_one_test_cycle_req_sched function to
also test parent sm jobs), but all of that still seems to be
independent of the state machine code (i.e. someone could use the
state machine code separately and drive state machines using
something other than the job framework). I don't know if all
that makes sense in the context of the changes you've made, but
that's what I had in mind when I suggested posting a job for the
parent.
I think I follow what you are describing, but I am not entirely
sure. If so, I think there is one advantage to the approach that
Walt has been hashing out thus far. I think that what Walt is
describing is event-driven, in a sense. No one has to actively
look to see if all of the children have finished. Instead, the
children each send notification (by calling a release function or
manually decrementing a counter) in their completion function, with
the parent eventually getting a single notification (representing
all of the children) through the existing job completion queue
mechanism.
I think I'm getting voted down here, so I should probably just
shutup, but I don't think in practice we're going to have that many
child state machines that iterating through the list is at all
costly. I'm arguing for simpler mechanisms that fit in with the job
subsystem over something more fancy and possibly slightly better
performing.
I think that the way that you describe would work fine too, but it
would require a little more active work to check the status of the
array of child SMs and would require more code to keep track of them.
Probably a bit more code yes, but it seems cleaner than keeping
around backpointers and checking for parents. Instead of driving all
state machines from one place, this event notification scheme
essentially replaces the last child state machine with the parent,
which seems like a bit of hack and harder to debug.
-sam
I think you are right though, that you could pull off your version
without the the children actually having to make a job_* call.
-Phil
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers