Re: [Pvfs2-developers] terminating state machines

Sam Lang Wed, 26 Jul 2006 19:21:15 -0700


On Jul 26, 2006, at 6:16 PM, Phil Carns wrote:

I think I'm getting voted down here, so I should probably justshutup, but I don't think in practice we're going to have thatmany child state machines that iterating through the list is atall costly. I'm arguing for simpler mechanisms that fit in withthe job subsystem over something more fancy and possibly slightlybetter performing.
Well, as far as the number of SMs goes, I would rather not riskit. I still hope this is lightweight enough that we couldeventually use it in more places that would generate a lot ofchildren (like a re-architected sys-io implementation), though Idon't know if that will pan out in practice. I got bitten by asimilar assumption in the flow protocol- it used to track all ofits posted operations for testing rather than relying on someone tonotify it of completion. Admittedly the flow protocol is a moreobvious case and I should have known better, but at the time itseemed reasonable :)

Hmm...I had been thinking about a flow implementation that used thenew concurrent state machine code...it sounds like that's a bad ideabecause the testing and restarting would take too long to switchbetween bmi and trove? We use the post/test model through pvfs2though, so maybe I don't understand the issue.

I think that the way that you describe would work fine too, butit would require a little more active work to check the statusof the array of child SMs and would require more code to keeptrack of them.
Probably a bit more code yes, but it seems cleaner than keepingaround backpointers and checking for parents. Instead of drivingall state machines from one place, this event notificationscheme essentially replaces the last child state machine with theparent, which seems like a bit of hack and harder to debug.
I think I'm lost now. What do you mean by replace? The states arestill isolated, jobs trigger the transitions, only one state actiongets executed at a time, there still may be a time gap betweencompletion of any given child and when the parent picks upprocessing again, and there are still frames. I think bothapproaches will look the same when running unless I missedsomething. If Walt puts a longjmp() in there we can both hit himover the head.

Heh.  Don't give him ideas! ;-)

I was operating under the constraint that a state machine can onlypost a job for itself. If I understand the current plan correctly,using job_null in the child state machine to post a job for theparent breaks that constraint, and so in some sense is a replace (thejob_null actually takes the parent smcb pointer). I think you'reprobably right that its not a big difference either way, its justcleaner in my head to only have state machines posting jobs forthemselves.

I think having a pointer to the parent actually improvesdebugability (though I'm not sure this approach actually requiresit, all you really need is either a job descriptor or a pointer toa counter). If I have a state machine that does something bad orgets stuck it would be nice to be able to work backwards to findout who invoked it, without having to search for it in a seperatedata structure.
I don't mean to keep struggling with this issue- I honestly thinkthat both approaches are pretty good, and if Walt implements it theway I think he is going to, then 95% of developers won't notice thedifference anyway. At this point I am mostly hammering away tomake sure I am not missing a larger issue...

Walt probably got more discussion than he bargained for, but at theleast, lively discussion keeps me awake in the afternoon ;-).


-sam


-Phil


_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

Reply via email to