Re: [Pvfs2-developers] terminating state machines

Phil Carns Thu, 27 Jul 2006 07:42:09 -0700

Hmm...I had been thinking about a flow implementation that used the newconcurrent state machine code...it sounds like that's a bad ideabecause the testing and restarting would take too long to switchbetween bmi and trove? We use the post/test model through pvfs2though, so maybe I don't understand the issue.

I don't think that is bad idea. There were really two seperate butrelated problems in one of the older flow protocol implementations, Ican try to describe them a little more here if I can remember:

- explicitly tracking and testing each trove and bmi operation: Itbasically kept arrays that listed pending trove and bmi ops, and wouldcall testsome() to service them. This was a problem because the time ittook to keep running up and down those arrays (when building them at theflow level, or when testing them at the trove/bmi level). The solutionis to just use testcontext() and let trove/bmi tell you when somethingfinishes without managing extra state.

- thread switch time: the architecture here was set up at one time tohave one thread pushing the test functions for bmi, another threadpushing the test functions for trove, while another thread wasprocessing the flow and posting new operations. The problem here isthat it (at the time) took too long to jump between the "pushing"threads and the "processing" thread when an operation finished thatshould trigger progress on the flow. This led to the thread-mgr.c codeand associated callbacks. The callbacks actually drive the flowprogress and post new operations. That means that the same thread thatpushes testcontext() gets to trigger the next post, without waiting onthe latency of waking up a different thread to do something (usingcondition variable etc.). I managed to reuse the thread-mgr for the jobcode as well, so that one testcontext() call triggers callbacks to boththe job and flow interfaces.

I don't think either of the above issues precludes different flowprotocol implementations, and they are really kind of orthogonal towhether state machines are used or not. The first issue is solved justby using testcontext() rather than manually tracking operations.

The second issue could be solved in a variety of ways, some of which maybe better than what we have now. The callback approach is effecientenough, but is hard to debug. Of course it is also possible that thethread switch (ie. condition signal) latency is low enough nowadays thatyou don't even need to worry about it anymore. I last looked at thisproblem before NPTL arrived on the scene.

At any rate I think a state machine based flow protocol could dodgeissue #2 by either:

- lucking out with a faster modern thread implementation
- being smarter about how thread work is divided up

- using callbacks as we do now, and making the state machine mechanismthread safe so that it can be driven directly from those callbacksrather than from a testcontext() work loop

On a related note, it is important to remember that trove has its owninternal thread also- so on the trove push side (depending on yourdesign) you could have to worry about a chain of 2 threads that have tobe woken up to get something done at completion time. The trove part ofthat chain can't be avoided without changing the API.

Sorry about the tangent here, but I figured I may as well share somewarnings about things to look out for here. I think it would be good tohave a cleaner flow protocol implementation.

I think I'm lost now. What do you mean by replace? The states arestill isolated, jobs trigger the transitions, only one state actiongets executed at a time, there still may be a time gap betweencompletion of any given child and when the parent picks up processingagain, and there are still frames. I think both approaches will lookthe same when running unless I missed something. If Walt puts alongjmp() in there we can both hit him over the head.
Heh.  Don't give him ideas! ;-)
I was operating under the constraint that a state machine can only posta job for itself. If I understand the current plan correctly, usingjob_null in the child state machine to post a job for the parent breaksthat constraint, and so in some sense is a replace (the job_nullactually takes the parent smcb pointer). I think you're probably rightthat its not a big difference either way, its just cleaner in my headto only have state machines posting jobs for themselves.

I see what you are saying. I guess it depends on how you look at it. Ihad kind of started thinking of the jobs as a signalling mechanism sincethey are the construct that "signals" as state machine to make its nexttransition. The job_null() approach just makes it so that a child statemachine is what triggers this particular signal, rather than abmi/trove/dev/req_sched/flow component. I know this is a change in themodel and adds a dependency that wasn't previously there, but at leastjob_null() is just a few dozen lines of code. If someone reuses the SMcode elsewhere, I would guess that is one of the more minor worriesconsidering that they would need a whole new mechanism (other than thejob api) to motivate all of the transitions anyway.

Walt probably got more discussion than he bargained for, but at theleast, lively discussion keeps me awake in the afternoon ;-).


Heh- same here :)

-Phil
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

Reply via email to