Re: [Pvfs2-developers] terminating state machines
Thanks for the detailed explanation Phil. I hadn't thought about the context switches that might slow down flow. I was primarily thinking of something that would be cleaner, and easier to modify and test for different scenarios. If at some point I get around to playing with a flow impl that uses the concurrent state machine framework, I'll open up the discussion again to avoid any of the pitfalls you described. Cleaner and easier to modify would be great! I just remembered that there are a couple of test programs in the tree to look at the thread context switch overhead, in case they are helpful to figure out if it is still a concern: pvfs2/test/io/job/thread-bench2.c pvfs2/test/io/job/thread-bench3.c One of those just goes through a bunch of iterations relaying a condition across threads to see how long it takes. The second one does the same thing, except with 2 relays instead of one (to mimic the trove side of things). I haven't run these on a decent machine in years. I will also add a disclaimer that the test programs are old and quite possibly wrong :) We also have the benefit of your small-io optimization now too, so it isn't quite as critical as it used to be for the flow to be able to keep the latency down on small transfers any more... -Phil ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
On Jul 27, 2006, at 10:16 AM, Phil Carns wrote: Hmm...I had been thinking about a flow implementation that used the new concurrent state machine code...it sounds like that's a bad idea because the testing and restarting would take too long to switch between bmi and trove? We use the post/test model through pvfs2 though, so maybe I don't understand the issue. I don't think that is bad idea. There were really two seperate but related problems in one of the older flow protocol implementations, I can try to describe them a little more here if I can remember: - explicitly tracking and testing each trove and bmi operation: It basically kept arrays that listed pending trove and bmi ops, and would call testsome() to service them. This was a problem because the time it took to keep running up and down those arrays (when building them at the flow level, or when testing them at the trove/ bmi level). The solution is to just use testcontext() and let trove/bmi tell you when something finishes without managing extra state. - thread switch time: the architecture here was set up at one time to have one thread pushing the test functions for bmi, another thread pushing the test functions for trove, while another thread was processing the flow and posting new operations. The problem here is that it (at the time) took too long to jump between the "pushing" threads and the "processing" thread when an operation finished that should trigger progress on the flow. This led to the thread-mgr.c code and associated callbacks. The callbacks actually drive the flow progress and post new operations. That means that the same thread that pushes testcontext() gets to trigger the next post, without waiting on the latency of waking up a different thread to do something (using condition variable etc.). I managed to reuse the thread-mgr for the job code as well, so that one testcontext() call triggers callbacks to both the job and flow interfaces. I don't think either of the above issues precludes different flow protocol implementations, and they are really kind of orthogonal to whether state machines are used or not. The first issue is solved just by using testcontext() rather than manually tracking operations. The second issue could be solved in a variety of ways, some of which may be better than what we have now. The callback approach is effecient enough, but is hard to debug. Of course it is also possible that the thread switch (ie. condition signal) latency is low enough nowadays that you don't even need to worry about it anymore. I last looked at this problem before NPTL arrived on the scene. At any rate I think a state machine based flow protocol could dodge issue #2 by either: - lucking out with a faster modern thread implementation - being smarter about how thread work is divided up - using callbacks as we do now, and making the state machine mechanism thread safe so that it can be driven directly from those callbacks rather than from a testcontext() work loop On a related note, it is important to remember that trove has its own internal thread also- so on the trove push side (depending on your design) you could have to worry about a chain of 2 threads that have to be woken up to get something done at completion time. The trove part of that chain can't be avoided without changing the API. Sorry about the tangent here, but I figured I may as well share some warnings about things to look out for here. I think it would be good to have a cleaner flow protocol implementation. Thanks for the detailed explanation Phil. I hadn't thought about the context switches that might slow down flow. I was primarily thinking of something that would be cleaner, and easier to modify and test for different scenarios. If at some point I get around to playing with a flow impl that uses the concurrent state machine framework, I'll open up the discussion again to avoid any of the pitfalls you described. -sam I think I'm lost now. What do you mean by replace? The states are still isolated, jobs trigger the transitions, only one state action gets executed at a time, there still may be a time gap between completion of any given child and when the parent picks up processing again, and there are still frames. I think both approaches will look the same when running unless I missed something. If Walt puts a longjmp() in there we can both hit him over the head. Heh. Don't give him ideas! ;-) I was operating under the constraint that a state machine can only post a job for itself. If I understand the current plan correctly, using job_null in the child state machine to post a job for the parent breaks that constraint, and so in some sense is a replace (the job_null actually takes the parent smcb pointer). I think you're probably right that its not a big difference either w
Re: [Pvfs2-developers] terminating state machines
Sam Lang wrote: On Jul 26, 2006, at 6:16 PM, Phil Carns wrote: I think I'm getting voted down here, so I should probably just shutup, but I don't think in practice we're going to have that many child state machines that iterating through the list is at all costly. I'm arguing for simpler mechanisms that fit in with the job subsystem over something more fancy and possibly slightly better performing. Well, as far as the number of SMs goes, I would rather not risk it. I still hope this is lightweight enough that we could eventually use it in more places that would generate a lot of children (like a re-architected sys-io implementation), though I don't know if that will pan out in practice. I got bitten by a similar assumption in the flow protocol- it used to track all of its posted operations for testing rather than relying on someone to notify it of completion. Admittedly the flow protocol is a more obvious case and I should have known better, but at the time it seemed reasonable :) Hmm...I had been thinking about a flow implementation that used the new concurrent state machine code...it sounds like that's a bad idea because the testing and restarting would take too long to switch between bmi and trove? We use the post/test model through pvfs2 though, so maybe I don't understand the issue. I think that the way that you describe would work fine too, but it would require a little more active work to check the status of the array of child SMs and would require more code to keep track of them. Probably a bit more code yes, but it seems cleaner than keeping around backpointers and checking for parents. Instead of driving all state machines from one place, this event notification scheme essentially replaces the last child state machine with the parent, which seems like a bit of hack and harder to debug. I think I'm lost now. What do you mean by replace? The states are still isolated, jobs trigger the transitions, only one state action gets executed at a time, there still may be a time gap between completion of any given child and when the parent picks up processing again, and there are still frames. I think both approaches will look the same when running unless I missed something. If Walt puts a longjmp() in there we can both hit him over the head. Heh. Don't give him ideas! ;-) I was operating under the constraint that a state machine can only post a job for itself. If I understand the current plan correctly, using job_null in the child state machine to post a job for the parent breaks that constraint, and so in some sense is a replace (the job_null actually takes the parent smcb pointer). I think you're probably right that its not a big difference either way, its just cleaner in my head to only have state machines posting jobs for themselves. I think having a pointer to the parent actually improves debugability (though I'm not sure this approach actually requires it, all you really need is either a job descriptor or a pointer to a counter). If I have a state machine that does something bad or gets stuck it would be nice to be able to work backwards to find out who invoked it, without having to search for it in a seperate data structure. I don't mean to keep struggling with this issue- I honestly think that both approaches are pretty good, and if Walt implements it the way I think he is going to, then 95% of developers won't notice the difference anyway. At this point I am mostly hammering away to make sure I am not missing a larger issue... Walt probably got more discussion than he bargained for, but at the least, lively discussion keeps me awake in the afternoon ;-). -sam -Phil Good discussion. Phil has convinced me the level of dependency is low, and unless I completely misunderstand Sam, the complexity of the parent pointer/job_null approach is a lot less than the alternative, and I like low complexity. I also think debugging will be simpler. So that's where I'm going. I'll hae to think of other topics to get you guys going form time to time! ;-) Now off to figure out a way to use setjmp/longjmp in my implementation! Walt -- Dr. Walter B. Ligon III Associate Professor ECE Department Clemson University ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
Phil Carns wrote: I think I'm getting voted down here, so I should probably just shutup, but I don't think in practice we're going to have that many child state machines that iterating through the list is at all costly. I'm arguing for simpler mechanisms that fit in with the job subsystem over something more fancy and possibly slightly better performing. Well, as far as the number of SMs goes, I would rather not risk it. I still hope this is lightweight enough that we could eventually use it in more places that would generate a lot of children (like a re-architected sys-io implementation), though I don't know if that will pan out in practice. I got bitten by a similar assumption in the flow protocol- it used to track all of its posted operations for testing rather than relying on someone to notify it of completion. Admittedly the flow protocol is a more obvious case and I should have known better, but at the time it seemed reasonable :) I think that the way that you describe would work fine too, but it would require a little more active work to check the status of the array of child SMs and would require more code to keep track of them. Probably a bit more code yes, but it seems cleaner than keeping around backpointers and checking for parents. Instead of driving all state machines from one place, this event notification scheme essentially replaces the last child state machine with the parent, which seems like a bit of hack and harder to debug. I think I'm lost now. What do you mean by replace? The states are still isolated, jobs trigger the transitions, only one state action gets executed at a time, there still may be a time gap between completion of any given child and when the parent picks up processing again, and there are still frames. I think both approaches will look the same when running unless I missed something. If Walt puts a longjmp() in there we can both hit him over the head. What? What? How else would I do it? ;-) I think having a pointer to the parent actually improves debugability (though I'm not sure this approach actually requires it, all you really need is either a job descriptor or a pointer to a counter). If I have a state machine that does something bad or gets stuck it would be nice to be able to work backwards to find out who invoked it, without having to search for it in a seperate data structure. I don't mean to keep struggling with this issue- I honestly think that both approaches are pretty good, and if Walt implements it the way I think he is going to, then 95% of developers won't notice the difference anyway. At this point I am mostly hammering away to make sure I am not missing a larger issue... -Phil -- Dr. Walter B. Ligon III Associate Professor ECE Department Clemson University ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
Hmm...I had been thinking about a flow implementation that used the new concurrent state machine code...it sounds like that's a bad idea because the testing and restarting would take too long to switch between bmi and trove? We use the post/test model through pvfs2 though, so maybe I don't understand the issue. I don't think that is bad idea. There were really two seperate but related problems in one of the older flow protocol implementations, I can try to describe them a little more here if I can remember: - explicitly tracking and testing each trove and bmi operation: It basically kept arrays that listed pending trove and bmi ops, and would call testsome() to service them. This was a problem because the time it took to keep running up and down those arrays (when building them at the flow level, or when testing them at the trove/bmi level). The solution is to just use testcontext() and let trove/bmi tell you when something finishes without managing extra state. - thread switch time: the architecture here was set up at one time to have one thread pushing the test functions for bmi, another thread pushing the test functions for trove, while another thread was processing the flow and posting new operations. The problem here is that it (at the time) took too long to jump between the "pushing" threads and the "processing" thread when an operation finished that should trigger progress on the flow. This led to the thread-mgr.c code and associated callbacks. The callbacks actually drive the flow progress and post new operations. That means that the same thread that pushes testcontext() gets to trigger the next post, without waiting on the latency of waking up a different thread to do something (using condition variable etc.). I managed to reuse the thread-mgr for the job code as well, so that one testcontext() call triggers callbacks to both the job and flow interfaces. I don't think either of the above issues precludes different flow protocol implementations, and they are really kind of orthogonal to whether state machines are used or not. The first issue is solved just by using testcontext() rather than manually tracking operations. The second issue could be solved in a variety of ways, some of which may be better than what we have now. The callback approach is effecient enough, but is hard to debug. Of course it is also possible that the thread switch (ie. condition signal) latency is low enough nowadays that you don't even need to worry about it anymore. I last looked at this problem before NPTL arrived on the scene. At any rate I think a state machine based flow protocol could dodge issue #2 by either: - lucking out with a faster modern thread implementation - being smarter about how thread work is divided up - using callbacks as we do now, and making the state machine mechanism thread safe so that it can be driven directly from those callbacks rather than from a testcontext() work loop On a related note, it is important to remember that trove has its own internal thread also- so on the trove push side (depending on your design) you could have to worry about a chain of 2 threads that have to be woken up to get something done at completion time. The trove part of that chain can't be avoided without changing the API. Sorry about the tangent here, but I figured I may as well share some warnings about things to look out for here. I think it would be good to have a cleaner flow protocol implementation. I think I'm lost now. What do you mean by replace? The states are still isolated, jobs trigger the transitions, only one state action gets executed at a time, there still may be a time gap between completion of any given child and when the parent picks up processing again, and there are still frames. I think both approaches will look the same when running unless I missed something. If Walt puts a longjmp() in there we can both hit him over the head. Heh. Don't give him ideas! ;-) I was operating under the constraint that a state machine can only post a job for itself. If I understand the current plan correctly, using job_null in the child state machine to post a job for the parent breaks that constraint, and so in some sense is a replace (the job_null actually takes the parent smcb pointer). I think you're probably right that its not a big difference either way, its just cleaner in my head to only have state machines posting jobs for themselves. I see what you are saying. I guess it depends on how you look at it. I had kind of started thinking of the jobs as a signalling mechanism since they are the construct that "signals" as state machine to make its next transition. The job_null() approach just makes it so that a child state machine is what triggers this particular signal, rather than a bmi/trove/dev/req_sched/flow component. I know this is a change in the model and adds a dependency
Re: [Pvfs2-developers] terminating state machines
On Jul 26, 2006, at 6:16 PM, Phil Carns wrote: I think I'm getting voted down here, so I should probably just shutup, but I don't think in practice we're going to have that many child state machines that iterating through the list is at all costly. I'm arguing for simpler mechanisms that fit in with the job subsystem over something more fancy and possibly slightly better performing. Well, as far as the number of SMs goes, I would rather not risk it. I still hope this is lightweight enough that we could eventually use it in more places that would generate a lot of children (like a re-architected sys-io implementation), though I don't know if that will pan out in practice. I got bitten by a similar assumption in the flow protocol- it used to track all of its posted operations for testing rather than relying on someone to notify it of completion. Admittedly the flow protocol is a more obvious case and I should have known better, but at the time it seemed reasonable :) Hmm...I had been thinking about a flow implementation that used the new concurrent state machine code...it sounds like that's a bad idea because the testing and restarting would take too long to switch between bmi and trove? We use the post/test model through pvfs2 though, so maybe I don't understand the issue. I think that the way that you describe would work fine too, but it would require a little more active work to check the status of the array of child SMs and would require more code to keep track of them. Probably a bit more code yes, but it seems cleaner than keeping around backpointers and checking for parents. Instead of driving all state machines from one place, this event notification scheme essentially replaces the last child state machine with the parent, which seems like a bit of hack and harder to debug. I think I'm lost now. What do you mean by replace? The states are still isolated, jobs trigger the transitions, only one state action gets executed at a time, there still may be a time gap between completion of any given child and when the parent picks up processing again, and there are still frames. I think both approaches will look the same when running unless I missed something. If Walt puts a longjmp() in there we can both hit him over the head. Heh. Don't give him ideas! ;-) I was operating under the constraint that a state machine can only post a job for itself. If I understand the current plan correctly, using job_null in the child state machine to post a job for the parent breaks that constraint, and so in some sense is a replace (the job_null actually takes the parent smcb pointer). I think you're probably right that its not a big difference either way, its just cleaner in my head to only have state machines posting jobs for themselves. I think having a pointer to the parent actually improves debugability (though I'm not sure this approach actually requires it, all you really need is either a job descriptor or a pointer to a counter). If I have a state machine that does something bad or gets stuck it would be nice to be able to work backwards to find out who invoked it, without having to search for it in a seperate data structure. I don't mean to keep struggling with this issue- I honestly think that both approaches are pretty good, and if Walt implements it the way I think he is going to, then 95% of developers won't notice the difference anyway. At this point I am mostly hammering away to make sure I am not missing a larger issue... Walt probably got more discussion than he bargained for, but at the least, lively discussion keeps me awake in the afternoon ;-). -sam -Phil ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
I think I'm getting voted down here, so I should probably just shutup, but I don't think in practice we're going to have that many child state machines that iterating through the list is at all costly. I'm arguing for simpler mechanisms that fit in with the job subsystem over something more fancy and possibly slightly better performing. Well, as far as the number of SMs goes, I would rather not risk it. I still hope this is lightweight enough that we could eventually use it in more places that would generate a lot of children (like a re-architected sys-io implementation), though I don't know if that will pan out in practice. I got bitten by a similar assumption in the flow protocol- it used to track all of its posted operations for testing rather than relying on someone to notify it of completion. Admittedly the flow protocol is a more obvious case and I should have known better, but at the time it seemed reasonable :) I think that the way that you describe would work fine too, but it would require a little more active work to check the status of the array of child SMs and would require more code to keep track of them. Probably a bit more code yes, but it seems cleaner than keeping around backpointers and checking for parents. Instead of driving all state machines from one place, this event notification scheme essentially replaces the last child state machine with the parent, which seems like a bit of hack and harder to debug. I think I'm lost now. What do you mean by replace? The states are still isolated, jobs trigger the transitions, only one state action gets executed at a time, there still may be a time gap between completion of any given child and when the parent picks up processing again, and there are still frames. I think both approaches will look the same when running unless I missed something. If Walt puts a longjmp() in there we can both hit him over the head. I think having a pointer to the parent actually improves debugability (though I'm not sure this approach actually requires it, all you really need is either a job descriptor or a pointer to a counter). If I have a state machine that does something bad or gets stuck it would be nice to be able to work backwards to find out who invoked it, without having to search for it in a seperate data structure. I don't mean to keep struggling with this issue- I honestly think that both approaches are pretty good, and if Walt implements it the way I think he is going to, then 95% of developers won't notice the difference anyway. At this point I am mostly hammering away to make sure I am not missing a larger issue... -Phil ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
On Jul 26, 2006, at 5:17 PM, Phil Carns wrote: Sam Lang wrote: On Jul 26, 2006, at 3:41 PM, Walter B. Ligon III wrote: Yeah, the idea is that the SM code would call the job function. Depending on the state actions to do it seems like asking for trouble, all the details that have to be kept up with. Actually, there are already job structs used by the SM code, now I've had to add a context id to the smcb and there will be job calls. I think you are right though, the amount of dependency is pretty small. As for the job funcs I think I'd need one new one to post the parent job, establishing a counter. The child job would look up the counter, decrement, and if zero, call job_null to relaunch the parent, or just replicate what job_null does, whatever seem the easiest. I would rather see the parent get relaunched by the normal job test code by putting itself in the job completion queue once its finished. This could happen in a job_sm_test call like I suggested in my previous email. Also, instead of a counter that a test function would check, and the child state machines would have to decrement, I'd prefer the parent job keep an array of child state machines (it does this anyway, no?) and check each element in the array for completion of the state machine. That way the children aren't competing to lock the same state to notify of completion, the parent just checks each one. There doesn't need to be any locking- the main server thread only executes one state function or one transition at a time. The counter also doesn't need to be visible- it could be hidden inside the job call, which could lock or not lock as it sees fit. The parent also couldn't be the one checking the elements in an array like that - it would have to be done from within the job code somewhere (which I think you described in your previous email). That means that somewhere in the job code (or request scheduler, etc.) something will have to do the following on every job_testcontext() call: for each active sm Only jobs that got posted as parent states would need to be checked. for each child within that sm check state Which could get expensive depending on how extensively we use the child/parallel sm model. It seems unlikely to me that this would cost much overall. If we're going to use this child/parallel system to send out 1000s of messages at once, well, then perhaps we'd be better suited using something like mpi? :-) -sam The implicit call is the child's call when it terminates. The parent's call could be implicit too, or done by the state action. Doesn't this require child state machines to only function in the child state machine context? I'd prefer to just have generic state machines that can be used as a child state machine or as a top-level state machine. I would prefer that too :) Is this going to work Walt? It would be nice if the state machine processing code handled transparently triggering different termination functions depending on whether it was a top level sm or not without the state functions themselves knowing any better. As of this moment we really haven't taken any pains to keep the SM independent from the job system, in fact you have to have the job system to drive things, so in some sense its not really an issue. I vote for making the interfaces as separate as possible. If someone else wants to use the state machine code somewhere else, it would be nice to allow them to take it as-is (mpich2 guys were talking about using it, but I think they ended up doing something else). Also, independent layers make testing and debugging easier in my view. In the current code, the sm_p is passed through to the job descriptor as a void*, and we just cast back to a sm_p in the while loop that does the job_testcontext and then drives the state machines again. The use of job_status does bring in the job code into the state machine code, but it seems like mostly only the error_code field is used within the state actions, and the rest of that structure could be independent of the state machine code. -sam Any more commends? (Sam I hope this address some of yours) Walt Phil Carns wrote: Walter B. Ligon III wrote: OK, guys, I have another issue I want input on. When child SMs terminate they have to notify their parent. The parent has to wait for all the children to terminate. So I've been thinking to use the job subsystem for this: the parent would post a job to wait for N children, and each child would post a job, the last one releasing the parent. Now I see two ways to implement this - one is to implement this directly in the state machine code. The parent simply stops running (because it does not schedule a job yet returns DEFERRED). Each child decrements a counter, and when it hits 0 the parent i
Re: [Pvfs2-developers] terminating state machines
On Jul 26, 2006, at 5:06 PM, Phil Carns wrote: I don't see why the two have to be dependent for this to work. Do you mean by the parent posting a job, the state machine stepping code would handling the actual posting? I was assuming that the parent state action could just call job_concurrent_sm_post (or whatever its called). Could it be similar to the request scheduler job posting code? The parent state action could call job_concurrent_sm_post with an array of the child sms, which just calls sm_post and adds the parent sm and its array to an operation queue. Then a job_concurrent_sm_test function could test for completion of a parent sm by looking at all the sms in the array to see if they completed. The job_testcontext code would have to be modified of course (maybe rework the do_one_test_cycle_req_sched function to also test parent sm jobs), but all of that still seems to be independent of the state machine code (i.e. someone could use the state machine code separately and drive state machines using something other than the job framework). I don't know if all that makes sense in the context of the changes you've made, but that's what I had in mind when I suggested posting a job for the parent. I think I follow what you are describing, but I am not entirely sure. If so, I think there is one advantage to the approach that Walt has been hashing out thus far. I think that what Walt is describing is event-driven, in a sense. No one has to actively look to see if all of the children have finished. Instead, the children each send notification (by calling a release function or manually decrementing a counter) in their completion function, with the parent eventually getting a single notification (representing all of the children) through the existing job completion queue mechanism. I think I'm getting voted down here, so I should probably just shutup, but I don't think in practice we're going to have that many child state machines that iterating through the list is at all costly. I'm arguing for simpler mechanisms that fit in with the job subsystem over something more fancy and possibly slightly better performing. I think that the way that you describe would work fine too, but it would require a little more active work to check the status of the array of child SMs and would require more code to keep track of them. Probably a bit more code yes, but it seems cleaner than keeping around backpointers and checking for parents. Instead of driving all state machines from one place, this event notification scheme essentially replaces the last child state machine with the parent, which seems like a bit of hack and harder to debug. -sam I think you are right though, that you could pull off your version without the the children actually having to make a job_* call. -Phil ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
I think it would be nice to help prevent programmer error. The same thing was done with the protocol request structures (see all the PINT_SERVREQ_*_FILL macros used in the client sms). If you have a macro, then neglecting to pass in one of the required input fields results in a compiler error. Otherwise the compiler can't help to tell you if you have set all of the frame fields that you were supposed to set. There is no technical advantage, it just makes setting the fields a little more foolproof. Same goes for the output of a frame after completion, although I'm not sure what the macro would look like there, or if it is possible. Probably a given frame will have several fields - some are input, some are output, some are scratch area for the state functions, etc. Someone coming along later trying to reuse the SM may not know (without some tricky code digging) which fields are the output fields that it can count on to be correctly filled in after completion. For example, maybe there is a field called "parent_handle" in there- is it filled in? If so, is it guaranteed to be filled in, or did I just happen to get it this time because of the steps path the sm took? I don't know what the best way is to make this explicit, maybe some kind of macro, maybe putting a special prefix on the names of the output fields, any other ideas? Maybe we just use comments :) OK, I see what you mean. I think that's kind of a syntax level thing - IOW I think if affects the underlying mechanism. So yeah, we should have that and we'll work on that once the mechanism works. Yeah, you are right- it doesn't have any impact on the underlying mechanism. Its just some extra programming practice we can try to establish later for SM users. -Phil ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
Sam Lang wrote: On Jul 26, 2006, at 3:41 PM, Walter B. Ligon III wrote: Yeah, the idea is that the SM code would call the job function. Depending on the state actions to do it seems like asking for trouble, all the details that have to be kept up with. Actually, there are already job structs used by the SM code, now I've had to add a context id to the smcb and there will be job calls. I think you are right though, the amount of dependency is pretty small. As for the job funcs I think I'd need one new one to post the parent job, establishing a counter. The child job would look up the counter, decrement, and if zero, call job_null to relaunch the parent, or just replicate what job_null does, whatever seem the easiest. I would rather see the parent get relaunched by the normal job test code by putting itself in the job completion queue once its finished. This could happen in a job_sm_test call like I suggested in my previous email. Also, instead of a counter that a test function would check, and the child state machines would have to decrement, I'd prefer the parent job keep an array of child state machines (it does this anyway, no?) and check each element in the array for completion of the state machine. That way the children aren't competing to lock the same state to notify of completion, the parent just checks each one. There doesn't need to be any locking- the main server thread only executes one state function or one transition at a time. The counter also doesn't need to be visible- it could be hidden inside the job call, which could lock or not lock as it sees fit. The parent also couldn't be the one checking the elements in an array like that - it would have to be done from within the job code somewhere (which I think you described in your previous email). That means that somewhere in the job code (or request scheduler, etc.) something will have to do the following on every job_testcontext() call: for each active sm for each child within that sm check state Which could get expensive depending on how extensively we use the child/parallel sm model. The implicit call is the child's call when it terminates. The parent's call could be implicit too, or done by the state action. Doesn't this require child state machines to only function in the child state machine context? I'd prefer to just have generic state machines that can be used as a child state machine or as a top-level state machine. I would prefer that too :) Is this going to work Walt? It would be nice if the state machine processing code handled transparently triggering different termination functions depending on whether it was a top level sm or not without the state functions themselves knowing any better. As of this moment we really haven't taken any pains to keep the SM independent from the job system, in fact you have to have the job system to drive things, so in some sense its not really an issue. I vote for making the interfaces as separate as possible. If someone else wants to use the state machine code somewhere else, it would be nice to allow them to take it as-is (mpich2 guys were talking about using it, but I think they ended up doing something else). Also, independent layers make testing and debugging easier in my view. In the current code, the sm_p is passed through to the job descriptor as a void*, and we just cast back to a sm_p in the while loop that does the job_testcontext and then drives the state machines again. The use of job_status does bring in the job code into the state machine code, but it seems like mostly only the error_code field is used within the state actions, and the rest of that structure could be independent of the state machine code. -sam Any more commends? (Sam I hope this address some of yours) Walt Phil Carns wrote: Walter B. Ligon III wrote: OK, guys, I have another issue I want input on. When child SMs terminate they have to notify their parent. The parent has to wait for all the children to terminate. So I've been thinking to use the job subsystem for this: the parent would post a job to wait for N children, and each child would post a job, the last one releasing the parent. Now I see two ways to implement this - one is to implement this directly in the state machine code. The parent simply stops running (because it does not schedule a job yet returns DEFERRED). Each child decrements a counter, and when it hits 0 the parent is restarted. This is a little ugly because the waiting parent is not being held on any list or queue (up to now all waiting SMs are in the job subsystem), also the last terminating child becomes the parent as it starts executing the parent code. Things can get weird when one SM starts children that start children, and so on. Now the other way to implement this is with the job subsystem as I
Re: [Pvfs2-developers] terminating state machines
On Jul 26, 2006, at 4:37 PM, Walter B. Ligon III wrote: Sam Lang wrote: On Jul 26, 2006, at 3:41 PM, Walter B. Ligon III wrote: Yeah, the idea is that the SM code would call the job function. Depending on the state actions to do it seems like asking for trouble, all the details that have to be kept up with. Actually, there are already job structs used by the SM code, now I've had to add a context id to the smcb and there will be job calls. I think you are right though, the amount of dependency is pretty small. As for the job funcs I think I'd need one new one to post the parent job, establishing a counter. The child job would look up the counter, decrement, and if zero, call job_null to relaunch the parent, or just replicate what job_null does, whatever seem the easiest. I would rather see the parent get relaunched by the normal job test code by putting itself in the job completion queue once its finished. That's what I'm talking about. This could happen in a job_sm_test call like I suggested in my previous email. Also, instead of a counter that a test function would check, and the child state machines would have to decrement, I'd prefer the parent job keep an array of child state machines (it does this anyway, no?) and check each element in the array for completion of the state machine. That way the children aren't competing to lock the same state to notify of completion, the parent just checks each one. That's going to be tricky, and probably would perform worse than a counter. The primary problem being that the parent isn't running, so it can't really check anything. Its not running, but it could work similar to the request scheduler code. A job_sm_post would add the sm job to a pending queue, and the job_sm_test could be called in job_testcontext, just like PINT_request_scheduler_testworld is called. The job_sm_test call would check the pending sm job queue (look at each one and check all the children SMs for completion). Once an sm job is completed, it gets added to the job completion queue, and the while loop that drives the state machines will start it up again. The implicit call is the child's call when it terminates. The parent's call could be implicit too, or done by the state action. Doesn't this require child state machines to only function in the child state machine context? I'd prefer to just have generic state machines that can be used as a child state machine or as a top-level state machine. No, not at all. When all state machines terminate they check to see if they have a parent (SMs started directly as a result of a syscall or request have a NULL parent) and if so they then enter into the routine to see if they are the last child, and if so they release the parent. That seems like a needless check. Many state machines don't have parent's after all. Why not just keep the direction from parent to child, instead of requiring children to keep a backpointer to the parent? As of this moment we really haven't taken any pains to keep the SM independent from the job system, in fact you have to have the job system to drive things, so in some sense its not really an issue. I vote for making the interfaces as separate as possible. If someone else wants to use the state machine code somewhere else, it would be nice to allow them to take it as-is (mpich2 guys were talking about using it, but I think they ended up doing something else). Also, independent layers make testing and debugging easier in my view. I agree, that's why I asked the question. Again, I could do it without the job layer at all and quite easily, but if I want the parent to pop out of the job_test call, then I'm going to have to call some things in the job interface. I could leave it to the SM programmer to do that but then the SM really doesn't have a complete implementation, half of what it does depends on the SM programmer. As it is there's already stuff that has to be provided as infrastructure to use the SM, and that's going to include something that wakes the SMs when they are done with their current task - which is currently the job system, so this isn't adding much. Just to clarify, I'm only arguing that the state machine code be independent of the job code (not vice-versa). Adding job_sm_post and job_sm_test functions that look at state machine pointers should prevent the need for state machines to know about jobs. -sam In the current code, the sm_p is passed through to the job descriptor as a void*, and we just cast back to a sm_p in the while loop that does the job_testcontext and then drives the state machines again. The use of job_status does bring in the job code into the state machine code, but it seems like mostly only the error_code field is used within the state actions, and the rest of that structure
Re: [Pvfs2-developers] terminating state machines
Phil Carns wrote: Phil, first your questions: The parent will push a "frame" onto a stack for each child it is starting. A frame is everything that used to be in either a s_op or sm_p on the server or client, except for the stuff that actually runs the SM (now in an smcb). The parent can pass in anything it wants by filling in the fields appropriately. When each child runs that struct will appear to be its "current" frame. Each child can leave that frame in any condition it wants, with any values of buffers the child wants to leave for the parent. After the children are done the parent can pop each frame off the stack and do what it wants with it. Thus there is plenty of flexibility on how you want to handle passing things in and out, all under control of the server or client code. Sounds great. As for providing macros for setting up and tearing down frames, we can certainly do that. I'm not sure hoe much that really helps, but we can do it. I think it would be nice to help prevent programmer error. The same thing was done with the protocol request structures (see all the PINT_SERVREQ_*_FILL macros used in the client sms). If you have a macro, then neglecting to pass in one of the required input fields results in a compiler error. Otherwise the compiler can't help to tell you if you have set all of the frame fields that you were supposed to set. There is no technical advantage, it just makes setting the fields a little more foolproof. Same goes for the output of a frame after completion, although I'm not sure what the macro would look like there, or if it is possible. Probably a given frame will have several fields - some are input, some are output, some are scratch area for the state functions, etc. Someone coming along later trying to reuse the SM may not know (without some tricky code digging) which fields are the output fields that it can count on to be correctly filled in after completion. For example, maybe there is a field called "parent_handle" in there- is it filled in? If so, is it guaranteed to be filled in, or did I just happen to get it this time because of the steps path the sm took? I don't know what the best way is to make this explicit, maybe some kind of macro, maybe putting a special prefix on the names of the output fields, any other ideas? Maybe we just use comments :) OK, I see what you mean. I think that's kind of a syntax level thing - IOW I think if affects the underlying mechanism. So yeah, we should have that and we'll work on that once the mechanism works. Now, an implementation question - one approach to this job/counter thing is to have two job calls, one for the parent, and one of the children. Another approach is for the parent to simple set a counter and not call anything. The children come along, decrement the count, and if zero, call job_null() to awaken the parent. Requires no modification in the job layer, minimizes dependency. What do you think? Should the job layer have more of a roll, or keep it minimum? Not a big deal to me either way. Especially if all of these calls are implicit in the state processing code - no one is really going to see them normally anyway. OK, I think everyone has weighed in on this, and I think I'll use the minmal method. The only real diff is Sam's preference not to use a counter. We can go around on that, but I'm leaning towards a counter, at least for the initial implementation. -Phil -- Dr. Walter B. Ligon III Associate Professor ECE Department Clemson University ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
I don't see why the two have to be dependent for this to work. Do you mean by the parent posting a job, the state machine stepping code would handling the actual posting? I was assuming that the parent state action could just call job_concurrent_sm_post (or whatever its called). Could it be similar to the request scheduler job posting code? The parent state action could call job_concurrent_sm_post with an array of the child sms, which just calls sm_post and adds the parent sm and its array to an operation queue. Then a job_concurrent_sm_test function could test for completion of a parent sm by looking at all the sms in the array to see if they completed. The job_testcontext code would have to be modified of course (maybe rework the do_one_test_cycle_req_sched function to also test parent sm jobs), but all of that still seems to be independent of the state machine code (i.e. someone could use the state machine code separately and drive state machines using something other than the job framework). I don't know if all that makes sense in the context of the changes you've made, but that's what I had in mind when I suggested posting a job for the parent. I think I follow what you are describing, but I am not entirely sure. If so, I think there is one advantage to the approach that Walt has been hashing out thus far. I think that what Walt is describing is event-driven, in a sense. No one has to actively look to see if all of the children have finished. Instead, the children each send notification (by calling a release function or manually decrementing a counter) in their completion function, with the parent eventually getting a single notification (representing all of the children) through the existing job completion queue mechanism. I think that the way that you describe would work fine too, but it would require a little more active work to check the status of the array of child SMs and would require more code to keep track of them. I think you are right though, that you could pull off your version without the the children actually having to make a job_* call. -Phil ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
Phil, first your questions: The parent will push a "frame" onto a stack for each child it is starting. A frame is everything that used to be in either a s_op or sm_p on the server or client, except for the stuff that actually runs the SM (now in an smcb). The parent can pass in anything it wants by filling in the fields appropriately. When each child runs that struct will appear to be its "current" frame. Each child can leave that frame in any condition it wants, with any values of buffers the child wants to leave for the parent. After the children are done the parent can pop each frame off the stack and do what it wants with it. Thus there is plenty of flexibility on how you want to handle passing things in and out, all under control of the server or client code. Sounds great. As for providing macros for setting up and tearing down frames, we can certainly do that. I'm not sure hoe much that really helps, but we can do it. I think it would be nice to help prevent programmer error. The same thing was done with the protocol request structures (see all the PINT_SERVREQ_*_FILL macros used in the client sms). If you have a macro, then neglecting to pass in one of the required input fields results in a compiler error. Otherwise the compiler can't help to tell you if you have set all of the frame fields that you were supposed to set. There is no technical advantage, it just makes setting the fields a little more foolproof. Same goes for the output of a frame after completion, although I'm not sure what the macro would look like there, or if it is possible. Probably a given frame will have several fields - some are input, some are output, some are scratch area for the state functions, etc. Someone coming along later trying to reuse the SM may not know (without some tricky code digging) which fields are the output fields that it can count on to be correctly filled in after completion. For example, maybe there is a field called "parent_handle" in there- is it filled in? If so, is it guaranteed to be filled in, or did I just happen to get it this time because of the steps path the sm took? I don't know what the best way is to make this explicit, maybe some kind of macro, maybe putting a special prefix on the names of the output fields, any other ideas? Maybe we just use comments :) Now, an implementation question - one approach to this job/counter thing is to have two job calls, one for the parent, and one of the children. Another approach is for the parent to simple set a counter and not call anything. The children come along, decrement the count, and if zero, call job_null() to awaken the parent. Requires no modification in the job layer, minimizes dependency. What do you think? Should the job layer have more of a roll, or keep it minimum? Not a big deal to me either way. Especially if all of these calls are implicit in the state processing code - no one is really going to see them normally anyway. -Phil ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
On Jul 26, 2006, at 3:41 PM, Walter B. Ligon III wrote: Yeah, the idea is that the SM code would call the job function. Depending on the state actions to do it seems like asking for trouble, all the details that have to be kept up with. Actually, there are already job structs used by the SM code, now I've had to add a context id to the smcb and there will be job calls. I think you are right though, the amount of dependency is pretty small. As for the job funcs I think I'd need one new one to post the parent job, establishing a counter. The child job would look up the counter, decrement, and if zero, call job_null to relaunch the parent, or just replicate what job_null does, whatever seem the easiest. I would rather see the parent get relaunched by the normal job test code by putting itself in the job completion queue once its finished. This could happen in a job_sm_test call like I suggested in my previous email. Also, instead of a counter that a test function would check, and the child state machines would have to decrement, I'd prefer the parent job keep an array of child state machines (it does this anyway, no?) and check each element in the array for completion of the state machine. That way the children aren't competing to lock the same state to notify of completion, the parent just checks each one. The implicit call is the child's call when it terminates. The parent's call could be implicit too, or done by the state action. Doesn't this require child state machines to only function in the child state machine context? I'd prefer to just have generic state machines that can be used as a child state machine or as a top-level state machine. As of this moment we really haven't taken any pains to keep the SM independent from the job system, in fact you have to have the job system to drive things, so in some sense its not really an issue. I vote for making the interfaces as separate as possible. If someone else wants to use the state machine code somewhere else, it would be nice to allow them to take it as-is (mpich2 guys were talking about using it, but I think they ended up doing something else). Also, independent layers make testing and debugging easier in my view. In the current code, the sm_p is passed through to the job descriptor as a void*, and we just cast back to a sm_p in the while loop that does the job_testcontext and then drives the state machines again. The use of job_status does bring in the job code into the state machine code, but it seems like mostly only the error_code field is used within the state actions, and the rest of that structure could be independent of the state machine code. -sam Any more commends? (Sam I hope this address some of yours) Walt Phil Carns wrote: Walter B. Ligon III wrote: OK, guys, I have another issue I want input on. When child SMs terminate they have to notify their parent. The parent has to wait for all the children to terminate. So I've been thinking to use the job subsystem for this: the parent would post a job to wait for N children, and each child would post a job, the last one releasing the parent. Now I see two ways to implement this - one is to implement this directly in the state machine code. The parent simply stops running (because it does not schedule a job yet returns DEFERRED). Each child decrements a counter, and when it hits 0 the parent is restarted. This is a little ugly because the waiting parent is not being held on any list or queue (up to now all waiting SMs are in the job subsystem), also the last terminating child becomes the parent as it starts executing the parent code. Things can get weird when one SM starts children that start children, and so on. Now the other way to implement this is with the job subsystem as I suggested above. Much cleaner except for one thing: up to now the state machine subsystem has had no dependency at all on the job subsystem. If we do it this way, this function only works with the job system intact. I'd prefer not to do this, but it does seem the cleanest, most logical means. I like the job approach. I guess this is an extra dependency because the sms would be calling these particular job functions implicitly, rather than relying on the state functions to handle those posts and releases? We definitely haven't done that before, but at least in this case the job function that the sm infrastructure would be depending on is the simplest one in the arsenal :) It shouldn't be hard for someone to reimplement that particular functionality if they wanted to use the state machine mechanism in another project. If you weren't planning on these job calls to be implicit, then I'm not sure where the extra dependency is- we already use jobs to trigger all of the other "normal" transitions. This reminded me of a question, though- is there go
Re: [Pvfs2-developers] terminating state machines
Phil Carns wrote: Walter B. Ligon III wrote: OK, guys, I have another issue I want input on. When child SMs terminate they have to notify their parent. The parent has to wait for all the children to terminate. So I've been thinking to use the job subsystem for this: the parent would post a job to wait for N children, and each child would post a job, the last one releasing the parent. Now I see two ways to implement this - one is to implement this directly in the state machine code. The parent simply stops running (because it does not schedule a job yet returns DEFERRED). Each child decrements a counter, and when it hits 0 the parent is restarted. This is a little ugly because the waiting parent is not being held on any list or queue (up to now all waiting SMs are in the job subsystem), also the last terminating child becomes the parent as it starts executing the parent code. Things can get weird when one SM starts children that start children, and so on. Now the other way to implement this is with the job subsystem as I suggested above. Much cleaner except for one thing: up to now the state machine subsystem has had no dependency at all on the job subsystem. If we do it this way, this function only works with the job system intact. I'd prefer not to do this, but it does seem the cleanest, most logical means. I like the job approach. I guess this is an extra dependency because the sms would be calling these particular job functions implicitly, rather than relying on the state functions to handle those posts and releases? We definitely haven't done that before, but at least in this case the job function that the sm infrastructure would be depending on is the simplest one in the arsenal :) It shouldn't be hard for someone to reimplement that particular functionality if they wanted to use the state machine mechanism in another project. If you weren't planning on these job calls to be implicit, then I'm not sure where the extra dependency is- we already use jobs to trigger all of the other "normal" transitions. This reminded me of a question, though- is there going to be a standard mechanism for the children to report each of their independent error codes to the parent sm? Or do the children need to just keep a reference to the parent sm structure and manually fill in an array or something? I guess I have a broader question of how data that the children generate (like a handle value or an attr structure) gets transferred to the parent. Does the parent copy this stuff from the child after the child finishes, or does the child copy it to the parent before it exits?I think we talked about this before at some point but I forgot what the plan is. It would be nice if we made the developer define macros or something to dictate what the input parameters need to be filled in when invoking a child and what output parameters can be retrieved when it finishes. Otherwise it starts getting tricky to remember what fields need to be set in the sm structure before kicking something off. Phil, first your questions: The parent will push a "frame" onto a stack for each child it is starting. A frame is everything that used to be in either a s_op or sm_p on the server or client, except for the stuff that actually runs the SM (now in an smcb). The parent can pass in anything it wants by filling in the fields appropriately. When each child runs that struct will appear to be its "current" frame. Each child can leave that frame in any condition it wants, with any values of buffers the child wants to leave for the parent. After the children are done the parent can pop each frame off the stack and do what it wants with it. Thus there is plenty of flexibility on how you want to handle passing things in and out, all under control of the server or client code. As for providing macros for setting up and tearing down frames, we can certainly do that. I'm not sure hoe much that really helps, but we can do it. Now, an implementation question - one approach to this job/counter thing is to have two job calls, one for the parent, and one of the children. Another approach is for the parent to simple set a counter and not call anything. The children come along, decrement the count, and if zero, call job_null() to awaken the parent. Requires no modification in the job layer, minimizes dependency. What do you think? Should the job layer have more of a roll, or keep it minimum? Walt -- Dr. Walter B. Ligon III Associate Professor ECE Department Clemson University ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
Yeah, the idea is that the SM code would call the job function. Depending on the state actions to do it seems like asking for trouble, all the details that have to be kept up with. Actually, there are already job structs used by the SM code, now I've had to add a context id to the smcb and there will be job calls. I think you are right though, the amount of dependency is pretty small. As for the job funcs I think I'd need one new one to post the parent job, establishing a counter. The child job would look up the counter, decrement, and if zero, call job_null to relaunch the parent, or just replicate what job_null does, whatever seem the easiest. The implicit call is the child's call when it terminates. The parent's call could be implicit too, or done by the state action. As of this moment we really haven't taken any pains to keep the SM independent from the job system, in fact you have to have the job system to drive things, so in some sense its not really an issue. Any more commends? (Sam I hope this address some of yours) Walt Phil Carns wrote: Walter B. Ligon III wrote: OK, guys, I have another issue I want input on. When child SMs terminate they have to notify their parent. The parent has to wait for all the children to terminate. So I've been thinking to use the job subsystem for this: the parent would post a job to wait for N children, and each child would post a job, the last one releasing the parent. Now I see two ways to implement this - one is to implement this directly in the state machine code. The parent simply stops running (because it does not schedule a job yet returns DEFERRED). Each child decrements a counter, and when it hits 0 the parent is restarted. This is a little ugly because the waiting parent is not being held on any list or queue (up to now all waiting SMs are in the job subsystem), also the last terminating child becomes the parent as it starts executing the parent code. Things can get weird when one SM starts children that start children, and so on. Now the other way to implement this is with the job subsystem as I suggested above. Much cleaner except for one thing: up to now the state machine subsystem has had no dependency at all on the job subsystem. If we do it this way, this function only works with the job system intact. I'd prefer not to do this, but it does seem the cleanest, most logical means. I like the job approach. I guess this is an extra dependency because the sms would be calling these particular job functions implicitly, rather than relying on the state functions to handle those posts and releases? We definitely haven't done that before, but at least in this case the job function that the sm infrastructure would be depending on is the simplest one in the arsenal :) It shouldn't be hard for someone to reimplement that particular functionality if they wanted to use the state machine mechanism in another project. If you weren't planning on these job calls to be implicit, then I'm not sure where the extra dependency is- we already use jobs to trigger all of the other "normal" transitions. This reminded me of a question, though- is there going to be a standard mechanism for the children to report each of their independent error codes to the parent sm? Or do the children need to just keep a reference to the parent sm structure and manually fill in an array or something? I guess I have a broader question of how data that the children generate (like a handle value or an attr structure) gets transferred to the parent. Does the parent copy this stuff from the child after the child finishes, or does the child copy it to the parent before it exits?I think we talked about this before at some point but I forgot what the plan is. It would be nice if we made the developer define macros or something to dictate what the input parameters need to be filled in when invoking a child and what output parameters can be retrieved when it finishes. Otherwise it starts getting tricky to remember what fields need to be set in the sm structure before kicking something off. -Phil -Phil -- Dr. Walter B. Ligon III Associate Professor ECE Department Clemson University ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
On Jul 26, 2006, at 12:37 PM, Walter B. Ligon III wrote: OK, guys, I have another issue I want input on. When child SMs terminate they have to notify their parent. The parent has to wait for all the children to terminate. So I've been thinking to use the job subsystem for this: the parent would post a job to wait for N children, and each child would post a job, the last one releasing the parent. Now I see two ways to implement this - one is to implement this directly in the state machine code. The parent simply stops running (because it does not schedule a job yet returns DEFERRED). Each child decrements a counter, and when it hits 0 the parent is restarted. This is a little ugly because the waiting parent is not being held on any list or queue (up to now all waiting SMs are in the job subsystem), also the last terminating child becomes the parent as it starts executing the parent code. Things can get weird when one SM starts children that start children, and so on. Now the other way to implement this is with the job subsystem as I suggested above. Much cleaner except for one thing: up to now the state machine subsystem has had no dependency at all on the job subsystem. If we do it this way, this function only works with the job system intact. I'd prefer not to do this, but it does seem the cleanest, most logical means. I don't see why the two have to be dependent for this to work. Do you mean by the parent posting a job, the state machine stepping code would handling the actual posting? I was assuming that the parent state action could just call job_concurrent_sm_post (or whatever its called). Could it be similar to the request scheduler job posting code? The parent state action could call job_concurrent_sm_post with an array of the child sms, which just calls sm_post and adds the parent sm and its array to an operation queue. Then a job_concurrent_sm_test function could test for completion of a parent sm by looking at all the sms in the array to see if they completed. The job_testcontext code would have to be modified of course (maybe rework the do_one_test_cycle_req_sched function to also test parent sm jobs), but all of that still seems to be independent of the state machine code (i.e. someone could use the state machine code separately and drive state machines using something other than the job framework). I don't know if all that makes sense in the context of the changes you've made, but that's what I had in mind when I suggested posting a job for the parent. -sam Comments? Walt -- Dr. Walter B. Ligon III Associate Professor ECE Department Clemson University ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] terminating state machines
Walter B. Ligon III wrote: OK, guys, I have another issue I want input on. When child SMs terminate they have to notify their parent. The parent has to wait for all the children to terminate. So I've been thinking to use the job subsystem for this: the parent would post a job to wait for N children, and each child would post a job, the last one releasing the parent. Now I see two ways to implement this - one is to implement this directly in the state machine code. The parent simply stops running (because it does not schedule a job yet returns DEFERRED). Each child decrements a counter, and when it hits 0 the parent is restarted. This is a little ugly because the waiting parent is not being held on any list or queue (up to now all waiting SMs are in the job subsystem), also the last terminating child becomes the parent as it starts executing the parent code. Things can get weird when one SM starts children that start children, and so on. Now the other way to implement this is with the job subsystem as I suggested above. Much cleaner except for one thing: up to now the state machine subsystem has had no dependency at all on the job subsystem. If we do it this way, this function only works with the job system intact. I'd prefer not to do this, but it does seem the cleanest, most logical means. I like the job approach. I guess this is an extra dependency because the sms would be calling these particular job functions implicitly, rather than relying on the state functions to handle those posts and releases? We definitely haven't done that before, but at least in this case the job function that the sm infrastructure would be depending on is the simplest one in the arsenal :) It shouldn't be hard for someone to reimplement that particular functionality if they wanted to use the state machine mechanism in another project. If you weren't planning on these job calls to be implicit, then I'm not sure where the extra dependency is- we already use jobs to trigger all of the other "normal" transitions. This reminded me of a question, though- is there going to be a standard mechanism for the children to report each of their independent error codes to the parent sm? Or do the children need to just keep a reference to the parent sm structure and manually fill in an array or something? I guess I have a broader question of how data that the children generate (like a handle value or an attr structure) gets transferred to the parent. Does the parent copy this stuff from the child after the child finishes, or does the child copy it to the parent before it exits?I think we talked about this before at some point but I forgot what the plan is. It would be nice if we made the developer define macros or something to dictate what the input parameters need to be filled in when invoking a child and what output parameters can be retrieved when it finishes. Otherwise it starts getting tricky to remember what fields need to be set in the sm structure before kicking something off. -Phil -Phil ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
[Pvfs2-developers] terminating state machines
OK, guys, I have another issue I want input on. When child SMs terminate they have to notify their parent. The parent has to wait for all the children to terminate. So I've been thinking to use the job subsystem for this: the parent would post a job to wait for N children, and each child would post a job, the last one releasing the parent. Now I see two ways to implement this - one is to implement this directly in the state machine code. The parent simply stops running (because it does not schedule a job yet returns DEFERRED). Each child decrements a counter, and when it hits 0 the parent is restarted. This is a little ugly because the waiting parent is not being held on any list or queue (up to now all waiting SMs are in the job subsystem), also the last terminating child becomes the parent as it starts executing the parent code. Things can get weird when one SM starts children that start children, and so on. Now the other way to implement this is with the job subsystem as I suggested above. Much cleaner except for one thing: up to now the state machine subsystem has had no dependency at all on the job subsystem. If we do it this way, this function only works with the job system intact. I'd prefer not to do this, but it does seem the cleanest, most logical means. Comments? Walt -- Dr. Walter B. Ligon III Associate Professor ECE Department Clemson University ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers