subject:"\[Pvfs2\-developers\] terminating state machines"

Re: [Pvfs2-developers] terminating state machines

2006-07-27 Thread Phil Carns



Thanks for the detailed explanation Phil.  I hadn't thought about the  
context switches that might slow down flow.  I was primarily thinking  
of something that would be cleaner, and easier to modify and test for  
different scenarios.  If at some point I get around to playing with a  
flow impl that uses the concurrent state machine framework, I'll open  
up the discussion again to avoid any of the pitfalls you described.


Cleaner and easier to modify would be great!

I just remembered that there are a couple of test programs in the tree 
to look at the thread context switch overhead, in case they are helpful 
to figure out if it is still a concern:


pvfs2/test/io/job/thread-bench2.c
pvfs2/test/io/job/thread-bench3.c

One of those just goes through a bunch of iterations relaying a 
condition across threads to see how long it takes.  The second one does 
the same thing, except with 2 relays instead of one (to mimic the trove 
side of things).  I haven't run these on a decent machine in years.  I 
will also add a disclaimer that the test programs are old and quite 
possibly wrong :)


We also have the benefit of your small-io optimization now too, so it 
isn't quite as critical as it used to be for the flow to be able to keep 
the latency down on small transfers any more...


-Phil
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-27 Thread Sam Lang



On Jul 27, 2006, at 10:16 AM, Phil Carns wrote:



Hmm...I had been thinking about a flow implementation that used  
the  new concurrent state machine code...it sounds like that's a  
bad idea  because the testing and restarting would take too long  
to switch  between bmi and trove?  We use the post/test model  
through pvfs2  though, so maybe I don't understand the issue.


I don't think that is bad idea.  There were really two seperate but  
related problems in one of the older flow protocol implementations,  
I can try to describe them a little more here if I can remember:


- explicitly tracking and testing each trove and bmi operation: It  
basically kept arrays that listed pending trove and bmi ops, and  
would call testsome() to service them.  This was a problem because  
the time it took to keep running up and down those arrays (when  
building them at the flow level, or when testing them at the trove/ 
bmi level).  The solution is to just use testcontext() and let  
trove/bmi tell you when something finishes without managing extra  
state.


- thread switch time: the architecture here was set up at one time  
to have one thread pushing the test functions for bmi, another  
thread pushing the test functions for trove, while another thread  
was processing the flow and posting new operations.  The problem  
here is that it (at the time) took too long to jump between the  
"pushing" threads and the "processing" thread when an operation  
finished that should trigger progress on the flow. This led to the  
thread-mgr.c code and associated callbacks.  The callbacks actually  
drive the flow progress and post new operations.  That means that  
the same thread that pushes testcontext() gets to trigger the next  
post, without waiting on the latency of waking up a different  
thread to do something (using condition variable etc.).  I managed  
to reuse the thread-mgr for the job code as well, so that one  
testcontext() call triggers callbacks to both the job and flow  
interfaces.


I don't think either of the above issues precludes different flow  
protocol implementations, and they are really kind of orthogonal to  
whether state machines are used or not.  The first issue is solved  
just by using testcontext() rather than manually tracking operations.


The second issue could be solved in a variety of ways, some of  
which may  be better than what we have now.  The callback approach  
is effecient enough, but is hard to debug.  Of course it is also  
possible that the thread switch (ie. condition signal) latency is  
low enough nowadays that you don't even need to worry about it  
anymore.  I last looked at this problem before NPTL arrived on the  
scene.


At any rate I think a state machine based flow protocol could dodge  
issue #2 by either:

- lucking out with a faster modern thread implementation
- being smarter about how thread work is divided up
- using callbacks as we do now, and making the state machine  
mechanism thread safe so that it can be driven directly from those  
callbacks rather than from a testcontext() work loop


On a related note, it is important to remember that trove has its  
own internal thread also- so on the trove push side (depending on  
your design) you could have to worry about a chain of 2 threads  
that have to be woken up to get something done at completion time.   
The trove part of that chain can't be avoided without changing the  
API.


Sorry about the tangent here, but I figured I may as well share  
some warnings about things to look out for here.  I think it would  
be good to have a cleaner flow protocol implementation.




Thanks for the detailed explanation Phil.  I hadn't thought about the  
context switches that might slow down flow.  I was primarily thinking  
of something that would be cleaner, and easier to modify and test for  
different scenarios.  If at some point I get around to playing with a  
flow impl that uses the concurrent state machine framework, I'll open  
up the discussion again to avoid any of the pitfalls you described.


-sam

I think I'm lost now.  What do you mean by replace?  The states  
are  still isolated, jobs trigger the transitions, only one state  
action  gets executed at a time, there still may be a time gap  
between  completion of any given child and when the parent picks  
up  processing again, and there are still frames.  I think both   
approaches will look the same when running unless I missed   
something.  If Walt puts a longjmp() in there we can both hit  
him  over the head.



Heh.  Don't give him ideas! ;-)
I was operating under the constraint that a state machine can  
only  post a job for itself.  If I understand the current plan  
correctly,  using job_null in the child state machine to post a  
job for the  parent breaks that constraint, and so in some sense  
is a replace (the  job_null actually takes the parent smcb  
pointer).  I think you're  probably right that its not a big  
difference either w

Re: [Pvfs2-developers] terminating state machines

2006-07-27 Thread Walter B. Ligon III




Sam Lang wrote:


On Jul 26, 2006, at 6:16 PM, Phil Carns wrote:

I think I'm getting voted down here, so I should probably just   
shutup, but I don't think in practice we're going to have that  many  
child state machines that iterating through the list is at  all  
costly.  I'm arguing for simpler mechanisms that fit in with  the 
job  subsystem over something more fancy and possibly slightly  
better  performing.



Well, as far as the number of SMs goes, I would rather not risk  it.  
I still hope this is lightweight enough that we could  eventually use 
it in more places that would generate a lot of  children (like a 
re-architected sys-io implementation), though I  don't know if that 
will pan out in practice.  I got bitten by a  similar assumption in 
the flow protocol- it used to track all of  its posted operations for 
testing rather than relying on someone to  notify it of completion.  
Admittedly the flow protocol is a more  obvious case and I should have 
known better, but at the time it  seemed reasonable :)




Hmm...I had been thinking about a flow implementation that used the  new 
concurrent state machine code...it sounds like that's a bad idea  
because the testing and restarting would take too long to switch  
between bmi and trove?  We use the post/test model through pvfs2  
though, so maybe I don't understand the issue.


I think that the way that you describe would work fine too, but  it  
would require a little more active work to check the status  of the  
array of child SMs and would require more code to keep  track of them.



Probably a bit more code yes, but it seems cleaner than keeping   
around backpointers and checking for parents.  Instead of driving  
all  state machines from one place, this event notification  scheme  
essentially replaces the last child state machine with the  parent,  
which seems like a bit of hack and harder to debug.



I think I'm lost now.  What do you mean by replace?  The states are  
still isolated, jobs trigger the transitions, only one state action  
gets executed at a time, there still may be a time gap between  
completion of any given child and when the parent picks up  processing 
again, and there are still frames.  I think both  approaches will look 
the same when running unless I missed  something.  If Walt puts a 
longjmp() in there we can both hit him  over the head.



Heh.  Don't give him ideas! ;-)

I was operating under the constraint that a state machine can only  post 
a job for itself.  If I understand the current plan correctly,  using 
job_null in the child state machine to post a job for the  parent breaks 
that constraint, and so in some sense is a replace (the  job_null 
actually takes the parent smcb pointer).  I think you're  probably right 
that its not a big difference either way, its just  cleaner in my head 
to only have state machines posting jobs for  themselves.


I think having a pointer to the parent actually improves  debugability 
(though I'm not sure this approach actually requires  it, all you 
really need is either a job descriptor or a pointer to  a counter).  
If I have a state machine that does something bad or  gets stuck it 
would be nice to be able to work backwards to find  out who invoked 
it, without having to search for it in a seperate  data structure.


I don't mean to keep struggling with this issue- I honestly think  
that both approaches are pretty good, and if Walt implements it the  
way I think he is going to, then 95% of developers won't notice the  
difference anyway.  At this point I am mostly hammering away to  make 
sure I am not missing a larger issue...



Walt probably got more discussion than he bargained for, but at the  
least, lively discussion keeps me awake in the afternoon ;-).


-sam



-Phil



Good discussion.  Phil has convinced me the level of dependency is low, 
and unless I completely misunderstand Sam, the complexity of the parent 
pointer/job_null approach is a lot less than the alternative, and I like 
low complexity.  I also think debugging will be simpler.  So that's 
where I'm going.


I'll hae to think of other topics to get you guys going form time to 
time!  ;-)


Now off to figure out a way to use setjmp/longjmp in my implementation!

Walt
--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-27 Thread Walter B. Ligon III




Phil Carns wrote:
I think I'm getting voted down here, so I should probably just  
shutup, but I don't think in practice we're going to have that many  
child state machines that iterating through the list is at all  
costly.  I'm arguing for simpler mechanisms that fit in with the job  
subsystem over something more fancy and possibly slightly better  
performing.



Well, as far as the number of SMs goes, I would rather not risk it.  I 
still hope this is lightweight enough that we could eventually use it in 
more places that would generate a lot of children (like a re-architected 
sys-io implementation), though I don't know if that will pan out in 
practice.  I got bitten by a similar assumption in the flow protocol- it 
used to track all of its posted operations for testing rather than 
relying on someone to notify it of completion.  Admittedly the flow 
protocol is a more obvious case and I should have known better, but at 
the time it seemed reasonable :)


I think that the way that you describe would work fine too, but it  
would require a little more active work to check the status of the  
array of child SMs and would require more code to keep track of them.



Probably a bit more code yes, but it seems cleaner than keeping  
around backpointers and checking for parents.  Instead of driving all  
state machines from one place, this event notification scheme  
essentially replaces the last child state machine with the parent,  
which seems like a bit of hack and harder to debug.



I think I'm lost now.  What do you mean by replace?  The states are 
still isolated, jobs trigger the transitions, only one state action gets 
executed at a time, there still may be a time gap between completion of 
any given child and when the parent picks up processing again, and there 
are still frames.  I think both approaches will look the same when 
running unless I missed something.  If Walt puts a longjmp() in there we 
can both hit him over the head.


What? What?  How else would I do it?  ;-)



I think having a pointer to the parent actually improves debugability 
(though I'm not sure this approach actually requires it, all you really 
need is either a job descriptor or a pointer to a counter).  If I have a 
state machine that does something bad or gets stuck it would be nice to 
be able to work backwards to find out who invoked it, without having to 
search for it in a seperate data structure.


I don't mean to keep struggling with this issue- I honestly think that 
both approaches are pretty good, and if Walt implements it the way I 
think he is going to, then 95% of developers won't notice the difference 
anyway.  At this point I am mostly hammering away to make sure I am not 
missing a larger issue...


-Phil


--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-27 Thread Phil Carns



Hmm...I had been thinking about a flow implementation that used the  new 
concurrent state machine code...it sounds like that's a bad idea  
because the testing and restarting would take too long to switch  
between bmi and trove?  We use the post/test model through pvfs2  
though, so maybe I don't understand the issue.


I don't think that is bad idea.  There were really two seperate but 
related problems in one of the older flow protocol implementations, I 
can try to describe them a little more here if I can remember:


- explicitly tracking and testing each trove and bmi operation: It 
basically kept arrays that listed pending trove and bmi ops, and would 
call testsome() to service them.  This was a problem because the time it 
took to keep running up and down those arrays (when building them at the 
flow level, or when testing them at the trove/bmi level).  The solution 
is to just use testcontext() and let trove/bmi tell you when something 
finishes without managing extra state.


- thread switch time: the architecture here was set up at one time to 
have one thread pushing the test functions for bmi, another thread 
pushing the test functions for trove, while another thread was 
processing the flow and posting new operations.  The problem here is 
that it (at the time) took too long to jump between the "pushing" 
threads and the "processing" thread when an operation finished that 
should trigger progress on the flow. This led to the thread-mgr.c code 
and associated callbacks.  The callbacks actually drive the flow 
progress and post new operations.  That means that the same thread that 
pushes testcontext() gets to trigger the next post, without waiting on 
the latency of waking up a different thread to do something (using 
condition variable etc.).  I managed to reuse the thread-mgr for the job 
code as well, so that one testcontext() call triggers callbacks to both 
the job and flow interfaces.


I don't think either of the above issues precludes different flow 
protocol implementations, and they are really kind of orthogonal to 
whether state machines are used or not.  The first issue is solved just 
by using testcontext() rather than manually tracking operations.


The second issue could be solved in a variety of ways, some of which may 
 be better than what we have now.  The callback approach is effecient 
enough, but is hard to debug.  Of course it is also possible that the 
thread switch (ie. condition signal) latency is low enough nowadays that 
you don't even need to worry about it anymore.  I last looked at this 
problem before NPTL arrived on the scene.


At any rate I think a state machine based flow protocol could dodge 
issue #2 by either:

- lucking out with a faster modern thread implementation
- being smarter about how thread work is divided up
- using callbacks as we do now, and making the state machine mechanism 
thread safe so that it can be driven directly from those callbacks 
rather than from a testcontext() work loop


On a related note, it is important to remember that trove has its own 
internal thread also- so on the trove push side (depending on your 
design) you could have to worry about a chain of 2 threads that have to 
be woken up to get something done at completion time.  The trove part of 
that chain can't be avoided without changing the API.


Sorry about the tangent here, but I figured I may as well share some 
warnings about things to look out for here.  I think it would be good to 
have a cleaner flow protocol implementation.


I think I'm lost now.  What do you mean by replace?  The states are  
still isolated, jobs trigger the transitions, only one state action  
gets executed at a time, there still may be a time gap between  
completion of any given child and when the parent picks up  processing 
again, and there are still frames.  I think both  approaches will look 
the same when running unless I missed  something.  If Walt puts a 
longjmp() in there we can both hit him  over the head.



Heh.  Don't give him ideas! ;-)

I was operating under the constraint that a state machine can only  post 
a job for itself.  If I understand the current plan correctly,  using 
job_null in the child state machine to post a job for the  parent breaks 
that constraint, and so in some sense is a replace (the  job_null 
actually takes the parent smcb pointer).  I think you're  probably right 
that its not a big difference either way, its just  cleaner in my head 
to only have state machines posting jobs for  themselves.


I see what you are saying.  I guess it depends on how you look at it.  I 
had kind of started thinking of the jobs as a signalling mechanism since 
they are the construct that "signals" as state machine to make its next 
transition.  The job_null() approach just makes it so that a child state 
machine is what triggers this particular signal, rather than a 
bmi/trove/dev/req_sched/flow component.  I know this is a change in the 
model and adds a dependency

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Sam Lang



On Jul 26, 2006, at 6:16 PM, Phil Carns wrote:

I think I'm getting voted down here, so I should probably just   
shutup, but I don't think in practice we're going to have that  
many  child state machines that iterating through the list is at  
all  costly.  I'm arguing for simpler mechanisms that fit in with  
the job  subsystem over something more fancy and possibly slightly  
better  performing.


Well, as far as the number of SMs goes, I would rather not risk  
it.  I still hope this is lightweight enough that we could  
eventually use it in more places that would generate a lot of  
children (like a re-architected sys-io implementation), though I  
don't know if that will pan out in practice.  I got bitten by a  
similar assumption in the flow protocol- it used to track all of  
its posted operations for testing rather than relying on someone to  
notify it of completion.  Admittedly the flow protocol is a more  
obvious case and I should have known better, but at the time it  
seemed reasonable :)




Hmm...I had been thinking about a flow implementation that used the  
new concurrent state machine code...it sounds like that's a bad idea  
because the testing and restarting would take too long to switch  
between bmi and trove?  We use the post/test model through pvfs2  
though, so maybe I don't understand the issue.


I think that the way that you describe would work fine too, but  
it  would require a little more active work to check the status  
of the  array of child SMs and would require more code to keep  
track of them.


Probably a bit more code yes, but it seems cleaner than keeping   
around backpointers and checking for parents.  Instead of driving  
all  state machines from one place, this event notification  
scheme  essentially replaces the last child state machine with the  
parent,  which seems like a bit of hack and harder to debug.


I think I'm lost now.  What do you mean by replace?  The states are  
still isolated, jobs trigger the transitions, only one state action  
gets executed at a time, there still may be a time gap between  
completion of any given child and when the parent picks up  
processing again, and there are still frames.  I think both  
approaches will look the same when running unless I missed  
something.  If Walt puts a longjmp() in there we can both hit him  
over the head.



Heh.  Don't give him ideas! ;-)

I was operating under the constraint that a state machine can only  
post a job for itself.  If I understand the current plan correctly,  
using job_null in the child state machine to post a job for the  
parent breaks that constraint, and so in some sense is a replace (the  
job_null actually takes the parent smcb pointer).  I think you're  
probably right that its not a big difference either way, its just  
cleaner in my head to only have state machines posting jobs for  
themselves.


I think having a pointer to the parent actually improves  
debugability (though I'm not sure this approach actually requires  
it, all you really need is either a job descriptor or a pointer to  
a counter).  If I have a state machine that does something bad or  
gets stuck it would be nice to be able to work backwards to find  
out who invoked it, without having to search for it in a seperate  
data structure.


I don't mean to keep struggling with this issue- I honestly think  
that both approaches are pretty good, and if Walt implements it the  
way I think he is going to, then 95% of developers won't notice the  
difference anyway.  At this point I am mostly hammering away to  
make sure I am not missing a larger issue...


Walt probably got more discussion than he bargained for, but at the  
least, lively discussion keeps me awake in the afternoon ;-).


-sam



-Phil



___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Phil Carns

I think I'm getting voted down here, so I should probably just  shutup, 
but I don't think in practice we're going to have that many  child state 
machines that iterating through the list is at all  costly.  I'm arguing 
for simpler mechanisms that fit in with the job  subsystem over 
something more fancy and possibly slightly better  performing.


Well, as far as the number of SMs goes, I would rather not risk it.  I 
still hope this is lightweight enough that we could eventually use it in 
more places that would generate a lot of children (like a re-architected 
sys-io implementation), though I don't know if that will pan out in 
practice.  I got bitten by a similar assumption in the flow protocol- it 
used to track all of its posted operations for testing rather than 
relying on someone to notify it of completion.  Admittedly the flow 
protocol is a more obvious case and I should have known better, but at 
the time it seemed reasonable :)


I think that the way that you describe would work fine too, but it  
would require a little more active work to check the status of the  
array of child SMs and would require more code to keep track of them.


Probably a bit more code yes, but it seems cleaner than keeping  around 
backpointers and checking for parents.  Instead of driving all  state 
machines from one place, this event notification scheme  essentially 
replaces the last child state machine with the parent,  which seems like 
a bit of hack and harder to debug.


I think I'm lost now.  What do you mean by replace?  The states are 
still isolated, jobs trigger the transitions, only one state action gets 
executed at a time, there still may be a time gap between completion of 
any given child and when the parent picks up processing again, and there 
are still frames.  I think both approaches will look the same when 
running unless I missed something.  If Walt puts a longjmp() in there we 
can both hit him over the head.


I think having a pointer to the parent actually improves debugability 
(though I'm not sure this approach actually requires it, all you really 
need is either a job descriptor or a pointer to a counter).  If I have a 
state machine that does something bad or gets stuck it would be nice to 
be able to work backwards to find out who invoked it, without having to 
search for it in a seperate data structure.


I don't mean to keep struggling with this issue- I honestly think that 
both approaches are pretty good, and if Walt implements it the way I 
think he is going to, then 95% of developers won't notice the difference 
anyway.  At this point I am mostly hammering away to make sure I am not 
missing a larger issue...


-Phil
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Sam Lang



On Jul 26, 2006, at 5:17 PM, Phil Carns wrote:


Sam Lang wrote:

On Jul 26, 2006, at 3:41 PM, Walter B. Ligon III wrote:
Yeah, the idea is that the SM code would call the job function.   
Depending on the state actions to do it seems like asking for   
trouble, all the details that have to be kept up with.


Actually, there are already job structs used by the SM code, now   
I've had to add a context id to the smcb and there will be job   
calls.  I think you are right though, the amount of dependency  
is  pretty small.


As for the job funcs I think I'd need one new one to post the   
parent job, establishing a counter.  The child job would look up   
the counter, decrement, and if zero, call job_null to relaunch  
the  parent, or just

replicate what job_null does, whatever seem the easiest.
I would rather see the parent get relaunched by the normal job  
test  code by putting itself in the job completion queue once its   
finished.  This could happen in a job_sm_test call like I  
suggested  in my previous email.  Also, instead of a counter that  
a test  function would check, and the child state machines would  
have to  decrement, I'd prefer the parent job keep an array of  
child state  machines (it does this anyway, no?) and check each  
element in the  array for completion of the state machine.  That  
way the children  aren't competing to lock the same state to  
notify of completion, the  parent just checks each one.


There doesn't need to be any locking- the main server thread only  
executes one state function or one transition at a time.  The  
counter also doesn't need to be visible- it could be hidden inside  
the job call, which could lock or not lock as it sees fit.


The parent also couldn't be the one checking the elements in an  
array like that - it would have to be done from within the job code  
somewhere (which I think you described in your previous email).   
That means that somewhere in the job code (or request scheduler,  
etc.) something will have to do the following on every  
job_testcontext() call:


for each active sm


Only jobs that got posted as parent states would need to be checked.


for each child within that sm
check state

Which could get expensive depending on how extensively we use the  
child/parallel sm model.


It seems unlikely to me that this would cost much overall.  If we're  
going to use this child/parallel system to send out 1000s of messages  
at once, well, then perhaps we'd be better suited using something  
like mpi? :-)


-sam



The implicit call is the child's call when it terminates.  The   
parent's call could be implicit too, or done by the state action.
Doesn't this require child state machines to only function in the   
child state machine context?  I'd prefer to just have generic  
state  machines that can be used as a child state machine or as a  
top-level  state machine.


I would prefer that too :)  Is this going to work Walt?  It would  
be nice if the state machine processing code handled transparently  
triggering different termination functions depending on whether it  
was a top level sm or not without the state functions themselves  
knowing any better.


As of this moment we really haven't taken any pains to keep the  
SM  independent from the job system, in fact you have to have the  
job  system to drive things, so in some sense its not really an  
issue.
I vote for making the interfaces as separate as possible.  If  
someone  else wants to use the state machine code somewhere else,  
it would be  nice to allow them to take it as-is (mpich2 guys were  
talking about  using it, but I think they ended up doing something  
else).  Also,  independent layers make testing and debugging  
easier in my view.
In the current code, the sm_p is passed through to the job  
descriptor  as a void*, and we just cast back to a sm_p in the  
while loop that  does the job_testcontext and then drives the  
state machines again.   The use of job_status does bring in the  
job code into the state  machine code,  but it seems like mostly  
only the error_code field is  used within the state actions, and  
the rest of that structure could  be independent of the state  
machine code.

-sam


Any more commends?  (Sam I hope this address some of yours)

Walt

Phil Carns wrote:


Walter B. Ligon III wrote:



OK, guys, I have another issue I want input on.  When child  
SMs  terminate they have to notify their parent.  The parent  
has to  wait for all the children to terminate.  So I've been  
thinking to  use the job subsystem for this: the parent would  
post a job to  wait for N children,
and each child would post a job, the last one releasing the  
parent.


Now I see two ways to implement this - one is to implement  
this  directly in the state machine code.  The parent simply  
stops  running (because it does not schedule a job yet returns   
DEFERRED).  Each child decrements a counter, and when it hits  
0  the parent i

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Sam Lang



On Jul 26, 2006, at 5:06 PM, Phil Carns wrote:



I don't see why the two have to be dependent for this to work.   
Do  you mean by the parent posting a job, the state machine  
stepping code  would handling the actual posting?  I was assuming  
that the parent  state action could just call  
job_concurrent_sm_post (or whatever its  called).
Could it be similar to the request scheduler job posting code?   
The  parent state action could call job_concurrent_sm_post with an  
array  of the child sms, which just calls sm_post and adds the  
parent sm and  its array to an operation queue.  Then a  
job_concurrent_sm_test  function could test for completion of a  
parent sm by looking at all  the sms in the array to see if they  
completed.  The job_testcontext  code would have to be modified of  
course (maybe rework the  do_one_test_cycle_req_sched function to  
also test parent sm jobs),  but all of that still seems to be  
independent of the state machine  code (i.e. someone could use the  
state machine code separately and  drive state machines using  
something other than the job framework).   I don't know if all  
that makes sense in the context of the changes  you've made, but  
that's what I had in mind when I suggested posting a  job for the  
parent.


I think I follow what you are describing, but I am not entirely  
sure. If so, I think there is one advantage to the approach that  
Walt has been hashing out thus far.  I think that what Walt is  
describing is event-driven, in a sense.  No one has to actively  
look to see if all of the children have finished.  Instead, the  
children each send notification (by calling a release function or  
manually decrementing a counter) in their completion function, with  
the parent eventually getting a single notification (representing  
all of the children) through the existing job completion queue  
mechanism.


I think I'm getting voted down here, so I should probably just  
shutup, but I don't think in practice we're going to have that many  
child state machines that iterating through the list is at all  
costly.  I'm arguing for simpler mechanisms that fit in with the job  
subsystem over something more fancy and possibly slightly better  
performing.




I think that the way that you describe would work fine too, but it  
would require a little more active work to check the status of the  
array of child SMs and would require more code to keep track of them.


Probably a bit more code yes, but it seems cleaner than keeping  
around backpointers and checking for parents.  Instead of driving all  
state machines from one place, this event notification scheme  
essentially replaces the last child state machine with the parent,  
which seems like a bit of hack and harder to debug.


-sam



I think you are right though, that you could pull off your version  
without the the children actually having to make a job_* call.


-Phil



___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Phil Carns

I think it would be nice to help prevent programmer error.  The same 
thing was done with the protocol request structures (see all the 
PINT_SERVREQ_*_FILL macros used in the client sms).  If you have a 
macro, then neglecting to pass in one of the required input fields 
results in a compiler error.  Otherwise the compiler can't help to 
tell you if you have set all of the frame fields that you were 
supposed to set.  There is no technical advantage, it just makes 
setting the fields a little more foolproof.


Same goes for the output of a frame after completion, although I'm not 
sure what the macro would look like there, or if it is possible. 
Probably a given frame will have several fields - some are input, some 
are output, some are scratch area for the state functions, etc.  
Someone coming along later trying to reuse the SM may not know 
(without some tricky code digging) which fields are the output fields 
that it can count on to be correctly filled in after completion.   For 
example, maybe there is a field called "parent_handle" in there- is it 
filled in?  If so, is it guaranteed to be filled in, or did I just 
happen to get it this time because of the steps path the sm took?   I 
don't know what the best way is to make this explicit, maybe some kind 
of macro, maybe putting a special prefix on the names of the output 
fields, any other ideas?  Maybe we just use comments :)


OK, I see what you mean.  I think that's kind of a syntax level thing - 
IOW I think if affects the underlying mechanism.  So yeah, we should 
have that and we'll work on that once the mechanism works.


Yeah, you are right- it doesn't have any impact on the underlying 
mechanism.  Its just some extra programming practice we can try to 
establish later for SM users.


-Phil
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Phil Carns


Sam Lang wrote:


On Jul 26, 2006, at 3:41 PM, Walter B. Ligon III wrote:

Yeah, the idea is that the SM code would call the job function.  
Depending on the state actions to do it seems like asking for  
trouble, all the details that have to be kept up with.


Actually, there are already job structs used by the SM code, now  I've 
had to add a context id to the smcb and there will be job  calls.  I 
think you are right though, the amount of dependency is  pretty small.


As for the job funcs I think I'd need one new one to post the  parent 
job, establishing a counter.  The child job would look up  the 
counter, decrement, and if zero, call job_null to relaunch the  
parent, or just

replicate what job_null does, whatever seem the easiest.


I would rather see the parent get relaunched by the normal job test  
code by putting itself in the job completion queue once its  finished.  
This could happen in a job_sm_test call like I suggested  in my previous 
email.  Also, instead of a counter that a test  function would check, 
and the child state machines would have to  decrement, I'd prefer the 
parent job keep an array of child state  machines (it does this anyway, 
no?) and check each element in the  array for completion of the state 
machine.  That way the children  aren't competing to lock the same state 
to notify of completion, the  parent just checks each one.


There doesn't need to be any locking- the main server thread only 
executes one state function or one transition at a time.  The counter 
also doesn't need to be visible- it could be hidden inside the job call, 
which could lock or not lock as it sees fit.


The parent also couldn't be the one checking the elements in an array 
like that - it would have to be done from within the job code somewhere 
(which I think you described in your previous email).  That means that 
somewhere in the job code (or request scheduler, etc.) something will 
have to do the following on every job_testcontext() call:


for each active sm
for each child within that sm
check state

Which could get expensive depending on how extensively we use the 
child/parallel sm model.




The implicit call is the child's call when it terminates.  The  
parent's call could be implicit too, or done by the state action.


Doesn't this require child state machines to only function in the  child 
state machine context?  I'd prefer to just have generic state  machines 
that can be used as a child state machine or as a top-level  state machine.


I would prefer that too :)  Is this going to work Walt?  It would be 
nice if the state machine processing code handled transparently 
triggering different termination functions depending on whether it was a 
top level sm or not without the state functions themselves knowing any 
better.


As of this moment we really haven't taken any pains to keep the SM  
independent from the job system, in fact you have to have the job  
system to drive things, so in some sense its not really an issue.



I vote for making the interfaces as separate as possible.  If someone  
else wants to use the state machine code somewhere else, it would be  
nice to allow them to take it as-is (mpich2 guys were talking about  
using it, but I think they ended up doing something else).  Also,  
independent layers make testing and debugging easier in my view.


In the current code, the sm_p is passed through to the job descriptor  
as a void*, and we just cast back to a sm_p in the while loop that  does 
the job_testcontext and then drives the state machines again.   The use 
of job_status does bring in the job code into the state  machine code,  
but it seems like mostly only the error_code field is  used within the 
state actions, and the rest of that structure could  be independent of 
the state machine code.


-sam



Any more commends?  (Sam I hope this address some of yours)

Walt

Phil Carns wrote:


Walter B. Ligon III wrote:



OK, guys, I have another issue I want input on.  When child SMs  
terminate they have to notify their parent.  The parent has to  wait 
for all the children to terminate.  So I've been thinking to  use 
the job subsystem for this: the parent would post a job to  wait for 
N children,

and each child would post a job, the last one releasing the parent.

Now I see two ways to implement this - one is to implement this  
directly in the state machine code.  The parent simply stops  
running (because it does not schedule a job yet returns  DEFERRED).  
Each child decrements a counter, and when it hits 0  the parent is 
restarted.  This is a little ugly because the  waiting parent is not 
being held on any list or queue (up to now  all waiting SMs are in 
the job subsystem), also the last  terminating child becomes the 
parent as it starts executing the  parent code.  Things can get 
weird when one SM starts children  that start children, and so on.


Now the other way to implement this is with the job subsystem as  I

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Sam Lang



On Jul 26, 2006, at 4:37 PM, Walter B. Ligon III wrote:




Sam Lang wrote:

On Jul 26, 2006, at 3:41 PM, Walter B. Ligon III wrote:
Yeah, the idea is that the SM code would call the job function.   
Depending on the state actions to do it seems like asking for   
trouble, all the details that have to be kept up with.


Actually, there are already job structs used by the SM code, now   
I've had to add a context id to the smcb and there will be job   
calls.  I think you are right though, the amount of dependency  
is  pretty small.


As for the job funcs I think I'd need one new one to post the   
parent job, establishing a counter.  The child job would look up   
the counter, decrement, and if zero, call job_null to relaunch  
the  parent, or just

replicate what job_null does, whatever seem the easiest.

I would rather see the parent get relaunched by the normal job  
test  code by putting itself in the job completion queue once its   
finished.


That's what I'm talking about.

This could happen in a job_sm_test call like I suggested  in my  
previous email.  Also, instead of a counter that a test  function  
would check, and the child state machines would have to   
decrement, I'd prefer the parent job keep an array of child state   
machines (it does this anyway, no?) and check each element in the   
array for completion of the state machine.  That way the children   
aren't competing to lock the same state to notify of completion,  
the  parent just checks each one.


That's going to be tricky, and probably would perform worse than a  
counter.  The primary problem being that the parent isn't running,  
so it can't really check anything.


Its not running, but it could work similar to the request scheduler  
code.  A job_sm_post would add the sm job to a pending queue, and the  
job_sm_test could be called in job_testcontext, just like  
PINT_request_scheduler_testworld is called.  The job_sm_test call  
would check the pending sm job queue (look at each one and check all  
the children SMs for completion).  Once an sm job is completed, it  
gets added to the job completion queue, and the while loop that  
drives the state machines will start it up again.




The implicit call is the child's call when it terminates.  The   
parent's call could be implicit too, or done by the state action.
Doesn't this require child state machines to only function in the   
child state machine context?  I'd prefer to just have generic  
state  machines that can be used as a child state machine or as a  
top-level  state machine.


No, not at all.  When all state machines terminate they check to  
see if they have a parent (SMs started directly as a result of a  
syscall or request have a NULL parent) and if so they then enter  
into the routine to see if they are the last child, and if so they  
release the parent.


That seems like a needless check.  Many state machines don't have  
parent's after all.  Why not just keep the direction from parent to  
child, instead of requiring children to keep a backpointer to the  
parent?




As of this moment we really haven't taken any pains to keep the  
SM  independent from the job system, in fact you have to have the  
job  system to drive things, so in some sense its not really an  
issue.
I vote for making the interfaces as separate as possible.  If  
someone  else wants to use the state machine code somewhere else,  
it would be  nice to allow them to take it as-is (mpich2 guys were  
talking about  using it, but I think they ended up doing something  
else).  Also,  independent layers make testing and debugging  
easier in my view.


I agree, that's why I asked the question.  Again, I could do it  
without the job layer at all and quite easily, but if I want the  
parent to pop out of the job_test call, then I'm going to have to  
call some things in the job interface.  I could leave it to the SM  
programmer to do that but then the SM really doesn't have a  
complete implementation, half of what it does depends on the SM  
programmer.  As it is there's already stuff that has to be provided  
as infrastructure to use the SM, and that's going to include  
something that wakes the SMs when they are done with their current  
task - which is currently the job system, so this isn't adding much.


Just to clarify, I'm only arguing that the state machine code be  
independent of the job code (not vice-versa).  Adding job_sm_post and  
job_sm_test functions that look at state machine pointers should  
prevent the need for state machines to know about jobs.


-sam



In the current code, the sm_p is passed through to the job  
descriptor  as a void*, and we just cast back to a sm_p in the  
while loop that  does the job_testcontext and then drives the  
state machines again.   The use of job_status does bring in the  
job code into the state  machine code,  but it seems like mostly  
only the error_code field is  used within the state actions, and  
the rest of that structure

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Walter B. Ligon III




Phil Carns wrote:


Phil, first your questions:  The parent will push a "frame" onto a 
stack for each child it is starting.  A frame is everything that used 
to be in either a s_op or sm_p on the server or client, except for the 
stuff that actually runs the SM (now in an smcb).  The parent can pass 
in anything it wants by filling in the fields appropriately.  When 
each child runs that struct will appear to be its "current" frame.  
Each child can leave that frame in any condition it wants, with any 
values of buffers the child wants to leave for the parent.  After the 
children are done the parent can pop each frame off the stack and do 
what it wants with it. Thus there is plenty of flexibility on how you 
want to handle passing things in and out, all under control of the 
server or client code.



Sounds great.

As for providing macros for setting up and tearing down frames, we can 
certainly do that.  I'm not sure hoe much that really helps, but we 
can do it.



I think it would be nice to help prevent programmer error.  The same 
thing was done with the protocol request structures (see all the 
PINT_SERVREQ_*_FILL macros used in the client sms).  If you have a 
macro, then neglecting to pass in one of the required input fields 
results in a compiler error.  Otherwise the compiler can't help to tell 
you if you have set all of the frame fields that you were supposed to 
set.  There is no technical advantage, it just makes setting the fields 
a little more foolproof.


Same goes for the output of a frame after completion, although I'm not 
sure what the macro would look like there, or if it is possible. 
Probably a given frame will have several fields - some are input, some 
are output, some are scratch area for the state functions, etc.  Someone 
coming along later trying to reuse the SM may not know (without some 
tricky code digging) which fields are the output fields that it can 
count on to be correctly filled in after completion.   For example, 
maybe there is a field called "parent_handle" in there- is it filled in? 
 If so, is it guaranteed to be filled in, or did I just happen to get it 
this time because of the steps path the sm took?   I don't know what the 
best way is to make this explicit, maybe some kind of macro, maybe 
putting a special prefix on the names of the output fields, any other 
ideas?  Maybe we just use comments :)


OK, I see what you mean.  I think that's kind of a syntax level thing - 
IOW I think if affects the underlying mechanism.  So yeah, we should 
have that and we'll work on that once the mechanism works.




Now, an implementation question - one approach to this job/counter 
thing is to have two job calls, one for the parent, and one of the 
children. Another approach is for the parent to simple set a counter 
and not call anything.  The children come along, decrement the count, 
and if zero, call job_null() to awaken the parent.  Requires no 
modification in the job layer, minimizes dependency.  What do you 
think?  Should the job layer have more of a roll, or keep it minimum?



Not a big deal to me either way. Especially if all of these calls are 
implicit in the state processing code - no one is really going to see 
them normally anyway.


OK, I think everyone has weighed in on this, and I think I'll use the 
minmal method.  The only real diff is Sam's preference not to use a 
counter.  We can go around on that, but I'm leaning towards a counter, 
at least for the initial implementation.


-Phil


--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Phil Carns





I don't see why the two have to be dependent for this to work.  Do  you 
mean by the parent posting a job, the state machine stepping code  would 
handling the actual posting?  I was assuming that the parent  state 
action could just call job_concurrent_sm_post (or whatever its  called).


Could it be similar to the request scheduler job posting code?  The  
parent state action could call job_concurrent_sm_post with an array  of 
the child sms, which just calls sm_post and adds the parent sm and  its 
array to an operation queue.  Then a job_concurrent_sm_test  function 
could test for completion of a parent sm by looking at all  the sms in 
the array to see if they completed.  The job_testcontext  code would 
have to be modified of course (maybe rework the  
do_one_test_cycle_req_sched function to also test parent sm jobs),  but 
all of that still seems to be independent of the state machine  code 
(i.e. someone could use the state machine code separately and  drive 
state machines using something other than the job framework).   I don't 
know if all that makes sense in the context of the changes  you've made, 
but that's what I had in mind when I suggested posting a  job for the 
parent.


I think I follow what you are describing, but I am not entirely sure. 
If so, I think there is one advantage to the approach that Walt has been 
hashing out thus far.  I think that what Walt is describing is 
event-driven, in a sense.  No one has to actively look to see if all of 
the children have finished.  Instead, the children each send 
notification (by calling a release function or manually decrementing a 
counter) in their completion function, with the parent eventually 
getting a single notification (representing all of the children) through 
the existing job completion queue mechanism.


I think that the way that you describe would work fine too, but it would 
require a little more active work to check the status of the array of 
child SMs and would require more code to keep track of them.


I think you are right though, that you could pull off your version 
without the the children actually having to make a job_* call.


-Phil
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Phil Carns



Phil, first your questions:  The parent will push a "frame" onto a stack 
for each child it is starting.  A frame is everything that used to be in 
either a s_op or sm_p on the server or client, except for the stuff that 
actually runs the SM (now in an smcb).  The parent can pass in anything 
it wants by filling in the fields appropriately.  When each child runs 
that struct will appear to be its "current" frame.  Each child can leave 
that frame in any condition it wants, with any values of buffers the 
child wants to leave for the parent.  After the children are done the 
parent can pop each frame off the stack and do what it wants with it. 
Thus there is plenty of flexibility on how you want to handle passing 
things in and out, all under control of the server or client code.


Sounds great.

As for providing macros for setting up and tearing down frames, we can 
certainly do that.  I'm not sure hoe much that really helps, but we can 
do it.


I think it would be nice to help prevent programmer error.  The same 
thing was done with the protocol request structures (see all the 
PINT_SERVREQ_*_FILL macros used in the client sms).  If you have a 
macro, then neglecting to pass in one of the required input fields 
results in a compiler error.  Otherwise the compiler can't help to tell 
you if you have set all of the frame fields that you were supposed to 
set.  There is no technical advantage, it just makes setting the fields 
a little more foolproof.


Same goes for the output of a frame after completion, although I'm not 
sure what the macro would look like there, or if it is possible. 
Probably a given frame will have several fields - some are input, some 
are output, some are scratch area for the state functions, etc.  Someone 
coming along later trying to reuse the SM may not know (without some 
tricky code digging) which fields are the output fields that it can 
count on to be correctly filled in after completion.   For example, 
maybe there is a field called "parent_handle" in there- is it filled in? 
 If so, is it guaranteed to be filled in, or did I just happen to get 
it this time because of the steps path the sm took?   I don't know what 
the best way is to make this explicit, maybe some kind of macro, maybe 
putting a special prefix on the names of the output fields, any other 
ideas?  Maybe we just use comments :)


Now, an implementation question - one approach to this job/counter thing 
is to have two job calls, one for the parent, and one of the children. 
Another approach is for the parent to simple set a counter and not call 
anything.  The children come along, decrement the count, and if zero, 
call job_null() to awaken the parent.  Requires no modification in the 
job layer, minimizes dependency.  What do you think?  Should the job 
layer have more of a roll, or keep it minimum?


Not a big deal to me either way. Especially if all of these calls are 
implicit in the state processing code - no one is really going to see 
them normally anyway.


-Phil
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Sam Lang



On Jul 26, 2006, at 3:41 PM, Walter B. Ligon III wrote:

Yeah, the idea is that the SM code would call the job function.  
Depending on the state actions to do it seems like asking for  
trouble, all the details that have to be kept up with.


Actually, there are already job structs used by the SM code, now  
I've had to add a context id to the smcb and there will be job  
calls.  I think you are right though, the amount of dependency is  
pretty small.


As for the job funcs I think I'd need one new one to post the  
parent job, establishing a counter.  The child job would look up  
the counter, decrement, and if zero, call job_null to relaunch the  
parent, or just

replicate what job_null does, whatever seem the easiest.



I would rather see the parent get relaunched by the normal job test  
code by putting itself in the job completion queue once its  
finished.  This could happen in a job_sm_test call like I suggested  
in my previous email.  Also, instead of a counter that a test  
function would check, and the child state machines would have to  
decrement, I'd prefer the parent job keep an array of child state  
machines (it does this anyway, no?) and check each element in the  
array for completion of the state machine.  That way the children  
aren't competing to lock the same state to notify of completion, the  
parent just checks each one.


The implicit call is the child's call when it terminates.  The  
parent's call could be implicit too, or done by the state action.


Doesn't this require child state machines to only function in the  
child state machine context?  I'd prefer to just have generic state  
machines that can be used as a child state machine or as a top-level  
state machine.




As of this moment we really haven't taken any pains to keep the SM  
independent from the job system, in fact you have to have the job  
system to drive things, so in some sense its not really an issue.


I vote for making the interfaces as separate as possible.  If someone  
else wants to use the state machine code somewhere else, it would be  
nice to allow them to take it as-is (mpich2 guys were talking about  
using it, but I think they ended up doing something else).  Also,  
independent layers make testing and debugging easier in my view.


In the current code, the sm_p is passed through to the job descriptor  
as a void*, and we just cast back to a sm_p in the while loop that  
does the job_testcontext and then drives the state machines again.   
The use of job_status does bring in the job code into the state  
machine code,  but it seems like mostly only the error_code field is  
used within the state actions, and the rest of that structure could  
be independent of the state machine code.


-sam



Any more commends?  (Sam I hope this address some of yours)

Walt

Phil Carns wrote:

Walter B. Ligon III wrote:


OK, guys, I have another issue I want input on.  When child SMs  
terminate they have to notify their parent.  The parent has to  
wait for all the children to terminate.  So I've been thinking to  
use the job subsystem for this: the parent would post a job to  
wait for N children,

and each child would post a job, the last one releasing the parent.

Now I see two ways to implement this - one is to implement this  
directly in the state machine code.  The parent simply stops  
running (because it does not schedule a job yet returns  
DEFERRED).  Each child decrements a counter, and when it hits 0  
the parent is restarted.  This is a little ugly because the  
waiting parent is not being held on any list or queue (up to now  
all waiting SMs are in the job subsystem), also the last  
terminating child becomes the parent as it starts executing the  
parent code.  Things can get weird when one SM starts children  
that start children, and so on.


Now the other way to implement this is with the job subsystem as  
I suggested above.  Much cleaner except for one thing:  up to now  
the state machine subsystem has had no dependency at all on the  
job subsystem.  If we do it this way, this function only works  
with the job system intact.  I'd prefer not to do this, but it  
does seem the cleanest, most logical means.
I like the job approach.  I guess this is an extra dependency  
because the sms would be calling these particular job functions  
implicitly, rather than relying on the state functions to handle  
those posts and releases?  We definitely haven't done that before,  
but at least in this case the job function that the sm  
infrastructure would be depending on is the simplest one in the  
arsenal :)  It shouldn't be hard for someone to reimplement that  
particular functionality if they wanted to use the state machine  
mechanism in another project.
If you weren't planning on these job calls to be implicit, then  
I'm not sure where the extra dependency is- we already use jobs to  
trigger all of the other "normal" transitions.
This reminded me of a question, though- is there go

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Walter B. Ligon III




Phil Carns wrote:

Walter B. Ligon III wrote:



OK, guys, I have another issue I want input on.  When child SMs 
terminate they have to notify their parent.  The parent has to wait 
for all the children to terminate.  So I've been thinking to use the 
job subsystem for this: the parent would post a job to wait for N 
children,

and each child would post a job, the last one releasing the parent.

Now I see two ways to implement this - one is to implement this 
directly in the state machine code.  The parent simply stops running 
(because it does not schedule a job yet returns DEFERRED).  Each child 
decrements a counter, and when it hits 0 the parent is restarted.  
This is a little ugly because the waiting parent is not being held on 
any list or queue (up to now all waiting SMs are in the job 
subsystem), also the last terminating child becomes the parent as it 
starts executing the parent code.  Things can get weird when one SM 
starts children that start children, and so on.


Now the other way to implement this is with the job subsystem as I 
suggested above.  Much cleaner except for one thing:  up to now the 
state machine subsystem has had no dependency at all on the job 
subsystem.  If we do it this way, this function only works with the 
job system intact.  I'd prefer not to do this, but it does seem the 
cleanest, most logical means.



I like the job approach.  I guess this is an extra dependency because 
the sms would be calling these particular job functions implicitly, 
rather than relying on the state functions to handle those posts and 
releases?  We definitely haven't done that before, but at least in this 
case the job function that the sm infrastructure would be depending on 
is the simplest one in the arsenal :)  It shouldn't be hard for someone 
to reimplement that particular functionality if they wanted to use the 
state machine mechanism in another project.


If you weren't planning on these job calls to be implicit, then I'm not 
sure where the extra dependency is- we already use jobs to trigger all 
of the other "normal" transitions.


This reminded me of a question, though- is there going to be a standard 
mechanism for the children to report each of their independent error 
codes to the parent sm?  Or do the children need to just keep a 
reference to the parent sm structure and manually fill in an array or 
something?


I guess I have a broader question of how data that the children generate 
(like a handle value or an attr structure) gets transferred to the 
parent.  Does the parent copy this stuff from the child after the child 
finishes, or does the child copy it to the parent before it exits?I 
think we talked about this before at some point but I forgot what the 
plan is.  It would be nice if we made the developer define macros or 
something to dictate what the input parameters need to be filled in when 
invoking a child and what output parameters can be retrieved when it 
finishes.  Otherwise it starts getting tricky to remember what fields 
need to be set in the sm structure before kicking something off.




Phil, first your questions:  The parent will push a "frame" onto a stack 
for each child it is starting.  A frame is everything that used to be in 
either a s_op or sm_p on the server or client, except for the stuff that 
actually runs the SM (now in an smcb).  The parent can pass in anything 
it wants by filling in the fields appropriately.  When each child runs 
that struct will appear to be its "current" frame.  Each child can leave 
that frame in any condition it wants, with any values of buffers the 
child wants to leave for the parent.  After the children are done the 
parent can pop each frame off the stack and do what it wants with it. 
Thus there is plenty of flexibility on how you want to handle passing 
things in and out, all under control of the server or client code.


As for providing macros for setting up and tearing down frames, we can 
certainly do that.  I'm not sure hoe much that really helps, but we can 
do it.


Now, an implementation question - one approach to this job/counter thing 
is to have two job calls, one for the parent, and one of the children. 
Another approach is for the parent to simple set a counter and not call 
anything.  The children come along, decrement the count, and if zero, 
call job_null() to awaken the parent.  Requires no modification in the 
job layer, minimizes dependency.  What do you think?  Should the job 
layer have more of a roll, or keep it minimum?


Walt

--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Walter B. Ligon III

Yeah, the idea is that the SM code would call the job function. 
Depending on the state actions to do it seems like asking for trouble, 
all the details that have to be kept up with.


Actually, there are already job structs used by the SM code, now I've 
had to add a context id to the smcb and there will be job calls.  I 
think you are right though, the amount of dependency is pretty small.


As for the job funcs I think I'd need one new one to post the parent 
job, establishing a counter.  The child job would look up the counter, 
decrement, and if zero, call job_null to relaunch the parent, or just

replicate what job_null does, whatever seem the easiest.

The implicit call is the child's call when it terminates.  The parent's 
call could be implicit too, or done by the state action.


As of this moment we really haven't taken any pains to keep the SM 
independent from the job system, in fact you have to have the job system 
to drive things, so in some sense its not really an issue.


Any more commends?  (Sam I hope this address some of yours)

Walt

Phil Carns wrote:

Walter B. Ligon III wrote:



OK, guys, I have another issue I want input on.  When child SMs 
terminate they have to notify their parent.  The parent has to wait 
for all the children to terminate.  So I've been thinking to use the 
job subsystem for this: the parent would post a job to wait for N 
children,

and each child would post a job, the last one releasing the parent.

Now I see two ways to implement this - one is to implement this 
directly in the state machine code.  The parent simply stops running 
(because it does not schedule a job yet returns DEFERRED).  Each child 
decrements a counter, and when it hits 0 the parent is restarted.  
This is a little ugly because the waiting parent is not being held on 
any list or queue (up to now all waiting SMs are in the job 
subsystem), also the last terminating child becomes the parent as it 
starts executing the parent code.  Things can get weird when one SM 
starts children that start children, and so on.


Now the other way to implement this is with the job subsystem as I 
suggested above.  Much cleaner except for one thing:  up to now the 
state machine subsystem has had no dependency at all on the job 
subsystem.  If we do it this way, this function only works with the 
job system intact.  I'd prefer not to do this, but it does seem the 
cleanest, most logical means.



I like the job approach.  I guess this is an extra dependency because 
the sms would be calling these particular job functions implicitly, 
rather than relying on the state functions to handle those posts and 
releases?  We definitely haven't done that before, but at least in this 
case the job function that the sm infrastructure would be depending on 
is the simplest one in the arsenal :)  It shouldn't be hard for someone 
to reimplement that particular functionality if they wanted to use the 
state machine mechanism in another project.


If you weren't planning on these job calls to be implicit, then I'm not 
sure where the extra dependency is- we already use jobs to trigger all 
of the other "normal" transitions.


This reminded me of a question, though- is there going to be a standard 
mechanism for the children to report each of their independent error 
codes to the parent sm?  Or do the children need to just keep a 
reference to the parent sm structure and manually fill in an array or 
something?


I guess I have a broader question of how data that the children generate 
(like a handle value or an attr structure) gets transferred to the 
parent.  Does the parent copy this stuff from the child after the child 
finishes, or does the child copy it to the parent before it exits?I 
think we talked about this before at some point but I forgot what the 
plan is.  It would be nice if we made the developer define macros or 
something to dictate what the input parameters need to be filled in when 
invoking a child and what output parameters can be retrieved when it 
finishes.  Otherwise it starts getting tricky to remember what fields 
need to be set in the sm structure before kicking something off.


-Phil

-Phil


--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Sam Lang



On Jul 26, 2006, at 12:37 PM, Walter B. Ligon III wrote:



OK, guys, I have another issue I want input on.  When child SMs  
terminate they have to notify their parent.  The parent has to wait  
for all the children to terminate.  So I've been thinking to use  
the job subsystem for this: the parent would post a job to wait for  
N children,

and each child would post a job, the last one releasing the parent.

Now I see two ways to implement this - one is to implement this  
directly in the state machine code.  The parent simply stops  
running (because it does not schedule a job yet returns DEFERRED).   
Each child decrements a counter, and when it hits 0 the parent is  
restarted.  This is a little ugly because the waiting parent is not  
being held on any list or queue (up to now all waiting SMs are in  
the job subsystem), also the last terminating child becomes the  
parent as it starts executing the parent code.  Things can get  
weird when one SM starts children that start children, and so on.


Now the other way to implement this is with the job subsystem as I  
suggested above.  Much cleaner except for one thing:  up to now the  
state machine subsystem has had no dependency at all on the job  
subsystem.  If we do it this way, this function only works with the  
job system intact.  I'd prefer not to do this, but it does seem the  
cleanest, most logical means.




I don't see why the two have to be dependent for this to work.  Do  
you mean by the parent posting a job, the state machine stepping code  
would handling the actual posting?  I was assuming that the parent  
state action could just call job_concurrent_sm_post (or whatever its  
called).


Could it be similar to the request scheduler job posting code?  The  
parent state action could call job_concurrent_sm_post with an array  
of the child sms, which just calls sm_post and adds the parent sm and  
its array to an operation queue.  Then a job_concurrent_sm_test  
function could test for completion of a parent sm by looking at all  
the sms in the array to see if they completed.  The job_testcontext  
code would have to be modified of course (maybe rework the  
do_one_test_cycle_req_sched function to also test parent sm jobs),  
but all of that still seems to be independent of the state machine  
code (i.e. someone could use the state machine code separately and  
drive state machines using something other than the job framework).   
I don't know if all that makes sense in the context of the changes  
you've made, but that's what I had in mind when I suggested posting a  
job for the parent.


-sam


Comments?

Walt
--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers



___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Phil Carns


Walter B. Ligon III wrote:


OK, guys, I have another issue I want input on.  When child SMs 
terminate they have to notify their parent.  The parent has to wait for 
all the children to terminate.  So I've been thinking to use the job 
subsystem for this: the parent would post a job to wait for N children,

and each child would post a job, the last one releasing the parent.

Now I see two ways to implement this - one is to implement this directly 
in the state machine code.  The parent simply stops running (because it 
does not schedule a job yet returns DEFERRED).  Each child decrements a 
counter, and when it hits 0 the parent is restarted.  This is a little 
ugly because the waiting parent is not being held on any list or queue 
(up to now all waiting SMs are in the job subsystem), also the last 
terminating child becomes the parent as it starts executing the parent 
code.  Things can get weird when one SM starts children that start 
children, and so on.


Now the other way to implement this is with the job subsystem as I 
suggested above.  Much cleaner except for one thing:  up to now the 
state machine subsystem has had no dependency at all on the job 
subsystem.  If we do it this way, this function only works with the job 
system intact.  I'd prefer not to do this, but it does seem the 
cleanest, most logical means.


I like the job approach.  I guess this is an extra dependency because 
the sms would be calling these particular job functions implicitly, 
rather than relying on the state functions to handle those posts and 
releases?  We definitely haven't done that before, but at least in this 
case the job function that the sm infrastructure would be depending on 
is the simplest one in the arsenal :)  It shouldn't be hard for someone 
to reimplement that particular functionality if they wanted to use the 
state machine mechanism in another project.


If you weren't planning on these job calls to be implicit, then I'm not 
sure where the extra dependency is- we already use jobs to trigger all 
of the other "normal" transitions.


This reminded me of a question, though- is there going to be a standard 
mechanism for the children to report each of their independent error 
codes to the parent sm?  Or do the children need to just keep a 
reference to the parent sm structure and manually fill in an array or 
something?


I guess I have a broader question of how data that the children generate 
(like a handle value or an attr structure) gets transferred to the 
parent.  Does the parent copy this stuff from the child after the child 
finishes, or does the child copy it to the parent before it exits?I 
think we talked about this before at some point but I forgot what the 
plan is.  It would be nice if we made the developer define macros or 
something to dictate what the input parameters need to be filled in when 
invoking a child and what output parameters can be retrieved when it 
finishes.  Otherwise it starts getting tricky to remember what fields 
need to be set in the sm structure before kicking something off.


-Phil

-Phil
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

[Pvfs2-developers] terminating state machines

2006-07-26 Thread Walter B. Ligon III



OK, guys, I have another issue I want input on.  When child SMs 
terminate they have to notify their parent.  The parent has to wait for 
all the children to terminate.  So I've been thinking to use the job 
subsystem for this: the parent would post a job to wait for N children,

and each child would post a job, the last one releasing the parent.

Now I see two ways to implement this - one is to implement this directly 
in the state machine code.  The parent simply stops running (because it 
does not schedule a job yet returns DEFERRED).  Each child decrements a 
counter, and when it hits 0 the parent is restarted.  This is a little 
ugly because the waiting parent is not being held on any list or queue 
(up to now all waiting SMs are in the job subsystem), also the last 
terminating child becomes the parent as it starts executing the parent 
code.  Things can get weird when one SM starts children that start 
children, and so on.


Now the other way to implement this is with the job subsystem as I 
suggested above.  Much cleaner except for one thing:  up to now the 
state machine subsystem has had no dependency at all on the job 
subsystem.  If we do it this way, this function only works with the job 
system intact.  I'd prefer not to do this, but it does seem the 
cleanest, most logical means.


Comments?

Walt
--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

Re: [Pvfs2-developers] terminating state machines

[Pvfs2-developers] terminating state machines

21 matches

Site Navigation

Mail list logo

Footer information