Thanks Walt. I'm forwarding your response to dev so that everyone
can benefit. :-)
-sam
Begin forwarded message:
From: "Walter B. Ligon III" <[EMAIL PROTECTED]>
Date: October 24, 2006 2:52:45 PM CDT
To: Sam Lang <[EMAIL PROTECTED]>
Subject: Re: [Pvfs2-developers] threaded client-core and the device
thread
Well, I had been planning to write that up ... as soon as I get the
code done, but the pvfs-client stuff is being a real bitch. I
don't really understand most of it any more than you understand
this other stuff.
But the quick answer is SM_ACTION_DEFERRED is just 0 and
SM_ACTION_COMPLETE is just 1, you use them the same way 0 and 1
were used, only now (hopfully) it is a little clearer what is going
on (either the state action completed and you can continue to the
next state, or it it was deferred, and you have to wait for it to
finish).
There is a new return code SM_ACTION_TERMINATE which indicates that
the state machine should terminate. The state machines themselves
treat the various "frames" (PINT_client_sm or PINT_server_op) as
opaque types. They are set up by the respective code (client or
server) and dutifully returned when requested in the state
actions. Really, none of that is changed from the original, except
how you get to them. They are nolonger directly accessible but
should be accessed with PINT_sm_frame function.
I don't have a general method to kill a state machine. I guess I
can put that on the list for the next revision, but at this point
I'm still focussing on getting what is there to run. The
unexpected message state machine in the server checks for a server
specific flag and kills a state machine if it is set. Its really
up to the jobs to deal with killing a deferred SM - and I don't
know if or how to do that. If there is a way to cancel a job, and
if we keep a reference to a job for a deferred SM, then we could
kill it that way. But if its just the timer SMs we could always
have them check a flag each time the timer goes off and die if the
flag is set - similar to what the unexpected messages do.
Oh, and the other big change is that unepxected messages are
nolonger special cases - they are regular old SMs like everything
else. Much cleaner.
Walt
Sam Lang wrote:
On Oct 24, 2006, at 1:53 PM, Walter B. Ligon III wrote:
Good. I'm making progress tracking down the problems in the code
- somehow a bunch of edits got lost. I'm fixing them.
Involves changes it all of the client state machines.
BTW, there is one I'm confused about. src/client/sysint/sys-
getattr.sm
the last state action "getattr_set_sys_response" returns from
several places. It is not clear if ALL of them intend to
terminate since they don't all set the op_completed flag, but
the only option in the SM is to terminate. So I'm assuming they
want to terminate. If you know anything about that one I'd
appreciate it if you'd look.
I agree that you don't want SM_ACTION_DEFERRED for any of those.
It looks like you just went through and replaced all the return
0; lines in state actions with SM_ACTION_DEFERRED, even if the
error_code is set to a negative value (we used to ignore the
return value if the error value was negative?). If it was just a
search and replace, there are probably a bunch of other places
like this as well.
BTW, when is SM_ACTION_COMPLETE supposed to be used (returned by
a state action)? For nested machines? We could really use some
documentation for what is supposed to be returned by state
actions and when. It didn't exist before, and it took me a while
to figure out how return 0; and return 1; behaved, and now all
that is changing again. Its certainly for the better, but it
will help me to have the rules documented explicitly.
Also, the semantics of state machines and jobs, what are they?
What are the jobs currently associated with a state machine
pointer (PINT_client_sm or PINT_server_op)? How do I stop/cancel
a state machine? This is especially pertinent for our state
machines that essentially infinite loop, such as the job-timer
sm. We don't currently cleanup those state machines ourselves,
it would be nice of us if we did. That means figuring out what
(if any) jobs are currently posted by the machine, and cancelling
or waiting for completion on those jobs.
-sam
Walt
Sam Lang wrote:
I'm working with your branch Walt. Most of the code that does
allocation of the client state machines is the same.
-sam
On Oct 24, 2006, at 9:10 AM, Walter B. Ligon III wrote:
Should be careful here, since all of the code dealing with
PINT_client_sm's have been rewritten for the new SM code and
Murali's suggestions (for example) may not work so well.
Walt
Murali Vilayannur wrote:
Hey Sam,
I ran pvfs2-client-core in valgrind, and then ran Bonnie++ a
few times (10) on the mounted pvfs volume, and noticed the
following when I stopped the client process:
==20132== malloc/free: 1,298,824 allocs, 1,297,888 frees,
3,462,517,583 bytes allocated.
Allocating and freeing 3.5GB seemed extreme, so I went
exploring. It turns out that every time we allocate a
PINT_client_sm, we're allocating about 35KB:
(gdb) p sizeof(struct PINT_client_sm)
$4 = 37764
Oh boy.. that is definitely large..
static array of 8 PINT_client_lookup_sm_ctx, which itself
has a static array 40 PINT_client_lookup_sm_segment, which
are each about 112 bytes. Anyway, it ends up accumulating.
So I'm convinced at this point that this is beyond the
noise range, plus its just cruft that we don't need. I'd
like to swap out those static arrays for dynamic allocation
when we get to the start of the lookup state machine. Any
thoughts or suggestions?
I agree. It definitely does not look like noise region anymore.
How about we keep a pool of PINT_client_sm's around in client-
core and allocate from that instead of dynamically
allocating one everytime?
My 2 cents :)
thanks,
Murali
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-
developers
--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers