From: Jan Harkes <[EMAIL PROTECTED]>
> Yeah, the venus-at-100%-of-cpu thing is pretty common right after
> I get back on the net; it usually lasts for about 10-15 minutes.
> During this time, by the way, codacon is pretty calm -- it's not
> blasting out "validate" messages or anything.
Interesting, in that case the 100% cpu usage probably doesn't have
anything to do with reintegration. I guess it is the demotion of all
cached objects as a result of the server/volume state change(s).
I guess that code path may be missing a yield in the outer loop. This
wouldn't fix the CPU usage, but make the system a little more responsive
again. A better fix may be to use some sort of an epoch/event counter
when the volume state changes and use that to detect which objects need
to be revalidated. Not sure if such a solution would merge well into the
existing revalidation mechanism.
It is spinning right now -- it is morning and I opened up my laptop after
a night of suspension. Here's codacon's current output:
Probe ( 23:39:29 )
BackProbe lambda.csail.mit.edu ( 23:39:29 )
Probe ( 23:42:02 )
BackProbe lambda.csail.mit.edu ( 23:42:02 )
Probe ( 08:20:30 )
BeginStatusWalk [27693] ( 08:20:30 )
[28366, 0, 0, 0] [28365] ( 08:20:30 )
EndStatusWalk [27693] ( 08:20:30 )
[28366, 0, 0, 0] [28365, 0, 0] [1, 0, 0.1] ( 08:20:30 )
BeginDataWalk [2585437] ( 08:20:30 )
EndDataWalk [2585437] ( 08:20:30 )
[1, 0, 0.1] [0, 0, 0, 0] ( 08:20:30 )
unreachable lambda.csail.mit.edu ( 08:21:56 )
NewConnectFS lambda.csail.mit.edu ( 08:23:02 )
NewConnection lambda.csail.mit.edu ( 08:23:02 )
up lambda.csail.mit.edu ( 08:23:02 )
BackProbe lambda.csail.mit.edu ( 08:23:02 )
Probe ( 08:23:03 )
BackProbe lambda.csail.mit.edu ( 08:23:03 )
bandwidth lambda.csail.mit.edu 31747 54558 77370 ( 08:23:03 )
NewConnectFS lambda.csail.mit.edu ( 08:23:08 )
BackProbe lambda.csail.mit.edu ( 08:23:08 )
ValidateVols / [1] ( 08:23:08 )
Probe ( 08:25:41 )
BackProbe lambda.csail.mit.edu ( 08:25:41 )
Probe ( 08:28:15 )
BackProbe lambda.csail.mit.edu ( 08:28:15 )
Probe ( 08:30:48 )
BackProbe lambda.csail.mit.edu ( 08:30:48 )
Probe ( 08:33:21 )
BackProbe lambda.csail.mit.edu ( 08:33:21 )
BeginStatusWalk [27693] ( 08:35:28 )
[0, 28366, 0, 0] [28365] ( 08:35:28 )
Probe ( 08:39:52 )
BackProbe lambda.csail.mit.edu ( 08:39:52 )
Probe ( 08:42:22 )
BackProbe lambda.csail.mit.edu ( 08:42:22 )
Ahh... it just stopped spinning, and codacon simultaneously ouput
EndStatusWalk [27693] ( 08:43:23 )
[28366, 0, 0, 0] [28365, 28369, 28369] [1, 28370, 475.4] ( 08:43:23 )
BeginDataWalk [2585437] ( 08:43:23 )
EndDataWalk [2585437] ( 08:43:23 )
[1, 0, 0.0] [0, 0, 0, 0] ( 08:43:23 )
Does that help?
-Olin