Re: [Pvfs2-developers] proposed changes to kernel timeout mechanism

Rob Ross Fri, 24 Mar 2006 07:21:03 -0800

This sounds like a good idea to me. I agree that competing timeouts arenot a good thing, and that pvfs2-client has more information to workwith than the kernel does.

Rob


Phil Carns wrote:

This is somewhat related to the timeout discussion from the previousemail, but this time the issue is the "op timeout" that the kernelmodule uses. This is an absolute timeout associated with every upcallthat the kernel submits, and is fully independent of the job timeoutsthat the pvfs2-client daemon uses.
These two competing timeouts for operations posted through the VFS cancause some headache in some circumstances. I mentioned part of theproblem a while back but we recently dug into it a little bit more:http://www.beowulf-underground.org/pipermail/pvfs2-developers/2005-December/001702.html
It seems like the pvfs2-client should be the real authority on when totimeout and retry from network or server problems. In particular, itcan do a couple of things that the kernel can't:
- it can differentiate BMI and non/BMI errors
- it uses a sliding timeout (based on progress over time) for the flowsrather than an absolute timeout
I believe that the timeout/retry mechanism in the kernel module wasmainly added to handle cases in which the pvfs2-client daemon isrestarted, but it ends up triggering in a variety of unrelated scenariosbecause it is shorter than the job timeouts and doesn't understandflows. It very uncommon for the pvfs2-client-core to restart anymore(this is no longer a normal error cleanup mechanism).
However, it seems like kernel should still recover gracefully frompvfs2-client-core restarts, but it would be nice if the timeouts/retriesused to handle this didn't interfere with the pvfs2-client-coretimeout/retry mechanism.
Here is a proposed solution:
- completely get rid of the per-operation timeout and retry mechanism
  (and the op-timeout tunable parameter)
- instead add logic to the device release function in the kernel, which
  is an indicator that the pvfs2-client-core has exited:
   - when this happens, requeue all pending operations to be resubmitted
   - start a single global timer
     - if the timer expires before someone reopens the device file, then
       cancel all pending operations with some error code to indicate
       that the pvfs2-client died
     - if the device is reopened in time, the new pvfs2-client-core
       instance will service the old operations (transparent to the
       application), and the timer is cancelled
The end result is that the kernel module never times out or retries anyoperation unless it is specifically to handle the case that thepvfs2-client-core has been restarted. We also have the opportunity touse an error code other than -ETIMEOUT that might be a little more helpful.
Any thoughts?

-Phil
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] proposed changes to kernel timeout mechanism

Reply via email to