This sounds like a good idea to me. I agree that competing timeouts are
not a good thing, and that pvfs2-client has more information to work
with than the kernel does.
Rob
Phil Carns wrote:
This is somewhat related to the timeout discussion from the previous
email, but this time the issue is the "op timeout" that the kernel
module uses. This is an absolute timeout associated with every upcall
that the kernel submits, and is fully independent of the job timeouts
that the pvfs2-client daemon uses.
These two competing timeouts for operations posted through the VFS can
cause some headache in some circumstances. I mentioned part of the
problem a while back but we recently dug into it a little bit more:
http://www.beowulf-underground.org/pipermail/pvfs2-developers/2005-December/001702.html
It seems like the pvfs2-client should be the real authority on when to
timeout and retry from network or server problems. In particular, it
can do a couple of things that the kernel can't:
- it can differentiate BMI and non/BMI errors
- it uses a sliding timeout (based on progress over time) for the flows
rather than an absolute timeout
I believe that the timeout/retry mechanism in the kernel module was
mainly added to handle cases in which the pvfs2-client daemon is
restarted, but it ends up triggering in a variety of unrelated scenarios
because it is shorter than the job timeouts and doesn't understand
flows. It very uncommon for the pvfs2-client-core to restart anymore
(this is no longer a normal error cleanup mechanism).
However, it seems like kernel should still recover gracefully from
pvfs2-client-core restarts, but it would be nice if the timeouts/retries
used to handle this didn't interfere with the pvfs2-client-core
timeout/retry mechanism.
Here is a proposed solution:
- completely get rid of the per-operation timeout and retry mechanism
(and the op-timeout tunable parameter)
- instead add logic to the device release function in the kernel, which
is an indicator that the pvfs2-client-core has exited:
- when this happens, requeue all pending operations to be resubmitted
- start a single global timer
- if the timer expires before someone reopens the device file, then
cancel all pending operations with some error code to indicate
that the pvfs2-client died
- if the device is reopened in time, the new pvfs2-client-core
instance will service the old operations (transparent to the
application), and the timer is cancelled
The end result is that the kernel module never times out or retries any
operation unless it is specifically to handle the case that the
pvfs2-client-core has been restarted. We also have the opportunity to
use an error code other than -ETIMEOUT that might be a little more helpful.
Any thoughts?
-Phil
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers