This sounds like a good idea to me. I agree that competing timeouts are not a good thing, and that pvfs2-client has more information to work with than the kernel does.

Rob

Phil Carns wrote:
This is somewhat related to the timeout discussion from the previous email, but this time the issue is the "op timeout" that the kernel module uses. This is an absolute timeout associated with every upcall that the kernel submits, and is fully independent of the job timeouts that the pvfs2-client daemon uses.

These two competing timeouts for operations posted through the VFS can cause some headache in some circumstances. I mentioned part of the problem a while back but we recently dug into it a little bit more: http://www.beowulf-underground.org/pipermail/pvfs2-developers/2005-December/001702.html

It seems like the pvfs2-client should be the real authority on when to timeout and retry from network or server problems. In particular, it can do a couple of things that the kernel can't:
- it can differentiate BMI and non/BMI errors
- it uses a sliding timeout (based on progress over time) for the flows rather than an absolute timeout

I believe that the timeout/retry mechanism in the kernel module was mainly added to handle cases in which the pvfs2-client daemon is restarted, but it ends up triggering in a variety of unrelated scenarios because it is shorter than the job timeouts and doesn't understand flows. It very uncommon for the pvfs2-client-core to restart anymore (this is no longer a normal error cleanup mechanism).

However, it seems like kernel should still recover gracefully from pvfs2-client-core restarts, but it would be nice if the timeouts/retries used to handle this didn't interfere with the pvfs2-client-core timeout/retry mechanism.

Here is a proposed solution:
- completely get rid of the per-operation timeout and retry mechanism
  (and the op-timeout tunable parameter)
- instead add logic to the device release function in the kernel, which
  is an indicator that the pvfs2-client-core has exited:
   - when this happens, requeue all pending operations to be resubmitted
   - start a single global timer
     - if the timer expires before someone reopens the device file, then
       cancel all pending operations with some error code to indicate
       that the pvfs2-client died
     - if the device is reopened in time, the new pvfs2-client-core
       instance will service the old operations (transparent to the
       application), and the timer is cancelled

The end result is that the kernel module never times out or retries any operation unless it is specifically to handle the case that the pvfs2-client-core has been restarted. We also have the opportunity to use an error code other than -ETIMEOUT that might be a little more helpful.

Any thoughts?

-Phil
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to