On Wed, May 04, 2011 at 01:37:14PM +0200, DEGREMONT Aurelien wrote:
> > I assume that the 25315s is from a bug

BTW, do you see this problem with both extent & inodebits locks?

> (fixed in 1.8.5 I think, not sure if it was ported to 2.x) that calculated 
> the wrong time when printing this error message for LDLM lock timeouts.
> >
> I did not find the bug for that.

I think Andreas was referring to bug 17887. However you should have the patch 
applied already since it was landed for 2.0.0.

> > If there are routers they can cause dropped RPCs from the server to the 
> > client, and the client will be evicted for unresponsiveness even though it 
> > is not at fault.  At one time Johann was working on a patch (or at least 
> > investigating) the ability to have servers resend RPCs before evicting 
> > clients.  The tricky part is that you don't want to send 2 RPCs each with 
> > 1/2 the timeout interval, since that may reduce stability instead of 
> > increasing it.
> >
> How can I track those dropped RPCs on routers?

I don't think routers can drop RPCs w/o a good reason. It is just that a router 
failure can lead to packet loss and given that servers don't resend local 
callbacks, this can result in client evictions.

> Is this an expected behaviour?

Well, let's call this a known problem we would like to address at some point.

> How could I protect my filesystem from that? If I increase the timeout
> this won't change anything

Right, tweaking timeouts cannot help here.

> if client/server do not re-send their RPC.

To be clear, clients go through a disconnect/reconnect cycle and eventually 
resend RPCs.

> > I think the bugzilla bug was called "limited server-side resend" or 
> > similar, filed by me several years ago.
> >
> Did not find either :)

That's bug 3622. Fanyong also used to work on a patch, see 
http://review.whamcloud.com/#change,125.

HTH

Cheers,
Johann
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to