On Tue, Apr 3, 2012 at 10:25 AM, Andrew Deason <adea...@sinenomine.net>wrote:

> On Mon, 2 Apr 2012 19:04:19 -0700
> Ken Elkabany <k...@elkabany.com> wrote:
>
> > Over time these errors become more and more frequent. The problem is
> > that the client who hits this issue will experience a 5-10s delay in
> > accessing a file, which hurts performance significantly. The clients
> > are 1.6pre1, and the server is 1.4.14
>
> 1.6.0pre1? Or 1.6.1pre1?
>

1.6.0pre1 which was packaged with Ubuntu 11.10. Should we make it a
priority to upgrade?

>
> > Using afsmonitor, I do see that one of the clients hitting this issue
> > (I haven't checked whether all client have the problem, but many seem
> > to) has 17M callbacks alloced. Could that be suspect?
>
> Yes; that should not be possible unless the client is within a certain
> narrow range of versions. The client could be tied up trying to clear up
> that queue of GUCB messages, which is why everything would appear to
> freeze for a short time, and you get that ProbeUuid failure.
>
> What are GUCB messages? Why would they pile up, and in which circumstances?

>  --
> Andrew Deason
> adea...@sinenomine.net
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>

I traced the ProbeUuid failure to the OpenAFS fileservers using the
incorrect IP for certain clients. The clients each have one interface, but
are accessible via 2 IP addresses (one external/internet/WAN, one
internal/local). The fileservers would use their external IP address, which
the firewall would block. After opening up the external IP address ports,
the probeuuid errors disappeared. Anyone seen this problem before? The
servers are sitting in Amazon EC2, so there's additional complexity with
how the fileserver resolves the client IP address.

Reply via email to