On Mon, Nov 13, 2006 at 02:07:23PM -0500, Derrick J Brashear wrote: > >>Except that code path can't cause a corrupted file. It may be related but > >>that error message (in the fileserver) is not a cause of that client > >>problem. > > > >In my tests the compilation sometimes abort, because of a timeout > >comunicating with the fileserver, usually happened during a vos > >backupsys of all volumes. > > Can you get tcpdump from the client's point of view? Basically, at some > point the client is marking the server down, I assume. The question is on > the basis of what.
Yes I can. How much detail do you need? I may need run the tcpdump
for hours to catch the error and the capture file needs to fit on the
empty space of the disk.
>
> >Looking for errors in the fileserver I had seen "FindClient: stillborn
> >client" in some of the cases. Can it be possible when a client is
> >hitting very hard a fileserver, with reads and writes, for this error
> >to happen?
>
> Yes. But, that's not necessarily related to the problem you're
> having.
Ok
>
> >What I can do to pinpoint the cause of the problem?
>
> As above.
I am running it, now.
>
> >I can think this problem can hit my prodution clients and servers if I
> >do an upgrade to 1.4.2, now they use 1.3.81, 1.4.0, 1.4.1.
>
> There's no new code in 1.4.2 which would cause this. That server log
> message was also in 1.4.1, for instance.
>
> Derrick
>
José Calhariz
--
A beleza e uma carta de recomendacao a curto prazo.
-- Ninon de Lenclos
signature.asc
Description: Digital signature
