Re: sshfs/nfs cause server lockup - resolved

2006-12-21 Thread Chris

On 19/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:

On Tue, Dec 19, 2006 at 08:20:21PM +, Chris wrote:
> On 18/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> >On Mon, Dec 18, 2006 at 12:39:13AM +, Chris wrote:
> >> On 14/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> >> >On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote:
> >> >
> >> >> It does make sense if thats the problem since the entire server even
> >> >> locally stops working properly, and it always follows a unexpected
> >> >> nfs/sshfs disconnection ie. network timeout.
> >> >>
> >> >> I am now running 6.2-RC that has the new file and currently at 1 day
> >> >> 11hrs uptime.
> >> >
> >> >OK, thanks for following part of the advice I gave a month ago ;) Let
> >> >us know if the problems persist.
> >> >
> >> >Kris
> >> >
> >> >
> >> >
> >>
> >> Early today the nfs hub was rebooted so had a unexpected disconnection
> >> also noted by the sshfs timeout prompt waiting for me in the terminal
> >> , was able to remount fine and no server lockup or other probolems.
> >>
> >> Current uptime is 5 days, 10:48
> >
> >OK, good to know.
> >
> >Thanks,
> >Kris
> >
> >
> >
> >
>
> Some bad news, I was offline for a day here, then I logged in today
> reattached to screen, and was greeted with a timeout message to the
> sshfs server, at this point server still functioning fine.  When I ran
> the sshfs command again it locked, with only pings responding and had
> to hard reboot it.
>
> I will setup my local machne now so I can do proper debugging for you.

OK, it's (still) probably an sshfs bug though.

Kris





Ok how to repeat the bug everytime.  Works on sshfs and nfs.

First.

The server died again (hub having its own problems so causing lots of timeouts).
This time instead of remounting I tried to ls the 2 mounts simply list
empty dirs, first dir worked and 2nd dir caused lockup, so its some
kind of problem with the filesystem nodes or something.

With this in mind on my local box I yanked out the network cable
causing a unexpected timeout, box hung, tried to do the ddb procedure
but didnt work, I may have been doing it wrong.

Booted local box again mounted nfs over internet and tried same thing
yanked out network cable, same thing accessing the dir where nfs mount
to hung server hard reboot needed.

Local box using 6.2-RC as well.  GENERIC kernel default make.conf.

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: sshfs/nfs cause server lockup - resolved

2006-12-19 Thread Kris Kennaway
On Tue, Dec 19, 2006 at 08:20:21PM +, Chris wrote:
> On 18/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> >On Mon, Dec 18, 2006 at 12:39:13AM +, Chris wrote:
> >> On 14/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> >> >On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote:
> >> >
> >> >> It does make sense if thats the problem since the entire server even
> >> >> locally stops working properly, and it always follows a unexpected
> >> >> nfs/sshfs disconnection ie. network timeout.
> >> >>
> >> >> I am now running 6.2-RC that has the new file and currently at 1 day
> >> >> 11hrs uptime.
> >> >
> >> >OK, thanks for following part of the advice I gave a month ago ;) Let
> >> >us know if the problems persist.
> >> >
> >> >Kris
> >> >
> >> >
> >> >
> >>
> >> Early today the nfs hub was rebooted so had a unexpected disconnection
> >> also noted by the sshfs timeout prompt waiting for me in the terminal
> >> , was able to remount fine and no server lockup or other probolems.
> >>
> >> Current uptime is 5 days, 10:48
> >
> >OK, good to know.
> >
> >Thanks,
> >Kris
> >
> >
> >
> >
> 
> Some bad news, I was offline for a day here, then I logged in today
> reattached to screen, and was greeted with a timeout message to the
> sshfs server, at this point server still functioning fine.  When I ran
> the sshfs command again it locked, with only pings responding and had
> to hard reboot it.
> 
> I will setup my local machne now so I can do proper debugging for you.

OK, it's (still) probably an sshfs bug though.

Kris


pgpvY0DSmrB3i.pgp
Description: PGP signature


Re: sshfs/nfs cause server lockup - resolved

2006-12-19 Thread Chris

On 18/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:

On Mon, Dec 18, 2006 at 12:39:13AM +, Chris wrote:
> On 14/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> >On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote:
> >
> >> It does make sense if thats the problem since the entire server even
> >> locally stops working properly, and it always follows a unexpected
> >> nfs/sshfs disconnection ie. network timeout.
> >>
> >> I am now running 6.2-RC that has the new file and currently at 1 day
> >> 11hrs uptime.
> >
> >OK, thanks for following part of the advice I gave a month ago ;) Let
> >us know if the problems persist.
> >
> >Kris
> >
> >
> >
>
> Early today the nfs hub was rebooted so had a unexpected disconnection
> also noted by the sshfs timeout prompt waiting for me in the terminal
> , was able to remount fine and no server lockup or other probolems.
>
> Current uptime is 5 days, 10:48

OK, good to know.

Thanks,
Kris






Some bad news, I was offline for a day here, then I logged in today
reattached to screen, and was greeted with a timeout message to the
sshfs server, at this point server still functioning fine.  When I ran
the sshfs command again it locked, with only pings responding and had
to hard reboot it.

I will setup my local machne now so I can do proper debugging for you.

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: sshfs/nfs cause server lockup

2006-12-17 Thread Chris

On 14/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:

On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote:

> It does make sense if thats the problem since the entire server even
> locally stops working properly, and it always follows a unexpected
> nfs/sshfs disconnection ie. network timeout.
>
> I am now running 6.2-RC that has the new file and currently at 1 day
> 11hrs uptime.

OK, thanks for following part of the advice I gave a month ago ;) Let
us know if the problems persist.

Kris





Early today the nfs hub was rebooted so had a unexpected disconnection
also noted by the sshfs timeout prompt waiting for me in the terminal
, was able to remount fine and no server lockup or other probolems.

Current uptime is 5 days, 10:48

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: sshfs/nfs cause server lockup - resolved

2006-12-17 Thread Kris Kennaway
On Mon, Dec 18, 2006 at 12:39:13AM +, Chris wrote:
> On 14/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> >On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote:
> >
> >> It does make sense if thats the problem since the entire server even
> >> locally stops working properly, and it always follows a unexpected
> >> nfs/sshfs disconnection ie. network timeout.
> >>
> >> I am now running 6.2-RC that has the new file and currently at 1 day
> >> 11hrs uptime.
> >
> >OK, thanks for following part of the advice I gave a month ago ;) Let
> >us know if the problems persist.
> >
> >Kris
> >
> >
> >
> 
> Early today the nfs hub was rebooted so had a unexpected disconnection
> also noted by the sshfs timeout prompt waiting for me in the terminal
> , was able to remount fine and no server lockup or other probolems.
> 
> Current uptime is 5 days, 10:48

OK, good to know.

Thanks,
Kris



pgpNMCX6n5K2z.pgp
Description: PGP signature


Re: sshfs/nfs cause server lockup

2006-12-14 Thread Chris

On 14/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:

On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote:

> It does make sense if thats the problem since the entire server even
> locally stops working properly, and it always follows a unexpected
> nfs/sshfs disconnection ie. network timeout.
>
> I am now running 6.2-RC that has the new file and currently at 1 day
> 11hrs uptime.

OK, thanks for following part of the advice I gave a month ago ;) Let
us know if the problems persist.

Kris





Will do, also I have parts for local machine now so if it persists I
will get that online to diagnose locally.

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: sshfs/nfs cause server lockup

2006-12-13 Thread Kris Kennaway
On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote:

> It does make sense if thats the problem since the entire server even
> locally stops working properly, and it always follows a unexpected
> nfs/sshfs disconnection ie. network timeout.
> 
> I am now running 6.2-RC that has the new file and currently at 1 day
> 11hrs uptime.

OK, thanks for following part of the advice I gave a month ago ;) Let
us know if the problems persist.

Kris


pgp9zvea2CRuD.pgp
Description: PGP signature


Re: sshfs/nfs cause server lockup

2006-12-13 Thread Chris

On 07/12/06, Anish Mistry <[EMAIL PROTECTED]> wrote:

On Thursday 07 December 2006 13:36, Chris wrote:
> On 23/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> > On Thu, Nov 23, 2006 at 04:38:48PM +0100,
[EMAIL PROTECTED] wrote:
> > > Chris <[EMAIL PROTECTED]> wrote:
> > > >kris a development on this, someone else posted about a nfs
> > > > problem and reading his post some starkling point he made
> > > > about network cards, he stated he only gets the bug on sis rl
> > > > and fxp.
> > >
> > > Sorry for the misunderstanding, but I think that the 'NFS via
> > > TCP' thread covers other bugs, ie the inability to mount NFS v3
> > > over TCP.
> > >
> > > I've tested the cards above, and the person I replied to
> > > encountered the same bug with a bge card. My solution was to
> > > remove custom nfs settings in sysctl.conf. I don't know which
> > > one was the culprit because I don't have the time to look into
> > > it further.
> > >
> > > My poking uncovered a set of crashing bugs and potentially a
> > > livelock. I would agree that NFS is very fragile in RELENG_6.
> > > So far  I've not run into an NFS server
> > > deadlock you described.
> >
> > Are you sure these are NFS problems and not ethernet driver
> > problems?
> >
> > Kris
>
> Had another lockup today, reattached to screen process to see both
> sshfs mounts timed out when running sshfs to remount the terminal
> stopped updating, it didnt disconnect me and an ircd running on the
> server remained functional and the server responded to pings,
> however the terminal was dead and I couldnt login on ssh for a new
> session, a ctrl-alt-del also failed to reboot it and it needed a
> power cycle again.
>
> I have googled researched and read dozens and dozens of nfs bug
> reports where most of them seem to be a problem caused by nfs not
> disconnecting properly leaving itself in a bad state, the fix is
> usually to reboot.  I am having to reboot weekly more often then my
> windows desktop pc, everyone else having no problems with the nfs
> on their linux servers having no problems and its hardly inspiring
> that the stable freebsd is not stable.  All these problems on
> google didnt get fixed no dev attention etc.
>
> So far I have not got my local bsd box up and running yet due to no
> keyboard still need parts for it.
I've updated the sshfs and libs port, you may want to update and see
if that helps.

--
Anish Mistry





Hi yes my sshfs and the libs got updated but didnt fix the crash
however I did find this link.

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=1362611+0+archive/2006/freebsd-stable/20060702.freebsd-stable

It does make sense if thats the problem since the entire server even
locally stops working properly, and it always follows a unexpected
nfs/sshfs disconnection ie. network timeout.

I am now running 6.2-RC that has the new file and currently at 1 day
11hrs uptime.

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: sshfs/nfs cause server lockup

2006-12-07 Thread Anish Mistry
On Thursday 07 December 2006 13:36, Chris wrote:
> On 23/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> > On Thu, Nov 23, 2006 at 04:38:48PM +0100, 
[EMAIL PROTECTED] wrote:
> > > Chris <[EMAIL PROTECTED]> wrote:
> > > >kris a development on this, someone else posted about a nfs
> > > > problem and reading his post some starkling point he made
> > > > about network cards, he stated he only gets the bug on sis rl
> > > > and fxp.
> > >
> > > Sorry for the misunderstanding, but I think that the 'NFS via
> > > TCP' thread covers other bugs, ie the inability to mount NFS v3
> > > over TCP.
> > >
> > > I've tested the cards above, and the person I replied to
> > > encountered the same bug with a bge card. My solution was to
> > > remove custom nfs settings in sysctl.conf. I don't know which
> > > one was the culprit because I don't have the time to look into
> > > it further.
> > >
> > > My poking uncovered a set of crashing bugs and potentially a
> > > livelock. I would agree that NFS is very fragile in RELENG_6.
> > > So far  I've not run into an NFS server
> > > deadlock you described.
> >
> > Are you sure these are NFS problems and not ethernet driver
> > problems?
> >
> > Kris
>
> Had another lockup today, reattached to screen process to see both
> sshfs mounts timed out when running sshfs to remount the terminal
> stopped updating, it didnt disconnect me and an ircd running on the
> server remained functional and the server responded to pings,
> however the terminal was dead and I couldnt login on ssh for a new
> session, a ctrl-alt-del also failed to reboot it and it needed a
> power cycle again.
>
> I have googled researched and read dozens and dozens of nfs bug
> reports where most of them seem to be a problem caused by nfs not
> disconnecting properly leaving itself in a bad state, the fix is
> usually to reboot.  I am having to reboot weekly more often then my
> windows desktop pc, everyone else having no problems with the nfs
> on their linux servers having no problems and its hardly inspiring
> that the stable freebsd is not stable.  All these problems on
> google didnt get fixed no dev attention etc.
>
> So far I have not got my local bsd box up and running yet due to no
> keyboard still need parts for it.
I've updated the sshfs and libs port, you may want to update and see 
if that helps.

-- 
Anish Mistry


pgpOgh9DDvEiC.pgp
Description: PGP signature


Re: sshfs/nfs cause server lockup

2006-12-07 Thread Kris Kennaway
On Thu, Dec 07, 2006 at 06:36:10PM +, Chris wrote:
> On 23/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> >On Thu, Nov 23, 2006 at 04:38:48PM +0100, [EMAIL PROTECTED] 
> >wrote:
> >> Chris <[EMAIL PROTECTED]> wrote:
> >> >kris a development on this, someone else posted about a nfs problem
> >> >and reading his post some starkling point he made about network cards,
> >> >he stated he only gets the bug on sis rl and fxp.
> >>
> >> Sorry for the misunderstanding, but I think that the 'NFS via TCP' thread
> >> covers other bugs, ie the inability to mount NFS v3 over TCP.
> >>
> >> I've tested the cards above, and the person I replied to encountered the
> >> same bug with a bge card. My solution was to remove custom nfs settings
> >> in sysctl.conf. I don't know which one was the culprit because I don't
> >> have the time to look into it further.
> >>
> >> My poking uncovered a set of crashing bugs and potentially a livelock.
> >> I would agree that NFS is very fragile in RELENG_6.
> >> So far  I've not run into an NFS server deadlock you
> >> described.
> >
> >Are you sure these are NFS problems and not ethernet driver problems?
> >
> >Kris
> >
> >
> >
> 
> Had another lockup today, reattached to screen process to see both
> sshfs mounts timed out when running sshfs to remount the terminal
> stopped updating, it didnt disconnect me and an ircd running on the
> server remained functional and the server responded to pings, however
> the terminal was dead and I couldnt login on ssh for a new session, a
> ctrl-alt-del also failed to reboot it and it needed a power cycle
> again.
> 
> I have googled researched and read dozens and dozens of nfs bug
> reports where most of them seem to be a problem caused by nfs not
> disconnecting properly leaving itself in a bad state, the fix is
> usually to reboot.  I am having to reboot weekly more often then my
> windows desktop pc, everyone else having no problems with the nfs on
> their linux servers having no problems and its hardly inspiring that
> the stable freebsd is not stable.  All these problems on google didnt
> get fixed no dev attention etc.

They probably got "no dev attention" because the user didn't submit
any usable information ;-)

Yes, that's a hint.  If this problem really is important to you then
you need to find the time to set up a suitable debugging environment
so the developers can begin to help you.  Until then we can't do
anything no matter how many emails you send.

> So far I have not got my local bsd box up and running yet due to no
> keyboard still need parts for it.

Kris


pgpvX8wgsl2Sa.pgp
Description: PGP signature


Re: sshfs/nfs cause server lockup

2006-12-07 Thread Chris

On 23/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:

On Thu, Nov 23, 2006 at 04:38:48PM +0100, [EMAIL PROTECTED] wrote:
> Chris <[EMAIL PROTECTED]> wrote:
> >kris a development on this, someone else posted about a nfs problem
> >and reading his post some starkling point he made about network cards,
> >he stated he only gets the bug on sis rl and fxp.
>
> Sorry for the misunderstanding, but I think that the 'NFS via TCP' thread
> covers other bugs, ie the inability to mount NFS v3 over TCP.
>
> I've tested the cards above, and the person I replied to encountered the
> same bug with a bge card. My solution was to remove custom nfs settings
> in sysctl.conf. I don't know which one was the culprit because I don't
> have the time to look into it further.
>
> My poking uncovered a set of crashing bugs and potentially a livelock.
> I would agree that NFS is very fragile in RELENG_6.
> So far  I've not run into an NFS server deadlock you
> described.

Are you sure these are NFS problems and not ethernet driver problems?

Kris





Had another lockup today, reattached to screen process to see both
sshfs mounts timed out when running sshfs to remount the terminal
stopped updating, it didnt disconnect me and an ircd running on the
server remained functional and the server responded to pings, however
the terminal was dead and I couldnt login on ssh for a new session, a
ctrl-alt-del also failed to reboot it and it needed a power cycle
again.

I have googled researched and read dozens and dozens of nfs bug
reports where most of them seem to be a problem caused by nfs not
disconnecting properly leaving itself in a bad state, the fix is
usually to reboot.  I am having to reboot weekly more often then my
windows desktop pc, everyone else having no problems with the nfs on
their linux servers having no problems and its hardly inspiring that
the stable freebsd is not stable.  All these problems on google didnt
get fixed no dev attention etc.

So far I have not got my local bsd box up and running yet due to no
keyboard still need parts for it.

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: sshfs/nfs cause server lockup

2006-11-26 Thread Chris

On 23/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:

On Thu, Nov 23, 2006 at 04:38:48PM +0100, [EMAIL PROTECTED] wrote:
> Chris <[EMAIL PROTECTED]> wrote:
> >kris a development on this, someone else posted about a nfs problem
> >and reading his post some starkling point he made about network cards,
> >he stated he only gets the bug on sis rl and fxp.
>
> Sorry for the misunderstanding, but I think that the 'NFS via TCP' thread
> covers other bugs, ie the inability to mount NFS v3 over TCP.
>
> I've tested the cards above, and the person I replied to encountered the
> same bug with a bge card. My solution was to remove custom nfs settings
> in sysctl.conf. I don't know which one was the culprit because I don't
> have the time to look into it further.
>
> My poking uncovered a set of crashing bugs and potentially a livelock.
> I would agree that NFS is very fragile in RELENG_6.
> So far  I've not run into an NFS server deadlock you
> described.

Are you sure these are NFS problems and not ethernet driver problems?

Kris




The consistency is the servers that lock up are fine when not using
any nfs or sshfs mounts.  But the 2 servers that are fine use dc and
re network adaptors.

I am not able to setup my local bsd box until a new hd arrives next
monday and will be a day or so before I get it setup to mirror the
production conditions of the servers and I will then mount some nfs on
it to repeat the problem.

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: sshfs/nfs cause server lockup

2006-11-23 Thread Kris Kennaway
On Thu, Nov 23, 2006 at 04:38:48PM +0100, [EMAIL PROTECTED] wrote:
> Chris <[EMAIL PROTECTED]> wrote:
> >kris a development on this, someone else posted about a nfs problem
> >and reading his post some starkling point he made about network cards,
> >he stated he only gets the bug on sis rl and fxp.
> 
> Sorry for the misunderstanding, but I think that the 'NFS via TCP' thread
> covers other bugs, ie the inability to mount NFS v3 over TCP.
> 
> I've tested the cards above, and the person I replied to encountered the
> same bug with a bge card. My solution was to remove custom nfs settings
> in sysctl.conf. I don't know which one was the culprit because I don't
> have the time to look into it further.
> 
> My poking uncovered a set of crashing bugs and potentially a livelock.
> I would agree that NFS is very fragile in RELENG_6.
> So far  I've not run into an NFS server deadlock you
> described.

Are you sure these are NFS problems and not ethernet driver problems?

Kris


pgpVbyEyKStFe.pgp
Description: PGP signature


Re: sshfs/nfs cause server lockup

2006-11-23 Thread s . c . sprong
Chris <[EMAIL PROTECTED]> wrote:
>kris a development on this, someone else posted about a nfs problem
>and reading his post some starkling point he made about network cards,
>he stated he only gets the bug on sis rl and fxp.

Sorry for the misunderstanding, but I think that the 'NFS via TCP' thread
covers other bugs, ie the inability to mount NFS v3 over TCP.

I've tested the cards above, and the person I replied to encountered the
same bug with a bge card. My solution was to remove custom nfs settings
in sysctl.conf. I don't know which one was the culprit because I don't
have the time to look into it further.

My poking uncovered a set of crashing bugs and potentially a livelock.
I would agree that NFS is very fragile in RELENG_6.
So far  I've not run into an NFS server deadlock you
described.

scs
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: sshfs/nfs cause server lockup

2006-11-22 Thread Chris

On 23/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:

On Thu, Nov 23, 2006 at 05:25:21AM +, Chris wrote:
> On 22/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> >On Wed, Nov 22, 2006 at 05:49:12AM +, Chris wrote:
> >> On a few occasions all different remote servers I have had nfs cause
> >> servers to stop responding so I stopped  using it all the servers were
> >> either 6.0 release 6.1 release or 6-stable.
> >>
> >> We recently discovered sshfs which supports cross platform mounting
> >> server is linux and I mounted on a freebsd 6.1 release using security
> >> branch up to date.
> >>
> >> it was working fine for around 5 to 6 days with some problems with
> >> sshfs not updating files that are updated but wasnt compromising the
> >> stability of the freebsd server I just remounted to keep up to date.
> >> Then today the linux server had network problems so the sshfs timed
> >> out and there is 2 dirs I mount, the first mounted fine a bit slow but
> >> connected but when I ran the command to mount the 2nd dir the server
> >> stopped responding.
> >>
> >> My 2nd ssh terminal was alive I tried to run top to see if sshfs was
> >> hanging or something but when I hit enter top didnt run and the 2nd
> >> terminal was froze, note both terminals didnt timeout and a ircd
> >> running on the server also did not timeout but the box wasnt listening
> >> to any new requests, it was responding to pings fine.
> >>
> >> I have a remote reboot facility on the box but no local access and no
> >> kvm/serial console facility available this is the case for all of my
> >> servers.  I initially tried a soft reboot which uses ctrl-alt-delete
> >> but the pings kept replying so I could see the reboot wasn initiated
> >> indicating some kind of console lockup as well, I then did a hard
> >> reboot which brought the server back.
> >>
> >> All logs stopped when the first lockup occured so no errors etc.
> >> recorded bear in mind I have no local access to this machine.  It does
> >> appear that 6.x has some kind of serious remote mounting bug because I
> >> never had these nfs problems in freebsd 5.x.
> >>
> >> I would be interested in any thoughts as to what could help me I have
> >> rebooted the server now with network mpsafe disabled to see if this
> >> will help it is using a generic kernel with the following changes.
> >
> >Sounds like your "sshfs" is causing the kernel to deadlock in that
> >error situation.  You can confirm by enabling DEBUG_LOCKS and
> >DEBUG_VFS_LOCKS, then breaking to DDB and running 'show lockedvnods'
> >when the deadlock occurs.
> >
> >If you're still having problems with NFS on 6.2, we'd much rather you
> >reported those so that we can investigate and try to fix them.
> >
> >Kris
> >
> >
> >
>
> Ok thanks, I will make sure this box is updated to 6.2 when it hits
> release, if I enable the options in the kernel I will need local
> access to use ddb?

Yeah, you'll need a form of console access (local or serial).

In principle you could extract the information from a coredump
(i.e. trigger a coredump when the system deadlocks), but I don't think
there's a kgdb macro equivalent of 'show lockedvnods'.

Kris






kris a development on this, someone else posted about a nfs problem
and reading his post some starkling point he made about network cards,
he stated he only gets the bug on sis rl and fxp.

I have 2 servers that have no problems they show dc0 and re0 in ifconfig.

The servers that lockup I have 2 using fxp0 and 1 using rl0 and
another that used sis0 which I no longer have this would back up what
he was saying.

I wont be able to use ddb on my remote server since the datacentre
wont provide kvm even if I offer cash for the service, my local server
is rl0 so I will try to repeat the problem on that.

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: sshfs/nfs cause server lockup

2006-11-22 Thread Kris Kennaway
On Thu, Nov 23, 2006 at 05:25:21AM +, Chris wrote:
> On 22/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> >On Wed, Nov 22, 2006 at 05:49:12AM +, Chris wrote:
> >> On a few occasions all different remote servers I have had nfs cause
> >> servers to stop responding so I stopped  using it all the servers were
> >> either 6.0 release 6.1 release or 6-stable.
> >>
> >> We recently discovered sshfs which supports cross platform mounting
> >> server is linux and I mounted on a freebsd 6.1 release using security
> >> branch up to date.
> >>
> >> it was working fine for around 5 to 6 days with some problems with
> >> sshfs not updating files that are updated but wasnt compromising the
> >> stability of the freebsd server I just remounted to keep up to date.
> >> Then today the linux server had network problems so the sshfs timed
> >> out and there is 2 dirs I mount, the first mounted fine a bit slow but
> >> connected but when I ran the command to mount the 2nd dir the server
> >> stopped responding.
> >>
> >> My 2nd ssh terminal was alive I tried to run top to see if sshfs was
> >> hanging or something but when I hit enter top didnt run and the 2nd
> >> terminal was froze, note both terminals didnt timeout and a ircd
> >> running on the server also did not timeout but the box wasnt listening
> >> to any new requests, it was responding to pings fine.
> >>
> >> I have a remote reboot facility on the box but no local access and no
> >> kvm/serial console facility available this is the case for all of my
> >> servers.  I initially tried a soft reboot which uses ctrl-alt-delete
> >> but the pings kept replying so I could see the reboot wasn initiated
> >> indicating some kind of console lockup as well, I then did a hard
> >> reboot which brought the server back.
> >>
> >> All logs stopped when the first lockup occured so no errors etc.
> >> recorded bear in mind I have no local access to this machine.  It does
> >> appear that 6.x has some kind of serious remote mounting bug because I
> >> never had these nfs problems in freebsd 5.x.
> >>
> >> I would be interested in any thoughts as to what could help me I have
> >> rebooted the server now with network mpsafe disabled to see if this
> >> will help it is using a generic kernel with the following changes.
> >
> >Sounds like your "sshfs" is causing the kernel to deadlock in that
> >error situation.  You can confirm by enabling DEBUG_LOCKS and
> >DEBUG_VFS_LOCKS, then breaking to DDB and running 'show lockedvnods'
> >when the deadlock occurs.
> >
> >If you're still having problems with NFS on 6.2, we'd much rather you
> >reported those so that we can investigate and try to fix them.
> >
> >Kris
> >
> >
> >
> 
> Ok thanks, I will make sure this box is updated to 6.2 when it hits
> release, if I enable the options in the kernel I will need local
> access to use ddb?

Yeah, you'll need a form of console access (local or serial).

In principle you could extract the information from a coredump
(i.e. trigger a coredump when the system deadlocks), but I don't think
there's a kgdb macro equivalent of 'show lockedvnods'.

Kris



pgpMrUEIlwUUc.pgp
Description: PGP signature


Re: sshfs/nfs cause server lockup

2006-11-22 Thread Chris

On 22/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote:

On Wed, Nov 22, 2006 at 05:49:12AM +, Chris wrote:
> On a few occasions all different remote servers I have had nfs cause
> servers to stop responding so I stopped  using it all the servers were
> either 6.0 release 6.1 release or 6-stable.
>
> We recently discovered sshfs which supports cross platform mounting
> server is linux and I mounted on a freebsd 6.1 release using security
> branch up to date.
>
> it was working fine for around 5 to 6 days with some problems with
> sshfs not updating files that are updated but wasnt compromising the
> stability of the freebsd server I just remounted to keep up to date.
> Then today the linux server had network problems so the sshfs timed
> out and there is 2 dirs I mount, the first mounted fine a bit slow but
> connected but when I ran the command to mount the 2nd dir the server
> stopped responding.
>
> My 2nd ssh terminal was alive I tried to run top to see if sshfs was
> hanging or something but when I hit enter top didnt run and the 2nd
> terminal was froze, note both terminals didnt timeout and a ircd
> running on the server also did not timeout but the box wasnt listening
> to any new requests, it was responding to pings fine.
>
> I have a remote reboot facility on the box but no local access and no
> kvm/serial console facility available this is the case for all of my
> servers.  I initially tried a soft reboot which uses ctrl-alt-delete
> but the pings kept replying so I could see the reboot wasn initiated
> indicating some kind of console lockup as well, I then did a hard
> reboot which brought the server back.
>
> All logs stopped when the first lockup occured so no errors etc.
> recorded bear in mind I have no local access to this machine.  It does
> appear that 6.x has some kind of serious remote mounting bug because I
> never had these nfs problems in freebsd 5.x.
>
> I would be interested in any thoughts as to what could help me I have
> rebooted the server now with network mpsafe disabled to see if this
> will help it is using a generic kernel with the following changes.

Sounds like your "sshfs" is causing the kernel to deadlock in that
error situation.  You can confirm by enabling DEBUG_LOCKS and
DEBUG_VFS_LOCKS, then breaking to DDB and running 'show lockedvnods'
when the deadlock occurs.

If you're still having problems with NFS on 6.2, we'd much rather you
reported those so that we can investigate and try to fix them.

Kris





Ok thanks, I will make sure this box is updated to 6.2 when it hits
release, if I enable the options in the kernel I will need local
access to use ddb?

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: sshfs/nfs cause server lockup

2006-11-22 Thread Kris Kennaway
On Wed, Nov 22, 2006 at 05:49:12AM +, Chris wrote:
> On a few occasions all different remote servers I have had nfs cause
> servers to stop responding so I stopped  using it all the servers were
> either 6.0 release 6.1 release or 6-stable.
> 
> We recently discovered sshfs which supports cross platform mounting
> server is linux and I mounted on a freebsd 6.1 release using security
> branch up to date.
> 
> it was working fine for around 5 to 6 days with some problems with
> sshfs not updating files that are updated but wasnt compromising the
> stability of the freebsd server I just remounted to keep up to date.
> Then today the linux server had network problems so the sshfs timed
> out and there is 2 dirs I mount, the first mounted fine a bit slow but
> connected but when I ran the command to mount the 2nd dir the server
> stopped responding.
> 
> My 2nd ssh terminal was alive I tried to run top to see if sshfs was
> hanging or something but when I hit enter top didnt run and the 2nd
> terminal was froze, note both terminals didnt timeout and a ircd
> running on the server also did not timeout but the box wasnt listening
> to any new requests, it was responding to pings fine.
> 
> I have a remote reboot facility on the box but no local access and no
> kvm/serial console facility available this is the case for all of my
> servers.  I initially tried a soft reboot which uses ctrl-alt-delete
> but the pings kept replying so I could see the reboot wasn initiated
> indicating some kind of console lockup as well, I then did a hard
> reboot which brought the server back.
> 
> All logs stopped when the first lockup occured so no errors etc.
> recorded bear in mind I have no local access to this machine.  It does
> appear that 6.x has some kind of serious remote mounting bug because I
> never had these nfs problems in freebsd 5.x.
> 
> I would be interested in any thoughts as to what could help me I have
> rebooted the server now with network mpsafe disabled to see if this
> will help it is using a generic kernel with the following changes.

Sounds like your "sshfs" is causing the kernel to deadlock in that
error situation.  You can confirm by enabling DEBUG_LOCKS and
DEBUG_VFS_LOCKS, then breaking to DDB and running 'show lockedvnods'
when the deadlock occurs.

If you're still having problems with NFS on 6.2, we'd much rather you
reported those so that we can investigate and try to fix them.

Kris


pgprgVzqrqInr.pgp
Description: PGP signature


sshfs/nfs cause server lockup

2006-11-21 Thread Chris

On a few occasions all different remote servers I have had nfs cause
servers to stop responding so I stopped  using it all the servers were
either 6.0 release 6.1 release or 6-stable.

We recently discovered sshfs which supports cross platform mounting
server is linux and I mounted on a freebsd 6.1 release using security
branch up to date.

it was working fine for around 5 to 6 days with some problems with
sshfs not updating files that are updated but wasnt compromising the
stability of the freebsd server I just remounted to keep up to date.
Then today the linux server had network problems so the sshfs timed
out and there is 2 dirs I mount, the first mounted fine a bit slow but
connected but when I ran the command to mount the 2nd dir the server
stopped responding.

My 2nd ssh terminal was alive I tried to run top to see if sshfs was
hanging or something but when I hit enter top didnt run and the 2nd
terminal was froze, note both terminals didnt timeout and a ircd
running on the server also did not timeout but the box wasnt listening
to any new requests, it was responding to pings fine.

I have a remote reboot facility on the box but no local access and no
kvm/serial console facility available this is the case for all of my
servers.  I initially tried a soft reboot which uses ctrl-alt-delete
but the pings kept replying so I could see the reboot wasn initiated
indicating some kind of console lockup as well, I then did a hard
reboot which brought the server back.

All logs stopped when the first lockup occured so no errors etc.
recorded bear in mind I have no local access to this machine.  It does
appear that 6.x has some kind of serious remote mounting bug because I
never had these nfs problems in freebsd 5.x.

I would be interested in any thoughts as to what could help me I have
rebooted the server now with network mpsafe disabled to see if this
will help it is using a generic kernel with the following changes.

options directio, polling, noadaptive mutexes, adaptive giant,ipv6 and
nfs disabled.

dmesg output below.  I left the reboot showing vnodes because it also
looks supicous it took so long for it to synch the disks, this was
following a working reboot the remote reboot of course was improper
shutdown.

The hd is a sata2 but dmesg shows as ata33

Syncing disks, vnodes remaining...3 3 1 0 2 1 1 1 1 1 1 1 1 1 1 1 2 1
1 1 1 1 1 0 0 0 done
All buffers synced.
Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
   The Regents of the University of California. All rights reserved.
FreeBSD 6.1-RELEASE-p10 #1: Sat Nov 11 23:02:09 GMT 2006
   [EMAIL PROTECTED]:/usr/obj/usr/src/sys/HEAVEN
WARNING: MPSAFE network stack disabled, expect reduced performance.
ACPI APIC Table: 
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) 64 Processor 3800+ (2410.95-MHz 686-class CPU)
 Origin = "AuthenticAMD"  Id = 0x40ff2  Stepping = 2
 
Features=0x78bfbff
 Features2=0x2001
 AMD Features=0xea500800
 AMD Features2=0x1d,,CR8>
real memory  = 939261952 (895 MB)
avail memory = 909828096 (867 MB)
ioapic0  irqs 0-23 on motherboard
kbd1 at kbdmux0
acpi0:  on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x508-0x50b on acpi0
cpu0:  on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pci0:  at device 0.0 (no driver attached)
isab0:  at device 1.0 on pci0
isa0:  on isab0
pci0:  at device 1.1 (no driver attached)
pci0:  at device 1.2 (no driver attached)
pci0:  at device 1.3 (no driver attached)
ohci0:  mem 0xdfe7f000-0xdfe7 irq
21 at device 2.0 on pci0
ohci0: [GIANT-LOCKED]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0:  on ohci0
usb0: USB revision 1.0
uhub0: nVidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 10 ports with 10 removable, self powered
ehci0:  mem 0xdfe7ec00-0xdfe7ecff
irq 22 at device 2.1 on pci0
ehci0: [GIANT-LOCKED]
usb1: EHCI version 1.0
usb1: companion controller, 10 ports each: usb0
usb1:  on ehci0
usb1: USB revision 2.0
uhub1: nVidia EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub1: 10 ports with 10 removable, self powered
pcib1:  at device 4.0 on pci0
pci1:  on pcib1
fxp0:  port 0xec00-0xec3f mem 0xd000-0xdff
f,0xdffc-0xdffd irq 16 at device 6.0 on pci1
miibus0:  on fxp0
inphy0:  on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp0: Ethernet address: 00:02:b3:bf:b5:c9
fxp0: [GIANT-LOCKED]
pci0:  at device 5.0 (no driver attached)
atapci0:  port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 6.0 on
pci0
ata0:  on atapci0
ata1:  on atapci0
atapci1:  port
0xd480-0xd487,0xd400-0xd403,0xd080-0xd087,0xd000-0xd003,0xcc00-0xcc0f
mem 0xdfe7d000-0xdfe7dfff irq 20 at device 8.0 on pci0
ata2:  on atapci1
ata3:  on atapci1
atapci2:  port
0xc880-0xc887,0xc800-0xc803,0xc480-0xc487,0xc400-0xc403,0xc080-0xc08f
mem 0xdfe7c000-0xdfe