Re: sshfs/nfs cause server lockup - resolved
On 19/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: On Tue, Dec 19, 2006 at 08:20:21PM +, Chris wrote: > On 18/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > >On Mon, Dec 18, 2006 at 12:39:13AM +, Chris wrote: > >> On 14/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > >> >On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote: > >> > > >> >> It does make sense if thats the problem since the entire server even > >> >> locally stops working properly, and it always follows a unexpected > >> >> nfs/sshfs disconnection ie. network timeout. > >> >> > >> >> I am now running 6.2-RC that has the new file and currently at 1 day > >> >> 11hrs uptime. > >> > > >> >OK, thanks for following part of the advice I gave a month ago ;) Let > >> >us know if the problems persist. > >> > > >> >Kris > >> > > >> > > >> > > >> > >> Early today the nfs hub was rebooted so had a unexpected disconnection > >> also noted by the sshfs timeout prompt waiting for me in the terminal > >> , was able to remount fine and no server lockup or other probolems. > >> > >> Current uptime is 5 days, 10:48 > > > >OK, good to know. > > > >Thanks, > >Kris > > > > > > > > > > Some bad news, I was offline for a day here, then I logged in today > reattached to screen, and was greeted with a timeout message to the > sshfs server, at this point server still functioning fine. When I ran > the sshfs command again it locked, with only pings responding and had > to hard reboot it. > > I will setup my local machne now so I can do proper debugging for you. OK, it's (still) probably an sshfs bug though. Kris Ok how to repeat the bug everytime. Works on sshfs and nfs. First. The server died again (hub having its own problems so causing lots of timeouts). This time instead of remounting I tried to ls the 2 mounts simply list empty dirs, first dir worked and 2nd dir caused lockup, so its some kind of problem with the filesystem nodes or something. With this in mind on my local box I yanked out the network cable causing a unexpected timeout, box hung, tried to do the ddb procedure but didnt work, I may have been doing it wrong. Booted local box again mounted nfs over internet and tried same thing yanked out network cable, same thing accessing the dir where nfs mount to hung server hard reboot needed. Local box using 6.2-RC as well. GENERIC kernel default make.conf. Chris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: sshfs/nfs cause server lockup - resolved
On Tue, Dec 19, 2006 at 08:20:21PM +, Chris wrote: > On 18/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > >On Mon, Dec 18, 2006 at 12:39:13AM +, Chris wrote: > >> On 14/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > >> >On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote: > >> > > >> >> It does make sense if thats the problem since the entire server even > >> >> locally stops working properly, and it always follows a unexpected > >> >> nfs/sshfs disconnection ie. network timeout. > >> >> > >> >> I am now running 6.2-RC that has the new file and currently at 1 day > >> >> 11hrs uptime. > >> > > >> >OK, thanks for following part of the advice I gave a month ago ;) Let > >> >us know if the problems persist. > >> > > >> >Kris > >> > > >> > > >> > > >> > >> Early today the nfs hub was rebooted so had a unexpected disconnection > >> also noted by the sshfs timeout prompt waiting for me in the terminal > >> , was able to remount fine and no server lockup or other probolems. > >> > >> Current uptime is 5 days, 10:48 > > > >OK, good to know. > > > >Thanks, > >Kris > > > > > > > > > > Some bad news, I was offline for a day here, then I logged in today > reattached to screen, and was greeted with a timeout message to the > sshfs server, at this point server still functioning fine. When I ran > the sshfs command again it locked, with only pings responding and had > to hard reboot it. > > I will setup my local machne now so I can do proper debugging for you. OK, it's (still) probably an sshfs bug though. Kris pgpvY0DSmrB3i.pgp Description: PGP signature
Re: sshfs/nfs cause server lockup - resolved
On 18/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: On Mon, Dec 18, 2006 at 12:39:13AM +, Chris wrote: > On 14/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > >On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote: > > > >> It does make sense if thats the problem since the entire server even > >> locally stops working properly, and it always follows a unexpected > >> nfs/sshfs disconnection ie. network timeout. > >> > >> I am now running 6.2-RC that has the new file and currently at 1 day > >> 11hrs uptime. > > > >OK, thanks for following part of the advice I gave a month ago ;) Let > >us know if the problems persist. > > > >Kris > > > > > > > > Early today the nfs hub was rebooted so had a unexpected disconnection > also noted by the sshfs timeout prompt waiting for me in the terminal > , was able to remount fine and no server lockup or other probolems. > > Current uptime is 5 days, 10:48 OK, good to know. Thanks, Kris Some bad news, I was offline for a day here, then I logged in today reattached to screen, and was greeted with a timeout message to the sshfs server, at this point server still functioning fine. When I ran the sshfs command again it locked, with only pings responding and had to hard reboot it. I will setup my local machne now so I can do proper debugging for you. Chris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: sshfs/nfs cause server lockup
On 14/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote: > It does make sense if thats the problem since the entire server even > locally stops working properly, and it always follows a unexpected > nfs/sshfs disconnection ie. network timeout. > > I am now running 6.2-RC that has the new file and currently at 1 day > 11hrs uptime. OK, thanks for following part of the advice I gave a month ago ;) Let us know if the problems persist. Kris Early today the nfs hub was rebooted so had a unexpected disconnection also noted by the sshfs timeout prompt waiting for me in the terminal , was able to remount fine and no server lockup or other probolems. Current uptime is 5 days, 10:48 Chris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: sshfs/nfs cause server lockup - resolved
On Mon, Dec 18, 2006 at 12:39:13AM +, Chris wrote: > On 14/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > >On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote: > > > >> It does make sense if thats the problem since the entire server even > >> locally stops working properly, and it always follows a unexpected > >> nfs/sshfs disconnection ie. network timeout. > >> > >> I am now running 6.2-RC that has the new file and currently at 1 day > >> 11hrs uptime. > > > >OK, thanks for following part of the advice I gave a month ago ;) Let > >us know if the problems persist. > > > >Kris > > > > > > > > Early today the nfs hub was rebooted so had a unexpected disconnection > also noted by the sshfs timeout prompt waiting for me in the terminal > , was able to remount fine and no server lockup or other probolems. > > Current uptime is 5 days, 10:48 OK, good to know. Thanks, Kris pgpNMCX6n5K2z.pgp Description: PGP signature
Re: sshfs/nfs cause server lockup
On 14/12/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote: > It does make sense if thats the problem since the entire server even > locally stops working properly, and it always follows a unexpected > nfs/sshfs disconnection ie. network timeout. > > I am now running 6.2-RC that has the new file and currently at 1 day > 11hrs uptime. OK, thanks for following part of the advice I gave a month ago ;) Let us know if the problems persist. Kris Will do, also I have parts for local machine now so if it persists I will get that online to diagnose locally. Chris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: sshfs/nfs cause server lockup
On Thu, Dec 14, 2006 at 01:28:48AM +, Chris wrote: > It does make sense if thats the problem since the entire server even > locally stops working properly, and it always follows a unexpected > nfs/sshfs disconnection ie. network timeout. > > I am now running 6.2-RC that has the new file and currently at 1 day > 11hrs uptime. OK, thanks for following part of the advice I gave a month ago ;) Let us know if the problems persist. Kris pgp9zvea2CRuD.pgp Description: PGP signature
Re: sshfs/nfs cause server lockup
On 07/12/06, Anish Mistry <[EMAIL PROTECTED]> wrote: On Thursday 07 December 2006 13:36, Chris wrote: > On 23/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > > On Thu, Nov 23, 2006 at 04:38:48PM +0100, [EMAIL PROTECTED] wrote: > > > Chris <[EMAIL PROTECTED]> wrote: > > > >kris a development on this, someone else posted about a nfs > > > > problem and reading his post some starkling point he made > > > > about network cards, he stated he only gets the bug on sis rl > > > > and fxp. > > > > > > Sorry for the misunderstanding, but I think that the 'NFS via > > > TCP' thread covers other bugs, ie the inability to mount NFS v3 > > > over TCP. > > > > > > I've tested the cards above, and the person I replied to > > > encountered the same bug with a bge card. My solution was to > > > remove custom nfs settings in sysctl.conf. I don't know which > > > one was the culprit because I don't have the time to look into > > > it further. > > > > > > My poking uncovered a set of crashing bugs and potentially a > > > livelock. I would agree that NFS is very fragile in RELENG_6. > > > So far I've not run into an NFS server > > > deadlock you described. > > > > Are you sure these are NFS problems and not ethernet driver > > problems? > > > > Kris > > Had another lockup today, reattached to screen process to see both > sshfs mounts timed out when running sshfs to remount the terminal > stopped updating, it didnt disconnect me and an ircd running on the > server remained functional and the server responded to pings, > however the terminal was dead and I couldnt login on ssh for a new > session, a ctrl-alt-del also failed to reboot it and it needed a > power cycle again. > > I have googled researched and read dozens and dozens of nfs bug > reports where most of them seem to be a problem caused by nfs not > disconnecting properly leaving itself in a bad state, the fix is > usually to reboot. I am having to reboot weekly more often then my > windows desktop pc, everyone else having no problems with the nfs > on their linux servers having no problems and its hardly inspiring > that the stable freebsd is not stable. All these problems on > google didnt get fixed no dev attention etc. > > So far I have not got my local bsd box up and running yet due to no > keyboard still need parts for it. I've updated the sshfs and libs port, you may want to update and see if that helps. -- Anish Mistry Hi yes my sshfs and the libs got updated but didnt fix the crash however I did find this link. http://docs.freebsd.org/cgi/getmsg.cgi?fetch=1362611+0+archive/2006/freebsd-stable/20060702.freebsd-stable It does make sense if thats the problem since the entire server even locally stops working properly, and it always follows a unexpected nfs/sshfs disconnection ie. network timeout. I am now running 6.2-RC that has the new file and currently at 1 day 11hrs uptime. Chris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: sshfs/nfs cause server lockup
On Thursday 07 December 2006 13:36, Chris wrote: > On 23/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > > On Thu, Nov 23, 2006 at 04:38:48PM +0100, [EMAIL PROTECTED] wrote: > > > Chris <[EMAIL PROTECTED]> wrote: > > > >kris a development on this, someone else posted about a nfs > > > > problem and reading his post some starkling point he made > > > > about network cards, he stated he only gets the bug on sis rl > > > > and fxp. > > > > > > Sorry for the misunderstanding, but I think that the 'NFS via > > > TCP' thread covers other bugs, ie the inability to mount NFS v3 > > > over TCP. > > > > > > I've tested the cards above, and the person I replied to > > > encountered the same bug with a bge card. My solution was to > > > remove custom nfs settings in sysctl.conf. I don't know which > > > one was the culprit because I don't have the time to look into > > > it further. > > > > > > My poking uncovered a set of crashing bugs and potentially a > > > livelock. I would agree that NFS is very fragile in RELENG_6. > > > So far I've not run into an NFS server > > > deadlock you described. > > > > Are you sure these are NFS problems and not ethernet driver > > problems? > > > > Kris > > Had another lockup today, reattached to screen process to see both > sshfs mounts timed out when running sshfs to remount the terminal > stopped updating, it didnt disconnect me and an ircd running on the > server remained functional and the server responded to pings, > however the terminal was dead and I couldnt login on ssh for a new > session, a ctrl-alt-del also failed to reboot it and it needed a > power cycle again. > > I have googled researched and read dozens and dozens of nfs bug > reports where most of them seem to be a problem caused by nfs not > disconnecting properly leaving itself in a bad state, the fix is > usually to reboot. I am having to reboot weekly more often then my > windows desktop pc, everyone else having no problems with the nfs > on their linux servers having no problems and its hardly inspiring > that the stable freebsd is not stable. All these problems on > google didnt get fixed no dev attention etc. > > So far I have not got my local bsd box up and running yet due to no > keyboard still need parts for it. I've updated the sshfs and libs port, you may want to update and see if that helps. -- Anish Mistry pgpOgh9DDvEiC.pgp Description: PGP signature
Re: sshfs/nfs cause server lockup
On Thu, Dec 07, 2006 at 06:36:10PM +, Chris wrote: > On 23/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > >On Thu, Nov 23, 2006 at 04:38:48PM +0100, [EMAIL PROTECTED] > >wrote: > >> Chris <[EMAIL PROTECTED]> wrote: > >> >kris a development on this, someone else posted about a nfs problem > >> >and reading his post some starkling point he made about network cards, > >> >he stated he only gets the bug on sis rl and fxp. > >> > >> Sorry for the misunderstanding, but I think that the 'NFS via TCP' thread > >> covers other bugs, ie the inability to mount NFS v3 over TCP. > >> > >> I've tested the cards above, and the person I replied to encountered the > >> same bug with a bge card. My solution was to remove custom nfs settings > >> in sysctl.conf. I don't know which one was the culprit because I don't > >> have the time to look into it further. > >> > >> My poking uncovered a set of crashing bugs and potentially a livelock. > >> I would agree that NFS is very fragile in RELENG_6. > >> So far I've not run into an NFS server deadlock you > >> described. > > > >Are you sure these are NFS problems and not ethernet driver problems? > > > >Kris > > > > > > > > Had another lockup today, reattached to screen process to see both > sshfs mounts timed out when running sshfs to remount the terminal > stopped updating, it didnt disconnect me and an ircd running on the > server remained functional and the server responded to pings, however > the terminal was dead and I couldnt login on ssh for a new session, a > ctrl-alt-del also failed to reboot it and it needed a power cycle > again. > > I have googled researched and read dozens and dozens of nfs bug > reports where most of them seem to be a problem caused by nfs not > disconnecting properly leaving itself in a bad state, the fix is > usually to reboot. I am having to reboot weekly more often then my > windows desktop pc, everyone else having no problems with the nfs on > their linux servers having no problems and its hardly inspiring that > the stable freebsd is not stable. All these problems on google didnt > get fixed no dev attention etc. They probably got "no dev attention" because the user didn't submit any usable information ;-) Yes, that's a hint. If this problem really is important to you then you need to find the time to set up a suitable debugging environment so the developers can begin to help you. Until then we can't do anything no matter how many emails you send. > So far I have not got my local bsd box up and running yet due to no > keyboard still need parts for it. Kris pgpvX8wgsl2Sa.pgp Description: PGP signature
Re: sshfs/nfs cause server lockup
On 23/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: On Thu, Nov 23, 2006 at 04:38:48PM +0100, [EMAIL PROTECTED] wrote: > Chris <[EMAIL PROTECTED]> wrote: > >kris a development on this, someone else posted about a nfs problem > >and reading his post some starkling point he made about network cards, > >he stated he only gets the bug on sis rl and fxp. > > Sorry for the misunderstanding, but I think that the 'NFS via TCP' thread > covers other bugs, ie the inability to mount NFS v3 over TCP. > > I've tested the cards above, and the person I replied to encountered the > same bug with a bge card. My solution was to remove custom nfs settings > in sysctl.conf. I don't know which one was the culprit because I don't > have the time to look into it further. > > My poking uncovered a set of crashing bugs and potentially a livelock. > I would agree that NFS is very fragile in RELENG_6. > So far I've not run into an NFS server deadlock you > described. Are you sure these are NFS problems and not ethernet driver problems? Kris Had another lockup today, reattached to screen process to see both sshfs mounts timed out when running sshfs to remount the terminal stopped updating, it didnt disconnect me and an ircd running on the server remained functional and the server responded to pings, however the terminal was dead and I couldnt login on ssh for a new session, a ctrl-alt-del also failed to reboot it and it needed a power cycle again. I have googled researched and read dozens and dozens of nfs bug reports where most of them seem to be a problem caused by nfs not disconnecting properly leaving itself in a bad state, the fix is usually to reboot. I am having to reboot weekly more often then my windows desktop pc, everyone else having no problems with the nfs on their linux servers having no problems and its hardly inspiring that the stable freebsd is not stable. All these problems on google didnt get fixed no dev attention etc. So far I have not got my local bsd box up and running yet due to no keyboard still need parts for it. Chris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: sshfs/nfs cause server lockup
On 23/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: On Thu, Nov 23, 2006 at 04:38:48PM +0100, [EMAIL PROTECTED] wrote: > Chris <[EMAIL PROTECTED]> wrote: > >kris a development on this, someone else posted about a nfs problem > >and reading his post some starkling point he made about network cards, > >he stated he only gets the bug on sis rl and fxp. > > Sorry for the misunderstanding, but I think that the 'NFS via TCP' thread > covers other bugs, ie the inability to mount NFS v3 over TCP. > > I've tested the cards above, and the person I replied to encountered the > same bug with a bge card. My solution was to remove custom nfs settings > in sysctl.conf. I don't know which one was the culprit because I don't > have the time to look into it further. > > My poking uncovered a set of crashing bugs and potentially a livelock. > I would agree that NFS is very fragile in RELENG_6. > So far I've not run into an NFS server deadlock you > described. Are you sure these are NFS problems and not ethernet driver problems? Kris The consistency is the servers that lock up are fine when not using any nfs or sshfs mounts. But the 2 servers that are fine use dc and re network adaptors. I am not able to setup my local bsd box until a new hd arrives next monday and will be a day or so before I get it setup to mirror the production conditions of the servers and I will then mount some nfs on it to repeat the problem. Chris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: sshfs/nfs cause server lockup
On Thu, Nov 23, 2006 at 04:38:48PM +0100, [EMAIL PROTECTED] wrote: > Chris <[EMAIL PROTECTED]> wrote: > >kris a development on this, someone else posted about a nfs problem > >and reading his post some starkling point he made about network cards, > >he stated he only gets the bug on sis rl and fxp. > > Sorry for the misunderstanding, but I think that the 'NFS via TCP' thread > covers other bugs, ie the inability to mount NFS v3 over TCP. > > I've tested the cards above, and the person I replied to encountered the > same bug with a bge card. My solution was to remove custom nfs settings > in sysctl.conf. I don't know which one was the culprit because I don't > have the time to look into it further. > > My poking uncovered a set of crashing bugs and potentially a livelock. > I would agree that NFS is very fragile in RELENG_6. > So far I've not run into an NFS server deadlock you > described. Are you sure these are NFS problems and not ethernet driver problems? Kris pgpVbyEyKStFe.pgp Description: PGP signature
Re: sshfs/nfs cause server lockup
Chris <[EMAIL PROTECTED]> wrote: >kris a development on this, someone else posted about a nfs problem >and reading his post some starkling point he made about network cards, >he stated he only gets the bug on sis rl and fxp. Sorry for the misunderstanding, but I think that the 'NFS via TCP' thread covers other bugs, ie the inability to mount NFS v3 over TCP. I've tested the cards above, and the person I replied to encountered the same bug with a bge card. My solution was to remove custom nfs settings in sysctl.conf. I don't know which one was the culprit because I don't have the time to look into it further. My poking uncovered a set of crashing bugs and potentially a livelock. I would agree that NFS is very fragile in RELENG_6. So far I've not run into an NFS server deadlock you described. scs ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: sshfs/nfs cause server lockup
On 23/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: On Thu, Nov 23, 2006 at 05:25:21AM +, Chris wrote: > On 22/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > >On Wed, Nov 22, 2006 at 05:49:12AM +, Chris wrote: > >> On a few occasions all different remote servers I have had nfs cause > >> servers to stop responding so I stopped using it all the servers were > >> either 6.0 release 6.1 release or 6-stable. > >> > >> We recently discovered sshfs which supports cross platform mounting > >> server is linux and I mounted on a freebsd 6.1 release using security > >> branch up to date. > >> > >> it was working fine for around 5 to 6 days with some problems with > >> sshfs not updating files that are updated but wasnt compromising the > >> stability of the freebsd server I just remounted to keep up to date. > >> Then today the linux server had network problems so the sshfs timed > >> out and there is 2 dirs I mount, the first mounted fine a bit slow but > >> connected but when I ran the command to mount the 2nd dir the server > >> stopped responding. > >> > >> My 2nd ssh terminal was alive I tried to run top to see if sshfs was > >> hanging or something but when I hit enter top didnt run and the 2nd > >> terminal was froze, note both terminals didnt timeout and a ircd > >> running on the server also did not timeout but the box wasnt listening > >> to any new requests, it was responding to pings fine. > >> > >> I have a remote reboot facility on the box but no local access and no > >> kvm/serial console facility available this is the case for all of my > >> servers. I initially tried a soft reboot which uses ctrl-alt-delete > >> but the pings kept replying so I could see the reboot wasn initiated > >> indicating some kind of console lockup as well, I then did a hard > >> reboot which brought the server back. > >> > >> All logs stopped when the first lockup occured so no errors etc. > >> recorded bear in mind I have no local access to this machine. It does > >> appear that 6.x has some kind of serious remote mounting bug because I > >> never had these nfs problems in freebsd 5.x. > >> > >> I would be interested in any thoughts as to what could help me I have > >> rebooted the server now with network mpsafe disabled to see if this > >> will help it is using a generic kernel with the following changes. > > > >Sounds like your "sshfs" is causing the kernel to deadlock in that > >error situation. You can confirm by enabling DEBUG_LOCKS and > >DEBUG_VFS_LOCKS, then breaking to DDB and running 'show lockedvnods' > >when the deadlock occurs. > > > >If you're still having problems with NFS on 6.2, we'd much rather you > >reported those so that we can investigate and try to fix them. > > > >Kris > > > > > > > > Ok thanks, I will make sure this box is updated to 6.2 when it hits > release, if I enable the options in the kernel I will need local > access to use ddb? Yeah, you'll need a form of console access (local or serial). In principle you could extract the information from a coredump (i.e. trigger a coredump when the system deadlocks), but I don't think there's a kgdb macro equivalent of 'show lockedvnods'. Kris kris a development on this, someone else posted about a nfs problem and reading his post some starkling point he made about network cards, he stated he only gets the bug on sis rl and fxp. I have 2 servers that have no problems they show dc0 and re0 in ifconfig. The servers that lockup I have 2 using fxp0 and 1 using rl0 and another that used sis0 which I no longer have this would back up what he was saying. I wont be able to use ddb on my remote server since the datacentre wont provide kvm even if I offer cash for the service, my local server is rl0 so I will try to repeat the problem on that. Chris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: sshfs/nfs cause server lockup
On Thu, Nov 23, 2006 at 05:25:21AM +, Chris wrote: > On 22/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > >On Wed, Nov 22, 2006 at 05:49:12AM +, Chris wrote: > >> On a few occasions all different remote servers I have had nfs cause > >> servers to stop responding so I stopped using it all the servers were > >> either 6.0 release 6.1 release or 6-stable. > >> > >> We recently discovered sshfs which supports cross platform mounting > >> server is linux and I mounted on a freebsd 6.1 release using security > >> branch up to date. > >> > >> it was working fine for around 5 to 6 days with some problems with > >> sshfs not updating files that are updated but wasnt compromising the > >> stability of the freebsd server I just remounted to keep up to date. > >> Then today the linux server had network problems so the sshfs timed > >> out and there is 2 dirs I mount, the first mounted fine a bit slow but > >> connected but when I ran the command to mount the 2nd dir the server > >> stopped responding. > >> > >> My 2nd ssh terminal was alive I tried to run top to see if sshfs was > >> hanging or something but when I hit enter top didnt run and the 2nd > >> terminal was froze, note both terminals didnt timeout and a ircd > >> running on the server also did not timeout but the box wasnt listening > >> to any new requests, it was responding to pings fine. > >> > >> I have a remote reboot facility on the box but no local access and no > >> kvm/serial console facility available this is the case for all of my > >> servers. I initially tried a soft reboot which uses ctrl-alt-delete > >> but the pings kept replying so I could see the reboot wasn initiated > >> indicating some kind of console lockup as well, I then did a hard > >> reboot which brought the server back. > >> > >> All logs stopped when the first lockup occured so no errors etc. > >> recorded bear in mind I have no local access to this machine. It does > >> appear that 6.x has some kind of serious remote mounting bug because I > >> never had these nfs problems in freebsd 5.x. > >> > >> I would be interested in any thoughts as to what could help me I have > >> rebooted the server now with network mpsafe disabled to see if this > >> will help it is using a generic kernel with the following changes. > > > >Sounds like your "sshfs" is causing the kernel to deadlock in that > >error situation. You can confirm by enabling DEBUG_LOCKS and > >DEBUG_VFS_LOCKS, then breaking to DDB and running 'show lockedvnods' > >when the deadlock occurs. > > > >If you're still having problems with NFS on 6.2, we'd much rather you > >reported those so that we can investigate and try to fix them. > > > >Kris > > > > > > > > Ok thanks, I will make sure this box is updated to 6.2 when it hits > release, if I enable the options in the kernel I will need local > access to use ddb? Yeah, you'll need a form of console access (local or serial). In principle you could extract the information from a coredump (i.e. trigger a coredump when the system deadlocks), but I don't think there's a kgdb macro equivalent of 'show lockedvnods'. Kris pgpMrUEIlwUUc.pgp Description: PGP signature
Re: sshfs/nfs cause server lockup
On 22/11/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: On Wed, Nov 22, 2006 at 05:49:12AM +, Chris wrote: > On a few occasions all different remote servers I have had nfs cause > servers to stop responding so I stopped using it all the servers were > either 6.0 release 6.1 release or 6-stable. > > We recently discovered sshfs which supports cross platform mounting > server is linux and I mounted on a freebsd 6.1 release using security > branch up to date. > > it was working fine for around 5 to 6 days with some problems with > sshfs not updating files that are updated but wasnt compromising the > stability of the freebsd server I just remounted to keep up to date. > Then today the linux server had network problems so the sshfs timed > out and there is 2 dirs I mount, the first mounted fine a bit slow but > connected but when I ran the command to mount the 2nd dir the server > stopped responding. > > My 2nd ssh terminal was alive I tried to run top to see if sshfs was > hanging or something but when I hit enter top didnt run and the 2nd > terminal was froze, note both terminals didnt timeout and a ircd > running on the server also did not timeout but the box wasnt listening > to any new requests, it was responding to pings fine. > > I have a remote reboot facility on the box but no local access and no > kvm/serial console facility available this is the case for all of my > servers. I initially tried a soft reboot which uses ctrl-alt-delete > but the pings kept replying so I could see the reboot wasn initiated > indicating some kind of console lockup as well, I then did a hard > reboot which brought the server back. > > All logs stopped when the first lockup occured so no errors etc. > recorded bear in mind I have no local access to this machine. It does > appear that 6.x has some kind of serious remote mounting bug because I > never had these nfs problems in freebsd 5.x. > > I would be interested in any thoughts as to what could help me I have > rebooted the server now with network mpsafe disabled to see if this > will help it is using a generic kernel with the following changes. Sounds like your "sshfs" is causing the kernel to deadlock in that error situation. You can confirm by enabling DEBUG_LOCKS and DEBUG_VFS_LOCKS, then breaking to DDB and running 'show lockedvnods' when the deadlock occurs. If you're still having problems with NFS on 6.2, we'd much rather you reported those so that we can investigate and try to fix them. Kris Ok thanks, I will make sure this box is updated to 6.2 when it hits release, if I enable the options in the kernel I will need local access to use ddb? Chris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: sshfs/nfs cause server lockup
On Wed, Nov 22, 2006 at 05:49:12AM +, Chris wrote: > On a few occasions all different remote servers I have had nfs cause > servers to stop responding so I stopped using it all the servers were > either 6.0 release 6.1 release or 6-stable. > > We recently discovered sshfs which supports cross platform mounting > server is linux and I mounted on a freebsd 6.1 release using security > branch up to date. > > it was working fine for around 5 to 6 days with some problems with > sshfs not updating files that are updated but wasnt compromising the > stability of the freebsd server I just remounted to keep up to date. > Then today the linux server had network problems so the sshfs timed > out and there is 2 dirs I mount, the first mounted fine a bit slow but > connected but when I ran the command to mount the 2nd dir the server > stopped responding. > > My 2nd ssh terminal was alive I tried to run top to see if sshfs was > hanging or something but when I hit enter top didnt run and the 2nd > terminal was froze, note both terminals didnt timeout and a ircd > running on the server also did not timeout but the box wasnt listening > to any new requests, it was responding to pings fine. > > I have a remote reboot facility on the box but no local access and no > kvm/serial console facility available this is the case for all of my > servers. I initially tried a soft reboot which uses ctrl-alt-delete > but the pings kept replying so I could see the reboot wasn initiated > indicating some kind of console lockup as well, I then did a hard > reboot which brought the server back. > > All logs stopped when the first lockup occured so no errors etc. > recorded bear in mind I have no local access to this machine. It does > appear that 6.x has some kind of serious remote mounting bug because I > never had these nfs problems in freebsd 5.x. > > I would be interested in any thoughts as to what could help me I have > rebooted the server now with network mpsafe disabled to see if this > will help it is using a generic kernel with the following changes. Sounds like your "sshfs" is causing the kernel to deadlock in that error situation. You can confirm by enabling DEBUG_LOCKS and DEBUG_VFS_LOCKS, then breaking to DDB and running 'show lockedvnods' when the deadlock occurs. If you're still having problems with NFS on 6.2, we'd much rather you reported those so that we can investigate and try to fix them. Kris pgprgVzqrqInr.pgp Description: PGP signature
sshfs/nfs cause server lockup
On a few occasions all different remote servers I have had nfs cause servers to stop responding so I stopped using it all the servers were either 6.0 release 6.1 release or 6-stable. We recently discovered sshfs which supports cross platform mounting server is linux and I mounted on a freebsd 6.1 release using security branch up to date. it was working fine for around 5 to 6 days with some problems with sshfs not updating files that are updated but wasnt compromising the stability of the freebsd server I just remounted to keep up to date. Then today the linux server had network problems so the sshfs timed out and there is 2 dirs I mount, the first mounted fine a bit slow but connected but when I ran the command to mount the 2nd dir the server stopped responding. My 2nd ssh terminal was alive I tried to run top to see if sshfs was hanging or something but when I hit enter top didnt run and the 2nd terminal was froze, note both terminals didnt timeout and a ircd running on the server also did not timeout but the box wasnt listening to any new requests, it was responding to pings fine. I have a remote reboot facility on the box but no local access and no kvm/serial console facility available this is the case for all of my servers. I initially tried a soft reboot which uses ctrl-alt-delete but the pings kept replying so I could see the reboot wasn initiated indicating some kind of console lockup as well, I then did a hard reboot which brought the server back. All logs stopped when the first lockup occured so no errors etc. recorded bear in mind I have no local access to this machine. It does appear that 6.x has some kind of serious remote mounting bug because I never had these nfs problems in freebsd 5.x. I would be interested in any thoughts as to what could help me I have rebooted the server now with network mpsafe disabled to see if this will help it is using a generic kernel with the following changes. options directio, polling, noadaptive mutexes, adaptive giant,ipv6 and nfs disabled. dmesg output below. I left the reboot showing vnodes because it also looks supicous it took so long for it to synch the disks, this was following a working reboot the remote reboot of course was improper shutdown. The hd is a sata2 but dmesg shows as ata33 Syncing disks, vnodes remaining...3 3 1 0 2 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 0 0 0 done All buffers synced. Copyright (c) 1992-2006 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 6.1-RELEASE-p10 #1: Sat Nov 11 23:02:09 GMT 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/HEAVEN WARNING: MPSAFE network stack disabled, expect reduced performance. ACPI APIC Table: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) 64 Processor 3800+ (2410.95-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x40ff2 Stepping = 2 Features=0x78bfbff Features2=0x2001 AMD Features=0xea500800 AMD Features2=0x1d,,CR8> real memory = 939261952 (895 MB) avail memory = 909828096 (867 MB) ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x508-0x50b on acpi0 cpu0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 0.0 (no driver attached) isab0: at device 1.0 on pci0 isa0: on isab0 pci0: at device 1.1 (no driver attached) pci0: at device 1.2 (no driver attached) pci0: at device 1.3 (no driver attached) ohci0: mem 0xdfe7f000-0xdfe7 irq 21 at device 2.0 on pci0 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: on ohci0 usb0: USB revision 1.0 uhub0: nVidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 10 ports with 10 removable, self powered ehci0: mem 0xdfe7ec00-0xdfe7ecff irq 22 at device 2.1 on pci0 ehci0: [GIANT-LOCKED] usb1: EHCI version 1.0 usb1: companion controller, 10 ports each: usb0 usb1: on ehci0 usb1: USB revision 2.0 uhub1: nVidia EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub1: 10 ports with 10 removable, self powered pcib1: at device 4.0 on pci0 pci1: on pcib1 fxp0: port 0xec00-0xec3f mem 0xd000-0xdff f,0xdffc-0xdffd irq 16 at device 6.0 on pci1 miibus0: on fxp0 inphy0: on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp0: Ethernet address: 00:02:b3:bf:b5:c9 fxp0: [GIANT-LOCKED] pci0: at device 5.0 (no driver attached) atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 6.0 on pci0 ata0: on atapci0 ata1: on atapci0 atapci1: port 0xd480-0xd487,0xd400-0xd403,0xd080-0xd087,0xd000-0xd003,0xcc00-0xcc0f mem 0xdfe7d000-0xdfe7dfff irq 20 at device 8.0 on pci0 ata2: on atapci1 ata3: on atapci1 atapci2: port 0xc880-0xc887,0xc800-0xc803,0xc480-0xc487,0xc400-0xc403,0xc080-0xc08f mem 0xdfe7c000-0xdfe