Re: F12 NFS Failures
On 11/24/2009 04:21 AM, John Austin wrote: Just tested my machine with UDP and TCP This was using md5sum for about 10GB over the NFS mount 1. The default for F12/Centos5.4 appears to be TCP - which freezes 2. Forcing UDP gives NO errors for 10GB transfer 3. Forcing TCP gives a freeze I know this is an old thread, but I thought I'd toss in that you will see symptoms very much like this if only one of your machines (probably the NFS server) is configured to use jumbo frames. You should check the MTU on the server and client. -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: F12 NFS Failures
On Mon, 2009-12-21 at 00:39 -0800, Gordon Messmer wrote: On 11/24/2009 04:21 AM, John Austin wrote: Just tested my machine with UDP and TCP This was using md5sum for about 10GB over the NFS mount 1. The default for F12/Centos5.4 appears to be TCP - which freezes 2. Forcing UDP gives NO errors for 10GB transfer 3. Forcing TCP gives a freeze I know this is an old thread, but I thought I'd toss in that you will see symptoms very much like this if only one of your machines (probably the NFS server) is configured to use jumbo frames. You should check the MTU on the server and client. Thanks for the idea I have checked the host and the server, both are set to MTU of 1500 I then checked the switch (Netgear GS108T) this had jumbo frames enabled Disabled jumbo frames - no change Updated switch firmware - still no change Problem still present with all F12 kernel versions (sky2 drivers) to date I have taken sky2 driver from latest stable kernel and tried to compile under F12 but failed! As I have a work around with 2nd NIC I have been lazy !! Next move probably a custom kernel John -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: F12 NFS Failures
John Austin wrote, On Tue, 24 Nov 2009 12:21:58 +: On Mon, 2009-11-23 at 15:00 -0800, Rick Stevens wrote: On 11/21/2009 10:41 AM, John Austin wrote: On Sat, 2009-11-21 at 11:11 -0700, Greg Woods wrote: On Sat, 2009-11-21 at 10:09 +, John Austin wrote: When copying a large file (2.7GB) from the server to the F12 m/c a complete freeze of the F12 machine occurs. I haven't seen freezes, but I have seen corruption when trying to copy large files (e.g. like a DVD iso image) via NFS. In fact, this happened to me when I was trying to install an F12 virtual machine on my F11 box (so I could try it out before deciding whether or not to bite the bullet and upgrade the host OS). I copied over the DVD iso image, then tried to install a VM from it, and it failed the media test. Sure enough, it also failed the sha256sum test. Copying the same DVD iso file via scp instead worked fine. I do not trust NFS for large files. --Greg Hi Greg That's interesting and very worrying - surely it can't/shouldn't happen! I have been using NFS for years for all types/sizes of files and never had a problem until the last couple of months. 1. The Centos/RHEL 5.3/5.4 kernel had a serious bug that has been fixed with the latest kernel update 2. Now this F12 problem Surely a very large worldwide community uses NFS ? OK the F12 case could be my finger trouble or even a hardware problem I will install F12 on a second machine and test again (against the same server) Can you verify that you run into the same issue if you run NFS over TCP as opposed to NFS over UDP (it's an option in the mount command on the client, use either proto=tcp or proto=udp). By default, the system queries the server and selects a protocol based on what's being asked of it. See the TRANSPORT METHODS section of man nfs. -- - Rick Stevens, Systems Engineer ri...@nerd.com - - AIM/Skype: therps2ICQ: 22643734Yahoo: origrps2 - -- - The Theory of Rapitivity: E=MC Hammer- - -- Glenn Marcus (via TopFive.com) - -- Hi Rick Many thanks for the reply - you have found a work-around !! Just tested my machine with UDP and TCP This was using md5sum for about 10GB over the NFS mount 1. The default for F12/Centos5.4 appears to be TCP - which freezes 2. Forcing UDP gives NO errors for 10GB transfer 3. Forcing TCP gives a freeze Having briefly read the man pages this is the opposite of what I would expect and of what you suggest !! There must be a timing problem somewhere - Please see the other thread Sky2 NIC Problem? - Was F12 NFS Failures for other tests I have carried out Regards John what are your other mount options? having seen the Sky2 NIC Problem message, your card/driver may be having issues, but some nfs options may help/hurt. I am assuming that you only have 'hard' and not 'hard,intr' as options to the mount. And for transferring large files over NFS, I have had experiences that say stay away from 'soft' NFS. it is interesting that TCP nfs locks the machine and fails to copy the very large file, while UDP succeeds in copying the same file with the same device/drver. BTW when you say that UDP gave no errors, do you mean that from the user program perspective (cp, and then sha256sum) there were no errors, or that from both the user and syslog perspective there were no errors? I am wondering if you have found a place where the UDP code deals with a bad packet correctly and the TCP version has not seen enough (bad environment) testing. Wouldn't happen to have a serial cable around so you can capture where the kernel goes bonkers at would you? (note, never done the serial console myself.) -- Todd Denniston Crane Division, Naval Surface Warfare Center (NSWC Crane) Harnessing the Power of Technology for the Warfighter -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: F12 NFS Failures
On Tue, 2009-12-01 at 11:00 -0500, Todd Denniston wrote: John Austin wrote, On Tue, 24 Nov 2009 12:21:58 +: On Mon, 2009-11-23 at 15:00 -0800, Rick Stevens wrote: On 11/21/2009 10:41 AM, John Austin wrote: On Sat, 2009-11-21 at 11:11 -0700, Greg Woods wrote: On Sat, 2009-11-21 at 10:09 +, John Austin wrote: When copying a large file (2.7GB) from the server to the F12 m/c a complete freeze of the F12 machine occurs. I haven't seen freezes, but I have seen corruption when trying to copy large files (e.g. like a DVD iso image) via NFS. In fact, this happened to me when I was trying to install an F12 virtual machine on my F11 box (so I could try it out before deciding whether or not to bite the bullet and upgrade the host OS). I copied over the DVD iso image, then tried to install a VM from it, and it failed the media test. Sure enough, it also failed the sha256sum test. Copying the same DVD iso file via scp instead worked fine. I do not trust NFS for large files. --Greg Hi Greg That's interesting and very worrying - surely it can't/shouldn't happen! I have been using NFS for years for all types/sizes of files and never had a problem until the last couple of months. 1. The Centos/RHEL 5.3/5.4 kernel had a serious bug that has been fixed with the latest kernel update 2. Now this F12 problem Surely a very large worldwide community uses NFS ? OK the F12 case could be my finger trouble or even a hardware problem I will install F12 on a second machine and test again (against the same server) Can you verify that you run into the same issue if you run NFS over TCP as opposed to NFS over UDP (it's an option in the mount command on the client, use either proto=tcp or proto=udp). By default, the system queries the server and selects a protocol based on what's being asked of it. See the TRANSPORT METHODS section of man nfs. -- - Rick Stevens, Systems Engineer ri...@nerd.com - - AIM/Skype: therps2ICQ: 22643734Yahoo: origrps2 - -- - The Theory of Rapitivity: E=MC Hammer- - -- Glenn Marcus (via TopFive.com) - -- Hi Rick Many thanks for the reply - you have found a work-around !! Just tested my machine with UDP and TCP This was using md5sum for about 10GB over the NFS mount 1. The default for F12/Centos5.4 appears to be TCP - which freezes 2. Forcing UDP gives NO errors for 10GB transfer 3. Forcing TCP gives a freeze Having briefly read the man pages this is the opposite of what I would expect and of what you suggest !! There must be a timing problem somewhere - Please see the other thread Sky2 NIC Problem? - Was F12 NFS Failures for other tests I have carried out Regards John what are your other mount options? having seen the Sky2 NIC Problem message, your card/driver may be having issues, but some nfs options may help/hurt. I am assuming that you only have 'hard' and not 'hard,intr' as options to the mount. And for transferring large files over NFS, I have had experiences that say stay away from 'soft' NFS. it is interesting that TCP nfs locks the machine and fails to copy the very large file, while UDP succeeds in copying the same file with the same device/drver. BTW when you say that UDP gave no errors, do you mean that from the user program perspective (cp, and then sha256sum) there were no errors, or that from both the user and syslog perspective there were no errors? Purely from the user point of view, I did not check the number of re-transmission, log files etc. I am wondering if you have found a place where the UDP code deals with a bad packet correctly and the TCP version has not seen enough (bad environment) testing. Wouldn't happen to have a serial cable around so you can capture where the kernel goes bonkers at would you? (note, never done the serial console myself.) I've probably got a serial cable in the roof somewhere but the machine has no serial ports! Shuttle SA76G2. Hi Todd I must admit that I have basically given up with the sky2 driver for the moment. I gave up after reading about problems with the sky2 driver way back to something like 2.6.18. I had a spare D-Link gigabit NIC and have been using that. My whole network depends on NFS working perfectly so a dodgy driver is no use to me. It must be a very subtle bug as I cannot cause the freeze with 1. scp 10GB across the network 2. md5sum across a CIFS samba mount 3. md5sum across NFS4 UDP Maybe you are right and it would fail if I tried harder/longer Regards John -- fedora-list mailing list
Re: F12 NFS Failures
On Mon, 2009-11-23 at 15:00 -0800, Rick Stevens wrote: On 11/21/2009 10:41 AM, John Austin wrote: On Sat, 2009-11-21 at 11:11 -0700, Greg Woods wrote: On Sat, 2009-11-21 at 10:09 +, John Austin wrote: When copying a large file (2.7GB) from the server to the F12 m/c a complete freeze of the F12 machine occurs. I haven't seen freezes, but I have seen corruption when trying to copy large files (e.g. like a DVD iso image) via NFS. In fact, this happened to me when I was trying to install an F12 virtual machine on my F11 box (so I could try it out before deciding whether or not to bite the bullet and upgrade the host OS). I copied over the DVD iso image, then tried to install a VM from it, and it failed the media test. Sure enough, it also failed the sha256sum test. Copying the same DVD iso file via scp instead worked fine. I do not trust NFS for large files. --Greg Hi Greg That's interesting and very worrying - surely it can't/shouldn't happen! I have been using NFS for years for all types/sizes of files and never had a problem until the last couple of months. 1. The Centos/RHEL 5.3/5.4 kernel had a serious bug that has been fixed with the latest kernel update 2. Now this F12 problem Surely a very large worldwide community uses NFS ? OK the F12 case could be my finger trouble or even a hardware problem I will install F12 on a second machine and test again (against the same server) Can you verify that you run into the same issue if you run NFS over TCP as opposed to NFS over UDP (it's an option in the mount command on the client, use either proto=tcp or proto=udp). By default, the system queries the server and selects a protocol based on what's being asked of it. See the TRANSPORT METHODS section of man nfs. -- - Rick Stevens, Systems Engineer ri...@nerd.com - - AIM/Skype: therps2ICQ: 22643734Yahoo: origrps2 - -- - The Theory of Rapitivity: E=MC Hammer- - -- Glenn Marcus (via TopFive.com) - -- Hi Rick Many thanks for the reply - you have found a work-around !! Just tested my machine with UDP and TCP This was using md5sum for about 10GB over the NFS mount 1. The default for F12/Centos5.4 appears to be TCP - which freezes 2. Forcing UDP gives NO errors for 10GB transfer 3. Forcing TCP gives a freeze Having briefly read the man pages this is the opposite of what I would expect and of what you suggest !! There must be a timing problem somewhere - Please see the other thread Sky2 NIC Problem? - Was F12 NFS Failures for other tests I have carried out Regards John -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: F12 NFS Failures
On 11/21/2009 10:41 AM, John Austin wrote: On Sat, 2009-11-21 at 11:11 -0700, Greg Woods wrote: On Sat, 2009-11-21 at 10:09 +, John Austin wrote: When copying a large file (2.7GB) from the server to the F12 m/c a complete freeze of the F12 machine occurs. I haven't seen freezes, but I have seen corruption when trying to copy large files (e.g. like a DVD iso image) via NFS. In fact, this happened to me when I was trying to install an F12 virtual machine on my F11 box (so I could try it out before deciding whether or not to bite the bullet and upgrade the host OS). I copied over the DVD iso image, then tried to install a VM from it, and it failed the media test. Sure enough, it also failed the sha256sum test. Copying the same DVD iso file via scp instead worked fine. I do not trust NFS for large files. --Greg Hi Greg That's interesting and very worrying - surely it can't/shouldn't happen! I have been using NFS for years for all types/sizes of files and never had a problem until the last couple of months. 1. The Centos/RHEL 5.3/5.4 kernel had a serious bug that has been fixed with the latest kernel update 2. Now this F12 problem Surely a very large worldwide community uses NFS ? OK the F12 case could be my finger trouble or even a hardware problem I will install F12 on a second machine and test again (against the same server) Can you verify that you run into the same issue if you run NFS over TCP as opposed to NFS over UDP (it's an option in the mount command on the client, use either proto=tcp or proto=udp). By default, the system queries the server and selects a protocol based on what's being asked of it. See the TRANSPORT METHODS section of man nfs. -- - Rick Stevens, Systems Engineer ri...@nerd.com - - AIM/Skype: therps2ICQ: 22643734Yahoo: origrps2 - -- - The Theory of Rapitivity: E=MC Hammer- - -- Glenn Marcus (via TopFive.com) - -- -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: F12 NFS Failures
--- On Sat, 11/21/09, John Austin j...@jaa.org.uk wrote: From: John Austin j...@jaa.org.uk Subject: F12 NFS Failures To: fedora-list@redhat.com Date: Saturday, November 21, 2009, 2:09 AM Hi I have just completed a clean install of F12 and subsequent yum update on a client machine. NFS was used for the install - no problems !! I am using a fully updated Centos 5.4 nfs server When copying a large file (2.7GB) from the server to the F12 m/c a complete freeze of the F12 machine occurs. No mouse, keyboard, ssh login. Only hitting the Reset button gets it back. F12 is installed on the only disk on the machine which has several ext3 partitions. A fully updated F11 is on one of the partitions I have tried 1. Changing from NFS4 to NFS3 - Still locks up 2. scp the same file from the server to F12 no problem 3. md5sum on the file across the nfs mount - a read only? - F12 freezes 4. Booting the F11 partition and copying the same file - no problems 5. Tried playing with Defaultvers=4 in /etc/nfsmount.conf - still locks I have googled but not found anything useful so far My understanding is that NFS code is in the kernel - is that correct? Has anyone seen this or has any ideas about the next move 1) before doing anything, check the status of NFS, i.e, # service NFS status 2), NFS is failing because something is not letting it run correctly. I saw it in testing Fedora 12 rawhide days, on messages(bootup), so it could be that the service is not running? and something is stopping it from working properly? Regards John -- Regards, Antonio -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: F12 NFS Failures
On Sat, 2009-11-21 at 06:33 -0800, Antonio Olivares wrote: --- On Sat, 11/21/09, John Austin j...@jaa.org.uk wrote: From: John Austin j...@jaa.org.uk Subject: F12 NFS Failures To: fedora-list@redhat.com Date: Saturday, November 21, 2009, 2:09 AM Hi I have just completed a clean install of F12 and subsequent yum update on a client machine. NFS was used for the install - no problems !! I am using a fully updated Centos 5.4 nfs server When copying a large file (2.7GB) from the server to the F12 m/c a complete freeze of the F12 machine occurs. No mouse, keyboard, ssh login. Only hitting the Reset button gets it back. F12 is installed on the only disk on the machine which has several ext3 partitions. A fully updated F11 is on one of the partitions I have tried 1. Changing from NFS4 to NFS3 - Still locks up 2. scp the same file from the server to F12 no problem 3. md5sum on the file across the nfs mount - a read only? - F12 freezes 4. Booting the F11 partition and copying the same file - no problems 5. Tried playing with Defaultvers=4 in /etc/nfsmount.conf - still locks I have googled but not found anything useful so far My understanding is that NFS code is in the kernel - is that correct? Has anyone seen this or has any ideas about the next move 1) before doing anything, check the status of NFS, i.e, # service NFS status 2), NFS is failing because something is not letting it run correctly. I saw it in testing Fedora 12 rawhide days, on messages(bootup), so it could be that the service is not running? and something is stopping it from working properly? Regards, Antonio Hi Antonio Thanks for the reply NFS is definitely running to some extent as home directories are mounted OK and my global directory is also mounted OK. The client only seems to fail during a large/long transfer The autofs (NIS exported) files of interest are maui.jaa.org.uk ~ 1# cat /etc/auto.home #* -fstype=nfs 148.197.29.5:/exports/home/ * -fstype=nfs4,rsize=32768,wsize=32768148.197.29.5:/home/ maui.jaa.org.uk ~ 2# cat /etc/auto.direct #/global-fstype=nfs 148.197.29.5:/exports/global /global-fstype=nfs4,rsize=32768,wsize=32768 148.197.29.5:/global The client locks up with no indication of a problem in /var/log/messages after a restart The server shows [r...@maui ~]# cat /var/log/messages |grep nfs ... Nov 20 16:25:48 maui kernel: nfs4_cb: server 148.197.29.252 not responding, timed out ... The client falls over at random times during a transfer and leaves a partially copied file when using cp I did wonder whether it was something to do with FS-Cache but as far as I can see nfs is not using it. dmesg includes FS-Cache: Loaded FS-Cache: Netfs 'nfs' registered for caching but this shows no activity cat /proc/fs/fscache/stats John -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: F12 NFS Failures
On Sat, 2009-11-21 at 10:09 +, John Austin wrote: When copying a large file (2.7GB) from the server to the F12 m/c a complete freeze of the F12 machine occurs. I haven't seen freezes, but I have seen corruption when trying to copy large files (e.g. like a DVD iso image) via NFS. In fact, this happened to me when I was trying to install an F12 virtual machine on my F11 box (so I could try it out before deciding whether or not to bite the bullet and upgrade the host OS). I copied over the DVD iso image, then tried to install a VM from it, and it failed the media test. Sure enough, it also failed the sha256sum test. Copying the same DVD iso file via scp instead worked fine. I do not trust NFS for large files. --Greg -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines