Re: NFS mount lockups since about a month ago

2021-10-30 Thread Terry Barnaby
Since some Fedora33 update in the last couple of weeks the problem has gone away. I haven't changed anything as far as I am aware. One change is that the kernel moved from 5.13.x to 5.14.x ... Terry On 21/10/2021 23:36, Reon Beon via users wrote: https://release-monitoring.org/project/2081/ We

Re: NFS mount lockups since about a month ago

2021-10-21 Thread Reon Beon via users
https://release-monitoring.org/project/2081/ Well it is a pre-release version. 2.5.5.rc3 ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedo

Re: NFS mount lockups since about a month ago

2021-10-06 Thread Terry Barnaby
Hi Roger, Thanks for looking. I will try NFS v3 with my latency tests running. I did try NFS v3 before and I "think" there were still desktop lockups but for a much shorter time. But this is just a feeling. Current kernel on both systems is: 5.13.19-100.fc33.x86_64. If I find the time, I will

Re: NFS mount lockups since about a month ago

2021-10-05 Thread Roger Heflin
That network looks fine to me I would try v3. I have had bad luck many times with v4 on a variety of different kernels. If the code is recovering from something related to a bug 45 seconds might be right to decide something that was working is no longer working. I am not sure any amount of debu

Re: NFS mount lockups since about a month ago

2021-10-05 Thread Terry Barnaby
sar -n EDEV reports all 0's all around then. There are somerxdrop/s of 0.02 occasionally on eno1 through the day (about 20 of these with minute based sampling). Today ifconfig lists 39 dropped RX packets out of 2357593. Not sure why there are some dropped packets. "ethtool -S eno1" doesn't seem

Re: NFS mount lockups since about a month ago

2021-10-04 Thread Roger Heflin
Since it is recovering from it, maybe it is losing packets inside the network, what does "sar -n DEV" and "sar -n EDEV" look like during that time on both client seeing the pause and the server. EDEV is typically all zeros unless something is lost. if something is being lost and it matches the ti

Re: NFS mount lockups since about a month ago

2021-10-04 Thread Terry Barnaby
and iostats: 04/10/21 10:51:14 avg-cpu:  %user   %nice %system %iowait  %steal   %idle   2.09    0.00    1.56    0.02    0.00   96.33 Device    r/s rkB/s   rrqm/s  %rrqm r_await rareq-sz w/s wkB/s   wrqm/s  %wrqm w_await wareq-sz d/s dkB/s   drqm/s  %drqm d

Re: NFS mount lockups since about a month ago

2021-10-04 Thread Terry Barnaby
My disklatencytest showed a longish (14 secs) NFS file system directoty/stat lookup again today on a desktop: 2021-10-04T05:26:19 0.069486 0.069486 0.000570 /home/... 2021-10-04T05:28:19 0.269743 0.538000 0.001019 /home/... 2021-10-04T09:48:00 1.492158 0.003314   

Re: NFS mount lockups since about a month ago

2021-10-03 Thread Terry Barnaby
On 04/10/2021 00:51, Roger Heflin wrote: With 10 minute samples anything that happened gets averaged enough that even the worst event is almost impossible to see. Sar will report the same as date ie local time.  And a 12:51 event would be in the 13:00 sample (started at about 12:50 and ended a

Re: NFS mount lockups since about a month ago

2021-10-03 Thread Roger Heflin
With 10 minute samples anything that happened gets averaged enough that even the worst event is almost impossible to see. Sar will report the same as date ie local time. And a 12:51 event would be in the 13:00 sample (started at about 12:50 and ended at 1300). What I do see is that during that w

Re: NFS mount lockups since about a month ago

2021-10-03 Thread Terry Barnaby
45 second event happened at: 2021-10-02T11:51:02 UTC. Not sure what sar time is based on (maybe local time BST  rather than UTC so would be 2021-10-02T12:51:02 BST. Continuing info ... sar -n NFSD on the server 11:00:01    24.16  0.00 24.16  0.00 24.16  0.00  0.00

Re: NFS mount lockups since about a month ago

2021-10-03 Thread Terry Barnaby
45 second event happened at: 2021-10-02T11:51:02 UTC. Not sure what sar time is based on (maybe local time BST  rather than UTC so would be 2021-10-02T12:51:02 BST. "sar -d" on the server: 11:50:02   dev8-0  4.67  0.01 46.62 0.00  9.99  0.12 14.03  5.75 11:50:0

Re: NFS mount lockups since about a month ago

2021-10-02 Thread Roger Heflin
You might retest with nfsv3, the code handling v3 should be significantly different since v3 is stateless and does not maintain long-term connections. And if the long-term connection had some sort of issue then 45 seconds may be how long it takes to figure that out and re-initiate the connection.

Re: NFS mount lockups since about a month ago

2021-10-02 Thread Roger Heflin
What did the sar -d look like for the 2 minutes before and 2 minutes afterward? If it is slow or not may depend on if the directory/file fell out of cache and had to be reread from the disk. I have also seen really large dirs take a really long time to find, but typically that takes thousands of

Re: NFS mount lockups since about a month ago

2021-10-02 Thread Terry Barnaby
I am getting more sure this is an NFS/networking issue rather than an issue with disks in the server. I created a small test program that given a directory finds a random file in a random directory three levels below, opens it and reads up to a block (512 Bytes) of data from it and times how l

Re: NFS mount lockups since about a month ago

2021-10-01 Thread George N. White III
On Fri, 1 Oct 2021 at 16:20, Terry Barnaby wrote: > > Thanks for the info, I am using MDraid. There are no "mddevice" messages > in /var/log/messages and smartctl -a lists no errors on any of the > disks. The disks are about 3 years old, I change them in servers between > 3 and 4 years old. > Wh

Re: NFS mount lockups since about a month ago

2021-10-01 Thread Roger Heflin
You need to replace mddevice with the name of your mddevice. probably md0. 3-5 years is about when they start to go. I have 2-3TB wd-reds sitting on the floor because their correctable/offline uncorr kept happening and blipping my storage (a few second pause). I even removed the disks from the

Re: NFS mount lockups since about a month ago

2021-10-01 Thread Terry Barnaby
On 01/10/2021 19:05, Roger Heflin wrote: it will show latency. await is average iotime in ms, and %util is calced based in await and iops/sec. So long as your turn sar down to 1 minute samples it should tell you which of the 2 disks had higher await/util%.With a 10 minute sample the 40sec p

Re: NFS mount lockups since about a month ago

2021-10-01 Thread Roger Heflin
it will show latency. await is average iotime in ms, and %util is calced based in await and iops/sec. So long as your turn sar down to 1 minute samples it should tell you which of the 2 disks had higher await/util%.With a 10 minute sample the 40sec pause may get spread out across enough iops

Re: NFS mount lockups since about a month ago

2021-10-01 Thread Terry Barnaby
On 01/10/2021 13:31, D. Hugh Redelmeier wrote: Trivial thoughts from reading this thread. Please don't take the triviality as an insult. Perhaps the best way to determine if the problem is from a software update is to downgrade likely packages. In the case of the kernel, you can just boot an o

Re: NFS mount lockups since about a month ago

2021-10-01 Thread Terry Barnaby
On 30/09/2021 19:27, Roger Heflin wrote: Raid0, so there is no redundancy on the data? And what kind of underlying hard disks? The desktop drives will try for a long time (ie a minute or more) to read any bad blocks. Those disks will not report an error unless it gets to the default os timeou

Re: NFS mount lockups since about a month ago

2021-10-01 Thread D. Hugh Redelmeier
Trivial thoughts from reading this thread. Please don't take the triviality as an insult. Perhaps the best way to determine if the problem is from a software update is to downgrade likely packages. In the case of the kernel, you can just boot an older one (assuming that an old enough one is s

Re: NFS mount lockups since about a month ago

2021-09-30 Thread Roger Heflin
Raid0, so there is no redundancy on the data? And what kind of underlying hard disks? The desktop drives will try for a long time (ie a minute or more) to read any bad blocks. Those disks will not report an error unless it gets to the default os timeout, or it hits the disk firmware timeout. T

Re: NFS mount lockups since about a month ago

2021-09-30 Thread Tom Horsley
On Thu, 30 Sep 2021 17:50:01 +0100 Terry Barnaby wrote: > Yes, problems often occur due to you having done something, but I am > pretty sure nothing has changed apart from Fedora updates. But hardware is sneaky. It waits for you to install software updates, the breaks itself to make you think th

Re: NFS mount lockups since about a month ago

2021-09-30 Thread Terry Barnaby
On 30/09/2021 11:42, Roger Heflin wrote: On mine when I first access the NFS volume it takes 5-10 seconds for the disks to spin up.  Mine will spin down later in the day if little or nothing is going on and I will get another delay. I have also seen delays if a disk gets bad blocks and correct

Re: NFS mount lockups since about a month ago

2021-09-30 Thread Terry Barnaby
On 30/09/2021 11:32, Ed Greshko wrote: On 30/09/2021 16:35, Terry Barnaby wrote: This is a very lightly loaded system with just 3 users ATM and very little going on across the network (just editing code files etc). The problem occurred again yesterday. For about 10 minutes my KDE desktop locke

Re: NFS mount lockups since about a month ago

2021-09-30 Thread Roger Heflin
On mine when I first access the NFS volume it takes 5-10 seconds for the disks to spin up. Mine will spin down later in the day if little or nothing is going on and I will get another delay. I have also seen delays if a disk gets bad blocks and corrects them. About 1/2 of time that does have a m

Re: NFS mount lockups since about a month ago

2021-09-30 Thread Ed Greshko
On 30/09/2021 16:35, Terry Barnaby wrote: This is a very lightly loaded system with just 3 users ATM and very little going on across the network (just editing code files etc). The problem occurred again yesterday. For about 10 minutes my KDE desktop locked up in 20 second bursts and then the pr

Re: NFS mount lockups since about a month ago

2021-09-30 Thread Terry Barnaby
Thanks for the feedback everyone. This is a very lightly loaded system with just 3 users ATM and very little going on across the network (just editing code files etc). The problem occurred again yesterday. For about 10 minutes my KDE desktop locked up in 20 second bursts and then the problem w

Re: NFS mount lockups since about a month ago

2021-09-26 Thread Roger Heflin
Make sure you have sar/sysstat enabled and changed to do 1 minute samples. sar -d will show disk perf. If one of the disks "blips" at the firmware level (working on a hard to read block maybe), the util% on that device will be significantly higher than all other disks so will stand out. Then you

Re: NFS mount lockups since about a month ago

2021-09-26 Thread Jamie Fargen
Are there network switches under your control? It sounds similar to what happens when MTU on the systems MTU do not match or one system MTU is set above the value on the switch ports. Next time the issue occurs use ping with the do not fragment flag. ex $ ping -m DO -s 8972 ip.address This exampl

Re: NFS mount lockups since about a month ago

2021-09-26 Thread Tom Horsley
On Sun, 26 Sep 2021 10:26:19 -0300 George N. White III wrote: > If you have cron jobs that use a lot of network bandwidth it may work > fine until some network issue causing lots of retransmits bogs it down. Which is why you should check the dumb stuff first! Has a critter chewed on the ethernet

Re: NFS mount lockups since about a month ago

2021-09-26 Thread George N. White III
On Sun, 26 Sept 2021 at 01:44, Tim via users wrote: > On Sat, 2021-09-25 at 06:04 +0100, Terry Barnaby wrote: > > in the last month or so all of the client computers are getting KDE > > GUI lockups every few hours that last for around 40 secs. > > Might one of them have a cron job that's scouring

Re: NFS mount lockups since about a month ago

2021-09-25 Thread Tim via users
On Sat, 2021-09-25 at 06:04 +0100, Terry Barnaby wrote: > in the last month or so all of the client computers are getting KDE > GUI lockups every few hours that last for around 40 secs. Might one of them have a cron job that's scouring the network? e.g. locate databasing -- uname -rsvp Linux

Re: NFS mount lockups since about a month ago

2021-09-25 Thread George N. White III
On Sat, 25 Sept 2021 at 02:04, Terry Barnaby wrote: > Hi, > > I use NFS mount (defaults so V4) /home directories with a simple server > over Gigabit Ethernet all running Fedora33. This has been working fine > for 25+ years through various Fedora versions. However in the last month > or so all of

Re: NFS mount lockups since about a month ago

2021-09-25 Thread Terry Barnaby
On 25/09/2021 09:00, Ed Greshko wrote: On 25/09/2021 14:07, Terry Barnaby wrote: A few questions. 1.  Are you saying your NFS server HW is the same for the past 25 years.  Couldn't have been all Fedora, right? No ( :) ) was using previous Linux and Unix systems before then. Certainly OS v

Re: NFS mount lockups since about a month ago

2021-09-25 Thread Ed Greshko
On 25/09/2021 14:07, Terry Barnaby wrote: A few questions. 1.  Are you saying your NFS server HW is the same for the past 25 years.   Couldn't have been all Fedora, right? No ( :) ) was using previous Linux and Unix systems before then. Certainly OS versions and hardware has changed over th

Re: NFS mount lockups since about a month ago

2021-09-24 Thread Terry Barnaby
On 25/09/2021 06:42, Ed Greshko wrote: On 25/09/2021 13:04, Terry Barnaby wrote: Hi, I use NFS mount (defaults so V4) /home directories with a simple server over Gigabit Ethernet all running Fedora33. This has been working fine for 25+ years through various Fedora versions. However in the la

Re: NFS mount lockups since about a month ago

2021-09-24 Thread Ed Greshko
On 25/09/2021 13:04, Terry Barnaby wrote: Hi, I use NFS mount (defaults so V4) /home directories with a simple server over Gigabit Ethernet all running Fedora33. This has been working fine for 25+ years through various Fedora versions. However in the last month or so all of the client computer

NFS mount lockups since about a month ago

2021-09-24 Thread Terry Barnaby
Hi, I use NFS mount (defaults so V4) /home directories with a simple server over Gigabit Ethernet all running Fedora33. This has been working fine for 25+ years through various Fedora versions. However in the last month or so all of the client computers are getting KDE GUI lockups every few h