[gentoo-user] Re: CIFS mounts started misbehaving

2017-07-06 Thread Grant Edwards
On 2017-03-06, Grant Edwards  wrote:
> On 2017-03-03, Grant Edwards  wrote:
>
>> For the past 10-15 [years], I've been mounting a handfull of
>> directories that reside on a Windows server, and it's always worked
>> fine.
>>
>> About a week ago, they started acting oddly.  They all mount fine,
>> and work as usual as long as you keep using them.  AFAICT, if they
>> sit idle for "a while" (tens of minutes, maybe an hour), they
>> freeze up.
[...]
> It's a kernel 4.9 problem.
[...]
> Rebooting with the 4.4.39 kernel fixes the problem.

FWIW, I've been running 4.9.34 since yesterday, and all my CIFS mounts
seem stable.

-- 
Grant Edwards   grant.b.edwardsYow! TONY RANDALL!  Is YOUR
  at   life a PATIO of FUN??
  gmail.com




[gentoo-user] Re: CIFS mounts started misbehaving

2017-03-07 Thread Grant Edwards
On 2017-03-07, Marc Joliet  wrote:
> On Dienstag, 7. M�rz 2017 15:19:33 CET Grant Edwards wrote:
>> No, as a rule I run stable gentoo-sources, and that's at 4.9.6-r1.
>
> Ah, of course.  I'm using ~arch kernels ATM.  (As a btrfs user I was tracking 
> the most recent upstream stable series, but want to switch to LTS kernels 
> now, 
> which happen to be the ones Gentoo stabilizes.  That is, unless a newer 
> kernel 
> has something that I really, *really* want.)

If I get bored, I may try latest 4.9, 4.10, 4.11 on my "test" machine
and see how it acts there.

I did some googling and didn't find anything that looked like this
problem being reported anywhere -- which makes me wonder what
particular combination of ancient windows server (IIRC, it's 2008) and
odd confguration is causing it.  I have a hard time believing that the
CIFS client in 4.9 kernels can be doing this for very many people...

-- 
Grant Edwards   grant.b.edwardsYow! Uh-oh!!  I forgot
  at   to submit to COMPULSORY
  gmail.comURINALYSIS!




Re: [gentoo-user] Re: CIFS mounts started misbehaving

2017-03-07 Thread Marc Joliet
On Dienstag, 7. März 2017 15:19:33 CET Grant Edwards wrote:
> No, as a rule I run stable gentoo-sources, and that's at 4.9.6-r1.

Ah, of course.  I'm using ~arch kernels ATM.  (As a btrfs user I was tracking 
the most recent upstream stable series, but want to switch to LTS kernels now, 
which happen to be the ones Gentoo stabilizes.  That is, unless a newer kernel 
has something that I really, *really* want.)

> However, I'm a bit confused about the table shown at
> 
>   https://packages.gentoo.org/packages/sys-kernel/gentoo-sources
> 
> There are two rows for some versions (e.g. 4.9.6-r1), with different
> indicators.  What does that mean?

AFAIK that's a (known?) bug.

-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup


signature.asc
Description: This is a digitally signed message part.


[gentoo-user] Re: CIFS mounts started misbehaving

2017-03-07 Thread Grant Edwards
On 2017-03-07, Marc Joliet  wrote:

>> It's a kernel 4.9 problem.
>> 
>> I had built and installed a gentoo-sources 4.9.6-r1 kernel about a
>> month ago, but didn't update the grub configuration and reboot until
>> two weeks ago.
>> 
>> Rebooting with the 4.4.39 kernel fixes the problem.
[...]
>> I guess I'll have to stick with the 4.4 series until this gets fixed.
>
> I'm glad you found the source of the problem and a workaround.
> However, the 4.9 series is now at 4.9.13.  Have you tried that, too?

No, as a rule I run stable gentoo-sources, and that's at 4.9.6-r1.

However, I'm a bit confused about the table shown at

  https://packages.gentoo.org/packages/sys-kernel/gentoo-sources

There are two rows for some versions (e.g. 4.9.6-r1), with different
indicators.  What does that mean?

-- 
Grant Edwards   grant.b.edwardsYow! for ARTIFICIAL
  at   FLAVORING!!
  gmail.com




[gentoo-user] Re: CIFS mounts started misbehaving

2017-03-06 Thread Kai Krakow
Am Mon, 06 Mar 2017 19:01:57 +
schrieb "J. Roeleveld" :

> On March 6, 2017 5:14:39 PM GMT+01:00, Grant Edwards
>  wrote:
> >On 2017-03-06, Kai Krakow  wrote:
> >  
>  [...]  
> >and  
>  [...]  
> >>
> >> Did something on the Windows side change?  
> >
> >Probaby, but I've learned not to ask questions like that.  They never
> >get answered, and it just causes problems when it is revealed that
> >the client having problems is a Linux machine.
> >  
> >> Maybe force Windows down to a lower SMB version or reduce/disable
> >> SMB client side caching?  
> 
> Windows sharing is designed as a 'link when used' option. Not as a
> permanent mount like Linix treats it.
> 
> Even 'mounting' in Windows doesn't mean the share is actually
> accessed.
> 
> A windows CIFS server will not be reliable enough for long term
> mounting. With Samba, it does work more reliable. (In my experience)
> 
> For this reason, I use KDE/Dolphin to access CIFS shares. It is
> closer to how Windows expects the shares to be treated.

Then it may help to use automount with a somewhat low timeout, maybe
also setup cachefilesd and mount with fsc option. This is how I use my
office shares on a 2012 R2 server via VPN.

-- 
Regards,
Kai

Replies to list-only preferred.




Re: [gentoo-user] Re: CIFS mounts started misbehaving

2017-03-06 Thread Marc Joliet
On Dienstag, 7. März 2017 00:12:06 CET Grant Edwards wrote:
> On 2017-03-03, Grant Edwards  wrote:
> > For the past 10-15 [years], I've been mounting a handfull of
> > directories that reside on a Windows server, and it's always worked
> > find.
> > 
> > About a week ago, they started acting oddly.  They all mount fine, and
> > work as usual as long as you keep using them.  AFAICT, if they sit
> > idle for "a while" (tens of minutes, maybe an hour), they freeze up.
> 
> It finally dawned on me that I had changed something.
> 
> It's a kernel 4.9 problem.
> 
> I had built and installed a gentoo-sources 4.9.6-r1 kernel about a
> month ago, but didn't update the grub configuration and reboot until
> two weeks ago.
> 
> Rebooting with the 4.4.39 kernel fixes the problem.
> 
> [I also tried just rebooting the 4.9.4 kernel, but that didn't help.]
> 
> The configuration of the 4.9.4 kernel is as close to that of the
> 4.4.39 as I can get.
> 
> I guess I'll have to stick with the 4.4 series until this gets fixed.

I'm glad you found the source of the problem and a workaround.  However, the 
4.9 series is now at 4.9.13.  Have you tried that, too?

HTH
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup


signature.asc
Description: This is a digitally signed message part.


[gentoo-user] Re: CIFS mounts started misbehaving

2017-03-06 Thread Grant Edwards
On 2017-03-03, Grant Edwards  wrote:

> For the past 10-15 [years], I've been mounting a handfull of
> directories that reside on a Windows server, and it's always worked
> find.
>
> About a week ago, they started acting oddly.  They all mount fine, and
> work as usual as long as you keep using them.  AFAICT, if they sit
> idle for "a while" (tens of minutes, maybe an hour), they freeze up.

It finally dawned on me that I had changed something.

It's a kernel 4.9 problem.

I had built and installed a gentoo-sources 4.9.6-r1 kernel about a
month ago, but didn't update the grub configuration and reboot until
two weeks ago.

Rebooting with the 4.4.39 kernel fixes the problem.

[I also tried just rebooting the 4.9.4 kernel, but that didn't help.]

The configuration of the 4.9.4 kernel is as close to that of the
4.4.39 as I can get.

I guess I'll have to stick with the 4.4 series until this gets fixed.

-- 
Grant Edwards   grant.b.edwardsYow! Now we can become
  at   alcoholics!
  gmail.com




Re: [gentoo-user] Re: CIFS mounts started misbehaving

2017-03-06 Thread J. Roeleveld
On March 6, 2017 8:17:37 PM GMT+01:00, Grant Edwards 
 wrote:
>On 2017-03-06, J. Roeleveld  wrote:
>> On March 6, 2017 5:14:39 PM GMT+01:00, Grant Edwards
> wrote:
>>>On 2017-03-06, Kai Krakow  wrote:
>>>
> I'm going to try to set up a Wireshark capture in ring-buffer mode
>>>and
> somehow detect the failure and stop the capture...

 Did something on the Windows side change?
>>>
>>>Probaby, but I've learned not to ask questions like that.  They never
>>>get answered, and it just causes problems when it is revealed that
>the
>>>client having problems is a Linux machine.
>>>
 Maybe force Windows down to a lower SMB version or reduce/disable
 SMB client side caching?
>>
>> Windows sharing is designed as a 'link when used' option. Not as a
>> permanent mount like Linix treats it.
>>
>> Even 'mounting' in Windows doesn't mean the share is actually
>> accessed.
>>
>> A windows CIFS server will not be reliable enough for long term
>> mounting. With Samba, it does work more reliable. (In my experience)
>
>It's worked perfectly fine for 10+ years, and apparently continues to
>do so for other Linux users in the office.

And trying to troubleshoot it is not simple. Especially as MS Windows event 
viewer never shows anything remotely useful. (I tried to troubleshoot various 
issues, never got anything usefull from the windows admins or event viewer)
How do the other Linux users access the shares?

>> For this reason, I use KDE/Dolphin to access CIFS shares. It is
>> closer to how Windows expects the shares to be treated.
>
>I don't see how things like shell scripts or other applications that
>need to access files on the CIFS mounts would use something like that.

Did you test if a small script that touches a file on the share every minute 
resolves the issue?

--
Joost


-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



[gentoo-user] Re: CIFS mounts started misbehaving

2017-03-06 Thread Grant Edwards
On 2017-03-06, J. Roeleveld  wrote:
> On March 6, 2017 5:14:39 PM GMT+01:00, Grant Edwards 
>  wrote:
>>On 2017-03-06, Kai Krakow  wrote:
>>
 I'm going to try to set up a Wireshark capture in ring-buffer mode
>>and
 somehow detect the failure and stop the capture...
>>>
>>> Did something on the Windows side change?
>>
>>Probaby, but I've learned not to ask questions like that.  They never
>>get answered, and it just causes problems when it is revealed that the
>>client having problems is a Linux machine.
>>
>>> Maybe force Windows down to a lower SMB version or reduce/disable
>>> SMB client side caching?
>
> Windows sharing is designed as a 'link when used' option. Not as a
> permanent mount like Linix treats it.
>
> Even 'mounting' in Windows doesn't mean the share is actually
> accessed.
>
> A windows CIFS server will not be reliable enough for long term
> mounting. With Samba, it does work more reliable. (In my experience)

It's worked perfectly fine for 10+ years, and apparently continues to
do so for other Linux users in the office.

> For this reason, I use KDE/Dolphin to access CIFS shares. It is
> closer to how Windows expects the shares to be treated.

I don't see how things like shell scripts or other applications that
need to access files on the CIFS mounts would use something like that.

-- 
Grant Edwards   grant.b.edwardsYow! I think my career
  at   is ruined!
  gmail.com




Re: [gentoo-user] Re: CIFS mounts started misbehaving

2017-03-06 Thread J. Roeleveld
On March 6, 2017 5:14:39 PM GMT+01:00, Grant Edwards 
 wrote:
>On 2017-03-06, Kai Krakow  wrote:
>
>>> I'm going to try to set up a Wireshark capture in ring-buffer mode
>and
>>> somehow detect the failure and stop the capture...
>>
>> Did something on the Windows side change?
>
>Probaby, but I've learned not to ask questions like that.  They never
>get answered, and it just causes problems when it is revealed that the
>client having problems is a Linux machine.
>
>> Maybe force Windows down to a lower SMB version or reduce/disable
>> SMB client side caching?

Windows sharing is designed as a 'link when used' option. Not as a permanent 
mount like Linix treats it.

Even 'mounting' in Windows doesn't mean the share is actually accessed.

A windows CIFS server will not be reliable enough for long term mounting. With 
Samba, it does work more reliable. (In my experience)

For this reason, I use KDE/Dolphin to access CIFS shares. It is closer to how 
Windows expects the shares to be treated.

--
Joost
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



[gentoo-user] Re: CIFS mounts started misbehaving

2017-03-06 Thread Grant Edwards
On 2017-03-06, Kai Krakow  wrote:

>> I'm going to try to set up a Wireshark capture in ring-buffer mode and
>> somehow detect the failure and stop the capture...
>
> Did something on the Windows side change?

Probaby, but I've learned not to ask questions like that.  They never
get answered, and it just causes problems when it is revealed that the
client having problems is a Linux machine.

> Maybe force Windows down to a lower SMB version or reduce/disable
> SMB client side caching?

-- 
Grant Edwards   grant.b.edwardsYow! Like I always say
  at   -- nothing can beat
  gmail.comthe BRATWURST here in
   DUSSELDORF!!




[gentoo-user] Re: CIFS mounts started misbehaving

2017-03-05 Thread Kai Krakow
Am Sat, 4 Mar 2017 16:42:07 + (UTC)
schrieb Grant Edwards :

> On 2017-03-04, Kai Krakow  wrote:
> > Am Sat, 04 Mar 2017 08:02:11 + schrieb "J. Roeleveld"
> > : 
> >>  
>  [...]  
>  [...]  
>  [...]  
>  [...]  
>  [...]  
>  [...]  
>  [...]  
>  [...]  
>  [...]  
>  [...]  
>  [...]  
> >>  
>  [...]  
> >> 
> >> Are other hosts linux or windows?  
> 
> Other Linux and Windows clients don't seem to be having this problem.
> 
> >> Maybe a dodgy switch forgetting the correct path?  
> 
> I don't think so.  I can ping the host while the CIFS subsystem says
> "host is down".  If the switch is forgetting the path, who's sending
> back the SYN/ACK and the RST
> 
> > Or an MTU problem... Is there a router in the path?  
> 
> Nope.

The MTU idea was dumb anyways as you wrote that the problem occurs
after some idle time... Which could still be a router problem - but as
you wrote: no router. :-)

> I'm going to try to set up a Wireshark capture in ring-buffer mode and
> somehow detect the failure and stop the capture...

Did something on the Windows side change? Maybe force Windows down to a
lower SMB version or reduce/disable SMB client side caching?


-- 
Regards,
Kai

Replies to list-only preferred.




[gentoo-user] Re: CIFS mounts started misbehaving

2017-03-04 Thread Grant Edwards
On 2017-03-04, Kai Krakow  wrote:
> Am Sat, 04 Mar 2017 08:02:11 + schrieb "J. Roeleveld" 
> :
>
>>
>> >Normally, when things are working but idle, the TCP connection to 445
>> >shows an SMB echo request/rseponse transaction once per minute.  When
>> >it fails, the TCP connection evidently got dropped, and the Windows
>> >machine repeatedly shuts down new ones:
>> >
>> >The failure mode looks like this in wireshark:
>> >
>> >  GentooWindows
>> >  
>> >  -> SYN  ->  445  
>> > <-SYN/ACK   <-   445  
>> >  -> ACK  ->  445
>> >  -> SMB[echo req]->  445  
>> > <-  RST <-   445
>> >
>> >[that repeats 800 times per second for long periods of time]
>> >
>> >Then at some point, it starts to work:
>> >  
>> >  ->SYN  ->  445  
>> > <-   SYN/ACK   <-   445  
>> >  ->ACK  ->  445
>> >  -> SMB[proto neg req]  ->  445  
>> > <-  SMB[proto neg rsp] <-   445  
>> >  -> SMB[ses setup req]  ->  445  
>> > <-  SMB[ses setup rsp] <-   445
>> > ...
>>
>> >Sometimes the umount times out and "fails" because the "host is
>> >down", and when that happens, it seems like it immediately starts to
>> >work again. :/  
>> 
>> Are other hosts linux or windows?

Other Linux and Windows clients don't seem to be having this problem.

>> Maybe a dodgy switch forgetting the correct path?

I don't think so.  I can ping the host while the CIFS subsystem says
"host is down".  If the switch is forgetting the path, who's sending
back the SYN/ACK and the RST

> Or an MTU problem... Is there a router in the path?

Nope.

I'm going to try to set up a Wireshark capture in ring-buffer mode and
somehow detect the failure and stop the capture...

-- 
Grant






[gentoo-user] Re: CIFS mounts started misbehaving

2017-03-04 Thread Kai Krakow
Am Sat, 04 Mar 2017 08:02:11 +
schrieb "J. Roeleveld" :

> On March 4, 2017 12:41:05 AM GMT+01:00, Grant Edwards
>  wrote:
> >On 2017-03-03, J. Roeleveld  wrote:
> >  
> >> On March 3, 2017 7:49:27 PM GMT+01:00, Grant Edwards  
> > wrote:
> >  
>  [...]  
> >and  
>  [...]  
> >
> >[...]
> >  
> >> My guess would be some timeout setting on the server killing the
> >> login.  
> >
> >That doesn't seem to be the problem.  I've asked around, and others
> >aren't seeing this problem.
> >
> >I've also noticed that sometimes the mounts will start working again
> >without a umount/mount, but I can't figure out what causes it...
> >
> >Normally, when things are working but idle, the TCP connection to 445
> >shows an SMB echo request/rseponse transaction once per minute.  When
> >it fails, the TCP connection evidently got dropped, and the Windows
> >machine repeatedly shuts down new ones:
> >
> >The failure mode looks like this in wireshark:
> >
> >  GentooWindows
> >  
> >  -> SYN  ->  445  
> > <-SYN/ACK   <-   445  
> >  -> ACK  ->  445
> >  -> SMB[echo req]->  445  
> > <-  RST <-   445
> >
> >[that repeats 800 times per second for long periods of time]
> >
> >Then at some point, it starts to work:
> >  
> >  ->SYN  ->  445  
> > <-   SYN/ACK   <-   445  
> >  ->ACK  ->  445
> >  -> SMB[proto neg req]  ->  445  
> > <-  SMB[proto neg rsp] <-   445  
> >  -> SMB[ses setup req]  ->  445  
> > <-  SMB[ses setup rsp] <-   445
> > ...
> > 
> >Sometimes the umount times out and "fails" because the "host is
> >down", and when that happens, it seems like it immediately starts to
> >work again. :/  
> 
> Are other hosts linux or windows?
> 
> Maybe a dodgy switch forgetting the correct path?

Or an MTU problem... Is there a router in the path?

-- 
Regards,
Kai

Replies to list-only preferred.




Re: [gentoo-user] Re: CIFS mounts started misbehaving

2017-03-04 Thread J. Roeleveld
On March 4, 2017 12:41:05 AM GMT+01:00, Grant Edwards 
 wrote:
>On 2017-03-03, J. Roeleveld  wrote:
>
>> On March 3, 2017 7:49:27 PM GMT+01:00, Grant Edwards
> wrote:
>
>>>About a week ago, they started acting oddly.  They all mount fine,
>and
>>>work as usual as long as you keep using them.  AFAICT, if they sit
>>>idle for "a while" (tens of minutes, maybe an hour), they freeze up.
>
>[...]
>
>> My guess would be some timeout setting on the server killing the
>> login.
>
>That doesn't seem to be the problem.  I've asked around, and others
>aren't seeing this problem.
>
>I've also noticed that sometimes the mounts will start working again
>without a umount/mount, but I can't figure out what causes it...
>
>Normally, when things are working but idle, the TCP connection to 445
>shows an SMB echo request/rseponse transaction once per minute.  When
>it fails, the TCP connection evidently got dropped, and the Windows
>machine repeatedly shuts down new ones:
>
>The failure mode looks like this in wireshark:
>
>  GentooWindows
>
>  -> SYN  ->  445
> <-SYN/ACK   <-   445
>  -> ACK  ->  445
>  -> SMB[echo req]->  445
> <-  RST <-   445
>
>[that repeats 800 times per second for long periods of time]
>
>Then at some point, it starts to work:
>
>  ->SYN  ->  445
> <-   SYN/ACK   <-   445
>  ->ACK  ->  445
>  -> SMB[proto neg req]  ->  445
> <-  SMB[proto neg rsp] <-   445
>  -> SMB[ses setup req]  ->  445
> <-  SMB[ses setup rsp] <-   445
> ...
> 
>Sometimes the umount times out and "fails" because the "host is down",
>and when that happens, it seems like it immediately starts to work
>again. :/

Are other hosts linux or windows?

Maybe a dodgy switch forgetting the correct path?

--
Joost
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



[gentoo-user] Re: CIFS mounts started misbehaving

2017-03-03 Thread Grant Edwards
On 2017-03-03, J. Roeleveld  wrote:

> On March 3, 2017 7:49:27 PM GMT+01:00, Grant Edwards 
>  wrote:

>>About a week ago, they started acting oddly.  They all mount fine, and
>>work as usual as long as you keep using them.  AFAICT, if they sit
>>idle for "a while" (tens of minutes, maybe an hour), they freeze up.

[...]

> My guess would be some timeout setting on the server killing the
> login.

That doesn't seem to be the problem.  I've asked around, and others
aren't seeing this problem.

I've also noticed that sometimes the mounts will start working again
without a umount/mount, but I can't figure out what causes it...

Normally, when things are working but idle, the TCP connection to 445
shows an SMB echo request/rseponse transaction once per minute.  When
it fails, the TCP connection evidently got dropped, and the Windows
machine repeatedly shuts down new ones:

The failure mode looks like this in wireshark:

  GentooWindows

  -> SYN  ->  445
 <-SYN/ACK   <-   445
  -> ACK  ->  445
  -> SMB[echo req]->  445
 <-  RST <-   445

[that repeats 800 times per second for long periods of time]

Then at some point, it starts to work:

  ->SYN  ->  445
 <-   SYN/ACK   <-   445
  ->ACK  ->  445
  -> SMB[proto neg req]  ->  445
 <-  SMB[proto neg rsp] <-   445
  -> SMB[ses setup req]  ->  445
 <-  SMB[ses setup rsp] <-   445
 ...
 
Sometimes the umount times out and "fails" because the "host is down",
and when that happens, it seems like it immediately starts to work
again. :/


-- 
Grant Edwards   grant.b.edwardsYow! Wow!  Look!!  A stray
  at   meatball!!  Let's interview
  gmail.comit!