Re: Linux 5.2-RC regression bisected, mounting glusterfs volumes fails after commit: fuse: require /dev/fuse reads to have enough buffer capacity

2019-06-12 Thread Miklos Szeredi
On Tue, Jun 11, 2019 at 10:28 PM Kirill Smelkov  wrote:

> Miklos, would 4K -> `sizeof(fuse_in_header) + sizeof(fuse_write_in)` for
> header room change be accepted?

Yes, next cycle.   For 4.2 I'll just push the revert.

Thanks,
Miklos


Re: Linux 5.2-RC regression bisected, mounting glusterfs volumes fails after commit: fuse: require /dev/fuse reads to have enough buffer capacity

2019-06-11 Thread Kirill Smelkov
On Tue, Jun 11, 2019 at 01:52:14PM +0200, Miklos Szeredi wrote:
> On Tue, Jun 11, 2019 at 1:03 PM Sander Eikelenboom  
> wrote:
> >
> > L.S.,
> >
> > While testing a linux 5.2 kernel I noticed it fails to mount my glusterfs 
> > volumes.
> >
> > It repeatedly fails with:
> >[2019-06-11 09:15:27.106946] W [fuse-bridge.c:4993:fuse_thread_proc] 
> > 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
> >[2019-06-11 09:15:27.106955] W [fuse-bridge.c:4993:fuse_thread_proc] 
> > 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
> >[2019-06-11 09:15:27.106963] W [fuse-bridge.c:4993:fuse_thread_proc] 
> > 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
> >[2019-06-11 09:15:27.106971] W [fuse-bridge.c:4993:fuse_thread_proc] 
> > 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
> >etc.
> >etc.
> >
> > Bisecting turned up as culprit:
> > commit d4b13963f217dd947da5c0cabd1569e914d21699: fuse: require 
> > /dev/fuse reads to have enough buffer capacity
> >
> > The glusterfs version i'm using is from Debian stable:
> > ii  glusterfs-client3.8.8-1  amd64  
> >   clustered file-system (client package)
> > ii  glusterfs-common3.8.8-1  amd64  
> >   GlusterFS common libraries and translator modules
> >
> >
> > A 5.1.* kernel works fine, as does a 5.2-rc4 kernel with said commit 
> > reverted.
> 
> Thanks for the report, reverted the bad commit.

First of all I'm sorry for breaking things here. The diff of the guilty
commit is

--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1317,6 +1317,16 @@ static ssize_t fuse_dev_do_read(struct fuse_dev 
*fud, struct file *file,
unsigned reqsize;
unsigned int hash;
 
+   /*
+* Require sane minimum read buffer - that has capacity for 
fixed part
+* of any request header + negotated max_write room for data. 
If the
+* requirement is not satisfied return EINVAL to the filesystem 
server
+* to indicate that it is not following FUSE server/client 
contract.
+* Don't dequeue / abort any request.
+*/
+   if (nbytes < max_t(size_t, FUSE_MIN_READ_BUFFER, 4096 + 
fc->max_write))
+   return -EINVAL;
+
  restart:
spin_lock(>waitq.lock);
err = -EAGAIN;

and it was essentially requesting that the filesystem server provide
4K+ buffer for reads from /dev/fuse. That 4K was meant as
space for FUSE request header, citing commit:

Before getting into operation phase, FUSE filesystem server and kernel
client negotiate what should be the maximum write size the client will
ever issue. After negotiation the contract in between server/client is
that the filesystem server then should queue /dev/fuse sys_read calls with
enough buffer capacity to receive any client request - WRITE in
particular, while FUSE client should not, in particular, send WRITE
requests with > negotiated max_write payload. FUSE client in kernel and
libfuse historically reserve 4K for request header. This way the
contract is that filesystem server should queue sys_reads with
4K+max_write buffer.

I could reproduce the problem and as it turns out what broke here is that
glusterfs is using not 4K but a smaller room for header - 80 bytes for
gluster-3.8 being `sizeof(fuse_in_header) + sizeof(fuse_write_in)`:

https://github.com/gluster/glusterfs/blob/v3.8.15-0-gd174f021a/xlators/mount/fuse/src/fuse-bridge.c#L4894


Since

`sizeof(fuse_in_header) + sizeof(fuse_write_in)` ==
`sizeof(fuse_in_header) + sizeof(fuse_read_in)`

is the absolute minimum any sane filesystem should be using for header room, can
we please restore the patch with that value instead of 4K?

That patch was there in the first place to help diagnose stuck fuse
servers much more easier, citing commit:

If the filesystem server does not follow this contract, what can happen
is that fuse_dev_do_read will see that request size is > buffer size,
and then it will return EIO to client who issued the request but won't
indicate in any way that there is a problem to filesystem server.
This can be hard to diagnose because for some requests, e.g. for
NOTIFY_REPLY which mimics WRITE, there is no client thread that is
waiting for request completion and that EIO goes nowhere, while on
filesystem server side things look like the kernel is not replying back
after successful NOTIFY_RETRIEVE request made by the server.

We can make the problem easy to diagnose if we indicate via error return to
filesystem server when it is violating the contract.  This should not
practically cause problems because if a filesystem server is using shorter
buffer, writes to it 

Re: Linux 5.2-RC regression bisected, mounting glusterfs volumes fails after commit: fuse: require /dev/fuse reads to have enough buffer capacity

2019-06-11 Thread Miklos Szeredi
On Tue, Jun 11, 2019 at 1:03 PM Sander Eikelenboom  wrote:
>
> L.S.,
>
> While testing a linux 5.2 kernel I noticed it fails to mount my glusterfs 
> volumes.
>
> It repeatedly fails with:
>[2019-06-11 09:15:27.106946] W [fuse-bridge.c:4993:fuse_thread_proc] 
> 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
>[2019-06-11 09:15:27.106955] W [fuse-bridge.c:4993:fuse_thread_proc] 
> 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
>[2019-06-11 09:15:27.106963] W [fuse-bridge.c:4993:fuse_thread_proc] 
> 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
>[2019-06-11 09:15:27.106971] W [fuse-bridge.c:4993:fuse_thread_proc] 
> 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
>etc.
>etc.
>
> Bisecting turned up as culprit:
> commit d4b13963f217dd947da5c0cabd1569e914d21699: fuse: require /dev/fuse 
> reads to have enough buffer capacity
>
> The glusterfs version i'm using is from Debian stable:
> ii  glusterfs-client3.8.8-1  amd64
> clustered file-system (client package)
> ii  glusterfs-common3.8.8-1  amd64
> GlusterFS common libraries and translator modules
>
>
> A 5.1.* kernel works fine, as does a 5.2-rc4 kernel with said commit reverted.

Thanks for the report, reverted the bad commit.

Thanks,
Miklos


Re: [Gluster-devel] Linux 5.2-RC regression bisected, mounting glusterfs volumes fails after commit: fuse: require /dev/fuse reads to have enough buffer capacity

2019-06-11 Thread Amar Tumballi Suryanarayan
Thanks for the heads up! We will see how to revert / fix the issue properly
for 5.2 kernel.

-Amar

On Tue, Jun 11, 2019 at 4:34 PM Sander Eikelenboom 
wrote:

> L.S.,
>
> While testing a linux 5.2 kernel I noticed it fails to mount my glusterfs
> volumes.
>
> It repeatedly fails with:
>[2019-06-11 09:15:27.106946] W [fuse-bridge.c:4993:fuse_thread_proc]
> 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
>[2019-06-11 09:15:27.106955] W [fuse-bridge.c:4993:fuse_thread_proc]
> 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
>[2019-06-11 09:15:27.106963] W [fuse-bridge.c:4993:fuse_thread_proc]
> 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
>[2019-06-11 09:15:27.106971] W [fuse-bridge.c:4993:fuse_thread_proc]
> 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
>etc.
>etc.
>
> Bisecting turned up as culprit:
> commit d4b13963f217dd947da5c0cabd1569e914d21699: fuse: require
> /dev/fuse reads to have enough buffer capacity
>
> The glusterfs version i'm using is from Debian stable:
> ii  glusterfs-client3.8.8-1
> amd64clustered file-system (client package)
> ii  glusterfs-common3.8.8-1
> amd64GlusterFS common libraries and translator modules
>
>
> A 5.1.* kernel works fine, as does a 5.2-rc4 kernel with said commit
> reverted.
>
> --
> Sander
> ___
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/836554017
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/486278655
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
>

-- 
Amar Tumballi (amarts)
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Linux 5.2-RC regression bisected, mounting glusterfs volumes fails after commit: fuse: require /dev/fuse reads to have enough buffer capacity

2019-06-11 Thread Sander Eikelenboom
L.S.,

While testing a linux 5.2 kernel I noticed it fails to mount my glusterfs 
volumes.

It repeatedly fails with:
   [2019-06-11 09:15:27.106946] W [fuse-bridge.c:4993:fuse_thread_proc] 
0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
   [2019-06-11 09:15:27.106955] W [fuse-bridge.c:4993:fuse_thread_proc] 
0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
   [2019-06-11 09:15:27.106963] W [fuse-bridge.c:4993:fuse_thread_proc] 
0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
   [2019-06-11 09:15:27.106971] W [fuse-bridge.c:4993:fuse_thread_proc] 
0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument)
   etc. 
   etc.

Bisecting turned up as culprit:
commit d4b13963f217dd947da5c0cabd1569e914d21699: fuse: require /dev/fuse 
reads to have enough buffer capacity

The glusterfs version i'm using is from Debian stable:
ii  glusterfs-client3.8.8-1  amd64  
  clustered file-system (client package)
ii  glusterfs-common3.8.8-1  amd64  
  GlusterFS common libraries and translator modules


A 5.1.* kernel works fine, as does a 5.2-rc4 kernel with said commit reverted.

--
Sander