Re: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio
On 4/23/2024 4:06 AM, Michael S. Tsirkin wrote: > On Tue, Apr 09, 2024 at 09:48:08AM +0800, Hou Tao wrote: >> Hi, >> >> On 4/8/2024 3:45 PM, Michael S. Tsirkin wrote: >>> On Wed, Feb 28, 2024 at 10:41:20PM +0800, Hou Tao wrote: From: Hou Tao Hi, The patch set aims to fix the warning related to an abnormal size parameter of kmalloc() in virtiofs. The warning occurred when attempting to insert a 10MB sized kernel module kept in a virtiofs with cache disabled. As analyzed in patch #1, the root cause is that the length of the read buffer is no limited, and the read buffer is passed directly to virtiofs through out_args[0].value. Therefore patch #1 limits the length of the read buffer passed to virtiofs by using max_pages. However it is not enough, because now the maximal value of max_pages is 256. Consequently, when reading a 10MB-sized kernel module, the length of the bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will try to allocate 2MB from memory subsystem. The request for 2MB of physically contiguous memory significantly stress the memory subsystem and may fail indefinitely on hosts with fragmented memory. To address this, patch #2~#5 use scattered pages in a bio_vec to replace the kmalloc-allocated bounce buffer when the length of the bounce buffer for KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the allocation of the bounce buffer and sg array in virtiofs is that GFP_ATOMIC is used even when the allocation occurs in a kworker context. Therefore the last patch uses GFP_NOFS for the allocation of both sg array and bounce buffer when initiated by the kworker. For more details, please check the individual patches. As usual, comments are always welcome. Change Log: >>> Bernd should I just merge the patchset as is? >>> It seems to fix a real problem and no one has the >>> time to work on a better fix WDYT? >> Sorry for the long delay. I am just start to prepare for v3. In v3, I >> plan to avoid the unnecessary memory copy between fuse args and bio_vec. >> Will post it before next week. > Didn't happen before this week apparently. Sorry for failing to make it this week. Being busy these two weeks. Hope to send v3 out before the end of April. > >>> v2: * limit the length of ITER_KVEC dio by max_pages instead of the newly-introduced max_nopage_rw. Using max_pages make the ITER_KVEC dio being consistent with other rw operations. * replace kmalloc-allocated bounce buffer by using a bounce buffer backed by scattered pages when the length of the bounce buffer for KVEC_ITER dio is larger than PAG_SIZE, so even on hosts with fragmented memory, the KVEC_ITER dio can be handled normally by virtiofs. (Bernd Schubert) * merge the GFP_NOFS patch [1] into this patch-set and use memalloc_nofs_{save|restore}+GFP_KERNEL instead of GFP_NOFS (Benjamin Coddington) v1: https://lore.kernel.org/linux-fsdevel/20240103105929.1902658-1-hou...@huaweicloud.com/ [1]: https://lore.kernel.org/linux-fsdevel/20240105105305.4052672-1-hou...@huaweicloud.com/ Hou Tao (6): fuse: limit the length of ITER_KVEC dio by max_pages virtiofs: move alloc/free of argbuf into separated helpers virtiofs: factor out more common methods for argbuf virtiofs: support bounce buffer backed by scattered pages virtiofs: use scattered bounce buffer for ITER_KVEC dio virtiofs: use GFP_NOFS when enqueuing request through kworker fs/fuse/file.c | 12 +- fs/fuse/virtio_fs.c | 336 +--- 2 files changed, 296 insertions(+), 52 deletions(-) -- 2.29.2
Re: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio
On 4/22/24 22:06, Michael S. Tsirkin wrote: > On Tue, Apr 09, 2024 at 09:48:08AM +0800, Hou Tao wrote: >> Hi, >> >> On 4/8/2024 3:45 PM, Michael S. Tsirkin wrote: >>> On Wed, Feb 28, 2024 at 10:41:20PM +0800, Hou Tao wrote: From: Hou Tao Hi, The patch set aims to fix the warning related to an abnormal size parameter of kmalloc() in virtiofs. The warning occurred when attempting to insert a 10MB sized kernel module kept in a virtiofs with cache disabled. As analyzed in patch #1, the root cause is that the length of the read buffer is no limited, and the read buffer is passed directly to virtiofs through out_args[0].value. Therefore patch #1 limits the length of the read buffer passed to virtiofs by using max_pages. However it is not enough, because now the maximal value of max_pages is 256. Consequently, when reading a 10MB-sized kernel module, the length of the bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will try to allocate 2MB from memory subsystem. The request for 2MB of physically contiguous memory significantly stress the memory subsystem and may fail indefinitely on hosts with fragmented memory. To address this, patch #2~#5 use scattered pages in a bio_vec to replace the kmalloc-allocated bounce buffer when the length of the bounce buffer for KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the allocation of the bounce buffer and sg array in virtiofs is that GFP_ATOMIC is used even when the allocation occurs in a kworker context. Therefore the last patch uses GFP_NOFS for the allocation of both sg array and bounce buffer when initiated by the kworker. For more details, please check the individual patches. As usual, comments are always welcome. Change Log: >>> Bernd should I just merge the patchset as is? >>> It seems to fix a real problem and no one has the >>> time to work on a better fix WDYT? >> >> Sorry for the long delay. I am just start to prepare for v3. In v3, I >> plan to avoid the unnecessary memory copy between fuse args and bio_vec. >> Will post it before next week. > > Didn't happen before this week apparently. Hi Michael, sorry for my later reply, I had been totally busy for the last weeks as well. Also I can't decide to merge it - I'm not the official fuse maintainer... >From my point of view, patch 1 is just missing to set the actual limit and then would be clear and easy back-portable bug fix. Not promised, I will try it out if I find a bit time tomorrow. Bernd
Re: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio
On Tue, Apr 09, 2024 at 09:48:08AM +0800, Hou Tao wrote: > Hi, > > On 4/8/2024 3:45 PM, Michael S. Tsirkin wrote: > > On Wed, Feb 28, 2024 at 10:41:20PM +0800, Hou Tao wrote: > >> From: Hou Tao > >> > >> Hi, > >> > >> The patch set aims to fix the warning related to an abnormal size > >> parameter of kmalloc() in virtiofs. The warning occurred when attempting > >> to insert a 10MB sized kernel module kept in a virtiofs with cache > >> disabled. As analyzed in patch #1, the root cause is that the length of > >> the read buffer is no limited, and the read buffer is passed directly to > >> virtiofs through out_args[0].value. Therefore patch #1 limits the > >> length of the read buffer passed to virtiofs by using max_pages. However > >> it is not enough, because now the maximal value of max_pages is 256. > >> Consequently, when reading a 10MB-sized kernel module, the length of the > >> bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will > >> try to allocate 2MB from memory subsystem. The request for 2MB of > >> physically contiguous memory significantly stress the memory subsystem > >> and may fail indefinitely on hosts with fragmented memory. To address > >> this, patch #2~#5 use scattered pages in a bio_vec to replace the > >> kmalloc-allocated bounce buffer when the length of the bounce buffer for > >> KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the > >> allocation of the bounce buffer and sg array in virtiofs is that > >> GFP_ATOMIC is used even when the allocation occurs in a kworker context. > >> Therefore the last patch uses GFP_NOFS for the allocation of both sg > >> array and bounce buffer when initiated by the kworker. For more details, > >> please check the individual patches. > >> > >> As usual, comments are always welcome. > >> > >> Change Log: > > Bernd should I just merge the patchset as is? > > It seems to fix a real problem and no one has the > > time to work on a better fix WDYT? > > Sorry for the long delay. I am just start to prepare for v3. In v3, I > plan to avoid the unnecessary memory copy between fuse args and bio_vec. > Will post it before next week. Didn't happen before this week apparently. > > > > > >> v2: > >> * limit the length of ITER_KVEC dio by max_pages instead of the > >> newly-introduced max_nopage_rw. Using max_pages make the ITER_KVEC > >> dio being consistent with other rw operations. > >> * replace kmalloc-allocated bounce buffer by using a bounce buffer > >> backed by scattered pages when the length of the bounce buffer for > >> KVEC_ITER dio is larger than PAG_SIZE, so even on hosts with > >> fragmented memory, the KVEC_ITER dio can be handled normally by > >> virtiofs. (Bernd Schubert) > >> * merge the GFP_NOFS patch [1] into this patch-set and use > >> memalloc_nofs_{save|restore}+GFP_KERNEL instead of GFP_NOFS > >> (Benjamin Coddington) > >> > >> v1: > >> https://lore.kernel.org/linux-fsdevel/20240103105929.1902658-1-hou...@huaweicloud.com/ > >> > >> [1]: > >> https://lore.kernel.org/linux-fsdevel/20240105105305.4052672-1-hou...@huaweicloud.com/ > >> > >> Hou Tao (6): > >> fuse: limit the length of ITER_KVEC dio by max_pages > >> virtiofs: move alloc/free of argbuf into separated helpers > >> virtiofs: factor out more common methods for argbuf > >> virtiofs: support bounce buffer backed by scattered pages > >> virtiofs: use scattered bounce buffer for ITER_KVEC dio > >> virtiofs: use GFP_NOFS when enqueuing request through kworker > >> > >> fs/fuse/file.c | 12 +- > >> fs/fuse/virtio_fs.c | 336 +--- > >> 2 files changed, 296 insertions(+), 52 deletions(-) > >> > >> -- > >> 2.29.2
Re: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio
Hi, On 4/8/2024 3:45 PM, Michael S. Tsirkin wrote: > On Wed, Feb 28, 2024 at 10:41:20PM +0800, Hou Tao wrote: >> From: Hou Tao >> >> Hi, >> >> The patch set aims to fix the warning related to an abnormal size >> parameter of kmalloc() in virtiofs. The warning occurred when attempting >> to insert a 10MB sized kernel module kept in a virtiofs with cache >> disabled. As analyzed in patch #1, the root cause is that the length of >> the read buffer is no limited, and the read buffer is passed directly to >> virtiofs through out_args[0].value. Therefore patch #1 limits the >> length of the read buffer passed to virtiofs by using max_pages. However >> it is not enough, because now the maximal value of max_pages is 256. >> Consequently, when reading a 10MB-sized kernel module, the length of the >> bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will >> try to allocate 2MB from memory subsystem. The request for 2MB of >> physically contiguous memory significantly stress the memory subsystem >> and may fail indefinitely on hosts with fragmented memory. To address >> this, patch #2~#5 use scattered pages in a bio_vec to replace the >> kmalloc-allocated bounce buffer when the length of the bounce buffer for >> KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the >> allocation of the bounce buffer and sg array in virtiofs is that >> GFP_ATOMIC is used even when the allocation occurs in a kworker context. >> Therefore the last patch uses GFP_NOFS for the allocation of both sg >> array and bounce buffer when initiated by the kworker. For more details, >> please check the individual patches. >> >> As usual, comments are always welcome. >> >> Change Log: > Bernd should I just merge the patchset as is? > It seems to fix a real problem and no one has the > time to work on a better fix WDYT? Sorry for the long delay. I am just start to prepare for v3. In v3, I plan to avoid the unnecessary memory copy between fuse args and bio_vec. Will post it before next week. > > >> v2: >> * limit the length of ITER_KVEC dio by max_pages instead of the >> newly-introduced max_nopage_rw. Using max_pages make the ITER_KVEC >> dio being consistent with other rw operations. >> * replace kmalloc-allocated bounce buffer by using a bounce buffer >> backed by scattered pages when the length of the bounce buffer for >> KVEC_ITER dio is larger than PAG_SIZE, so even on hosts with >> fragmented memory, the KVEC_ITER dio can be handled normally by >> virtiofs. (Bernd Schubert) >> * merge the GFP_NOFS patch [1] into this patch-set and use >> memalloc_nofs_{save|restore}+GFP_KERNEL instead of GFP_NOFS >> (Benjamin Coddington) >> >> v1: >> https://lore.kernel.org/linux-fsdevel/20240103105929.1902658-1-hou...@huaweicloud.com/ >> >> [1]: >> https://lore.kernel.org/linux-fsdevel/20240105105305.4052672-1-hou...@huaweicloud.com/ >> >> Hou Tao (6): >> fuse: limit the length of ITER_KVEC dio by max_pages >> virtiofs: move alloc/free of argbuf into separated helpers >> virtiofs: factor out more common methods for argbuf >> virtiofs: support bounce buffer backed by scattered pages >> virtiofs: use scattered bounce buffer for ITER_KVEC dio >> virtiofs: use GFP_NOFS when enqueuing request through kworker >> >> fs/fuse/file.c | 12 +- >> fs/fuse/virtio_fs.c | 336 +--- >> 2 files changed, 296 insertions(+), 52 deletions(-) >> >> -- >> 2.29.2
Re: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio
On Wed, Feb 28, 2024 at 10:41:20PM +0800, Hou Tao wrote: > From: Hou Tao > > Hi, > > The patch set aims to fix the warning related to an abnormal size > parameter of kmalloc() in virtiofs. The warning occurred when attempting > to insert a 10MB sized kernel module kept in a virtiofs with cache > disabled. As analyzed in patch #1, the root cause is that the length of > the read buffer is no limited, and the read buffer is passed directly to > virtiofs through out_args[0].value. Therefore patch #1 limits the > length of the read buffer passed to virtiofs by using max_pages. However > it is not enough, because now the maximal value of max_pages is 256. > Consequently, when reading a 10MB-sized kernel module, the length of the > bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will > try to allocate 2MB from memory subsystem. The request for 2MB of > physically contiguous memory significantly stress the memory subsystem > and may fail indefinitely on hosts with fragmented memory. To address > this, patch #2~#5 use scattered pages in a bio_vec to replace the > kmalloc-allocated bounce buffer when the length of the bounce buffer for > KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the > allocation of the bounce buffer and sg array in virtiofs is that > GFP_ATOMIC is used even when the allocation occurs in a kworker context. > Therefore the last patch uses GFP_NOFS for the allocation of both sg > array and bounce buffer when initiated by the kworker. For more details, > please check the individual patches. > > As usual, comments are always welcome. > > Change Log: Bernd should I just merge the patchset as is? It seems to fix a real problem and no one has the time to work on a better fix WDYT? > v2: > * limit the length of ITER_KVEC dio by max_pages instead of the > newly-introduced max_nopage_rw. Using max_pages make the ITER_KVEC > dio being consistent with other rw operations. > * replace kmalloc-allocated bounce buffer by using a bounce buffer > backed by scattered pages when the length of the bounce buffer for > KVEC_ITER dio is larger than PAG_SIZE, so even on hosts with > fragmented memory, the KVEC_ITER dio can be handled normally by > virtiofs. (Bernd Schubert) > * merge the GFP_NOFS patch [1] into this patch-set and use > memalloc_nofs_{save|restore}+GFP_KERNEL instead of GFP_NOFS > (Benjamin Coddington) > > v1: > https://lore.kernel.org/linux-fsdevel/20240103105929.1902658-1-hou...@huaweicloud.com/ > > [1]: > https://lore.kernel.org/linux-fsdevel/20240105105305.4052672-1-hou...@huaweicloud.com/ > > Hou Tao (6): > fuse: limit the length of ITER_KVEC dio by max_pages > virtiofs: move alloc/free of argbuf into separated helpers > virtiofs: factor out more common methods for argbuf > virtiofs: support bounce buffer backed by scattered pages > virtiofs: use scattered bounce buffer for ITER_KVEC dio > virtiofs: use GFP_NOFS when enqueuing request through kworker > > fs/fuse/file.c | 12 +- > fs/fuse/virtio_fs.c | 336 +--- > 2 files changed, 296 insertions(+), 52 deletions(-) > > -- > 2.29.2