[PATCH] fat: Workaround the race with userspace's read via blockdev while mounting

2019-09-10 Thread OGAWA Hirofumi
If userspace reads the buffer via blockdev while mounting, sb_getblk()+modify can race with buffer read via blockdev. For example, FS userspace bh = sb_getblk() modify bh->b_data read

[PATCH v1] fixup null q->dev checking in both block and scsi layer

2019-09-10 Thread Stanley Chu
Some devices may skip blk_pm_runtime_init() and have null pointer in its request_queue->dev. For example, SCSI devices of UFS Well-Known LUNs. Currently the null pointer is checked by the user of blk_set_runtime_active(), i.e., scsi_dev_type_resume(). It is better to check it by blk_set_runtime_ac

[PATCH v1 1/2] block: bypass blk_set_runtime_active for uninitialized q->dev

2019-09-10 Thread Stanley Chu
Some devices may skip blk_pm_runtime_init() and have null pointer in its request_queue->dev. For example, SCSI devices of UFS Well-Known LUNs. Currently the null pointer is checked by the user of blk_set_runtime_active(), i.e., scsi_dev_type_resume(). It is better to check it by blk_set_runtime_ac

[PATCH v1 2/2] scsi: core: remove dummy q->dev check

2019-09-10 Thread Stanley Chu
Currently blk_set_runtime_active() is checking if q->dev is null by itself, thus remove the same checking in its user: scsi_dev_type_resume(). Signed-off-by: Stanley Chu --- drivers/scsi/scsi_pm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_pm.c b/driv

Re: [PATCH v4 1/3] block: centralize PI remapping logic to the block layer

2019-09-10 Thread Martin K. Petersen
Max, > I guess Type 1 and Type 3 mirrors can work because Type 3 doesn't have > a ref tag, right ? It will work but you'll lose ref tag checking on the Type 3 side of the mirror. So not exactly desirable. And in our experience, the ref tag is hugely important. Also, there are probably some hea

Re: [PATCH] nbd: remove the duplicated code

2019-09-10 Thread Xiubo Li
On 2019/9/10 23:56, Mike Christie wrote: On 09/10/2019 01:36 AM, xiu...@redhat.com wrote: From: Xiubo Li The followed code will do the same check, and this part is redandant. Signed-off-by: Xiubo Li --- drivers/block/nbd.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/bloc

Re: [RFC PATCH] Add proc interface to set PF_MEMALLOC flags

2019-09-10 Thread Kirill A. Shutemov
On Wed, Sep 11, 2019 at 07:12:06AM +0900, Tetsuo Handa wrote: > >> +static ssize_t memalloc_write(struct file *file, const char __user *buf, > >> +size_t count, loff_t *ppos) > >> +{ > >> + struct task_struct *task; > >> + char buffer[5]; > >> + int rc = count; > >> + > >

Re: [PATCH] md/raid0: avoid RAID0 data corruption due to layout confusion.

2019-09-10 Thread NeilBrown
On Tue, Sep 10 2019, Guoqing Jiang wrote: > On 9/10/19 5:45 PM, Song Liu wrote: >> >> >>> On Sep 10, 2019, at 12:33 AM, NeilBrown wrote: >>> >>> On Mon, Sep 09 2019, Song Liu wrote: >>> Hi Neil, > On Sep 9, 2019, at 7:57 AM, NeilBrown wrote: > > > If the drives in a R

Re: [PATCH v4 1/3] block: centralize PI remapping logic to the block layer

2019-09-10 Thread Max Gurtovoy
On 9/10/2019 5:29 AM, Martin K. Petersen wrote: Max, Hi Martin, thanks for the great explanation ! maybe we can add profiles to type0 and type2 in the future and have more readable code. It's a deliberate feature that we treat DIX Type 0, 1, and 2 the same. It's very common to mix and m

Re: [RFC PATCH] Add proc interface to set PF_MEMALLOC flags

2019-09-10 Thread Tetsuo Handa
On 2019/09/10 3:26, Mike Christie wrote: > Forgot to cc linux-mm. > > On 09/09/2019 11:28 AM, Mike Christie wrote: >> There are several storage drivers like dm-multipath, iscsi, and nbd that >> have userspace components that can run in the IO path. For example, >> iscsi and nbd's userspace deamons

Re: [PATCHSET block/for-next] blk-iocost: Implement absolute debt handling

2019-09-10 Thread Jens Axboe
On 9/4/19 1:45 PM, Tejun Heo wrote: > Currently, when a given cgroup doesn't have enough budget, a forced or > merged bio will advance the cgroup's vtime by the cost calculated > according to the hierarchical weight at the time of issue. Once vtime > is advanced, how the cgroup's weight changes do

Re: [block/for-next] iocost: Fix incorrect operation order during iocg free

2019-09-10 Thread Jens Axboe
On 9/10/19 10:15 AM, Tejun Heo wrote: > ioc_pd_free() first cancels the hrtimers and then deactivates the > iocg. However, the iocg timer can run inbetween and reschedule the > hrtimers which will end up running after the iocg is freed leading to > crashes like the following. > >general prote

[PATCHSET 0/2] io_uring: improve handling of buffered writes

2019-09-10 Thread Jens Axboe
XFS/ext4/others all need to lock the inode for buffered writes. Since io_uring handles any IO in an async manner, this means that for higher queue depth buffered write workloads, we have a lot of workers hammering on the same mutex. Running a QD=32 random write workload on my test box yields about

[PATCH 1/2] io_uring: add io_queue_async_work() helper

2019-09-10 Thread Jens Axboe
Add a helper for queueing a request for async execution, in preparation for optimizing it. No functional change in this patch. Signed-off-by: Jens Axboe --- fs/io_uring.c | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index b2

[PATCH 2/2] io_uring: limit parallelism of buffered writes

2019-09-10 Thread Jens Axboe
All the popular filesystems need to grab the inode lock for buffered writes. With io_uring punting buffered writes to async context, we observe a lot of contention with all workers hamming this mutex. For buffered writes, we generally don't need a lot of parallelism on the submission side, as the

Re: [PATCH] fat: fix corruption in fat_alloc_new_dir()

2019-09-10 Thread Jan Stancek
- Original Message - > Jan Stancek writes: > > >> Using the device while mounting same device doesn't work reliably like > >> this race. (getblk() is intentionally used to get the buffer to write > >> new data.) > > > > Are you saying this is expected even if 'usage' is just read? > >

[block/for-next] iocost: Fix incorrect operation order during iocg free

2019-09-10 Thread Tejun Heo
ioc_pd_free() first cancels the hrtimers and then deactivates the iocg. However, the iocg timer can run inbetween and reschedule the hrtimers which will end up running after the iocg is freed leading to crashes like the following. general protection fault: [#1] SMP ... RIP: 0010:iocg_k

Re: [PATCH 08/10] blkcg: implement blk-iocost

2019-09-10 Thread Tejun Heo
Hello, Michal. On Tue, Sep 10, 2019 at 02:55:14PM +0200, Michal Koutný wrote: > This adds the generic io.weight attribute. How will this compose with > the weight from IO schedulers? (AFAIK, only BFQ allows proportional > control as of now. +CC Paolo.) The two being enabled at the same time doesn

Re: [RFC PATCH] Add proc interface to set PF_MEMALLOC flags

2019-09-10 Thread Mike Christie
On 09/10/2019 05:00 AM, Kirill A. Shutemov wrote: > On Mon, Sep 09, 2019 at 11:28:04AM -0500, Mike Christie wrote: >> There are several storage drivers like dm-multipath, iscsi, and nbd that >> have userspace components that can run in the IO path. For example, >> iscsi and nbd's userspace deamons

Re: [PATCH] md/raid0: avoid RAID0 data corruption due to layout confusion.

2019-09-10 Thread Guoqing Jiang
On 9/10/19 5:45 PM, Song Liu wrote: On Sep 10, 2019, at 12:33 AM, NeilBrown wrote: On Mon, Sep 09 2019, Song Liu wrote: Hi Neil, On Sep 9, 2019, at 7:57 AM, NeilBrown wrote: If the drives in a RAID0 are not all the same size, the array is divided into zones. The first zone covers a

Re: [PATCH] nbd: remove the duplicated code

2019-09-10 Thread Mike Christie
On 09/10/2019 01:36 AM, xiu...@redhat.com wrote: > From: Xiubo Li > > The followed code will do the same check, and this part is redandant. > > Signed-off-by: Xiubo Li > --- > drivers/block/nbd.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/drivers/block/nbd.c b/drivers/block/nb

Re: [PATCH] md/raid0: avoid RAID0 data corruption due to layout confusion.

2019-09-10 Thread Song Liu
> On Sep 10, 2019, at 12:33 AM, NeilBrown wrote: > > On Mon, Sep 09 2019, Song Liu wrote: > >> Hi Neil, >> >>> On Sep 9, 2019, at 7:57 AM, NeilBrown wrote: >>> >>> >>> If the drives in a RAID0 are not all the same size, the array is >>> divided into zones. >>> The first zone covers all dr

Re: [RFC PATCH] Add proc interface to set PF_MEMALLOC flags

2019-09-10 Thread Damien Le Moal
+ Miklos On 2019/09/10 13:41, Kirill A. Shutemov wrote: > On Tue, Sep 10, 2019 at 12:05:33PM +, Damien Le Moal wrote: >> On 2019/09/10 11:00, Kirill A. Shutemov wrote: >>> On Mon, Sep 09, 2019 at 11:28:04AM -0500, Mike Christie wrote: There are several storage drivers like dm-multipath, i

Re: [PATCH 08/10] blkcg: implement blk-iocost

2019-09-10 Thread Michal Koutný
Hello. On Wed, Aug 28, 2019 at 03:05:58PM -0700, Tejun Heo wrote: > diff --git a/block/blk-iocost.c b/block/blk-iocost.c > [...] > +static struct cftype ioc_files[] = { > + .name = "weight", > [...] This adds the generic io.weight attribute. How will this compose with the weight from

Re: [RFC PATCH] Add proc interface to set PF_MEMALLOC flags

2019-09-10 Thread Kirill A. Shutemov
On Tue, Sep 10, 2019 at 12:05:33PM +, Damien Le Moal wrote: > On 2019/09/10 11:00, Kirill A. Shutemov wrote: > > On Mon, Sep 09, 2019 at 11:28:04AM -0500, Mike Christie wrote: > >> There are several storage drivers like dm-multipath, iscsi, and nbd that > >> have userspace components that can r

Re: [RFC PATCH] Add proc interface to set PF_MEMALLOC flags

2019-09-10 Thread Damien Le Moal
On 2019/09/10 11:00, Kirill A. Shutemov wrote: > On Mon, Sep 09, 2019 at 11:28:04AM -0500, Mike Christie wrote: >> There are several storage drivers like dm-multipath, iscsi, and nbd that >> have userspace components that can run in the IO path. For example, >> iscsi and nbd's userspace deamons may

Re: [PATCH v2] nbd_genl_status: null check for nla_nest_start

2019-09-10 Thread Michal Kubecek
(Just stumbled upon this patch when link to it came with a CVE bug report.) On Mon, Jul 29, 2019 at 11:42:26AM -0500, Navid Emamdoost wrote: > nla_nest_start may fail and return NULL. The check is inserted, and > errno is selected based on other call sites within the same source code. > Update: re

Re: [PATCH v2] loop: change queue block size to match when using DIO.

2019-09-10 Thread Martijn Coenen
Hi Jens, Ming, Do you have any thoughts about this patch? Thanks, Martijn On Wed, Sep 4, 2019 at 9:49 PM Martijn Coenen wrote: > > The loop driver assumes that if the passed in fd is opened with > O_DIRECT, the caller wants to use direct I/O on the loop device. > However, if the underlying bloc

Re: [RFC PATCH] Add proc interface to set PF_MEMALLOC flags

2019-09-10 Thread Kirill A. Shutemov
On Mon, Sep 09, 2019 at 11:28:04AM -0500, Mike Christie wrote: > There are several storage drivers like dm-multipath, iscsi, and nbd that > have userspace components that can run in the IO path. For example, > iscsi and nbd's userspace deamons may need to recreate a socket and/or > send IO on it, a

Re: [RFC PATCH] Add proc interface to set PF_MEMALLOC flags

2019-09-10 Thread Damien Le Moal
Mike, On 2019/09/09 19:26, Mike Christie wrote: > Forgot to cc linux-mm. > > On 09/09/2019 11:28 AM, Mike Christie wrote: >> There are several storage drivers like dm-multipath, iscsi, and nbd that >> have userspace components that can run in the IO path. For example, >> iscsi and nbd's userspace

Re: [PATCH 1/3] block: Respect the device's maximum segment size

2019-09-10 Thread Thierry Reding
On Tue, Sep 10, 2019 at 08:13:48AM +0200, Christoph Hellwig wrote: > On Tue, Sep 10, 2019 at 02:03:17AM +, Yoshihiro Shimoda wrote: > > I'm sorry for causing this trouble on your environment. I have a proposal to > > resolve this issue. The mmc_host struct will have a new caps2 flag > > like MM

Re: [PATCH 1/3] block: Respect the device's maximum segment size

2019-09-10 Thread Thierry Reding
On Tue, Sep 10, 2019 at 02:03:17AM +, Yoshihiro Shimoda wrote: > Hi Thierry, > > > From: Thierry Reding, Sent: Tuesday, September 10, 2019 4:19 AM > > > > On Mon, Sep 09, 2019 at 06:13:31PM +0200, Christoph Hellwig wrote: > > > On Mon, Sep 09, 2019 at 02:56:56PM +0200, Thierry Reding wrote: >