Re: [PATCH] loop: drop caches if offset or block_size are changed

2019-01-09 Thread Jens Axboe
On 1/9/19 8:17 PM, Jaegeuk Kim wrote: > If we don't drop caches used in old offset or block_size, we can get old data > from new offset/block_size, which gives unexpected data to user. > > For example, Martijn found a loopback bug in the below scenario. > 1) LOOP_SET_FD loads first two pages on lo

[PATCH] loop: drop caches if offset or block_size are changed

2019-01-09 Thread Jaegeuk Kim
If we don't drop caches used in old offset or block_size, we can get old data from new offset/block_size, which gives unexpected data to user. For example, Martijn found a loopback bug in the below scenario. 1) LOOP_SET_FD loads first two pages on loop file 2) LOOP_SET_STATUS64 changes the offset

[PATCH 15/15] io_uring: add io_uring_event cache hit information

2019-01-09 Thread Jens Axboe
Add hint on whether a read was served out of the page cache, or if it hit media. This is useful for buffered async IO, O_DIRECT reads would never have this set (for obvious reasons). If the read hit page cache, cqe->flags will have IOCQE_FLAG_CACHEHIT set. Signed-off-by: Jens Axboe --- fs/io_ur

[PATCH 14/15] io_uring: add submission polling

2019-01-09 Thread Jens Axboe
This enables an application to do IO, without ever entering the kernel. By using the SQ ring to fill in new events and watching for completions on the CQ ring, we can submit and reap IOs without doing a single system call. The kernel side thread will poll for new submissions, and in case of HIPRI/p

[PATCH 13/15] io_uring: support kernel side submission

2019-01-09 Thread Jens Axboe
Add support for backing the io_uring fd with either a thread, or a workqueue and letting those handle the submission for us. This can be used to reduce overhead for submission, or to always make submission async. The latter is particularly useful for buffered aio, which is now fully async with this

[PATCH 10/15] io_uring: batch io_kiocb allocation

2019-01-09 Thread Jens Axboe
Similarly to how we use the state->ios_left to know how many references to get to a file, we can use it to allocate the io_kiocb's we need in bulk. Signed-off-by: Jens Axboe --- fs/io_uring.c | 71 +-- 1 file changed, 52 insertions(+), 19 deletions

[PATCH 06/15] io_uring: support for IO polling

2019-01-09 Thread Jens Axboe
Add support for polled read and write commands. These act like their non-polled counterparts, except we expect to poll for completion of them. To use polling, io_uring_setup() must be used with the IORING_SETUP_IOPOLL flag being set. It is illegal to mix and match polled and non-polled IO on an io

[PATCH 09/15] io_uring: use fget/fput_many() for file references

2019-01-09 Thread Jens Axboe
On the submission side, add file reference batching to the io_submit_state. We get as many references as the number of iocbs we are submitting, and drop unused ones if we end up switching files. The assumption here is that we're usually only dealing with one fd, and if there are multiple, hopefuly

[PATCH 08/15] fs: add fget_many() and fput_many()

2019-01-09 Thread Jens Axboe
Some uses cases repeatedly get and put references to the same file, but the only exposed interface is doing these one at the time. As each of these entail an atomic inc or dec on a shared structure, that cost can add up. Add fget_many(), which works just like fget(), except it takes an argument fo

[PATCH 05/15] Add io_uring IO interface

2019-01-09 Thread Jens Axboe
The submission queue (SQ) and completion queue (CQ) rings are shared between the application and the kernel. This eliminates the need to copy data back and forth to submit and complete IO. IO submissions use the io_uring_sqe data structure, and completions are generated in the form of io_uring_sqe

[PATCH 03/15] block: add bio_set_polled() helper

2019-01-09 Thread Jens Axboe
For the upcoming async polled IO, we can't sleep allocating requests. If we do, then we introduce a deadlock where the submitter already has async polled IO in-flight, but can't wait for them to complete since polled requests must be active found and reaped. Utilize the helper in the blockdev DIRE

[PATCH 07/15] io_uring: add submission side request cache

2019-01-09 Thread Jens Axboe
We have to add each submitted polled request to the io_ring_ctx poll_submitted list, which means we have to grab the poll_lock. We already use the block plug to batch submissions if we're doing a batch of IO submissions, extend that to cover the poll requests internally as well. Signed-off-by: Jen

[PATCHSET v2] io_uring IO interface

2019-01-09 Thread Jens Axboe
Here's v2 of the io_uring interface. See the v1 posting for some more info: https://lore.kernel.org/linux-block/20190108165645.19311-1-ax...@kernel.dk/ The data structures changed, to improve the symmetry of the submission and completion side. The io_uring_iocb is now io_uring_sqe, but it otherwi

[PATCH 01/15] fs: add an iopoll method to struct file_operations

2019-01-09 Thread Jens Axboe
From: Christoph Hellwig This new methods is used to explicitly poll for I/O completion for an iocb. It must be called for any iocb submitted asynchronously (that is with a non-null ki_complete) which has the IOCB_HIPRI flag set. The method is assisted by a new ki_cookie field in struct iocb to

[PATCH 11/15] block: implement bio helper to add iter bvec pages to bio

2019-01-09 Thread Jens Axboe
For an ITER_BVEC, we can just iterate the iov and add the pages to the bio directly. This requires that the caller doesn't releases the pages on IO completion, we add a BIO_HOLD_PAGES flag for that. The current two callers of bio_iov_iter_get_pages() are updated to check if they need to release pa

[PATCH 12/15] io_uring: add support for pre-mapped user IO buffers

2019-01-09 Thread Jens Axboe
If we have fixed user buffers, we can map them into the kernel when we setup the io_context. That avoids the need to do get_user_pages() for each and every IO. To utilize this feature, the application must pass in an array of iovecs that contain the desired buffer addresses and lengths. These buff

[PATCH 04/15] iomap: wire up the iopoll method

2019-01-09 Thread Jens Axboe
From: Christoph Hellwig Store the request queue the last bio was submitted to in the iocb private data in addition to the cookie so that we find the right block device. Also refactor the common direct I/O bio submission code into a nice little helper. Signed-off-by: Christoph Hellwig Modified

[PATCH 02/15] block: wire up block device iopoll method

2019-01-09 Thread Jens Axboe
From: Christoph Hellwig Just call blk_poll on the iocb cookie, we can derive the block device from the inode trivially. Reviewed-by: Johannes Thumshirn Signed-off-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/block_dev.c | 10 ++ 1 file changed, 10 insertions(+) diff --git

Re: [PATCH] block: fix kerneldoc comment for blk_attempt_plug_merge()

2019-01-09 Thread Jens Axboe
On 1/9/19 1:59 PM, Jonathan Corbet wrote: > Commit 5f0ed774ed29 ("block: sum requests in the plug structure") removed > the request_count parameter from block_attempt_plug_merge(), but did not > remove the associated kerneldoc comment, introducing this warning to the > docs build: > > ./block/bl

[PATCH] block: fix kerneldoc comment for blk_attempt_plug_merge()

2019-01-09 Thread Jonathan Corbet
Commit 5f0ed774ed29 ("block: sum requests in the plug structure") removed the request_count parameter from block_attempt_plug_merge(), but did not remove the associated kerneldoc comment, introducing this warning to the docs build: ./block/blk-core.c:685: warning: Excess function parameter 'requ

Re: [PATCH v3] loop: drop caches if offset or block_size are changed

2019-01-09 Thread Bart Van Assche
On Tue, 2018-12-18 at 14:41 -0800, Jaegeuk Kim wrote: > [ ... ] Please post new versions of a patch as a new e-mail thread instead of as a reply to a previous e-mail. > [ ... ] > > if (lo->lo_offset != info->lo_offset || > lo->lo_sizelimit != info->lo_sizelimit) { > +

Re: [PATCH 14/16] io_uring: support kernel side submission

2019-01-09 Thread Jens Axboe
On 1/9/19 12:06 PM, Christoph Hellwig wrote: >> +struct iocb_submit { >> +const struct io_uring_iocb *iocb; >> +unsigned int index; >> +}; >> + >> +struct io_work { >> +struct work_struct work; >> +struct io_ring_ctx *ctx; >> +struct io_uring_iocb iocb; >> +unsigned iocb_ind

Re: [PATCH 11/16] io_uring: batch io_kiocb allocation

2019-01-09 Thread Jens Axboe
On 1/9/19 12:03 PM, Christoph Hellwig wrote: > On Wed, Jan 09, 2019 at 09:57:59AM -0700, Jens Axboe wrote: >> On 1/9/19 5:13 AM, Christoph Hellwig wrote: + if (!state) + req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); >>> >>> Just return an error here if kmem_cache_alloc fails

Re: [PATCH 05/16] Add io_uring IO interface

2019-01-09 Thread Jens Axboe
On 1/9/19 11:30 AM, Christoph Hellwig wrote: > On Wed, Jan 09, 2019 at 08:53:31AM -0700, Jens Axboe wrote: +static int io_setup_rw(int rw, const struct io_uring_iocb *iocb, + struct iovec **iovec, struct iov_iter *iter) +{ + void __user *buf = (void __user *)(ui

Re: [PATCH 14/16] io_uring: support kernel side submission

2019-01-09 Thread Christoph Hellwig
> +struct iocb_submit { > + const struct io_uring_iocb *iocb; > + unsigned int index; > +}; > + > +struct io_work { > + struct work_struct work; > + struct io_ring_ctx *ctx; > + struct io_uring_iocb iocb; > + unsigned iocb_index; > +}; I think we should use struct iocb_subm

Re: [PATCH 11/16] io_uring: batch io_kiocb allocation

2019-01-09 Thread Christoph Hellwig
On Wed, Jan 09, 2019 at 09:57:59AM -0700, Jens Axboe wrote: > On 1/9/19 5:13 AM, Christoph Hellwig wrote: > >> + if (!state) > >> + req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); > > > > Just return an error here if kmem_cache_alloc fails. > > > >> + if (req) > >> + io_req_

Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

2019-01-09 Thread Christoph Hellwig
This looks good. I wonder if there is any good way to prevent other drivers from picking up this bug byt using a better interface, but that should not delay your fix.

Re: [PATCH 05/16] Add io_uring IO interface

2019-01-09 Thread Christoph Hellwig
On Wed, Jan 09, 2019 at 08:53:31AM -0700, Jens Axboe wrote: > >> +static int io_setup_rw(int rw, const struct io_uring_iocb *iocb, > >> + struct iovec **iovec, struct iov_iter *iter) > >> +{ > >> + void __user *buf = (void __user *)(uintptr_t)iocb->addr; > >> + size_t ret; > >> +

Re: [PATCH 13/16] io_uring: add support for pre-mapped user IO buffers

2019-01-09 Thread Jens Axboe
On 1/9/19 5:16 AM, Christoph Hellwig wrote: >> +static int io_setup_rw(int rw, struct io_kiocb *kiocb, >> + const struct io_uring_iocb *iocb, struct iovec **iovec, >> + struct iov_iter *iter, bool kaddr) >> { >> void __user *buf = (void __user *)(uintptr_t)

Re: [PATCH 11/16] io_uring: batch io_kiocb allocation

2019-01-09 Thread Jens Axboe
On 1/9/19 5:13 AM, Christoph Hellwig wrote: >> +if (!state) >> +req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); > > Just return an error here if kmem_cache_alloc fails. > >> +if (req) >> +io_req_init(ctx, req); > > Because all the other ones can't reached this w

Re: [PATCH 10/16] io_uring: split kiocb init from allocation

2019-01-09 Thread Jens Axboe
On 1/9/19 5:12 AM, Christoph Hellwig wrote: > On Tue, Jan 08, 2019 at 09:56:39AM -0700, Jens Axboe wrote: >> In preparation from having pre-allocated requests, that we then just >> need to initialize before use. >> >> Signed-off-by: Jens Axboe >> --- >> fs/io_uring.c | 13 + >> 1 file

Re: [PATCHSET v1] io_uring IO interface

2019-01-09 Thread Chris Mason
On 9 Jan 2019, at 11:00, Matthew Wilcox wrote: > On Tue, Jan 08, 2019 at 09:56:29AM -0700, Jens Axboe wrote: >> After some arm twisting from Christoph, I finally caved and divorced >> the >> aio-poll patches from aio/libaio itself. The io_uring interface >> itself >> is useful and efficient, and

Re: [PATCHSET v1] io_uring IO interface

2019-01-09 Thread Matthew Wilcox
On Tue, Jan 08, 2019 at 09:56:29AM -0700, Jens Axboe wrote: > After some arm twisting from Christoph, I finally caved and divorced the > aio-poll patches from aio/libaio itself. The io_uring interface itself > is useful and efficient, and after rebasing all the new goodies on top > of that, there w

Re: [PATCH 06/16] io_uring: support for IO polling

2019-01-09 Thread Jens Axboe
On 1/9/19 5:11 AM, Christoph Hellwig wrote: > On Tue, Jan 08, 2019 at 09:56:35AM -0700, Jens Axboe wrote: >> Add polled variants of the read and write commands. These act like their >> non-polled counterparts, except we expect to poll for completion of >> them. > > These aren't really need command

Re: [PATCH 05/16] Add io_uring IO interface

2019-01-09 Thread Jens Axboe
On 1/9/19 5:10 AM, Christoph Hellwig wrote: >> index 293733f61594..9ef9987b4192 100644 >> --- a/fs/Makefile >> +++ b/fs/Makefile >> @@ -29,7 +29,7 @@ obj-$(CONFIG_SIGNALFD) += signalfd.o >> obj-$(CONFIG_TIMERFD) += timerfd.o >> obj-$(CONFIG_EVENTFD) += even

[PATCH 1/4] block: Increase count of supported write-hints

2019-01-09 Thread Kanchan Joshi
This patch bumps up write-hint count to support four new, in-kernel hints. Signed-off-by: Kanchan Joshi --- include/linux/blkdev.h | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 338604d..df07759 100644 --- a/include/l

[PATCH 3/4] fs: introduce APIs to enable sending write-hint with buffer-head

2019-01-09 Thread Kanchan Joshi
submit_bh and write_dirty_buffer do not take write-hint as parameter. This patch introduces variants which do. Signed-off-by: Kanchan Joshi --- fs/buffer.c | 18 -- include/linux/buffer_head.h | 3 +++ 2 files changed, 19 insertions(+), 2 deletions(-) diff --git

[PATCH 2/4] fs: introduce four macros for in-kernel hints

2019-01-09 Thread Kanchan Joshi
Exiting write-hints are exposed to user-mode. There is a possiblity of conflict if kernel happens to use those. This patch introduces four write-hints for exclusive kernel-mode use. Signed-off-by: Kanchan Joshi --- include/linux/fs.h | 5 + 1 file changed, 5 insertions(+) diff --git a/inclu

[PATCH v2 0/4] Write-hint for FS journal

2019-01-09 Thread Kanchan Joshi
Towards supporing write-hints/streams for filesystem journal. Here is the v1 patch for background - https://marc.info/?l=linux-fsdevel&m=15637519020&w=2

[PATCH 4/4] fs/ext4,jbd2: add support for passing write-hint with journal.

2019-01-09 Thread Kanchan Joshi
For NAND based SSDs, mixing of data with different life-time reduces efficiency of internal garbage-collection. During FS operations, series of journal updates will follow/precede series of data/meta updates, causing intermixing inside SSD. By passing a write-hint with journal, its write can be iso

Re: [PATCH] block: doc: add slice_idle_us to bfq documentation

2019-01-09 Thread John Pittman
Thanks; noted. On Wed, Jan 9, 2019 at 9:39 AM Jens Axboe wrote: > > On 1/8/19 2:56 PM, John Pittman wrote: > > Of the tunables available for the bfq I/O scheduler, > > the only one missing from the documentation in > > 'Documentation/block/bfq-iosched.txt' is slice_idle_us. > > Add this tunable

Re: [PATCH] block: doc: add slice_idle_us to bfq documentation

2019-01-09 Thread Jens Axboe
On 1/8/19 2:56 PM, John Pittman wrote: > Of the tunables available for the bfq I/O scheduler, > the only one missing from the documentation in > 'Documentation/block/bfq-iosched.txt' is slice_idle_us. > Add this tunable to the documentation and a short > explanation of its purpose. Applied, but I

Re: [PATCH 13/16] io_uring: add support for pre-mapped user IO buffers

2019-01-09 Thread Christoph Hellwig
> +static int io_setup_rw(int rw, struct io_kiocb *kiocb, > +const struct io_uring_iocb *iocb, struct iovec **iovec, > +struct iov_iter *iter, bool kaddr) > { > void __user *buf = (void __user *)(uintptr_t)iocb->addr; > size_t ret; > > - re

Re: [PATCH 11/16] io_uring: batch io_kiocb allocation

2019-01-09 Thread Christoph Hellwig
> + if (!state) > + req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); Just return an error here if kmem_cache_alloc fails. > + if (req) > + io_req_init(ctx, req); Because all the other ones can't reached this with a NULL req.

Re: [PATCH 10/16] io_uring: split kiocb init from allocation

2019-01-09 Thread Christoph Hellwig
On Tue, Jan 08, 2019 at 09:56:39AM -0700, Jens Axboe wrote: > In preparation from having pre-allocated requests, that we then just > need to initialize before use. > > Signed-off-by: Jens Axboe > --- > fs/io_uring.c | 13 + > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff

Re: [PATCH 06/16] io_uring: support for IO polling

2019-01-09 Thread Christoph Hellwig
On Tue, Jan 08, 2019 at 09:56:35AM -0700, Jens Axboe wrote: > Add polled variants of the read and write commands. These act like their > non-polled counterparts, except we expect to poll for completion of > them. These aren't really need command variants, but a different type of context. >

Re: [PATCH 05/16] Add io_uring IO interface

2019-01-09 Thread Christoph Hellwig
> index 293733f61594..9ef9987b4192 100644 > --- a/fs/Makefile > +++ b/fs/Makefile > @@ -29,7 +29,7 @@ obj-$(CONFIG_SIGNALFD) += signalfd.o > obj-$(CONFIG_TIMERFD)+= timerfd.o > obj-$(CONFIG_EVENTFD)+= eventfd.o > obj-$(CONFIG_USERFAULTFD)+= userfa

Re: [PATCH blktests 02/14] common: Introduce _test_dev_is_zoned() helper function

2019-01-09 Thread Johannes Thumshirn
On 09/01/2019 02:35, Damien Le Moal wrote: > From: Shin'ichiro Kawasaki > +_test_dev_is_zoned() { > + local zoned_file="${TEST_DEV_SYSFS}/queue/zoned" > + if grep -q -e "none" "${zoned_file}" ; then Nit: I think we can leave the zoned_file variable out if grep -qe "none" "${TEST_D