Re: [PATCHSET v11] io_uring IO interface

2019-02-01 Thread Bart Van Assche
On Fri, 2019-02-01 at 08:23 -0700, Jens Axboe wrote: > Here's v11 of the io_uring project. Main fixes in this release is a > rework of how we grab the ctx->uring_lock, never using trylock for it in > a user visible way. Outside of that, fixes around locking for the polled > list when we hit -EAGAIN

Re: Question on handling managed IRQs when hotplugging CPUs

2019-02-01 Thread Thomas Gleixner
On Fri, 1 Feb 2019, Hannes Reinecke wrote: > Thing is, if we have _managed_ CPU hotplug (ie if the hardware provides some > means of quiescing the CPU before hotplug) then the whole thing is trivial; > disable SQ and wait for all outstanding commands to complete. > Then trivially all requests are c

[PATCH v4 11/16] block: sed-opal: ioctl for writing to shadow mbr

2019-02-01 Thread David Kozub
From: Jonas Rabenstein Allow modification of the shadow mbr. If the shadow mbr is not marked as done, this data will be presented read only as the device content. Only after marking the shadow mbr as done and unlocking a locking range the actual content is accessible. Co-authored-by: David Kozub

[PATCH v4 13/16] block: sed-opal: check size of shadow mbr

2019-02-01 Thread David Kozub
From: Jonas Rabenstein Check whether the shadow mbr does fit in the provided space on the target. Also a proper firmware should handle this case and return an error we may prevent problems or even damage with crappy firmwares. Signed-off-by: Jonas Rabenstein Reviewed-by: Scott Bauer --- block

[PATCH v4 10/16] block: sed-opal: add ioctl for done-mark of shadow mbr

2019-02-01 Thread David Kozub
From: Jonas Rabenstein Enable users to mark the shadow mbr as done without completely deactivating the shadow mbr feature. This may be useful on reboots, when the power to the disk is not disconnected in between and the shadow mbr stores the required boot files. Of course, this saves also the (fe

[PATCH v4 03/16] block: sed-opal: unify space check in add_token_*

2019-02-01 Thread David Kozub
From: Jonas Rabenstein All add_token_* functions have a common set of conditions that have to be checked. Use a common function for those checks in order to avoid different behaviour as well as code duplication. Co-authored-by: David Kozub Signed-off-by: Jonas Rabenstein Signed-off-by: David K

[PATCH v4 04/16] block: sed-opal: close parameter list in cmd_finalize

2019-02-01 Thread David Kozub
Every step ends by calling cmd_finalize (via finalize_and_send) yet every step adds the token OPAL_ENDLIST on its own. Moving this into cmd_finalize decreases code duplication. Co-authored-by: Jonas Rabenstein Signed-off-by: David Kozub Signed-off-by: Jonas Rabenstein Reviewed-by: Scott Bauer

[PATCH v4 02/16] block: sed-opal: use correct macro for method length

2019-02-01 Thread David Kozub
From: Jonas Rabenstein Also the values of OPAL_UID_LENGTH and OPAL_METHOD_LENGTH are the same, it is weird to use OPAL_UID_LENGTH for the definition of the methods. Signed-off-by: Jonas Rabenstein Reviewed-by: Scott Bauer --- block/sed-opal.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion

[PATCH v4 09/16] block: sed-opal: split generation of bytestring header and content

2019-02-01 Thread David Kozub
From: Jonas Rabenstein Split the header generation from the (normal) memcpy part if a bytestring is copied into the command buffer. This allows in-place generation of the bytestring content. For example, copy_from_user may be used without an intermediate buffer. Signed-off-by: Jonas Rabenstein

[PATCH v4 05/16] block: sed-opal: unify cmd start

2019-02-01 Thread David Kozub
Every step starts with resetting the cmd buffer as well as the comid and constructs the appropriate OPAL_CALL command. Consequently, those actions may be combined into one generic function. On should take care that the opening and closing tokens for the argument list are already emitted by cmd_star

[PATCH v4 08/16] block: sed-opal: print failed function address

2019-02-01 Thread David Kozub
From: Jonas Rabenstein Add function address (and if available its symbol) to the message if a step function fails. Signed-off-by: Jonas Rabenstein Reviewed-by: Scott Bauer --- block/sed-opal.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/sed-opal.c b/block/sed

[PATCH v4 15/16] block: sed-opal: don't repeat opal_discovery0 in each steps array

2019-02-01 Thread David Kozub
Originally each of the opal functions that call next include opal_discovery0 in the array of steps. This is superfluous and can be done always inside next. Signed-off-by: David Kozub Reviewed-by: Scott Bauer --- block/sed-opal.c | 89 +++- 1 file chan

[PATCH v4 12/16] block: sed-opal: unify retrieval of table columns

2019-02-01 Thread David Kozub
From: Jonas Rabenstein Instead of having multiple places defining the same argument list to get a specific column of a sed-opal table, provide a generic version and call it from those functions. Signed-off-by: Jonas Rabenstein Reviewed-by: Scott Bauer --- block/opal_proto.h | 2 + block/sed

[PATCH v4 14/16] block: sed-opal: pass steps via argument rather than via opal_dev

2019-02-01 Thread David Kozub
The steps argument is only read by the next function, so it can be passed directly as an argument rather than via opal_dev. Normally, the steps is an array on the stack, so the pointer stops being valid then the function that set opal_dev.steps returns. If opal_dev.steps was not set to NULL before

[PATCH v4 07/16] block: sed-opal: reuse response_get_token to decrease code duplication

2019-02-01 Thread David Kozub
response_get_token had already been in place, its functionality had been duplicated within response_get_{u64,bytestring} with the same error handling. Unify the handling by reusing response_get_token within the other functions. Co-authored-by: Jonas Rabenstein Signed-off-by: David Kozub Signed-o

[PATCH v4 06/16] block: sed-opal: unify error handling of responses

2019-02-01 Thread David Kozub
response_get_{string,u64} include error handling for argument resp being NULL but response_get_token does not handle this. Make all three of response_get_{string,u64,token} handle NULL resp in the same way. Co-authored-by: Jonas Rabenstein Signed-off-by: David Kozub Signed-off-by: Jonas Rabenst

[PATCH v4 01/16] block: sed-opal: fix typos and formatting

2019-02-01 Thread David Kozub
This should make no change in functionality. The formatting changes were triggered by checkpatch.pl. Signed-off-by: David Kozub Reviewed-by: Scott Bauer --- block/sed-opal.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/block/sed-opal.c b/block/sed-opa

[PATCH v4 00/16] block: sed-opal: support shadow MBR done flag and write

2019-02-01 Thread David Kozub
This patch series extends SED OPAL support: it adds IOCTL for setting the shadow MBR done flag which can be useful for unlocking an OPAL disk on boot and it adds IOCTL for writing to the shadow MBR. Also included are some minor fixes and improvements. This series is based on the original work done

Re: [PATCH 05/18] Add io_uring IO interface

2019-02-01 Thread Florian Weimer
* Jens Axboe: > +/* > + * Filled with the offset for mmap(2) > + */ > +struct io_sqring_offsets { > + __u32 head; > + __u32 tail; > + __u32 ring_mask; > + __u32 ring_entries; > + __u32 flags; > + __u32 dropped; > + __u32 array; > + __u32 resv[3]; > +}; > + > +struct

Re: [RESEND PATCH 1/2] loop: Report EOPNOTSUPP properly

2019-02-01 Thread Evan Green
On Thu, Jan 31, 2019 at 3:31 PM Bart Van Assche wrote: > > On Thu, 2019-01-31 at 14:13 -0800, Evan Green wrote: > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > > index cf5538942834..a1ba555e3b92 100644 > > --- a/drivers/block/loop.c > > +++ b/drivers/block/loop.c > > @@ -458,8 +458,

Re: [PATCH 05/18] Add io_uring IO interface

2019-02-01 Thread Al Viro
On Fri, Feb 01, 2019 at 06:23:27PM +0100, Jann Horn wrote: > > Oh, yuck. Uuuh... can we make "struct files_struct" doubly-refcounted, > > like "struct mm_struct"? One reference type to keep the contents > > intact (the reference type you normally use, and the type used by > > uring when the thread

Re: Recent removal of bsg read/write support

2019-02-01 Thread Douglas Gilbert
Updated reply, see below. On 2018-09-03 4:34 a.m., Dror Levin wrote: On Sun, Sep 2, 2018 at 8:55 PM Linus Torvalds wrote: On Sun, Sep 2, 2018 at 4:44 AM Richard Weinberger wrote: CC'ing relevant people. Otherwise your mail might get lost. Indeed. Sorry for that. On Sun, Sep 2, 2018 a

Re: [PATCH 05/18] Add io_uring IO interface

2019-02-01 Thread Jann Horn
On Fri, Feb 1, 2019 at 6:04 PM Jann Horn wrote: > > On Fri, Feb 1, 2019 at 5:57 PM Matt Mullins wrote: > > On Tue, 2019-01-29 at 00:59 +0100, Jann Horn wrote: > > > On Tue, Jan 29, 2019 at 12:47 AM Jens Axboe wrote: > > > > On 1/28/19 3:32 PM, Jann Horn wrote: > > > > > On Mon, Jan 28, 2019 at 1

Re: [PATCH 05/18] Add io_uring IO interface

2019-02-01 Thread Jann Horn
On Fri, Feb 1, 2019 at 5:57 PM Matt Mullins wrote: > On Tue, 2019-01-29 at 00:59 +0100, Jann Horn wrote: > > On Tue, Jan 29, 2019 at 12:47 AM Jens Axboe wrote: > > > On 1/28/19 3:32 PM, Jann Horn wrote: > > > > On Mon, Jan 28, 2019 at 10:35 PM Jens Axboe wrote: > > > > > The submission queue (SQ

Re: [PATCH 04/18] iomap: wire up the iopoll method

2019-02-01 Thread Bart Van Assche
On Fri, 2019-02-01 at 08:24 -0700, Jens Axboe wrote: > +int iomap_dio_iopoll(struct kiocb *kiocb, bool spin) > +{ > + struct request_queue *q = READ_ONCE(kiocb->private); > + > + if (!q) > + return 0; > + return blk_poll(q, READ_ONCE(kiocb->ki_cookie), spin); > +} > +EXPORT_

Re: [PATCH 05/18] Add io_uring IO interface

2019-02-01 Thread Matt Mullins
On Tue, 2019-01-29 at 00:59 +0100, Jann Horn wrote: > On Tue, Jan 29, 2019 at 12:47 AM Jens Axboe wrote: > > On 1/28/19 3:32 PM, Jann Horn wrote: > > > On Mon, Jan 28, 2019 at 10:35 PM Jens Axboe wrote: > > > > The submission queue (SQ) and completion queue (CQ) rings are shared > > > > between t

Re: [dm-devel] block: Fix a WRITE SAME BUG_ON

2019-02-01 Thread Christoph Hellwig
On Fri, Feb 01, 2019 at 05:03:40PM +0100, Heinz Mauelshagen wrote: > On 2/1/19 3:09 PM, John Dorminy wrote: > > I didn't know such a thing existed... does it work on any block > > device? Where do I read more about this? > > > Use sg_write_same(8) from package sg3_utils. > > For instance 'sg_wri

Re: [dm-devel] block: Fix a WRITE SAME BUG_ON

2019-02-01 Thread Heinz Mauelshagen
On 2/1/19 3:09 PM, John Dorminy wrote: I didn't know such a thing existed... does it work on any block device? Where do I read more about this? Use sg_write_same(8) from package sg3_utils. For instance 'sg_write_same --in=foobarfile --lba=0 --num=2 --xferlen=512 /dev/sdwhatever' will r

Re: Question on handling managed IRQs when hotplugging CPUs

2019-02-01 Thread Hannes Reinecke
On 1/31/19 6:48 PM, John Garry wrote: On 30/01/2019 12:43, Thomas Gleixner wrote: On Wed, 30 Jan 2019, John Garry wrote: On 29/01/2019 17:20, Keith Busch wrote: On Tue, Jan 29, 2019 at 05:12:40PM +, John Garry wrote: On 29/01/2019 15:44, Keith Busch wrote: Hm, we used to freeze the queu

Re: [PATCH 0/2] small optimization for accessing queue map

2019-02-01 Thread Jens Axboe
On 1/24/19 3:25 AM, Jianchao Wang wrote: > Hi Jens > > These two patches are small optimization for accessing the queue mapping > in hot path. It saves the queue mapping results into blk_mq_ctx directly, > then we needn't do the complicated bounce on queue_hw_ctx[] map[] and > mq_map[]. Doing som

[PATCH 08/18] fs: add fget_many() and fput_many()

2019-02-01 Thread Jens Axboe
Some uses cases repeatedly get and put references to the same file, but the only exposed interface is doing these one at the time. As each of these entail an atomic inc or dec on a shared structure, that cost can add up. Add fget_many(), which works just like fget(), except it takes an argument fo

[PATCH 16/18] io_uring: add support for IORING_OP_POLL

2019-02-01 Thread Jens Axboe
This is basically a direct port of bfe4037e722e, which implements a one-shot poll command through aio. Description below is based on that commit as well. However, instead of adding a POLL command and relying on io_cancel(2) to remove it, we mimic the epoll(2) interface of having a command to add a

[PATCH 02/18] block: wire up block device iopoll method

2019-02-01 Thread Jens Axboe
From: Christoph Hellwig Just call blk_poll on the iocb cookie, we can derive the block device from the inode trivially. Reviewed-by: Johannes Thumshirn Signed-off-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/block_dev.c | 10 ++ 1 file changed, 10 insertions(+) diff --git

[PATCH 17/18] io_uring: allow workqueue item to handle multiple buffered requests

2019-02-01 Thread Jens Axboe
Right now we punt any buffered request that ends up triggering an -EAGAIN to an async workqueue. This works fine in terms of providing async execution of them, but it also can create quite a lot of work queue items. For sequentially buffered IO, it's advantageous to serialize the issue of them. For

[PATCH 07/18] io_uring: support for IO polling

2019-02-01 Thread Jens Axboe
Add support for a polled io_uring context. When a read or write is submitted to a polled context, the application must poll for completions on the CQ ring through io_uring_enter(2). Polled IO may not generate IRQ completions, hence they need to be actively found by the application itself. To use p

[PATCHSET v11] io_uring IO interface

2019-02-01 Thread Jens Axboe
Here's v11 of the io_uring project. Main fixes in this release is a rework of how we grab the ctx->uring_lock, never using trylock for it in a user visible way. Outside of that, fixes around locking for the polled list when we hit -EAGAIN conditions on IO submit. This fixes list corruption issues w

[PATCH 03/18] block: add bio_set_polled() helper

2019-02-01 Thread Jens Axboe
For the upcoming async polled IO, we can't sleep allocating requests. If we do, then we introduce a deadlock where the submitter already has async polled IO in-flight, but can't wait for them to complete since polled requests must be active found and reaped. Utilize the helper in the blockdev DIRE

[PATCH 05/18] Add io_uring IO interface

2019-02-01 Thread Jens Axboe
The submission queue (SQ) and completion queue (CQ) rings are shared between the application and the kernel. This eliminates the need to copy data back and forth to submit and complete IO. IO submissions use the io_uring_sqe data structure, and completions are generated in the form of io_uring_sqe

[PATCH 04/18] iomap: wire up the iopoll method

2019-02-01 Thread Jens Axboe
From: Christoph Hellwig Store the request queue the last bio was submitted to in the iocb private data in addition to the cookie so that we find the right block device. Also refactor the common direct I/O bio submission code into a nice little helper. Signed-off-by: Christoph Hellwig Modified

[PATCH 01/18] fs: add an iopoll method to struct file_operations

2019-02-01 Thread Jens Axboe
From: Christoph Hellwig This new methods is used to explicitly poll for I/O completion for an iocb. It must be called for any iocb submitted asynchronously (that is with a non-null ki_complete) which has the IOCB_HIPRI flag set. The method is assisted by a new ki_cookie field in struct iocb to

[PATCH 06/18] io_uring: add fsync support

2019-02-01 Thread Jens Axboe
From: Christoph Hellwig Add a new fsync opcode, which either syncs a range if one is passed, or the whole file if the offset and length fields are both cleared to zero. A flag is provided to use fdatasync semantics, that is only force out metadata which is required to retrieve the file data, but

[PATCH 13/18] io_uring: add file set registration

2019-02-01 Thread Jens Axboe
We normally have to fget/fput for each IO we do on a file. Even with the batching we do, the cost of the atomic inc/dec of the file usage count adds up. This adds IORING_REGISTER_FILES, and IORING_UNREGISTER_FILES opcodes for the io_uring_register(2) system call. The arguments passed in must be an

[PATCH 11/18] block: implement bio helper to add iter bvec pages to bio

2019-02-01 Thread Jens Axboe
For an ITER_BVEC, we can just iterate the iov and add the pages to the bio directly. This requires that the caller doesn't releases the pages on IO completion, we add a BIO_NO_PAGE_REF flag for that. The current two callers of bio_iov_iter_get_pages() are updated to check if they need to release p

[PATCH 12/18] io_uring: add support for pre-mapped user IO buffers

2019-02-01 Thread Jens Axboe
If we have fixed user buffers, we can map them into the kernel when we setup the io_context. That avoids the need to do get_user_pages() for each and every IO. To utilize this feature, the application must call io_uring_register() after having setup an io_uring context, passing in IORING_REGISTER_

[PATCH 14/18] io_uring: add submission polling

2019-02-01 Thread Jens Axboe
This enables an application to do IO, without ever entering the kernel. By using the SQ ring to fill in new sqes and watching for completions on the CQ ring, we can submit and reap IOs without doing a single system call. The kernel side thread will poll for new submissions, and in case of HIPRI/pol

[PATCH 09/18] io_uring: use fget/fput_many() for file references

2019-02-01 Thread Jens Axboe
Add a separate io_submit_state structure, to cache some of the things we need for IO submission. One such example is file reference batching. io_submit_state. We get as many references as the number of sqes we are submitting, and drop unused ones if we end up switching files. The assumption here i

[PATCH 15/18] io_uring: add io_kiocb ref count

2019-02-01 Thread Jens Axboe
We'll use this for the POLL implementation. Regular requests will NOT be using references, so initialize it to 0. Any real use of the io_kiocb ref will initialize it to at least 2. Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/io_uring.c | 8 ++-- 1 file changed, 6 inserti

[PATCH 10/18] io_uring: batch io_kiocb allocation

2019-02-01 Thread Jens Axboe
Similarly to how we use the state->ios_left to know how many references to get to a file, we can use it to allocate the io_kiocb's we need in bulk. Signed-off-by: Jens Axboe --- fs/io_uring.c | 45 ++--- 1 file changed, 38 insertions(+), 7 deletions(-) di

[PATCH 18/18] io_uring: add io_uring_event cache hit information

2019-02-01 Thread Jens Axboe
Add hint on whether a read was served out of the page cache, or if it hit media. This is useful for buffered async IO, O_DIRECT reads would never have this set (for obvious reasons). If the read hit page cache, cqe->flags will have IOCQE_FLAG_CACHEHIT set. Signed-off-by: Jens Axboe --- fs/io_ur

Re: [PATCH 1/4] block: disk_events: introduce event flags

2019-02-01 Thread Martin Wilck
Hannes, all, On Mon, 2019-01-28 at 14:54 +0100, Martin Wilck wrote: > On Sat, 2019-01-26 at 11:09 +0100, Hannes Reinecke wrote: > > On 1/18/19 10:32 PM, Martin Wilck wrote: > > > Currently, an empty disk->events field tells the block layer not > > > to > > > forward > > > media change events to us

Re: remove exofs, the T10 OSD code and block/scsi bidi support V4

2019-02-01 Thread Jens Axboe
On 2/1/19 12:55 AM, Christoph Hellwig wrote: > The only real user of the T10 OSD protocol, the pNFS object layout > driver never went to the point of having shipping products, and we > removed it 1.5 years ago. Exofs is just a simple example without > real life users. > > The code has been mostly

Re: block: Fix a WRITE SAME BUG_ON

2019-02-01 Thread John Dorminy
I didn't know such a thing existed... does it work on any block device? Where do I read more about this? On Fri, Feb 1, 2019 at 2:35 AM Christoph Hellwig wrote: > > On Thu, Jan 31, 2019 at 02:41:52PM -0500, John Dorminy wrote: > > > On Wed, Jan 30, 2019 at 09:08:50AM -0500, John Dorminy wrote: >

Re: general protection fault in debugfs_create_files

2019-02-01 Thread Kees Cook
On Thu, Jan 31, 2019 at 7:53 AM syzbot wrote: > > Hello, > > syzbot found the following crash on: > > HEAD commit:02495e76ded5 Add linux-next specific files for 20190130 > git tree: linux-next > console output: https://syzkaller.appspot.com/x/log.txt?x=172ed528c0 > kernel config: ht

RE: [PATCH] lightnvm: pblk: fix bio leak on large sized io

2019-02-01 Thread Chansol Kim
On 01/31/19 22:14 PM, Matias Bjørling wrote: > On 1/30/19 2:53 AM, 김찬솔 wrote: >> >> Changes: >> 1. Function pblk_rw_io to get bio* as a reference >> 2. In pblk_rw_io bio_put call on read case removed >> >> A fix to address issue where >> 1. pblk_make_rq calls pblk_rw_io passes bio* pointe

Re: [PATCH 0/5 v6] Fix virtio-blk issue with SWIOTLB

2019-02-01 Thread Christoph Hellwig
For some reason patch 5 didn't make it to my inbox, but assuming nothing has changed this whole series looks good to me now.