Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-31 Thread Jens Axboe
On 01/31/2017 01:55 PM, Bart Van Assche wrote:
> On Tue, 2017-01-31 at 13:34 -0800, Bart Van Assche wrote:
>> On Mon, 2017-01-30 at 17:38 -0800, Jens Axboe wrote:
>>> That's a known bug in mainline. Pull it into 4.10-rc6,
>>> or use my for-next where everything is already merged. 
>>
>> Hello Jens,
>>
>> With your for-next branch (commit c2e60b3a2602) I haven't hit any block
>> layer crashes so far. The only issue I encountered that is new is a
>> memory leak triggered by the SG-IO code. These memory leak reports
>> started to appear after I started testing the mq-deadline scheduler.
>> kmemleak reported the following call stack multiple times after my tests
>> had finished:
>>
>> unreferenced object 0x88041119e528 (size 192):
>>   comm "multipathd", pid 2353, jiffies 4295128020 (age 1332.440s)
>>   hex dump (first 32 bytes):
>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>> 00 00 00 00 00 00 00 00 12 01 00 00 00 00 00 00  
>>   backtrace:
>> [] kmemleak_alloc+0x45/0xa0
>> [] __kmalloc+0x15d/0x2f0
>> [] bio_alloc_bioset+0x185/0x1f0
>> [] bio_map_user_iov+0x124/0x400
>> [] blk_rq_map_user_iov+0x11a/0x210
>> [] blk_rq_map_user+0x4d/0x60
>> [] sg_io+0x3d4/0x410
>> [] scsi_cmd_ioctl+0x300/0x490
>> [] scsi_cmd_blk_ioctl+0x3d/0x50
>> [] sd_ioctl+0x80/0x100
>> [] blkdev_ioctl+0x51e/0x9f0
>> [] block_ioctl+0x38/0x40
>> [] do_vfs_ioctl+0x8f/0x700
>> [] SyS_ioctl+0x3c/0x70
>> [] entry_SYSCALL_64_fastpath+0x18/0xad
> 
> After I repeated my test the above findings were confirmed: no memory leaks
> were reported by kmemleak after a test with I/O scheduler "none" and the
> above call stack was reported 44 times by kmemleak after a test with I/O
> scheduler "mq-deadline".

Interesting, I'll check this. Doesn't make any sense why the scheduler
would be implicated in that, given how we run completions now. But if
it complains, then something must be up.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-31 Thread Bart Van Assche
On Tue, 2017-01-31 at 13:34 -0800, Bart Van Assche wrote:
> On Mon, 2017-01-30 at 17:38 -0800, Jens Axboe wrote:
> > That's a known bug in mainline. Pull it into 4.10-rc6,
> > or use my for-next where everything is already merged. 
> 
> Hello Jens,
> 
> With your for-next branch (commit c2e60b3a2602) I haven't hit any block
> layer crashes so far. The only issue I encountered that is new is a
> memory leak triggered by the SG-IO code. These memory leak reports
> started to appear after I started testing the mq-deadline scheduler.
> kmemleak reported the following call stack multiple times after my tests
> had finished:
> 
> unreferenced object 0x88041119e528 (size 192):
>   comm "multipathd", pid 2353, jiffies 4295128020 (age 1332.440s)
>   hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 00 00 00 00 00 00 00 00 12 01 00 00 00 00 00 00  
>   backtrace:
> [] kmemleak_alloc+0x45/0xa0
> [] __kmalloc+0x15d/0x2f0
> [] bio_alloc_bioset+0x185/0x1f0
> [] bio_map_user_iov+0x124/0x400
> [] blk_rq_map_user_iov+0x11a/0x210
> [] blk_rq_map_user+0x4d/0x60
> [] sg_io+0x3d4/0x410
> [] scsi_cmd_ioctl+0x300/0x490
> [] scsi_cmd_blk_ioctl+0x3d/0x50
> [] sd_ioctl+0x80/0x100
> [] blkdev_ioctl+0x51e/0x9f0
> [] block_ioctl+0x38/0x40
> [] do_vfs_ioctl+0x8f/0x700
> [] SyS_ioctl+0x3c/0x70
> [] entry_SYSCALL_64_fastpath+0x18/0xad

After I repeated my test the above findings were confirmed: no memory leaks
were reported by kmemleak after a test with I/O scheduler "none" and the
above call stack was reported 44 times by kmemleak after a test with I/O
scheduler "mq-deadline".

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-31 Thread Bart Van Assche
On Mon, 2017-01-30 at 17:38 -0800, Jens Axboe wrote:
> That's a known bug in mainline. Pull it into 4.10-rc6,
> or use my for-next where everything is already merged. 

Hello Jens,

With your for-next branch (commit c2e60b3a2602) I haven't hit any block
layer crashes so far. The only issue I encountered that is new is a
memory leak triggered by the SG-IO code. These memory leak reports
started to appear after I started testing the mq-deadline scheduler.
kmemleak reported the following call stack multiple times after my tests
had finished:

unreferenced object 0x88041119e528 (size 192):
  comm "multipathd", pid 2353, jiffies 4295128020 (age 1332.440s)
  hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 12 01 00 00 00 00 00 00  
  backtrace:
[] kmemleak_alloc+0x45/0xa0
[] __kmalloc+0x15d/0x2f0
[] bio_alloc_bioset+0x185/0x1f0
[] bio_map_user_iov+0x124/0x400
[] blk_rq_map_user_iov+0x11a/0x210
[] blk_rq_map_user+0x4d/0x60
[] sg_io+0x3d4/0x410
[] scsi_cmd_ioctl+0x300/0x490
[] scsi_cmd_blk_ioctl+0x3d/0x50
[] sd_ioctl+0x80/0x100
[] blkdev_ioctl+0x51e/0x9f0
[] block_ioctl+0x38/0x40
[] do_vfs_ioctl+0x8f/0x700
[] SyS_ioctl+0x3c/0x70
[] entry_SYSCALL_64_fastpath+0x18/0xad

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-30 Thread Jens Axboe
On 01/30/2017 05:38 PM, Jens Axboe wrote:
> 
> 
>> On Jan 30, 2017, at 5:12 PM, Bart Van Assche  
>> wrote:
>>
>>> On Fri, 2017-01-27 at 09:56 -0700, Jens Axboe wrote:
 On 01/27/2017 09:52 AM, Bart Van Assche wrote:
 [  215.724452] general protection fault:  [#1] SMP
 [  215.725060] Call Trace:
 [  215.725086]  scsi_disk_put+0x2d/0x40
 [  215.725110]  sd_release+0x3d/0xb0
 [  215.725137]  __blkdev_put+0x29e/0x360
 [  215.725163]  blkdev_put+0x49/0x170
 [  215.725192]  dm_put_table_device+0x58/0xc0 [dm_mod]
 [  215.725219]  dm_put_device+0x70/0xc0 [dm_mod]
 [  215.725269]  free_priority_group+0x92/0xc0 [dm_multipath]
 [  215.725295]  free_multipath+0x70/0xc0 [dm_multipath]
 [  215.725320]  multipath_dtr+0x19/0x20 [dm_multipath]
 [  215.725348]  dm_table_destroy+0x67/0x120 [dm_mod]
 [  215.725379]  dev_suspend+0xde/0x240 [dm_mod]
 [  215.725434]  ctl_ioctl+0x1f5/0x520 [dm_mod]
 [  215.725489]  dm_ctl_ioctl+0xe/0x20 [dm_mod]
 [  215.725515]  do_vfs_ioctl+0x8f/0x700
 [  215.725589]  SyS_ioctl+0x3c/0x70
 [  215.725614]  entry_SYSCALL_64_fastpath+0x18/0xad

>>>
>>> I have no idea what this is, I haven't messed with life time or devices
>>> or queues at all in that branch.
>>
>> Hello Jens,
>>
>> Running the srp-test software against kernel 4.9.6 and kernel 4.10-rc5
>> went fine. With your for-4.11/block branch (commit 400f73b23f457a) however
>> I just ran into the following:
>>
>> [  214.27] [ cut here ]
>> [  214.65] WARNING: CPU: 5 PID: 13201 at kernel/locking/lockdep.c:3514 
>> lock_release+0x346/0x480
>> [  214.88] DEBUG_LOCKS_WARN_ON(depth <= 0)
>> [  214.555824] CPU: 5 PID: 13201 Comm: fio Not tainted 4.10.0-rc3-dbg+ #1
>> [  214.555846] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 
>> 11/17/2014
>> [  214.555867] Call Trace:
>> [  214.555889]  dump_stack+0x68/0x93
>> [  214.555911]  __warn+0xc6/0xe0
>> [  214.555953]  warn_slowpath_fmt+0x4a/0x50
>> [  214.555973]  lock_release+0x346/0x480
>> [  214.556021]  aio_write+0x106/0x140
>> [  214.556067]  do_io_submit+0x37d/0x900
>> [  214.556108]  SyS_io_submit+0xb/0x10
>> [  214.556131]  entry_SYSCALL_64_fastpath+0x18/0xad
>>
>> I will continue to try to figure out what is causing this behavior.
> 
> That's a known bug in mainline. Pull it into 4.10-rc6,
> or use my for-next where everything is already merged.

Since I'm not on the phone anymore, this is the commit that was
merged after my for-4.11/block was forked, which fixes this issue:

commit a12f1ae61c489076a9aeb90bddca7722bf330df3
Author: Shaohua Li 
Date:   Tue Dec 13 12:09:56 2016 -0800

aio: fix lock dep warning

So you can just pull that in, if you want, or do what I suggested above.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-30 Thread Jens Axboe


> On Jan 30, 2017, at 5:12 PM, Bart Van Assche  
> wrote:
> 
>> On Fri, 2017-01-27 at 09:56 -0700, Jens Axboe wrote:
>>> On 01/27/2017 09:52 AM, Bart Van Assche wrote:
>>> [  215.724452] general protection fault:  [#1] SMP
>>> [  215.725060] Call Trace:
>>> [  215.725086]  scsi_disk_put+0x2d/0x40
>>> [  215.725110]  sd_release+0x3d/0xb0
>>> [  215.725137]  __blkdev_put+0x29e/0x360
>>> [  215.725163]  blkdev_put+0x49/0x170
>>> [  215.725192]  dm_put_table_device+0x58/0xc0 [dm_mod]
>>> [  215.725219]  dm_put_device+0x70/0xc0 [dm_mod]
>>> [  215.725269]  free_priority_group+0x92/0xc0 [dm_multipath]
>>> [  215.725295]  free_multipath+0x70/0xc0 [dm_multipath]
>>> [  215.725320]  multipath_dtr+0x19/0x20 [dm_multipath]
>>> [  215.725348]  dm_table_destroy+0x67/0x120 [dm_mod]
>>> [  215.725379]  dev_suspend+0xde/0x240 [dm_mod]
>>> [  215.725434]  ctl_ioctl+0x1f5/0x520 [dm_mod]
>>> [  215.725489]  dm_ctl_ioctl+0xe/0x20 [dm_mod]
>>> [  215.725515]  do_vfs_ioctl+0x8f/0x700
>>> [  215.725589]  SyS_ioctl+0x3c/0x70
>>> [  215.725614]  entry_SYSCALL_64_fastpath+0x18/0xad
>>> 
>> 
>> I have no idea what this is, I haven't messed with life time or devices
>> or queues at all in that branch.
> 
> Hello Jens,
> 
> Running the srp-test software against kernel 4.9.6 and kernel 4.10-rc5
> went fine. With your for-4.11/block branch (commit 400f73b23f457a) however
> I just ran into the following:
> 
> [  214.27] [ cut here ]
> [  214.65] WARNING: CPU: 5 PID: 13201 at kernel/locking/lockdep.c:3514 
> lock_release+0x346/0x480
> [  214.88] DEBUG_LOCKS_WARN_ON(depth <= 0)
> [  214.555824] CPU: 5 PID: 13201 Comm: fio Not tainted 4.10.0-rc3-dbg+ #1
> [  214.555846] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 
> 11/17/2014
> [  214.555867] Call Trace:
> [  214.555889]  dump_stack+0x68/0x93
> [  214.555911]  __warn+0xc6/0xe0
> [  214.555953]  warn_slowpath_fmt+0x4a/0x50
> [  214.555973]  lock_release+0x346/0x480
> [  214.556021]  aio_write+0x106/0x140
> [  214.556067]  do_io_submit+0x37d/0x900
> [  214.556108]  SyS_io_submit+0xb/0x10
> [  214.556131]  entry_SYSCALL_64_fastpath+0x18/0xad
> 
> I will continue to try to figure out what is causing this behavior.

That's a known bug in mainline. Pull it into 4.10-rc6,
or use my for-next where everything is already merged. 



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-30 Thread Bart Van Assche
On Fri, 2017-01-27 at 09:56 -0700, Jens Axboe wrote:
> On 01/27/2017 09:52 AM, Bart Van Assche wrote:
> > [  215.724452] general protection fault:  [#1] SMP
> > [  215.725060] Call Trace:
> > [  215.725086]  scsi_disk_put+0x2d/0x40
> > [  215.725110]  sd_release+0x3d/0xb0
> > [  215.725137]  __blkdev_put+0x29e/0x360
> > [  215.725163]  blkdev_put+0x49/0x170
> > [  215.725192]  dm_put_table_device+0x58/0xc0 [dm_mod]
> > [  215.725219]  dm_put_device+0x70/0xc0 [dm_mod]
> > [  215.725269]  free_priority_group+0x92/0xc0 [dm_multipath]
> > [  215.725295]  free_multipath+0x70/0xc0 [dm_multipath]
> > [  215.725320]  multipath_dtr+0x19/0x20 [dm_multipath]
> > [  215.725348]  dm_table_destroy+0x67/0x120 [dm_mod]
> > [  215.725379]  dev_suspend+0xde/0x240 [dm_mod]
> > [  215.725434]  ctl_ioctl+0x1f5/0x520 [dm_mod]
> > [  215.725489]  dm_ctl_ioctl+0xe/0x20 [dm_mod]
> > [  215.725515]  do_vfs_ioctl+0x8f/0x700
> > [  215.725589]  SyS_ioctl+0x3c/0x70
> > [  215.725614]  entry_SYSCALL_64_fastpath+0x18/0xad
> > 
> 
> I have no idea what this is, I haven't messed with life time or devices
> or queues at all in that branch.

Hello Jens,

Running the srp-test software against kernel 4.9.6 and kernel 4.10-rc5
went fine. With your for-4.11/block branch (commit 400f73b23f457a) however
I just ran into the following:

[  214.27] [ cut here ]
[  214.65] WARNING: CPU: 5 PID: 13201 at kernel/locking/lockdep.c:3514 
lock_release+0x346/0x480
[  214.88] DEBUG_LOCKS_WARN_ON(depth <= 0)
[  214.555824] CPU: 5 PID: 13201 Comm: fio Not tainted 4.10.0-rc3-dbg+ #1
[  214.555846] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 
11/17/2014
[  214.555867] Call Trace:
[  214.555889]  dump_stack+0x68/0x93
[  214.555911]  __warn+0xc6/0xe0
[  214.555953]  warn_slowpath_fmt+0x4a/0x50
[  214.555973]  lock_release+0x346/0x480
[  214.556021]  aio_write+0x106/0x140
[  214.556067]  do_io_submit+0x37d/0x900
[  214.556108]  SyS_io_submit+0xb/0x10
[  214.556131]  entry_SYSCALL_64_fastpath+0x18/0xad

I will continue to try to figure out what is causing this behavior.

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-29 Thread Hannes Reinecke
On 01/27/2017 10:27 PM, Bart Van Assche wrote:
> On Wed, 2017-01-25 at 18:25 +0100, Christoph Hellwig wrote:
>> this series splits the support for SCSI passthrough commands from the
>> main struct request used all over the block layer into a separate
>> scsi_request structure that drivers that want to support SCSI passthough
>> need to embedded as the first thing into their request-private data,
>> similar to how we handle NVMe passthrough commands.
>>
>> To support this I've added support for that the private data after
>> request structure to the legacy request path instead, so that it can
>> be treated the same way as the blk-mq path.  Compare to the current
>> scsi_cmnd allocator that actually is a major simplification.
>>
>> Changes since V1:
>>  - fix handling of a NULL sense pointer in __scsi_execute
>>  - clean up handling of the flush flags in the block layer and MD
>>  - additional small cleanup in dm-rq
> 
> Hello Christoph,
> 
> A general comment: patch "block: allow specifying size for extra
> command data" is a very welcome improvement but unfortunately also
> introduces an inconsistency among block drivers. This patch series
> namely creates two kinds of block drivers:
> - Block drivers that use the block layer core to allocate
>   request-private data. These block drivers set request.cmd_size
>   to a non-zero value and do not need request.special.
> - Block drivers that allocate request-private data themselves.
>   These block drivers set request.cmd_size to zero and use
>   request.special to translate a request pointer into the private
>   data pointer.
> 
> Have you considered to convert all block drivers to the new
> approach and to get rid of request.special? If so, do you already
> have plans to start working on this? I'm namely wondering wheter I
> should start working on this myself.
> 
I was actually looking into it, too.
Once scsi passthrough is removed from struct request there is no
reasonable need to rely on '->special' for anything, and we should just
ditch it.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-28 Thread h...@lst.de
On Fri, Jan 27, 2017 at 09:27:53PM +, Bart Van Assche wrote:
> Have you considered to convert all block drivers to the new
> approach and to get rid of request.special? If so, do you already
> have plans to start working on this? I'm namely wondering wheter I
> should start working on this myself.

Hi Bart,

I'd love to have all drivers move of using .special (and thus reducing
request size further).  I think the general way to do that is to convert
them to blk-mq and not using the legacy cmd_size field.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V3

2017-01-28 Thread h...@lst.de
On Fri, Jan 27, 2017 at 06:58:53PM +, Bart Van Assche wrote:
> Version 3 of the patch with title "block: split scsi_request out of
> struct request" (commit 3c30af6ebe12) differs significantly from v2
> of that patch that has been posted on several mailing lists. E.g. v2
> moves __cmd[], cmd and cmd_len from struct request into struct
> scsi_request but v3 not. Which version do you want us to review?

Hi Bart,

I tried to resend the whole updated v3 series, but the mail server
stopped accepting mails due to overload.  Otherwise it would have
included all the patches.  Jens instead took the updated version
straight from this git branch:


http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/block-pc-refactor

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Bart Van Assche
On Wed, 2017-01-25 at 18:25 +0100, Christoph Hellwig wrote:
> this series splits the support for SCSI passthrough commands from the
> main struct request used all over the block layer into a separate
> scsi_request structure that drivers that want to support SCSI passthough
> need to embedded as the first thing into their request-private data,
> similar to how we handle NVMe passthrough commands.
> 
> To support this I've added support for that the private data after
> request structure to the legacy request path instead, so that it can
> be treated the same way as the blk-mq path.  Compare to the current
> scsi_cmnd allocator that actually is a major simplification.
> 
> Changes since V1:
>  - fix handling of a NULL sense pointer in __scsi_execute
>  - clean up handling of the flush flags in the block layer and MD
>  - additional small cleanup in dm-rq

Hello Christoph,

A general comment: patch "block: allow specifying size for extra
command data" is a very welcome improvement but unfortunately also
introduces an inconsistency among block drivers. This patch series
namely creates two kinds of block drivers:
- Block drivers that use the block layer core to allocate
  request-private data. These block drivers set request.cmd_size
  to a non-zero value and do not need request.special.
- Block drivers that allocate request-private data themselves.
  These block drivers set request.cmd_size to zero and use
  request.special to translate a request pointer into the private
  data pointer.

Have you considered to convert all block drivers to the new
approach and to get rid of request.special? If so, do you already
have plans to start working on this? I'm namely wondering wheter I
should start working on this myself.

Thanks,

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V3

2017-01-27 Thread Bart Van Assche
On Fri, 2017-01-27 at 17:34 +0100, Christoph Hellwig wrote:
> this series splits the support for SCSI passthrough commands from the
> main struct request used all over the block layer into a separate
> scsi_request structure that drivers that want to support SCSI passthough
> need to embedded as the first thing into their request-private data,
> similar to how we handle NVMe passthrough commands.
> 
> To support this I've added support for that the private data after
> request structure to the legacy request path instead, so that it can
> be treated the same way as the blk-mq path.  Compare to the current
> scsi_cmnd allocator that actually is a major simplification.
> 
> Changes since V2:
>  - remove req->cmd tracing
>  - minor spelling fixes
> 
> Changes since V1:
>  - fix handling of a NULL sense pointer in __scsi_execute
>  - clean up handling of the flush flags in the block layer and MD
>  - additional small cleanup in dm-rq

Hello Christoph,

Version 3 of the patch with title "block: split scsi_request out of
struct request" (commit 3c30af6ebe12) differs significantly from v2
of that patch that has been posted on several mailing lists. E.g. v2
moves __cmd[], cmd and cmd_len from struct request into struct
scsi_request but v3 not. Which version do you want us to review?

Thanks,

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] split scsi passthrough fields out of struct request V3

2017-01-27 Thread Christoph Hellwig
Hi all,

this series splits the support for SCSI passthrough commands from the
main struct request used all over the block layer into a separate
scsi_request structure that drivers that want to support SCSI passthough
need to embedded as the first thing into their request-private data,
similar to how we handle NVMe passthrough commands.

To support this I've added support for that the private data after
request structure to the legacy request path instead, so that it can
be treated the same way as the blk-mq path.  Compare to the current
scsi_cmnd allocator that actually is a major simplification.

Changes since V2:
 - remove req->cmd tracing
 - minor spelling fixes

Changes since V1:
 - fix handling of a NULL sense pointer in __scsi_execute
 - clean up handling of the flush flags in the block layer and MD
 - additional small cleanup in dm-rq

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Bart Van Assche
On Fri, 2017-01-27 at 09:56 -0700, Jens Axboe wrote:
> I have no idea what this is, I haven't messed with life time or devices
> or queues at all in that branch.

The srp-test software passes with kernel v4.9. Something must have changed.
I'll see whether I can find some time to look further into this.

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Bart Van Assche
On Thu, 2017-01-26 at 18:22 -0700, Jens Axboe wrote:
> What's your boot device? I've been booting this on a variety of setups,
> no problems observed. It's booting my laptop, and on SCSI and SATA as
> well. What is your root drive? What is the queue depth of it?
> Controller?

The boot device in my test setup is a SATA hard disk:

# cat /proc/cmdline  
BOOT_IMAGE=/boot/vmlinuz-4.10.0-rc5-dbg+ 
root=UUID=60a4b064-b3ef-4d28-96d3-3c13ecbec43e resume=/dev/sda2 showopts
# ls -l /dev/disk/by-uuid/60a4b064-b3ef-4d28-96d3-3c13ecbec43e
lrwxrwxrwx 1 root root 10 Jan 27 08:43 
/dev/disk/by-uuid/60a4b064-b3ef-4d28-96d3-3c13ecbec43e -> ../../sda1
# cat /sys/block/sda/queue/nr_requests  
31
# lsscsi | grep sda
[0:0:0:0]    disk    ATA  ST1000NM0033-9ZM GA67  /dev/sda
# hdparm -i /dev/sda

/dev/sda:
 Model=ST1000NM0033-9ZM173, FwRev=GA67, SerialNo=Z1W2HM75
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=0
 BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=off
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=1953525168
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4  
 DMA modes:  mdma0 mdma1 mdma2  
 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6  
 AdvancedPM=no WriteCache=disabled
 Drive conforms to: Unspecified:  ATA/ATAPI-4,5,6,7

 * signifies the current active mode

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Jens Axboe
On 01/27/2017 09:42 AM, Christoph Hellwig wrote:
> On Fri, Jan 27, 2017 at 09:38:40AM -0700, Jens Axboe wrote:
>>> Ok, I'll repost what I have right now, which is on top of a merge
>>> of your block/for-4.11/next and your for-next from this morning
>>> my time.
>>
>> Perfect.
> 
> At least I tried, looks like the mail server is overloaded and crapped
> out three mails into it.  For now there is a git tree here:
> 
> http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/block-pc-refactor

I grabbed it all from there. for-4.11/rq-refactor has been rebased to v3.
Basic testing looks fine here, at least on v2. I'll repeat the same and
then merge it into for-next as well.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Jens Axboe
On 01/27/2017 09:52 AM, Bart Van Assche wrote:
> On Fri, 2017-01-27 at 01:04 -0700, Jens Axboe wrote:
>> The previous patch had a bug if you didn't use a scheduler, here's a
>> version that should work fine in both cases. I've also updated the
>> above mentioned branch, so feel free to pull that as well and merge to
>> master like before.
> 
> Booting time is back to normal with commit f3a8ab7d55bc merged with
> v4.10-rc5. That's a great improvement. However, running the srp-test
> software triggers now a new complaint:
> 
> [  215.600386] sd 11:0:0:0: [sdh] Attached SCSI disk
> [  215.609485] sd 11:0:0:0: alua: port group 00 state A non-preferred 
> supports TOlUSNA
> [  215.722900] scsi 13:0:0:0: alua: Detached
> [  215.724452] general protection fault:  [#1] SMP
> [  215.724484] Modules linked in: dm_service_time ib_srp scsi_transport_srp 
> target_core_user uio target_core_pscsi target_core_file ib_srpt 
> target_core_iblock target_core_mod brd netconsole xt_CHECKSUM iptable_mangle 
> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat 
> libcrc32c nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack 
> ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc ebtable_filter 
> ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables 
> af_packet ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm msr configfs 
> ib_cm iw_cm mlx4_ib ib_core sb_edac edac_core x86_pkg_temp_thermal 
> intel_powerclamp ipmi_ssif coretemp kvm_intel hid_generic kvm usbhid 
> irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel mlx4_core 
> ghash_clmulni_intel iTCO_wdt dcdbas pcbc tg3
> [  215.724629]  iTCO_vendor_support ptp aesni_intel pps_core aes_x86_64 
> pcspkr crypto_simd libphy ipmi_si glue_helper cryptd ipmi_devintf tpm_tis 
> devlink fjes ipmi_msghandler tpm_tis_core tpm mei_me lpc_ich mei mfd_core 
> button shpchp wmi mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect 
> sysimgblt fb_sys_fops ttm drm sr_mod cdrom ehci_pci ehci_hcd usbcore 
> usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua 
> autofs4
> [  215.724719] CPU: 9 PID: 8043 Comm: multipathd Not tainted 4.10.0-rc5-dbg+ 
> #1
> [  215.724748] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 
> 11/17/2014
> [  215.724775] task: 8801717998c0 task.stack: c90002a9c000
> [  215.724804] RIP: 0010:scsi_device_put+0xb/0x30
> [  215.724829] RSP: 0018:c90002a9faa0 EFLAGS: 00010246
> [  215.724855] RAX: 6b6b6b6b6b6b6b6b RBX: 88038bf85698 RCX: 
> 0006
> [  215.724880] RDX: 0006 RSI: 88017179a108 RDI: 
> 88038bf85698
> [  215.724906] RBP: c90002a9faa8 R08: 880384786008 R09: 
> 000100170007
> [  215.724932] R10:  R11:  R12: 
> 88038bf85698
> [  215.724958] R13: 88038919f090 R14: dead0100 R15: 
> 88038a41dd28
> [  215.724983] FS:  7fbf8c6cf700() GS:88046f44() 
> knlGS:
> [  215.725010] CS:  0010 DS:  ES:  CR0: 80050033
> [  215.725035] CR2: 7f1262ef3ee0 CR3: 00044f6cc000 CR4: 
> 001406e0
> [  215.725060] Call Trace:
> [  215.725086]  scsi_disk_put+0x2d/0x40
> [  215.725110]  sd_release+0x3d/0xb0
> [  215.725137]  __blkdev_put+0x29e/0x360
> [  215.725163]  blkdev_put+0x49/0x170
> [  215.725192]  dm_put_table_device+0x58/0xc0 [dm_mod]
> [  215.725219]  dm_put_device+0x70/0xc0 [dm_mod]
> [  215.725269]  free_priority_group+0x92/0xc0 [dm_multipath]
> [  215.725295]  free_multipath+0x70/0xc0 [dm_multipath]
> [  215.725320]  multipath_dtr+0x19/0x20 [dm_multipath]
> [  215.725348]  dm_table_destroy+0x67/0x120 [dm_mod]
> [  215.725379]  dev_suspend+0xde/0x240 [dm_mod]
> [  215.725434]  ctl_ioctl+0x1f5/0x520 [dm_mod]
> [  215.725489]  dm_ctl_ioctl+0xe/0x20 [dm_mod]
> [  215.725515]  do_vfs_ioctl+0x8f/0x700
> [  215.725589]  SyS_ioctl+0x3c/0x70
> [  215.725614]  entry_SYSCALL_64_fastpath+0x18/0xad
> [  215.725641] RIP: 0033:0x7fbf8aca0667
> [  215.725665] RSP: 002b:7fbf8c6cd668 EFLAGS: 0246 ORIG_RAX: 
> 0010
> [  215.725692] RAX: ffda RBX: 0046 RCX: 
> 7fbf8aca0667
> [  215.725716] RDX: 7fbf8006b940 RSI: c138fd06 RDI: 
> 0007
> [  215.725743] RBP: 0009 R08: 7fbf8c6cb3c0 R09: 
> 7fbf8b68d8d8
> [  215.725768] R10: 0075 R11: 0246 R12: 
> 7fbf8c6cd770
> [  215.725793] R13: 0013 R14: 006168f0 R15: 
> 00f74780
> [  215.725820] Code: bc 24 b8 00 00 00 e8 55 c8 1c 00 48 83 c4 08 48 89 d8 5b 
> 41 5c 41 5d 41 5e 41 5f 5d c3 0f 1f 00 55 48 89 e5 53 48 8b 07 48 89 fb <48> 
> 8b 80 a8 01 00 00 48 8b 38 e8 f6 68 c5 ff 48 8d bb 38 02 00 
> [  215.725903] RIP: scsi_device_put+0xb/0x30 RSP: c90002a9faa0
> 
> (gdb) list *(scsi_device_put+0xb)
> 0x8149fc2b is in scsi_device_put (drivers/scsi/scsi.c:957).
> 952  * count of the underlying LLDD module.  The device is freed once the 
> las

Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Bart Van Assche
On Fri, 2017-01-27 at 01:04 -0700, Jens Axboe wrote:
> The previous patch had a bug if you didn't use a scheduler, here's a
> version that should work fine in both cases. I've also updated the
> above mentioned branch, so feel free to pull that as well and merge to
> master like before.

Booting time is back to normal with commit f3a8ab7d55bc merged with
v4.10-rc5. That's a great improvement. However, running the srp-test
software triggers now a new complaint:

[  215.600386] sd 11:0:0:0: [sdh] Attached SCSI disk
[  215.609485] sd 11:0:0:0: alua: port group 00 state A non-preferred supports 
TOlUSNA
[  215.722900] scsi 13:0:0:0: alua: Detached
[  215.724452] general protection fault:  [#1] SMP
[  215.724484] Modules linked in: dm_service_time ib_srp scsi_transport_srp 
target_core_user uio target_core_pscsi target_core_file ib_srpt 
target_core_iblock target_core_mod brd netconsole xt_CHECKSUM iptable_mangle 
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat libcrc32c 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
nf_reject_ipv4 xt_tcpudp tun bridge stp llc ebtable_filter ebtables 
ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet ib_ipoib 
rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm msr configfs ib_cm iw_cm mlx4_ib 
ib_core sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp ipmi_ssif 
coretemp kvm_intel hid_generic kvm usbhid irqbypass crct10dif_pclmul 
crc32_pclmul crc32c_intel mlx4_core ghash_clmulni_intel iTCO_wdt dcdbas pcbc tg3
[  215.724629]  iTCO_vendor_support ptp aesni_intel pps_core aes_x86_64 pcspkr 
crypto_simd libphy ipmi_si glue_helper cryptd ipmi_devintf tpm_tis devlink fjes 
ipmi_msghandler tpm_tis_core tpm mei_me lpc_ich mei mfd_core button shpchp wmi 
mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt 
fb_sys_fops ttm drm sr_mod cdrom ehci_pci ehci_hcd usbcore usb_common sg 
dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4
[  215.724719] CPU: 9 PID: 8043 Comm: multipathd Not tainted 4.10.0-rc5-dbg+ #1
[  215.724748] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 
11/17/2014
[  215.724775] task: 8801717998c0 task.stack: c90002a9c000
[  215.724804] RIP: 0010:scsi_device_put+0xb/0x30
[  215.724829] RSP: 0018:c90002a9faa0 EFLAGS: 00010246
[  215.724855] RAX: 6b6b6b6b6b6b6b6b RBX: 88038bf85698 RCX: 0006
[  215.724880] RDX: 0006 RSI: 88017179a108 RDI: 88038bf85698
[  215.724906] RBP: c90002a9faa8 R08: 880384786008 R09: 000100170007
[  215.724932] R10:  R11:  R12: 88038bf85698
[  215.724958] R13: 88038919f090 R14: dead0100 R15: 88038a41dd28
[  215.724983] FS:  7fbf8c6cf700() GS:88046f44() 
knlGS:
[  215.725010] CS:  0010 DS:  ES:  CR0: 80050033
[  215.725035] CR2: 7f1262ef3ee0 CR3: 00044f6cc000 CR4: 001406e0
[  215.725060] Call Trace:
[  215.725086]  scsi_disk_put+0x2d/0x40
[  215.725110]  sd_release+0x3d/0xb0
[  215.725137]  __blkdev_put+0x29e/0x360
[  215.725163]  blkdev_put+0x49/0x170
[  215.725192]  dm_put_table_device+0x58/0xc0 [dm_mod]
[  215.725219]  dm_put_device+0x70/0xc0 [dm_mod]
[  215.725269]  free_priority_group+0x92/0xc0 [dm_multipath]
[  215.725295]  free_multipath+0x70/0xc0 [dm_multipath]
[  215.725320]  multipath_dtr+0x19/0x20 [dm_multipath]
[  215.725348]  dm_table_destroy+0x67/0x120 [dm_mod]
[  215.725379]  dev_suspend+0xde/0x240 [dm_mod]
[  215.725434]  ctl_ioctl+0x1f5/0x520 [dm_mod]
[  215.725489]  dm_ctl_ioctl+0xe/0x20 [dm_mod]
[  215.725515]  do_vfs_ioctl+0x8f/0x700
[  215.725589]  SyS_ioctl+0x3c/0x70
[  215.725614]  entry_SYSCALL_64_fastpath+0x18/0xad
[  215.725641] RIP: 0033:0x7fbf8aca0667
[  215.725665] RSP: 002b:7fbf8c6cd668 EFLAGS: 0246 ORIG_RAX: 
0010
[  215.725692] RAX: ffda RBX: 0046 RCX: 7fbf8aca0667
[  215.725716] RDX: 7fbf8006b940 RSI: c138fd06 RDI: 0007
[  215.725743] RBP: 0009 R08: 7fbf8c6cb3c0 R09: 7fbf8b68d8d8
[  215.725768] R10: 0075 R11: 0246 R12: 7fbf8c6cd770
[  215.725793] R13: 0013 R14: 006168f0 R15: 00f74780
[  215.725820] Code: bc 24 b8 00 00 00 e8 55 c8 1c 00 48 83 c4 08 48 89 d8 5b 
41 5c 41 5d 41 5e 41 5f 5d c3 0f 1f 00 55 48 89 e5 53 48 8b 07 48 89 fb <48> 8b 
80 a8 01 00 00 48 8b 38 e8 f6 68 c5 ff 48 8d bb 38 02 00 
[  215.725903] RIP: scsi_device_put+0xb/0x30 RSP: c90002a9faa0

(gdb) list *(scsi_device_put+0xb)
0x8149fc2b is in scsi_device_put (drivers/scsi/scsi.c:957).
952  * count of the underlying LLDD module.  The device is freed once the 
last
953  * user vanishes.
954  */
955 void scsi_device_put(struct scsi_device *sdev)
956 {
957 module_put(sdev->host->hostt->module);
958 put_device(&sdev->sdev_gendev);
959 }
960 EXPORT_SYMBOL(scsi_de

Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Christoph Hellwig
On Fri, Jan 27, 2017 at 09:38:40AM -0700, Jens Axboe wrote:
> > Ok, I'll repost what I have right now, which is on top of a merge
> > of your block/for-4.11/next and your for-next from this morning
> > my time.
> 
> Perfect.

At least I tried, looks like the mail server is overloaded and crapped
out three mails into it.  For now there is a git tree here:

http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/block-pc-refactor

> 
> > Btw, I disagred with your patch to use op_is_flush in
> > generic_make_request_checks - given that we clear these flags just
> > below I think using the helper obsfucates what's really going on.
> 
> Why? It's the exact same check. The ugly part is the fact that
> we strip the flags later on, imho.

But before it was pretty obvious that it clears exactly the flags checked
two lines earlier.  Now it's not as obvious.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Jens Axboe
On 01/27/2017 09:34 AM, Christoph Hellwig wrote:
> On Fri, Jan 27, 2017 at 09:27:02AM -0700, Jens Axboe wrote:
>> Feel free to repost it, I have no problem rebasing that branch as it's
>> standalone for now.
> 
> Ok, I'll repost what I have right now, which is on top of a merge
> of your block/for-4.11/next and your for-next from this morning
> my time.

Perfect.

> Btw, I disagred with your patch to use op_is_flush in
> generic_make_request_checks - given that we clear these flags just
> below I think using the helper obsfucates what's really going on.

Why? It's the exact same check. The ugly part is the fact that
we strip the flags later on, imho.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Christoph Hellwig
On Fri, Jan 27, 2017 at 09:27:02AM -0700, Jens Axboe wrote:
> Feel free to repost it, I have no problem rebasing that branch as it's
> standalone for now.

Ok, I'll repost what I have right now, which is on top of a merge
of your block/for-4.11/next and your for-next from this morning
my time.

Btw, I disagred with your patch to use op_is_flush in
generic_make_request_checks - given that we clear these flags just
below I think using the helper obsfucates what's really going on.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Jens Axboe
On 01/27/2017 09:23 AM, Christoph Hellwig wrote:
> On Fri, Jan 27, 2017 at 09:21:46AM -0700, Jens Axboe wrote:
>> On 01/27/2017 09:17 AM, Christoph Hellwig wrote:
>>> On Fri, Jan 27, 2017 at 09:11:14AM -0700, Jens Axboe wrote:
 I've queued this up for 4.11. Since some of the patches had dependencies
 on changes in master since for-4.11/block was forked, they are sitting
 in a separate branch that has both for-4.11/block and v4.10-rc5 pulled
 in first. for-next has everything, as usual.
>>>
>>> Eww.  I just had a couple non-trivial updates that I now do again.
>>> In case you haven't pushed it out yet can you let me repost first?
>>
>> Why the eww?! You can't fix this with a repost.
> 
> Not because of the merge, mostly because I just spent two same
> time adding all the ACKs, fixing typos and adding the removal of
> the ->cmd tracing to the series and was getting ready for a repost.

Feel free to repost it, I have no problem rebasing that branch as it's
standalone for now.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Christoph Hellwig
On Fri, Jan 27, 2017 at 09:21:46AM -0700, Jens Axboe wrote:
> On 01/27/2017 09:17 AM, Christoph Hellwig wrote:
> > On Fri, Jan 27, 2017 at 09:11:14AM -0700, Jens Axboe wrote:
> >> I've queued this up for 4.11. Since some of the patches had dependencies
> >> on changes in master since for-4.11/block was forked, they are sitting
> >> in a separate branch that has both for-4.11/block and v4.10-rc5 pulled
> >> in first. for-next has everything, as usual.
> > 
> > Eww.  I just had a couple non-trivial updates that I now do again.
> > In case you haven't pushed it out yet can you let me repost first?
> 
> Why the eww?! You can't fix this with a repost.

Not because of the merge, mostly because I just spent two same
time adding all the ACKs, fixing typos and adding the removal of
the ->cmd tracing to the series and was getting ready for a repost.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Jens Axboe
On 01/27/2017 09:17 AM, Christoph Hellwig wrote:
> On Fri, Jan 27, 2017 at 09:11:14AM -0700, Jens Axboe wrote:
>> I've queued this up for 4.11. Since some of the patches had dependencies
>> on changes in master since for-4.11/block was forked, they are sitting
>> in a separate branch that has both for-4.11/block and v4.10-rc5 pulled
>> in first. for-next has everything, as usual.
> 
> Eww.  I just had a couple non-trivial updates that I now do again.
> In case you haven't pushed it out yet can you let me repost first?

Why the eww?! You can't fix this with a repost.

It's fine, I'll just ship off for-4.11/block first (as usual), then
for-4.11/rq-refactor.

The two issues is in virtio_blk and raid1. For some reason, raid1
included a refactor of a function later in the cycle (hrmpf). So there's
really no good way to solve this, unless I pull in v4.10-rc5 into
for-4.11/block.  And I don't want to do that. Hence the topic branch for
this work.

I have pushed it out, but it's not merged into for-next yet, it's just
standalone. When I've done some sanity testing, I'll push it out.


-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Christoph Hellwig
On Fri, Jan 27, 2017 at 09:11:14AM -0700, Jens Axboe wrote:
> I've queued this up for 4.11. Since some of the patches had dependencies
> on changes in master since for-4.11/block was forked, they are sitting
> in a separate branch that has both for-4.11/block and v4.10-rc5 pulled
> in first. for-next has everything, as usual.

Eww.  I just had a couple non-trivial updates that I now do again.
In case you haven't pushed it out yet can you let me repost first?

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Jens Axboe
On Wed, Jan 25 2017, Christoph Hellwig wrote:
> Hi all,
> 
> this series splits the support for SCSI passthrough commands from the
> main struct request used all over the block layer into a separate
> scsi_request structure that drivers that want to support SCSI passthough
> need to embedded as the first thing into their request-private data,
> similar to how we handle NVMe passthrough commands.
> 
> To support this I've added support for that the private data after
> request structure to the legacy request path instead, so that it can
> be treated the same way as the blk-mq path.  Compare to the current
> scsi_cmnd allocator that actually is a major simplification.
> 
> Changes since V1:
>  - fix handling of a NULL sense pointer in __scsi_execute
>  - clean up handling of the flush flags in the block layer and MD
>  - additional small cleanup in dm-rq

I've queued this up for 4.11. Since some of the patches had dependencies
on changes in master since for-4.11/block was forked, they are sitting
in a separate branch that has both for-4.11/block and v4.10-rc5 pulled
in first. for-next has everything, as usual.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-27 Thread Jens Axboe
On 01/26/2017 11:40 PM, Jens Axboe wrote:
> On 01/26/2017 06:22 PM, Jens Axboe wrote:
>> On 01/26/2017 06:15 PM, Bart Van Assche wrote:
>>> On Thu, 2017-01-26 at 17:41 -0700, Jens Axboe wrote:
 On 01/26/2017 05:38 PM, Bart Van Assche wrote:
> I see similar behavior with the blk-mq-sched branch of
> git://git.kernel.dk/linux-block.git (git commit ID 0efe27068ecf):
> booting happens much slower than usual and I/O hangs if I run the
> srp-test software.

 Please don't run that, run for-4.11/block and merge it to master.
 Same behavior?
>>>
>>> I have not yet had the chance to run the srp-test software against that
>>> kernel. But I already see that booting takes more than ten times longer
>>> than usual. Note: as far as I know the dm-mpath driver is not involved
>>> in the boot process of my test system.
>>
>> What's your boot device? I've been booting this on a variety of setups,
>> no problems observed. It's booting my laptop, and on SCSI and SATA as
>> well. What is your root drive? What is the queue depth of it?
>> Controller?
> 
> Are you using dm for your root device?
> 
> I think I see what is going on. The scheduler framework put the
> insertion of flushes on the side, whereas it's integrated "nicely"
> on the legacy side.
> 
> Can you try with this applied? This is on top of the previous two that
> we already went through. Or, you can just pull:
> 
> git://git.kernel.dk/linux-block for-4.11/next
> 
> which is for-4.11/block with the next set of fixes on top that I haven't
> pulled in yet.

The previous patch had a bug if you didn't use a scheduler, here's a
version that should work fine in both cases. I've also updated the
above mentioned branch, so feel free to pull that as well and merge to
master like before.

commit 2f54ba92a274a7c1a5ceb34a56565f84f7b994b7
Author: Jens Axboe 
Date:   Fri Jan 27 01:00:47 2017 -0700

blk-mq-sched: add flush insertion into blk_mq_sched_insert_request()

Instead of letting the caller check this and handle the details
of inserting a flush request, put the logic in the scheduler
insertion function. This fixes direct flush insertion outside
of the usual make_request_fn calls, like from dm via
blk_insert_cloned_request().

Signed-off-by: Jens Axboe 

diff --git a/block/blk-core.c b/block/blk-core.c
index a61f1407f4f6..78daf5b6d7cb 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2129,7 +2129,7 @@ int blk_insert_cloned_request(struct request_queue *q, 
struct request *rq)
if (q->mq_ops) {
if (blk_queue_io_stat(q))
blk_account_io_start(rq, true);
-   blk_mq_sched_insert_request(rq, false, true, false);
+   blk_mq_sched_insert_request(rq, false, true, false, false);
return 0;
}
 
diff --git a/block/blk-exec.c b/block/blk-exec.c
index 86656fdfa637..ed1f10165268 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -66,7 +66,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct 
gendisk *bd_disk,
 * be reused after dying flag is set
 */
if (q->mq_ops) {
-   blk_mq_sched_insert_request(rq, at_head, true, false);
+   blk_mq_sched_insert_request(rq, at_head, true, false, false);
return;
}
 
diff --git a/block/blk-flush.c b/block/blk-flush.c
index d7de34ee39c2..4427896641ac 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -456,7 +456,7 @@ void blk_insert_flush(struct request *rq)
if ((policy & REQ_FSEQ_DATA) &&
!(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) {
if (q->mq_ops)
-   blk_mq_sched_insert_request(rq, false, true, false);
+   blk_mq_sched_insert_request(rq, false, true, false, 
false);
else
list_add_tail(&rq->queuelist, &q->queue_head);
return;
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index c27613de80c5..5e91743e193a 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -336,6 +336,64 @@ void blk_mq_sched_restart_queues(struct blk_mq_hw_ctx 
*hctx)
}
 }
 
+/*
+ * Add flush/fua to the queue. If we fail getting a driver tag, then
+ * punt to the requeue list. Requeue will re-invoke us from a context
+ * that's safe to block from.
+ */
+static void blk_mq_sched_insert_flush(struct blk_mq_hw_ctx *hctx,
+ struct request *rq, bool can_block)
+{
+   if (blk_mq_get_driver_tag(rq, &hctx, can_block)) {
+   blk_insert_flush(rq);
+   blk_mq_run_hw_queue(hctx, true);
+   } else
+   blk_mq_add_to_requeue_list(rq, true, true);
+}
+
+void blk_mq_sched_insert_request(struct request *rq, bool at_head,
+bool run_queue, bool async, bool can_block)
+{
+   struct request_queue *q = rq->q;
+   struct elevator_queue *e = q->elevator;
+   struct

Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Jens Axboe
On 01/26/2017 06:22 PM, Jens Axboe wrote:
> On 01/26/2017 06:15 PM, Bart Van Assche wrote:
>> On Thu, 2017-01-26 at 17:41 -0700, Jens Axboe wrote:
>>> On 01/26/2017 05:38 PM, Bart Van Assche wrote:
 I see similar behavior with the blk-mq-sched branch of
 git://git.kernel.dk/linux-block.git (git commit ID 0efe27068ecf):
 booting happens much slower than usual and I/O hangs if I run the
 srp-test software.
>>>
>>> Please don't run that, run for-4.11/block and merge it to master.
>>> Same behavior?
>>
>> I have not yet had the chance to run the srp-test software against that
>> kernel. But I already see that booting takes more than ten times longer
>> than usual. Note: as far as I know the dm-mpath driver is not involved
>> in the boot process of my test system.
> 
> What's your boot device? I've been booting this on a variety of setups,
> no problems observed. It's booting my laptop, and on SCSI and SATA as
> well. What is your root drive? What is the queue depth of it?
> Controller?

Are you using dm for your root device?

I think I see what is going on. The scheduler framework put the
insertion of flushes on the side, whereas it's integrated "nicely"
on the legacy side.

Can you try with this applied? This is on top of the previous two that
we already went through. Or, you can just pull:

git://git.kernel.dk/linux-block for-4.11/next

which is for-4.11/block with the next set of fixes on top that I haven't
pulled in yet.


commit 995447bfd14dd871e0c8771261ed7d1f2b5b4c86
Author: Jens Axboe 
Date:   Thu Jan 26 23:34:56 2017 -0700

blk-mq-sched: integrate flush insertion into blk_mq_sched_get_request()

Instead of letting the caller check this and handle the details
of inserting a flush request, put the logic in the scheduler
insertion function.

Outside of cleaning up the code, this handles the case where
outside callers insert a flush, like through
blk_insert_cloned_request().

Signed-off-by: Jens Axboe 

diff --git a/block/blk-core.c b/block/blk-core.c
index a61f1407f4f6..78daf5b6d7cb 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2129,7 +2129,7 @@ int blk_insert_cloned_request(struct request_queue *q, 
struct request *rq)
if (q->mq_ops) {
if (blk_queue_io_stat(q))
blk_account_io_start(rq, true);
-   blk_mq_sched_insert_request(rq, false, true, false);
+   blk_mq_sched_insert_request(rq, false, true, false, false);
return 0;
}
 
diff --git a/block/blk-exec.c b/block/blk-exec.c
index 86656fdfa637..ed1f10165268 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -66,7 +66,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct 
gendisk *bd_disk,
 * be reused after dying flag is set
 */
if (q->mq_ops) {
-   blk_mq_sched_insert_request(rq, at_head, true, false);
+   blk_mq_sched_insert_request(rq, at_head, true, false, false);
return;
}
 
diff --git a/block/blk-flush.c b/block/blk-flush.c
index d7de34ee39c2..4427896641ac 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -456,7 +456,7 @@ void blk_insert_flush(struct request *rq)
if ((policy & REQ_FSEQ_DATA) &&
!(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) {
if (q->mq_ops)
-   blk_mq_sched_insert_request(rq, false, true, false);
+   blk_mq_sched_insert_request(rq, false, true, false, 
false);
else
list_add_tail(&rq->queuelist, &q->queue_head);
return;
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index c27613de80c5..fa2ff0f458fa 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -336,6 +336,64 @@ void blk_mq_sched_restart_queues(struct blk_mq_hw_ctx 
*hctx)
}
 }
 
+/*
+ * Add flush/fua to the queue. If we fail getting a driver tag, then
+ * punt to the requeue list. Requeue will re-invoke us from a context
+ * that's safe to block from.
+ */
+static void blk_mq_sched_insert_flush(struct blk_mq_hw_ctx *hctx,
+ struct request *rq, bool can_block)
+{
+   if (blk_mq_get_driver_tag(rq, &hctx, can_block)) {
+   blk_insert_flush(rq);
+   blk_mq_run_hw_queue(hctx, !can_block);
+   } else
+   blk_mq_add_to_requeue_list(rq, true, true);
+}
+
+void blk_mq_sched_insert_request(struct request *rq, bool at_head,
+bool run_queue, bool async, bool can_block)
+{
+   struct request_queue *q = rq->q;
+   struct elevator_queue *e = q->elevator;
+   struct blk_mq_ctx *ctx = rq->mq_ctx;
+   struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu);
+
+   if (rq->tag == -1 && (rq->cmd_flags & (REQ_PREFLUSH | REQ_FUA))) {
+   blk_mq_sched_insert_flush(hctx, rq, can_block);
+   return;
+   }
+
+   if (e

Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Jens Axboe
On 01/26/2017 06:15 PM, Bart Van Assche wrote:
> On Thu, 2017-01-26 at 17:41 -0700, Jens Axboe wrote:
>> On 01/26/2017 05:38 PM, Bart Van Assche wrote:
>>> I see similar behavior with the blk-mq-sched branch of
>>> git://git.kernel.dk/linux-block.git (git commit ID 0efe27068ecf):
>>> booting happens much slower than usual and I/O hangs if I run the
>>> srp-test software.
>>
>> Please don't run that, run for-4.11/block and merge it to master.
>> Same behavior?
> 
> I have not yet had the chance to run the srp-test software against that
> kernel. But I already see that booting takes more than ten times longer
> than usual. Note: as far as I know the dm-mpath driver is not involved
> in the boot process of my test system.

What's your boot device? I've been booting this on a variety of setups,
no problems observed. It's booting my laptop, and on SCSI and SATA as
well. What is your root drive? What is the queue depth of it?
Controller?

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Bart Van Assche
On Thu, 2017-01-26 at 17:41 -0700, Jens Axboe wrote:
> On 01/26/2017 05:38 PM, Bart Van Assche wrote:
> > I see similar behavior with the blk-mq-sched branch of
> > git://git.kernel.dk/linux-block.git (git commit ID 0efe27068ecf):
> > booting happens much slower than usual and I/O hangs if I run the
> > srp-test software.
> 
> Please don't run that, run for-4.11/block and merge it to master.
> Same behavior?

I have not yet had the chance to run the srp-test software against that
kernel. But I already see that booting takes more than ten times longer
than usual. Note: as far as I know the dm-mpath driver is not involved
in the boot process of my test system.

> > Regarding creating a similar dm setup: I hope that in the future it
> > will become possible to run the srp-test software without any special
> > hardware and with in-tree drivers. Today running the srp-test software
> > with in-tree drivers namely requires IB hardware. This is how to run the
> > srp-test software today with in-tree drivers:
> > * Find a system with at least two InfiniBand ports.
> > * Make sure that the appropriate IB driver in the kernel is enabled and
> >   also that LIO (CONFIG_TARGET_CORE=m and CONFIG_TCM_FILEIO=m), ib_srp,
> >   ib_srpt and dm-mpath are built as kernel modules.
> > * If none of the IB ports are connected to an IB switch, connect the
> >   two ports to each other and configure and start the opensm software
> >   such that the port states change from "Initializing" to "Active".
> > * Check with "ibstat | grep State: Active" that at least one port is
> >   in the active state.
> > * Configure multipathd as explained in
> >   https://github.com/bvanassche/srp-test/blob/master/README.md.
> > * Restart multipathd to make sure it picks up /etc/multipath.conf.
> > * Clone https://github.com/bvanassche/srp-test and start it as follows:
> >   srp-test/run_tests -t 02-mq
> 
> I can't run that. Any chance of a test case that doesn't require IB?

It is possible to run that test on top of the SoftRoCE driver. I will first
check myself whether the latest version of the SoftRoCE driver is stable
enough to run srp-test on top of it (see also
https://github.com/dledford/linux/commits/k.o/for-4.11).

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Jens Axboe
On 01/26/2017 05:38 PM, Bart Van Assche wrote:
> On Thu, 2017-01-26 at 16:50 -0700, Jens Axboe wrote:
>> Clearly we are missing some requests. How do I setup dm similarly to
>> you?
>>
>> Does it reproduce without Christoph's patchset?
> 
> Hello Jens,
> 
> I see similar behavior with the blk-mq-sched branch of
> git://git.kernel.dk/linux-block.git (git commit ID 0efe27068ecf):
> booting happens much slower than usual and I/O hangs if I run the
> srp-test software.

Please don't run that, run for-4.11/block and merge it to master.
Same behavior?

> Regarding creating a similar dm setup: I hope that in the future it
> will become possible to run the srp-test software without any special
> hardware and with in-tree drivers. Today running the srp-test software
> with in-tree drivers namely requires IB hardware. This is how to run the
> srp-test software today with in-tree drivers:
> * Find a system with at least two InfiniBand ports.
> * Make sure that the appropriate IB driver in the kernel is enabled and
>   also that LIO (CONFIG_TARGET_CORE=m and CONFIG_TCM_FILEIO=m), ib_srp,
>   ib_srpt and dm-mpath are built as kernel modules.
> * If none of the IB ports are connected to an IB switch, connect the
>   two ports to each other and configure and start the opensm software
>   such that the port states change from "Initializing" to "Active".
> * Check with "ibstat | grep State: Active" that at least one port is
>   in the active state.
> * Configure multipathd as explained in
>   https://github.com/bvanassche/srp-test/blob/master/README.md.
> * Restart multipathd to make sure it picks up /etc/multipath.conf.
> * Clone https://github.com/bvanassche/srp-test and start it as follows:
>   srp-test/run_tests -t 02-mq

I can't run that. Any chance of a test case that doesn't require IB?

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Bart Van Assche
On Thu, 2017-01-26 at 16:50 -0700, Jens Axboe wrote:
> Clearly we are missing some requests. How do I setup dm similarly to
> you?
> 
> Does it reproduce without Christoph's patchset?

Hello Jens,

I see similar behavior with the blk-mq-sched branch of
git://git.kernel.dk/linux-block.git (git commit ID 0efe27068ecf):
booting happens much slower than usual and I/O hangs if I run the
srp-test software.

Regarding creating a similar dm setup: I hope that in the future it
will become possible to run the srp-test software without any special
hardware and with in-tree drivers. Today running the srp-test software
with in-tree drivers namely requires IB hardware. This is how to run the
srp-test software today with in-tree drivers:
* Find a system with at least two InfiniBand ports.
* Make sure that the appropriate IB driver in the kernel is enabled and
  also that LIO (CONFIG_TARGET_CORE=m and CONFIG_TCM_FILEIO=m), ib_srp,
  ib_srpt and dm-mpath are built as kernel modules.
* If none of the IB ports are connected to an IB switch, connect the
  two ports to each other and configure and start the opensm software
  such that the port states change from "Initializing" to "Active".
* Check with "ibstat | grep State: Active" that at least one port is
  in the active state.
* Configure multipathd as explained in
  https://github.com/bvanassche/srp-test/blob/master/README.md.
* Restart multipathd to make sure it picks up /etc/multipath.conf.
* Clone https://github.com/bvanassche/srp-test and start it as follows:
  srp-test/run_tests -t 02-mq

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Jens Axboe
On 01/26/2017 04:50 PM, Jens Axboe wrote:
> On 01/26/2017 04:47 PM, Bart Van Assche wrote:
>> On Thu, 2017-01-26 at 16:26 -0700, Jens Axboe wrote:
>>> What device is stuck? Is it running with an mq scheduler attached, or
>>> with "none"?
>>>
>>> Would also be great to see the output of /sys/block/*/mq/*/tags and
>>> sched_tags so we can see if they have anything pending.
>>>
>>> From a quick look at the below, it looks like a request leak. Bisection
>>> would most likely be very helpful.
>>
>> Hello Jens,
>>
>> This happens with and without scheduler attached. The most recent test I ran
>> was with the deadline scheduler configured as default scheduler for all 
>> blk-mq
>> devices (CONFIG_DEFAULT_SQ_IOSCHED="mq-deadline" and
>> CONFIG_DEFAULT_MQ_IOSCHED="mq-deadline"). The block devices that hang are
>> /dev/dm-0 and /dev/dm-1. The tags and sched_tags data is as follows:
>>
>> # (cd /sys/class/block && grep -aH '' dm*/mq/*/tags)
>> dm-0/mq/0/tags:nr_tags=2048, reserved_tags=0, bits_per_word=64
>> dm-0/mq/0/tags:nr_free=1795, nr_reserved=0
>> dm-0/mq/0/tags:active_queues=0
>> dm-1/mq/0/tags:nr_tags=2048, reserved_tags=0, bits_per_word=64
>> dm-1/mq/0/tags:nr_free=2047, nr_reserved=0
>> dm-1/mq/0/tags:active_queues=0
>> # (cd /sys/class/block && grep -aH '' dm*/mq/*/sched_tags)
>> dm-0/mq/0/sched_tags:nr_tags=256, reserved_tags=0, bits_per_word=64
>> dm-0/mq/0/sched_tags:nr_free=0, nr_reserved=0
>> dm-0/mq/0/sched_tags:active_queues=0
>> dm-1/mq/0/sched_tags:nr_tags=256, reserved_tags=0, bits_per_word=64
>> dm-1/mq/0/sched_tags:nr_free=254, nr_reserved=0
>> dm-1/mq/0/sched_tags:active_queues=0
> 
> Clearly we are missing some requests. How do I setup dm similarly to
> you?
> 
> Does it reproduce without Christoph's patchset?

I have dm-mpath running using blk_mq and with mq-deadline on both dm and
the lower level device, and it seems to be running just fine here.
Note, this is without Christoph's patchset, I'll try that next once
xfstest finishes.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Jens Axboe
On 01/26/2017 04:47 PM, Bart Van Assche wrote:
> On Thu, 2017-01-26 at 16:26 -0700, Jens Axboe wrote:
>> What device is stuck? Is it running with an mq scheduler attached, or
>> with "none"?
>>
>> Would also be great to see the output of /sys/block/*/mq/*/tags and
>> sched_tags so we can see if they have anything pending.
>>
>> From a quick look at the below, it looks like a request leak. Bisection
>> would most likely be very helpful.
> 
> Hello Jens,
> 
> This happens with and without scheduler attached. The most recent test I ran
> was with the deadline scheduler configured as default scheduler for all blk-mq
> devices (CONFIG_DEFAULT_SQ_IOSCHED="mq-deadline" and
> CONFIG_DEFAULT_MQ_IOSCHED="mq-deadline"). The block devices that hang are
> /dev/dm-0 and /dev/dm-1. The tags and sched_tags data is as follows:
> 
> # (cd /sys/class/block && grep -aH '' dm*/mq/*/tags)
> dm-0/mq/0/tags:nr_tags=2048, reserved_tags=0, bits_per_word=64
> dm-0/mq/0/tags:nr_free=1795, nr_reserved=0
> dm-0/mq/0/tags:active_queues=0
> dm-1/mq/0/tags:nr_tags=2048, reserved_tags=0, bits_per_word=64
> dm-1/mq/0/tags:nr_free=2047, nr_reserved=0
> dm-1/mq/0/tags:active_queues=0
> # (cd /sys/class/block && grep -aH '' dm*/mq/*/sched_tags)
> dm-0/mq/0/sched_tags:nr_tags=256, reserved_tags=0, bits_per_word=64
> dm-0/mq/0/sched_tags:nr_free=0, nr_reserved=0
> dm-0/mq/0/sched_tags:active_queues=0
> dm-1/mq/0/sched_tags:nr_tags=256, reserved_tags=0, bits_per_word=64
> dm-1/mq/0/sched_tags:nr_free=254, nr_reserved=0
> dm-1/mq/0/sched_tags:active_queues=0

Clearly we are missing some requests. How do I setup dm similarly to
you?

Does it reproduce without Christoph's patchset?

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Bart Van Assche
On Thu, 2017-01-26 at 16:26 -0700, Jens Axboe wrote:
> What device is stuck? Is it running with an mq scheduler attached, or
> with "none"?
> 
> Would also be great to see the output of /sys/block/*/mq/*/tags and
> sched_tags so we can see if they have anything pending.
> 
> From a quick look at the below, it looks like a request leak. Bisection
> would most likely be very helpful.

Hello Jens,

This happens with and without scheduler attached. The most recent test I ran
was with the deadline scheduler configured as default scheduler for all blk-mq
devices (CONFIG_DEFAULT_SQ_IOSCHED="mq-deadline" and
CONFIG_DEFAULT_MQ_IOSCHED="mq-deadline"). The block devices that hang are
/dev/dm-0 and /dev/dm-1. The tags and sched_tags data is as follows:

# (cd /sys/class/block && grep -aH '' dm*/mq/*/tags)
dm-0/mq/0/tags:nr_tags=2048, reserved_tags=0, bits_per_word=64
dm-0/mq/0/tags:nr_free=1795, nr_reserved=0
dm-0/mq/0/tags:active_queues=0
dm-1/mq/0/tags:nr_tags=2048, reserved_tags=0, bits_per_word=64
dm-1/mq/0/tags:nr_free=2047, nr_reserved=0
dm-1/mq/0/tags:active_queues=0
# (cd /sys/class/block && grep -aH '' dm*/mq/*/sched_tags)
dm-0/mq/0/sched_tags:nr_tags=256, reserved_tags=0, bits_per_word=64
dm-0/mq/0/sched_tags:nr_free=0, nr_reserved=0
dm-0/mq/0/sched_tags:active_queues=0
dm-1/mq/0/sched_tags:nr_tags=256, reserved_tags=0, bits_per_word=64
dm-1/mq/0/sched_tags:nr_free=254, nr_reserved=0
dm-1/mq/0/sched_tags:active_queues=0

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Jens Axboe
On 01/26/2017 04:14 PM, Bart Van Assche wrote:
> On Thu, 2017-01-26 at 14:51 -0700, Jens Axboe wrote:
>> That is exactly what it means, looks like that one path doesn't handle
>> that.  You'd have to exhaust the pool with atomic allocs for this to
>> trigger, we don't do that at all in the normal IO path. So good catch,
>> must be the dm part that enables this since it does NOWAIT allocations.
>>
>>
>> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
>> index 3136696f4991..c27613de80c5 100644
>> --- a/block/blk-mq-sched.c
>> +++ b/block/blk-mq-sched.c
>> @@ -134,7 +134,8 @@ struct request *blk_mq_sched_get_request(struct 
>> request_queue *q,
>>  rq = __blk_mq_alloc_request(data, op);
>>  } else {
>>  rq = __blk_mq_alloc_request(data, op);
>> -data->hctx->tags->rqs[rq->tag] = rq;
>> +if (rq)
>> +data->hctx->tags->rqs[rq->tag] = rq;
>>  }
>>  
>>  if (rq) {
> 
> Hello Jens,
> 
> With these two patches applied the scheduling-while-atomic complaint and
> the oops are gone. However, some tasks get stuck. Is the console output
> below enough to figure out what is going on or do you want me to bisect
> this? I don't think that any requests got stuck since no pending requests
> are shown in /sys/block/*/mq/*/{pending,*/rq_list}.

What device is stuck? Is it running with an mq scheduler attached, or
with "none"?

Would also be great to see the output of /sys/block/*/mq/*/tags and
sched_tags so we can see if they have anything pending.

>From a quick look at the below, it looks like a request leak. Bisection
would most likely be very helpful.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Jens Axboe
On 01/26/2017 02:47 PM, Bart Van Assche wrote:
> (gdb) list *(blk_mq_sched_get_request+0x310)
> 0x8132dcf0 is in blk_mq_sched_get_request (block/blk-mq-sched.c:136).
> 131 rq->rq_flags |= RQF_QUEUED;
> 132 } else
> 133 rq = __blk_mq_alloc_request(data, op);
> 134 } else {
> 135 rq = __blk_mq_alloc_request(data, op);
> 136 data->hctx->tags->rqs[rq->tag] = rq;
> 137 }
> 138
> 139 if (rq) {
> 140 if (!op_is_flush(op)) {
> 
> (gdb) disas blk_mq_sched_get_request
> [ ... ]
>0x8132dce3 <+771>:   callq  0x81324ab0 
> <__blk_mq_alloc_request>
>0x8132dce8 <+776>:   mov%rax,%rcx
>0x8132dceb <+779>:   mov0x18(%r12),%rax
>0x8132dcf0 <+784>:   movslq 0x5c(%rcx),%rdx
> [ ... ]
> (gdb) print &((struct request *)0)->tag
> $1 = (int *) 0x5c 
> 
> I think this means that rq == NULL and that a test for rq is missing after the
> __blk_mq_alloc_request() call?

That is exactly what it means, looks like that one path doesn't handle
that.  You'd have to exhaust the pool with atomic allocs for this to
trigger, we don't do that at all in the normal IO path. So good catch,
must be the dm part that enables this since it does NOWAIT allocations.


diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 3136696f4991..c27613de80c5 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -134,7 +134,8 @@ struct request *blk_mq_sched_get_request(struct 
request_queue *q,
rq = __blk_mq_alloc_request(data, op);
} else {
rq = __blk_mq_alloc_request(data, op);
-   data->hctx->tags->rqs[rq->tag] = rq;
+   if (rq)
+   data->hctx->tags->rqs[rq->tag] = rq;
}
 
if (rq) {

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Bart Van Assche
On Thu, 2017-01-26 at 14:12 -0700, Jens Axboe wrote:
> On 01/26/2017 02:01 PM, Bart Van Assche wrote:
> > On Thu, 2017-01-26 at 13:54 -0700, Jens Axboe wrote:
> > > Your call path has blk_get_request() in it, I don't have
> > > that in my tree. Is it passing in the right mask?
> > 
> > Hello Jens,
> > 
> > There is only one blk_get_request() call in drivers/md/dm-mpath.c
> > and it looks as follows:
> > 
> > clone = blk_get_request(bdev_get_queue(bdev),
> > rq->cmd_flags | REQ_NOMERGE,
> > GFP_ATOMIC);
> 
> Yeah, I found it in the dm patch. Looks fine to me, since
> blk_mq_alloc_request() checks for __GFP_DIRECT_RECLAIM. Weird, it all
> looks fine to me. Are you sure you tested with the patch? Either that,
> or I'm smoking crack.

Hello Jens,

After I received your e-mail I noticed that there was a local
modification on the test system that was responsible for the schedule-
while-atomic complaint. Sorry for that. Anyway, I undid the merge with
the v4.10-rc5 code and repeated my test. This time the following call
stack appeared:

BUG: unable to handle kernel NULL pointer dereference at 005c
IP: blk_mq_sched_get_request+0x310/0x350
PGD 34bd9c067 
PUD 346b37067 
PMD 0 

Oops:  [#1] SMP
Modules linked in: dm_service_time ib_srp scsi_transport_srp target_core_user 
uio target_core_pscsi target_core_file ib_srpt target_core_iblock 
target_core_mod brd netconsole xt_CHECKSUM iptable_mangle ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat libcrc32c 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
nf_reject_ipv4 xt_tcpudp tun bridge stp llc ebtable_filter ebtables 
ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet ib_ipoib 
rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm configfs ib_cm iw_cm msr mlx4_ib 
ib_core sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp 
ipmi_ssif kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul mlx4_core 
crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 tg3 iTCO_wdt 
crypto_simd dcdbas iTCO_vendor_support ptp glue_helper ipmi_si cryptd 
ipmi_devintf pps_core fjes devlink ipmi_msghandler pcspkr libphy tpm_tis 
tpm_tis_core tpm button mei_me lpc_ich wmi mei mfd_core shpchp hid_generic 
usbhid mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt 
fb_sys_fops ttm sr_mod drm cdrom ehci_pci ehci_hcd usbcore usb_common sg 
dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4
CPU: 0 PID: 9231 Comm: fio Not tainted 4.10.0-rc4-dbg+ #1
Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014
task: 88034c8c3140 task.stack: c90005698000
RIP: 0010:blk_mq_sched_get_request+0x310/0x350
RSP: 0018:c9000569bac8 EFLAGS: 00010246
RAX: 88034f430958 RBX: 88045ed2cef0 RCX: 
RDX: 001f RSI: 8803507bdcf8 RDI: 001f
RBP: c9000569bb00 R08: 0001 R09: 
R10: 0001 R11:  R12: c9000569bb18
R13: c801 R14:  R15: 
FS:  7f65ca054700() GS:88046f20() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 005c CR3: 00034b0ed000 CR4: 001406f0
Call Trace:
 blk_mq_alloc_request+0x5e/0xb0
 blk_get_request+0x2f/0x110
 multipath_clone_and_map+0xcd/0x140 [dm_multipath]
 map_request+0x3c/0x290 [dm_mod]
 dm_mq_queue_rq+0x77/0x100 [dm_mod]
 blk_mq_dispatch_rq_list+0x1ff/0x320
 blk_mq_sched_dispatch_requests+0xa9/0xe0
 __blk_mq_run_hw_queue+0x122/0x1c0
 blk_mq_run_hw_queue+0x84/0x90
 blk_mq_flush_plug_list+0x39f/0x480
 blk_flush_plug_list+0xee/0x270
 blk_finish_plug+0x27/0x40
 do_io_submit+0x475/0x900
 SyS_io_submit+0xb/0x10
 entry_SYSCALL_64_fastpath+0x18/0xad
RIP: 0033:0x7f65e4d05787
RSP: 002b:7f65ca051948 EFLAGS: 0202 ORIG_RAX: 00d1
RAX: ffda RBX: 0046 RCX: 7f65e4d05787
RDX: 7f65a404f158 RSI: 0001 RDI: 7f65f6bfd000
RBP: 0815 R08: 0001 R09: 7f65a404e3e0
R10: 7f65a404 R11: 0202 R12: 06d0
R13: 7f65a404e930 R14: 1000 R15: 0830
Code: 67 ff ff ff e9 80 fe ff ff 48 89 df e8 ba c4 fe ff 31 c9 e9 60 ff ff ff 
44 89 ee 4c 89 e7 e8 c8 6d ff ff 48 89 c1 49 8b 44 24 18 <48> 63 51 5c 48 8b 80 
20 01 00 00 48 8b 80 80 00 00 00 48 89 0c 
RIP: blk_mq_sched_get_request+0x310/0x350 RSP: c9000569bac8
CR2: 005c

(gdb) list *(blk_mq_sched_get_request+0x310)
0x8132dcf0 is in blk_mq_sched_get_request (block/blk-mq-sched.c:136).
131 rq->rq_flags |= RQF_QUEUED;
132 } else
133 rq = __blk_mq_alloc_request(data, op);
134 } else {
135 rq = __blk_mq_alloc_request(data, op);
136 data->hctx->tags->rqs[rq->tag] = rq;
137 }
138

Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Jens Axboe
On 01/26/2017 02:01 PM, Bart Van Assche wrote:
> On Thu, 2017-01-26 at 13:54 -0700, Jens Axboe wrote:
>> Your call path has blk_get_request() in it, I don't have
>> that in my tree. Is it passing in the right mask?
> 
> Hello Jens,
> 
> There is only one blk_get_request() call in drivers/md/dm-mpath.c
> and it looks as follows:
> 
>   clone = blk_get_request(bdev_get_queue(bdev),
>   rq->cmd_flags | REQ_NOMERGE,
>   GFP_ATOMIC);

Yeah, I found it in the dm patch. Looks fine to me, since
blk_mq_alloc_request() checks for __GFP_DIRECT_RECLAIM. Weird, it all
looks fine to me. Are you sure you tested with the patch? Either that,
or I'm smoking crack.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Bart Van Assche
On Thu, 2017-01-26 at 13:54 -0700, Jens Axboe wrote:
> Your call path has blk_get_request() in it, I don't have
> that in my tree. Is it passing in the right mask?

Hello Jens,

There is only one blk_get_request() call in drivers/md/dm-mpath.c
and it looks as follows:

clone = blk_get_request(bdev_get_queue(bdev),
rq->cmd_flags | REQ_NOMERGE,
GFP_ATOMIC);

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Jens Axboe
On 01/26/2017 01:47 PM, Bart Van Assche wrote:
> On 01/26/2017 11:01 AM, Jens Axboe wrote:
>> On 01/26/2017 11:59 AM, h...@lst.de wrote:
>>> On Thu, Jan 26, 2017 at 11:57:36AM -0700, Jens Axboe wrote:
 It's against my for-4.11/block, which you were running under Christoph's
 patches. Maybe he's using an older version? In any case, should be
 pretty trivial for you to hand apply. Just ensure that .flags is set to
 0 for the common cases, and inherit 'flags' when it is passed in.
>>>
>>> No, the flush op cleanups you asked for last round create a conflict
>>> with your patch.  They should be trivial to fix, though.
>>
>> Ah, makes sense. And yes, as I said, should be trivial to hand apply the
>> hunk that does fail.
> 
> Hello Jens and Christoph,
> 
> With the below patch applied the test got a little further but did not
> pass unfortunately. I tried to analyze the new call stack but it's not yet
> clear to me what is going on.
>  
> The patch I had applied on Christoph's tree:
> 
> ---
>  block/blk-mq-sched.c | 2 +-
>  block/blk-mq.c   | 6 +++---
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> index 3bd66e50ec84..7c9318755fab 100644
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -116,7 +116,7 @@ struct request *blk_mq_sched_get_request(struct 
> request_queue *q,
>   ctx = blk_mq_get_ctx(q);
>   hctx = blk_mq_map_queue(q, ctx->cpu);
>  
> - blk_mq_set_alloc_data(data, q, 0, ctx, hctx);
> + blk_mq_set_alloc_data(data, q, data->flags, ctx, hctx);
>  
>   if (e) {
>   data->flags |= BLK_MQ_REQ_INTERNAL;
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 83640869d9e4..6697626e5d32 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -248,7 +248,7 @@ EXPORT_SYMBOL_GPL(__blk_mq_alloc_request);
>  struct request *blk_mq_alloc_request(struct request_queue *q, int rw,
>   unsigned int flags)
>  {
> - struct blk_mq_alloc_data alloc_data;
> + struct blk_mq_alloc_data alloc_data = { .flags = flags };
>   struct request *rq;
>   int ret;
>  
> @@ -1369,7 +1369,7 @@ static blk_qc_t blk_mq_make_request(struct 
> request_queue *q, struct bio *bio)
>  {
>   const int is_sync = op_is_sync(bio->bi_opf);
>   const int is_flush_fua = op_is_flush(bio->bi_opf);
> - struct blk_mq_alloc_data data;
> + struct blk_mq_alloc_data data = { };
>   struct request *rq;
>   unsigned int request_count = 0, srcu_idx;
>   struct blk_plug *plug;
> @@ -1491,7 +1491,7 @@ static blk_qc_t blk_sq_make_request(struct 
> request_queue *q, struct bio *bio)
>   const int is_flush_fua = op_is_flush(bio->bi_opf);
>   struct blk_plug *plug;
>   unsigned int request_count = 0;
> - struct blk_mq_alloc_data data;
> + struct blk_mq_alloc_data data = { };
>   struct request *rq;
>   blk_qc_t cookie;
>   unsigned int wb_acct;

Looks correct to me. Your call path has blk_get_request() in it, I don't have
that in my tree. Is it passing in the right mask?

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Bart Van Assche
On 01/26/2017 11:01 AM, Jens Axboe wrote:
> On 01/26/2017 11:59 AM, h...@lst.de wrote:
>> On Thu, Jan 26, 2017 at 11:57:36AM -0700, Jens Axboe wrote:
>>> It's against my for-4.11/block, which you were running under Christoph's
>>> patches. Maybe he's using an older version? In any case, should be
>>> pretty trivial for you to hand apply. Just ensure that .flags is set to
>>> 0 for the common cases, and inherit 'flags' when it is passed in.
>>
>> No, the flush op cleanups you asked for last round create a conflict
>> with your patch.  They should be trivial to fix, though.
> 
> Ah, makes sense. And yes, as I said, should be trivial to hand apply the
> hunk that does fail.

Hello Jens and Christoph,

With the below patch applied the test got a little further but did not
pass unfortunately. I tried to analyze the new call stack but it's not yet
clear to me what is going on.
 
The patch I had applied on Christoph's tree:

---
 block/blk-mq-sched.c | 2 +-
 block/blk-mq.c   | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 3bd66e50ec84..7c9318755fab 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -116,7 +116,7 @@ struct request *blk_mq_sched_get_request(struct 
request_queue *q,
ctx = blk_mq_get_ctx(q);
hctx = blk_mq_map_queue(q, ctx->cpu);
 
-   blk_mq_set_alloc_data(data, q, 0, ctx, hctx);
+   blk_mq_set_alloc_data(data, q, data->flags, ctx, hctx);
 
if (e) {
data->flags |= BLK_MQ_REQ_INTERNAL;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 83640869d9e4..6697626e5d32 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -248,7 +248,7 @@ EXPORT_SYMBOL_GPL(__blk_mq_alloc_request);
 struct request *blk_mq_alloc_request(struct request_queue *q, int rw,
unsigned int flags)
 {
-   struct blk_mq_alloc_data alloc_data;
+   struct blk_mq_alloc_data alloc_data = { .flags = flags };
struct request *rq;
int ret;
 
@@ -1369,7 +1369,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue 
*q, struct bio *bio)
 {
const int is_sync = op_is_sync(bio->bi_opf);
const int is_flush_fua = op_is_flush(bio->bi_opf);
-   struct blk_mq_alloc_data data;
+   struct blk_mq_alloc_data data = { };
struct request *rq;
unsigned int request_count = 0, srcu_idx;
struct blk_plug *plug;
@@ -1491,7 +1491,7 @@ static blk_qc_t blk_sq_make_request(struct request_queue 
*q, struct bio *bio)
const int is_flush_fua = op_is_flush(bio->bi_opf);
struct blk_plug *plug;
unsigned int request_count = 0;
-   struct blk_mq_alloc_data data;
+   struct blk_mq_alloc_data data = { };
struct request *rq;
blk_qc_t cookie;
unsigned int wb_acct;
-- 
2.11.0


The new call trace:

[ 4277.729785] BUG: scheduling while atomic: mount/9209/0x0004
[ 4277.729824] 2 locks held by mount/9209:
[ 4277.729846]  #0:  (&type->s_umount_key#25/1){+.+.+.}, at: 
[] sget_userns+0x2bd/0x500
[ 4277.729881]  #1:  (rcu_read_lock){..}, at: [] 
__blk_mq_run_hw_queue+0xde/0x1c0
[ 4277.729911] Modules linked in: dm_service_time ib_srp scsi_transport_srp 
target_core_user uio target_core_pscsi target_core_file ib_srpt 
target_core_iblock target_core_mod brd netconsole xt_CHECKSUM iptable_mangle 
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_
ipv4 nf_nat libcrc32c nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack 
nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc 
ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables 
x_tables af_packet ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad 
rdma_cm configfs ib_cm iw_cm msr mlx4_ib ib_core sb_edac edac_core 
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm 
irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc 
aesni_intel aes_x86_64 mlx4_core crypto_simd iTCO_wdt
[ 4277.730048]  tg3 iTCO_vendor_support dcdbas glue_helper ptp ipmi_si pcspkr 
pps_core devlink ipmi_devintf cryptd libphy fjes ipmi_msghandler tpm_tis mei_me 
tpm_tis_core lpc_ich mfd_core shpchp mei tpm wmi button hid_generic usbhid 
mgag200 i2c_algo_bit drm_kms_helper sysco
pyarea sysfillrect sysimgblt fb_sys_fops ttm sr_mod cdrom drm ehci_pci ehci_hcd 
usbcore usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua 
autofs4
[ 4277.730135] CPU: 11 PID: 9209 Comm: mount Not tainted 4.10.0-rc5-dbg+ #2
[ 4277.730159] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 
11/17/2014
[ 4277.730187] Call Trace:
[ 4277.730212]  dump_stack+0x68/0x93
[ 4277.730236]  __schedule_bug+0x5b/0x80
[ 4277.730259]  __schedule+0x762/0xb00
[ 4277.730281]  schedule+0x38/0x90
[ 4277.730302]  schedule_timeout+0x2fe/0x640
[ 4277.730324]  ? mark_held_locks+0x6f/0xa0
[ 4277.730349]  ? ktime_get+0x74/0x130
[ 4277.730370]  ? trace_hardirqs_on_caller+0xf9/0x1b0
[ 4277.730391]  ? trace_hardirqs_on+0xd/0x10
[ 4277.73041

Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Jens Axboe
On 01/26/2017 11:59 AM, h...@lst.de wrote:
> On Thu, Jan 26, 2017 at 11:57:36AM -0700, Jens Axboe wrote:
>> It's against my for-4.11/block, which you were running under Christoph's
>> patches. Maybe he's using an older version? In any case, should be
>> pretty trivial for you to hand apply. Just ensure that .flags is set to
>> 0 for the common cases, and inherit 'flags' when it is passed in.
> 
> No, the flush op cleanups you asked for last round create a conflict
> with your patch.  They should be trivial to fix, though.

Ah, makes sense. And yes, as I said, should be trivial to hand apply the
hunk that does fail.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Jens Axboe
On 01/26/2017 11:52 AM, Bart Van Assche wrote:
> On Thu, 2017-01-26 at 11:44 -0700, Jens Axboe wrote:
>> I think this may be my bug - does the below help?
> 
> Hello Jens,
> 
> What tree has that patch been generated against? It does not apply
> cleanly on top of Christoph's tree:
> 
> $ git checkout hch-block-pc-refactor
> $ patch -p1 --dry-run -f -s < 
> ~/Re\:_split_scsi_passthrough_fields_out_of_struct_request_V2.mbox
> 1 out of 3 hunks FAILED

It's against my for-4.11/block, which you were running under Christoph's
patches. Maybe he's using an older version? In any case, should be
pretty trivial for you to hand apply. Just ensure that .flags is set to
0 for the common cases, and inherit 'flags' when it is passed in.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread h...@lst.de
On Thu, Jan 26, 2017 at 11:57:36AM -0700, Jens Axboe wrote:
> It's against my for-4.11/block, which you were running under Christoph's
> patches. Maybe he's using an older version? In any case, should be
> pretty trivial for you to hand apply. Just ensure that .flags is set to
> 0 for the common cases, and inherit 'flags' when it is passed in.

No, the flush op cleanups you asked for last round create a conflict
with your patch.  They should be trivial to fix, though.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Bart Van Assche
On Thu, 2017-01-26 at 11:44 -0700, Jens Axboe wrote:
> I think this may be my bug - does the below help?

Hello Jens,

What tree has that patch been generated against? It does not apply
cleanly on top of Christoph's tree:

$ git checkout hch-block-pc-refactor
$ patch -p1 --dry-run -f -s < 
~/Re\:_split_scsi_passthrough_fields_out_of_struct_request_V2.mbox
1 out of 3 hunks FAILED

Thanks,

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Jens Axboe
On 01/26/2017 11:29 AM, Bart Van Assche wrote:
> On Wed, 2017-01-25 at 18:25 +0100, Christoph Hellwig wrote:
>> Hi all,
>>
>> this series splits the support for SCSI passthrough commands from the
>> main struct request used all over the block layer into a separate
>> scsi_request structure that drivers that want to support SCSI passthough
>> need to embedded as the first thing into their request-private data,
>> similar to how we handle NVMe passthrough commands.
>>
>> To support this I've added support for that the private data after
>> request structure to the legacy request path instead, so that it can
>> be treated the same way as the blk-mq path.  Compare to the current
>> scsi_cmnd allocator that actually is a major simplification.
>>
>> Changes since V1:
>>  - fix handling of a NULL sense pointer in __scsi_execute
>>  - clean up handling of the flush flags in the block layer and MD
>>  - additional small cleanup in dm-rq
> 
> Hello Christoph,
> 
> Thanks for having fixed the NULL pointer issue I had reported for v1.
> However, if I try to run my srp-test testsuite on top of your
> hch-block/block-pc-refactor branch (commit ID a07dc3521034) merged
> with v4.10-rc5 the following appears on the console:

I think this may be my bug - does the below help?


diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index d05061f27bb1..56b92db944ae 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -117,7 +117,7 @@ struct request *blk_mq_sched_get_request(struct 
request_queue *q,
ctx = blk_mq_get_ctx(q);
hctx = blk_mq_map_queue(q, ctx->cpu);
 
-   blk_mq_set_alloc_data(data, q, 0, ctx, hctx);
+   blk_mq_set_alloc_data(data, q, data->flags, ctx, hctx);
 
if (e) {
data->flags |= BLK_MQ_REQ_INTERNAL;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index dcb567642db7..9e4ed04f398c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -253,7 +253,7 @@ EXPORT_SYMBOL_GPL(__blk_mq_alloc_request);
 struct request *blk_mq_alloc_request(struct request_queue *q, int rw,
unsigned int flags)
 {
-   struct blk_mq_alloc_data alloc_data;
+   struct blk_mq_alloc_data alloc_data = { .flags = flags };
struct request *rq;
int ret;
 
@@ -1382,7 +1382,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue 
*q, struct bio *bio)
 {
const int is_sync = op_is_sync(bio->bi_opf);
const int is_flush_fua = bio->bi_opf & (REQ_PREFLUSH | REQ_FUA);
-   struct blk_mq_alloc_data data;
+   struct blk_mq_alloc_data data = { 0, };
struct request *rq;
unsigned int request_count = 0, srcu_idx;
struct blk_plug *plug;
@@ -1504,7 +1504,7 @@ static blk_qc_t blk_sq_make_request(struct request_queue 
*q, struct bio *bio)
const int is_flush_fua = bio->bi_opf & (REQ_PREFLUSH | REQ_FUA);
struct blk_plug *plug;
unsigned int request_count = 0;
-   struct blk_mq_alloc_data data;
+   struct blk_mq_alloc_data data = { 0, };
struct request *rq;
blk_qc_t cookie;
unsigned int wb_acct;

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request V2

2017-01-26 Thread Bart Van Assche
On Wed, 2017-01-25 at 18:25 +0100, Christoph Hellwig wrote:
> Hi all,
> 
> this series splits the support for SCSI passthrough commands from the
> main struct request used all over the block layer into a separate
> scsi_request structure that drivers that want to support SCSI passthough
> need to embedded as the first thing into their request-private data,
> similar to how we handle NVMe passthrough commands.
> 
> To support this I've added support for that the private data after
> request structure to the legacy request path instead, so that it can
> be treated the same way as the blk-mq path.  Compare to the current
> scsi_cmnd allocator that actually is a major simplification.
> 
> Changes since V1:
>  - fix handling of a NULL sense pointer in __scsi_execute
>  - clean up handling of the flush flags in the block layer and MD
>  - additional small cleanup in dm-rq

Hello Christoph,

Thanks for having fixed the NULL pointer issue I had reported for v1.
However, if I try to run my srp-test testsuite on top of your
hch-block/block-pc-refactor branch (commit ID a07dc3521034) merged
with v4.10-rc5 the following appears on the console:

[  707.317403] BUG: scheduling while atomic: fio/9073/0x0003
[  707.317404] 1 lock held by fio/9073:
[  707.317404]  #0:  (rcu_read_lock){..}, at: [] 
__blk_mq_run_hw_queue+0xde/0x1c0
[  707.317409] Modules linked in: dm_service_time ib_srp scsi_transport_srp 
target_core_user uio target_core_pscsi target_core_file ib_srpt 
target_core_iblock target_core_mod brd netconsole xt_CHECKSUM iptable_mangle 
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat libcrc32c 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
nf_reject_ipv4 xt_tcpudp tun bridge stp llc ebtable_filter ebtables 
ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet ib_ipoib 
rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm configfs ib_cm iw_cm msr mlx4_ib 
ib_core sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp 
ipmi_ssif kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul mlx4_core 
crc32c_intel ghash_clmulni_intel hid_generic pcbc usbhid iTCO_wdt tg3 
aesni_intel
[  707.317445]  ptp iTCO_vendor_support aes_x86_64 crypto_simd pps_core 
glue_helper dcdbas ipmi_si ipmi_devintf libphy devlink lpc_ich cryptd pcspkr 
ipmi_msghandler mfd_core fjes mei_me tpm_tis button tpm_tis_core shpchp mei tpm 
wmi mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt 
fb_sys_fops ttm sr_mod cdrom drm ehci_pci ehci_hcd usbcore usb_common sg 
dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4
[  707.317469] CPU: 6 PID: 9073 Comm: fio Tainted: GW   
4.10.0-rc5-dbg+ #1
[  707.317470] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 
11/17/2014
[  707.317470] Call Trace:
[  707.317473]  dump_stack+0x68/0x93
[  707.317475]  __schedule_bug+0x5b/0x80
[  707.317477]  __schedule+0x762/0xb00
[  707.317479]  schedule+0x38/0x90
[  707.317481]  schedule_timeout+0x2fe/0x640
[  707.317491]  io_schedule_timeout+0x9f/0x110
[  707.317493]  blk_mq_get_tag+0x158/0x260
[  707.317496]  __blk_mq_alloc_request+0x16/0xe0
[  707.317498]  blk_mq_sched_get_request+0x30d/0x360
[  707.317502]  blk_mq_alloc_request+0x3b/0x90
[  707.317505]  blk_get_request+0x2f/0x110
[  707.317507]  multipath_clone_and_map+0xcd/0x140 [dm_multipath]
[  707.317512]  map_request+0x3c/0x290 [dm_mod]
[  707.317517]  dm_mq_queue_rq+0x77/0x100 [dm_mod]
[  707.317519]  blk_mq_dispatch_rq_list+0x1ff/0x320
[  707.317521]  blk_mq_sched_dispatch_requests+0xa9/0xe0
[  707.317523]  __blk_mq_run_hw_queue+0x122/0x1c0
[  707.317528]  blk_mq_run_hw_queue+0x84/0x90
[  707.317530]  blk_mq_flush_plug_list+0x39f/0x480
[  707.317531]  blk_flush_plug_list+0xee/0x270
[  707.317533]  blk_finish_plug+0x27/0x40
[  707.317534]  do_io_submit+0x475/0x900
[  707.317537]  SyS_io_submit+0xb/0x10
[  707.317539]  entry_SYSCALL_64_fastpath+0x18/0xad

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] split scsi passthrough fields out of struct request V2

2017-01-25 Thread Christoph Hellwig
Hi all,

this series splits the support for SCSI passthrough commands from the
main struct request used all over the block layer into a separate
scsi_request structure that drivers that want to support SCSI passthough
need to embedded as the first thing into their request-private data,
similar to how we handle NVMe passthrough commands.

To support this I've added support for that the private data after
request structure to the legacy request path instead, so that it can
be treated the same way as the blk-mq path.  Compare to the current
scsi_cmnd allocator that actually is a major simplification.

Changes since V1:
 - fix handling of a NULL sense pointer in __scsi_execute
 - clean up handling of the flush flags in the block layer and MD
 - additional small cleanup in dm-rq

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] split scsi passthrough fields out of struct request

2017-01-23 Thread Christoph Hellwig
Hi all,

this series splits the support for SCSI passthrough commands from the
main struct request used all over the block layer into a separate
scsi_request structure that drivers that want to support SCSI passthough
need to embedded as the first thing into their request-private data,
similar to how we handle NVMe passthrough commands.

To support this I've added support for that the private data after
request structure to the legacy request path instead, so that it can
be treated the same way as the blk-mq path.  Compare to the current
scsi_cmnd allocator that actually is a major simplification.

Compared to the previous RFC version the major change is that dm-mpath
works with this version.  To make it work I've switched the legacy
request dm-rq to use the same clone and map method as the blk-mq version.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request

2017-01-23 Thread Christoph Hellwig
On Mon, Jan 23, 2017 at 08:39:44AM -0700, Jens Axboe wrote:
> I'd like to get this in sooner rather than later, so I'll spend some
> time reviewing and testing it start this week. I'm assuming you are
> targeting 4.11 with this change, right?

Yes.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] split scsi passthrough fields out of struct request

2017-01-23 Thread Jens Axboe
On 01/23/2017 08:29 AM, Christoph Hellwig wrote:
> Hi all,
> 
> this series splits the support for SCSI passthrough commands from the
> main struct request used all over the block layer into a separate
> scsi_request structure that drivers that want to support SCSI passthough
> need to embedded as the first thing into their request-private data,
> similar to how we handle NVMe passthrough commands.
> 
> To support this I've added support for that the private data after
> request structure to the legacy request path instead, so that it can
> be treated the same way as the blk-mq path.  Compare to the current
> scsi_cmnd allocator that actually is a major simplification.
> 
> Compared to the previous RFC version the major change is that dm-mpath
> works with this version.  To make it work I've switched the legacy
> request dm-rq to use the same clone and map method as the blk-mq version.

I'd like to get this in sooner rather than later, so I'll spend some
time reviewing and testing it start this week. I'm assuming you are
targeting 4.11 with this change, right?

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel