Re: [PATCH] scsi-mq: Always unprepare before requeuing a request

2017-08-11 Thread Bart Van Assche
On Fri, 2017-08-11 at 11:05 +1000, Michael Ellerman wrote:
> kworker/u193:0  D12736 6  2 0x0800
> Workqueue: events_unbound async_run_entry_fn
> Call Trace:
> [c003f7597410] [c0150d00] console_unlock+0x330/0x770 (unreliable)
> [c003f75975e0] [c001b3b8] __switch_to+0x2a8/0x460
> [c003f7597640] [c09abc60] __schedule+0x320/0xaa0
> [c003f7597710] [c09ac420] schedule+0x40/0xb0
> [c003f7597740] [c09b09d4] schedule_timeout+0x254/0x440
> [c003f7597820] [c09aca80] io_schedule_timeout+0x30/0x60
> [c003f7597850] [c09ad75c] 
> wait_for_common_io.constprop.2+0xbc/0x1e0
> [c003f75978d0] [c0509e6c] blk_execute_rq+0x4c/0x70
> [c003f7597920] [c0654abc] scsi_execute+0xfc/0x260
> [c003f7597990] [c0654f98] scsi_mode_sense+0x218/0x410
> [c003f7597aa0] [c068ee68] sd_revalidate_disk+0x908/0x1cd0
> [c003f7597be0] [c0690434] sd_probe_async+0xb4/0x220
> [c003f7597c60] [c0110a20] async_run_entry_fn+0x70/0x170
> [c003f7597ca0] [c0103904] process_one_work+0x2b4/0x560
> [c003f7597d30] [c0103c38] worker_thread+0x88/0x5a0
> [c003f7597dc0] [c010bfcc] kthread+0x15c/0x1a0
> [c003f7597e30] [c000ba1c] ret_from_kernel_thread+0x5c/0xc0

Hello Michael,

It is suspicious that entries like the above appear in the SysRq-w output.
Every time I saw this in the past it was caused by a block driver not having
called blk_end_request() or a SCSI LLD not having called .scsi_done().
Additionally, it is unlikely that the patch at the start of this thread
introduced this issue. Can you have another look at the ipr driver? If a
shell is available at the time this lockup occurs, it will probably be
helpful to have a look at the debugfs entries under /sys/kernel/debug/block/.

Thanks,

Bart.

Re: [PATCH] scsi-mq: Always unprepare before requeuing a request

2017-08-10 Thread Michael Ellerman
Bart Van Assche  writes:
> On Thu, 2017-08-10 at 20:32 +1000, Michael Ellerman wrote:
>> "Martin K. Petersen"  writes:
>> > > One of the two scsi-mq functions that requeue a request unprepares a
>> > > request before requeueing (scsi_io_completion()) but the other
>> > > function not (__scsi_queue_insert()). Make sure that a request is
>> > > unprepared before requeuing it.
>> > 
>> > Applied to 4.13/scsi-fixes. Thanks much!
>> 
>> This seems to be preventing my Power8 box, which uses IPR, from booting.
...
>
> Hello Michael,
>
> Thanks for having reported this early.

No worries.

> Is there any chance that you can
> reproduce this state, press SysRq-w on the console and collect the task
> overview that is reported on the console (see also Documentation/admin-guide/
> sysrq.rst)?

Here it is below. Doesn't seem to change over time.

I can do the scsi_logging_level thing as well as soon as I've got some
coffee :)

cheers


sysrq: SysRq : Show Blocked State
  taskPC stack   pid father
swapper/0   D10080 1  0 0x0800
Call Trace:
[c003f7583890] [c003f75838e0] 0xc003f75838e0 (unreliable)
[c003f7583a60] [c001b3b8] __switch_to+0x2a8/0x460
[c003f7583ac0] [c09abc60] __schedule+0x320/0xaa0
[c003f7583b90] [c09ac420] schedule+0x40/0xb0
[c003f7583bc0] [c0110fc4] async_synchronize_cookie_domain+0xd4/0x150
[c003f7583c30] [c0619f94] wait_for_device_probe+0x44/0xe0
[c003f7583c90] [c0c64ce4] prepare_namespace+0x58/0x248
[c003f7583d00] [c0c64478] kernel_init_freeable+0x310/0x348
[c003f7583dc0] [c000d6e4] kernel_init+0x24/0x150
[c003f7583e30] [c000ba1c] ret_from_kernel_thread+0x5c/0xc0
kworker/u193:0  D12736 6  2 0x0800
Workqueue: events_unbound async_run_entry_fn
Call Trace:
[c003f7597410] [c0150d00] console_unlock+0x330/0x770 (unreliable)
[c003f75975e0] [c001b3b8] __switch_to+0x2a8/0x460
[c003f7597640] [c09abc60] __schedule+0x320/0xaa0
[c003f7597710] [c09ac420] schedule+0x40/0xb0
[c003f7597740] [c09b09d4] schedule_timeout+0x254/0x440
[c003f7597820] [c09aca80] io_schedule_timeout+0x30/0x60
[c003f7597850] [c09ad75c] wait_for_common_io.constprop.2+0xbc/0x1e0
[c003f75978d0] [c0509e6c] blk_execute_rq+0x4c/0x70
[c003f7597920] [c0654abc] scsi_execute+0xfc/0x260
[c003f7597990] [c0654f98] scsi_mode_sense+0x218/0x410
[c003f7597aa0] [c068ee68] sd_revalidate_disk+0x908/0x1cd0
[c003f7597be0] [c0690434] sd_probe_async+0xb4/0x220
[c003f7597c60] [c0110a20] async_run_entry_fn+0x70/0x170
[c003f7597ca0] [c0103904] process_one_work+0x2b4/0x560
[c003f7597d30] [c0103c38] worker_thread+0x88/0x5a0
[c003f7597dc0] [c010bfcc] kthread+0x15c/0x1a0
[c003f7597e30] [c000ba1c] ret_from_kernel_thread+0x5c/0xc0
kworker/u193:1  D12480   412  2 0x0800
Workqueue: events_unbound async_run_entry_fn
Call Trace:
[c003f5907410] [c0150d00] console_unlock+0x330/0x770 (unreliable)
[c003f59075e0] [c001b3b8] __switch_to+0x2a8/0x460
[c003f5907640] [c09abc60] __schedule+0x320/0xaa0
[c003f5907710] [c09ac420] schedule+0x40/0xb0
[c003f5907740] [c09b09d4] schedule_timeout+0x254/0x440
[c003f5907820] [c09aca80] io_schedule_timeout+0x30/0x60
[c003f5907850] [c09ad75c] wait_for_common_io.constprop.2+0xbc/0x1e0
[c003f59078d0] [c0509e6c] blk_execute_rq+0x4c/0x70
[c003f5907920] [c0654abc] scsi_execute+0xfc/0x260
[c003f5907990] [c0654f98] scsi_mode_sense+0x218/0x410
[c003f5907aa0] [c068ee68] sd_revalidate_disk+0x908/0x1cd0
[c003f5907be0] [c0690434] sd_probe_async+0xb4/0x220
[c003f5907c60] [c0110a20] async_run_entry_fn+0x70/0x170
[c003f5907ca0] [c0103904] process_one_work+0x2b4/0x560
[c003f5907d30] [c0103c38] worker_thread+0x88/0x5a0
[c003f5907dc0] [c010bfcc] kthread+0x15c/0x1a0
[c003f5907e30] [c000ba1c] ret_from_kernel_thread+0x5c/0xc0
kworker/u193:2  D12832   421  2 0x0800
Workqueue: events_unbound async_run_entry_fn
Call Trace:
[c003f4103410] [c003f41035f0] 0xc003f41035f0 (unreliable)
[c003f41035e0] [c001b3b8] __switch_to+0x2a8/0x460
[c003f4103640] [c09abc60] __schedule+0x320/0xaa0
[c003f4103710] [c09ac420] schedule+0x40/0xb0
[c003f4103740] [c09b09d4] schedule_timeout+0x254/0x440
[c003f4103820] [c09aca80] io_schedule_timeout+0x30/0x60
[c003f4103850] [c09ad75c] wait_for_common_io.constprop.2+0xbc/0x1e0
[c003f41038d0] [c0509e6c] blk_execute_rq+0x4c/0x70
[c003f4103920] [c0654abc] scsi_execute+0xfc/0x260
[c003f4103990] [c0654f98] scsi_mode_sense+0x218/0x410
[c003f4103aa0] [c000

Re: [PATCH] scsi-mq: Always unprepare before requeuing a request

2017-08-10 Thread Brian King
On 08/10/2017 10:26 AM, Bart Van Assche wrote:
> On Thu, 2017-08-10 at 20:32 +1000, Michael Ellerman wrote:
>> "Martin K. Petersen"  writes:
 One of the two scsi-mq functions that requeue a request unprepares a
 request before requeueing (scsi_io_completion()) but the other
 function not (__scsi_queue_insert()). Make sure that a request is
 unprepared before requeuing it.
>>>
>>> Applied to 4.13/scsi-fixes. Thanks much!
>>
>> This seems to be preventing my Power8 box, which uses IPR, from booting.
>>
>> Bisect said so:
>>
>> # first bad commit: [270065e92c317845d69095ec8e3d18616b5b39d5] scsi: 
>> scsi-mq: Always unprepare before requeuing a request
>>
>> And if I revert that on top of next-20170809 my system boots again.
>>
>> The symptom is that it just gets "stuck" during boot and never gets to
>> mounting root, full log below, the end is:
>>
>>   .
>>   ready
>>   ready
>>   sd 0:2:4:0: [sde] 554287104 512-byte logical blocks: (284 GB/264 GiB)
>>   sd 0:2:4:0: [sde] 4096-byte physical blocks
>>   sd 0:2:5:0: [sdf] 272646144 512-byte logical blocks: (140 GB/130 GiB)
>>   sd 0:2:5:0: [sdf] 4096-byte physical blocks
>>   sd 0:2:4:0: [sde] Write Protect is off
>>   sd 0:2:4:0: [sde] Mode Sense: 0b 00 00 08
>>   sd 0:2:5:0: [sdf] Write Protect is off
>>   sd 0:2:5:0: [sdf] Mode Sense: 0b 00 00 08
>>
>>
>> And it just sits there for at least hours.
>>
>> I compared a good and bad boot log and there appears to be essentially
>> no difference. Certainly nothing that looks suspicous.
> 
> Hello Michael,
> 
> Thanks for having reported this early. Is there any chance that you can
> reproduce this state, press SysRq-w on the console and collect the task
> overview that is reported on the console (see also Documentation/admin-guide/
> sysrq.rst)? If this is not possible or if that task overview does not report
> any blocked tasks, can you add scsi_mod.scsi_logging_level=-1 to the kernel
> command line (either through /etc/default/grub or in /boot/grub2/grub.cfg
> when using GRUB), reboot the system, capture the console output and report
> that output as a reply to this e-mail?

I'm building a kernel to try to reproduce this on a machine with ipr.


-Brian


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center



Re: [PATCH] scsi-mq: Always unprepare before requeuing a request

2017-08-10 Thread Bart Van Assche
On Thu, 2017-08-10 at 20:32 +1000, Michael Ellerman wrote:
> "Martin K. Petersen"  writes:
> > > One of the two scsi-mq functions that requeue a request unprepares a
> > > request before requeueing (scsi_io_completion()) but the other
> > > function not (__scsi_queue_insert()). Make sure that a request is
> > > unprepared before requeuing it.
> > 
> > Applied to 4.13/scsi-fixes. Thanks much!
> 
> This seems to be preventing my Power8 box, which uses IPR, from booting.
> 
> Bisect said so:
> 
> # first bad commit: [270065e92c317845d69095ec8e3d18616b5b39d5] scsi: scsi-mq: 
> Always unprepare before requeuing a request
> 
> And if I revert that on top of next-20170809 my system boots again.
> 
> The symptom is that it just gets "stuck" during boot and never gets to
> mounting root, full log below, the end is:
> 
>   .
>   ready
>   ready
>   sd 0:2:4:0: [sde] 554287104 512-byte logical blocks: (284 GB/264 GiB)
>   sd 0:2:4:0: [sde] 4096-byte physical blocks
>   sd 0:2:5:0: [sdf] 272646144 512-byte logical blocks: (140 GB/130 GiB)
>   sd 0:2:5:0: [sdf] 4096-byte physical blocks
>   sd 0:2:4:0: [sde] Write Protect is off
>   sd 0:2:4:0: [sde] Mode Sense: 0b 00 00 08
>   sd 0:2:5:0: [sdf] Write Protect is off
>   sd 0:2:5:0: [sdf] Mode Sense: 0b 00 00 08
> 
> 
> And it just sits there for at least hours.
> 
> I compared a good and bad boot log and there appears to be essentially
> no difference. Certainly nothing that looks suspicous.

Hello Michael,

Thanks for having reported this early. Is there any chance that you can
reproduce this state, press SysRq-w on the console and collect the task
overview that is reported on the console (see also Documentation/admin-guide/
sysrq.rst)? If this is not possible or if that task overview does not report
any blocked tasks, can you add scsi_mod.scsi_logging_level=-1 to the kernel
command line (either through /etc/default/grub or in /boot/grub2/grub.cfg
when using GRUB), reboot the system, capture the console output and report
that output as a reply to this e-mail?

Thanks,

Bart.

Re: [PATCH] scsi-mq: Always unprepare before requeuing a request

2017-08-07 Thread Martin K. Petersen

Bart,

> One of the two scsi-mq functions that requeue a request unprepares a
> request before requeueing (scsi_io_completion()) but the other
> function not (__scsi_queue_insert()). Make sure that a request is
> unprepared before requeuing it.

Applied to 4.13/scsi-fixes. Thanks much!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi-mq: Always unprepare before requeuing a request

2017-08-07 Thread Johannes Thumshirn

Thanks Bart,
Reviewed-by: Johannes Thumshirn 
-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


Re: [PATCH] scsi-mq: Always unprepare before requeuing a request

2017-08-05 Thread Christoph Hellwig
Looks fine,

Reviewed-by: Christoph Hellwig 


Re: [PATCH] scsi-mq: Always unprepare before requeuing a request

2017-08-04 Thread Damien Le Moal


On 8/4/17 06:40, Bart Van Assche wrote:
> One of the two scsi-mq functions that requeue a request unprepares
> a request before requeueing (scsi_io_completion()) but the other
> function not (__scsi_queue_insert()). Make sure that a request is
> unprepared before requeuing it.
> 
> Fixes: commit d285203cf647 ("scsi: add support for a blk-mq based I/O path.")
> Signed-off-by: Bart Van Assche 
> Cc: Christoph Hellwig 
> Cc: Hannes Reinecke 
> Cc: Damien Le Moal 
> Cc: Johannes Thumshirn 
> Cc: 
> ---
>  drivers/scsi/scsi_lib.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 4a2f705cdb14..c7514f3b444a 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -44,6 +44,8 @@ static struct kmem_cache *scsi_sense_cache;
>  static struct kmem_cache *scsi_sense_isadma_cache;
>  static DEFINE_MUTEX(scsi_sense_cache_mutex);
>  
> +static void scsi_mq_uninit_cmd(struct scsi_cmnd *cmd);
> +
>  static inline struct kmem_cache *
>  scsi_select_sense_cache(bool unchecked_isa_dma)
>  {
> @@ -140,6 +142,12 @@ static void scsi_mq_requeue_cmd(struct scsi_cmnd *cmd)
>  {
>   struct scsi_device *sdev = cmd->device;
>  
> + if (cmd->request->rq_flags & RQF_DONTPREP) {
> + cmd->request->rq_flags &= ~RQF_DONTPREP;
> + scsi_mq_uninit_cmd(cmd);
> + } else {
> + WARN_ON_ONCE(true);
> + }
>   blk_mq_requeue_request(cmd->request, true);
>   put_device(&sdev->sdev_gendev);
>  }
> @@ -995,8 +1003,6 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
> int good_bytes)
>* A new command will be prepared and issued.
>*/
>   if (q->mq_ops) {
> - cmd->request->rq_flags &= ~RQF_DONTPREP;
> - scsi_mq_uninit_cmd(cmd);
>   scsi_mq_requeue_cmd(cmd);
>   } else {
>   scsi_release_buffers(cmd);
> 

Tested-by: Damien Le Moal 

This patch is needed for the V2 of the series "Zoned block device
support fixes" that I sent.

Best regards.

-- 
Damien Le Moal,
Western Digital


[PATCH] scsi-mq: Always unprepare before requeuing a request

2017-08-03 Thread Bart Van Assche
One of the two scsi-mq functions that requeue a request unprepares
a request before requeueing (scsi_io_completion()) but the other
function not (__scsi_queue_insert()). Make sure that a request is
unprepared before requeuing it.

Fixes: commit d285203cf647 ("scsi: add support for a blk-mq based I/O path.")
Signed-off-by: Bart Van Assche 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Damien Le Moal 
Cc: Johannes Thumshirn 
Cc: 
---
 drivers/scsi/scsi_lib.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 4a2f705cdb14..c7514f3b444a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -44,6 +44,8 @@ static struct kmem_cache *scsi_sense_cache;
 static struct kmem_cache *scsi_sense_isadma_cache;
 static DEFINE_MUTEX(scsi_sense_cache_mutex);
 
+static void scsi_mq_uninit_cmd(struct scsi_cmnd *cmd);
+
 static inline struct kmem_cache *
 scsi_select_sense_cache(bool unchecked_isa_dma)
 {
@@ -140,6 +142,12 @@ static void scsi_mq_requeue_cmd(struct scsi_cmnd *cmd)
 {
struct scsi_device *sdev = cmd->device;
 
+   if (cmd->request->rq_flags & RQF_DONTPREP) {
+   cmd->request->rq_flags &= ~RQF_DONTPREP;
+   scsi_mq_uninit_cmd(cmd);
+   } else {
+   WARN_ON_ONCE(true);
+   }
blk_mq_requeue_request(cmd->request, true);
put_device(&sdev->sdev_gendev);
 }
@@ -995,8 +1003,6 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * A new command will be prepared and issued.
 */
if (q->mq_ops) {
-   cmd->request->rq_flags &= ~RQF_DONTPREP;
-   scsi_mq_uninit_cmd(cmd);
scsi_mq_requeue_cmd(cmd);
} else {
scsi_release_buffers(cmd);
-- 
2.13.3