RE: [PATCH RESEND] scsi: qla2xxx: I/Os timing out on surprise removal of
This is still a bug in 4.20/scsi-queue. I am sending a new patch that applies cleanly to 4.20/scsi-queue as it stands now. I have tested it successfully. Regards -Bill -Original Message- From: Martin K. Petersen [mailto:martin.peter...@oracle.com] Sent: Friday, October 19, 2018 6:24 PM To: Kuzeja, William Cc: linux-scsi@vger.kernel.org; qla2xxx-upstr...@qlogic.com Subject: Re: [PATCH RESEND] scsi: qla2xxx: I/Os timing out on surprise removal of Bill, > When doing a surprise removal of an adapter, some in flight I/Os can > get stuck and take a while to complete (they actually timeout and are > retried). We are not handling an early error exit from > qla2xxx_eh_abort properly. This doesn't apply to 4.20/scsi-queue and the surrounding code has changed significantly. -- Martin K. Petersen Oracle Linux Engineering
RE: [PATCH 3/3] qla2x00: fix init error handling
On Thu, Mar 08, 2018 at 08:45:25AM, Meelis Roos wrote: > When firmware init fails, qla2x00_probe_one() does double free of req and rsp > queues and possibly other structures allocated by qla2x00_mem_alloc(). > Fix it by pulling out qla2x00_mem_free() and qla2x00_free_queues() invocations > from qla2x00_free_device() and call them manually where needed, and also zero > the req and rsp pointers after freeing them once in the error handler of > qla2x00_probe_one(). > This fixes memory corruption and further crashes in unrelated code when > qla2200 > init fails for some reason. > Signed-off-by: Meelis RoosHi Meelis, This issue should already be addressed by a very recent commit: 6a2cf8d3663e13e1 scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure Furthermore, the additions in qla2x00_remove_one of: + qla2x00_mem_free(ha); + + qla2x00_free_queues(ha); + are unnecessary. These routines are already called by qla2x00_free_device just above in qla2x00_remove_one. Regards, -bk
RE: [PATCH] scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure
Thanks for the quick reply and analysis Hannes! My apologies in advance, I'm stuck on Outlook here at work - I'll try to format this to be readable (hopefully it doesn't get mangled). On 02/27/2018 06:01 AM, Hannes Reinecke wrote: > Hmm. Isn't is rather the case that the labels and gotos are mixed up? > We have this order of labels: > probe_init_failed > probe_failed > probe_hw_failed > iospace_config_failed > disable_device > but this potential calling order: > disable_device > iospace_config_failed > probe_hw_failed > probe_hw_failed > probe_init_failed > probe_failed > probe_failed > probe_failed > So it looks to me as if the 'probe_init_failed' and 'probe_failed' > labels should be exchanged... > But yeah, we need to fix this up. > I've run into this issue myself recently :-( So, that is partly the problem. But.if we switch the order, we still have problems. Consider the code with order switched: probe_failed: qla2x00_free_device(base_vha); scsi_host_put(base_vha->host); probe_init_failed: qla2x00_free_req_que(ha, req); ha->req_q_map[0] = NULL; alloc_queues succeeds clear_bit(0, ha->req_qid_map); qla2x00_free_rsp_que(ha, rsp); ha->rsp_q_map[0] = NULL; clear_bit(0, ha->rsp_qid_map); ha->max_req_queues = ha->max_rsp_queues = 0; probe_hw_failed: qla2x00_mem_free(ha); qla2x00_free_req_que(ha, req); qla2x00_free_rsp_que(ha, rsp); qla2x00_clear_drv_active(ha); 1) We call probe_init_failed only if qla2x00_alloc_queues fails. But if this routine fails, ha->req_q_map = NULL (as well as rsp_q_map), so we crash doing ha->req_q_map[0] = NULL. So, I can remove these assignments. But the qla2x00_free_req_que calls are redundant (they are done below), so we're left with clearing out max_req_queues and max_rsp_queues for this label. Since we're freeing ha anyways, I'm not sure if this is worth having another label for. 2) We still have the problem where in probe_failed, qla2x00_free_device does frees which are repeated later on in probe_hw_failed. qla2x00_free_device -> qla2x00_mem_free qla2x00_free_device -> qla2x00_free_queues -> qla2x00_free_req_que 3) Also, I still need another label for when qla2x00_mem_alloc fails. We can't just jump to probe_hw_failed. I suppose we could do something like this: probe_failed: probe_failed = 1; qla2x00_free_device(base_vha); scsi_host_put(base_vha->host); probe_init_failed: ha->max_req_queues = ha->max_rsp_queues = 0; probe_hw_failed: if (!probe_failed) { qla2x00_mem_free(ha); qla2x00_free_req_que(ha, req); qla2x00_free_rsp_que(ha, rsp); } probe_hw_failed_no_mem: qla2x00_clear_drv_active(ha); This is essentially what I have, except my version does not have the probe_init_failed label. Thoughts?? Thanks again and regards. ---
RE: [PATCH] sd: Limit WRITE SAME / WRITE SAME(16) w/UNMAP length for certain devices
Ewan, I like it, more generic than my patch. I never saw the other cases, so I limited my patch to WS16. Acked-by: Bill KuzejaOn Tue-09-19 at 12:14 Ewan D. Milne wrote: > Some devices do not support a WRITE SAME / WRITE SAME(16) with the > UNMAP > bit set up to the length specified in the MAXIMUM WRITE SAME LENGTH > field > in the block limits VPD page (or, the field is zero, indicating there is > no limit). Limit the length by the MAXIMUM UNMAP LBA COUNT value. > Otherwise the command might be rejected. > > Signed-off-by: Ewan D. Milne > --- > drivers/scsi/sd.c | 14 +++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index 6549e5c..fa07ac2 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -693,6 +693,7 @@ static void sd_config_discard(struct scsi_disk *sdkp, > unsigned int mode) > struct request_queue *q = sdkp->disk->queue; > unsigned int logical_block_size = sdkp->device->sector_size; > unsigned int max_blocks = 0; > + u32 max_ws_blocks = sdkp->max_ws_blocks; > > q->limits.discard_alignment = > sdkp->unmap_alignment * logical_block_size; > @@ -701,6 +702,13 @@ static void sd_config_discard(struct scsi_disk *sdkp, > unsigned int mode) > sdkp->unmap_granularity * logical_block_size); > sdkp->provisioning_mode = mode; > > + /* > + * Some devices limit WRITE SAME / WRITE SAME(16) w/UNMAP > + * by MAXIMUM UNMAP LBA COUNT instead of MAXIMUM WRITE > SAME LENGTH. > + * Use the smaller of the two values to avoid ILLEGAL REQUEST > errors. > + */ > + max_ws_blocks = min_not_zero(max_ws_blocks, sdkp- > >max_unmap_blocks); > + > switch (mode) { > > case SD_LBP_FULL: > @@ -715,17 +723,17 @@ static void sd_config_discard(struct scsi_disk *sdkp, > unsigned int mode) > break; > > case SD_LBP_WS16: > - max_blocks = min_not_zero(sdkp->max_ws_blocks, > + max_blocks = min_not_zero(max_ws_blocks, > (u32)SD_MAX_WS16_BLOCKS); > break; > > case SD_LBP_WS10: > - max_blocks = min_not_zero(sdkp->max_ws_blocks, > + max_blocks = min_not_zero(max_ws_blocks, > (u32)SD_MAX_WS10_BLOCKS); > break; > > case SD_LBP_ZERO: > - max_blocks = min_not_zero(sdkp->max_ws_blocks, > + max_blocks = min_not_zero(max_ws_blocks, > (u32)SD_MAX_WS10_BLOCKS); > break; > } > -- > 2.1.0