Re: [PATCH 3/3] scsi: myrs: prevent negatives in disable_enclosure_messages_store()
Dan, >> +if (value < 0 || value > 2) >> return -EINVAL; > > It's not actually clear to me why we allow 2. Shouldn't we just use > kstrtobool()? Hannes? -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH -next] mvsas: Remove set but not used variable 'id'
YueHaibing, > Fixes gcc '-Wunused-but-set-variable' warning: > > drivers/scsi/mvsas/mv_sas.c: In function 'mvs_work_queue': > drivers/scsi/mvsas/mv_sas.c:1909:31: warning: > variable 'id' set but not used [-Wunused-but-set-variable] > > It never used since introduction in commit > 20b09c2992fe ("[SCSI] mvsas: add support for 94xx; layout change; bug fixes") Applied to 4.20/scsi-queue. Thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 0/7] qla2xxx patches for kernel v4.20
Bart, > This is a series with mostly trivial patches for the qla2xxx > driver. These patches address warnings reported by gcc and by the > smatch and sparse static analyzers. Please consider these patches for > kernel v4.20. Applied to 4.20/scsi-queue. Thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v4 00/11] Zoned block device support improvements
Jens, > 2) The ordering of the signed-off-by. Someone told me that this is >patchwork, but I absolutely hate it. SOB should go last, not before >the reviewed-by. I fixed that up too. You keep mentioning this, but I don't recall ever seeing anything to that effect. The rest of the kernel appears to be either arbitrary ordering or favoring author SoB as the first tag. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: 3w-{sas,9xxx}: Use unsigned char for cdb
Nathan, > Clang warns a few times: > > drivers/scsi/3w-sas.c:386:11: warning: implicit conversion from 'int' to > 'char' changes value from 128 to -128 [-Wconstant-conversion] > cdb[4] = TW_ALLOCATION_LENGTH; /* allocation length */ >~ ^~~~ > > Update cdb's type to unsigned char, which matches the type of the cdb > member in struct TW_Command_Apache. Applied to 4.20/scsi-queue. Thank you. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi/pmcraid.c: Use dma_pool_zalloc
On Mon, Oct 8, 2018 at 9:58 PM Souptick Joarder wrote: > > On Tue, Oct 2, 2018 at 10:53 AM Souptick Joarder wrote: > > > > Replaced dma_pool_alloc + memset with dma_pool_zalloc. > > > > Signed-off-by: Sabyasachi Gupta > > Signed-off-by: Souptick Joarder > > Any comment on this patch ? Any comment on this patch ? > > > --- > > drivers/scsi/pmcraid.c | 4 +--- > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > > diff --git a/drivers/scsi/pmcraid.c b/drivers/scsi/pmcraid.c > > index 4e86994..84a2734 100644 > > --- a/drivers/scsi/pmcraid.c > > +++ b/drivers/scsi/pmcraid.c > > @@ -4681,7 +4681,7 @@ static int pmcraid_allocate_control_blocks(struct > > pmcraid_instance *pinstance) > > > > for (i = 0; i < PMCRAID_MAX_CMD; i++) { > > pinstance->cmd_list[i]->ioa_cb = > > - dma_pool_alloc( > > + dma_pool_zalloc( > > pinstance->control_pool, > > GFP_KERNEL, > > &(pinstance->cmd_list[i]->ioa_cb_bus_addr)); > > @@ -4690,8 +4690,6 @@ static int pmcraid_allocate_control_blocks(struct > > pmcraid_instance *pinstance) > > pmcraid_release_control_blocks(pinstance, i); > > return -ENOMEM; > > } > > - memset(pinstance->cmd_list[i]->ioa_cb, 0, > > - sizeof(struct pmcraid_control_block)); > > } > > return 0; > > } > > -- > > 1.9.1 > >
Re: [PATCH] scsi/mvsas/mv_sas.c: Use dma_pool_zalloc
Sabyasachi, > Replaced dma_pool_alloc + memset with dma_pool_zalloc Applied to 4.20/scsi-queue, thank you. -- Martin K. Petersen Oracle Linux Engineering
Re: [RFC][PATCH v2] scsi: ufs: Fix hynix ufs bug with quirk on hi36xx SoC
John, > Ok. Yea, I saw something similar in the qcom code, but I wasn't sure > if folks would want host specific quirks isolated to host code. Yeah, that's why I thought it would be good for the UFS folks to chime in. -- Martin K. Petersen Oracle Linux Engineering
Re: [RFC][PATCH v2] scsi: ufs: Fix hynix ufs bug with quirk on hi36xx SoC
On Tue, Oct 23, 2018 at 7:47 PM, Martin K. Petersen wrote: > > John, > > Thanks for tweaking this. > >> Not sure if this is the preferred way of scoping the quirk to >> the controller or not. Feedback would be greatly appreciated! > > I think my preference would be to add: > >UFS_FIX(UFS_VENDOR_SKHYNIX, "hB8aL1", > UFS_DEVICE_QUIRK_HOST_VS_DEBUG), > > to ufs_fixups[] and then key off of that in the driver. That's how we do > it in SCSI but the UFS folks may have a different opinion. Ok. Yea, I saw something similar in the qcom code, but I wasn't sure if folks would want host specific quirks isolated to host code. I appreciate the clarification, I'll rework and respin it here shortly. Thanks again! -john
Re: [PATCH 2/8] sg: introduce sg_log macro
Hi Doug, > I'll follow what the scsi mid-level and the other ULDs do. IOW, no > change. The debug messages they produce are quite helpful (to me, I > use them a lot, and Tony B. has asked for more precision) and > well-tuned to the SCSI subsystem (e.g. telling us what sdp represents > in useful terms). > > And they can be compiled out (but not my pr_info above, probably > should be a pr_warn). I agree with Johannes. SCSI logging is in sustaining mode. We're trying to remove it, not to add to it. The kernel has much more capable and flexible methods of getting information out to the user these days. No need to resort to arcane logging masks and the like. -- Martin K. Petersen Oracle Linux Engineering
Re: [RFC][PATCH v2] scsi: ufs: Fix hynix ufs bug with quirk on hi36xx SoC
John, Thanks for tweaking this. > Not sure if this is the preferred way of scoping the quirk to > the controller or not. Feedback would be greatly appreciated! I think my preference would be to add: UFS_FIX(UFS_VENDOR_SKHYNIX, "hB8aL1", UFS_DEVICE_QUIRK_HOST_VS_DEBUG), to ufs_fixups[] and then key off of that in the driver. That's how we do it in SCSI but the UFS folks may have a different opinion. -- Martin K. Petersen Oracle Linux Engineering
[PATCH 09/12] lpfc: Correct loss of fc4 type on remote port address change
An address change for a remote port cause PRLI for the wrong protocol to be sent. The node copy done in the discovery code skipped copying the fc4 protocols supported as well. Fix the copy logic for the address change. Beefed up log messages in this area as well. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc_els.c | 27 +++ drivers/scsi/lpfc/lpfc_nportdisc.c | 5 +++-- 2 files changed, 26 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index a200cdaf34a6..ebd6c7251ad8 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -1556,8 +1556,10 @@ lpfc_plogi_confirm_nport(struct lpfc_hba *phba, uint32_t *prsp, */ new_ndlp = lpfc_findnode_wwpn(vport, &sp->portName); + /* return immediately if the WWPN matches ndlp */ if (new_ndlp == ndlp && NLP_CHK_NODE_ACT(new_ndlp)) return ndlp; + if (phba->sli_rev == LPFC_SLI_REV4) { active_rrqs_xri_bitmap = mempool_alloc(phba->active_rrq_pool, GFP_KERNEL); @@ -1566,9 +1568,13 @@ lpfc_plogi_confirm_nport(struct lpfc_hba *phba, uint32_t *prsp, phba->cfg_rrq_xri_bitmap_sz); } - lpfc_printf_vlog(vport, KERN_INFO, LOG_ELS, -"3178 PLOGI confirm: ndlp %p x%x: new_ndlp %p\n", -ndlp, ndlp->nlp_DID, new_ndlp); + lpfc_printf_vlog(vport, KERN_INFO, LOG_ELS | LOG_NODE, +"3178 PLOGI confirm: ndlp x%x x%x x%x: " +"new_ndlp x%x x%x x%x\n", +ndlp->nlp_DID, ndlp->nlp_flag, ndlp->nlp_fc4_type, +(new_ndlp ? new_ndlp->nlp_DID : 0), +(new_ndlp ? new_ndlp->nlp_flag : 0), +(new_ndlp ? new_ndlp->nlp_fc4_type : 0)); if (!new_ndlp) { rc = memcmp(&ndlp->nlp_portname, name, @@ -1617,6 +1623,14 @@ lpfc_plogi_confirm_nport(struct lpfc_hba *phba, uint32_t *prsp, phba->cfg_rrq_xri_bitmap_sz); } + /* At this point in this routine, we know new_ndlp will be +* returned. however, any previous GID_FTs that were done +* would have updated nlp_fc4_type in ndlp, so we must ensure +* new_ndlp has the right value. +*/ + if (vport->fc_flag & FC_FABRIC) + new_ndlp->nlp_fc4_type = ndlp->nlp_fc4_type; + lpfc_unreg_rpi(vport, new_ndlp); new_ndlp->nlp_DID = ndlp->nlp_DID; new_ndlp->nlp_prev_state = ndlp->nlp_prev_state; @@ -1666,7 +1680,6 @@ lpfc_plogi_confirm_nport(struct lpfc_hba *phba, uint32_t *prsp, if (ndlp->nrport) { ndlp->nrport = NULL; lpfc_nlp_put(ndlp); - new_ndlp->nlp_fc4_type = ndlp->nlp_fc4_type; } /* We shall actually free the ndlp with both nlp_DID and @@ -1740,6 +1753,12 @@ lpfc_plogi_confirm_nport(struct lpfc_hba *phba, uint32_t *prsp, active_rrqs_xri_bitmap) mempool_free(active_rrqs_xri_bitmap, phba->active_rrq_pool); + + lpfc_printf_vlog(vport, KERN_INFO, LOG_ELS | LOG_NODE, +"3173 PLOGI confirm exit: new_ndlp x%x x%x x%x\n", +new_ndlp->nlp_DID, new_ndlp->nlp_flag, +new_ndlp->nlp_fc4_type); + return new_ndlp; } diff --git a/drivers/scsi/lpfc/lpfc_nportdisc.c b/drivers/scsi/lpfc/lpfc_nportdisc.c index 394ffbe9cb6d..6827ffef3261 100644 --- a/drivers/scsi/lpfc/lpfc_nportdisc.c +++ b/drivers/scsi/lpfc/lpfc_nportdisc.c @@ -2868,8 +2868,9 @@ lpfc_disc_state_machine(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp, /* DSM in event on NPort in state */ lpfc_printf_vlog(vport, KERN_INFO, LOG_DISCOVERY, "0211 DSM in event x%x on NPort x%x in " -"state %d Data: x%x\n", -evt, ndlp->nlp_DID, cur_state, ndlp->nlp_flag); +"state %d Data: x%x x%x\n", +evt, ndlp->nlp_DID, cur_state, +ndlp->nlp_flag, ndlp->nlp_fc4_type); lpfc_debugfs_disc_trc(vport, LPFC_DISC_TRC_DSM, "DSM in: evt:%d ste:%d did:x%x", -- 2.13.1
[PATCH 04/12] lpfc: Reset link or adapter instead of doing infinite nameserver PLOGI retry
Currently, PLOGI failures are infinitely delayed/retried. There have been some fabric situations where the PLOGI's were to the nameserver and it stopped responding. The retries would never clear up. A better resolution in this situation is to retry a couple of times, then drop the link and reinit. This brings back connectivity to the nameserver. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc_crtn.h | 1 + drivers/scsi/lpfc/lpfc_els.c | 83 ++- 2 files changed, 83 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h index e01136507780..e9b297a39e54 100644 --- a/drivers/scsi/lpfc/lpfc_crtn.h +++ b/drivers/scsi/lpfc/lpfc_crtn.h @@ -380,6 +380,7 @@ void lpfc_nvmet_buf_free(struct lpfc_hba *phba, void *virtp, dma_addr_t dma); void lpfc_in_buf_free(struct lpfc_hba *, struct lpfc_dmabuf *); void lpfc_rq_buf_free(struct lpfc_hba *phba, struct lpfc_dmabuf *mp); +int lpfc_link_reset(struct lpfc_vport *vport); /* Function prototypes. */ const char* lpfc_info(struct Scsi_Host *); diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index 8160a5ebad08..e3e851931394 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -3242,6 +3242,62 @@ lpfc_els_retry_delay_handler(struct lpfc_nodelist *ndlp) } /** + * lpfc_link_reset - Issue link reset + * @vport: pointer to a virtual N_Port data structure. + * + * This routine performs link reset by sending INIT_LINK mailbox command. + * For SLI-3 adapter, link attention interrupt is enabled before issuing + * INIT_LINK mailbox command. + * + * Return code + * 0 - Link reset initiated successfully + * 1 - Failed to initiate link reset + **/ +int +lpfc_link_reset(struct lpfc_vport *vport) +{ + struct lpfc_hba *phba = vport->phba; + LPFC_MBOXQ_t *mbox; + uint32_t control; + int rc; + + lpfc_printf_vlog(vport, KERN_ERR, LOG_ELS, +"2851 Attempt link reset\n"); + mbox = mempool_alloc(phba->mbox_mem_pool, GFP_KERNEL); + if (!mbox) { + lpfc_printf_log(phba, KERN_ERR, LOG_MBOX, + "2852 Failed to allocate mbox memory"); + return 1; + } + + /* Enable Link attention interrupts */ + if (phba->sli_rev <= LPFC_SLI_REV3) { + spin_lock_irq(&phba->hbalock); + phba->sli.sli_flag |= LPFC_PROCESS_LA; + control = readl(phba->HCregaddr); + control |= HC_LAINT_ENA; + writel(control, phba->HCregaddr); + readl(phba->HCregaddr); /* flush */ + spin_unlock_irq(&phba->hbalock); + } + + lpfc_init_link(phba, mbox, phba->cfg_topology, + phba->cfg_link_speed); + mbox->mbox_cmpl = lpfc_sli_def_mbox_cmpl; + mbox->vport = vport; + rc = lpfc_sli_issue_mbox(phba, mbox, MBX_NOWAIT); + if ((rc != MBX_BUSY) && (rc != MBX_SUCCESS)) { + lpfc_printf_log(phba, KERN_ERR, LOG_MBOX, + "2853 Failed to issue INIT_LINK " + "mbox command, rc:x%x\n", rc); + mempool_free(mbox, phba->mbox_mem_pool); + return 1; + } + + return 0; +} + +/** * lpfc_els_retry - Make retry decision on an els command iocb * @phba: pointer to lpfc hba data structure. * @cmdiocb: pointer to lpfc command iocb data structure. @@ -3277,6 +,7 @@ lpfc_els_retry(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, int logerr = 0; uint32_t cmd = 0; uint32_t did; + int link_reset = 0, rc; /* Note: context2 may be 0 for internal driver abort @@ -3358,7 +3415,6 @@ lpfc_els_retry(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, retry = 1; break; - case IOERR_SEQUENCE_TIMEOUT: case IOERR_INVALID_RPI: if (cmd == ELS_CMD_PLOGI && did == NameServer_DID) { @@ -3369,6 +3425,18 @@ lpfc_els_retry(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, } retry = 1; break; + + case IOERR_SEQUENCE_TIMEOUT: + if (cmd == ELS_CMD_PLOGI && + did == NameServer_DID && + (cmdiocb->retry + 1) == maxretry) { + /* Reset the Link */ + link_reset = 1; + break; + } + retry = 1; + delay = 100; + break; } break; @@ -3525,6 +3593,19 @@ lpfc_els_retry(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, break; } + if (link_reset)
[PATCH 00/12] lpfc updates for 12.0.0.8
This patch contains lpfc bug fixes and 2 enhancements. The patches were cut against Martin's 4.20/scsi-queue tree James Smart (12): lpfc: Correct speeds on SFP swap lpfc: Fix lpfc_sli4_read_config return value check lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event lpfc: Reset link or adapter instead of doing infinite nameserver PLOGI retry lpfc: Correct errors accessing fw log lpfc: fcoe: Fix link down issue after 1000+ link bounces lpfc: Correct LCB RJT handling lpfc: Fix odd recovery in duplicate FLOGIs in point-to-point lpfc: Correct loss of fc4 type on remote port address change lpfc: Implement GID_PT on Nameserver query to support faster failover lpfc: add Trunking support lpfc: update driver version to 12.0.0.8 drivers/scsi/lpfc/lpfc.h | 15 +++ drivers/scsi/lpfc/lpfc_attr.c | 115 ++ drivers/scsi/lpfc/lpfc_bsg.c | 138 +++-- drivers/scsi/lpfc/lpfc_bsg.h | 38 ++ drivers/scsi/lpfc/lpfc_crtn.h | 2 + drivers/scsi/lpfc/lpfc_ct.c| 211 drivers/scsi/lpfc/lpfc_els.c | 242 ++--- drivers/scsi/lpfc/lpfc_hbadisc.c | 59 + drivers/scsi/lpfc/lpfc_hw.h| 1 + drivers/scsi/lpfc/lpfc_hw4.h | 68 +++ drivers/scsi/lpfc/lpfc_init.c | 213 +--- drivers/scsi/lpfc/lpfc_nportdisc.c | 23 +++- drivers/scsi/lpfc/lpfc_scsi.h | 4 + drivers/scsi/lpfc/lpfc_sli.c | 22 ++-- drivers/scsi/lpfc/lpfc_sli4.h | 14 +++ drivers/scsi/lpfc/lpfc_version.h | 2 +- 16 files changed, 1049 insertions(+), 118 deletions(-) -- 2.13.1
[PATCH 07/12] lpfc: Correct LCB RJT handling
When LCB's are rejected, if beaconing was already in progress, the Reason Code Explanation was not being set. Should have been set to command in progress. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc_els.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index 25625c03a5b3..832e5e00c1c9 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -5776,6 +5776,9 @@ lpfc_els_lcb_rsp(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb) stat = (struct ls_rjt *)(pcmd + sizeof(uint32_t)); stat->un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC; + if (shdr_add_status == ADD_STATUS_OPERATION_ALREADY_ACTIVE) + stat->un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS; + elsiocb->iocb_cmpl = lpfc_cmpl_els_rsp; phba->fc_stat.elsXmitLSRJT++; rc = lpfc_sli_issue_iocb(phba, LPFC_ELS_RING, elsiocb, 0); -- 2.13.1
[PATCH 12/12] lpfc: update driver version to 12.0.0.8
Update the driver version to 12.0.0.8 Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc_version.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/lpfc/lpfc_version.h b/drivers/scsi/lpfc/lpfc_version.h index 5a0d512ff497..d0b2dd9b737f 100644 --- a/drivers/scsi/lpfc/lpfc_version.h +++ b/drivers/scsi/lpfc/lpfc_version.h @@ -20,7 +20,7 @@ * included with this package. * ***/ -#define LPFC_DRIVER_VERSION "12.0.0.7" +#define LPFC_DRIVER_VERSION "12.0.0.8" #define LPFC_DRIVER_NAME "lpfc" /* Used for SLI 2/3 */ -- 2.13.1
[PATCH 08/12] lpfc: Fix odd recovery in duplicate FLOGIs in point-to-point
Testing a point-to-point topology and a case of re-FLOGI without intervening link bouncing, showed an odd interaction with firmware and a resulting scenario where the driver no longer probed after accepting the new FLOGI. Work around the firmware issue by issuing a link bounce if a FLOGI is received after the link is already up and FLOGI's accepted. While debugging the issue, realized that some debug traces should be clarified to help in the future. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc.h | 1 + drivers/scsi/lpfc/lpfc_els.c | 66 drivers/scsi/lpfc/lpfc_hbadisc.c | 9 ++ 3 files changed, 64 insertions(+), 12 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 4fe04c00a390..1dfe71f0fcfd 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -490,6 +490,7 @@ struct lpfc_vport { struct nvme_fc_local_port *localport; uint8_t nvmei_support; /* driver supports NVME Initiator */ uint32_t last_fcp_wqidx; + uint32_t rcv_flogi_cnt; /* How many unsol FLOGIs ACK'd. */ }; struct hbq_s { diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index 832e5e00c1c9..a200cdaf34a6 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -1057,9 +1057,9 @@ lpfc_cmpl_els_flogi(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, goto flogifail; lpfc_printf_vlog(vport, KERN_WARNING, LOG_ELS, -"0150 FLOGI failure Status:x%x/x%x TMO:x%x\n", +"0150 FLOGI failure Status:x%x/x%x xri x%x TMO:x%x\n", irsp->ulpStatus, irsp->un.ulpWord[4], -irsp->ulpTimeout); +cmdiocb->sli4_xritag, irsp->ulpTimeout); /* FLOGI failed, so there is no fabric */ spin_lock_irq(shost->host_lock); @@ -1113,7 +1113,8 @@ lpfc_cmpl_els_flogi(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, /* FLOGI completes successfully */ lpfc_printf_vlog(vport, KERN_INFO, LOG_ELS, "0101 FLOGI completes successfully, I/O tag:x%x, " -"Data: x%x x%x x%x x%x x%x x%x\n", cmdiocb->iotag, +"xri x%x Data: x%x x%x x%x x%x x%x %x\n", +cmdiocb->iotag, cmdiocb->sli4_xritag, irsp->un.ulpWord[4], sp->cmn.e_d_tov, sp->cmn.w2.r_a_tov, sp->cmn.edtovResolution, vport->port_state, vport->fc_flag); @@ -4347,14 +4348,6 @@ lpfc_els_rsp_acc(struct lpfc_vport *vport, uint32_t flag, default: return 1; } - /* Xmit ELS ACC response tag */ - lpfc_printf_vlog(vport, KERN_INFO, LOG_ELS, -"0128 Xmit ELS ACC response tag x%x, XRI: x%x, " -"DID: x%x, nlp_flag: x%x nlp_state: x%x RPI: x%x " -"fc_flag x%x\n", -elsiocb->iotag, elsiocb->iocb.ulpContext, -ndlp->nlp_DID, ndlp->nlp_flag, ndlp->nlp_state, -ndlp->nlp_rpi, vport->fc_flag); if (ndlp->nlp_flag & NLP_LOGO_ACC) { spin_lock_irq(shost->host_lock); if (!(ndlp->nlp_flag & NLP_RPI_REGISTERED || @@ -4523,6 +4516,15 @@ lpfc_els_rsp_adisc_acc(struct lpfc_vport *vport, struct lpfc_iocbq *oldiocb, lpfc_els_free_iocb(phba, elsiocb); return 1; } + + /* Xmit ELS ACC response tag */ + lpfc_printf_vlog(vport, KERN_INFO, LOG_ELS, +"0128 Xmit ELS ACC response Status: x%x, IoTag: x%x, " +"XRI: x%x, DID: x%x, nlp_flag: x%x nlp_state: x%x " +"RPI: x%x, fc_flag x%x\n", +rc, elsiocb->iotag, elsiocb->sli4_xritag, +ndlp->nlp_DID, ndlp->nlp_flag, ndlp->nlp_state, +ndlp->nlp_rpi, vport->fc_flag); return 0; } @@ -6533,6 +6535,11 @@ lpfc_els_rcv_flogi(struct lpfc_vport *vport, struct lpfc_iocbq *cmdiocb, port_state = vport->port_state; vport->fc_flag |= FC_PT2PT; vport->fc_flag &= ~(FC_FABRIC | FC_PUBLIC_LOOP); + + /* Acking an unsol FLOGI. Count 1 for link bounce +* work-around. +*/ + vport->rcv_flogi_cnt++; spin_unlock_irq(shost->host_lock); lpfc_printf_vlog(vport, KERN_INFO, LOG_ELS, "3311 Rcv Flogi PS x%x new PS x%x " @@ -7930,8 +7937,9 @@ lpfc_els_unsol_buffer(struct lpfc_hba *phba, struct lpfc_sli_ring *pring, struct ls_rjt stat; uint32_t *payload; uint32_t cmd, did, newnode; - uint8_t rjt_exp, rjt_err = 0; + uint8_t rjt_exp, rjt_err = 0, init_link = 0; IOCB_t *ic
[PATCH 10/12] lpfc: Implement GID_PT on Nameserver query to support faster failover
The switches seem to respond faster to GID_PT vs GID_FT NameServer queries. Add support for GID_PT to be used over GID_FT to enable faster storage failover detection. Includes addition of new module parameter to select between GID_PT and GID_FT (GID_FT is default). Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc.h | 1 + drivers/scsi/lpfc/lpfc_attr.c | 14 +++ drivers/scsi/lpfc/lpfc_crtn.h | 1 + drivers/scsi/lpfc/lpfc_ct.c| 206 + drivers/scsi/lpfc/lpfc_els.c | 10 +- drivers/scsi/lpfc/lpfc_hbadisc.c | 29 ++ drivers/scsi/lpfc/lpfc_hw.h| 1 + drivers/scsi/lpfc/lpfc_hw4.h | 4 + drivers/scsi/lpfc/lpfc_nportdisc.c | 13 ++- 9 files changed, 275 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 1dfe71f0fcfd..979366fc34d4 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -784,6 +784,7 @@ struct lpfc_hba { #define LPFC_FCF_PRIORITY 2/* Priority fcf failover */ uint32_t cfg_fcf_failover_policy; uint32_t cfg_fcp_io_sched; + uint32_t cfg_ns_query; uint32_t cfg_fcp2_no_tgt_reset; uint32_t cfg_cr_delay; uint32_t cfg_cr_count; diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index 73e2296796e6..159ede7032dc 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -5065,6 +5065,18 @@ LPFC_ATTR_RW(fcp_io_sched, LPFC_FCP_SCHED_ROUND_ROBIN, "issuing commands [0] - Round Robin, [1] - Current CPU"); /* + * lpfc_ns_query: Determine algrithmn for NameServer queries after RSCN + * range is [0,1]. Default value is 0. + * For [0], GID_FT is used for NameServer queries after RSCN (default) + * For [1], GID_PT is used for NameServer queries after RSCN + * + */ +LPFC_ATTR_RW(ns_query, LPFC_NS_QUERY_GID_FT, +LPFC_NS_QUERY_GID_FT, LPFC_NS_QUERY_GID_PT, +"Determine algorithm NameServer queries after RSCN " +"[0] - GID_FT, [1] - GID_PT"); + +/* # lpfc_fcp2_no_tgt_reset: Determine bus reset behavior # range is [0,1]. Default value is 0. # For [0], bus reset issues target reset to ALL devices @@ -5509,6 +5521,7 @@ struct device_attribute *lpfc_hba_attrs[] = { &dev_attr_lpfc_scan_down, &dev_attr_lpfc_link_speed, &dev_attr_lpfc_fcp_io_sched, + &dev_attr_lpfc_ns_query, &dev_attr_lpfc_fcp2_no_tgt_reset, &dev_attr_lpfc_cr_delay, &dev_attr_lpfc_cr_count, @@ -6559,6 +6572,7 @@ void lpfc_get_cfgparam(struct lpfc_hba *phba) { lpfc_fcp_io_sched_init(phba, lpfc_fcp_io_sched); + lpfc_ns_query_init(phba, lpfc_ns_query); lpfc_fcp2_no_tgt_reset_init(phba, lpfc_fcp2_no_tgt_reset); lpfc_cr_delay_init(phba, lpfc_cr_delay); lpfc_cr_count_init(phba, lpfc_cr_count); diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h index e9b297a39e54..a4b1bc2782eb 100644 --- a/drivers/scsi/lpfc/lpfc_crtn.h +++ b/drivers/scsi/lpfc/lpfc_crtn.h @@ -175,6 +175,7 @@ void lpfc_hb_timeout_handler(struct lpfc_hba *); void lpfc_ct_unsol_event(struct lpfc_hba *, struct lpfc_sli_ring *, struct lpfc_iocbq *); int lpfc_ct_handle_unsol_abort(struct lpfc_hba *, struct hbq_dmabuf *); +int lpfc_issue_gidpt(struct lpfc_vport *vport); int lpfc_issue_gidft(struct lpfc_vport *vport); int lpfc_get_gidft_type(struct lpfc_vport *vport, struct lpfc_iocbq *iocbq); int lpfc_ns_cmd(struct lpfc_vport *, int, uint8_t, uint32_t); diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c index 789ad1502534..62e8ae3b4685 100644 --- a/drivers/scsi/lpfc/lpfc_ct.c +++ b/drivers/scsi/lpfc/lpfc_ct.c @@ -832,6 +832,198 @@ lpfc_cmpl_ct_cmd_gid_ft(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, } static void +lpfc_cmpl_ct_cmd_gid_pt(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, + struct lpfc_iocbq *rspiocb) +{ + struct lpfc_vport *vport = cmdiocb->vport; + struct Scsi_Host *shost = lpfc_shost_from_vport(vport); + IOCB_t *irsp; + struct lpfc_dmabuf *outp; + struct lpfc_dmabuf *inp; + struct lpfc_sli_ct_request *CTrsp; + struct lpfc_sli_ct_request *CTreq; + struct lpfc_nodelist *ndlp; + int rc; + + /* First save ndlp, before we overwrite it */ + ndlp = cmdiocb->context_un.ndlp; + + /* we pass cmdiocb to state machine which needs rspiocb as well */ + cmdiocb->context_un.rsp_iocb = rspiocb; + inp = (struct lpfc_dmabuf *)cmdiocb->context1; + outp = (struct lpfc_dmabuf *)cmdiocb->context2; + irsp = &rspiocb->iocb; + + lpfc_debugfs_disc_trc(vport, LPFC_DISC_TRC_CT, + "GID_PT cmpl: status:x%x/x%x rtry:%d", + irsp->ulpStatus, irsp->un.ulpWord[4], + vport->fc_ns_retry); + +
[PATCH 01/12] lpfc: Correct speeds on SFP swap
Supported speeds is not updated when SFP is removed or replaced Supported speed is obtained from lmt field in READ_CONFIG mailbox response. Driver updates supported speeds only once from PCI probe path. After that it is never updated. So, supported speeds remains the same till reboot or driver reload. When SFP is removed or inserted, driver gets SLI-Port Event ACQE. If SFP is removed, lmt wil have value 0. If a different SFP is inserted, lmt will have value according to its supported speeds. So, afterr SLI-Port Event ACQE handling path, send READ_CONFIG mailbox and update supported speeds. If READ_CONFIG fails, set supported speeds to unknown and log. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc_init.c | 63 +++ 1 file changed, 46 insertions(+), 17 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 323a32e87258..c78ae81b5701 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -4102,6 +4102,30 @@ int lpfc_scan_finished(struct Scsi_Host *shost, unsigned long time) return stat; } +void lpfc_host_supported_speeds_set(struct Scsi_Host *shost) +{ + struct lpfc_vport *vport = (struct lpfc_vport *)shost->hostdata; + struct lpfc_hba *phba = vport->phba; + + fc_host_supported_speeds(shost) = 0; + if (phba->lmt & LMT_64Gb) + fc_host_supported_speeds(shost) |= FC_PORTSPEED_64GBIT; + if (phba->lmt & LMT_32Gb) + fc_host_supported_speeds(shost) |= FC_PORTSPEED_32GBIT; + if (phba->lmt & LMT_16Gb) + fc_host_supported_speeds(shost) |= FC_PORTSPEED_16GBIT; + if (phba->lmt & LMT_10Gb) + fc_host_supported_speeds(shost) |= FC_PORTSPEED_10GBIT; + if (phba->lmt & LMT_8Gb) + fc_host_supported_speeds(shost) |= FC_PORTSPEED_8GBIT; + if (phba->lmt & LMT_4Gb) + fc_host_supported_speeds(shost) |= FC_PORTSPEED_4GBIT; + if (phba->lmt & LMT_2Gb) + fc_host_supported_speeds(shost) |= FC_PORTSPEED_2GBIT; + if (phba->lmt & LMT_1Gb) + fc_host_supported_speeds(shost) |= FC_PORTSPEED_1GBIT; +} + /** * lpfc_host_attrib_init - Initialize SCSI host attributes on a FC port * @shost: pointer to SCSI host data structure. @@ -4129,23 +4153,7 @@ void lpfc_host_attrib_init(struct Scsi_Host *shost) lpfc_vport_symbolic_node_name(vport, fc_host_symbolic_name(shost), sizeof fc_host_symbolic_name(shost)); - fc_host_supported_speeds(shost) = 0; - if (phba->lmt & LMT_64Gb) - fc_host_supported_speeds(shost) |= FC_PORTSPEED_64GBIT; - if (phba->lmt & LMT_32Gb) - fc_host_supported_speeds(shost) |= FC_PORTSPEED_32GBIT; - if (phba->lmt & LMT_16Gb) - fc_host_supported_speeds(shost) |= FC_PORTSPEED_16GBIT; - if (phba->lmt & LMT_10Gb) - fc_host_supported_speeds(shost) |= FC_PORTSPEED_10GBIT; - if (phba->lmt & LMT_8Gb) - fc_host_supported_speeds(shost) |= FC_PORTSPEED_8GBIT; - if (phba->lmt & LMT_4Gb) - fc_host_supported_speeds(shost) |= FC_PORTSPEED_4GBIT; - if (phba->lmt & LMT_2Gb) - fc_host_supported_speeds(shost) |= FC_PORTSPEED_2GBIT; - if (phba->lmt & LMT_1Gb) - fc_host_supported_speeds(shost) |= FC_PORTSPEED_1GBIT; + lpfc_host_supported_speeds_set(shost); fc_host_maxframe_size(shost) = (((uint32_t) vport->fc_sparam.cmn.bbRcvSizeMsb & 0x0F) << 8) | @@ -4758,6 +4766,8 @@ lpfc_sli4_async_sli_evt(struct lpfc_hba *phba, struct lpfc_acqe_sli *acqe_sli) struct temp_event temp_event_data; struct lpfc_acqe_misconfigured_event *misconfigured; struct Scsi_Host *shost; + struct lpfc_vport **vports; + int rc, i; evt_type = bf_get(lpfc_trailer_type, acqe_sli); @@ -4883,6 +4893,25 @@ lpfc_sli4_async_sli_evt(struct lpfc_hba *phba, struct lpfc_acqe_sli *acqe_sli) sprintf(message, "Unknown event status x%02x", status); break; } + + /* Issue READ_CONFIG mbox command to refresh supported speeds */ + rc = lpfc_sli4_read_config(phba); + if (rc == -EIO) { + phba->lmt = 0; + lpfc_printf_log(phba, KERN_ERR, LOG_SLI, + "3194 Unable to retrieve supported " + "speeds\n"); + } + vports = lpfc_create_vport_work_array(phba); + if (vports != NULL) { + for (i = 0; i <= phba->max_vports && vports[i] != NULL; + i++) { + shost = lpfc_shost_from_vport(vports[i]); + lpfc_host_supported_s
[PATCH 03/12] lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event
After a LOGO in response to an ABTS timeout, a PLOGI wasn't issued to re-establish the login. A nlp_type check in the LOGO completion handler failed to restart discovery for NVME targets. Revised the nlp_type check for NVME as well as SCSI. While reviewing the LOGO handling a few other issues were seen and were addressed: - Better lock synchronization around ndlp data types - When the ABTS times out, unregister the RPI before sending the LOGO so that all local exchange contexts are cleared and nothing received while awaiting LOGO/PLOGI handling will be accepted. - LOGO handling optimized to: Wait only R_A_TOV for a response. It doesn't need to be retried on timeout. If there wasn't a response, a PLOGI will be sent, thus an implicit logout applies as well when the other port sees it. If there is a response, any kind of response is considered "good" and the XRI quarantined for a exchange qualifier window. - PLOGI is issued as soon a LOGO state is resolved. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc_els.c | 49 -- drivers/scsi/lpfc/lpfc_nportdisc.c | 5 2 files changed, 26 insertions(+), 28 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index f1c1faa74b46..8160a5ebad08 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -242,6 +242,8 @@ lpfc_prep_els_iocb(struct lpfc_vport *vport, uint8_t expectRsp, icmd->ulpCommand = CMD_ELS_REQUEST64_CR; if (elscmd == ELS_CMD_FLOGI) icmd->ulpTimeout = FF_DEF_RATOV * 2; + else if (elscmd == ELS_CMD_LOGO) + icmd->ulpTimeout = phba->fc_ratov; else icmd->ulpTimeout = phba->fc_ratov * 2; } else { @@ -2682,16 +2684,15 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, goto out; } + /* The LOGO will not be retried on failure. A LOGO was +* issued to the remote rport and a ACC or RJT or no Answer are +* all acceptable. Note the failure and move forward with +* discovery. The PLOGI will retry. +*/ if (irsp->ulpStatus) { - /* Check for retry */ - if (lpfc_els_retry(phba, cmdiocb, rspiocb)) { - /* ELS command is being retried */ - skip_recovery = 1; - goto out; - } /* LOGO failed */ lpfc_printf_vlog(vport, KERN_ERR, LOG_ELS, -"2756 LOGO failure DID:%06X Status:x%x/x%x\n", +"2756 LOGO failure, No Retry DID:%06X Status:x%x/x%x\n", ndlp->nlp_DID, irsp->ulpStatus, irsp->un.ulpWord[4]); /* Do not call DSM for lpfc_els_abort'ed ELS cmds */ @@ -2737,7 +2738,8 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, * For any other port type, the rpi is unregistered as an implicit * LOGO. */ - if ((ndlp->nlp_type & NLP_FCP_TARGET) && (skip_recovery == 0)) { + if (ndlp->nlp_type & (NLP_FCP_TARGET | NLP_NVME_TARGET) && + skip_recovery == 0) { lpfc_cancel_retry_delay_tmo(vport, ndlp); spin_lock_irqsave(shost->host_lock, flags); ndlp->nlp_flag |= NLP_NPR_2B_DISC; @@ -2770,6 +2772,8 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, * will be stored into the context1 field of the IOCB for the completion * callback function to the LOGO ELS command. * + * Callers of this routine are expected to unregister the RPI first + * * Return code * 0 - successfully issued logo * 1 - failed to issue logo @@ -2811,22 +2815,6 @@ lpfc_issue_els_logo(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp, "Issue LOGO: did:x%x", ndlp->nlp_DID, 0, 0); - /* -* If we are issuing a LOGO, we may try to recover the remote NPort -* by issuing a PLOGI later. Even though we issue ELS cmds by the -* VPI, if we have a valid RPI, and that RPI gets unreg'ed while -* that ELS command is in-flight, the HBA returns a IOERR_INVALID_RPI -* for that ELS cmd. To avoid this situation, lets get rid of the -* RPI right now, before any ELS cmds are sent. -*/ - spin_lock_irq(shost->host_lock); - ndlp->nlp_flag |= NLP_ISSUE_LOGO; - spin_unlock_irq(shost->host_lock); - if (lpfc_unreg_rpi(vport, ndlp)) { - lpfc_els_free_iocb(phba, elsiocb); - return 0; - } - phba->fc_stat.elsXmitLOGO++; elsiocb->iocb_cmpl = lpfc_cmpl_els_logo; spin_lock_irq(shost->host_lock); @@ -2834,7 +2822,6 @@ lpfc_issue_els_logo(struct lpfc
[PATCH 06/12] lpfc: fcoe: Fix link down issue after 1000+ link bounces
On FCoE adapters, when running link bounce test in a loop, initiator failed to login with switch switch and required driver reload to recover. Switch reached a point where all subsequent FLOGIs would be LS_RJT'd. Further testing showed the condition to be related to not performing FCF discovery between FLOGI's. Fix by monitoring FLOGI failures and once a repeated error is seen repeat FCF discovery. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc_els.c | 2 ++ drivers/scsi/lpfc/lpfc_hbadisc.c | 20 drivers/scsi/lpfc/lpfc_init.c| 2 +- drivers/scsi/lpfc/lpfc_sli.c | 11 ++- drivers/scsi/lpfc/lpfc_sli4.h| 1 + 5 files changed, 26 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index e3e851931394..25625c03a5b3 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -1157,6 +1157,7 @@ lpfc_cmpl_els_flogi(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, phba->fcf.fcf_flag &= ~FCF_DISCOVERY; phba->hba_flag &= ~(FCF_RR_INPROG | HBA_DEVLOSS_TMO); spin_unlock_irq(&phba->hbalock); + phba->fcf.fcf_redisc_attempted = 0; /* reset */ goto out; } if (!rc) { @@ -1171,6 +1172,7 @@ lpfc_cmpl_els_flogi(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, phba->fcf.fcf_flag &= ~FCF_DISCOVERY; phba->hba_flag &= ~(FCF_RR_INPROG | HBA_DEVLOSS_TMO); spin_unlock_irq(&phba->hbalock); + phba->fcf.fcf_redisc_attempted = 0; /* reset */ goto out; } } diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c index f4deb862efc6..a26db7e1d821 100644 --- a/drivers/scsi/lpfc/lpfc_hbadisc.c +++ b/drivers/scsi/lpfc/lpfc_hbadisc.c @@ -1992,6 +1992,26 @@ int lpfc_sli4_fcf_rr_next_proc(struct lpfc_vport *vport, uint16_t fcf_index) "failover and change port state:x%x/x%x\n", phba->pport->port_state, LPFC_VPORT_UNKNOWN); phba->pport->port_state = LPFC_VPORT_UNKNOWN; + + if (!phba->fcf.fcf_redisc_attempted) { + lpfc_unregister_fcf(phba); + + rc = lpfc_sli4_redisc_fcf_table(phba); + if (!rc) { + lpfc_printf_log(phba, KERN_INFO, LOG_FIP, + "3195 Rediscover FCF table\n"); + phba->fcf.fcf_redisc_attempted = 1; + lpfc_sli4_clear_fcf_rr_bmask(phba); + } else { + lpfc_printf_log(phba, KERN_WARNING, LOG_FIP, + "3196 Rediscover FCF table " + "failed. Status:x%x\n", rc); + } + } else { + lpfc_printf_log(phba, KERN_WARNING, LOG_FIP, + "3197 Already rediscover FCF table " + "attempted. No more retry\n"); + } goto stop_flogi_current_fcf; } else { lpfc_printf_log(phba, KERN_INFO, LOG_FIP | LOG_ELS, diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 098b5ef3e9b8..d0a1fba1fc74 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -5069,7 +5069,7 @@ lpfc_sli4_async_fip_evt(struct lpfc_hba *phba, break; } /* If fast FCF failover rescan event is pending, do nothing */ - if (phba->fcf.fcf_flag & FCF_REDISC_EVT) { + if (phba->fcf.fcf_flag & (FCF_REDISC_EVT | FCF_REDISC_PEND)) { spin_unlock_irq(&phba->hbalock); break; } diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c index 783a1540cfbe..ee56ab63c657 100644 --- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -18712,15 +18712,8 @@ lpfc_sli4_fcf_rr_next_index_get(struct lpfc_hba *phba) goto initial_priority; lpfc_printf_log(phba, KERN_WARNING, LOG_FIP, "2844 No roundrobin failover FCF available\n"); - if (next_fcf_index >= LPFC_SLI4_FCF_TBL_INDX_MAX) - return LPFC_FCOE_FCF_NEXT_NONE; - else { - lpfc_printf_log(phba, KERN_WARNING, LOG_FIP, - "3063 Only FCF available idx %d, flag %x\n", - next_fcf_index, - phba->fcf.fcf_pri[next_fcf_index].fcf_rec.flag); -
[PATCH 11/12] lpfc: add Trunking support
Add trunking support to the driver. Trunking is found on more recent asics. In general, trunking appears as a single "port" to the driver and overall behavior doesn't differ. Link speed is reported as an aggregate value, while link speed control is done on a per-physical link basis with all links in the trunk symmetrical. Some commands returning port information are updated to additionally provide trunking information. And new ACQEs are generated to report physical link events relative to the trunk. This patch contains the following modifications: Added link speed settings of 128GB and 256GB. Added handling of trunk-related ACQEs, mainly logging and trapping of physical link statuses. Added additional bsg interface to query trunk state by applications. Augment link_state sysfs attribtute to display trunk link status Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc.h | 13 drivers/scsi/lpfc/lpfc_attr.c| 101 ++ drivers/scsi/lpfc/lpfc_bsg.c | 74 drivers/scsi/lpfc/lpfc_bsg.h | 38 ++ drivers/scsi/lpfc/lpfc_ct.c | 5 ++ drivers/scsi/lpfc/lpfc_els.c | 2 + drivers/scsi/lpfc/lpfc_hbadisc.c | 1 + drivers/scsi/lpfc/lpfc_hw4.h | 64 + drivers/scsi/lpfc/lpfc_init.c| 148 +++ drivers/scsi/lpfc/lpfc_scsi.h| 4 ++ drivers/scsi/lpfc/lpfc_sli.c | 11 +++ drivers/scsi/lpfc/lpfc_sli4.h| 13 12 files changed, 474 insertions(+) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 979366fc34d4..de85f816ce07 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -335,6 +335,18 @@ enum hba_state { LPFC_HBA_ERROR = -1 }; +struct lpfc_trunk_link_state { + enum hba_state state; + uint8_t fault; +}; + +struct lpfc_trunk_link { + struct lpfc_trunk_link_state link0, +link1, +link2, +link3; +}; + struct lpfc_vport { struct lpfc_hba *phba; struct list_head listentry; @@ -684,6 +696,7 @@ struct lpfc_hba { uint32_t iocb_cmd_size; uint32_t iocb_rsp_size; + struct lpfc_trunk_link trunk_link; enum hba_state link_state; uint32_t link_flag; /* link state flags */ #define LS_LOOPBACK_MODE 0x1 /* NPort is in Loopback mode */ diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index 159ede7032dc..9528c4932f55 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -883,6 +883,42 @@ lpfc_link_state_show(struct device *dev, struct device_attribute *attr, } } + if ((phba->sli_rev == LPFC_SLI_REV4) && + ((bf_get(lpfc_sli_intf_if_type, +&phba->sli4_hba.sli_intf) == +LPFC_SLI_INTF_IF_TYPE_6))) { + struct lpfc_trunk_link link = phba->trunk_link; + + if (bf_get(lpfc_conf_trunk_port0, &phba->sli4_hba)) + len += snprintf(buf + len, PAGE_SIZE - len, + "Trunk port 0: Link %s %s\n", + (link.link0.state == LPFC_LINK_UP) ? +"Up" : "Down. ", + trunk_errmsg[link.link0.fault]); + + if (bf_get(lpfc_conf_trunk_port1, &phba->sli4_hba)) + len += snprintf(buf + len, PAGE_SIZE - len, + "Trunk port 1: Link %s %s\n", + (link.link1.state == LPFC_LINK_UP) ? +"Up" : "Down. ", + trunk_errmsg[link.link1.fault]); + + if (bf_get(lpfc_conf_trunk_port2, &phba->sli4_hba)) + len += snprintf(buf + len, PAGE_SIZE - len, + "Trunk port 2: Link %s %s\n", + (link.link2.state == LPFC_LINK_UP) ? +"Up" : "Down. ", + trunk_errmsg[link.link2.fault]); + + if (bf_get(lpfc_conf_trunk_port3, &phba->sli4_hba)) + len += snprintf(buf + len, PAGE_SIZE - len, + "Trunk port 3: Link %s %s\n", + (link.link3.state == LPFC_LINK_UP) ? +"Up" : "Down. ", + trunk_errmsg[link.link3.fault]); + + } + return len; } @@ -1430,6 +1466,66 @@ lpfc_nport_evt_cnt_show(struct device *dev, struct device_attribute *attr, return snprintf(buf, PAGE_SIZE, "%d\n", phba->nport_event_cnt); } +int +lpfc_set_trunking(struct lpfc_hba *phba, char *buff_out) +{ + LPFC_MBOXQ_t *mbox = NULL; + unsigned long val = 0; + char *pval = 0; + int rc = 0; + + if (!strncmp("e
[PATCH 02/12] lpfc: Fix lpfc_sli4_read_config return value check
An error is an error - but not to the existing return value check. Revise check to handle any failure, not just EIO. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc_init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index c78ae81b5701..098b5ef3e9b8 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -4896,11 +4896,11 @@ lpfc_sli4_async_sli_evt(struct lpfc_hba *phba, struct lpfc_acqe_sli *acqe_sli) /* Issue READ_CONFIG mbox command to refresh supported speeds */ rc = lpfc_sli4_read_config(phba); - if (rc == -EIO) { + if (rc) { phba->lmt = 0; lpfc_printf_log(phba, KERN_ERR, LOG_SLI, "3194 Unable to retrieve supported " - "speeds\n"); + "speeds, rc = 0x%x\n", rc); } vports = lpfc_create_vport_work_array(phba); if (vports != NULL) { -- 2.13.1
[PATCH 05/12] lpfc: Correct errors accessing fw log
This patch corrects two issues: - An oops would occur if reading based on a non-zero offset. Offset calculation was incorrect. - Updates to ras config (logging level) were ignored if change was made while fw logging was enabled. Revise to dynamically update. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc_bsg.c | 64 +--- 1 file changed, 25 insertions(+), 39 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_bsg.c b/drivers/scsi/lpfc/lpfc_bsg.c index 7bd7ae86bed5..eb2e8c941b78 100644 --- a/drivers/scsi/lpfc/lpfc_bsg.c +++ b/drivers/scsi/lpfc/lpfc_bsg.c @@ -5416,7 +5416,7 @@ lpfc_bsg_set_ras_config(struct bsg_job *job) struct lpfc_ras_fwlog *ras_fwlog = &phba->ras_fwlog; struct fc_bsg_reply *bsg_reply = job->reply; uint8_t action = 0, log_level = 0; - int rc = 0; + int rc = 0, action_status = 0; if (job->request_len < sizeof(struct fc_bsg_request) + @@ -5449,16 +5449,25 @@ lpfc_bsg_set_ras_config(struct bsg_job *job) lpfc_ras_stop_fwlog(phba); } else { /*action = LPFC_RASACTION_START_LOGGING*/ - if (ras_fwlog->ras_active == true) { - rc = -EINPROGRESS; - goto ras_job_error; - } + + /* Even though FW-logging is active re-initialize +* FW-logging with new log-level. Return status +* "Logging already Running" to caller. +**/ + if (ras_fwlog->ras_active) + action_status = -EINPROGRESS; /* Enable logging */ rc = lpfc_sli4_ras_fwlog_init(phba, log_level, LPFC_RAS_ENABLE_LOGGING); - if (rc) + if (rc) { rc = -EINVAL; + goto ras_job_error; + } + + /* Check if FW-logging is re-initialized */ + if (action_status == -EINPROGRESS) + rc = action_status; } ras_job_error: /* make error code available to userspace */ @@ -5487,8 +5496,7 @@ lpfc_bsg_get_ras_lwpd(struct bsg_job *job) struct lpfc_hba *phba = vport->phba; struct lpfc_ras_fwlog *ras_fwlog = &phba->ras_fwlog; struct fc_bsg_reply *bsg_reply = job->reply; - uint32_t lwpd_offset = 0; - uint64_t wrap_value = 0; + u32 *lwpd_ptr = NULL; int rc = 0; rc = lpfc_check_fwlog_support(phba); @@ -5508,11 +5516,12 @@ lpfc_bsg_get_ras_lwpd(struct bsg_job *job) ras_reply = (struct lpfc_bsg_get_ras_lwpd *) bsg_reply->reply_data.vendor_reply.vendor_rsp; - lwpd_offset = *((uint32_t *)ras_fwlog->lwpd.virt) & 0x; - ras_reply->offset = be32_to_cpu(lwpd_offset); + /* Get lwpd offset */ + lwpd_ptr = (uint32_t *)(ras_fwlog->lwpd.virt); + ras_reply->offset = be32_to_cpu(*lwpd_ptr & 0x); - wrap_value = *((uint64_t *)ras_fwlog->lwpd.virt); - ras_reply->wrap_count = be32_to_cpu((wrap_value >> 32) & 0x); + /* Get wrap count */ + ras_reply->wrap_count = be32_to_cpu(*(++lwpd_ptr) & 0x); ras_job_error: /* make error code available to userspace */ @@ -5539,9 +5548,8 @@ lpfc_bsg_get_ras_fwlog(struct bsg_job *job) struct fc_bsg_request *bsg_request = job->request; struct fc_bsg_reply *bsg_reply = job->reply; struct lpfc_bsg_get_fwlog_req *ras_req; - uint32_t rd_offset, rd_index, offset, pending_wlen; - uint32_t boundary = 0, align_len = 0, write_len = 0; - void *dest, *src, *fwlog_buff; + u32 rd_offset, rd_index, offset; + void *src, *fwlog_buff; struct lpfc_ras_fwlog *ras_fwlog = NULL; struct lpfc_dmabuf *dmabuf, *next; int rc = 0; @@ -5581,8 +5589,6 @@ lpfc_bsg_get_ras_fwlog(struct bsg_job *job) rd_index = (rd_offset / LPFC_RAS_MAX_ENTRY_SIZE); offset = (rd_offset % LPFC_RAS_MAX_ENTRY_SIZE); - pending_wlen = ras_req->read_size; - dest = fwlog_buff; list_for_each_entry_safe(dmabuf, next, &ras_fwlog->fwlog_buff_list, list) { @@ -5590,29 +5596,9 @@ lpfc_bsg_get_ras_fwlog(struct bsg_job *job) if (dmabuf->buffer_tag < rd_index) continue; - /* Align read to buffer size */ - if (offset) { - boundary = ((dmabuf->buffer_tag + 1) * - LPFC_RAS_MAX_ENTRY_SIZE); - - align_len = (boundary - offset); - write_len = min_t(u32, align_len, - LPFC_RAS_MAX_ENTRY_SIZE); - } else { - write_len = min_t(u32, pending_wlen, - LPFC_RAS_MAX_ENTR
Re: [PATCH] bsg: convert to use blk-mq
On 10/23/18 11:40 AM, Benjamin Block wrote: > On Mon, Oct 22, 2018 at 06:38:36AM -0600, Jens Axboe wrote: >> On 10/22/18 4:03 AM, Benjamin Block wrote: >>> On Fri, Oct 19, 2018 at 09:50:53AM -0600, Jens Axboe wrote: >>> >>> Ok so, that gets past the stage where we initialize the queues. Simple >>> SCSI-I/O also seems to work, that is for example an INQUIRY(10), but >>> transport commands that get passed to the driver break. Tried to send >>> a FibreChannel GPN_FT (remote port discovery). >>> >>> As the BSG interface goes. This is a bidirectional command, that has >>> both a buffer for the request and for the reply. AFAIR BSG will create a >>> struct request for each of them. Protocol is BSG_PROTOCOL_SCSI, >>> Subprotocol BSG_SUB_PROTOCOL_SCSI_TRANSPORT. The rest should be >>> transparent till we get into the driver. >>> >>> First got this: >>> >>> [ 566.531100] BUG: sleeping function called from invalid context at >>> mm/slab.h:421 >>> [ 566.531452] in_atomic(): 1, irqs_disabled(): 0, pid: 3104, name: >>> bsg_api_test >>> [ 566.531460] 1 lock held by bsg_api_test/3104: >>> [ 566.531466] #0: cb4b58e8 (rcu_read_lock){}, at: >>> hctx_lock+0x30/0x118 >>> [ 566.531498] Preemption disabled at: >>> [ 566.531503] [<008175d0>] __blk_mq_delay_run_hw_queue+0x50/0x218 >>> [ 566.531519] CPU: 3 PID: 3104 Comm: bsg_api_test Tainted: GW >>>4.19.0-rc6-bb-next+ #1 >>> [ 566.531527] Hardware name: IBM 3906 M03 704 (LPAR) >>> [ 566.531533] Call Trace: >>> [ 566.531544] ([<001167fa>] show_stack+0x8a/0xd8) >>> [ 566.531555] [<00bcc6d2>] dump_stack+0x9a/0xd8 >>> [ 566.531565] [<00196410>] ___might_sleep+0x280/0x298 >>> [ 566.531576] [<003e528c>] __kmalloc+0xbc/0x560 >>> [ 566.531584] [<0083186a>] bsg_map_buffer+0x5a/0xb0 >>> [ 566.531591] [<00831948>] bsg_queue_rq+0x88/0x118 >>> [ 566.531599] [<0081ab56>] blk_mq_dispatch_rq_list+0x37e/0x670 >>> [ 566.531607] [<0082050e>] blk_mq_do_dispatch_sched+0x11e/0x130 >>> [ 566.531615] [<00820dfe>] >>> blk_mq_sched_dispatch_requests+0x156/0x1a0 >>> [ 566.531622] [<00817564>] __blk_mq_run_hw_queue+0x144/0x160 >>> [ 566.531630] [<00817614>] __blk_mq_delay_run_hw_queue+0x94/0x218 >>> [ 566.531638] [<008178b2>] blk_mq_run_hw_queue+0xda/0xf0 >>> [ 566.531645] [<008211d8>] blk_mq_sched_insert_request+0x1a8/0x1e8 >>> [ 566.531653] [<00811ee2>] blk_execute_rq_nowait+0x72/0x80 >>> [ 566.531660] [<00811f66>] blk_execute_rq+0x76/0xb8 >>> [ 566.531778] [<00830d0e>] bsg_ioctl+0x426/0x500 >>> [ 566.531787] [<00440cb4>] do_vfs_ioctl+0x68c/0x710 >>> [ 566.531794] [<00440dac>] ksys_ioctl+0x74/0xa0 >>> [ 566.531801] [<00440e0a>] sys_ioctl+0x32/0x40 >>> [ 566.531808] [<00bf1dd0>] system_call+0xd8/0x2d0 >>> [ 566.531815] 1 lock held by bsg_api_test/3104: >>> [ 566.531821] #0: cb4b58e8 (rcu_read_lock){}, at: >>> hctx_lock+0x30/0x118 >>> >> >> The first one is an easy fix, not sure how I missed that. The other >> one I have no idea, any chance you could try with this one: >> >> http://git.kernel.dk/cgit/linux-block/commit/?h=mq-conversions&id=142dc9f36e3113b6a76d472978c33c8c2a2b702c >> >> which fixes the first one, and also corrects a wrong end_io call, >> but I don't think that's the cause of the above. >> >> If it crashes, can you figure out where in the source that is? >> Basically just do >> >> gdb vmlinux >> l *zfcp_fc_exec_bsg_job+0x116 >> >> assuming that works fine on s390 :-) >> > > So I tried 4.19.0 with only the two patches from you: > http://git.kernel.dk/cgit/linux-block/commit/?h=mq-conversions&id=2b2ffa16193e9a69a076595ed64429b8cc9b42aa > http://git.kernel.dk/cgit/linux-block/commit/?h=mq-conversions&id=142dc9f36e3113b6a76d472978c33c8c2a2b702c > > This fixed the first warning from before, as you suggested, but it still > crash like this: > > [ ] Unable to handle kernel pointer dereference in virtual kernel address > space > [ ] Failing address: TEID: 0483 > [ ] Fault in home space mode while using kernel ASCE. > [ ] AS:025f0007 R3:dffb8007 S:dffbf000 > P:013d > [ ] Oops: 0004 ilc:3 [#1] PREEMPT SMP DEBUG_PAGEALLOC > [ ] Modules linked in: > [ ] CPU: 2 PID: 609 Comm: kworker/2:1H Kdump: loaded Tainted: GW >4.19.0-bb-next+ #1 > [ ] Hardware name: IBM 3906 M03 704 (LPAR) > [ ] Workqueue: kblockd blk_mq_run_work_fn > [ ] Krnl PSW : 0704e0018000 03ff806a6b40 > (zfcp_fc_exec_bsg_job+0x1c0/0x440 [zfcp]) > [ ]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 > [ ] Krnl GPRS: 83e0f3c0 > 0300 > [ ]0300 03ff806a6b3a a86b5948 > a86b5988 > [ ]83e0f3f0 a86b5938 > 984aee80 >
[PATCH RESEND] Timeouts occur on QLogic adapter surprise removal
When doing a surprise removal of an adapter, some in flight I/Os can get stuck and take a while to complete (they actually timeout and are retried). We are not handling an early error exit from qla2xxx_eh_abort properly. Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip") --- Note #1: (Reworked ACKed patch to cleanly apply to 4.20/scsi-queue. The main explanation has not been reworked. Note #2: I also see the following outstanding patch which removes a variable used in this patch) [PATCH 5/7] qla2xxx: Remove a set-but-not-used variable Obviously, that patch is no longer needed as I am now using that variable... Patch explanation: After a hot remove of a Qlogic adapter, the driver's remove function gets called and we end up aborting all in progress I/Os. Here is the code flow: qla2x00_remove_one qla2x00_abort_isp_cleanup qla2x00_abort_all_cmds __qla2x00_abort_all_cmds qla2xxx_eh_abort At the start of qla2xxx_eh_abort, some sanity checks are done before actually sending the abort. One of these checks is a call to fc_block_scsi_eh. In the case of a hot remove, it turns out that this routine can exit with FAST_IO_FAIL. When this occurs, we return back to __qla2x00_abort_all_cmds with an extra reference on sp (because the abort never gets sent). Originally, this was addressed with another fix: commit 4cd3b6ebff85 scsi: qla2xxx: Fix extraneous ref on sp's after adapter break But this later this added change complicated matters: commit 45235022da99 scsi: qla2xxx: Fix driver unload by shutting down chip Because the abort is now being done earlier in the teardown (through qla2x00_abort_isp_cleanup), in qla2xxx_eh_abort we make it past the first check because qla2x00_isp_reg_stat(ha) returns zero. When we fail a few lines later in fc_block_scsi_eh, this error is not handled properly in __qla2x00_abort_all_cmds and the I/O ends up hanging and timing out because of the extra reference. For this fix, a check for FAST_IO_FAIL is added to __qla2x00_abort_all_cmds where we check to see if qla2xxx_eh_abort succeeded or not. This removes the extra reference in this additional early exit case. In my testing (hw surprise removals and also adapter remove via sysfs), this eliminates the timeouts and delays and the remove proceeds smoothly. drivers/scsi/qla2xxx/qla_os.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index dba672f..af57f05 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1804,6 +1804,12 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha) spin_lock_irqsave (qp->qp_lock_ptr, flags); } + /* +* Get rid of extra reference caused by early +* exit from qla2xxx_eh_abort +*/ + if (status == FAST_IO_FAIL) + atomic_dec(&sp->ref_count); } sp->done(sp, res); break; -- 1.8.3.1
RE: [PATCH RESEND] scsi: qla2xxx: I/Os timing out on surprise removal of
This is still a bug in 4.20/scsi-queue. I am sending a new patch that applies cleanly to 4.20/scsi-queue as it stands now. I have tested it successfully. Regards -Bill -Original Message- From: Martin K. Petersen [mailto:martin.peter...@oracle.com] Sent: Friday, October 19, 2018 6:24 PM To: Kuzeja, William Cc: linux-scsi@vger.kernel.org; qla2xxx-upstr...@qlogic.com Subject: Re: [PATCH RESEND] scsi: qla2xxx: I/Os timing out on surprise removal of Bill, > When doing a surprise removal of an adapter, some in flight I/Os can > get stuck and take a while to complete (they actually timeout and are > retried). We are not handling an early error exit from > qla2xxx_eh_abort properly. This doesn't apply to 4.20/scsi-queue and the surrounding code has changed significantly. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] bsg: convert to use blk-mq
On Mon, Oct 22, 2018 at 06:38:36AM -0600, Jens Axboe wrote: > On 10/22/18 4:03 AM, Benjamin Block wrote: > > On Fri, Oct 19, 2018 at 09:50:53AM -0600, Jens Axboe wrote: > > > > Ok so, that gets past the stage where we initialize the queues. Simple > > SCSI-I/O also seems to work, that is for example an INQUIRY(10), but > > transport commands that get passed to the driver break. Tried to send > > a FibreChannel GPN_FT (remote port discovery). > > > > As the BSG interface goes. This is a bidirectional command, that has > > both a buffer for the request and for the reply. AFAIR BSG will create a > > struct request for each of them. Protocol is BSG_PROTOCOL_SCSI, > > Subprotocol BSG_SUB_PROTOCOL_SCSI_TRANSPORT. The rest should be > > transparent till we get into the driver. > > > > First got this: > > > > [ 566.531100] BUG: sleeping function called from invalid context at > > mm/slab.h:421 > > [ 566.531452] in_atomic(): 1, irqs_disabled(): 0, pid: 3104, name: > > bsg_api_test > > [ 566.531460] 1 lock held by bsg_api_test/3104: > > [ 566.531466] #0: cb4b58e8 (rcu_read_lock){}, at: > > hctx_lock+0x30/0x118 > > [ 566.531498] Preemption disabled at: > > [ 566.531503] [<008175d0>] __blk_mq_delay_run_hw_queue+0x50/0x218 > > [ 566.531519] CPU: 3 PID: 3104 Comm: bsg_api_test Tainted: GW > >4.19.0-rc6-bb-next+ #1 > > [ 566.531527] Hardware name: IBM 3906 M03 704 (LPAR) > > [ 566.531533] Call Trace: > > [ 566.531544] ([<001167fa>] show_stack+0x8a/0xd8) > > [ 566.531555] [<00bcc6d2>] dump_stack+0x9a/0xd8 > > [ 566.531565] [<00196410>] ___might_sleep+0x280/0x298 > > [ 566.531576] [<003e528c>] __kmalloc+0xbc/0x560 > > [ 566.531584] [<0083186a>] bsg_map_buffer+0x5a/0xb0 > > [ 566.531591] [<00831948>] bsg_queue_rq+0x88/0x118 > > [ 566.531599] [<0081ab56>] blk_mq_dispatch_rq_list+0x37e/0x670 > > [ 566.531607] [<0082050e>] blk_mq_do_dispatch_sched+0x11e/0x130 > > [ 566.531615] [<00820dfe>] > > blk_mq_sched_dispatch_requests+0x156/0x1a0 > > [ 566.531622] [<00817564>] __blk_mq_run_hw_queue+0x144/0x160 > > [ 566.531630] [<00817614>] __blk_mq_delay_run_hw_queue+0x94/0x218 > > [ 566.531638] [<008178b2>] blk_mq_run_hw_queue+0xda/0xf0 > > [ 566.531645] [<008211d8>] blk_mq_sched_insert_request+0x1a8/0x1e8 > > [ 566.531653] [<00811ee2>] blk_execute_rq_nowait+0x72/0x80 > > [ 566.531660] [<00811f66>] blk_execute_rq+0x76/0xb8 > > [ 566.531778] [<00830d0e>] bsg_ioctl+0x426/0x500 > > [ 566.531787] [<00440cb4>] do_vfs_ioctl+0x68c/0x710 > > [ 566.531794] [<00440dac>] ksys_ioctl+0x74/0xa0 > > [ 566.531801] [<00440e0a>] sys_ioctl+0x32/0x40 > > [ 566.531808] [<00bf1dd0>] system_call+0xd8/0x2d0 > > [ 566.531815] 1 lock held by bsg_api_test/3104: > > [ 566.531821] #0: cb4b58e8 (rcu_read_lock){}, at: > > hctx_lock+0x30/0x118 > > > > The first one is an easy fix, not sure how I missed that. The other > one I have no idea, any chance you could try with this one: > > http://git.kernel.dk/cgit/linux-block/commit/?h=mq-conversions&id=142dc9f36e3113b6a76d472978c33c8c2a2b702c > > which fixes the first one, and also corrects a wrong end_io call, > but I don't think that's the cause of the above. > > If it crashes, can you figure out where in the source that is? > Basically just do > > gdb vmlinux > l *zfcp_fc_exec_bsg_job+0x116 > > assuming that works fine on s390 :-) > So I tried 4.19.0 with only the two patches from you: http://git.kernel.dk/cgit/linux-block/commit/?h=mq-conversions&id=2b2ffa16193e9a69a076595ed64429b8cc9b42aa http://git.kernel.dk/cgit/linux-block/commit/?h=mq-conversions&id=142dc9f36e3113b6a76d472978c33c8c2a2b702c This fixed the first warning from before, as you suggested, but it still crash like this: [ ] Unable to handle kernel pointer dereference in virtual kernel address space [ ] Failing address: TEID: 0483 [ ] Fault in home space mode while using kernel ASCE. [ ] AS:025f0007 R3:dffb8007 S:dffbf000 P:013d [ ] Oops: 0004 ilc:3 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ ] Modules linked in: [ ] CPU: 2 PID: 609 Comm: kworker/2:1H Kdump: loaded Tainted: GW 4.19.0-bb-next+ #1 [ ] Hardware name: IBM 3906 M03 704 (LPAR) [ ] Workqueue: kblockd blk_mq_run_work_fn [ ] Krnl PSW : 0704e0018000 03ff806a6b40 (zfcp_fc_exec_bsg_job+0x1c0/0x440 [zfcp]) [ ]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 [ ] Krnl GPRS: 83e0f3c0 0300 [ ]0300 03ff806a6b3a a86b5948 a86b5988 [ ]83e0f3f0 a86b5938 984aee80 [ ]a86b5800 03ff806ba950 03ff806a6b3a 98a5ed88 [ ] Krnl Code: 03ff806a6b
Re: [PATCH v4 00/11] Zoned block device support improvements
On 10/12/18 4:08 AM, Damien Le Moal wrote: > This series improves zoned block device support (reduce overhead) and > introduces many simplifications to the code (overall, there are more deletions > than insertions). > > In more details: > * Patches 1 to 3 are SCSI side (sd driver) cleanups and improvements reducing > the overhead of report zones command execution during disk scan and > revalidation. > * Patches 4 to 9 improve the useability and user API of zoned block devices. > * Patch 10 is the main part of this series. This patch replaces the > REQ_OP_ZONE_REPORT BIO/request operation for executing report zones commands > with a block device file operation, removing the need for the command reply > payload in-place rewriting in the BIO buffer. This leads to major > simplification of the code in many places. > * Patch 11 further simplifies the code of low level drivers by providing a > generic implementation of zoned block device reuest queue zone bitmaps > initialization and revalidation. I've applied this, but I have two complaints: 1) Two had to be hand applied, it wasn't against the block tree. 2) The ordering of the signed-off-by. Someone told me that this is patchwork, but I absolutely hate it. SOB should go last, not before the reviewed-by. I fixed that up too. -- Jens Axboe
[v6 4/4] mpt3sas: Bump driver version to 27.100.00.00.
Modify driver version to 27.100.00.00 (which is equivalent to PH8 OOB driver) Signed-off-by: Suganath Prabu --- drivers/scsi/mpt3sas/mpt3sas_base.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index c860ed2..7fdaf29 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -74,8 +74,8 @@ #define MPT3SAS_DRIVER_NAME"mpt3sas" #define MPT3SAS_AUTHOR "Avago Technologies " #define MPT3SAS_DESCRIPTION"LSI MPT Fusion SAS 3.0 Device Driver" -#define MPT3SAS_DRIVER_VERSION "26.100.00.00" -#define MPT3SAS_MAJOR_VERSION 26 +#define MPT3SAS_DRIVER_VERSION "27.100.00.00" +#define MPT3SAS_MAJOR_VERSION 27 #define MPT3SAS_MINOR_VERSION 100 #define MPT3SAS_BUILD_VERSION 0 #define MPT3SAS_RELEASE_VERSION00 -- 1.8.3.1
[v6 0/4] mpt3sas: Hot-Plug Surprise removal support on IOC.
v6 Change set: Incorporated changes as suggested by Andy. In Patch 1 converted while loop to do while in function mpt3sas_wait_for_ioc_to_operational(). And in patch 3 removed parentheses. V5 Change set: V5 post has only defect fixes. We are reworking and incorporating the suggestions from Bjorn. And after covering tests, we ll be post Hot-Plug Surprise removal patches. V4 Change set: Reframe split strings in print statement, to avoid V3 Change Set: Simplified function "mpt3sas_base_pci_device_is_available" and made inline V2 changes: Replaced mpt3sas_base_pci_device_is_unplugged with pci_device_is_present. V1 changes: In Patch 0001 - unlock mutex, if active reset is in progress. Suganath Prabu (4): mpt3sas: Separate out mpt3sas_wait_for_ioc_to_operational mpt3sas: Fix Sync cache command failure during driver unload mpt3sas:Fix driver modifying persistent data. mpt3sas: Bump driver version to 27.100.00.00. drivers/scsi/mpt3sas/mpt3sas_base.c | 75 ++-- drivers/scsi/mpt3sas/mpt3sas_base.h | 8 +++- drivers/scsi/mpt3sas/mpt3sas_config.c| 28 +++- drivers/scsi/mpt3sas/mpt3sas_ctl.c | 21 ++--- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 38 +++- drivers/scsi/mpt3sas/mpt3sas_transport.c | 70 ++--- 6 files changed, 106 insertions(+), 134 deletions(-) -- 1.8.3.1
[v6 1/4] mpt3sas: Separate out mpt3sas_wait_for_ioc_to_operational
No functional changes. This section of code "wait for IOC to be operational" is used in many places across the driver, and hence moved this code in to a function "mpt3sas_wait_for_ioc_to_operational()" Signed-off-by: Suganath Prabu --- drivers/scsi/mpt3sas/mpt3sas_base.c | 73 ++-- drivers/scsi/mpt3sas/mpt3sas_base.h | 4 ++ drivers/scsi/mpt3sas/mpt3sas_config.c| 24 +++ drivers/scsi/mpt3sas/mpt3sas_ctl.c | 21 ++--- drivers/scsi/mpt3sas/mpt3sas_transport.c | 63 --- 5 files changed, 62 insertions(+), 123 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 166b607..243bf32 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -5079,6 +5079,41 @@ _base_send_ioc_reset(struct MPT3SAS_ADAPTER *ioc, u8 reset_type, int timeout) } /** + * mpt3sas_wait_for_ioc_to_operational - IOC's operational + * state and HBA hot unplug status are checked here. + * @ioc: per adapter object + * @wait_count: timeout in seconds + * + * Return: Waits up to timeout seconds for the IOC to + * become operational. Returns 0 if IOC is present + * and operational; otherwise returns -EFAULT. + */ + +int +mpt3sas_wait_for_ioc_to_operational(struct MPT3SAS_ADAPTER *ioc, + int timeout) +{ + int wait_state_count = 0; + u32 ioc_state; + + do { + ioc_state = mpt3sas_base_get_iocstate(ioc, 1); + if (ioc_state == MPI2_IOC_STATE_OPERATIONAL) + break; + ssleep(1); + ioc_info(ioc, "%s: waiting for operational state(count=%d)\n", + __func__, ++wait_state_count); + } while (--timeout); + if (!timeout) { + ioc_err(ioc, "%s: failed due to ioc not operational\n", __func__); + return -EFAULT; + } + if (wait_state_count) + ioc_info(ioc, "ioc is operational\n"); + return 0; +} + +/** * _base_handshake_req_reply_wait - send request thru doorbell interface * @ioc: per adapter object * @request_bytes: request length @@ -5212,11 +5247,9 @@ mpt3sas_base_sas_iounit_control(struct MPT3SAS_ADAPTER *ioc, Mpi2SasIoUnitControlRequest_t *mpi_request) { u16 smid; - u32 ioc_state; u8 issue_reset = 0; int rc; void *request; - u16 wait_state_count; dinitprintk(ioc, ioc_info(ioc, "%s\n", __func__)); @@ -5228,20 +5261,9 @@ mpt3sas_base_sas_iounit_control(struct MPT3SAS_ADAPTER *ioc, goto out; } - wait_state_count = 0; - ioc_state = mpt3sas_base_get_iocstate(ioc, 1); - while (ioc_state != MPI2_IOC_STATE_OPERATIONAL) { - if (wait_state_count++ == 10) { - ioc_err(ioc, "%s: failed due to ioc not operational\n", - __func__); - rc = -EFAULT; - goto out; - } - ssleep(1); - ioc_state = mpt3sas_base_get_iocstate(ioc, 1); - ioc_info(ioc, "%s: waiting for operational state(count=%d)\n", -__func__, wait_state_count); - } + rc = mpt3sas_wait_for_ioc_to_operational(ioc, IOC_OPERATIONAL_WAIT_COUNT); + if (rc) + goto out; smid = mpt3sas_base_get_smid(ioc, ioc->base_cb_idx); if (!smid) { @@ -5307,11 +5329,9 @@ mpt3sas_base_scsi_enclosure_processor(struct MPT3SAS_ADAPTER *ioc, Mpi2SepReply_t *mpi_reply, Mpi2SepRequest_t *mpi_request) { u16 smid; - u32 ioc_state; u8 issue_reset = 0; int rc; void *request; - u16 wait_state_count; dinitprintk(ioc, ioc_info(ioc, "%s\n", __func__)); @@ -5323,20 +5343,9 @@ mpt3sas_base_scsi_enclosure_processor(struct MPT3SAS_ADAPTER *ioc, goto out; } - wait_state_count = 0; - ioc_state = mpt3sas_base_get_iocstate(ioc, 1); - while (ioc_state != MPI2_IOC_STATE_OPERATIONAL) { - if (wait_state_count++ == 10) { - ioc_err(ioc, "%s: failed due to ioc not operational\n", - __func__); - rc = -EFAULT; - goto out; - } - ssleep(1); - ioc_state = mpt3sas_base_get_iocstate(ioc, 1); - ioc_info(ioc, "%s: waiting for operational state(count=%d)\n", -__func__, wait_state_count); - } + rc = mpt3sas_wait_for_ioc_to_operational(ioc, IOC_OPERATIONAL_WAIT_COUNT); + if (rc) + goto out; smid = mpt3sas_base_get_smid(ioc, ioc->base_cb_idx); if (!smid) { diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index 8f1d6b0..c860ed2 100644 --- a/drivers/scsi/mpt3sa
[v6 2/4] mpt3sas: Fix Sync cache command failure during driver unload
This is to fix Sync cache and start stop command failures with DID_NO_CONNECT during driver unload. 1) Release drives first from SML, then remove internally in driver. 2) And allow sync cache and Start stop commands to firmware, even when remove_host flag is set Signed-off-by: Suganath Prabu --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 38 ++-- drivers/scsi/mpt3sas/mpt3sas_transport.c | 7 -- 2 files changed, 41 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 4d73b5e..df56cbe 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -3748,6 +3748,40 @@ _scsih_tm_tr_complete(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, return _scsih_check_for_pending_tm(ioc, smid); } +/** _scsih_allow_scmd_to_device - check whether scmd needs to + * issue to IOC or not. + * @ioc: per adapter object + * @scmd: pointer to scsi command object + * + * Returns true if scmd can be issued to IOC otherwise returns false. + */ +inline bool _scsih_allow_scmd_to_device(struct MPT3SAS_ADAPTER *ioc, + struct scsi_cmnd *scmd) +{ + + if (ioc->pci_error_recovery) + return false; + + if (ioc->hba_mpi_version_belonged == MPI2_VERSION) { + if (ioc->remove_host) + return false; + + return true; + } + + if (ioc->remove_host) { + + switch (scmd->cmnd[0]) { + case SYNCHRONIZE_CACHE: + case START_STOP: + return true; + default: + return false; + } + } + + return true; +} /** * _scsih_sas_control_complete - completion routine @@ -4571,7 +4605,7 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd) return 0; } - if (ioc->pci_error_recovery || ioc->remove_host) { + if (!(_scsih_allow_scmd_to_device(ioc, scmd))) { scmd->result = DID_NO_CONNECT << 16; scmd->scsi_done(scmd); return 0; @@ -9641,6 +9675,7 @@ static void scsih_remove(struct pci_dev *pdev) /* release all the volumes */ _scsih_ir_shutdown(ioc); + sas_remove_host(shost); list_for_each_entry_safe(raid_device, next, &ioc->raid_device_list, list) { if (raid_device->starget) { @@ -9682,7 +9717,6 @@ static void scsih_remove(struct pci_dev *pdev) ioc->sas_hba.num_phys = 0; } - sas_remove_host(shost); mpt3sas_base_detach(ioc); spin_lock(&gioc_lock); list_del(&ioc->list); diff --git a/drivers/scsi/mpt3sas/mpt3sas_transport.c b/drivers/scsi/mpt3sas/mpt3sas_transport.c index bc1e67b..7d722b9 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_transport.c +++ b/drivers/scsi/mpt3sas/mpt3sas_transport.c @@ -807,10 +807,13 @@ mpt3sas_transport_port_remove(struct MPT3SAS_ADAPTER *ioc, u64 sas_address, mpt3sas_port->remote_identify.sas_address, mpt3sas_phy->phy_id); mpt3sas_phy->phy_belongs_to_port = 0; - sas_port_delete_phy(mpt3sas_port->port, mpt3sas_phy->phy); + if (!ioc->remove_host) + sas_port_delete_phy(mpt3sas_port->port, + mpt3sas_phy->phy); list_del(&mpt3sas_phy->port_siblings); } - sas_port_delete(mpt3sas_port->port); + if (!ioc->remove_host) + sas_port_delete(mpt3sas_port->port); kfree(mpt3sas_port); } -- 1.8.3.1
[v6 3/4] mpt3sas:Fix driver modifying persistent data.
* If EEDPTagMode field in manufacturing page11 is set, unset it. This is needed to fix a hardware bug in SAS3/SAS2 cards, So, skipping EEDPTagMode changes in Manufacturing page11 for SAS35 controllers. * Fix driver modifying NVRAM/persistent data in Manufacturing page11 along with current copy. Driver should change only current copy of Manufacturing page11. Signed-off-by: Suganath Prabu --- drivers/scsi/mpt3sas/mpt3sas_base.c | 2 +- drivers/scsi/mpt3sas/mpt3sas_config.c | 4 2 files changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 243bf32..2237681 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -4062,7 +4062,7 @@ _base_static_config_pages(struct MPT3SAS_ADAPTER *ioc) * flag unset in NVDATA. */ mpt3sas_config_get_manufacturing_pg11(ioc, &mpi_reply, &ioc->manu_pg11); - if (ioc->manu_pg11.EEDPTagMode == 0) { + if (!ioc->is_gen35_ioc && ioc->manu_pg11.EEDPTagMode == 0) { pr_err("%s: overriding NVDATA EEDPTagMode setting\n", ioc->name); ioc->manu_pg11.EEDPTagMode &= ~0x3; diff --git a/drivers/scsi/mpt3sas/mpt3sas_config.c b/drivers/scsi/mpt3sas/mpt3sas_config.c index 88062d1..fe3da1c 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_config.c +++ b/drivers/scsi/mpt3sas/mpt3sas_config.c @@ -659,10 +659,6 @@ mpt3sas_config_set_manufacturing_pg11(struct MPT3SAS_ADAPTER *ioc, r = _config_request(ioc, &mpi_request, mpi_reply, MPT3_CONFIG_PAGE_DEFAULT_TIMEOUT, config_page, sizeof(*config_page)); - mpi_request.Action = MPI2_CONFIG_ACTION_PAGE_WRITE_NVRAM; - r = _config_request(ioc, &mpi_request, mpi_reply, - MPT3_CONFIG_PAGE_DEFAULT_TIMEOUT, config_page, - sizeof(*config_page)); out: return r; } -- 1.8.3.1
Re: [GIT PULL] pcmcia odd fixes for v4.20-rc1
On Mon, Oct 22, 2018 at 3:39 PM Dominik Brodowski wrote: > > These are just a few odd fixes and improvements to the PCMCIA core > and to a few PCMCIA device drivers. Pulled, Linus