RE: mpt3sas regression...
David, I could see the same issue on sparc64 system, soon we will repost the patch addressing this. Thanks, Chaitra -Original Message- From: David Miller [mailto:da...@davemloft.net] Sent: Thursday, June 28, 2018 5:45 AM To: chaitra.basa...@broadcom.com Cc: linux-scsi@vger.kernel.org; sparcli...@vger.kernel.org Subject: Re: mpt3sas regression... From: Chaitra Basappa Date: Wed, 27 Jun 2018 19:58:34 +0530 > Please let us know what is the issue faced and if its recreating then > share the driver logs with logging_level=0x3f8. The driver cannot even probe successfully to start scanning for disks because some busy bit never clears. I think it was in one of the firmware status registers or the doorbell register.
RE: mpt3sas regression...
David, Please let us know what is the issue faced and if its recreating then share the driver logs with logging_level=0x3f8. Thanks, Chaitra -Original Message- From: Chaitra Basappa [mailto:chaitra.basa...@broadcom.com] Sent: Tuesday, June 26, 2018 5:36 PM To: David Miller; linux-scsi@vger.kernel.org Cc: sparcli...@vger.kernel.org Subject: RE: mpt3sas regression... Hi David, Sorry for the inconvenience caused. Yes, "scsi: mpt3sas: Bug fix for big endian systems." patch was posted to fix sparse warnings. I missed the testing. Currently we are testing on sparc64 system and soon I will be reposting the patch based on the findings. Thanks, Chaitra -Original Message- From: David Miller [mailto:da...@davemloft.net] Sent: Sunday, June 24, 2018 10:17 AM To: linux-scsi@vger.kernel.org Cc: chaitra.basa...@broadcom.com; sparcli...@vger.kernel.org Subject: mpt3sas regression... Commit: commit cf6bf9710cabba1fe94a4349f4eb8db623c77ebc Author: Chaitra P B Date: Tue Apr 24 05:28:30 2018 -0400 scsi: mpt3sas: Bug fix for big endian systems. actually breaks big-endian. This driver has been working perfectly fine for more a decade or so on my sparc64 test systems up until this point. If you are just responding to sparse warnings, please do not do that. What big-endian system did you test this change on? Meanwhile, I'd like to ask that this change be reverted. Thank you.
RE: mpt3sas regression...
Hi David, Sorry for the inconvenience caused. Yes, "scsi: mpt3sas: Bug fix for big endian systems." patch was posted to fix sparse warnings. I missed the testing. Currently we are testing on sparc64 system and soon I will be reposting the patch based on the findings. Thanks, Chaitra -Original Message- From: David Miller [mailto:da...@davemloft.net] Sent: Sunday, June 24, 2018 10:17 AM To: linux-scsi@vger.kernel.org Cc: chaitra.basa...@broadcom.com; sparcli...@vger.kernel.org Subject: mpt3sas regression... Commit: commit cf6bf9710cabba1fe94a4349f4eb8db623c77ebc Author: Chaitra P B Date: Tue Apr 24 05:28:30 2018 -0400 scsi: mpt3sas: Bug fix for big endian systems. actually breaks big-endian. This driver has been working perfectly fine for more a decade or so on my sparc64 test systems up until this point. If you are just responding to sparse warnings, please do not do that. What big-endian system did you test this change on? Meanwhile, I'd like to ask that this change be reverted. Thank you.
RE: [PATCH] mpt3sas: Fix calltrace observed while running IO & host reset
Bart, Please see my replies inline. Thanks, Chaitra -Original Message- From: Bart Van Assche [mailto:bart.vanass...@wdc.com] Sent: Wednesday, June 13, 2018 9:22 PM To: chaitra.basa...@broadcom.com; linux-scsi@vger.kernel.org Cc: sathya.prak...@broadcom.com; suganath-prabu.subram...@broadcom.com; sreekanth.re...@broadcom.com Subject: Re: [PATCH] mpt3sas: Fix calltrace observed while running IO & host reset On Wed, 2018-06-13 at 15:46 +0530, Chaitra Basappa wrote: > When host reset is issued from application, through ioctl reset handler > _ctl_do_reset() -> mpt3sas_base_hard_reset_handler() sets > “ioc->shost_recovery” flag. > If “ioc->shost_recovery” flag is set then driver will return all the > incoming SCSI cmds with “SCSI_MLQUEUE_HOST_BUSY” in the scsih_qcmd(). And > hence no new request gets processed by the driver until the reset > completes, > which guarantees that the smid won't change. Hello Chaitra, The patch at the start of this e-mail thread checks whether st->smid is zero. That check could only be useful if there would be code in the mpt3sas driver that clears that field upon command completion. However, I haven't found any such code in the mpt3sas driver. [Chaitra] Before starting the host reset operation, driver will set "ioc->shost_recovery" flag to one, so during host reset time if driver receives any IO commands then below check in scsih_qcmd() returns these scsi commands with host busy status and hence these commands are not issued to the HBA FW. So these scsi commands will not be outstanding at the driver level, hence smid for these scsi commands will be zero and no need to flush out these commands during host reset time. /* host recovery or link resets sent via IOCTLs */ if (ioc->shost_recovery || ioc->ioc_link_reset_in_progress) return SCSI_MLQUEUE_HOST_BUSY; As a part of host reset operation, driver will flush out all the scsi commands which are outstanding at the driver level with "DID_RESET" result. To determine whether scsi cmnds are outstanding at the driver level while looping from 'tag' value zero to hba queue depth, driver will check for below two fields from the scsiio_tracker 1. cb_idx == 0xFF : this means that scsi cmnd has completed from the driver, so this command is not outstanding at the driver level. So this check itself is enough to determine that scsi cmnd is completedfrom the driver and no need reset smid to zero. But any way it is better to reset the smid field also to zero along with cb_idx setting to 0xff. And hence we will re-post this patch with setting of smid field in scsiio_tracker to zero upon completion of the scsi cmnd by the driver. 2. smid == 0 (zero): this means that scsi cmnd has not issued to the HBA firmware, so this command is not outstanding at the driver level. (current driver was not checking this case and hence we are observing this issue. In this patch we have added this check to fix this issue) If cd_idx != 0xff && smid != 0 , this means that scsi cmnd is outstanding at the driver level and Driver will flush this scsi cmnd with "DID_RESET" during diag reset time. Another concern is that setting ioc->shost_recovery prevents new calls of scsih_qcmd() to submit any commands. But I don't think that setting that flag prevents any scsih_qcmd() calls that had already been started to submit a new command. [Chaitra] If scsi cmnd has already crossed the check for "ioc->shost_recovery" flag (it means that scmd has been issued just before starting of host reset operation) then such commands will be processed by driver , which assigns valid 'smid' whose value b/w 1 and <= ioc->scsiio_depth (i.e. scsi cmnd's tag value + 1) thus these commands will be outstanding at driver level and hence will be flushed out with "DID_RESET" during reset operation. In other words, I don't think that checking whether or not st->smid == 0 is sufficient to fix the reported race. Bart.
RE: [PATCH] mpt3sas: Fix calltrace observed while running IO & host reset
Bart, When host reset is issued from application, through ioctl reset handler _ctl_do_reset() -> mpt3sas_base_hard_reset_handler() sets “ioc->shost_recovery” flag. If “ioc->shost_recovery” flag is set then driver will return all the incoming SCSI cmds with “SCSI_MLQUEUE_HOST_BUSY” in the scsih_qcmd(). And hence no new request gets processed by the driver until the reset completes, which guarantees that the smid won't change. Thanks, Chaitra -Original Message- From: Bart Van Assche [mailto:bart.vanass...@wdc.com] Sent: Tuesday, June 12, 2018 8:54 PM To: chaitra.basa...@broadcom.com; linux-scsi@vger.kernel.org Cc: sathya.prak...@broadcom.com; suganath-prabu.subram...@broadcom.com; sreekanth.re...@broadcom.com Subject: Re: [PATCH] mpt3sas: Fix calltrace observed while running IO & host reset On Tue, 2018-06-12 at 09:17 -0400, Chaitra P B wrote: > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > b/drivers/scsi/mpt3sas/mpt3sas_scsih.c > index 23902ad..96e523a 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c > @@ -1489,7 +1489,7 @@ struct scsi_cmnd * > scmd = scsi_host_find_tag(ioc->shost, unique_tag); > if (scmd) { > st = scsi_cmd_priv(scmd); > - if (st->cb_idx == 0xFF) > + if (st->cb_idx == 0xFF || st->smid == 0) > scmd = NULL; > } > } What guarantees that st->smid won't change after it has been checked and before scmd is used? Thanks, Bart.
RE: [PATCH] mpt3sas: Add an i/o barrier
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: Tomas Henzl [mailto:the...@redhat.com] Sent: Thursday, May 24, 2018 9:19 PM To: James Bottomley; linux-scsi@vger.kernel.org Cc: chaitra.basa...@broadcom.com; sreekanth.re...@broadcom.com; sathya.prak...@broadcom.com Subject: Re: [PATCH] mpt3sas: Add an i/o barrier On 05/24/2018 05:33 PM, James Bottomley wrote: > On Thu, 2018-05-24 at 17:31 +0200, Tomas Henzl wrote: >> On 05/24/2018 05:19 PM, James Bottomley wrote: >>> On Thu, 2018-05-24 at 17:12 +0200, Tomas Henzl wrote: A barrier should be added to ensure proper ordering of memory mapped writes. Signed-off-by: Tomas Henzl --- drivers/scsi/mpt3sas/mpt3sas_base.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index bf04fa90f..569392d0d 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -3348,6 +3348,7 @@ _base_mpi_ep_writeq(__u64 b, volatile void __iomem *addr, spin_lock_irqsave(writeq_lock, flags); writel((u32)(data_out), addr); writel((u32)(data_out >> 32), (addr + 4)); + mmiowb(); spin_unlock_irqrestore(writeq_lock, flags); } >>> I thought, assuming mpt3sas has this right, that this construction >>> is only used on 32 bit platforms that don't have a writeq >>> instruction? I don't believe there's any overlap with the NUMA >>> systems that need io and memory domain synchronization, so either >>> this problem is purely theoretical or mpt3sas doesn't have the use >>> of writeq correct and if the latter case it should be fixed >>> correctly. >> The _base_mpi_ep_writeq is used regardless to 32/64 bit arch for >> example in _base_put_smid_mpi_ep_scsi_io, >> mpt3sas_base_put_smid_hi_priority and so on. > So it's the latter ... but my point is that that should be fixed > rather than adding barriers to what should be a corner case work around. I think that the hw is for some reason not able to handle a 64 write, the patch in e5747439366c1079257083f231f5dd9a84bf0fd7 "scsi: mpt3sas: Introduce function to clone mpi request" states that it is intentional. So adding the write barrier still makes sense for me. > > James >
RE: [PATCH v2 01/14] mpt3sas: Bug fix for big endian systems.
Martin, Please see my replies inline. Thanks, Chaitra -Original Message- From: Martin K. Petersen [mailto:martin.peter...@oracle.com] Sent: Saturday, April 21, 2018 3:52 AM To: Chaitra P B Cc: linux-scsi@vger.kernel.org; sathya.prak...@broadcom.com; sreekanth.re...@broadcom.com; suganath-prabu.subram...@broadcom.com Subject: Re: [PATCH v2 01/14] mpt3sas: Bug fix for big endian systems. Chaitra, A few comments: > @@ -426,7 +427,7 @@ static void _clone_sg_entries(struct MPT3SAS_ADAPTER *ioc, > dst_addr_phys = _base_get_chain_phys(ioc, > smid, sge_chain_count); > WARN_ON(dst_addr_phys > U32_MAX); > - sgel->Address = (u32)dst_addr_phys; > + sgel->Address = cpu_to_le32((u32)dst_addr_phys); I tend to prefer lower_32_bits() but that's your choice. Accepted, lower_32_bits() can be used here. > @@ -3040,8 +3047,9 @@ mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER *ioc) > } > > for (i = 0; i < ioc->combined_reply_index_count; i++) { > - ioc->replyPostRegisterIndex[i] = (resource_size_t *) > - ((u8 *)&ioc->chip->Doorbell + > + ioc->replyPostRegisterIndex[i] = > + (volatile void __iomem *) > + ((u8 __force *)&ioc->chip->Doorbell + >MPI25_SUP_REPLY_POST_HOST_INDEX_OFFSET + >(i * MPT3_SUP_REPLY_POST_HOST_INDEX_REG_OFFSET)); Do you really need volatile here? The existing resource_size_t didn't imply volatile. Why the double type casts? You've already changed replyPostRegisterIndex to be 'volatile void __iomem **' in the header file. So why not: ioc->replyPostRegisterIndex[i] = &ioc->chip->Doorbell + MPI25_SUP_REPLY_POST_HOST_INDEX_OFFSET + i * MPT3_SUP_REPLY_POST_HOST_INDEX_REG_OFFSET; Also looks like ioc->reply_post_host_index handling a few lines further down could lose the type casts. Accepted, volatile is not really needed. I shall remove volatile. > @@ -3386,7 +3394,7 @@ _base_put_smid_mpi_ep_scsi_io(struct MPT3SAS_ADAPTER *ioc, u16 smid, u16 handle) > __le32 *mfp = (__le32 *)mpt3sas_base_get_msg_frame(ioc, smid); > > _clone_sg_entries(ioc, (void *) mfp, smid); > - mpi_req_iomem = (void *)ioc->chip + > + mpi_req_iomem = (void __force *)ioc->chip + > MPI_FRAME_START_OFFSET + (smid * ioc->request_sz); > _base_clone_mpi_to_sys_mem(mpi_req_iomem, (void *)mfp, > ioc->request_sz); Wouldn't it be better to add __iomem to the definition of mpi_req_iomem? With this change I still see the below warnings: warning: cast removes address space of expression warning: incorrect type in assignment (different address spaces) expected void [noderef] *mpi_req_iomem got void * warning: incorrect type in argument 1 (different address spaces) expected void *dst_iomem got void [noderef] *mpi_req_iome > + nvme_encap_request->ErrorResponseBaseAddress = > + cpu_to_le64(ioc->sense_dma & 0xUL); upper_32_bits()? since upper_32_bits() returns only upper 32 bits. But here after bitwise & below we are doing bitwise | with dma_address lower 32 bits , so in this case use of upper_32_bits() will yield wrong address for below assignment. Hence upper_32_bits() can't be used. nvme_encap_request->ErrorResponseBaseAddress |= cpu_to_le64(le32_to_cpu( mpt3sas_base_get_sense_buffer_dma(ioc, smid))); -- Martin K. Petersen Oracle Linux Engineering
RE: [PATCH v1 03/15] mpt3sas: Add sanity checks for scsi tracker before accessing it.
Bart, We will work on this patch and submit. As of now reposting all the patches of this series except this patch. Thanks, Chaitra -Original Message- From: Bart Van Assche [mailto:bart.vanass...@wdc.com] Sent: Friday, April 6, 2018 8:59 PM To: chaitra.basa...@broadcom.com; linux-scsi@vger.kernel.org Cc: sathya.prak...@broadcom.com; sreekanth.re...@broadcom.com; suganath-prabu.subram...@broadcom.com Subject: Re: [PATCH v1 03/15] mpt3sas: Add sanity checks for scsi tracker before accessing it. On Thu, 2018-04-05 at 11:46 -0400, Chaitra P B wrote: > Check scsi tracker 'st' for NULL and st->smid for zero (as driver uses > smid starting from one) before accessing it. > These checks are added as there are possibilities for getting valid > scsi_cmd when driver calls scsi_host_find_tag() API when it loops > using smid(i.e tag) from one to hba queue depth but still scsi tracker > st for this corresponding scsi_cmd is not yet initialized. > > For example below are such scenario: > Sometimes it is possible that scsi_cmd might have created at SML but > it might not be issued to the driver (or driver might have returned > the command with Host busy status) as the host reset operation / TMs > is in progress.In such case where the scsi_cmd is not yet processed by > driver then the scsi tracker 'st' of that scsi_cmd & the fields of > this 'st' will be uninitialized. > And hence this patch add checks for 'st' in IOCTL path for TMs issued > from applications and also in host reset path where driver flushes all > the outstanding commands as part of host reset operation. What is needed is an explanation about which mechanism serializes the execution of scsih_qcmd() and mpt3sas_base_hard_reset_handler(), at least if such a mechanism exists. The above text does not mention anything about such a synchronization mechanism. > scmd = mpt3sas_scsih_scsi_lookup_get(ioc, smid); > - if (!scmd) > + if (scmd == NULL || scmd->device == NULL || > + scmd->device->hostdata == NULL) As Christoph explained before, scmd->device is never NULL. Additionally, the scmd->device->hostdata check looks very suspicious. That check should scmd->device->either be left out or the race that causes a SCSI device to be removed concurrently with this function should be fixed. If you are unable to motivate why this patch is correct, please repost this series without this patch. Thanks, Bart.
RE: [PATCH 00/15] mpt3sas: Enhancements and Defect fixes.
Bart, Below set of patches were prepared against Martin's 4.17/scsi-queue. Also today I tried applying patches to Martin's 4.17/scsi-queue and patches got applied smoothly, didn't observe any issue. Please share the observed errors/hunks failed messages. Below is the link of the repo used by me: git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git -b 4.17/scsi-queue Thanks, Chaitra -Original Message- From: Bart Van Assche [mailto:bart.vanass...@wdc.com] Sent: Friday, March 30, 2018 8:40 PM To: chaitra.basa...@broadcom.com; linux-scsi@vger.kernel.org Cc: sathya.prak...@broadcom.com; suganath-prabu.subram...@broadcom.com; sreekanth.re...@broadcom.com Subject: Re: [PATCH 00/15] mpt3sas: Enhancements and Defect fixes. On Fri, 2018-03-30 at 15:07 +0530, Chaitra P B wrote: > Chaitra P B (15): > mpt3sas: Fixed warnings. > mpt3sas: Pre-allocate RDPQ Array at driver boot time. > mpt3sas: Add sanity checks for scsi tracker before accessing it. > mpt3sas: Lockless access for chain buffers. > mpt3sas: Optimize I/O memory consumption in driver. > mpt3sas: Enhanced handling of Sense Buffer. > mpt3sas: Added support for SAS Device Discovery Error Event. > mpt3sas: Increase event log buffer to support 24 port HBA's. > mpt3sas: Allow processing of events during driver unload. > mpt3sas: Cache enclosure pages during enclosure add. > mpt3sas: Report Firmware Package Version from HBA Driver. > mpt3sas: Update MPI Headers > mpt3sas: For NVME device, issue a protocol level reset instead of > hot reset and use TM timeout value exposed in PCIe Device Page 2. > mpt3sas: fix possible memory leak. > mpt3sas: Update driver version "25.100.00.00" Hello Chaitra, Against which tree have these patches been prepared? These patches neither apply cleanly on Martin's 4.17/scsi-queue branch nor on Linus' master branch. Thanks, Bart.
RE: [PATCH 11/15] mpt3sas: Report Firmware Package Version from HBA Driver.
Bart, Agreed and patches will be posted with below changes. Thanks, Chaitra -Original Message- From: Bart Van Assche [mailto:bart.vanass...@wdc.com] Sent: Friday, March 30, 2018 10:05 PM To: chaitra.basa...@broadcom.com; linux-scsi@vger.kernel.org Cc: sathya.prak...@broadcom.com; suganath-prabu.subram...@broadcom.com; sreekanth.re...@broadcom.com Subject: Re: [PATCH 11/15] mpt3sas: Report Firmware Package Version from HBA Driver. On Fri, 2018-03-30 at 15:07 +0530, Chaitra P B wrote: > + pr_info(MPT3SAS_FMT "FW Package Version" > + "(%02d.%02d.%02d.%02d)\n", > + ioc->name, > + ((FWImgHdr->PackageVersion.Word) > + & 0xFF00) >> 24, > + ((FWImgHdr->PackageVersion.Word) > + & 0x00FF) >> 16, > + ((FWImgHdr->PackageVersion.Word) > + & 0xFF00) >> 8, > + (FWImgHdr->PackageVersion.Word) > + & 0x00FF); Since FWImgHdr->PackageVersion.Word has type __le32 I don't think that the above code will work correctly on big endian systems. Please use the Dev, Unit, Minor and Major members of MPI2_VERSION_STRUCT instead of open-coding access to these members. Thanks, Bart.
RE: [PATCH 06/15] mpt3sas: Enhanced handling of Sense Buffer.
Bart, Agreed with below changes, we will be posting the patches soon. -Original Message- From: Bart Van Assche [mailto:bart.vanass...@wdc.com] Sent: Friday, March 30, 2018 9:46 PM To: chaitra.basa...@broadcom.com; linux-scsi@vger.kernel.org Cc: sathya.prak...@broadcom.com; suganath-prabu.subram...@broadcom.com; sreekanth.re...@broadcom.com Subject: Re: [PATCH 06/15] mpt3sas: Enhanced handling of Sense Buffer. On Fri, 2018-03-30 at 15:07 +0530, Chaitra P B wrote: > + if (((reply_pool_start_address / bit_divisor_16) / (bit_divisor_16)) == > + ((reply_pool_end_address / bit_divisor_16) / bit_divisor_16)) > + return 1; > + else > + return 0; Please use upper_32_bits() instead of open-coding it. I think that the above check could be rewritten as follows: return upper_32_bits(reply_pool_start_address) == upper_32_bits(reply_pool_end_address); Thanks, Bart.
RE: [PATCH 07/15] mpt3sas: Added support for SAS Device Discovery Error Event.
Bart, Agreed. Thanks, Chaitra -Original Message- From: Bart Van Assche [mailto:bart.vanass...@wdc.com] Sent: Friday, March 30, 2018 9:50 PM To: chaitra.basa...@broadcom.com; linux-scsi@vger.kernel.org Cc: sathya.prak...@broadcom.com; suganath-prabu.subram...@broadcom.com; sreekanth.re...@broadcom.com Subject: Re: [PATCH 07/15] mpt3sas: Added support for SAS Device Discovery Error Event. On Fri, 2018-03-30 at 15:07 +0530, Chaitra P B wrote: > + switch (event_data->ReasonCode) { > + > + case MPI25_EVENT_SAS_DISC_ERR_SMP_FAILED: > + pr_warn(MPT3SAS_FMT "SMP command sent to the expander" > + "(handle:0x%04x, sas_address:0x%016llx," > + "physical_port:0x%02x) has failed", > + ioc->name, le16_to_cpu(event_data->DevHandle), > + (unsigned long long)le64_to_cpu(event_data->SASAddress), > + event_data->PhysicalPort); > + break; > + > + case MPI25_EVENT_SAS_DISC_ERR_SMP_TIMEOUT: Please follow the style that is used elsewhere in the Linux kernel and do not leave blank lines after "switch (...) {" nor after "break". Thanks, Bart.
smp-induced oops/NULL pointer dereference in mpt3sas, from kernel >= 4.11
Hi All, In testing kernel 4.11.1 and 4.11.6 we've hit an oops/ blown pointer issue in mpt3sas. It is easily reproducible on a system that contains expanders/enclosure connected behind SAS3 HBA. Soon after connecting expander / enclosure we observe below call trace. Jul 12 15:28:27 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at 00dc Jul 12 15:28:27 localhost kernel: IP: _transport_smp_handler+0x8bb/0x10c0 [mpt3sas] Jul 12 15:28:27 localhost kernel: PGD 811abb067 Jul 12 15:28:27 localhost kernel: PUD 81c96a067 Jul 12 15:28:27 localhost kernel: PMD 0 Jul 12 15:28:27 localhost kernel: Jul 12 15:28:27 localhost kernel: Oops: 0002 [#1] SMP Jul 12 15:28:27 localhost kernel: mpt3sas_cm0: Discovery: (stop) Jul 12 15:28:27 localhost kernel: Jul 12 15:28:27 localhost kernel: mpt3sas_cm0: discovery event: (stop) Jul 12 15:28:27 localhost kernel: Jul 12 15:28:27 localhost kernel: Hardware name: Dell Inc. PowerEdge T620/0658N7, BIOS 2.5.4 01/22/2016 Jul 12 15:28:27 localhost kernel: task: 88081c1b8100 task.stack: c90006168000 Jul 12 15:28:27 localhost kernel: RIP: 0010:_transport_smp_handler+0x8bb/0x10c0 [mpt3sas] Jul 12 15:28:27 localhost kernel: RSP: 0018:c9000616bb38 EFLAGS: 00010286 Jul 12 15:28:27 localhost kernel: RAX: 00dc RBX: 88041c2ba7b0 RCX: 003c1a07ff00 Jul 12 15:28:27 localhost kernel: RDX: 88081a45c948 RSI: dead0200 RDI: dead0100 Jul 12 15:28:27 localhost kernel: RBP: c9000616bbf8 R08: c9000616bac0 R09: dead0200 Jul 12 15:28:27 localhost kernel: R10: R11: 0010 R12: 0105 Jul 12 15:28:27 localhost kernel: R13: 88041d631680 R14: 88041a6c6c38 R15: 0001 Jul 12 15:28:27 localhost kernel: FS: 7f1818ad1700() GS:88042f80() knlGS: Jul 12 15:28:27 localhost kernel: CS: 0010 DS: ES: CR0: 80050033 Jul 12 15:28:27 localhost kernel: CR2: 00dc CR3: 00081dad1000 CR4: 000406f0 Jul 12 15:28:27 localhost kernel: Call Trace: Jul 12 15:28:27 localhost kernel: ? blk_rq_bio_prep+0x3c/0x80 Jul 12 15:28:27 localhost kernel: ? blk_start_request+0x38/0x60 Jul 12 15:28:27 localhost kernel: sas_smp_request+0x5f/0xa0 [scsi_transport_sas] Jul 12 15:28:27 localhost kernel: sas_non_host_smp_request+0x4a/0x60 [scsi_transport_sas] Jul 12 15:28:27 localhost kernel: __blk_run_queue+0x37/0x50 Jul 12 15:28:27 localhost kernel: blk_execute_rq_nowait+0xeb/0x140 Jul 12 15:28:27 localhost kernel: blk_execute_rq+0x48/0x90 Jul 12 15:28:27 localhost kernel: bsg_ioctl+0x18a/0x1e0 Jul 12 15:28:27 localhost kernel: vfs_ioctl+0x18/0x30 Jul 12 15:28:27 localhost kernel: do_vfs_ioctl+0x14b/0x3f0 Jul 12 15:28:27 localhost kernel: ? security_file_ioctl+0x45/0x60 Jul 12 15:28:27 localhost kernel: SyS_ioctl+0x92/0xa0 Jul 12 15:28:27 localhost kernel: do_syscall_64+0x6c/0x160 Jul 12 15:28:27 localhost kernel: entry_SYSCALL64_slow_path+0x25/0x25 Jul 12 15:28:27 localhost kernel: RIP: 0033:0x35a88e0a77 Jul 12 15:28:27 localhost kernel: RSP: 002b:7ffded06f278 EFLAGS: 0246 ORIG_RAX: 0010 Jul 12 15:28:27 localhost kernel: RAX: ffda RBX: 7ffded06f370 RCX: 0035a88e0a77 Jul 12 15:28:27 localhost kernel: RDX: 7ffded06f280 RSI: 2285 RDI: 0003 Jul 12 15:28:27 localhost kernel: RBP: R08: 03ea R09: 8000 Jul 12 15:28:27 localhost kernel: R10: fff0 R11: 0246 R12: 0003 Jul 12 15:28:27 localhost kernel: R13: R14: 7ffded06f3a0 R15: Jul 12 15:28:27 localhost kernel: Code: 84 3e 02 00 00 48 8b 5d a8 85 d2 4c 8b ab f8 02 00 00 0f 85 e3 05 00 00 48 8b 55 98 49 8b 4d 00 48 81 c2 48 01 00 00 48 8b 42 28 <48> 89 08 49 8b 4d 08 48 89 48 08 49 8b 4d 10 48 89 48 10 41 8b Jul 12 15:28:27 localhost kernel: RIP: _transport_smp_handler+0x8bb/0x10c0 [mpt3sas] RSP: c9000616bb38 Jul 12 15:28:27 localhost kernel: CR2: 00dc Jul 12 15:28:27 localhost kernel: ---[ end trace d0a22e0e5a84886a ]--- Jul 12 15:28:28 localhost kernel: ses 4:0:0:0: Attached Enclosure device We analyzed this issue and could figure out it is not because of driver, its because the "sense" field of the 'struct scsi_request' is not being populated properly from the upper layer. And this "sense" member is being referenced in our driver code for kernel versions >= 4.11 as shown below in the snippet: Whereas as for < 4.11 kernel version this "sense" member was referenced via 'struct request' static int _transport_smp_handler (.) { . . >>memcpy(scsi_req(req)->sense, mpi_reply, sizeof(*mpi_reply)); . . } And hence the NULL pointer dereference call trace is seen for the above chunk of mpt3sas. This needs to be addressed from upper layer, so please help us in getting this resolved. Thanks in advance for the support, Regards, Chaitra
RE: [Bug 179341] mpt3sas: LSISAS3008 don't see Intel 540s SSD
Hi, Please share driver logs with logging_level set to "0x3f8". If driver cannot be unloaded and loaded,then the module parameter has to be passed as kernel command line in the boot loader as “mpt3sas.logging_level=0x3F8” else if driver module can be unloaded and loaded then simply give logging_level=0x3F8 for insmod or modprobe. Thanks, Chaitra -Original Message- From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of bugzilla-dae...@bugzilla.kernel.org Sent: Friday, October 21, 2016 6:43 AM To: linux-scsi@vger.kernel.org Subject: [Bug 179341] mpt3sas: LSISAS3008 don't see Intel 540s SSD https://bugzilla.kernel.org/show_bug.cgi?id=179341 --- Comment #2 from Badalyan Vyacheslav --- Adapter Selected is a Avago SAS: SAS3008(C0) Num CtlrFW VerNVDATAx86-BIOS PCI Addr 0 SAS3008(C0) 13.00.00.000b.02.00.0308.31.00.00 00:0a:00:00 1 SAS3008(C0) 13.00.00.000b.02.00.0308.31.00.00 00:08:00:00 -- You are receiving this mail because: You are watching someone on the CC list of the bug. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] mpt3sas: Fix resume on WarpDrive flash cards
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: Greg Edwards [mailto:gedwa...@fireweed.org] Sent: Saturday, July 30, 2016 9:36 PM To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen Cc: mpt-fusionlinux@broadcom.com; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org; Greg Edwards Subject: [PATCH] mpt3sas: Fix resume on WarpDrive flash cards mpt3sas crashes on resume after suspend with WarpDrive flash cards. The reply_post_host_index array is not set back up after the resume, and we deference a stale pointer in _base_interrupt(). [ 47.309711] BUG: unable to handle kernel paging request at c90001f8006c [ 47.318289] IP: [] _base_interrupt+0x49f/0xa30 [mpt3sas] [ 47.326749] PGD 41ccaa067 PUD 41ccab067 PMD 3466c067 PTE 0 [ 47.333848] Oops: 0002 [#1] SMP ... [ 47.452708] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.7.0 #6 [ 47.460506] Hardware name: Dell Inc. OptiPlex 990/06D7TR, BIOS A18 09/24/2013 [ 47.469629] task: 81c0d500 ti: 81c0 task.ti: 81c0 [ 47.479112] RIP: 0010:[] [] _base_interrupt+0x49f/0xa30 [mpt3sas] [ 47.490466] RSP: 0018:88041d203e30 EFLAGS: 00010002 [ 47.497801] RAX: 0001 RBX: 880033f4c000 RCX: 0001 [ 47.506973] RDX: c90001f8006c RSI: 0082 RDI: 0082 [ 47.516141] RBP: 88041d203eb0 R08: 8804118e2820 R09: 0001 [ 47.525300] R10: 0001 R11: 100c R12: [ 47.534457] R13: 880412c487e0 R14: 88041a8987d8 R15: 0001 [ 47.543632] FS: () GS:88041d20() knlGS: [ 47.553796] CS: 0010 DS: ES: CR0: 80050033 [ 47.561632] CR2: c90001f8006c CR3: 01c06000 CR4: 000406f0 [ 47.570883] Stack: [ 47.575015] 1d211228 88041d2100c0 8800c47d8130 0100 [ 47.584625] 8804100c 100c 88041a8992a0 88041a8987f8 [ 47.594230] 88041d203e00 8e55 038c 880414ad4280 [ 47.603862] Call Trace: [ 47.608474] [ 47.610413] [] ? call_timer_fn+0x35/0x120 [ 47.620539] [] handle_irq_event_percpu+0x7f/0x1c0 [ 47.629061] [] handle_irq_event+0x2c/0x50 [ 47.636859] [] handle_edge_irq+0x6f/0x130 [ 47.644654] [] handle_irq+0x73/0x120 [ 47.652011] [] ? atomic_notifier_call_chain+0x1a/0x20 [ 47.660854] [] do_IRQ+0x4b/0xd0 [ 47.66] [] common_interrupt+0x8c/0x8c [ 47.675635] Move the reply_post_host_index array setup into mpt3sas_base_map_resources(), which is also in the resume path. Cc: sta...@vger.kernel.org Signed-off-by: Greg Edwards --- drivers/scsi/mpt3sas/mpt3sas_base.c | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 751f13e..750f82c 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -2188,6 +2188,17 @@ mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER *ioc) } else ioc->msix96_vector = 0; + if (ioc->is_warpdrive) { + ioc->reply_post_host_index[0] = (resource_size_t __iomem *) + &ioc->chip->ReplyPostHostIndex; + + for (i = 1; i < ioc->cpu_msix_table_sz; i++) + ioc->reply_post_host_index[i] = + (resource_size_t __iomem *) + ((u8 __iomem *)&ioc->chip->Doorbell + (0x4000 + ((i - 1) + * 4))); + } + list_for_each_entry(reply_q, &ioc->reply_queue_list, list) pr_info(MPT3SAS_FMT "%s: IRQ %d\n", reply_q->name, ((ioc->msix_enable) ? "PCI-MSI-X enabled" : @@ -5280,17 +5291,6 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc) if (r) goto out_free_resources; - if (ioc->is_warpdrive) { - ioc->reply_post_host_index[0] = (resource_size_t __iomem *) - &ioc->chip->ReplyPostHostIndex; - - for (i = 1; i < ioc->cpu_msix_table_sz; i++) - ioc->reply_post_host_index[i] = - (resource_size_t __iomem *) - ((u8 __iomem *)&ioc->chip->Doorbell + (0x4000 + ((i - 1) - * 4))); - } - pci_set_drvdata(ioc->pdev, ioc->shost); r = _base_get_ioc_facts(ioc, CAN_SLEEP); if (r) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] mpt3sas: Don't spam logs if logging level is 0
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Johannes Thumshirn Sent: Wednesday, August 03, 2016 6:30 PM To: Martin K . Petersen; James Bottomley Cc: Linux SCSI Mailinglist; Linux Kernel Mailinglist; Sreekanth Reddy; Johannes Thumshirn Subject: [PATCH] mpt3sas: Don't spam logs if logging level is 0 In _scsih_io_done() we test if the ioc->logging_level does _not_ have the MPT_DEBUG_REPLY bit set and if it hasn't we print the debug messages. This unfortunately is the wrong way around. Note, the actual bug is older than af0094115 but this commit removed the CONFIG_SCSI_MPT3SAS_LOGGING Kconfig option which hid the bug. Fixes: af0094115 'mpt2sas, mpt3sas: Remove SCSI_MPTXSAS_LOGGING entry from Kconfig' Signed-off-by: Johannes Thumshirn --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 4a1cc85..a138690 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -4693,7 +4693,7 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) le16_to_cpu(mpi_reply->DevHandle)); mpt3sas_trigger_scsi(ioc, data.skey, data.asc, data.ascq); - if (!(ioc->logging_level & MPT_DEBUG_REPLY) && + if ((ioc->logging_level & MPT_DEBUG_REPLY) && ((scmd->sense_buffer[2] == UNIT_ATTENTION) || (scmd->sense_buffer[2] == MEDIUM_ERROR) || (scmd->sense_buffer[2] == HARDWARE_ERROR))) -- 1.8.5.6 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] mpt3sas: Ensure the connector_name string is NUL-terminated
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: Calvin Owens [mailto:calvinow...@fb.com] Sent: Thursday, July 28, 2016 10:16 AM To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen Cc: mpt-fusionlinux@broadcom.com; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org; kernel-t...@fb.com; Calvin Owens Subject: [PATCH] mpt3sas: Ensure the connector_name string is NUL-terminated We blindly trust the hardware to give us NUL-terminated strings, which is a bad idea because it doesn't always do that. For example: [ 481.184784] mpt3sas_cm0: enclosure level(0x), connector name( \x3) In this case, connector_name is four spaces. We got lucky here because the 2nd byte beyond our character array happens to be a NUL. Fix this by explicitly writing '\0' to the end of the string to ensure we don't run off the edge of the world in printk(). Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_base.h | 2 +- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 10 ++ 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index 892c9be..eb7f5b0 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -478,7 +478,7 @@ struct _sas_device { u8 pfa_led_on; u8 pend_sas_rphy_add; u8 enclosure_level; - u8 connector_name[4]; + u8 connector_name[5]; struct kref refcount; }; diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index cd91a68..acabe48 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -5380,8 +5380,9 @@ _scsih_check_device(struct MPT3SAS_ADAPTER *ioc, MPI2_SAS_DEVICE0_FLAGS_ENCL_LEVEL_VALID) { sas_device->enclosure_level = le16_to_cpu(sas_device_pg0.EnclosureLevel); - memcpy(&sas_device->connector_name[0], - &sas_device_pg0.ConnectorName[0], 4); + memcpy(sas_device->connector_name, + sas_device_pg0.ConnectorName, 4); + sas_device->connector_name[4] = '\0'; } else { sas_device->enclosure_level = 0; sas_device->connector_name[0] = '\0'; @@ -5508,8 +5509,9 @@ _scsih_add_device(struct MPT3SAS_ADAPTER *ioc, u16 handle, u8 phy_num, if (sas_device_pg0.Flags & MPI2_SAS_DEVICE0_FLAGS_ENCL_LEVEL_VALID) { sas_device->enclosure_level = le16_to_cpu(sas_device_pg0.EnclosureLevel); - memcpy(&sas_device->connector_name[0], - &sas_device_pg0.ConnectorName[0], 4); + memcpy(sas_device->connector_name, + sas_device_pg0.ConnectorName, 4); + sas_device->connector_name[4] = '\0'; } else { sas_device->enclosure_level = 0; sas_device->connector_name[0] = '\0'; -- 2.8.0.rc2 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 1/3] mpt3sas: Eliminate conditional locking in mpt3sas_scsih_issue_tm()
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: Calvin Owens [mailto:calvinow...@fb.com] Sent: Friday, July 29, 2016 10:08 AM To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen Cc: mpt-fusionlinux@broadcom.com; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org; kernel-t...@fb.com; Calvin Owens Subject: [PATCH 1/3] mpt3sas: Eliminate conditional locking in mpt3sas_scsih_issue_tm() This flag that conditionally acquires the mutex is confusing and prone to bugginess: refactor it into two separate function calls, and make the unlocked one complain if it's called outside the mutex. Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_base.h | 16 +++-- drivers/scsi/mpt3sas/mpt3sas_ctl.c | 5 ++- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 66 +--- 3 files changed, 38 insertions(+), 49 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index eb7f5b0..f0baafd 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -794,16 +794,6 @@ struct reply_post_struct { dma_addr_t reply_post_free_dma; }; -/** - * enum mutex_type - task management mutex type - * @TM_MUTEX_OFF: mutex is not required becuase calling function is acquiring it - * @TM_MUTEX_ON: mutex is required - */ -enum mutex_type { - TM_MUTEX_OFF = 0, - TM_MUTEX_ON = 1, -}; - typedef void (*MPT3SAS_FLUSH_RUNNING_CMDS)(struct MPT3SAS_ADAPTER *ioc); /** * struct MPT3SAS_ADAPTER - per adapter struct @@ -1291,7 +1281,11 @@ void mpt3sas_scsih_reset_handler(struct MPT3SAS_ADAPTER *ioc, int reset_phase); int mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, uint channel, uint id, uint lun, u8 type, u16 smid_task, - ulong timeout, enum mutex_type m_type); + ulong timeout); +int mpt3sas_scsih_issue_locked_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, + uint channel, uint id, uint lun, u8 type, u16 smid_task, + ulong timeout); + void mpt3sas_scsih_set_tm_flag(struct MPT3SAS_ADAPTER *ioc, u16 handle); void mpt3sas_scsih_clear_tm_flag(struct MPT3SAS_ADAPTER *ioc, u16 handle); void mpt3sas_expander_remove(struct MPT3SAS_ADAPTER *ioc, u64 sas_address); diff --git a/drivers/scsi/mpt3sas/mpt3sas_ctl.c b/drivers/scsi/mpt3sas/mpt3sas_ctl.c index 7d00f09..75ae533 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_ctl.c +++ b/drivers/scsi/mpt3sas/mpt3sas_ctl.c @@ -1001,10 +1001,9 @@ _ctl_do_mpt_command(struct MPT3SAS_ADAPTER *ioc, struct mpt3_ioctl_command karg, ioc->name, le16_to_cpu(mpi_request->FunctionDependent1)); mpt3sas_halt_firmware(ioc); - mpt3sas_scsih_issue_tm(ioc, + mpt3sas_scsih_issue_locked_tm(ioc, le16_to_cpu(mpi_request->FunctionDependent1), 0, 0, - 0, MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET, 0, 30, - TM_MUTEX_ON); + 0, MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET, 0, 30); } else mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP, FORCE_BIG_HAMMER); diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index acabe48..c93a7ba 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -2201,7 +2201,6 @@ mpt3sas_scsih_clear_tm_flag(struct MPT3SAS_ADAPTER *ioc, u16 handle) * @type: MPI2_SCSITASKMGMT_TASKTYPE__XXX (defined in mpi2_init.h) * @smid_task: smid assigned to the task * @timeout: timeout in seconds - * @m_type: TM_MUTEX_ON or TM_MUTEX_OFF * Context: user * * A generic API for sending task management requests to firmware. @@ -2212,8 +2211,7 @@ mpt3sas_scsih_clear_tm_flag(struct MPT3SAS_ADAPTER *ioc, u16 handle) */ int mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, uint channel, - uint id, uint lun, u8 type, u16 smid_task, ulong timeout, - enum mutex_type m_type) + uint id, uint lun, u8 type, u16 smid_task, ulong timeout) { Mpi2SCSITaskManagementRequest_t *mpi_request; Mpi2SCSITaskManagementReply_t *mpi_reply; @@ -2224,21 +,19 @@ mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, uint channel, int rc; u16 msix_task = 0; - if (m_type == TM_MUTEX_ON) - mutex_lock(&ioc->tm_cmds.mutex); + lockdep_assert_held(&ioc->tm_cmds.mutex); + if (ioc->tm_cmds.status != MPT3_CMD_NOT_USED) { pr_info(MPT3SAS_FMT "%s: tm_cmd busy!!!\n", __func__, ioc->name); - rc = FAILED; - goto err_out; + return FAILED; } if (ioc->shost_recovery || ioc->remove_host || ioc->pci_error_recover
RE: [PATCH 3/3] mpt3sas: Fix warnings exposed by W=1
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: mpt-fusionlinux@broadcom.com [mailto:mpt-fusionlinux@broadcom.com] On Behalf Of Calvin Owens Sent: Friday, July 29, 2016 10:08 AM To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen Cc: mpt-fusionlinux@broadcom.com; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org; kernel-t...@fb.com; Calvin Owens Subject: [PATCH 3/3] mpt3sas: Fix warnings exposed by W=1 Trivial non-functional changes for a couple annoying things: 1) Functions local to files are not declared static, which is frustrating when reading the code because it's non-obvious at first glance what's actually called from other files. 2) Set-but-unused variables abound, presumably to mask -Wunused-result errors in the past. None of these are flagged today though (with one exception noted below), so remove them. Fixing (2) exposed the fact that we improperly ignore the return value of scsi_device_reprobe() in _scsih_reprobe_lun(). Fixing the calling code to deal with the potential error is non-trivial, so for now just WARN(). Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_base.c | 18 +++- drivers/scsi/mpt3sas/mpt3sas_config.c| 4 +- drivers/scsi/mpt3sas/mpt3sas_ctl.c | 29 ++--- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 70 +++- drivers/scsi/mpt3sas/mpt3sas_transport.c | 16 ++-- 5 files changed, 56 insertions(+), 81 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 0956183..df95d1a 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -2039,7 +2039,7 @@ _base_enable_msix(struct MPT3SAS_ADAPTER *ioc) * mpt3sas_base_unmap_resources - free controller resources * @ioc: per adapter object */ -void +static void mpt3sas_base_unmap_resources(struct MPT3SAS_ADAPTER *ioc) { struct pci_dev *pdev = ioc->pdev; @@ -3884,7 +3884,6 @@ _base_handshake_req_reply_wait(struct MPT3SAS_ADAPTER *ioc, int request_bytes, MPI2DefaultReply_t *default_reply = (MPI2DefaultReply_t *)reply; int i; u8 failed; - u16 dummy; __le32 *mfp; /* make sure doorbell is not in use */ @@ -3964,7 +3963,7 @@ _base_handshake_req_reply_wait(struct MPT3SAS_ADAPTER *ioc, int request_bytes, return -EFAULT; } if (i >= reply_bytes/2) /* overflow case */ - dummy = readl(&ioc->chip->Doorbell); + readl(&ioc->chip->Doorbell); else reply[i] = le16_to_cpu(readl(&ioc->chip->Doorbell) & MPI2_DOORBELL_DATA_MASK); @@ -4009,7 +4008,6 @@ mpt3sas_base_sas_iounit_control(struct MPT3SAS_ADAPTER *ioc, { u16 smid; u32 ioc_state; - unsigned long timeleft; bool issue_reset = false; int rc; void *request; @@ -4062,7 +4060,7 @@ mpt3sas_base_sas_iounit_control(struct MPT3SAS_ADAPTER *ioc, ioc->ioc_link_reset_in_progress = 1; init_completion(&ioc->base_cmds.done); mpt3sas_base_put_smid_default(ioc, smid); - timeleft = wait_for_completion_timeout(&ioc->base_cmds.done, + wait_for_completion_timeout(&ioc->base_cmds.done, msecs_to_jiffies(1)); if ((mpi_request->Operation == MPI2_SAS_OP_PHY_HARD_RESET || mpi_request->Operation == MPI2_SAS_OP_PHY_LINK_RESET) && @@ -4112,7 +4110,6 @@ mpt3sas_base_scsi_enclosure_processor(struct MPT3SAS_ADAPTER *ioc, { u16 smid; u32 ioc_state; - unsigned long timeleft; bool issue_reset = false; int rc; void *request; @@ -4163,7 +4160,7 @@ mpt3sas_base_scsi_enclosure_processor(struct MPT3SAS_ADAPTER *ioc, memcpy(request, mpi_request, sizeof(Mpi2SepReply_t)); init_completion(&ioc->base_cmds.done); mpt3sas_base_put_smid_default(ioc, smid); - timeleft = wait_for_completion_timeout(&ioc->base_cmds.done, + wait_for_completion_timeout(&ioc->base_cmds.done, msecs_to_jiffies(1)); if (!(ioc->base_cmds.status & MPT3_CMD_COMPLETE)) { pr_err(MPT3SAS_FMT "%s: timeout\n", @@ -4548,7 +4545,6 @@ _base_send_port_enable(struct MPT3SAS_ADAPTER *ioc) { Mpi2PortEnableRequest_t *mpi_request; Mpi2PortEnableReply_t *mpi_reply; - unsigned long timeleft; int r = 0; u16 smid; u16 ioc_status; @@ -4576,8 +4572,7 @@ _base_send_port_enable(struct MPT3SAS_ADAPTER *ioc) init_completion(&ioc->port_enable_cmds.done); mpt3sas_base_put_smid_default(ioc, smid); - timeleft = wait_for_completion_timeout(&ioc->port_enable_cmds.done, - 300*HZ); + wait_for_completion_timeout(&ioc->port_enable_cmds.done, 300*HZ); if (
RE: [PATCH 2/3] mpt3sas: Eliminate dead sleep_flag code
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: Calvin Owens [mailto:calvinow...@fb.com] Sent: Friday, July 29, 2016 10:08 AM To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen Cc: mpt-fusionlinux@broadcom.com; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org; kernel-t...@fb.com; Calvin Owens Subject: [PATCH 2/3] mpt3sas: Eliminate dead sleep_flag code With the exception of a single call to wait_for_doorbell_int(), all this conditional sleeping code is dead. So delete it. Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_base.c | 241 +-- drivers/scsi/mpt3sas/mpt3sas_base.h | 6 +- drivers/scsi/mpt3sas/mpt3sas_config.c| 3 +- drivers/scsi/mpt3sas/mpt3sas_ctl.c | 15 +- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 21 +-- drivers/scsi/mpt3sas/mpt3sas_transport.c | 12 +- 6 files changed, 120 insertions(+), 178 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 751f13e..0956183 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -98,7 +98,7 @@ MODULE_PARM_DESC(mpt3sas_fwfault_debug, " enable detection of firmware fault and halt firmware - (default=0)"); static int -_base_get_ioc_facts(struct MPT3SAS_ADAPTER *ioc, int sleep_flag); +_base_get_ioc_facts(struct MPT3SAS_ADAPTER *ioc); /** * _scsih_set_fwfault_debug - global setting of ioc->fwfault_debug. @@ -218,8 +218,7 @@ _base_fault_reset_work(struct work_struct *work) ioc->non_operational_loop = 0; if ((doorbell & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL) { - rc = mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP, - FORCE_BIG_HAMMER); + rc = mpt3sas_base_hard_reset_handler(ioc, FORCE_BIG_HAMMER); pr_warn(MPT3SAS_FMT "%s: hard reset: %s\n", ioc->name, __func__, (rc == 0) ? "success" : "failed"); doorbell = mpt3sas_base_get_iocstate(ioc, 0); @@ -2145,7 +2144,7 @@ mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER *ioc) _base_mask_interrupts(ioc); - r = _base_get_ioc_facts(ioc, CAN_SLEEP); + r = _base_get_ioc_facts(ioc); if (r) goto out_fail; @@ -3172,12 +3171,11 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc) /** * _base_allocate_memory_pools - allocate start of day memory pools * @ioc: per adapter object - * @sleep_flag: CAN_SLEEP or NO_SLEEP * * Returns 0 success, anything else error */ static int -_base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc, int sleep_flag) +_base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc) { struct mpt3sas_facts *facts; u16 max_sge_elements; @@ -3647,29 +3645,25 @@ mpt3sas_base_get_iocstate(struct MPT3SAS_ADAPTER *ioc, int cooked) * _base_wait_on_iocstate - waiting on a particular ioc state * @ioc_state: controller state { READY, OPERATIONAL, or RESET } * @timeout: timeout in second - * @sleep_flag: CAN_SLEEP or NO_SLEEP * * Returns 0 for success, non-zero for failure. */ static int -_base_wait_on_iocstate(struct MPT3SAS_ADAPTER *ioc, u32 ioc_state, int timeout, - int sleep_flag) +_base_wait_on_iocstate(struct MPT3SAS_ADAPTER *ioc, u32 ioc_state, int +timeout) { u32 count, cntdn; u32 current_state; count = 0; - cntdn = (sleep_flag == CAN_SLEEP) ? 1000*timeout : 2000*timeout; + cntdn = 1000 * timeout; do { current_state = mpt3sas_base_get_iocstate(ioc, 1); if (current_state == ioc_state) return 0; if (count && current_state == MPI2_IOC_STATE_FAULT) break; - if (sleep_flag == CAN_SLEEP) - usleep_range(1000, 1500); - else - udelay(500); + + usleep_range(1000, 1500); count++; } while (--cntdn); @@ -3681,24 +3675,22 @@ _base_wait_on_iocstate(struct MPT3SAS_ADAPTER *ioc, u32 ioc_state, int timeout, * a write to the doorbell) * @ioc: per adapter object * @timeout: timeout in second - * @sleep_flag: CAN_SLEEP or NO_SLEEP * * Returns 0 for success, non-zero for failure. * * Notes: MPI2_HIS_IOC2SYS_DB_STATUS - set to one when IOC writes to doorbell. */ static int -_base_diag_reset(struct MPT3SAS_ADAPTER *ioc, int sleep_flag); +_base_diag_reset(struct MPT3SAS_ADAPTER *ioc); static int -_base_wait_for_doorbell_int(struct MPT3SAS_ADAPTER *ioc, int timeout, - int sleep_flag) +_base_wait_for_doorbell_int(struct MPT3SAS_ADAPTER *ioc, int timeout) { u32 cntdn, count; u32 int_status; count = 0; - cntdn = (sleep_flag == CAN_SLEEP) ? 1000*timeout : 2000*timeout; + cntdn = 1000 * timeout; do {
RE: Kernel panics while creating RAID volume on latest stable 4.6.2 kernel beacuse of "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS enclosures"
Any updates on this ??? Thanks, Chaitra -Original Message- From: Chaitra Basappa [mailto:chaitra.basa...@broadcom.com] Sent: Friday, June 17, 2016 4:04 PM To: linux-ker...@vger.kernel.org; Linux SCSI Mailinglist; James Bottomley Subject: Kernel panics while creating RAID volume on latest stable 4.6.2 kernel beacuse of "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS enclosures" Importance: High Hi, Try creating RAID volume on latest stable 4.6.2 kernel, as soon as the volume gets created kernel panics , below are the logs... Carried out same experimentation on 4.4.13 kernel, issue was not observed.After learning diff b/w 4.4.13 & 4.6.2 kernels "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS enclosures" patch looks to be suspicious. commit 3f8d6f2a0797e8c650a47e5c1b5c2601a46f4293 And hence reverted above mentioned patch changes from 4.6.2 kernel and tried volume creation, volume created successfully and issue is not observed. >>Kernel panic logs: root@dhcp-135-24-192-112 ~]# sd 0:1:0:0: [sdw] No Caching mode page found sd 0:1:0:0: [sdw] Assuming drive cache: write through [ cut here ] kernel BUG at drivers/scsi/scsi_transport_sas.c:164! invalid opcode: [#1] SMP Modules linked in: mptctl mptbase ses enclosure ebtable_nat ebtables xt_CHECKSUM iptable_mangle bridge autofs4 8021q garp stp llc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan vhost tun kvm_intel kvm irqbypass uinput ipmi_devintf iTCO_wdt iTCO_vendor_support dcdbas pcspkr ipmi_si ipmi_msghandler acpi_pad sb_edac edac_core wmi sg lpc_ich mfd_core shpchp tg3 ptp pps_core joydev ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) mpt3sas(E) scsi_transport_sas(E) raid_class(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: speedstep_lib] CPU: 1 PID: 375 Comm: kworker/u96:4 Tainted: GE 4.6.2 #1 Hardware name: Dell Inc. PowerEdge T420/03015M, BIOS 2.2.0 02/06/2014 Workqueue: fw_event_mpt3sas0 _firmware_event_work [mpt3sas] task: 8800377f6480 ti: 8800c62c8000 task.ti: 8800c62c8000 RIP: 0010:[] [] sas_get_address+0x26/0x30 [scsi_transport_sas] RSP: 0018:8800c62cb8a8 EFLAGS: 00010282 RAX: 8800c6986208 RBX: 8800b04ec800 RCX: 8800b3deaac4 RDX: 002b RSI: RDI: 8800b04ec800 RBP: 8800c62cb8a8 R08: R09: 0008 R10: R11: 0001 R12: 8800b04ec800 R13: R14: 8800b04ec998 R15: FS: () GS:88012f02() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: ff600400 CR3: 01c06000 CR4: 000406e0 Stack: 8800c62cb8d8 a066bc62 8800b04ecc68 880128ee8000 8800c62cb938 a066bd5c 8800b04ecef8 81608333 8800b04ec800 8800b04ecc68 Call Trace: [] ses_match_to_enclosure+0x72/0x80 [ses] [] ses_intf_add+0xec/0x494 [ses] [] ? preempt_schedule_common+0x23/0x40 [] device_add+0x278/0x440 [] ? __pm_runtime_resume+0x6c/0x90 [] scsi_sysfs_add_sdev+0xee/0x2b0 [] scsi_add_lun+0x437/0x580 [] scsi_probe_and_add_lun+0x1bb/0x4e0 [] ? get_device+0x19/0x20 [] ? scsi_alloc_target+0x293/0x320 [] ? __pm_runtime_resume+0x6c/0x90 [] __scsi_add_device+0x10f/0x130 [] scsi_add_device+0x11/0x30 [] _scsih_sas_volume_add+0xf9/0x1b0 [mpt3sas] [] _scsih_sas_ir_config_change_event+0xdb/0x210 [mpt3sas] [] _mpt3sas_fw_work+0xc1/0x480 [mpt3sas] [] ? pwq_dec_nr_in_flight+0x50/0xa0 [] _firmware_event_work+0x19/0x20 [mpt3sas] [] process_one_work+0x189/0x4e0 [] ? del_timer_sync+0x4c/0x60 [] ? maybe_create_worker+0x8e/0x110 [] ? schedule+0x40/0xb0 [] worker_thread+0x16d/0x520 [] ? default_wake_function+0x12/0x20 [] ? __wake_up_common+0x56/0x90 [] ? maybe_create_worker+0x110/0x110 [] ? schedule+0x40/0xb0 [] ? maybe_create_worker+0x110/0x110 [] kthread+0xcc/0xf0 [] ? schedule_tail+0x1e/0xc0 [] ret_from_fork+0x22/0x40 [] ? kthread_freezable_should_stop+0x70/0x70 Code: 0f 1f 44 00 00 55 48 89 e5 66 66 66 66 90 48 8b 87 28 01 00 00 48 8b 40 28 83 b8 d0 02 00 00 01 75 09 48 8b 80 e0 02 00 00 c9 c3 <0f> 0b eb fe 66 0f 1f 44 00 00 55 48 89 e5 53 48 83 ec 08 66 66 RIP [] sas_get_address+0x26/0x30 [scsi_transport_sas] RSP ---[ end trace c8c9da69e1dcb8a1 ]--- BUG: unable to handle kernel paging request at ffd8 IP: [] kthread_data+0x10/0x20 PGD 1c07067 PUD 1c09067 PMD 0 Oops: [#2] SMP Modules linked in: mptctl mptbase ses enclosure ebtable_nat ebtables xt_CHECKSUM iptable_mangle bridge autofs4 8021q garp stp llc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
RE: [PATCH] mpt3sas: Fix panic when aer correct error occured
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: Kefeng Wang [mailto:wangkefeng.w...@huawei.com] Sent: Tuesday, July 12, 2016 3:13 PM To: martin.peter...@oracle.com; suganath-prabu.subram...@broadcom.com; mpt-fusionlinux@broadcom.com Cc: linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org; guohan...@huawei.com; Kefeng Wang; Sathya Prakash; Chaitra P B Subject: [PATCH] mpt3sas: Fix panic when aer correct error occured The _scsih_pci_mmio_enabled called if scsih_pci_error_detected returns PCI_ERS_RESULT_CAN_RECOVER, at this point, read/write to the device still works, no need to reset slot. Or the mpt3sas_base_map_resources in scsih_pci_slot_reset will fail, and iounamp ioc->chip, then we will meet issue when read ioc->chip in mpt3sas_base_get_iocstate from _base_fault_reset_work. Cc: Sathya Prakash Cc: Chaitra P B Cc: Suganath Prabu Subramani Signed-off-by: Kefeng Wang --- NOTE: I found this with an earlier kernel version, but the logic is not changed. drivers/scsi/mpt3sas/mpt3sas_scsih.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 6bff13e..eedd62e3 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -9033,8 +9033,11 @@ scsih_pci_mmio_enabled(struct pci_dev *pdev) /* TODO - dump whatever for debugging purposes */ - /* Request a slot reset. */ - return PCI_ERS_RESULT_NEED_RESET; + /* This called only if scsih_pci_error_detected returns +* PCI_ERS_RESULT_CAN_RECOVER, read/write to the device +* still works, not need to reset slot. +*/ + return PCI_ERS_RESULT_RECOVERED; } /* -- 1.7.12.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: DIF/DIX in mpt3sas
Zhang, Thanks for your email. DIX is supported in mpt3sas , but not by default and hence while loading mpt3sas driver set module parameter "prot_mask=0x7F" to enable DIX support. Thanks, Chaitra -Original Message- From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Baoquan Zhang Sent: Thursday, June 30, 2016 6:27 AM To: linux-scsi@vger.kernel.org Cc: Alireza Haghdoost; du; Raghu Raja Chandrasekar Subject: DIF/DIX in mpt3sas Hi there, Thanks for reading my email. I am a Ph.D student of University of Minnesota, Twin Cities. Recently, I am working on a project of T10 PI-capable MD module. However, I fail to use DIX with the HBA/driver combination of LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02) /mpt3sas even though I have formatted the drives with DIF protection type 1. May I know if DIX (data integrity extension) is supported in mpt3sas driver? Thanks, Baoquan Zhang-- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Kernel panics while creating RAID volume on latest stable 4.6.2 kernel beacuse of "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS enclosures"
Hi, Try creating RAID volume on latest stable 4.6.2 kernel, as soon as the volume gets created kernel panics , below are the logs... Carried out same experimentation on 4.4.13 kernel, issue was not observed.After learning diff b/w 4.4.13 & 4.6.2 kernels "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS enclosures" patch looks to be suspicious. commit 3f8d6f2a0797e8c650a47e5c1b5c2601a46f4293 And hence reverted above mentioned patch changes from 4.6.2 kernel and tried volume creation, volume created successfully and issue is not observed. >>Kernel panic logs: root@dhcp-135-24-192-112 ~]# sd 0:1:0:0: [sdw] No Caching mode page found sd 0:1:0:0: [sdw] Assuming drive cache: write through [ cut here ] kernel BUG at drivers/scsi/scsi_transport_sas.c:164! invalid opcode: [#1] SMP Modules linked in: mptctl mptbase ses enclosure ebtable_nat ebtables xt_CHECKSUM iptable_mangle bridge autofs4 8021q garp stp llc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan vhost tun kvm_intel kvm irqbypass uinput ipmi_devintf iTCO_wdt iTCO_vendor_support dcdbas pcspkr ipmi_si ipmi_msghandler acpi_pad sb_edac edac_core wmi sg lpc_ich mfd_core shpchp tg3 ptp pps_core joydev ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) mpt3sas(E) scsi_transport_sas(E) raid_class(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: speedstep_lib] CPU: 1 PID: 375 Comm: kworker/u96:4 Tainted: GE 4.6.2 #1 Hardware name: Dell Inc. PowerEdge T420/03015M, BIOS 2.2.0 02/06/2014 Workqueue: fw_event_mpt3sas0 _firmware_event_work [mpt3sas] task: 8800377f6480 ti: 8800c62c8000 task.ti: 8800c62c8000 RIP: 0010:[] [] sas_get_address+0x26/0x30 [scsi_transport_sas] RSP: 0018:8800c62cb8a8 EFLAGS: 00010282 RAX: 8800c6986208 RBX: 8800b04ec800 RCX: 8800b3deaac4 RDX: 002b RSI: RDI: 8800b04ec800 RBP: 8800c62cb8a8 R08: R09: 0008 R10: R11: 0001 R12: 8800b04ec800 R13: R14: 8800b04ec998 R15: FS: () GS:88012f02() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: ff600400 CR3: 01c06000 CR4: 000406e0 Stack: 8800c62cb8d8 a066bc62 8800b04ecc68 880128ee8000 8800c62cb938 a066bd5c 8800b04ecef8 81608333 8800b04ec800 8800b04ecc68 Call Trace: [] ses_match_to_enclosure+0x72/0x80 [ses] [] ses_intf_add+0xec/0x494 [ses] [] ? preempt_schedule_common+0x23/0x40 [] device_add+0x278/0x440 [] ? __pm_runtime_resume+0x6c/0x90 [] scsi_sysfs_add_sdev+0xee/0x2b0 [] scsi_add_lun+0x437/0x580 [] scsi_probe_and_add_lun+0x1bb/0x4e0 [] ? get_device+0x19/0x20 [] ? scsi_alloc_target+0x293/0x320 [] ? __pm_runtime_resume+0x6c/0x90 [] __scsi_add_device+0x10f/0x130 [] scsi_add_device+0x11/0x30 [] _scsih_sas_volume_add+0xf9/0x1b0 [mpt3sas] [] _scsih_sas_ir_config_change_event+0xdb/0x210 [mpt3sas] [] _mpt3sas_fw_work+0xc1/0x480 [mpt3sas] [] ? pwq_dec_nr_in_flight+0x50/0xa0 [] _firmware_event_work+0x19/0x20 [mpt3sas] [] process_one_work+0x189/0x4e0 [] ? del_timer_sync+0x4c/0x60 [] ? maybe_create_worker+0x8e/0x110 [] ? schedule+0x40/0xb0 [] worker_thread+0x16d/0x520 [] ? default_wake_function+0x12/0x20 [] ? __wake_up_common+0x56/0x90 [] ? maybe_create_worker+0x110/0x110 [] ? schedule+0x40/0xb0 [] ? maybe_create_worker+0x110/0x110 [] kthread+0xcc/0xf0 [] ? schedule_tail+0x1e/0xc0 [] ret_from_fork+0x22/0x40 [] ? kthread_freezable_should_stop+0x70/0x70 Code: 0f 1f 44 00 00 55 48 89 e5 66 66 66 66 90 48 8b 87 28 01 00 00 48 8b 40 28 83 b8 d0 02 00 00 01 75 09 48 8b 80 e0 02 00 00 c9 c3 <0f> 0b eb fe 66 0f 1f 44 00 00 55 48 89 e5 53 48 83 ec 08 66 66 RIP [] sas_get_address+0x26/0x30 [scsi_transport_sas] RSP ---[ end trace c8c9da69e1dcb8a1 ]--- BUG: unable to handle kernel paging request at ffd8 IP: [] kthread_data+0x10/0x20 PGD 1c07067 PUD 1c09067 PMD 0 Oops: [#2] SMP Modules linked in: mptctl mptbase ses enclosure ebtable_nat ebtables xt_CHECKSUM iptable_mangle bridge autofs4 8021q garp stp llc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan vhost tun kvm_intel kvm irqbypass uinput ipmi_devintf iTCO_wdt iTCO_vendor_support dcdbas pcspkr ipmi_si ipmi_msghandler acpi_pad sb_edac edac_core wmi sg lpc_ich mfd_core shpchp tg3 ptp pps_core joydev ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) mpt3sas(E) scsi_transport_sas(E) raid_class(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mo
RE: mpt3sas: memory allocation for firmware upgrade DMA memory question
Johannes, Could you please let us know which application is being used for firmware upgrade ?? Whether it is your own customized application or LSI provided applications? Also our application use single contiguous memory buffer for ioctls and hence mpt3sas driver is using single contiguous memory for the DMA operation. If we use GFP_KERNEL flag then it may be possible that ioctl thread may hang/wait for long, if it doesn't get required memory from the system. We may need to test below patch thoroughly , as I don’t see allocation of several non-contiguous chunks of memory in below patch..., Thanks, Chaitra -Original Message- From: Johannes Thumshirn [mailto:jthumsh...@suse.de] Sent: Wednesday, May 25, 2016 2:49 PM To: Chaitra P B; Suganath Prabu Subramani Cc: Linux SCSI Mailinglist; Jeff Mahoney Subject: mpt3sas: memory allocation for firmware upgrade DMA memory question Hi Chaitra and Suganath, I've got a question regarding mpt3sas' memory allocation used when doing a firmware upgrade. Currently you're doing a pci_alloc_consitent() which tries to allocate memory via GFP_ATOMIC. This memory then is passed as a single element on a sg_list. Jeff reported it returned -ENOMEM on his Server due to highly fragmented memory. Is it required to have the memory for the DMA operation contiguous, or can I just allocate several non-contiguous junks of memory and map it to a sg_list? If not, is GFP_ATMOIC really needed? I've converted the driver to use GFP_KERNEL but I'm a bit reluctant to test below patch on real hardware to not brick the HBA. Thanks, Johannes RFC patch for GFP_KERNEL allocation, though splitting into multiple sg mapped elements is the preferred fix here: >From 06e63654d887df7f740dc5abcb40d441a8da7fa5 Mon Sep 17 00:00:00 2001 From: Johannes Thumshirn Date: Tue, 24 May 2016 17:25:59 +0200 Subject: [RFC PATCH] mpt3sas: Don't do atomic memory allocations for firmware update DMA Currently mpt3sas uses pci_alloc_consistent() to allocate memory for the DMA used to do firmware updates. pci_alloc_consistent() in turn uses GFP_ATOMIC allocations. On a host with high memory fragmention this can lead to page allocation failures, as the DMA buffer holds the complete firmware update and thus can need page allocations of higher orders. As the firmware update code path may sleep, convert allocation to a normal kzalloc() call with GFP_KERNEL and map it to the DMA buffers. Reported-by: Jeff Mahoney Signed-off-by: Johannes Thumshirn --- drivers/scsi/mpt3sas/mpt3sas_ctl.c | 39 -- 1 file changed, 29 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_ctl.c b/drivers/scsi/mpt3sas/mpt3sas_ctl.c index 7d00f09..14be3cf 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_ctl.c +++ b/drivers/scsi/mpt3sas/mpt3sas_ctl.c @@ -751,8 +751,7 @@ _ctl_do_mpt_command(struct MPT3SAS_ADAPTER *ioc, struct mpt3_ioctl_command karg, /* obtain dma-able memory for data transfer */ if (data_out_sz) /* WRITE */ { - data_out = pci_alloc_consistent(ioc->pdev, data_out_sz, - &data_out_dma); + data_out = kzalloc(data_out_sz, GFP_KERNEL); if (!data_out) { pr_err("failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); @@ -760,6 +759,14 @@ _ctl_do_mpt_command(struct MPT3SAS_ADAPTER *ioc, struct mpt3_ioctl_command karg, mpt3sas_base_free_smid(ioc, smid); goto out; } + data_out_dma = pci_map_single(ioc->pdev, data_out, + data_out_sz, PCI_DMA_TODEVICE); + if (dma_mapping_error(&ioc->pdev->dev, data_out_dma)) { + ret = -EINVAL; + mpt3sas_base_free_smid(ioc, smid); + goto out_free_data_out; + } + if (copy_from_user(data_out, karg.data_out_buf_ptr, data_out_sz)) { pr_err("failure at %s:%d/%s()!\n", __FILE__, @@ -771,8 +778,7 @@ _ctl_do_mpt_command(struct MPT3SAS_ADAPTER *ioc, struct mpt3_ioctl_command karg, } if (data_in_sz) /* READ */ { - data_in = pci_alloc_consistent(ioc->pdev, data_in_sz, - &data_in_dma); + data_in = kzalloc(data_in_sz, GFP_KERNEL); if (!data_in) { pr_err("failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); @@ -780,6 +786,13 @@ _ctl_do_mpt_command(struct MPT3SAS_ADAPTER *ioc, struct mpt3_ioctl_command karg, mpt3sas_base_free_smid(ioc, smid); goto out; } + data_in_dma = pci_map_single(ioc->pdev, data_in, +data_in_sz, PCI_DMA_FROMDEVICE); + if (dma_mapping_error(&ioc->pdev->dev, da
RE: [patch 2/2] mpt3sas: clean up indenting a bit
Hi Dan, Below patch doesn't apply smoothly on latest upstream kernel , we are seeing Hunk failures. Thanks, Chaitra -Original Message- From: Dan Carpenter [mailto:dan.carpen...@oracle.com] Sent: Friday, May 13, 2016 2:09 AM To: Sathya Prakash Cc: Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com; linux-scsi@vger.kernel.org; kernel-janit...@vger.kernel.org Subject: [patch 2/2] mpt3sas: clean up indenting a bit The indenting is slightly off in parts of this file so I have tidied it. Signed-off-by: Dan Carpenter diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 6bff13e..34d6f996 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -1865,7 +1865,7 @@ scsih_slave_configure(struct scsi_device *sdev) ds = "SSP"; } else { qdepth = MPT3SAS_SATA_QUEUE_DEPTH; -if (raid_device->device_info & + if (raid_device->device_info & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) ds = "SATA"; else @@ -3497,21 +3497,21 @@ _scsih_issue_delayed_event_ack(struct MPT3SAS_ADAPTER *ioc, u16 smid, u16 event, void _scsih_issue_delayed_sas_io_unit_ctrl(struct MPT3SAS_ADAPTER *ioc, u16 smid, u16 handle) - { - Mpi2SasIoUnitControlRequest_t *mpi_request; - u32 ioc_state; - int i = smid - ioc->internal_smid; - unsigned long flags; +{ + Mpi2SasIoUnitControlRequest_t *mpi_request; + u32 ioc_state; + int i = smid - ioc->internal_smid; + unsigned long flags; - if (ioc->remove_host) { - dewtprintk(ioc, pr_info(MPT3SAS_FMT - "%s: host has been removed\n", -__func__, ioc->name)); - return; - } else if (ioc->pci_error_recovery) { - dewtprintk(ioc, pr_info(MPT3SAS_FMT - "%s: host in pci error recovery\n", - __func__, ioc->name)); + if (ioc->remove_host) { + dewtprintk(ioc, pr_info(MPT3SAS_FMT + "%s: host has been removed\n", +__func__, ioc->name)); + return; + } else if (ioc->pci_error_recovery) { + dewtprintk(ioc, pr_info(MPT3SAS_FMT + "%s: host in pci error recovery\n", + __func__, ioc->name)); return; } ioc_state = mpt3sas_base_get_iocstate(ioc, 1); @@ -5173,7 +5173,7 @@ _scsih_expander_add(struct MPT3SAS_ADAPTER *ioc, u16 handle) } _scsih_expander_node_add(ioc, sas_expander); -return 0; + return 0; out_fail: @@ -7774,9 +7774,9 @@ _mpt3sas_fw_work(struct MPT3SAS_ADAPTER *ioc, struct fw_event_work *fw_event) break; case MPT3SAS_PORT_ENABLE_COMPLETE: ioc->start_scan = 0; - if (missing_delay[0] != -1 && missing_delay[1] != -1) + if (missing_delay[0] != -1 && missing_delay[1] != -1) mpt3sas_base_update_missing_delay(ioc, missing_delay[0], - missing_delay[1]); + missing_delay[1]); dewtprintk(ioc, pr_info(MPT3SAS_FMT "port enable: complete from worker thread\n", ioc->name)); -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [patch 1/2] mpt3sas: add missing curly braces
Hi, Please consider this patch as Ack-by: Chaitra P B Thanks, Chaitra -Original Message- From: Dan Carpenter [mailto:dan.carpen...@oracle.com] Sent: Friday, May 13, 2016 2:08 AM To: Sathya Prakash; Chaitra P B Cc: Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com; linux-scsi@vger.kernel.org; kernel-janit...@vger.kernel.org Subject: [patch 1/2] mpt3sas: add missing curly braces There are some missing curly braces on this if statement, so we end up printing when we shouldn't. Fixes: a470a51cd648 ('mpt3sas: Handle active cable exception event') Signed-off-by: Dan Carpenter diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 6a4df5a..6bff13e 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -7975,13 +7975,14 @@ mpt3sas_scsih_event_callback(struct MPT3SAS_ADAPTER *ioc, u8 msix_index, ActiveCableEventData = (Mpi26EventDataActiveCableExcept_t *) mpi_reply->EventData; if (ActiveCableEventData->ReasonCode == - MPI26_EVENT_ACTIVE_CABLE_INSUFFICIENT_POWER) + MPI26_EVENT_ACTIVE_CABLE_INSUFFICIENT_POWER) { pr_info(MPT3SAS_FMT "Currently an active cable with ReceptacleID %d", ioc->name, ActiveCableEventData->ReceptacleID); pr_info("cannot be powered and devices connected to this active cable"); pr_info("will not be seen. This active cable"); pr_info("requires %d mW of power", ActiveCableEventData->ActiveCablePowerRequirement); + } break; default: /* ignore the rest */ -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] mpt3sas - remove unused fw_event_work delayed_work
Joe , The below mentioned patch is an older patch, verified latest code also the code before merging(mpt2sas & mpt3sas) didn't find changes of below patch. So I am searching for the patch which has removed the functionality/changes of the below patch. I shall get back to you on this by Monday. Thanks, Chaitra -Original Message- From: Martin K. Petersen [mailto:martin.peter...@oracle.com] Sent: Friday, April 15, 2016 8:13 AM To: Joe Lawrence Cc: Chaitra Basappa; Sathya Prakash Veerichetty; emi...@redhat.com; linux-scsi@vger.kernel.org; Suganath Prabu Subramani; Calvin Owens Subject: Re: [PATCH] mpt3sas - remove unused fw_event_work delayed_work >>>>> "Joe" == Joe Lawrence writes: Joe> Do we know why f1c35e6aea579 "mpt2sas: RESCAN Barrier work is added Joe> in case of HBA reset" was unneeded for the mpt3 version? If that Joe> is interesting, that info could be added to v2 commit message as Joe> well. Chaitra? -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] mpt3sas - remove unused fw_event_work delayed_work
Hi, Please consider this patch as Ack-by: Chaitra P B Thanks, Chaitra -Original Message- From: Sathya Prakash [mailto:sathya.prak...@broadcom.com] Sent: Saturday, April 02, 2016 1:45 AM To: emi...@redhat.com; Joe Lawrence Cc: linux-scsi@vger.kernel.org; Chaitra Basappa; Suganath Prabu Subramani; Calvin Owens Subject: RE: [PATCH] mpt3sas - remove unused fw_event_work delayed_work We will look into this early next week and provide a detailed response. On the first look this is ACK from Broadcom, will reconfirm. -Original Message- From: Ewan D. Milne [mailto:emi...@redhat.com] Sent: Friday, April 01, 2016 2:04 PM To: Joe Lawrence Cc: linux-scsi@vger.kernel.org; Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; Calvin Owens Subject: Re: [PATCH] mpt3sas - remove unused fw_event_work delayed_work On Fri, 2016-04-01 at 15:13 -0400, Joe Lawrence wrote: > On 04/01/2016 02:51 PM, Ewan D. Milne wrote: > > On Fri, 2016-04-01 at 13:56 -0400, Joe Lawrence wrote: > >> @@ -2804,12 +2803,12 @@ _scsih_fw_event_cleanup_queue(struct > >> MPT3SAS_ADAPTER *ioc) > >>/* > >> * Wait on the fw_event to complete. If this returns 1, then > >> * the event was never executed, and we need a put for the > >> - * reference the delayed_work had on the fw_event. > >> + * reference the work had on the fw_event. > >> * > >> * If it did execute, we wait for it to finish, and the put will > >> * happen from _firmware_event_work() > >> */ > >> - if (cancel_delayed_work_sync(&fw_event->delayed_work)) > >> + if (cancel_work_sync(&fw_event->work)) > >>fw_event_work_put(fw_event); > >> > >>fw_event_work_put(fw_event); > > > > Hmm... Fixes: 146b16c8 (mpt3sas: Refcount fw_events and fix unsafe > > list usage) > > This could technically go back to f92363d12359 (mpt3sas: add new > driver supporting 12GB SAS) ... but will probably only apply cleanly > to _scsih_fw_event_cleanup_queue after 146b16c8 (mpt3sas: Refcount > fw_events and fix unsafe list usage), so you're right. > > > Since mpt3sas uses ->work instead of _delayed_work this would seem > > to be correct, however the deprecated mpt2sas driver had a commit > > that changed the firmware event work mechanism to use ->delayed_work > > instead of ->work: > > > > commit f1c35e6aea579d5bdb6dc02dfa99c67c7c3b3f67 > > Author: Kashyap, Desai > > Date: Tue Mar 9 16:31:43 2010 +0530 > > Okay, so this is pre-mpt3sas split. > > > [SCSI] mpt2sas: RESCAN Barrier work is added in case of HBA reset. > > > > Add the cancel_pending_work flag from the fw_event_work > > structure, and then to > > set the flag during host reset, check the flag later from work > > threads > > context and if cancel_pending_work_flag is set ingore those events. > > More unused mpt2 vestiges in the mpt3 version? > > % cd drivers/scsi/mpt3sas/ > % grep 'cancel_pending_work' *.{c,h} > mpt3sas_scsih.c: * @cancel_pending_work: flag set during reset handling > mpt3sas_scsih.c:u8 cancel_pending_work; > > > Now Rescan after host reset is changed. > > Added special task MPT2SAS_RESCAN_AFTER_HOST_RESET. This task > > will be queued > > at the time of HBA reset. this task is treated as barrier. All > > work after > > MPT2SAS_RESCAN_AFTER_HOST_RESET will be treated as new work and > > will be > > server by callback handle. If host_recovery is going on while > > running RESCAN > > task, it will wait for shos_recovery_done completion which will > > be called > > from HBA reset DONE context. > > FWIW, I don't see anything like this in today's mpt3sas driver. Well, that's the question. Is there some functionality missing? Were the changes abandoned/replaced? mpt2sas used delayed_work for something else, so maybe that's why the firmware event changes initially used it (albeit with a 0 delay) but it's hard to know... cancel_delayed_work() will call del_timer() on delayed_work->timer, but it looks like kzalloc is used to allocate the fw_event_work objects so perhaps nothing bad will happen. I was wondering, though, because I have seen dumps of hung systems with requests that should have timed out but are not on any timer list. -Ewan > > Regards, > > -- Joe -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] mpt3sas: Don't overreach ioc->reply_post[] during initialization
Hi, Please consider this patch as Ack-by: Chaitra P B Thanks, Chaitra -Original Message- From: Martin K. Petersen [mailto:martin.peter...@oracle.com] Sent: Tuesday, March 22, 2016 6:00 AM To: Calvin Owens Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org; kernel-t...@fb.com; Sreekanth Reddy Subject: Re: [PATCH] mpt3sas: Don't overreach ioc->reply_post[] during initialization > "Calvin" == Calvin Owens writes: Calvin> In _base_make_ioc_operational(), we walk ioc->reply_queue_list Calvin> and pull a pointer out of successive elements of Calvin> ioc->reply_post[] for each entry in that list if RDPQ is Calvin> enabled. Calvin> Since the code pulls the pointer for the next iteration at the Calvin> bottom of the loop, it triggers the a KASAN dump on the final Calvin> iteration: Broadcom folks, please review. Thanks! -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] mpt3sas: Don't overreach ioc->reply_post[] during initialization
Martin, This patch is being reviewed , we shall get back with reviews by tomorrow. Thanks, Chaitra -Original Message- From: Martin K. Petersen [mailto:martin.peter...@oracle.com] Sent: Tuesday, March 22, 2016 6:00 AM To: Calvin Owens Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org; kernel-t...@fb.com; Sreekanth Reddy Subject: Re: [PATCH] mpt3sas: Don't overreach ioc->reply_post[] during initialization > "Calvin" == Calvin Owens writes: Calvin> In _base_make_ioc_operational(), we walk ioc->reply_queue_list Calvin> and pull a pointer out of successive elements of Calvin> ioc->reply_post[] for each entry in that list if RDPQ is Calvin> enabled. Calvin> Since the code pulls the pointer for the next iteration at the Calvin> bottom of the loop, it triggers the a KASAN dump on the final Calvin> iteration: Broadcom folks, please review. Thanks! -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] mpt3sas: a correction in unmap_resources
Hi, Please consider this patch as Ack-by: Chaitra P B Thanks, Chaitra -Original Message- From: Tomas Henzl [mailto:the...@redhat.com] Sent: Wednesday, December 23, 2015 6:52 PM To: linux-scsi@vger.kernel.org Cc: kashyap.de...@avagotech.com; chaitra.basa...@avagotech.com; mlomb...@redhat.com Subject: [PATCH] mpt3sas: a correction in unmap_resources It might happen that we try to free an already freed pointer. Tomas Reported-by: Maurizio Lombardi Signed-off-by: Tomas Henzl --- drivers/scsi/mpt3sas/mpt3sas_base.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index d4f1dcdb83..3b09b3d09f 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -1827,8 +1827,10 @@ mpt3sas_base_unmap_resources(struct MPT3SAS_ADAPTER *ioc) _base_free_irq(ioc); _base_disable_msix(ioc); - if (ioc->msix96_vector) + if (ioc->msix96_vector) { kfree(ioc->replyPostRegisterIndex); + ioc->replyPostRegisterIndex = NULL; + } if (ioc->chip_phys) { iounmap(ioc->chip); -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH-v2 1/2] mpt3sas: Refcount sas_device objects and fix unsafe list usage
From: Sreekanth Reddy [mailto:sreekanth.re...@avagotech.com] Sent: Tuesday, September 08, 2015 5:26 PM To: Nicholas A. Bellinger Cc: linux-scsi; linux-kernel; James Bottomley; Calvin Owens; Christoph Hellwig; MPT-FusionLinux.pdl; kernel-team; Nicholas Bellinger; Chaitra Basappa Subject: Re: [PATCH-v2 1/2] mpt3sas: Refcount sas_device objects and fix unsafe list usage On Sun, Aug 30, 2015 at 1:24 PM, Nicholas A. Bellinger wrote: > From: Nicholas Bellinger > > These objects can be referenced concurrently throughout the driver, we > need a way to make sure threads can't delete them out from under each > other. This patch adds the refcount, and refactors the code to use it. > > Additionally, we cannot iterate over the sas_device_list without > holding the lock, or we risk corrupting random memory if items are > added or deleted as we iterate. This patch refactors > _scsih_probe_sas() to use the sas_device_list in a safe way. > > This patch is a port of Calvin's PATCH-v4 for mpt2sas code, atop > mpt3sas changes in scsi.git/for-next. > > Cc: Calvin Owens > Cc: Christoph Hellwig > Cc: Sreekanth Reddy > Cc: MPT-FusionLinux.pdl > Signed-off-by: Nicholas Bellinger > --- > drivers/scsi/mpt3sas/mpt3sas_base.h | 25 +- > drivers/scsi/mpt3sas/mpt3sas_scsih.c | 479 > +-- > drivers/scsi/mpt3sas/mpt3sas_transport.c | 18 +- > 3 files changed, 364 insertions(+), 158 deletions(-) > > @@ -2763,7 +2874,7 @@ _scsih_block_io_device(struct MPT3SAS_ADAPTER *ioc, > u16 handle) > struct scsi_device *sdev; > struct _sas_device *sas_device; > [Sreekanth] Here sas_device_lock spin lock needs to be acquired before calling __mpt3sas_get_sdev_by_addr() function. [Chaitra]Here instead of calling " __mpt3sas_get_sdev_by_handle()" function calling "mpt3sas_get_sdev_by_handle()" function will fixes "invalid page access" type of kernel panic > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle); > + sas_device = __mpt3sas_get_sdev_by_handle(ioc, handle); > if (!sas_device) > return; > > @@ -2779,6 +2890,8 @@ _scsih_block_io_device(struct MPT3SAS_ADAPTER *ioc, > u16 handle) > continue; > _scsih_internal_device_block(sdev, sas_device_priv_data); > } > + > + sas_device_put(sas_device); > } > Regards, Chaitra -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html