smp-induced oops/NULL pointer dereference in mpt3sas, from kernel >= 4.11
Hi All, In testing kernel 4.11.1 and 4.11.6 we've hit an oops/ blown pointer issue in mpt3sas. It is easily reproducible on a system that contains expanders/enclosure connected behind SAS3 HBA. Soon after connecting expander / enclosure we observe below call trace. Jul 12 15:28:27 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at 00dc Jul 12 15:28:27 localhost kernel: IP: _transport_smp_handler+0x8bb/0x10c0 [mpt3sas] Jul 12 15:28:27 localhost kernel: PGD 811abb067 Jul 12 15:28:27 localhost kernel: PUD 81c96a067 Jul 12 15:28:27 localhost kernel: PMD 0 Jul 12 15:28:27 localhost kernel: Jul 12 15:28:27 localhost kernel: Oops: 0002 [#1] SMP Jul 12 15:28:27 localhost kernel: mpt3sas_cm0: Discovery: (stop) Jul 12 15:28:27 localhost kernel: Jul 12 15:28:27 localhost kernel: mpt3sas_cm0: discovery event: (stop) Jul 12 15:28:27 localhost kernel: Jul 12 15:28:27 localhost kernel: Hardware name: Dell Inc. PowerEdge T620/0658N7, BIOS 2.5.4 01/22/2016 Jul 12 15:28:27 localhost kernel: task: 88081c1b8100 task.stack: c90006168000 Jul 12 15:28:27 localhost kernel: RIP: 0010:_transport_smp_handler+0x8bb/0x10c0 [mpt3sas] Jul 12 15:28:27 localhost kernel: RSP: 0018:c9000616bb38 EFLAGS: 00010286 Jul 12 15:28:27 localhost kernel: RAX: 00dc RBX: 88041c2ba7b0 RCX: 003c1a07ff00 Jul 12 15:28:27 localhost kernel: RDX: 88081a45c948 RSI: dead0200 RDI: dead0100 Jul 12 15:28:27 localhost kernel: RBP: c9000616bbf8 R08: c9000616bac0 R09: dead0200 Jul 12 15:28:27 localhost kernel: R10: R11: 0010 R12: 0105 Jul 12 15:28:27 localhost kernel: R13: 88041d631680 R14: 88041a6c6c38 R15: 0001 Jul 12 15:28:27 localhost kernel: FS: 7f1818ad1700() GS:88042f80() knlGS: Jul 12 15:28:27 localhost kernel: CS: 0010 DS: ES: CR0: 80050033 Jul 12 15:28:27 localhost kernel: CR2: 00dc CR3: 00081dad1000 CR4: 000406f0 Jul 12 15:28:27 localhost kernel: Call Trace: Jul 12 15:28:27 localhost kernel: ? blk_rq_bio_prep+0x3c/0x80 Jul 12 15:28:27 localhost kernel: ? blk_start_request+0x38/0x60 Jul 12 15:28:27 localhost kernel: sas_smp_request+0x5f/0xa0 [scsi_transport_sas] Jul 12 15:28:27 localhost kernel: sas_non_host_smp_request+0x4a/0x60 [scsi_transport_sas] Jul 12 15:28:27 localhost kernel: __blk_run_queue+0x37/0x50 Jul 12 15:28:27 localhost kernel: blk_execute_rq_nowait+0xeb/0x140 Jul 12 15:28:27 localhost kernel: blk_execute_rq+0x48/0x90 Jul 12 15:28:27 localhost kernel: bsg_ioctl+0x18a/0x1e0 Jul 12 15:28:27 localhost kernel: vfs_ioctl+0x18/0x30 Jul 12 15:28:27 localhost kernel: do_vfs_ioctl+0x14b/0x3f0 Jul 12 15:28:27 localhost kernel: ? security_file_ioctl+0x45/0x60 Jul 12 15:28:27 localhost kernel: SyS_ioctl+0x92/0xa0 Jul 12 15:28:27 localhost kernel: do_syscall_64+0x6c/0x160 Jul 12 15:28:27 localhost kernel: entry_SYSCALL64_slow_path+0x25/0x25 Jul 12 15:28:27 localhost kernel: RIP: 0033:0x35a88e0a77 Jul 12 15:28:27 localhost kernel: RSP: 002b:7ffded06f278 EFLAGS: 0246 ORIG_RAX: 0010 Jul 12 15:28:27 localhost kernel: RAX: ffda RBX: 7ffded06f370 RCX: 0035a88e0a77 Jul 12 15:28:27 localhost kernel: RDX: 7ffded06f280 RSI: 2285 RDI: 0003 Jul 12 15:28:27 localhost kernel: RBP: R08: 03ea R09: 8000 Jul 12 15:28:27 localhost kernel: R10: fff0 R11: 0246 R12: 0003 Jul 12 15:28:27 localhost kernel: R13: R14: 7ffded06f3a0 R15: Jul 12 15:28:27 localhost kernel: Code: 84 3e 02 00 00 48 8b 5d a8 85 d2 4c 8b ab f8 02 00 00 0f 85 e3 05 00 00 48 8b 55 98 49 8b 4d 00 48 81 c2 48 01 00 00 48 8b 42 28 <48> 89 08 49 8b 4d 08 48 89 48 08 49 8b 4d 10 48 89 48 10 41 8b Jul 12 15:28:27 localhost kernel: RIP: _transport_smp_handler+0x8bb/0x10c0 [mpt3sas] RSP: c9000616bb38 Jul 12 15:28:27 localhost kernel: CR2: 00dc Jul 12 15:28:27 localhost kernel: ---[ end trace d0a22e0e5a84886a ]--- Jul 12 15:28:28 localhost kernel: ses 4:0:0:0: Attached Enclosure device We analyzed this issue and could figure out it is not because of driver, its because the "sense" field of the 'struct scsi_request' is not being populated properly from the upper layer. And this "sense" member is being referenced in our driver code for kernel versions >= 4.11 as shown below in the snippet: Whereas as for < 4.11 kernel version this "sense" member was referenced via 'struct request' static int _transport_smp_handler (.) { . . >>memcpy(scsi_req(req)->sense, mpi_reply, sizeof(*mpi_reply)); . . } And hence the NULL pointer dereference call trace is seen for the above chunk of mpt3sas. This needs to be addressed from upper layer, so please help us in getting this resolved. Thanks in advance for the support, Regards, Chaitra
RE: [PATCH] mpt3sas: Fix resume on WarpDrive flash cards
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: Greg Edwards [mailto:gedwa...@fireweed.org] Sent: Saturday, July 30, 2016 9:36 PM To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen Cc: mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; Greg Edwards Subject: [PATCH] mpt3sas: Fix resume on WarpDrive flash cards mpt3sas crashes on resume after suspend with WarpDrive flash cards. The reply_post_host_index array is not set back up after the resume, and we deference a stale pointer in _base_interrupt(). [ 47.309711] BUG: unable to handle kernel paging request at c90001f8006c [ 47.318289] IP: [] _base_interrupt+0x49f/0xa30 [mpt3sas] [ 47.326749] PGD 41ccaa067 PUD 41ccab067 PMD 3466c067 PTE 0 [ 47.333848] Oops: 0002 [#1] SMP ... [ 47.452708] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.7.0 #6 [ 47.460506] Hardware name: Dell Inc. OptiPlex 990/06D7TR, BIOS A18 09/24/2013 [ 47.469629] task: 81c0d500 ti: 81c0 task.ti: 81c0 [ 47.479112] RIP: 0010:[] [] _base_interrupt+0x49f/0xa30 [mpt3sas] [ 47.490466] RSP: 0018:88041d203e30 EFLAGS: 00010002 [ 47.497801] RAX: 0001 RBX: 880033f4c000 RCX: 0001 [ 47.506973] RDX: c90001f8006c RSI: 0082 RDI: 0082 [ 47.516141] RBP: 88041d203eb0 R08: 8804118e2820 R09: 0001 [ 47.525300] R10: 0001 R11: 100c R12: [ 47.534457] R13: 880412c487e0 R14: 88041a8987d8 R15: 0001 [ 47.543632] FS: () GS:88041d20() knlGS: [ 47.553796] CS: 0010 DS: ES: CR0: 80050033 [ 47.561632] CR2: c90001f8006c CR3: 01c06000 CR4: 000406f0 [ 47.570883] Stack: [ 47.575015] 1d211228 88041d2100c0 8800c47d8130 0100 [ 47.584625] 8804100c 100c 88041a8992a0 88041a8987f8 [ 47.594230] 88041d203e00 8e55 038c 880414ad4280 [ 47.603862] Call Trace: [ 47.608474] [ 47.610413] [] ? call_timer_fn+0x35/0x120 [ 47.620539] [] handle_irq_event_percpu+0x7f/0x1c0 [ 47.629061] [] handle_irq_event+0x2c/0x50 [ 47.636859] [] handle_edge_irq+0x6f/0x130 [ 47.644654] [] handle_irq+0x73/0x120 [ 47.652011] [] ? atomic_notifier_call_chain+0x1a/0x20 [ 47.660854] [] do_IRQ+0x4b/0xd0 [ 47.66] [] common_interrupt+0x8c/0x8c [ 47.675635] Move the reply_post_host_index array setup into mpt3sas_base_map_resources(), which is also in the resume path. Cc: sta...@vger.kernel.org Signed-off-by: Greg Edwards --- drivers/scsi/mpt3sas/mpt3sas_base.c | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 751f13e..750f82c 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -2188,6 +2188,17 @@ mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER *ioc) } else ioc->msix96_vector = 0; + if (ioc->is_warpdrive) { + ioc->reply_post_host_index[0] = (resource_size_t __iomem *) + &ioc->chip->ReplyPostHostIndex; + + for (i = 1; i < ioc->cpu_msix_table_sz; i++) + ioc->reply_post_host_index[i] = + (resource_size_t __iomem *) + ((u8 __iomem *)&ioc->chip->Doorbell + (0x4000 + ((i - 1) + * 4))); + } + list_for_each_entry(reply_q, &ioc->reply_queue_list, list) pr_info(MPT3SAS_FMT "%s: IRQ %d\n", reply_q->name, ((ioc->msix_enable) ? "PCI-MSI-X enabled" : @@ -5280,17 +5291,6 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc) if (r) goto out_free_resources; - if (ioc->is_warpdrive) { - ioc->reply_post_host_index[0] = (resource_size_t __iomem *) - &ioc->chip->ReplyPostHostIndex; - - for (i = 1; i < ioc->cpu_msix_table_sz; i++) - ioc->reply_post_host_index[i] = - (resource_size_t __iomem *) - ((u8 __iomem *)&ioc->chip->Doorbell + (0x4000 + ((i - 1) - * 4))); - } - pci_set_drvdata(ioc->pdev, ioc->shost); r = _base_get_ioc_facts(ioc, CAN_SLEEP); if (r) -- 2.7.4
RE: [PATCH] mpt3sas: Don't spam logs if logging level is 0
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Johannes Thumshirn Sent: Wednesday, August 03, 2016 6:30 PM To: Martin K . Petersen; James Bottomley Cc: Linux SCSI Mailinglist; Linux Kernel Mailinglist; Sreekanth Reddy; Johannes Thumshirn Subject: [PATCH] mpt3sas: Don't spam logs if logging level is 0 In _scsih_io_done() we test if the ioc->logging_level does _not_ have the MPT_DEBUG_REPLY bit set and if it hasn't we print the debug messages. This unfortunately is the wrong way around. Note, the actual bug is older than af0094115 but this commit removed the CONFIG_SCSI_MPT3SAS_LOGGING Kconfig option which hid the bug. Fixes: af0094115 'mpt2sas, mpt3sas: Remove SCSI_MPTXSAS_LOGGING entry from Kconfig' Signed-off-by: Johannes Thumshirn --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 4a1cc85..a138690 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -4693,7 +4693,7 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) le16_to_cpu(mpi_reply->DevHandle)); mpt3sas_trigger_scsi(ioc, data.skey, data.asc, data.ascq); - if (!(ioc->logging_level & MPT_DEBUG_REPLY) && + if ((ioc->logging_level & MPT_DEBUG_REPLY) && ((scmd->sense_buffer[2] == UNIT_ATTENTION) || (scmd->sense_buffer[2] == MEDIUM_ERROR) || (scmd->sense_buffer[2] == HARDWARE_ERROR))) -- 1.8.5.6 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] mpt3sas: Ensure the connector_name string is NUL-terminated
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: Calvin Owens [mailto:calvinow...@fb.com] Sent: Thursday, July 28, 2016 10:16 AM To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen Cc: mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; kernel-t...@fb.com; Calvin Owens Subject: [PATCH] mpt3sas: Ensure the connector_name string is NUL-terminated We blindly trust the hardware to give us NUL-terminated strings, which is a bad idea because it doesn't always do that. For example: [ 481.184784] mpt3sas_cm0: enclosure level(0x), connector name( \x3) In this case, connector_name is four spaces. We got lucky here because the 2nd byte beyond our character array happens to be a NUL. Fix this by explicitly writing '\0' to the end of the string to ensure we don't run off the edge of the world in printk(). Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_base.h | 2 +- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 10 ++ 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index 892c9be..eb7f5b0 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -478,7 +478,7 @@ struct _sas_device { u8 pfa_led_on; u8 pend_sas_rphy_add; u8 enclosure_level; - u8 connector_name[4]; + u8 connector_name[5]; struct kref refcount; }; diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index cd91a68..acabe48 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -5380,8 +5380,9 @@ _scsih_check_device(struct MPT3SAS_ADAPTER *ioc, MPI2_SAS_DEVICE0_FLAGS_ENCL_LEVEL_VALID) { sas_device->enclosure_level = le16_to_cpu(sas_device_pg0.EnclosureLevel); - memcpy(&sas_device->connector_name[0], - &sas_device_pg0.ConnectorName[0], 4); + memcpy(sas_device->connector_name, + sas_device_pg0.ConnectorName, 4); + sas_device->connector_name[4] = '\0'; } else { sas_device->enclosure_level = 0; sas_device->connector_name[0] = '\0'; @@ -5508,8 +5509,9 @@ _scsih_add_device(struct MPT3SAS_ADAPTER *ioc, u16 handle, u8 phy_num, if (sas_device_pg0.Flags & MPI2_SAS_DEVICE0_FLAGS_ENCL_LEVEL_VALID) { sas_device->enclosure_level = le16_to_cpu(sas_device_pg0.EnclosureLevel); - memcpy(&sas_device->connector_name[0], - &sas_device_pg0.ConnectorName[0], 4); + memcpy(sas_device->connector_name, + sas_device_pg0.ConnectorName, 4); + sas_device->connector_name[4] = '\0'; } else { sas_device->enclosure_level = 0; sas_device->connector_name[0] = '\0'; -- 2.8.0.rc2
RE: [PATCH 3/3] mpt3sas: Fix warnings exposed by W=1
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: mpt-fusionlinux@broadcom.com [mailto:mpt-fusionlinux@broadcom.com] On Behalf Of Calvin Owens Sent: Friday, July 29, 2016 10:08 AM To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen Cc: mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; kernel-t...@fb.com; Calvin Owens Subject: [PATCH 3/3] mpt3sas: Fix warnings exposed by W=1 Trivial non-functional changes for a couple annoying things: 1) Functions local to files are not declared static, which is frustrating when reading the code because it's non-obvious at first glance what's actually called from other files. 2) Set-but-unused variables abound, presumably to mask -Wunused-result errors in the past. None of these are flagged today though (with one exception noted below), so remove them. Fixing (2) exposed the fact that we improperly ignore the return value of scsi_device_reprobe() in _scsih_reprobe_lun(). Fixing the calling code to deal with the potential error is non-trivial, so for now just WARN(). Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_base.c | 18 +++- drivers/scsi/mpt3sas/mpt3sas_config.c| 4 +- drivers/scsi/mpt3sas/mpt3sas_ctl.c | 29 ++--- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 70 +++- drivers/scsi/mpt3sas/mpt3sas_transport.c | 16 ++-- 5 files changed, 56 insertions(+), 81 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 0956183..df95d1a 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -2039,7 +2039,7 @@ _base_enable_msix(struct MPT3SAS_ADAPTER *ioc) * mpt3sas_base_unmap_resources - free controller resources * @ioc: per adapter object */ -void +static void mpt3sas_base_unmap_resources(struct MPT3SAS_ADAPTER *ioc) { struct pci_dev *pdev = ioc->pdev; @@ -3884,7 +3884,6 @@ _base_handshake_req_reply_wait(struct MPT3SAS_ADAPTER *ioc, int request_bytes, MPI2DefaultReply_t *default_reply = (MPI2DefaultReply_t *)reply; int i; u8 failed; - u16 dummy; __le32 *mfp; /* make sure doorbell is not in use */ @@ -3964,7 +3963,7 @@ _base_handshake_req_reply_wait(struct MPT3SAS_ADAPTER *ioc, int request_bytes, return -EFAULT; } if (i >= reply_bytes/2) /* overflow case */ - dummy = readl(&ioc->chip->Doorbell); + readl(&ioc->chip->Doorbell); else reply[i] = le16_to_cpu(readl(&ioc->chip->Doorbell) & MPI2_DOORBELL_DATA_MASK); @@ -4009,7 +4008,6 @@ mpt3sas_base_sas_iounit_control(struct MPT3SAS_ADAPTER *ioc, { u16 smid; u32 ioc_state; - unsigned long timeleft; bool issue_reset = false; int rc; void *request; @@ -4062,7 +4060,7 @@ mpt3sas_base_sas_iounit_control(struct MPT3SAS_ADAPTER *ioc, ioc->ioc_link_reset_in_progress = 1; init_completion(&ioc->base_cmds.done); mpt3sas_base_put_smid_default(ioc, smid); - timeleft = wait_for_completion_timeout(&ioc->base_cmds.done, + wait_for_completion_timeout(&ioc->base_cmds.done, msecs_to_jiffies(1)); if ((mpi_request->Operation == MPI2_SAS_OP_PHY_HARD_RESET || mpi_request->Operation == MPI2_SAS_OP_PHY_LINK_RESET) && @@ -4112,7 +4110,6 @@ mpt3sas_base_scsi_enclosure_processor(struct MPT3SAS_ADAPTER *ioc, { u16 smid; u32 ioc_state; - unsigned long timeleft; bool issue_reset = false; int rc; void *request; @@ -4163,7 +4160,7 @@ mpt3sas_base_scsi_enclosure_processor(struct MPT3SAS_ADAPTER *ioc, memcpy(request, mpi_request, sizeof(Mpi2SepReply_t)); init_completion(&ioc->base_cmds.done); mpt3sas_base_put_smid_default(ioc, smid); - timeleft = wait_for_completion_timeout(&ioc->base_cmds.done, + wait_for_completion_timeout(&ioc->base_cmds.done, msecs_to_jiffies(1)); if (!(ioc->base_cmds.status & MPT3_CMD_COMPLETE)) { pr_err(MPT3SAS_FMT "%s: timeout\n", @@ -4548,7 +4545,6 @@ _base_send_port_enable(struct MPT3SAS_ADAPTER *ioc) { Mpi2PortEnableRequest_t *mpi_request; Mpi2PortEnableReply_t *mpi_reply; - unsigned long timeleft; int r = 0; u16 smid; u16 ioc_status; @@ -4576,8 +4572,7 @@ _base_send_port_enable(struct MPT3SAS_ADAPTER *ioc) init_completion(&ioc->port_enable_cmds.done); mpt3sas_base_put_smid_default(ioc, smid); - timeleft = wait_for_completion_timeout(&ioc->port_enable_cmds.done, - 300*HZ); + wait_for_completion_timeout(&ioc->port_enable_cmds.done, 300*HZ); if (
RE: [PATCH 2/3] mpt3sas: Eliminate dead sleep_flag code
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: Calvin Owens [mailto:calvinow...@fb.com] Sent: Friday, July 29, 2016 10:08 AM To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen Cc: mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; kernel-t...@fb.com; Calvin Owens Subject: [PATCH 2/3] mpt3sas: Eliminate dead sleep_flag code With the exception of a single call to wait_for_doorbell_int(), all this conditional sleeping code is dead. So delete it. Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_base.c | 241 +-- drivers/scsi/mpt3sas/mpt3sas_base.h | 6 +- drivers/scsi/mpt3sas/mpt3sas_config.c| 3 +- drivers/scsi/mpt3sas/mpt3sas_ctl.c | 15 +- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 21 +-- drivers/scsi/mpt3sas/mpt3sas_transport.c | 12 +- 6 files changed, 120 insertions(+), 178 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 751f13e..0956183 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -98,7 +98,7 @@ MODULE_PARM_DESC(mpt3sas_fwfault_debug, " enable detection of firmware fault and halt firmware - (default=0)"); static int -_base_get_ioc_facts(struct MPT3SAS_ADAPTER *ioc, int sleep_flag); +_base_get_ioc_facts(struct MPT3SAS_ADAPTER *ioc); /** * _scsih_set_fwfault_debug - global setting of ioc->fwfault_debug. @@ -218,8 +218,7 @@ _base_fault_reset_work(struct work_struct *work) ioc->non_operational_loop = 0; if ((doorbell & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL) { - rc = mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP, - FORCE_BIG_HAMMER); + rc = mpt3sas_base_hard_reset_handler(ioc, FORCE_BIG_HAMMER); pr_warn(MPT3SAS_FMT "%s: hard reset: %s\n", ioc->name, __func__, (rc == 0) ? "success" : "failed"); doorbell = mpt3sas_base_get_iocstate(ioc, 0); @@ -2145,7 +2144,7 @@ mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER *ioc) _base_mask_interrupts(ioc); - r = _base_get_ioc_facts(ioc, CAN_SLEEP); + r = _base_get_ioc_facts(ioc); if (r) goto out_fail; @@ -3172,12 +3171,11 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc) /** * _base_allocate_memory_pools - allocate start of day memory pools * @ioc: per adapter object - * @sleep_flag: CAN_SLEEP or NO_SLEEP * * Returns 0 success, anything else error */ static int -_base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc, int sleep_flag) +_base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc) { struct mpt3sas_facts *facts; u16 max_sge_elements; @@ -3647,29 +3645,25 @@ mpt3sas_base_get_iocstate(struct MPT3SAS_ADAPTER *ioc, int cooked) * _base_wait_on_iocstate - waiting on a particular ioc state * @ioc_state: controller state { READY, OPERATIONAL, or RESET } * @timeout: timeout in second - * @sleep_flag: CAN_SLEEP or NO_SLEEP * * Returns 0 for success, non-zero for failure. */ static int -_base_wait_on_iocstate(struct MPT3SAS_ADAPTER *ioc, u32 ioc_state, int timeout, - int sleep_flag) +_base_wait_on_iocstate(struct MPT3SAS_ADAPTER *ioc, u32 ioc_state, int +timeout) { u32 count, cntdn; u32 current_state; count = 0; - cntdn = (sleep_flag == CAN_SLEEP) ? 1000*timeout : 2000*timeout; + cntdn = 1000 * timeout; do { current_state = mpt3sas_base_get_iocstate(ioc, 1); if (current_state == ioc_state) return 0; if (count && current_state == MPI2_IOC_STATE_FAULT) break; - if (sleep_flag == CAN_SLEEP) - usleep_range(1000, 1500); - else - udelay(500); + + usleep_range(1000, 1500); count++; } while (--cntdn); @@ -3681,24 +3675,22 @@ _base_wait_on_iocstate(struct MPT3SAS_ADAPTER *ioc, u32 ioc_state, int timeout, * a write to the doorbell) * @ioc: per adapter object * @timeout: timeout in second - * @sleep_flag: CAN_SLEEP or NO_SLEEP * * Returns 0 for success, non-zero for failure. * * Notes: MPI2_HIS_IOC2SYS_DB_STATUS - set to one when IOC writes to doorbell. */ static int -_base_diag_reset(struct MPT3SAS_ADAPTER *ioc, int sleep_flag); +_base_diag_reset(struct MPT3SAS_ADAPTER *ioc); static int -_base_wait_for_doorbell_int(struct MPT3SAS_ADAPTER *ioc, int timeout, - int sleep_flag) +_base_wait_for_doorbell_int(struct MPT3SAS_ADAPTER *ioc, int timeout) { u32 cntdn, count; u32 int_status; count = 0; - cntdn = (sleep_flag == CAN_SLEEP) ? 1000*timeout : 2000*timeout; + cntdn = 1000 * timeout; do {
RE: [PATCH 1/3] mpt3sas: Eliminate conditional locking in mpt3sas_scsih_issue_tm()
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: Calvin Owens [mailto:calvinow...@fb.com] Sent: Friday, July 29, 2016 10:08 AM To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen Cc: mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; kernel-t...@fb.com; Calvin Owens Subject: [PATCH 1/3] mpt3sas: Eliminate conditional locking in mpt3sas_scsih_issue_tm() This flag that conditionally acquires the mutex is confusing and prone to bugginess: refactor it into two separate function calls, and make the unlocked one complain if it's called outside the mutex. Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_base.h | 16 +++-- drivers/scsi/mpt3sas/mpt3sas_ctl.c | 5 ++- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 66 +--- 3 files changed, 38 insertions(+), 49 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index eb7f5b0..f0baafd 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -794,16 +794,6 @@ struct reply_post_struct { dma_addr_t reply_post_free_dma; }; -/** - * enum mutex_type - task management mutex type - * @TM_MUTEX_OFF: mutex is not required becuase calling function is acquiring it - * @TM_MUTEX_ON: mutex is required - */ -enum mutex_type { - TM_MUTEX_OFF = 0, - TM_MUTEX_ON = 1, -}; - typedef void (*MPT3SAS_FLUSH_RUNNING_CMDS)(struct MPT3SAS_ADAPTER *ioc); /** * struct MPT3SAS_ADAPTER - per adapter struct @@ -1291,7 +1281,11 @@ void mpt3sas_scsih_reset_handler(struct MPT3SAS_ADAPTER *ioc, int reset_phase); int mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, uint channel, uint id, uint lun, u8 type, u16 smid_task, - ulong timeout, enum mutex_type m_type); + ulong timeout); +int mpt3sas_scsih_issue_locked_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, + uint channel, uint id, uint lun, u8 type, u16 smid_task, + ulong timeout); + void mpt3sas_scsih_set_tm_flag(struct MPT3SAS_ADAPTER *ioc, u16 handle); void mpt3sas_scsih_clear_tm_flag(struct MPT3SAS_ADAPTER *ioc, u16 handle); void mpt3sas_expander_remove(struct MPT3SAS_ADAPTER *ioc, u64 sas_address); diff --git a/drivers/scsi/mpt3sas/mpt3sas_ctl.c b/drivers/scsi/mpt3sas/mpt3sas_ctl.c index 7d00f09..75ae533 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_ctl.c +++ b/drivers/scsi/mpt3sas/mpt3sas_ctl.c @@ -1001,10 +1001,9 @@ _ctl_do_mpt_command(struct MPT3SAS_ADAPTER *ioc, struct mpt3_ioctl_command karg, ioc->name, le16_to_cpu(mpi_request->FunctionDependent1)); mpt3sas_halt_firmware(ioc); - mpt3sas_scsih_issue_tm(ioc, + mpt3sas_scsih_issue_locked_tm(ioc, le16_to_cpu(mpi_request->FunctionDependent1), 0, 0, - 0, MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET, 0, 30, - TM_MUTEX_ON); + 0, MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET, 0, 30); } else mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP, FORCE_BIG_HAMMER); diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index acabe48..c93a7ba 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -2201,7 +2201,6 @@ mpt3sas_scsih_clear_tm_flag(struct MPT3SAS_ADAPTER *ioc, u16 handle) * @type: MPI2_SCSITASKMGMT_TASKTYPE__XXX (defined in mpi2_init.h) * @smid_task: smid assigned to the task * @timeout: timeout in seconds - * @m_type: TM_MUTEX_ON or TM_MUTEX_OFF * Context: user * * A generic API for sending task management requests to firmware. @@ -2212,8 +2211,7 @@ mpt3sas_scsih_clear_tm_flag(struct MPT3SAS_ADAPTER *ioc, u16 handle) */ int mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, uint channel, - uint id, uint lun, u8 type, u16 smid_task, ulong timeout, - enum mutex_type m_type) + uint id, uint lun, u8 type, u16 smid_task, ulong timeout) { Mpi2SCSITaskManagementRequest_t *mpi_request; Mpi2SCSITaskManagementReply_t *mpi_reply; @@ -2224,21 +,19 @@ mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, uint channel, int rc; u16 msix_task = 0; - if (m_type == TM_MUTEX_ON) - mutex_lock(&ioc->tm_cmds.mutex); + lockdep_assert_held(&ioc->tm_cmds.mutex); + if (ioc->tm_cmds.status != MPT3_CMD_NOT_USED) { pr_info(MPT3SAS_FMT "%s: tm_cmd busy!!!\n", __func__, ioc->name); - rc = FAILED; - goto err_out; + return FAILED; } if (ioc->shost_recovery || ioc->remove_host || ioc->pci_error_recover
RE: Kernel panics while creating RAID volume on latest stable 4.6.2 kernel beacuse of "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS enclosures"
Any updates on this ??? Thanks, Chaitra -Original Message- From: Chaitra Basappa [mailto:chaitra.basa...@broadcom.com] Sent: Friday, June 17, 2016 4:04 PM To: linux-kernel@vger.kernel.org; Linux SCSI Mailinglist; James Bottomley Subject: Kernel panics while creating RAID volume on latest stable 4.6.2 kernel beacuse of "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS enclosures" Importance: High Hi, Try creating RAID volume on latest stable 4.6.2 kernel, as soon as the volume gets created kernel panics , below are the logs... Carried out same experimentation on 4.4.13 kernel, issue was not observed.After learning diff b/w 4.4.13 & 4.6.2 kernels "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS enclosures" patch looks to be suspicious. commit 3f8d6f2a0797e8c650a47e5c1b5c2601a46f4293 And hence reverted above mentioned patch changes from 4.6.2 kernel and tried volume creation, volume created successfully and issue is not observed. >>Kernel panic logs: root@dhcp-135-24-192-112 ~]# sd 0:1:0:0: [sdw] No Caching mode page found sd 0:1:0:0: [sdw] Assuming drive cache: write through [ cut here ] kernel BUG at drivers/scsi/scsi_transport_sas.c:164! invalid opcode: [#1] SMP Modules linked in: mptctl mptbase ses enclosure ebtable_nat ebtables xt_CHECKSUM iptable_mangle bridge autofs4 8021q garp stp llc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan vhost tun kvm_intel kvm irqbypass uinput ipmi_devintf iTCO_wdt iTCO_vendor_support dcdbas pcspkr ipmi_si ipmi_msghandler acpi_pad sb_edac edac_core wmi sg lpc_ich mfd_core shpchp tg3 ptp pps_core joydev ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) mpt3sas(E) scsi_transport_sas(E) raid_class(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: speedstep_lib] CPU: 1 PID: 375 Comm: kworker/u96:4 Tainted: GE 4.6.2 #1 Hardware name: Dell Inc. PowerEdge T420/03015M, BIOS 2.2.0 02/06/2014 Workqueue: fw_event_mpt3sas0 _firmware_event_work [mpt3sas] task: 8800377f6480 ti: 8800c62c8000 task.ti: 8800c62c8000 RIP: 0010:[] [] sas_get_address+0x26/0x30 [scsi_transport_sas] RSP: 0018:8800c62cb8a8 EFLAGS: 00010282 RAX: 8800c6986208 RBX: 8800b04ec800 RCX: 8800b3deaac4 RDX: 002b RSI: RDI: 8800b04ec800 RBP: 8800c62cb8a8 R08: R09: 0008 R10: R11: 0001 R12: 8800b04ec800 R13: R14: 8800b04ec998 R15: FS: () GS:88012f02() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: ff600400 CR3: 01c06000 CR4: 000406e0 Stack: 8800c62cb8d8 a066bc62 8800b04ecc68 880128ee8000 8800c62cb938 a066bd5c 8800b04ecef8 81608333 8800b04ec800 8800b04ecc68 Call Trace: [] ses_match_to_enclosure+0x72/0x80 [ses] [] ses_intf_add+0xec/0x494 [ses] [] ? preempt_schedule_common+0x23/0x40 [] device_add+0x278/0x440 [] ? __pm_runtime_resume+0x6c/0x90 [] scsi_sysfs_add_sdev+0xee/0x2b0 [] scsi_add_lun+0x437/0x580 [] scsi_probe_and_add_lun+0x1bb/0x4e0 [] ? get_device+0x19/0x20 [] ? scsi_alloc_target+0x293/0x320 [] ? __pm_runtime_resume+0x6c/0x90 [] __scsi_add_device+0x10f/0x130 [] scsi_add_device+0x11/0x30 [] _scsih_sas_volume_add+0xf9/0x1b0 [mpt3sas] [] _scsih_sas_ir_config_change_event+0xdb/0x210 [mpt3sas] [] _mpt3sas_fw_work+0xc1/0x480 [mpt3sas] [] ? pwq_dec_nr_in_flight+0x50/0xa0 [] _firmware_event_work+0x19/0x20 [mpt3sas] [] process_one_work+0x189/0x4e0 [] ? del_timer_sync+0x4c/0x60 [] ? maybe_create_worker+0x8e/0x110 [] ? schedule+0x40/0xb0 [] worker_thread+0x16d/0x520 [] ? default_wake_function+0x12/0x20 [] ? __wake_up_common+0x56/0x90 [] ? maybe_create_worker+0x110/0x110 [] ? schedule+0x40/0xb0 [] ? maybe_create_worker+0x110/0x110 [] kthread+0xcc/0xf0 [] ? schedule_tail+0x1e/0xc0 [] ret_from_fork+0x22/0x40 [] ? kthread_freezable_should_stop+0x70/0x70 Code: 0f 1f 44 00 00 55 48 89 e5 66 66 66 66 90 48 8b 87 28 01 00 00 48 8b 40 28 83 b8 d0 02 00 00 01 75 09 48 8b 80 e0 02 00 00 c9 c3 <0f> 0b eb fe 66 0f 1f 44 00 00 55 48 89 e5 53 48 83 ec 08 66 66 RIP [] sas_get_address+0x26/0x30 [scsi_transport_sas] RSP ---[ end trace c8c9da69e1dcb8a1 ]--- BUG: unable to handle kernel paging request at ffd8 IP: [] kthread_data+0x10/0x20 PGD 1c07067 PUD 1c09067 PMD 0 Oops: [#2] SMP Modules linked in: mptctl mptbase ses enclosure ebtable_nat ebtables xt_CHECKSUM iptable_mangle bridge autofs4 8021q garp stp llc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
RE: [PATCH] mpt3sas: Fix panic when aer correct error occured
Hi, Please consider this patch as Acked-by: Chaitra P B Thanks, Chaitra -Original Message- From: Kefeng Wang [mailto:wangkefeng.w...@huawei.com] Sent: Tuesday, July 12, 2016 3:13 PM To: martin.peter...@oracle.com; suganath-prabu.subram...@broadcom.com; mpt-fusionlinux@broadcom.com Cc: linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; guohan...@huawei.com; Kefeng Wang; Sathya Prakash; Chaitra P B Subject: [PATCH] mpt3sas: Fix panic when aer correct error occured The _scsih_pci_mmio_enabled called if scsih_pci_error_detected returns PCI_ERS_RESULT_CAN_RECOVER, at this point, read/write to the device still works, no need to reset slot. Or the mpt3sas_base_map_resources in scsih_pci_slot_reset will fail, and iounamp ioc->chip, then we will meet issue when read ioc->chip in mpt3sas_base_get_iocstate from _base_fault_reset_work. Cc: Sathya Prakash Cc: Chaitra P B Cc: Suganath Prabu Subramani Signed-off-by: Kefeng Wang --- NOTE: I found this with an earlier kernel version, but the logic is not changed. drivers/scsi/mpt3sas/mpt3sas_scsih.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 6bff13e..eedd62e3 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -9033,8 +9033,11 @@ scsih_pci_mmio_enabled(struct pci_dev *pdev) /* TODO - dump whatever for debugging purposes */ - /* Request a slot reset. */ - return PCI_ERS_RESULT_NEED_RESET; + /* This called only if scsih_pci_error_detected returns +* PCI_ERS_RESULT_CAN_RECOVER, read/write to the device +* still works, not need to reset slot. +*/ + return PCI_ERS_RESULT_RECOVERED; } /* -- 1.7.12.4
Kernel panics while creating RAID volume on latest stable 4.6.2 kernel beacuse of "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS enclosures"
Hi, Try creating RAID volume on latest stable 4.6.2 kernel, as soon as the volume gets created kernel panics , below are the logs... Carried out same experimentation on 4.4.13 kernel, issue was not observed.After learning diff b/w 4.4.13 & 4.6.2 kernels "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS enclosures" patch looks to be suspicious. commit 3f8d6f2a0797e8c650a47e5c1b5c2601a46f4293 And hence reverted above mentioned patch changes from 4.6.2 kernel and tried volume creation, volume created successfully and issue is not observed. >>Kernel panic logs: root@dhcp-135-24-192-112 ~]# sd 0:1:0:0: [sdw] No Caching mode page found sd 0:1:0:0: [sdw] Assuming drive cache: write through [ cut here ] kernel BUG at drivers/scsi/scsi_transport_sas.c:164! invalid opcode: [#1] SMP Modules linked in: mptctl mptbase ses enclosure ebtable_nat ebtables xt_CHECKSUM iptable_mangle bridge autofs4 8021q garp stp llc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan vhost tun kvm_intel kvm irqbypass uinput ipmi_devintf iTCO_wdt iTCO_vendor_support dcdbas pcspkr ipmi_si ipmi_msghandler acpi_pad sb_edac edac_core wmi sg lpc_ich mfd_core shpchp tg3 ptp pps_core joydev ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) mpt3sas(E) scsi_transport_sas(E) raid_class(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: speedstep_lib] CPU: 1 PID: 375 Comm: kworker/u96:4 Tainted: GE 4.6.2 #1 Hardware name: Dell Inc. PowerEdge T420/03015M, BIOS 2.2.0 02/06/2014 Workqueue: fw_event_mpt3sas0 _firmware_event_work [mpt3sas] task: 8800377f6480 ti: 8800c62c8000 task.ti: 8800c62c8000 RIP: 0010:[] [] sas_get_address+0x26/0x30 [scsi_transport_sas] RSP: 0018:8800c62cb8a8 EFLAGS: 00010282 RAX: 8800c6986208 RBX: 8800b04ec800 RCX: 8800b3deaac4 RDX: 002b RSI: RDI: 8800b04ec800 RBP: 8800c62cb8a8 R08: R09: 0008 R10: R11: 0001 R12: 8800b04ec800 R13: R14: 8800b04ec998 R15: FS: () GS:88012f02() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: ff600400 CR3: 01c06000 CR4: 000406e0 Stack: 8800c62cb8d8 a066bc62 8800b04ecc68 880128ee8000 8800c62cb938 a066bd5c 8800b04ecef8 81608333 8800b04ec800 8800b04ecc68 Call Trace: [] ses_match_to_enclosure+0x72/0x80 [ses] [] ses_intf_add+0xec/0x494 [ses] [] ? preempt_schedule_common+0x23/0x40 [] device_add+0x278/0x440 [] ? __pm_runtime_resume+0x6c/0x90 [] scsi_sysfs_add_sdev+0xee/0x2b0 [] scsi_add_lun+0x437/0x580 [] scsi_probe_and_add_lun+0x1bb/0x4e0 [] ? get_device+0x19/0x20 [] ? scsi_alloc_target+0x293/0x320 [] ? __pm_runtime_resume+0x6c/0x90 [] __scsi_add_device+0x10f/0x130 [] scsi_add_device+0x11/0x30 [] _scsih_sas_volume_add+0xf9/0x1b0 [mpt3sas] [] _scsih_sas_ir_config_change_event+0xdb/0x210 [mpt3sas] [] _mpt3sas_fw_work+0xc1/0x480 [mpt3sas] [] ? pwq_dec_nr_in_flight+0x50/0xa0 [] _firmware_event_work+0x19/0x20 [mpt3sas] [] process_one_work+0x189/0x4e0 [] ? del_timer_sync+0x4c/0x60 [] ? maybe_create_worker+0x8e/0x110 [] ? schedule+0x40/0xb0 [] worker_thread+0x16d/0x520 [] ? default_wake_function+0x12/0x20 [] ? __wake_up_common+0x56/0x90 [] ? maybe_create_worker+0x110/0x110 [] ? schedule+0x40/0xb0 [] ? maybe_create_worker+0x110/0x110 [] kthread+0xcc/0xf0 [] ? schedule_tail+0x1e/0xc0 [] ret_from_fork+0x22/0x40 [] ? kthread_freezable_should_stop+0x70/0x70 Code: 0f 1f 44 00 00 55 48 89 e5 66 66 66 66 90 48 8b 87 28 01 00 00 48 8b 40 28 83 b8 d0 02 00 00 01 75 09 48 8b 80 e0 02 00 00 c9 c3 <0f> 0b eb fe 66 0f 1f 44 00 00 55 48 89 e5 53 48 83 ec 08 66 66 RIP [] sas_get_address+0x26/0x30 [scsi_transport_sas] RSP ---[ end trace c8c9da69e1dcb8a1 ]--- BUG: unable to handle kernel paging request at ffd8 IP: [] kthread_data+0x10/0x20 PGD 1c07067 PUD 1c09067 PMD 0 Oops: [#2] SMP Modules linked in: mptctl mptbase ses enclosure ebtable_nat ebtables xt_CHECKSUM iptable_mangle bridge autofs4 8021q garp stp llc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan vhost tun kvm_intel kvm irqbypass uinput ipmi_devintf iTCO_wdt iTCO_vendor_support dcdbas pcspkr ipmi_si ipmi_msghandler acpi_pad sb_edac edac_core wmi sg lpc_ich mfd_core shpchp tg3 ptp pps_core joydev ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) mpt3sas(E) scsi_transport_sas(E) raid_class(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mo
RE: [PATCH] mpt3sas: Don't overreach ioc->reply_post[] during initialization
Hi, Please consider this patch as Ack-by: Chaitra P B Thanks, Chaitra -Original Message- From: Martin K. Petersen [mailto:martin.peter...@oracle.com] Sent: Tuesday, March 22, 2016 6:00 AM To: Calvin Owens Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; kernel-t...@fb.com; Sreekanth Reddy Subject: Re: [PATCH] mpt3sas: Don't overreach ioc->reply_post[] during initialization > "Calvin" == Calvin Owens writes: Calvin> In _base_make_ioc_operational(), we walk ioc->reply_queue_list Calvin> and pull a pointer out of successive elements of Calvin> ioc->reply_post[] for each entry in that list if RDPQ is Calvin> enabled. Calvin> Since the code pulls the pointer for the next iteration at the Calvin> bottom of the loop, it triggers the a KASAN dump on the final Calvin> iteration: Broadcom folks, please review. Thanks! -- Martin K. Petersen Oracle Linux Engineering
RE: [PATCH] mpt3sas: Don't overreach ioc->reply_post[] during initialization
Martin, This patch is being reviewed , we shall get back with reviews by tomorrow. Thanks, Chaitra -Original Message- From: Martin K. Petersen [mailto:martin.peter...@oracle.com] Sent: Tuesday, March 22, 2016 6:00 AM To: Calvin Owens Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; kernel-t...@fb.com; Sreekanth Reddy Subject: Re: [PATCH] mpt3sas: Don't overreach ioc->reply_post[] during initialization > "Calvin" == Calvin Owens writes: Calvin> In _base_make_ioc_operational(), we walk ioc->reply_queue_list Calvin> and pull a pointer out of successive elements of Calvin> ioc->reply_post[] for each entry in that list if RDPQ is Calvin> enabled. Calvin> Since the code pulls the pointer for the next iteration at the Calvin> bottom of the loop, it triggers the a KASAN dump on the final Calvin> iteration: Broadcom folks, please review. Thanks! -- Martin K. Petersen Oracle Linux Engineering
RE: [PATCH-v2 1/2] mpt3sas: Refcount sas_device objects and fix unsafe list usage
From: Sreekanth Reddy [mailto:sreekanth.re...@avagotech.com] Sent: Tuesday, September 08, 2015 5:26 PM To: Nicholas A. Bellinger Cc: linux-scsi; linux-kernel; James Bottomley; Calvin Owens; Christoph Hellwig; MPT-FusionLinux.pdl; kernel-team; Nicholas Bellinger; Chaitra Basappa Subject: Re: [PATCH-v2 1/2] mpt3sas: Refcount sas_device objects and fix unsafe list usage On Sun, Aug 30, 2015 at 1:24 PM, Nicholas A. Bellinger wrote: > From: Nicholas Bellinger > > These objects can be referenced concurrently throughout the driver, we > need a way to make sure threads can't delete them out from under each > other. This patch adds the refcount, and refactors the code to use it. > > Additionally, we cannot iterate over the sas_device_list without > holding the lock, or we risk corrupting random memory if items are > added or deleted as we iterate. This patch refactors > _scsih_probe_sas() to use the sas_device_list in a safe way. > > This patch is a port of Calvin's PATCH-v4 for mpt2sas code, atop > mpt3sas changes in scsi.git/for-next. > > Cc: Calvin Owens > Cc: Christoph Hellwig > Cc: Sreekanth Reddy > Cc: MPT-FusionLinux.pdl > Signed-off-by: Nicholas Bellinger > --- > drivers/scsi/mpt3sas/mpt3sas_base.h | 25 +- > drivers/scsi/mpt3sas/mpt3sas_scsih.c | 479 > +-- > drivers/scsi/mpt3sas/mpt3sas_transport.c | 18 +- > 3 files changed, 364 insertions(+), 158 deletions(-) > > @@ -2763,7 +2874,7 @@ _scsih_block_io_device(struct MPT3SAS_ADAPTER *ioc, > u16 handle) > struct scsi_device *sdev; > struct _sas_device *sas_device; > [Sreekanth] Here sas_device_lock spin lock needs to be acquired before calling __mpt3sas_get_sdev_by_addr() function. [Chaitra]Here instead of calling " __mpt3sas_get_sdev_by_handle()" function calling "mpt3sas_get_sdev_by_handle()" function will fixes "invalid page access" type of kernel panic > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle); > + sas_device = __mpt3sas_get_sdev_by_handle(ioc, handle); > if (!sas_device) > return; > > @@ -2779,6 +2890,8 @@ _scsih_block_io_device(struct MPT3SAS_ADAPTER *ioc, > u16 handle) > continue; > _scsih_internal_device_block(sdev, sas_device_priv_data); > } > + > + sas_device_put(sas_device); > } > Regards, Chaitra -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/