smp-induced oops/NULL pointer dereference in mpt3sas, from kernel >= 4.11

2017-08-17 Thread Chaitra Basappa
Hi All,
 In testing kernel 4.11.1 and 4.11.6 we've hit an oops/ blown pointer
issue in mpt3sas. It is easily reproducible on a system that contains
expanders/enclosure connected behind SAS3 HBA.
Soon after connecting expander / enclosure we observe below call trace.


Jul 12 15:28:27 localhost kernel: BUG: unable to handle kernel NULL
pointer dereference at 00dc
Jul 12 15:28:27 localhost kernel: IP: _transport_smp_handler+0x8bb/0x10c0
[mpt3sas]
Jul 12 15:28:27 localhost kernel: PGD 811abb067
Jul 12 15:28:27 localhost kernel: PUD 81c96a067
Jul 12 15:28:27 localhost kernel: PMD 0
Jul 12 15:28:27 localhost kernel:
Jul 12 15:28:27 localhost kernel: Oops: 0002 [#1] SMP
Jul 12 15:28:27 localhost kernel: mpt3sas_cm0: Discovery: (stop)
Jul 12 15:28:27 localhost kernel:
Jul 12 15:28:27 localhost kernel: mpt3sas_cm0: discovery event: (stop)
Jul 12 15:28:27 localhost kernel:
Jul 12 15:28:27 localhost kernel: Hardware name: Dell Inc. PowerEdge
T620/0658N7, BIOS 2.5.4 01/22/2016
Jul 12 15:28:27 localhost kernel: task: 88081c1b8100 task.stack:
c90006168000
Jul 12 15:28:27 localhost kernel: RIP:
0010:_transport_smp_handler+0x8bb/0x10c0 [mpt3sas]
Jul 12 15:28:27 localhost kernel: RSP: 0018:c9000616bb38 EFLAGS:
00010286
Jul 12 15:28:27 localhost kernel: RAX: 00dc RBX:
88041c2ba7b0 RCX: 003c1a07ff00
Jul 12 15:28:27 localhost kernel: RDX: 88081a45c948 RSI:
dead0200 RDI: dead0100
Jul 12 15:28:27 localhost kernel: RBP: c9000616bbf8 R08:
c9000616bac0 R09: dead0200
Jul 12 15:28:27 localhost kernel: R10:  R11:
0010 R12: 0105
Jul 12 15:28:27 localhost kernel: R13: 88041d631680 R14:
88041a6c6c38 R15: 0001
Jul 12 15:28:27 localhost kernel: FS:  7f1818ad1700()
GS:88042f80() knlGS:
Jul 12 15:28:27 localhost kernel: CS:  0010 DS:  ES:  CR0:
80050033
Jul 12 15:28:27 localhost kernel: CR2: 00dc CR3:
00081dad1000 CR4: 000406f0
Jul 12 15:28:27 localhost kernel: Call Trace:
Jul 12 15:28:27 localhost kernel: ? blk_rq_bio_prep+0x3c/0x80
Jul 12 15:28:27 localhost kernel: ? blk_start_request+0x38/0x60
Jul 12 15:28:27 localhost kernel: sas_smp_request+0x5f/0xa0
[scsi_transport_sas]
Jul 12 15:28:27 localhost kernel: sas_non_host_smp_request+0x4a/0x60
[scsi_transport_sas]
Jul 12 15:28:27 localhost kernel: __blk_run_queue+0x37/0x50
Jul 12 15:28:27 localhost kernel: blk_execute_rq_nowait+0xeb/0x140
Jul 12 15:28:27 localhost kernel: blk_execute_rq+0x48/0x90
Jul 12 15:28:27 localhost kernel: bsg_ioctl+0x18a/0x1e0
Jul 12 15:28:27 localhost kernel: vfs_ioctl+0x18/0x30
Jul 12 15:28:27 localhost kernel: do_vfs_ioctl+0x14b/0x3f0
Jul 12 15:28:27 localhost kernel: ? security_file_ioctl+0x45/0x60
Jul 12 15:28:27 localhost kernel: SyS_ioctl+0x92/0xa0
Jul 12 15:28:27 localhost kernel: do_syscall_64+0x6c/0x160
Jul 12 15:28:27 localhost kernel: entry_SYSCALL64_slow_path+0x25/0x25
Jul 12 15:28:27 localhost kernel: RIP: 0033:0x35a88e0a77
Jul 12 15:28:27 localhost kernel: RSP: 002b:7ffded06f278 EFLAGS:
0246 ORIG_RAX: 0010
Jul 12 15:28:27 localhost kernel: RAX: ffda RBX:
7ffded06f370 RCX: 0035a88e0a77
Jul 12 15:28:27 localhost kernel: RDX: 7ffded06f280 RSI:
2285 RDI: 0003
Jul 12 15:28:27 localhost kernel: RBP:  R08:
03ea R09: 8000
Jul 12 15:28:27 localhost kernel: R10: fff0 R11:
0246 R12: 0003
Jul 12 15:28:27 localhost kernel: R13:  R14:
7ffded06f3a0 R15: 
Jul 12 15:28:27 localhost kernel: Code: 84 3e 02 00 00 48 8b 5d a8 85 d2
4c 8b ab f8 02 00 00 0f 85 e3 05 00 00 48 8b 55 98 49 8b 4d 00 48 81 c2 48
01 00 00 48 8b 42 28 <48> 89 08 49 8b 4d 08 48 89 48 08 49 8b 4d 10 48 89
48 10 41 8b
Jul 12 15:28:27 localhost kernel: RIP: _transport_smp_handler+0x8bb/0x10c0
[mpt3sas] RSP: c9000616bb38
Jul 12 15:28:27 localhost kernel: CR2: 00dc
Jul 12 15:28:27 localhost kernel: ---[ end trace d0a22e0e5a84886a ]---
Jul 12 15:28:28 localhost kernel: ses 4:0:0:0: Attached Enclosure device




We analyzed this issue and could figure out it is not because of driver,
its because the "sense" field of the 'struct scsi_request' is not being
populated properly from the upper layer.
And this "sense" member is being referenced in our driver code for kernel
versions >= 4.11 as shown below in the snippet:
Whereas as for < 4.11 kernel version this "sense" member was referenced
via 'struct request'


static int
_transport_smp_handler (.) {
.
.
>>memcpy(scsi_req(req)->sense, mpi_reply, sizeof(*mpi_reply));
.
.
}

And hence the NULL pointer dereference call trace is seen for the above
chunk of mpt3sas. This needs to be addressed from upper layer, so please
help us in getting this resolved.

Thanks in advance for the support,

Regards,
 Chaitra


RE: [PATCH] mpt3sas: Fix resume on WarpDrive flash cards

2016-08-11 Thread Chaitra Basappa
Hi,
 Please consider this patch as Acked-by: Chaitra P B


Thanks,
 Chaitra

-Original Message-
From: Greg Edwards [mailto:gedwa...@fireweed.org]
Sent: Saturday, July 30, 2016 9:36 PM
To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J.
Bottomley; Martin K. Petersen
Cc: mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org;
linux-kernel@vger.kernel.org; Greg Edwards
Subject: [PATCH] mpt3sas: Fix resume on WarpDrive flash cards

mpt3sas crashes on resume after suspend with WarpDrive flash cards.  The
reply_post_host_index array is not set back up after the resume, and we
deference a stale pointer in _base_interrupt().

[   47.309711] BUG: unable to handle kernel paging request at
c90001f8006c
[   47.318289] IP: [] _base_interrupt+0x49f/0xa30
[mpt3sas]
[   47.326749] PGD 41ccaa067 PUD 41ccab067 PMD 3466c067 PTE 0
[   47.333848] Oops: 0002 [#1] SMP
...
[   47.452708] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.7.0 #6
[   47.460506] Hardware name: Dell Inc. OptiPlex 990/06D7TR, BIOS A18
09/24/2013
[   47.469629] task: 81c0d500 ti: 81c0 task.ti:
81c0
[   47.479112] RIP: 0010:[]  []
_base_interrupt+0x49f/0xa30 [mpt3sas]
[   47.490466] RSP: 0018:88041d203e30  EFLAGS: 00010002
[   47.497801] RAX: 0001 RBX: 880033f4c000 RCX:
0001
[   47.506973] RDX: c90001f8006c RSI: 0082 RDI:
0082
[   47.516141] RBP: 88041d203eb0 R08: 8804118e2820 R09:
0001
[   47.525300] R10: 0001 R11: 100c R12:

[   47.534457] R13: 880412c487e0 R14: 88041a8987d8 R15:
0001
[   47.543632] FS:  () GS:88041d20()
knlGS:
[   47.553796] CS:  0010 DS:  ES:  CR0: 80050033
[   47.561632] CR2: c90001f8006c CR3: 01c06000 CR4:
000406f0
[   47.570883] Stack:
[   47.575015]  1d211228 88041d2100c0 8800c47d8130
0100
[   47.584625]  8804100c 100c 88041a8992a0
88041a8987f8
[   47.594230]  88041d203e00 8e55 038c
880414ad4280
[   47.603862] Call Trace:
[   47.608474]  
[   47.610413]  [] ? call_timer_fn+0x35/0x120
[   47.620539]  [] handle_irq_event_percpu+0x7f/0x1c0
[   47.629061]  [] handle_irq_event+0x2c/0x50
[   47.636859]  [] handle_edge_irq+0x6f/0x130
[   47.644654]  [] handle_irq+0x73/0x120
[   47.652011]  [] ?
atomic_notifier_call_chain+0x1a/0x20
[   47.660854]  [] do_IRQ+0x4b/0xd0
[   47.66]  [] common_interrupt+0x8c/0x8c
[   47.675635]  

Move the reply_post_host_index array setup into
mpt3sas_base_map_resources(), which is also in the resume path.

Cc: sta...@vger.kernel.org
Signed-off-by: Greg Edwards 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 751f13e..750f82c 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -2188,6 +2188,17 @@ mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER
*ioc)
} else
ioc->msix96_vector = 0;

+   if (ioc->is_warpdrive) {
+   ioc->reply_post_host_index[0] = (resource_size_t __iomem
*)
+   &ioc->chip->ReplyPostHostIndex;
+
+   for (i = 1; i < ioc->cpu_msix_table_sz; i++)
+   ioc->reply_post_host_index[i] =
+   (resource_size_t __iomem *)
+   ((u8 __iomem *)&ioc->chip->Doorbell + (0x4000 +
((i - 1)
+   * 4)));
+   }
+
list_for_each_entry(reply_q, &ioc->reply_queue_list, list)
pr_info(MPT3SAS_FMT "%s: IRQ %d\n",
reply_q->name,  ((ioc->msix_enable) ? "PCI-MSI-X
enabled" :
@@ -5280,17 +5291,6 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc)
if (r)
goto out_free_resources;

-   if (ioc->is_warpdrive) {
-   ioc->reply_post_host_index[0] = (resource_size_t __iomem
*)
-   &ioc->chip->ReplyPostHostIndex;
-
-   for (i = 1; i < ioc->cpu_msix_table_sz; i++)
-   ioc->reply_post_host_index[i] =
-   (resource_size_t __iomem *)
-   ((u8 __iomem *)&ioc->chip->Doorbell + (0x4000 +
((i - 1)
-   * 4)));
-   }
-
pci_set_drvdata(ioc->pdev, ioc->shost);
r = _base_get_ioc_facts(ioc, CAN_SLEEP);
if (r)
--
2.7.4


RE: [PATCH] mpt3sas: Don't spam logs if logging level is 0

2016-08-04 Thread Chaitra Basappa
Hi,
 Please consider this patch as Acked-by: Chaitra P B


Thanks,
 Chaitra

-Original Message-
From: linux-scsi-ow...@vger.kernel.org
[mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Johannes Thumshirn
Sent: Wednesday, August 03, 2016 6:30 PM
To: Martin K . Petersen; James Bottomley
Cc: Linux SCSI Mailinglist; Linux Kernel Mailinglist; Sreekanth Reddy;
Johannes Thumshirn
Subject: [PATCH] mpt3sas: Don't spam logs if logging level is 0

In _scsih_io_done() we test if the ioc->logging_level does _not_ have the
MPT_DEBUG_REPLY bit set and if it hasn't we print the debug messages. This
unfortunately is the wrong way around.

Note, the actual bug is older than af0094115 but this commit removed the
CONFIG_SCSI_MPT3SAS_LOGGING Kconfig option which hid the bug.

Fixes: af0094115 'mpt2sas, mpt3sas: Remove SCSI_MPTXSAS_LOGGING entry from
Kconfig'
Signed-off-by: Johannes Thumshirn 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 4a1cc85..a138690 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -4693,7 +4693,7 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16
smid, u8 msix_index, u32 reply)
le16_to_cpu(mpi_reply->DevHandle));
mpt3sas_trigger_scsi(ioc, data.skey, data.asc, data.ascq);

-   if (!(ioc->logging_level & MPT_DEBUG_REPLY) &&
+   if ((ioc->logging_level & MPT_DEBUG_REPLY) &&
 ((scmd->sense_buffer[2] == UNIT_ATTENTION) ||
 (scmd->sense_buffer[2] == MEDIUM_ERROR) ||
 (scmd->sense_buffer[2] == HARDWARE_ERROR)))
--
1.8.5.6

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org More majordomo info at
http://vger.kernel.org/majordomo-info.html


RE: [PATCH] mpt3sas: Ensure the connector_name string is NUL-terminated

2016-08-04 Thread Chaitra Basappa
Hi,
 Please consider this patch as Acked-by: Chaitra P B



Thanks,
 Chaitra

-Original Message-
From: Calvin Owens [mailto:calvinow...@fb.com]
Sent: Thursday, July 28, 2016 10:16 AM
To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J.
Bottomley; Martin K. Petersen
Cc: mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org;
linux-kernel@vger.kernel.org; kernel-t...@fb.com; Calvin Owens
Subject: [PATCH] mpt3sas: Ensure the connector_name string is
NUL-terminated

We blindly trust the hardware to give us NUL-terminated strings, which is
a bad idea because it doesn't always do that. For example:

  [  481.184784] mpt3sas_cm0:   enclosure level(0x), connector name(
\x3)

In this case, connector_name is four spaces. We got lucky here because the
2nd byte beyond our character array happens to be a NUL. Fix this by
explicitly writing '\0' to the end of the string to ensure we don't run
off the edge of the world in printk().

Signed-off-by: Calvin Owens 
---
 drivers/scsi/mpt3sas/mpt3sas_base.h  |  2 +-
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 10 ++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 892c9be..eb7f5b0 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -478,7 +478,7 @@ struct _sas_device {
u8  pfa_led_on;
u8  pend_sas_rphy_add;
u8  enclosure_level;
-   u8  connector_name[4];
+   u8  connector_name[5];
struct kref refcount;
 };

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index cd91a68..acabe48 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -5380,8 +5380,9 @@ _scsih_check_device(struct MPT3SAS_ADAPTER *ioc,
 MPI2_SAS_DEVICE0_FLAGS_ENCL_LEVEL_VALID) {
sas_device->enclosure_level =

le16_to_cpu(sas_device_pg0.EnclosureLevel);
-   memcpy(&sas_device->connector_name[0],
-   &sas_device_pg0.ConnectorName[0], 4);
+   memcpy(sas_device->connector_name,
+   sas_device_pg0.ConnectorName, 4);
+   sas_device->connector_name[4] = '\0';
} else {
sas_device->enclosure_level = 0;
sas_device->connector_name[0] = '\0'; @@ -5508,8
+5509,9 @@ _scsih_add_device(struct MPT3SAS_ADAPTER *ioc, u16 handle, u8
phy_num,
if (sas_device_pg0.Flags &
MPI2_SAS_DEVICE0_FLAGS_ENCL_LEVEL_VALID) {
sas_device->enclosure_level =
le16_to_cpu(sas_device_pg0.EnclosureLevel);
-   memcpy(&sas_device->connector_name[0],
-   &sas_device_pg0.ConnectorName[0], 4);
+   memcpy(sas_device->connector_name,
+   sas_device_pg0.ConnectorName, 4);
+   sas_device->connector_name[4] = '\0';
} else {
sas_device->enclosure_level = 0;
sas_device->connector_name[0] = '\0';
--
2.8.0.rc2


RE: [PATCH 3/3] mpt3sas: Fix warnings exposed by W=1

2016-08-04 Thread Chaitra Basappa
Hi,
 Please consider this patch as Acked-by: Chaitra P B



Thanks,
 Chaitra

-Original Message-
From: mpt-fusionlinux@broadcom.com
[mailto:mpt-fusionlinux@broadcom.com] On Behalf Of Calvin Owens
Sent: Friday, July 29, 2016 10:08 AM
To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J.
Bottomley; Martin K. Petersen
Cc: mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org;
linux-kernel@vger.kernel.org; kernel-t...@fb.com; Calvin Owens
Subject: [PATCH 3/3] mpt3sas: Fix warnings exposed by W=1

Trivial non-functional changes for a couple annoying things:

  1) Functions local to files are not declared static, which is
  frustrating when reading the code because it's non-obvious at first
  glance what's actually called from other files.

  2) Set-but-unused variables abound, presumably to mask -Wunused-result
  errors in the past. None of these are flagged today though (with one
  exception noted below), so remove them.

Fixing (2) exposed the fact that we improperly ignore the return value of
scsi_device_reprobe() in _scsih_reprobe_lun(). Fixing the calling code to
deal with the potential error is non-trivial, so for now just WARN().

Signed-off-by: Calvin Owens 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c  | 18 +++-
 drivers/scsi/mpt3sas/mpt3sas_config.c|  4 +-
 drivers/scsi/mpt3sas/mpt3sas_ctl.c   | 29 ++---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 70
+++-
 drivers/scsi/mpt3sas/mpt3sas_transport.c | 16 ++--
 5 files changed, 56 insertions(+), 81 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 0956183..df95d1a 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -2039,7 +2039,7 @@ _base_enable_msix(struct MPT3SAS_ADAPTER *ioc)
  * mpt3sas_base_unmap_resources - free controller resources
  * @ioc: per adapter object
  */
-void
+static void
 mpt3sas_base_unmap_resources(struct MPT3SAS_ADAPTER *ioc)  {
struct pci_dev *pdev = ioc->pdev;
@@ -3884,7 +3884,6 @@ _base_handshake_req_reply_wait(struct
MPT3SAS_ADAPTER *ioc, int request_bytes,
MPI2DefaultReply_t *default_reply = (MPI2DefaultReply_t *)reply;
int i;
u8 failed;
-   u16 dummy;
__le32 *mfp;

/* make sure doorbell is not in use */ @@ -3964,7 +3963,7 @@
_base_handshake_req_reply_wait(struct MPT3SAS_ADAPTER *ioc, int
request_bytes,
return -EFAULT;
}
if (i >=  reply_bytes/2) /* overflow case */
-   dummy = readl(&ioc->chip->Doorbell);
+   readl(&ioc->chip->Doorbell);
else
reply[i] = le16_to_cpu(readl(&ioc->chip->Doorbell)
& MPI2_DOORBELL_DATA_MASK);
@@ -4009,7 +4008,6 @@ mpt3sas_base_sas_iounit_control(struct
MPT3SAS_ADAPTER *ioc,  {
u16 smid;
u32 ioc_state;
-   unsigned long timeleft;
bool issue_reset = false;
int rc;
void *request;
@@ -4062,7 +4060,7 @@ mpt3sas_base_sas_iounit_control(struct
MPT3SAS_ADAPTER *ioc,
ioc->ioc_link_reset_in_progress = 1;
init_completion(&ioc->base_cmds.done);
mpt3sas_base_put_smid_default(ioc, smid);
-   timeleft = wait_for_completion_timeout(&ioc->base_cmds.done,
+   wait_for_completion_timeout(&ioc->base_cmds.done,
msecs_to_jiffies(1));
if ((mpi_request->Operation == MPI2_SAS_OP_PHY_HARD_RESET ||
mpi_request->Operation == MPI2_SAS_OP_PHY_LINK_RESET) && @@
-4112,7 +4110,6 @@ mpt3sas_base_scsi_enclosure_processor(struct
MPT3SAS_ADAPTER *ioc,  {
u16 smid;
u32 ioc_state;
-   unsigned long timeleft;
bool issue_reset = false;
int rc;
void *request;
@@ -4163,7 +4160,7 @@ mpt3sas_base_scsi_enclosure_processor(struct
MPT3SAS_ADAPTER *ioc,
memcpy(request, mpi_request, sizeof(Mpi2SepReply_t));
init_completion(&ioc->base_cmds.done);
mpt3sas_base_put_smid_default(ioc, smid);
-   timeleft = wait_for_completion_timeout(&ioc->base_cmds.done,
+   wait_for_completion_timeout(&ioc->base_cmds.done,
msecs_to_jiffies(1));
if (!(ioc->base_cmds.status & MPT3_CMD_COMPLETE)) {
pr_err(MPT3SAS_FMT "%s: timeout\n",
@@ -4548,7 +4545,6 @@ _base_send_port_enable(struct MPT3SAS_ADAPTER *ioc)
{
Mpi2PortEnableRequest_t *mpi_request;
Mpi2PortEnableReply_t *mpi_reply;
-   unsigned long timeleft;
int r = 0;
u16 smid;
u16 ioc_status;
@@ -4576,8 +4572,7 @@ _base_send_port_enable(struct MPT3SAS_ADAPTER *ioc)

init_completion(&ioc->port_enable_cmds.done);
mpt3sas_base_put_smid_default(ioc, smid);
-   timeleft =
wait_for_completion_timeout(&ioc->port_enable_cmds.done,
-   300*HZ);
+   wait_for_completion_timeout(&ioc->port_enable_cmds.done, 300*HZ);
if (

RE: [PATCH 2/3] mpt3sas: Eliminate dead sleep_flag code

2016-08-04 Thread Chaitra Basappa
Hi,
 Please consider this patch as Acked-by: Chaitra P B



Thanks,
 Chaitra

-Original Message-
From: Calvin Owens [mailto:calvinow...@fb.com]
Sent: Friday, July 29, 2016 10:08 AM
To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J.
Bottomley; Martin K. Petersen
Cc: mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org;
linux-kernel@vger.kernel.org; kernel-t...@fb.com; Calvin Owens
Subject: [PATCH 2/3] mpt3sas: Eliminate dead sleep_flag code

With the exception of a single call to wait_for_doorbell_int(), all this
conditional sleeping code is dead. So delete it.

Signed-off-by: Calvin Owens 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c  | 241
+--
 drivers/scsi/mpt3sas/mpt3sas_base.h  |   6 +-
 drivers/scsi/mpt3sas/mpt3sas_config.c|   3 +-
 drivers/scsi/mpt3sas/mpt3sas_ctl.c   |  15 +-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |  21 +--
 drivers/scsi/mpt3sas/mpt3sas_transport.c |  12 +-
 6 files changed, 120 insertions(+), 178 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 751f13e..0956183 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -98,7 +98,7 @@ MODULE_PARM_DESC(mpt3sas_fwfault_debug,
" enable detection of firmware fault and halt firmware -
(default=0)");

 static int
-_base_get_ioc_facts(struct MPT3SAS_ADAPTER *ioc, int sleep_flag);
+_base_get_ioc_facts(struct MPT3SAS_ADAPTER *ioc);

 /**
  * _scsih_set_fwfault_debug - global setting of ioc->fwfault_debug.
@@ -218,8 +218,7 @@ _base_fault_reset_work(struct work_struct *work)
ioc->non_operational_loop = 0;

if ((doorbell & MPI2_IOC_STATE_MASK) !=
MPI2_IOC_STATE_OPERATIONAL) {
-   rc = mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP,
-   FORCE_BIG_HAMMER);
+   rc = mpt3sas_base_hard_reset_handler(ioc,
FORCE_BIG_HAMMER);
pr_warn(MPT3SAS_FMT "%s: hard reset: %s\n", ioc->name,
__func__, (rc == 0) ? "success" : "failed");
doorbell = mpt3sas_base_get_iocstate(ioc, 0); @@ -2145,7
+2144,7 @@ mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER *ioc)

_base_mask_interrupts(ioc);

-   r = _base_get_ioc_facts(ioc, CAN_SLEEP);
+   r = _base_get_ioc_facts(ioc);
if (r)
goto out_fail;

@@ -3172,12 +3171,11 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER
*ioc)
 /**
  * _base_allocate_memory_pools - allocate start of day memory pools
  * @ioc: per adapter object
- * @sleep_flag: CAN_SLEEP or NO_SLEEP
  *
  * Returns 0 success, anything else error
  */
 static int
-_base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc,  int sleep_flag)
+_base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
 {
struct mpt3sas_facts *facts;
u16 max_sge_elements;
@@ -3647,29 +3645,25 @@ mpt3sas_base_get_iocstate(struct MPT3SAS_ADAPTER
*ioc, int cooked)
  * _base_wait_on_iocstate - waiting on a particular ioc state
  * @ioc_state: controller state { READY, OPERATIONAL, or RESET }
  * @timeout: timeout in second
- * @sleep_flag: CAN_SLEEP or NO_SLEEP
  *
  * Returns 0 for success, non-zero for failure.
  */
 static int
-_base_wait_on_iocstate(struct MPT3SAS_ADAPTER *ioc, u32 ioc_state, int
timeout,
-   int sleep_flag)
+_base_wait_on_iocstate(struct MPT3SAS_ADAPTER *ioc, u32 ioc_state, int
+timeout)
 {
u32 count, cntdn;
u32 current_state;

count = 0;
-   cntdn = (sleep_flag == CAN_SLEEP) ? 1000*timeout : 2000*timeout;
+   cntdn = 1000 * timeout;
do {
current_state = mpt3sas_base_get_iocstate(ioc, 1);
if (current_state == ioc_state)
return 0;
if (count && current_state == MPI2_IOC_STATE_FAULT)
break;
-   if (sleep_flag == CAN_SLEEP)
-   usleep_range(1000, 1500);
-   else
-   udelay(500);
+
+   usleep_range(1000, 1500);
count++;
} while (--cntdn);

@@ -3681,24 +3675,22 @@ _base_wait_on_iocstate(struct MPT3SAS_ADAPTER
*ioc, u32 ioc_state, int timeout,
  * a write to the doorbell)
  * @ioc: per adapter object
  * @timeout: timeout in second
- * @sleep_flag: CAN_SLEEP or NO_SLEEP
  *
  * Returns 0 for success, non-zero for failure.
  *
  * Notes: MPI2_HIS_IOC2SYS_DB_STATUS - set to one when IOC writes to
doorbell.
  */
 static int
-_base_diag_reset(struct MPT3SAS_ADAPTER *ioc, int sleep_flag);
+_base_diag_reset(struct MPT3SAS_ADAPTER *ioc);

 static int
-_base_wait_for_doorbell_int(struct MPT3SAS_ADAPTER *ioc, int timeout,
-   int sleep_flag)
+_base_wait_for_doorbell_int(struct MPT3SAS_ADAPTER *ioc, int timeout)
 {
u32 cntdn, count;
u32 int_status;

count = 0;
-   cntdn = (sleep_flag == CAN_SLEEP) ? 1000*timeout : 2000*timeout;
+   cntdn = 1000 * timeout;
do {
 

RE: [PATCH 1/3] mpt3sas: Eliminate conditional locking in mpt3sas_scsih_issue_tm()

2016-08-04 Thread Chaitra Basappa
Hi,
 Please consider this patch as Acked-by: Chaitra P B


Thanks,
 Chaitra

-Original Message-
From: Calvin Owens [mailto:calvinow...@fb.com]
Sent: Friday, July 29, 2016 10:08 AM
To: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J.
Bottomley; Martin K. Petersen
Cc: mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org;
linux-kernel@vger.kernel.org; kernel-t...@fb.com; Calvin Owens
Subject: [PATCH 1/3] mpt3sas: Eliminate conditional locking in
mpt3sas_scsih_issue_tm()

This flag that conditionally acquires the mutex is confusing and prone to
bugginess: refactor it into two separate function calls, and make the
unlocked one complain if it's called outside the mutex.

Signed-off-by: Calvin Owens 
---
 drivers/scsi/mpt3sas/mpt3sas_base.h  | 16 +++--
 drivers/scsi/mpt3sas/mpt3sas_ctl.c   |  5 ++-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 66
+---
 3 files changed, 38 insertions(+), 49 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index eb7f5b0..f0baafd 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -794,16 +794,6 @@ struct reply_post_struct {
dma_addr_t  reply_post_free_dma;
 };

-/**
- * enum mutex_type - task management mutex type
- * @TM_MUTEX_OFF: mutex is not required becuase calling function is
acquiring it
- * @TM_MUTEX_ON: mutex is required
- */
-enum mutex_type {
-   TM_MUTEX_OFF = 0,
-   TM_MUTEX_ON = 1,
-};
-
 typedef void (*MPT3SAS_FLUSH_RUNNING_CMDS)(struct MPT3SAS_ADAPTER *ioc);
 /**
  * struct MPT3SAS_ADAPTER - per adapter struct @@ -1291,7 +1281,11 @@
void mpt3sas_scsih_reset_handler(struct MPT3SAS_ADAPTER *ioc, int
reset_phase);

 int mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle,
uint channel, uint id, uint lun, u8 type, u16 smid_task,
-   ulong timeout, enum mutex_type m_type);
+   ulong timeout);
+int mpt3sas_scsih_issue_locked_tm(struct MPT3SAS_ADAPTER *ioc, u16
handle,
+   uint channel, uint id, uint lun, u8 type, u16 smid_task,
+   ulong timeout);
+
 void mpt3sas_scsih_set_tm_flag(struct MPT3SAS_ADAPTER *ioc, u16 handle);
void mpt3sas_scsih_clear_tm_flag(struct MPT3SAS_ADAPTER *ioc, u16 handle);
void mpt3sas_expander_remove(struct MPT3SAS_ADAPTER *ioc, u64
sas_address); diff --git a/drivers/scsi/mpt3sas/mpt3sas_ctl.c
b/drivers/scsi/mpt3sas/mpt3sas_ctl.c
index 7d00f09..75ae533 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_ctl.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_ctl.c
@@ -1001,10 +1001,9 @@ _ctl_do_mpt_command(struct MPT3SAS_ADAPTER *ioc,
struct mpt3_ioctl_command karg,
ioc->name,

le16_to_cpu(mpi_request->FunctionDependent1));
mpt3sas_halt_firmware(ioc);
-   mpt3sas_scsih_issue_tm(ioc,
+   mpt3sas_scsih_issue_locked_tm(ioc,
le16_to_cpu(mpi_request->FunctionDependent1),
0, 0,
-   0, MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET, 0,
30,
-   TM_MUTEX_ON);
+   0, MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET, 0,
30);
} else
mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP,
FORCE_BIG_HAMMER);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index acabe48..c93a7ba 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -2201,7 +2201,6 @@ mpt3sas_scsih_clear_tm_flag(struct MPT3SAS_ADAPTER
*ioc, u16 handle)
  * @type: MPI2_SCSITASKMGMT_TASKTYPE__XXX (defined in mpi2_init.h)
  * @smid_task: smid assigned to the task
  * @timeout: timeout in seconds
- * @m_type: TM_MUTEX_ON or TM_MUTEX_OFF
  * Context: user
  *
  * A generic API for sending task management requests to firmware.
@@ -2212,8 +2211,7 @@ mpt3sas_scsih_clear_tm_flag(struct MPT3SAS_ADAPTER
*ioc, u16 handle)
  */
 int
 mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, uint
channel,
-   uint id, uint lun, u8 type, u16 smid_task, ulong timeout,
-   enum mutex_type m_type)
+   uint id, uint lun, u8 type, u16 smid_task, ulong timeout)
 {
Mpi2SCSITaskManagementRequest_t *mpi_request;
Mpi2SCSITaskManagementReply_t *mpi_reply; @@ -2224,21 +,19 @@
mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, uint
channel,
int rc;
u16 msix_task = 0;

-   if (m_type == TM_MUTEX_ON)
-   mutex_lock(&ioc->tm_cmds.mutex);
+   lockdep_assert_held(&ioc->tm_cmds.mutex);
+
if (ioc->tm_cmds.status != MPT3_CMD_NOT_USED) {
pr_info(MPT3SAS_FMT "%s: tm_cmd busy!!!\n",
__func__, ioc->name);
-   rc = FAILED;
-   goto err_out;
+   return FAILED;
}

if (ioc->shost_recovery || ioc->remove_host ||
ioc->pci_error_recover

RE: Kernel panics while creating RAID volume on latest stable 4.6.2 kernel beacuse of "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS enclosures"

2016-08-03 Thread Chaitra Basappa
Any updates on this ???

Thanks,
 Chaitra

-Original Message-
From: Chaitra Basappa [mailto:chaitra.basa...@broadcom.com]
Sent: Friday, June 17, 2016 4:04 PM
To: linux-kernel@vger.kernel.org; Linux SCSI Mailinglist; James Bottomley
Subject: Kernel panics while creating RAID volume on latest stable 4.6.2
kernel beacuse of "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS
enclosures"
Importance: High

Hi,
 Try creating RAID volume on latest stable 4.6.2 kernel, as soon as the
volume gets created kernel panics , below are the logs...

Carried out same experimentation on 4.4.13 kernel, issue was not
observed.After learning diff b/w 4.4.13 & 4.6.2 kernels "[PATCH v2  3/3]
ses: fix discovery of SATA devices in SAS enclosures" patch looks to be
suspicious.
commit 3f8d6f2a0797e8c650a47e5c1b5c2601a46f4293

And hence reverted above mentioned patch changes from 4.6.2 kernel and tried
volume creation, volume created successfully and issue is not observed.

>>Kernel panic logs:

root@dhcp-135-24-192-112 ~]# sd 0:1:0:0: [sdw] No Caching mode page found sd
0:1:0:0: [sdw] Assuming drive cache: write through [ cut
here ] kernel BUG at drivers/scsi/scsi_transport_sas.c:164!
invalid opcode:  [#1] SMP
Modules linked in: mptctl mptbase ses enclosure ebtable_nat ebtables
xt_CHECKSUM iptable_mangle bridge autofs4 8021q garp stp llc ipt_REJECT
nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables
ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state
nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan vhost
tun kvm_intel kvm irqbypass uinput ipmi_devintf iTCO_wdt iTCO_vendor_support
dcdbas pcspkr ipmi_si ipmi_msghandler acpi_pad sb_edac edac_core wmi sg
lpc_ich mfd_core shpchp tg3 ptp pps_core joydev ioatdma dca ext4(E)
mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) mpt3sas(E)
scsi_transport_sas(E) raid_class(E) dm_mirror(E) dm_region_hash(E)
dm_log(E) dm_mod(E) [last unloaded: speedstep_lib]
CPU: 1 PID: 375 Comm: kworker/u96:4 Tainted: GE   4.6.2 #1
Hardware name: Dell Inc. PowerEdge T420/03015M, BIOS 2.2.0 02/06/2014
Workqueue: fw_event_mpt3sas0 _firmware_event_work [mpt3sas]
task: 8800377f6480 ti: 8800c62c8000 task.ti: 8800c62c8000
RIP: 0010:[]  []
sas_get_address+0x26/0x30 [scsi_transport_sas]
RSP: 0018:8800c62cb8a8  EFLAGS: 00010282
RAX: 8800c6986208 RBX: 8800b04ec800 RCX: 8800b3deaac4
RDX: 002b RSI:  RDI: 8800b04ec800
RBP: 8800c62cb8a8 R08:  R09: 0008
R10:  R11: 0001 R12: 8800b04ec800
R13:  R14: 8800b04ec998 R15: 
FS:  () GS:88012f02()
knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: ff600400 CR3: 01c06000 CR4: 000406e0
Stack:
 8800c62cb8d8 a066bc62  
 8800b04ecc68 880128ee8000 8800c62cb938 a066bd5c
 8800b04ecef8 81608333 8800b04ec800 8800b04ecc68 Call
Trace:
 [] ses_match_to_enclosure+0x72/0x80 [ses]
[] ses_intf_add+0xec/0x494 [ses]  [] ?
preempt_schedule_common+0x23/0x40  []
device_add+0x278/0x440  [] ? __pm_runtime_resume+0x6c/0x90
[] scsi_sysfs_add_sdev+0xee/0x2b0  []
scsi_add_lun+0x437/0x580  []
scsi_probe_and_add_lun+0x1bb/0x4e0
 [] ? get_device+0x19/0x20  [] ?
scsi_alloc_target+0x293/0x320  [] ?
__pm_runtime_resume+0x6c/0x90  []
__scsi_add_device+0x10f/0x130  []
scsi_add_device+0x11/0x30  []
_scsih_sas_volume_add+0xf9/0x1b0 [mpt3sas]  []
_scsih_sas_ir_config_change_event+0xdb/0x210
[mpt3sas]
 [] _mpt3sas_fw_work+0xc1/0x480 [mpt3sas]
[] ? pwq_dec_nr_in_flight+0x50/0xa0  []
_firmware_event_work+0x19/0x20 [mpt3sas]  []
process_one_work+0x189/0x4e0  [] ?
del_timer_sync+0x4c/0x60  [] ?
maybe_create_worker+0x8e/0x110  [] ? schedule+0x40/0xb0
[] worker_thread+0x16d/0x520  [] ?
default_wake_function+0x12/0x20  [] ?
__wake_up_common+0x56/0x90  [] ?
maybe_create_worker+0x110/0x110  [] ? schedule+0x40/0xb0
[] ? maybe_create_worker+0x110/0x110  []
kthread+0xcc/0xf0  [] ? schedule_tail+0x1e/0xc0
[] ret_from_fork+0x22/0x40  [] ?
kthread_freezable_should_stop+0x70/0x70
Code: 0f 1f 44 00 00 55 48 89 e5 66 66 66 66 90 48 8b 87 28 01 00 00 48 8b
40 28 83 b8 d0 02 00 00 01 75 09 48 8b 80 e0 02 00 00 c9 c3 <0f> 0b eb fe
66 0f 1f 44 00 00 55 48 89 e5 53 48 83 ec 08 66 66 RIP  []
sas_get_address+0x26/0x30 [scsi_transport_sas]  RSP
 ---[ end trace c8c9da69e1dcb8a1 ]---
BUG: unable to handle kernel paging request at ffd8
IP: [] kthread_data+0x10/0x20 PGD 1c07067 PUD 1c09067 PMD
0
Oops:  [#2] SMP
Modules linked in: mptctl mptbase ses enclosure ebtable_nat ebtables
xt_CHECKSUM iptable_mangle bridge autofs4 8021q garp stp llc ipt_REJECT
nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables
ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 

RE: [PATCH] mpt3sas: Fix panic when aer correct error occured

2016-07-14 Thread Chaitra Basappa
Hi,
 Please consider this patch as Acked-by: Chaitra P B


Thanks,
 Chaitra

-Original Message-
From: Kefeng Wang [mailto:wangkefeng.w...@huawei.com]
Sent: Tuesday, July 12, 2016 3:13 PM
To: martin.peter...@oracle.com; suganath-prabu.subram...@broadcom.com;
mpt-fusionlinux@broadcom.com
Cc: linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org;
guohan...@huawei.com; Kefeng Wang; Sathya Prakash; Chaitra P B
Subject: [PATCH] mpt3sas: Fix panic when aer correct error occured

The _scsih_pci_mmio_enabled called if scsih_pci_error_detected returns
PCI_ERS_RESULT_CAN_RECOVER, at this point, read/write to the device still
works, no need to reset slot.

Or the mpt3sas_base_map_resources in scsih_pci_slot_reset will fail, and
iounamp ioc->chip, then we will meet issue when read ioc->chip in
mpt3sas_base_get_iocstate from _base_fault_reset_work.

Cc: Sathya Prakash 
Cc: Chaitra P B 
Cc: Suganath Prabu Subramani 
Signed-off-by: Kefeng Wang 
---

NOTE: I found this with an earlier kernel version, but the logic is not
changed.

 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 6bff13e..eedd62e3 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -9033,8 +9033,11 @@ scsih_pci_mmio_enabled(struct pci_dev *pdev)

/* TODO - dump whatever for debugging purposes */

-   /* Request a slot reset. */
-   return PCI_ERS_RESULT_NEED_RESET;
+   /* This called only if scsih_pci_error_detected returns
+* PCI_ERS_RESULT_CAN_RECOVER, read/write to the device
+* still works, not need to reset slot.
+*/
+   return PCI_ERS_RESULT_RECOVERED;
 }

 /*
--
1.7.12.4


Kernel panics while creating RAID volume on latest stable 4.6.2 kernel beacuse of "[PATCH v2 3/3] ses: fix discovery of SATA devices in SAS enclosures"

2016-06-17 Thread Chaitra Basappa
Hi,
 Try creating RAID volume on latest stable 4.6.2 kernel, as soon as the
volume gets created kernel panics , below are the logs...

Carried out same experimentation on 4.4.13 kernel, issue was not
observed.After learning diff b/w 4.4.13 & 4.6.2 kernels "[PATCH v2  3/3]
ses: fix discovery of SATA devices in SAS enclosures" patch looks to be
suspicious.
commit 3f8d6f2a0797e8c650a47e5c1b5c2601a46f4293

And hence reverted above mentioned patch changes from 4.6.2 kernel and
tried volume creation, volume created successfully and issue is not
observed.

>>Kernel panic logs:

root@dhcp-135-24-192-112 ~]# sd 0:1:0:0: [sdw] No Caching mode page found
sd 0:1:0:0: [sdw] Assuming drive cache: write through
[ cut here ]
kernel BUG at drivers/scsi/scsi_transport_sas.c:164!
invalid opcode:  [#1] SMP
Modules linked in: mptctl mptbase ses enclosure ebtable_nat ebtables
xt_CHECKSUM iptable_mangle bridge autofs4 8021q garp stp llc ipt_REJECT
nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables
ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state
nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan
vhost tun kvm_intel kvm irqbypass uinput ipmi_devintf iTCO_wdt
iTCO_vendor_support dcdbas pcspkr ipmi_si ipmi_msghandler acpi_pad sb_edac
edac_core wmi sg lpc_ich mfd_core shpchp tg3 ptp pps_core joydev ioatdma
dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) mpt3sas(E)
scsi_transport_sas(E) raid_class(E) dm_mirror(E) dm_region_hash(E)
dm_log(E) dm_mod(E) [last unloaded: speedstep_lib]
CPU: 1 PID: 375 Comm: kworker/u96:4 Tainted: GE   4.6.2 #1
Hardware name: Dell Inc. PowerEdge T420/03015M, BIOS 2.2.0 02/06/2014
Workqueue: fw_event_mpt3sas0 _firmware_event_work [mpt3sas]
task: 8800377f6480 ti: 8800c62c8000 task.ti: 8800c62c8000
RIP: 0010:[]  []
sas_get_address+0x26/0x30 [scsi_transport_sas]
RSP: 0018:8800c62cb8a8  EFLAGS: 00010282
RAX: 8800c6986208 RBX: 8800b04ec800 RCX: 8800b3deaac4
RDX: 002b RSI:  RDI: 8800b04ec800
RBP: 8800c62cb8a8 R08:  R09: 0008
R10:  R11: 0001 R12: 8800b04ec800
R13:  R14: 8800b04ec998 R15: 
FS:  () GS:88012f02()
knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: ff600400 CR3: 01c06000 CR4: 000406e0
Stack:
 8800c62cb8d8 a066bc62  
 8800b04ecc68 880128ee8000 8800c62cb938 a066bd5c
 8800b04ecef8 81608333 8800b04ec800 8800b04ecc68
Call Trace:
 [] ses_match_to_enclosure+0x72/0x80 [ses]
 [] ses_intf_add+0xec/0x494 [ses]
 [] ? preempt_schedule_common+0x23/0x40
 [] device_add+0x278/0x440
 [] ? __pm_runtime_resume+0x6c/0x90
 [] scsi_sysfs_add_sdev+0xee/0x2b0
 [] scsi_add_lun+0x437/0x580
 [] scsi_probe_and_add_lun+0x1bb/0x4e0
 [] ? get_device+0x19/0x20
 [] ? scsi_alloc_target+0x293/0x320
 [] ? __pm_runtime_resume+0x6c/0x90
 [] __scsi_add_device+0x10f/0x130
 [] scsi_add_device+0x11/0x30
 [] _scsih_sas_volume_add+0xf9/0x1b0 [mpt3sas]
 [] _scsih_sas_ir_config_change_event+0xdb/0x210
[mpt3sas]
 [] _mpt3sas_fw_work+0xc1/0x480 [mpt3sas]
 [] ? pwq_dec_nr_in_flight+0x50/0xa0
 [] _firmware_event_work+0x19/0x20 [mpt3sas]
 [] process_one_work+0x189/0x4e0
 [] ? del_timer_sync+0x4c/0x60
 [] ? maybe_create_worker+0x8e/0x110
 [] ? schedule+0x40/0xb0
 [] worker_thread+0x16d/0x520
 [] ? default_wake_function+0x12/0x20
 [] ? __wake_up_common+0x56/0x90
 [] ? maybe_create_worker+0x110/0x110
 [] ? schedule+0x40/0xb0
 [] ? maybe_create_worker+0x110/0x110
 [] kthread+0xcc/0xf0
 [] ? schedule_tail+0x1e/0xc0
 [] ret_from_fork+0x22/0x40
 [] ? kthread_freezable_should_stop+0x70/0x70
Code: 0f 1f 44 00 00 55 48 89 e5 66 66 66 66 90 48 8b 87 28 01 00 00 48 8b
40 28 83 b8 d0 02 00 00 01 75 09 48 8b 80 e0 02 00 00 c9 c3 <0f> 0b eb fe
66 0f 1f 44 00 00 55 48 89 e5 53 48 83 ec 08 66 66
RIP  [] sas_get_address+0x26/0x30 [scsi_transport_sas]
 RSP 
---[ end trace c8c9da69e1dcb8a1 ]---
BUG: unable to handle kernel paging request at ffd8
IP: [] kthread_data+0x10/0x20
PGD 1c07067 PUD 1c09067 PMD 0
Oops:  [#2] SMP
Modules linked in: mptctl mptbase ses enclosure ebtable_nat ebtables
xt_CHECKSUM iptable_mangle bridge autofs4 8021q garp stp llc ipt_REJECT
nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables
ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state
nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan
vhost tun kvm_intel kvm irqbypass uinput ipmi_devintf iTCO_wdt
iTCO_vendor_support dcdbas pcspkr ipmi_si ipmi_msghandler acpi_pad sb_edac
edac_core wmi sg lpc_ich mfd_core shpchp tg3 ptp pps_core joydev ioatdma
dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) mpt3sas(E)
scsi_transport_sas(E) raid_class(E) dm_mirror(E) dm_region_hash(E)
dm_log(E) dm_mo

RE: [PATCH] mpt3sas: Don't overreach ioc->reply_post[] during initialization

2016-03-23 Thread Chaitra Basappa
Hi,
 Please consider this patch as Ack-by: Chaitra P B


Thanks,
 Chaitra

-Original Message-
From: Martin K. Petersen [mailto:martin.peter...@oracle.com]
Sent: Tuesday, March 22, 2016 6:00 AM
To: Calvin Owens
Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J.
Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com;
linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org;
kernel-t...@fb.com; Sreekanth Reddy
Subject: Re: [PATCH] mpt3sas: Don't overreach ioc->reply_post[] during
initialization

> "Calvin" == Calvin Owens  writes:

Calvin> In _base_make_ioc_operational(), we walk ioc->reply_queue_list
Calvin> and pull a pointer out of successive elements of
Calvin> ioc->reply_post[] for each entry in that list if RDPQ is
Calvin> enabled.

Calvin> Since the code pulls the pointer for the next iteration at the
Calvin> bottom of the loop, it triggers the a KASAN dump on the final
Calvin> iteration:

Broadcom folks, please review.

Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


RE: [PATCH] mpt3sas: Don't overreach ioc->reply_post[] during initialization

2016-03-22 Thread Chaitra Basappa
Martin,
 This patch is being reviewed , we shall get back with reviews by
tomorrow.

Thanks,
 Chaitra

-Original Message-
From: Martin K. Petersen [mailto:martin.peter...@oracle.com]
Sent: Tuesday, March 22, 2016 6:00 AM
To: Calvin Owens
Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J.
Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com;
linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org;
kernel-t...@fb.com; Sreekanth Reddy
Subject: Re: [PATCH] mpt3sas: Don't overreach ioc->reply_post[] during
initialization

> "Calvin" == Calvin Owens  writes:

Calvin> In _base_make_ioc_operational(), we walk ioc->reply_queue_list
Calvin> and pull a pointer out of successive elements of
Calvin> ioc->reply_post[] for each entry in that list if RDPQ is
Calvin> enabled.

Calvin> Since the code pulls the pointer for the next iteration at the
Calvin> bottom of the loop, it triggers the a KASAN dump on the final
Calvin> iteration:

Broadcom folks, please review.

Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


RE: [PATCH-v2 1/2] mpt3sas: Refcount sas_device objects and fix unsafe list usage

2015-09-09 Thread Chaitra Basappa
From: Sreekanth Reddy [mailto:sreekanth.re...@avagotech.com]
Sent: Tuesday, September 08, 2015 5:26 PM
To: Nicholas A. Bellinger
Cc: linux-scsi; linux-kernel; James Bottomley; Calvin Owens; Christoph
Hellwig; MPT-FusionLinux.pdl; kernel-team; Nicholas Bellinger; Chaitra
Basappa
Subject: Re: [PATCH-v2 1/2] mpt3sas: Refcount sas_device objects and fix
unsafe list usage

On Sun, Aug 30, 2015 at 1:24 PM, Nicholas A. Bellinger 
wrote:
> From: Nicholas Bellinger 
>
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
>
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors
> _scsih_probe_sas() to use the sas_device_list in a safe way.
>
> This patch is a port of Calvin's PATCH-v4 for mpt2sas code, atop
> mpt3sas changes in scsi.git/for-next.
>
> Cc: Calvin Owens 
> Cc: Christoph Hellwig 
> Cc: Sreekanth Reddy 
> Cc: MPT-FusionLinux.pdl 
> Signed-off-by: Nicholas Bellinger 
> ---
>  drivers/scsi/mpt3sas/mpt3sas_base.h  |  25 +-
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 479
> +--
>  drivers/scsi/mpt3sas/mpt3sas_transport.c |  18 +-
>  3 files changed, 364 insertions(+), 158 deletions(-)
>
> @@ -2763,7 +2874,7 @@ _scsih_block_io_device(struct MPT3SAS_ADAPTER *ioc,
> u16 handle)
> struct scsi_device *sdev;
> struct _sas_device *sas_device;
>

[Sreekanth] Here sas_device_lock spin lock needs to be acquired before
calling
  __mpt3sas_get_sdev_by_addr() function.

[Chaitra]Here instead of calling " __mpt3sas_get_sdev_by_handle()" function
calling
"mpt3sas_get_sdev_by_handle()" function will fixes "invalid page access"
type of kernel panic

> -   sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +   sas_device = __mpt3sas_get_sdev_by_handle(ioc, handle);
> if (!sas_device)
> return;
>
> @@ -2779,6 +2890,8 @@ _scsih_block_io_device(struct MPT3SAS_ADAPTER *ioc,
> u16 handle)
> continue;
> _scsih_internal_device_block(sdev, sas_device_priv_data);
> }
> +
> +   sas_device_put(sas_device);
>  }
>


Regards,
Chaitra
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/