RE: [PATCH RFC 1/2] cxlflash: Base support for IBM CXL Flash Adapter

2015-07-02 Thread Seymour, Shane M
-Original Message-
From: linux-scsi-ow...@vger.kernel.org 
[mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Matthew R. Ochs
Sent: Monday, April 27, 2015 2:50 PM
To: linux-scsi@vger.kernel.org; james.bottom...@hansenpartnership.com; 
n...@linux-iscsi.org; brk...@linux.vnet.ibm.com
Cc: mi...@neuling.org; imun...@au1.ibm.com; Manoj N. Kumar
Subject: [PATCH RFC 1/2] cxlflash: Base support for IBM CXL Flash Adapter

 +static ssize_t cxlflash_show_port_status(struct device *dev,
 +  struct device_attribute *attr,
 +  char *buf)
 +{
 + struct Scsi_Host *shost = class_to_shost(dev);
 + struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata;
 + struct afu *afu = cxlflash-afu;
 +
 + char *disp_status;
 + int rc;
 + u32 port;
 + u64 status;
 + volatile u64 *fc_regs;
 +
 + rc = kstrtouint((attr-attr.name + 4), 10, port);
 + if (rc || (port  NUM_FC_PORTS))
 + return 0;
 +
 + fc_regs = afu-afu_map-global.fc_regs[port][0];
 + status =
 + (readq_be(fc_regs[FC_MTIP_STATUS / 8])  FC_MTIP_STATUS_MASK);
 +
 + if (status == FC_MTIP_STATUS_ONLINE)
 + disp_status = online;
 + else if (status == FC_MTIP_STATUS_OFFLINE)
 + disp_status = offline;
 + else
 + disp_status = unknown;
 +
 + return snprintf(buf, PAGE_SIZE, %s\n, disp_status);

You need to use scnprintf() instead of snprintf() here

 +static ssize_t cxlflash_show_lun_mode(struct device *dev,
 +   struct device_attribute *attr, char *buf)
 +{
 + struct Scsi_Host *shost = class_to_shost(dev);
 + struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata;
 + struct afu *afu = cxlflash-afu;
 +
 + return snprintf(buf, PAGE_SIZE, %u\n, afu-internal_lun);

See above about snprintf()

+static DEVICE_ATTR(port0, S_IRUGO, cxlflash_show_port_status, NULL);
+static DEVICE_ATTR(port1, S_IRUGO, cxlflash_show_port_status, NULL);
+static DEVICE_ATTR(lun_mode, S_IRUGO | S_IWUSR, cxlflash_show_lun_mode,
+  cxlflash_store_lun_mode);

Please use DEVICE_ATTR_RO and DEVICE_ATTR_RW as appropriate if you can, you 
will need to change the show/store function names. The main reason I know for 
doing this is (using DEVICE_ATTR_RO as an example) if someone sees a sysfs 
attribute called port0 or port1 they should be able to search the kernel source 
to find a function called port0_show or port1_show without having to spend any 
extra time searching to find out what the driver is and then look at the driver 
source to find that they need to look at cxlflash_show_port_status. Using 
DEVICE_ATTR_RO for port0 and port1 would probably change the code to looking 
something like this:

static ssize_t cxlflash_show_port_status(u32 port,
struct afu *afu, char *buf)
{
char *disp_status;
u64 status;
volatile u64 *fc_regs;

fc_regs = afu-afu_map-global.fc_regs[port][0];
/* I split this line into two because I had to really look at where
 * the brackets were and this made it more obvious to me
 * what it was doing but that could just be me. */
status = readq_be(fc_regs[FC_MTIP_STATUS / 8]);
status = FC_MTIP_STATUS_MASK;

if (status == FC_MTIP_STATUS_ONLINE)
disp_status = online;
else if (status == FC_MTIP_STATUS_OFFLINE)
disp_status = offline;
else
disp_status = unknown;

return scnprintf(buf, PAGE_SIZE, %s\n, disp_status);
}

Since you have fixed values you could use also sprintf here (although the 
documentation currently says to only use scnprintf) and that would make 
port0_show and port1_show to be:

static ssize_t port0_show(struct device *dev,
struct device_attribute *attr,
char *buf)
{
struct Scsi_Host *shost = class_to_shost(dev);
struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata;
struct afu *afu = cxlflash-afu;

return cxlflash_show_port_status(0, afu, buf);
}

static ssize_t port1_show(struct device *dev,
struct device_attribute *attr,
char *buf)
{
struct Scsi_Host *shost = class_to_shost(dev);
struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata;
struct afu *afu = cxlflash-afu;

return cxlflash_show_port_status(1, afu, buf);
}

I'm assuming that 0 and 1 are always valid for port number and don't need to be 
checked against NUM_FC_PORTS. That's just a suggestion and I haven't attempted 
to compile the above example and you'd need to test it to make sure that they 
still work as expected.

Shane
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] cxlflash: Base superpipe support

2015-07-02 Thread Michael Neuling
On Fri, 2015-06-19 at 17:37 -0500, Matthew R. Ochs wrote:
 Add superpipe supporting infrastructure to device driver for the IBM CXL
 Flash adapter. This patch allows userspace applications to take advantage
 of the accelerated I/O features that this adapter provides and bypass the
 traditional filesystem stack.

... bypass the traditional filesystem stack. interesting :-)

Other comments below.

 Signed-off-by: Matthew R. Ochs mro...@linux.vnet.ibm.com
 Signed-off-by: Manoj N. Kumar ma...@linux.vnet.ibm.com
 ---
  Documentation/powerpc/cxlflash.txt |  298 ++
  drivers/scsi/cxlflash/Makefile |2 +-
  drivers/scsi/cxlflash/common.h |   18 +
  drivers/scsi/cxlflash/main.c   |   12 +
  drivers/scsi/cxlflash/superpipe.c  | 1856 
 
  drivers/scsi/cxlflash/superpipe.h  |  210 
  include/uapi/scsi/cxlflash_ioctl.h |  159 +++
  7 files changed, 2554 insertions(+), 1 deletion(-)
  create mode 100644 Documentation/powerpc/cxlflash.txt
  create mode 100644 drivers/scsi/cxlflash/superpipe.c
  create mode 100644 drivers/scsi/cxlflash/superpipe.h
  create mode 100644 include/uapi/scsi/cxlflash_ioctl.h
 
 diff --git a/Documentation/powerpc/cxlflash.txt 
 b/Documentation/powerpc/cxlflash.txt
 new file mode 100644
 index 000..c4d3849
 --- /dev/null
 +++ b/Documentation/powerpc/cxlflash.txt
 @@ -0,0 +1,298 @@
 +Introduction
 +
 +
 +The IBM Power architecture provides support for CAPI (Coherent
 +Accelerator Power Interface), which is available to certain PCIe slots
 +on Power 8 systems. CAPI can be thought of as a special tunneling
 +protocol through PCIe that allow PCIe adapters to look like special
 +purpose co-processors which can read or write an application's
 +memory and generate page faults. As a result, the host interface to
 +an adapter running in CAPI mode does not require the data buffers to
 +be mapped to the device's memory (IOMMU bypass) nor does it require
 +memory to be pinned.
 +
 +On Linux, Coherent Accelerator (CXL) kernel services present CAPI
 +devices as a PCI device by implementing a virtual PCI host bridge.
 +This abstraction simplifies the infrastructure and programming
 +model, allowing for drivers to look similar to other native PCI
 +device drivers.
 +
 +CXL provides a mechanism by which user space applications can
 +directly talk to a device (network or storage) bypassing the typical
 +kernel/device driver stack. The CXL Flash Adapter Driver enables a
 +user space application direct access to Flash storage.
 +
 +The CXL Flash Adapter Driver is a kernel module that sits in the
 +SCSI stack as a low level device driver (below the SCSI disk and
 +protocol drivers) for the IBM CXL Flash Adapter. This driver is
 +responsible for the initialization of the adapter, setting up the
 +special path for user space access, and performing error recovery. It
 +communicates directly the Flash Accelerator Functional Unit (AFU)
 +as described in Documentation/powerpc/cxl.txt.
 +
 +The cxlflash driver supports two, mutually exclusive, modes of
 +operation at the device (LUN) level:
 +
 +- Any flash device (LUN) can be configured to be accessed as a
 +  regular disk device (i.e.: /dev/sdc). This is the default mode.
 +
 +- Any flash device (LUN) can be configured to be accessed from
 +  user space with a special block library. This mode further
 +  specifies the means of accessing the device and provides for
 +  either raw access to the entire LUN (referred to as direct
 +  or physical LUN access) or access to a kernel/AFU-mediated
 +  partition of the LUN (referred to as virtual LUN access). The
 +  segmentation of a disk device into virtual LUNs is assisted
 +  by special translation services provided by the Flash AFU.
 +
 +Overview
 +
 +
 +The Coherent Accelerator Interface Architecture (CAIA) introduces a
 +concept of a master context. A master typically has special privileges
 +granted to it by the kernel or hypervisor allowing it to perform AFU
 +wide management and control. The master may or may not be involved
 +directly in each user I/O, but at the minimum is involved in the
 +initial setup before the user application is allowed to send requests
 +directly to the AFU.
 +
 +The CXL Flash Adapter Driver establishes a master context with the
 +AFU. It uses memory mapped I/O (MMIO) for this control and setup. The
 +Adapter Problem Space Memory Map looks like this:
 +
 + +---+
 + |512 * 64 KB User MMIO  |
 + |(per context)  |
 + |   User Accessible |
 + +---+
 + |512 * 128 B per context|
 + 

Re: [PATCH 1/2] cxlflash: Base superpipe support

2015-07-02 Thread Michael Neuling
 +struct dk_cxlflash_attach {
 + struct dk_cxlflash_hdr hdr; /* Common fields */
 + __u64 num_interrupts;   /* Requested number of interrupts */
 + __u64 context_id;   /* Returned context */

Matt,

These ioctls seem to take a context id.  Why isn't that assumed from the
fd?  Why get it again from userspace?

Mikey

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] st: null pointer dereference panic caused by use after kref_put by st_open

2015-07-02 Thread Johannes Thumshirn
Seymour, Shane M shane.seym...@hp.com writes:

 Two SLES11 SP3 servers encountered similar crashes simultaneously
 following some kind of SAN/tape target issue:


Thanks,

Reviewed-by: Johannes Thumshirn jthumsh...@suse.com

-- 
Johannes Thumshirn   Storage
jthumsh...@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] st: null pointer dereference panic caused by use after kref_put by st_open

2015-07-02 Thread Seymour, Shane M

Two SLES11 SP3 servers encountered similar crashes simultaneously
following some kind of SAN/tape target issue:

...
qla2xxx [:81:00.0]-801c:3: Abort command issued nexus=3:0:2 --  1 2002.
qla2xxx [:81:00.0]-801c:3: Abort command issued nexus=3:0:2 --  1 2002.
qla2xxx [:81:00.0]-8009:3: DEVICE RESET ISSUED nexus=3:0:2 
cmd=882f89c2c7c0.
qla2xxx [:81:00.0]-800c:3: do_reset failed for cmd=882f89c2c7c0.
qla2xxx [:81:00.0]-800f:3: DEVICE RESET FAILED: Task management failed 
nexus=3:0:2 cmd=882f89c2c7c0.
qla2xxx [:81:00.0]-8009:3: TARGET RESET ISSUED nexus=3:0:2 
cmd=882f89c2c7c0.
qla2xxx [:81:00.0]-800c:3: do_reset failed for cmd=882f89c2c7c0.
qla2xxx [:81:00.0]-800f:3: TARGET RESET FAILED: Task management failed 
nexus=3:0:2 cmd=882f89c2c7c0.
qla2xxx [:81:00.0]-8012:3: BUS RESET ISSUED nexus=3:0:2.
qla2xxx [:81:00.0]-802b:3: BUS RESET SUCCEEDED nexus=3:0:2.
qla2xxx [:81:00.0]-505f:3: Link is operational (8 Gbps).
qla2xxx [:81:00.0]-8018:3: ADAPTER RESET ISSUED nexus=3:0:2.
qla2xxx [:81:00.0]-00af:3: Performing ISP error recovery - 
ha=88bf04d18000.
 rport-3:0-0: blocked FC remote port time out: removing target and saving 
binding
qla2xxx [:81:00.0]-505f:3: Link is operational (8 Gbps).
qla2xxx [:81:00.0]-8017:3: ADAPTER RESET SUCCEEDED nexus=3:0:2.
 rport-2:0-0: blocked FC remote port time out: removing target and saving 
binding
sg_rq_end_io: device detached
BUG: unable to handle kernel NULL pointer dereference at 02a8
IP: [8133b268] __pm_runtime_idle+0x28/0x90
PGD 7e6586f067 PUD 7e5af06067 PMD 0 [1739975.390354] Oops: 0002 [#1] SMP
CPU 0
...
Supported: No, Proprietary modules are loaded [1739975.390463]
Pid: 27965, comm: ABCD Tainted: PF   X 3.0.101-0.29-default #1 HP 
ProLiant DL580 Gen8
RIP: 0010:[8133b268]  [8133b268] __pm_runtime_idle+0x28/0x90
RSP: 0018:8839dc1e7c68  EFLAGS: 00010202
RAX:  RBX: 883f0592fc00 RCX: 0090
RDX:  RSI: 0004 RDI: 0138
RBP: 0138 R08: 0010 R09: 81bd39d0
R10: 09c0 R11: 81025790 R12: 0001
R13: 883022212b80 R14: 0004 R15: 883022212b80
FS:  7f8e54560720() GS:88407f80() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 02a8 CR3: 007e6ced6000 CR4: 001407f0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process ABCD (pid: 27965, threadinfo 8839dc1e6000, task 883592e0c640)
Stack:
 883f0592fc00 fffa 0001 883022212b80
 883eff772400 a03fa309  
 a04003a0 883f063196c0 887f0379a930 8115ea1e
Call Trace:
 [a03fa309] st_open+0x129/0x240 [st]
 [8115ea1e] chrdev_open+0x13e/0x200
 [811588a8] __dentry_open+0x198/0x310
 [81167d74] do_last+0x1f4/0x800
 [81168fe9] path_openat+0xd9/0x420
 [8116946c] do_filp_open+0x4c/0xc0
 [8115a00f] do_sys_open+0x17f/0x250
 [81468d92] system_call_fastpath+0x16/0x1b
 [7f8e4f617fd0] 0x7f8e4f617fcf
Code: eb d3 90 48 83 ec 28 40 f6 c6 04 48 89 6c 24 08 4c 89 74 24 20 48 89 fd 
48 89 1c 24 4c 89 64 24 10 41 89 f6 4c 89 6c 24 18 74 11 f0 ff 8f 70 01 00 00 
0f 94 c0 45 31 ed 84 c0 74 2b 4c 8d a5 a0
RIP  [8133b268] __pm_runtime_idle+0x28/0x90
 RSP 8839dc1e7c68
CR2: 02a8

Analysis reveals the cause of the crash to be due to STp-device
being NULL. The pointer was NULLed via scsi_tape_put(STp) when it
calls scsi_tape_release(). In st_open() we jump to err_out after
scsi_block_when_processing_errors() completes and returns the
device as offline (sdev_state was SDEV_DEL):

1180 /* Open the device. Needs to take the BKL only because of incrementing the 
SCSI host
1181module count. */
1182 static int st_open(struct inode *inode, struct file *filp)
1183 {
1184 int i, retval = (-EIO);
1185 int resumed = 0;
1186 struct scsi_tape *STp;
1187 struct st_partstat *STps;
1188 int dev = TAPE_NR(inode);
1189 char *name;
...
1217 if (scsi_autopm_get_device(STp-device)  0) {
1218 retval = -EIO;
1219 goto err_out;
1220 }
1221 resumed = 1;
1222 if (!scsi_block_when_processing_errors(STp-device)) {
1223 retval = (-ENXIO);
1224 goto err_out;
1225 }
...
1264  err_out:
1265 normalize_buffer(STp-buffer);
1266 spin_lock(st_use_lock);
1267 STp-in_use = 0;
1268 spin_unlock(st_use_lock);
1269 scsi_tape_put(STp); -- STp-device = 0 after this
1270 if (resumed)
1271 scsi_autopm_put_device(STp-device);
1272 return retval;

The ref count for the 

Re: [PATCH RFC 1/2] cxlflash: Base support for IBM CXL Flash Adapter

2015-07-02 Thread Matthew R. Ochs
Shane,

All great suggestions.

I've gone ahead and implemented these and also applied your comments to our
other attributes as well.  Will include in our next push of fixes.


-matt

On Jul 2, 2015, at 3:08 AM, Seymour, Shane M wrote:

 -Original Message-
 From: linux-scsi-ow...@vger.kernel.org 
 [mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Matthew R. Ochs
 Sent: Monday, April 27, 2015 2:50 PM
 To: linux-scsi@vger.kernel.org; james.bottom...@hansenpartnership.com; 
 n...@linux-iscsi.org; brk...@linux.vnet.ibm.com
 Cc: mi...@neuling.org; imun...@au1.ibm.com; Manoj N. Kumar
 Subject: [PATCH RFC 1/2] cxlflash: Base support for IBM CXL Flash Adapter
 
 +static ssize_t cxlflash_show_port_status(struct device *dev,
 + struct device_attribute *attr,
 + char *buf)
 +{
 +struct Scsi_Host *shost = class_to_shost(dev);
 +struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata;
 +struct afu *afu = cxlflash-afu;
 +
 +char *disp_status;
 +int rc;
 +u32 port;
 +u64 status;
 +volatile u64 *fc_regs;
 +
 +rc = kstrtouint((attr-attr.name + 4), 10, port);
 +if (rc || (port  NUM_FC_PORTS))
 +return 0;
 +
 +fc_regs = afu-afu_map-global.fc_regs[port][0];
 +status =
 +(readq_be(fc_regs[FC_MTIP_STATUS / 8])  FC_MTIP_STATUS_MASK);
 +
 +if (status == FC_MTIP_STATUS_ONLINE)
 +disp_status = online;
 +else if (status == FC_MTIP_STATUS_OFFLINE)
 +disp_status = offline;
 +else
 +disp_status = unknown;
 +
 +return snprintf(buf, PAGE_SIZE, %s\n, disp_status);
 
 You need to use scnprintf() instead of snprintf() here
 
 +static ssize_t cxlflash_show_lun_mode(struct device *dev,
 +  struct device_attribute *attr, char *buf)
 +{
 +struct Scsi_Host *shost = class_to_shost(dev);
 +struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata;
 +struct afu *afu = cxlflash-afu;
 +
 +return snprintf(buf, PAGE_SIZE, %u\n, afu-internal_lun);
 
 See above about snprintf()
 
 +static DEVICE_ATTR(port0, S_IRUGO, cxlflash_show_port_status, NULL);
 +static DEVICE_ATTR(port1, S_IRUGO, cxlflash_show_port_status, NULL);
 +static DEVICE_ATTR(lun_mode, S_IRUGO | S_IWUSR, cxlflash_show_lun_mode,
 +cxlflash_store_lun_mode);
 
 Please use DEVICE_ATTR_RO and DEVICE_ATTR_RW as appropriate if you can, you 
 will need to change the show/store function names. The main reason I know for 
 doing this is (using DEVICE_ATTR_RO as an example) if someone sees a sysfs 
 attribute called port0 or port1 they should be able to search the kernel 
 source to find a function called port0_show or port1_show without having to 
 spend any extra time searching to find out what the driver is and then look 
 at the driver source to find that they need to look at 
 cxlflash_show_port_status. Using DEVICE_ATTR_RO for port0 and port1 would 
 probably change the code to looking something like this:
 
 static ssize_t cxlflash_show_port_status(u32 port,
   struct afu *afu, char *buf)
 {
   char *disp_status;
   u64 status;
   volatile u64 *fc_regs;
 
   fc_regs = afu-afu_map-global.fc_regs[port][0];
   /* I split this line into two because I had to really look at where
* the brackets were and this made it more obvious to me
* what it was doing but that could just be me. */
   status = readq_be(fc_regs[FC_MTIP_STATUS / 8]);
   status = FC_MTIP_STATUS_MASK;
 
   if (status == FC_MTIP_STATUS_ONLINE)
   disp_status = online;
   else if (status == FC_MTIP_STATUS_OFFLINE)
   disp_status = offline;
   else
   disp_status = unknown;
 
   return scnprintf(buf, PAGE_SIZE, %s\n, disp_status);
 }
 
 Since you have fixed values you could use also sprintf here (although the 
 documentation currently says to only use scnprintf) and that would make 
 port0_show and port1_show to be:
 
 static ssize_t port0_show(struct device *dev,
   struct device_attribute *attr,
   char *buf)
 {
   struct Scsi_Host *shost = class_to_shost(dev);
   struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata;
   struct afu *afu = cxlflash-afu;
 
   return cxlflash_show_port_status(0, afu, buf);
 }
 
 static ssize_t port1_show(struct device *dev,
   struct device_attribute *attr,
   char *buf)
 {
   struct Scsi_Host *shost = class_to_shost(dev);
   struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata;
   struct afu *afu = cxlflash-afu;
 
   return cxlflash_show_port_status(1, afu, buf);
 }
 
 I'm assuming that 0 and 1 are always valid for port number and don't need to 
 be checked against NUM_FC_PORTS. That's just a suggestion and I haven't 
 attempted to compile the above example and you'd need to test it to make sure 
 that they still work as expected.
 
 Shane
 

--
To unsubscribe from 

Re: [PATCH 1/2] cxlflash: Base superpipe support

2015-07-02 Thread Michael Neuling
On Thu, 2015-07-02 at 17:00 +1000, Michael Neuling wrote:
  +struct dk_cxlflash_attach {
  +   struct dk_cxlflash_hdr hdr; /* Common fields */
  +   __u64 num_interrupts;   /* Requested number of interrupts */
  +   __u64 context_id;   /* Returned context */
 
 Matt,
 
 These ioctls seem to take a context id.  Why isn't that assumed from the
 fd?  Why get it again from userspace?

Sorry, ignore this.  I'm thinking about the wrong file descriptor.

Mikey
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv1] mpt2sas: setpci reset kernel oops fix

2015-07-02 Thread Nagarajkumar Narayanan
spinlock initialization modified as per comments, Could  anyone please
review and approve the patch




On Thu, Jun 25, 2015 at 4:24 PM, Nagarajkumar Narayanan
nagarajkumar.naraya...@seagate.com wrote:
 This patch contains the fix for kernel oops on issuing setpci reset
 along with sysfs, cli access

 spinlock initialization modified to DEFINE_SPINLOCK as per the
 comments for previous version of the patch


 From edf1c00ddcc9b045fd1527a23631f3d4b1eea599 Mon Sep 17 00:00:00 2001
 From: Nagarajkumar Narayanan nagarajkumar.naraya...@seagate.com
 Date: Thu, 25 Jun 2015 11:20:44 +0530
 Subject: [PATCH] mpt2sas setpci reset oops fix

 issuing setpci reset to nytro warpdrive card along with sysfs access and
 cli ioctl access resulted in kernel oops

 1. pci_access_mutex lock added to provide synchronization between IOCTL,
sysfs, PCI resource handling path

 2. gioc_lock spinlock to protect list operations over multiple controllers

 Signed-off-by: Nagarajkumar Narayanan nagarajkumar.naraya...@seagate.com
 ---
  drivers/scsi/mpt2sas/mpt2sas_base.c  |9 ++
  drivers/scsi/mpt2sas/mpt2sas_base.h  |   19 -
  drivers/scsi/mpt2sas/mpt2sas_ctl.c   |   48 +
  drivers/scsi/mpt2sas/mpt2sas_scsih.c |   16 ++-
  4 files changed, 84 insertions(+), 8 deletions(-)

 diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c 
 b/drivers/scsi/mpt2sas/mpt2sas_base.c
 index 11248de..9081f07 100644
 --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
 +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
 @@ -108,13 +108,17 @@ _scsih_set_fwfault_debug(const char *val, struct 
 kernel_param *kp)
  {
 int ret = param_set_int(val, kp);
 struct MPT2SAS_ADAPTER *ioc;
 +   unsigned long flags;

 if (ret)
 return ret;

 +   /* global ioc spinlock to protect controller list on list operations 
 */
 printk(KERN_INFO setting fwfault_debug(%d)\n, 
 mpt2sas_fwfault_debug);
 +   spin_lock_irqsave(gioc_lock, flags);
 list_for_each_entry(ioc, mpt2sas_ioc_list, list)
 ioc-fwfault_debug = mpt2sas_fwfault_debug;
 +   spin_unlock_irqrestore(gioc_lock, flags);
 return 0;
  }

 @@ -4436,6 +4440,9 @@ mpt2sas_base_free_resources(struct MPT2SAS_ADAPTER *ioc)
 __func__));

 if (ioc-chip_phys  ioc-chip) {
 +   /* synchronizing freeing resource with pci_access_mutex lock 
 */
 +   if (ioc-is_warpdrive)
 +   mutex_lock(ioc-pci_access_mutex);
 _base_mask_interrupts(ioc);
 ioc-shost_recovery = 1;
 _base_make_ioc_ready(ioc, CAN_SLEEP, SOFT_RESET);
 @@ -4454,6 +4461,8 @@ mpt2sas_base_free_resources(struct MPT2SAS_ADAPTER *ioc)
 pci_disable_pcie_error_reporting(pdev);
 pci_disable_device(pdev);
 }
 +   if (ioc-is_warpdrive)
 +   mutex_unlock(ioc-pci_access_mutex);
 return;
  }

 diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h 
 b/drivers/scsi/mpt2sas/mpt2sas_base.h
 index caff8d1..c82bdb3 100644
 --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
 +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
 @@ -799,6 +799,12 @@ typedef void (*MPT2SAS_FLUSH_RUNNING_CMDS)(struct 
 MPT2SAS_ADAPTER *ioc);
   * @delayed_tr_list: target reset link list
   * @delayed_tr_volume_list: volume target reset link list
   * @@temp_sensors_count: flag to carry the number of temperature sensors
 + * @pci_access_mutex: Mutex to synchronize ioctl,sysfs show path and
 + * pci resource handling. PCI resource freeing will lead to free
 + * vital hardware/memory resource, which might be in use by cli/sysfs
 + * path functions resulting in Null pointer reference followed by kernel
 + * crash. To avoid the above race condition we use mutex syncrhonization
 + * which ensures the syncrhonization between cli/sysfs_show path
   */
  struct MPT2SAS_ADAPTER {
 struct list_head list;
 @@ -1015,6 +1021,7 @@ struct MPT2SAS_ADAPTER {
 u8  mfg_pg10_hide_flag;
 u8  hide_drives;

 +   struct mutex pci_access_mutex;
  };

  typedef u8 (*MPT_CALLBACK)(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8 
 msix_index,
 @@ -1023,6 +1030,17 @@ typedef u8 (*MPT_CALLBACK)(struct MPT2SAS_ADAPTER 
 *ioc, u16 smid, u8 msix_index,

  /* base shared API */
  extern struct list_head mpt2sas_ioc_list;
 +/* spinlock on list operations over IOCs
 + * Case: when multiple warpdrive cards(IOCs) are in use
 + * Each IOC will added to the ioc list stucture on initialization.
 + * Watchdog threads run at regular intervals to check IOC for any
 + * fault conditions which will trigger the dead_ioc thread to
 + * deallocate pci resource, resulting deleting the IOC netry from list,
 + * this deletion need to protected by spinlock to enusre that
 + * ioc removal is syncrhonized, if not synchronized it might lead to
 + * list_del corruption as the ioc list is traversed in cli path
 + */
 +extern spinlock_t 

Re: [PATCH] st: null pointer dereference panic caused by use after kref_put by st_open

2015-07-02 Thread Kai Mäkisara (Kolumbus)

 On 2.7.2015, at 15.01, Seymour, Shane M shane.seym...@hp.com wrote:
 
 
 Two SLES11 SP3 servers encountered similar crashes simultaneously
 following some kind of SAN/tape target issue:
 
...
 The crash is fixed by reordering the code so we no longer access
 the struct scsi_tape after the kref_put() is done on it in st_open().
 
 Signed-off-by: Shane Seymour shane.seym...@hp.com
 Signed-off-by: Darren Lavender darren.laven...@hp.com

Acked-by: Kai Mäkisara kai.makis...@kolumbus.fi

Thanks for finding this. No code should touch an object after put().

Thanks,
Kai

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] cxlflash: Base superpipe support

2015-07-02 Thread Matthew R. Ochs
Mikey,

Thanks for reviewing. Responses are inline below.


-matt

On Jul 2, 2015, at 1:39 AM, Michael Neuling wrote:
 On Fri, 2015-06-19 at 17:37 -0500, Matthew R. Ochs wrote:
 
 --- a/drivers/scsi/cxlflash/common.h
 +++ b/drivers/scsi/cxlflash/common.h
 @@ -103,6 +103,14 @@ struct cxlflash_cfg {
  struct pci_pool *cxlflash_cmd_pool;
  struct pci_dev *parent_dev;
 
 +spinlock_t ctx_tbl_slock;
 
 From the code, it's not clear to me what this lock is actually
 protecting.  Writes to the pointers can be atomic.  Are two pointers
 needed to be written atomically?
 
 So is it the contents of what it's pointing to?  That doesn't seem
 correct ether as the contents are written outside of the lock

So this lock is serving two purposes:

First, we use it to serialize updates to the table. There are cases (not 
included in
this patch) where we want to clear the entire table while avoiding updates to 
it.

Second, by using it to serialize the table in conjunction with an atomic run 
count,
it allows us to obtain a context reference without having to worry about a 
removal
thread yanking the memory from under us. When we return from get_context() with
a non-NULL reference, we are guaranteed that reference will not be freed until
that thread of execution completes (the run count [nrefs] in the context is 
atomically
decremented). 

 +#include sislite.h
 +#include common.h
 +#include superpipe.h
 +
 +struct cxlflash_global global;
 
 Can this be static?

Why yes it can...done!

 +/*
 + * Increment the reference count under lock so the context
 + * is not yanked from under us on a removal thread.
 + */
 +atomic_inc(ctx_info-nrefs);
 +spin_unlock_irqrestore(cfg-ctx_tbl_slock, flags);
 
 There is no memory barrier here to ensure ctx_info-pid has been
 flushed.

I don't think we need one. The pid is only set when the context is created,
thus, the user won't be trying to issue other ioctls using this context until
after a successful return from attach. 

 
 +ctxpid = ctx_info-pid;
 +if (likely(!(ctx_ctrl  CTX_CTRL_NOPID)))
 +if (pid != ctxpid)
 +goto denied;


 +
 +/* Free the context; note that rht_lun was allocated at same time */
 +kfree(ctx_info);
 +cfg-num_user_contexts--;
 
 This seems racey. If two people are calling the destroy ioctl at the
 same time they will race on updating cfg-num_user_contexts;

Sure, we can make this atomic.

 +ctx_info-pid = current-tgid; /* tgid = pid */
 +ctx_info-ctx = ctx;
 +INIT_LIST_HEAD(ctx_info-luns);
 +INIT_LIST_HEAD(ctx_info-luns);
 
 Why this twice?

Because I goofed. Will fix.

 
 
 +atomic_set(ctx_info-nrefs, 1);
 +
 +cfg-num_user_contexts++;
 
 Does this need to be atomic too?  Can't two people call this ioctl at
 the same time?

Yep.

 +spin_lock_irqsave(cfg-ctx_tbl_slock, flags);
 +cfg-ctx_tbl[ctxid] = NULL;
 +spin_unlock_irqrestore(cfg-ctx_tbl_slock, flags);
 
 Not sure you need a lock here.  This NULL write should be atomic.

Covered in earlier comments.

 +
 +/* Translate read/write O_* flags from fcntl.h to AFU permission bits */
 +perms = SISL_RHT_PERM(attach-hdr.flags + 1);
 +
 +ctx_info = create_context(cfg, ctx, ctxid, fd, perms);
 
 I don't see a memory barrier between this create context and the
 insertion in the cfg-ctx_tbl table.  It concerns me we could have a
 race when accessing it.  same on the read side 

Regarding race conditions with the slot in the array, we're protected so long
as your code doesn't have a bug in doling out process elements (context id). =)

Regarding a memory barrier, aren't we not covered implicitly because of access
being tucked under a spin lock? Or was this concern in the context of us not
using a lock (per your previous comments).

 
 +/*
 + * No error paths after this point. Once the fd is installed it's
 + * visible to user space and can't be undone safely on this thread.
 + */
 +list_add(lun_access-list, ctx_info-luns);
 +cfg-ctx_tbl[ctxid] = ctx_info;

To be consistent we should be locking here.

 +fd_install(fd, file);
 +
 +out_attach:
 +attach-hdr.return_flags = 0;
 +attach-context_id = ctx_info-ctxid;
 +attach-block_size = lun_info-blk_len;
 +attach-mmio_size = sizeof(afu-afu_map-hosts[0].harea);
 +attach-last_lba = lun_info-max_lba;
 +attach-max_xfer = (sdev-host-max_sectors * 512) / lun_info-blk_len;

 +/*
 + * No error paths after this point. Once the fd is installed it's
 + * visible to user space and can't be undone safely on this thread.
 + */
 +ctx_info-ctxid = ENCODE_CTXID(ctx_info, ctxid);
 +ctx_info-lfd = fd;
 +ctx_info-ctx = ctx;
 
 No memory barrier here.  Seem like we could race in get_context() when
 we dereference ctx_info;
 
 +cfg-ctx_tbl[ctxid] = ctx_info;

Same comment as in attach, I 

Re: [PATCH 0/6] Fixes for memory corruption in mpt2sas

2015-07-02 Thread Jens Axboe

On 05/14/2015 09:41 PM, Calvin Owens wrote:

Hello all,

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

I will provide a similar set of fixes for mpt3sas, since we see
similar issues there as well. Porting this to mpt3sas will be
trivial since the part of the driver I'm touching is nearly identical
between the two, so I thought it would be simpler to review a patch
against mpt2sas alone at first.

I've tested this for a few days on a big storage box that seemed to be
very susceptible to the panics, and so far it seems to have eliminated
them.


Guys, can someone outside of FB please review this? We're hitting random 
memory corruptions without these fixes.


--
Jens Axboe

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] libiscsi: Fix iscsi_check_transport_timeouts possible infinite loop

2015-07-02 Thread Mike Christie

On 6/30/15, 9:55 AM, Sagi Grimberg wrote:

From: Ariel Nahum ari...@mellanox.com

Connection last_ping is not being updated when iscsi_send_nopout fails.
Not updating the last_ping will cause firing a timer to a past time
(last_ping + ping_tmo  current_time) which triggers an infinite loop of
iscsi_check_transport_timeouts() and hogs the cpu.

Fix this issue by checking the return value of iscsi_send_nopout.
If it fails set the next_timeout to one second later.

Signed-off-by: Ariel Nahum ari...@mellanox.com
Signed-off-by: Sagi Grimberg sa...@mellanox.com
---
  drivers/scsi/libiscsi.c |   15 ++-
  1 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 8053f24..1ea4213 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -979,13 +979,13 @@ static void iscsi_tmf_rsp(struct iscsi_conn *conn, struct 
iscsi_hdr *hdr)
wake_up(conn-ehwait);
  }

-static void iscsi_send_nopout(struct iscsi_conn *conn, struct iscsi_nopin 
*rhdr)
+static int iscsi_send_nopout(struct iscsi_conn *conn, struct iscsi_nopin *rhdr)
  {
  struct iscsi_nopout hdr;
struct iscsi_task *task;

if (!rhdr  conn-ping_task)
-   return;
+   return -EINVAL;

memset(hdr, 0, sizeof(struct iscsi_nopout));
hdr.opcode = ISCSI_OP_NOOP_OUT | ISCSI_OP_IMMEDIATE;
@@ -999,13 +999,16 @@ static void iscsi_send_nopout(struct iscsi_conn *conn, 
struct iscsi_nopin *rhdr)
hdr.ttt = RESERVED_ITT;

task = __iscsi_conn_send_pdu(conn, (struct iscsi_hdr *)hdr, NULL, 0);
-   if (!task)
+   if (!task) {



Are you hitting the failure case in the first chunk of the patch or the 
failure right above? If the latter, why is it failing?



iscsi_conn_printk(KERN_ERR, conn, Could not send nopout\n);
+   return -EIO;
+   }
else if (!rhdr) {


I think the coding style is wrong. It should be:

if () {

} else if {

}


/* only track our nops */
conn-ping_task = task;
conn-last_ping = jiffies;
}
+   return 0;
  }

  static int iscsi_nop_out_rsp(struct iscsi_task *task,
@@ -2095,8 +2098,10 @@ static void iscsi_check_transport_timeouts(unsigned long 
data)
if (time_before_eq(last_recv + recv_timeout, jiffies)) {
/* send a ping to try to provoke some traffic */
ISCSI_DBG_CONN(conn, Sending nopout as ping\n);
-   iscsi_send_nopout(conn, NULL);
-   next_timeout = conn-last_ping + (conn-ping_timeout * HZ);
+   if (iscsi_send_nopout(conn, NULL))
+   next_timeout = jiffies + (1 * HZ);
+   else
+   next_timeout = conn-last_ping + (conn-ping_timeout * 
HZ);
} else
next_timeout = last_recv + recv_timeout;



n
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas

2015-07-02 Thread Bart Van Assche

On 06/08/2015 08:50 PM, Calvin Owens wrote:

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

I will provide a similar set of fixes for mpt3sas, since we see
similar issues there as well. Porting this to mpt3sas will be
trivial since the part of the driver I'm touching is nearly identical
between the two, so I thought it would be simpler to review a patch
against mpt2sas alone at first.

I've tested this on a handful of large storage boxes over the past few
weeks, so far it seems to have completely eliminated the memory
corruption panics.


If you have to repost this series please convert 
BUG_ON(!spin_is_locked(ioc-sas_device_lock)); into 
lockdep_is_held(...). Otherwise, for the whole series:


Reviewed-by: Bart Van Assche bart.vanass...@sandisk.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] libiscsi: Fix host busy blocking during connection teardown

2015-07-02 Thread Mike Christie

On 6/23/15, 8:11 PM, John Soni Jose wrote:

  Issue:
  In case of hw iscsi offload, an host can have N-number of active
  connections. There can be IO's running on some connections which
  make host-host_busy always TRUE. Now if logout from a connection
  is tried then the code gets into an infinite loop as host-host_busy
  is always TRUE.

  iscsi_conn_teardown()
  {
.
 /*
  * Block until all in-progress commands for this connection
  * time out or fail.
  */
  for (;;) {
   spin_lock_irqsave(session-host-host_lock, flags);
   if (!atomic_read(session-host-host_busy)) { /* OK for ERL == 0 */
  spin_unlock_irqrestore(session-host-host_lock, flags);
   break;
   }
  spin_unlock_irqrestore(session-host-host_lock, flags);
  msleep_interruptible(500);
  iscsi_conn_printk(KERN_INFO, conn, iscsi conn_destroy(): 
  host_busy %d host_failed %d\n,
  atomic_read(session-host-host_busy),
  session-host-host_failed);


...
  }
   }
   This is not an issue with software-iscsi/iser as each cxn is a separate
   host.

  Fix:
  Acquiring eh_mutex in iscsi_conn_teardown() before setting
  session-state = ISCSI_STATE_TERMINATE.

Signed-off-by: John Soni Jose sony.j...@aavagotech.com
---
  drivers/scsi/libiscsi.c | 25 ++---
  1 file changed, 2 insertions(+), 23 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 8053f24..98d9bb6 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -2941,10 +2941,10 @@ void iscsi_conn_teardown(struct iscsi_cls_conn 
*cls_conn)
  {
struct iscsi_conn *conn = cls_conn-dd_data;
struct iscsi_session *session = conn-session;
-   unsigned long flags;

del_timer_sync(conn-transport_timer);

+   mutex_lock(session-eh_mutex);
spin_lock_bh(session-frwd_lock);
conn-c_stage = ISCSI_CONN_CLEANUP_WAIT;
if (session-leadconn == conn) {
@@ -2956,28 +2956,6 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
}
spin_unlock_bh(session-frwd_lock);

-   /*
-* Block until all in-progress commands for this connection
-* time out or fail.
-*/
-   for (;;) {
-   spin_lock_irqsave(session-host-host_lock, flags);
-   if (!atomic_read(session-host-host_busy)) { /* OK for ERL == 
0 */
-   spin_unlock_irqrestore(session-host-host_lock, flags);
-   break;
-   }
-   spin_unlock_irqrestore(session-host-host_lock, flags);
-   msleep_interruptible(500);
-   iscsi_conn_printk(KERN_INFO, conn, iscsi conn_destroy(): 
- host_busy %d host_failed %d\n,
- atomic_read(session-host-host_busy),
- session-host-host_failed);
-   /*
-* force eh_abort() to unblock
-*/
-   wake_up(conn-ehwait);
-   }
-
/* flush queued up work because we free the connection below */
iscsi_suspend_tx(conn);

@@ -2994,6 +2972,7 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
if (session-leadconn == conn)
session-leadconn = NULL;
spin_unlock_bh(session-frwd_lock);
+   mutex_unlock(session-eh_mutex);

iscsi_destroy_conn(cls_conn);
  }


It looks ok.

Reviewed-by: Mike Christie micha...@cs.wisc.edu

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] we added changes in fnic driver patch 1.6.0.16 to acquire io_req_lock in fnic_queuecommand() before issuing I/O so that io completion is serialized. But when releasing the lock we check fo

2015-07-02 Thread Hiral Shah
Signed-off-by: Hiral Shah his...@cisco.com
Signed-off-by: Sesidhar Baddela sebad...@cisco.com
Signed-off-by: Anil Chintalapati achin...@cisco.com
---
 drivers/scsi/fnic/fnic.h  | 2 +-
 drivers/scsi/fnic/fnic_scsi.c | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
index 26270c3..ce129e5 100644
--- a/drivers/scsi/fnic/fnic.h
+++ b/drivers/scsi/fnic/fnic.h
@@ -39,7 +39,7 @@
 
 #define DRV_NAME   fnic
 #define DRV_DESCRIPTIONCisco FCoE HBA Driver
-#define DRV_VERSION1.6.0.17
+#define DRV_VERSION1.6.0.17a
 #define PFXDRV_NAME : 
 #define DFX DRV_NAME %d: 
 
diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
index 155b286..25436cd 100644
--- a/drivers/scsi/fnic/fnic_scsi.c
+++ b/drivers/scsi/fnic/fnic_scsi.c
@@ -425,6 +425,7 @@ static int fnic_queuecommand_lck(struct scsi_cmnd *sc, void 
(*done)(struct scsi_
unsigned long ptr;
struct fc_rport_priv *rdata;
spinlock_t *io_lock = NULL;
+   int io_lock_acquired = 0;
 
if (unlikely(fnic_chk_state_flags_locked(fnic, FNIC_FLAGS_IO_BLOCKED)))
return SCSI_MLQUEUE_HOST_BUSY;
@@ -518,6 +519,7 @@ static int fnic_queuecommand_lck(struct scsi_cmnd *sc, void 
(*done)(struct scsi_
spin_lock_irqsave(io_lock, flags);
 
/* initialize rest of io_req */
+   io_lock_acquired = 1;
io_req-port_id = rport-port_id;
io_req-start_time = jiffies;
CMD_STATE(sc) = FNIC_IOREQ_CMD_PENDING;
@@ -571,7 +573,7 @@ out:
  (((u64)CMD_FLAGS(sc)  32) | CMD_STATE(sc)));
 
/* if only we issued IO, will we have the io lock */
-   if (CMD_FLAGS(sc)  FNIC_IO_INITIALIZED)
+   if (io_lock_acquired)
spin_unlock_irqrestore(io_lock, flags);
 
atomic_dec(fnic-in_flight);
-- 
1.8.5.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html