RE: [PATCH RFC 1/2] cxlflash: Base support for IBM CXL Flash Adapter
-Original Message- From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Matthew R. Ochs Sent: Monday, April 27, 2015 2:50 PM To: linux-scsi@vger.kernel.org; james.bottom...@hansenpartnership.com; n...@linux-iscsi.org; brk...@linux.vnet.ibm.com Cc: mi...@neuling.org; imun...@au1.ibm.com; Manoj N. Kumar Subject: [PATCH RFC 1/2] cxlflash: Base support for IBM CXL Flash Adapter +static ssize_t cxlflash_show_port_status(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct Scsi_Host *shost = class_to_shost(dev); + struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata; + struct afu *afu = cxlflash-afu; + + char *disp_status; + int rc; + u32 port; + u64 status; + volatile u64 *fc_regs; + + rc = kstrtouint((attr-attr.name + 4), 10, port); + if (rc || (port NUM_FC_PORTS)) + return 0; + + fc_regs = afu-afu_map-global.fc_regs[port][0]; + status = + (readq_be(fc_regs[FC_MTIP_STATUS / 8]) FC_MTIP_STATUS_MASK); + + if (status == FC_MTIP_STATUS_ONLINE) + disp_status = online; + else if (status == FC_MTIP_STATUS_OFFLINE) + disp_status = offline; + else + disp_status = unknown; + + return snprintf(buf, PAGE_SIZE, %s\n, disp_status); You need to use scnprintf() instead of snprintf() here +static ssize_t cxlflash_show_lun_mode(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct Scsi_Host *shost = class_to_shost(dev); + struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata; + struct afu *afu = cxlflash-afu; + + return snprintf(buf, PAGE_SIZE, %u\n, afu-internal_lun); See above about snprintf() +static DEVICE_ATTR(port0, S_IRUGO, cxlflash_show_port_status, NULL); +static DEVICE_ATTR(port1, S_IRUGO, cxlflash_show_port_status, NULL); +static DEVICE_ATTR(lun_mode, S_IRUGO | S_IWUSR, cxlflash_show_lun_mode, + cxlflash_store_lun_mode); Please use DEVICE_ATTR_RO and DEVICE_ATTR_RW as appropriate if you can, you will need to change the show/store function names. The main reason I know for doing this is (using DEVICE_ATTR_RO as an example) if someone sees a sysfs attribute called port0 or port1 they should be able to search the kernel source to find a function called port0_show or port1_show without having to spend any extra time searching to find out what the driver is and then look at the driver source to find that they need to look at cxlflash_show_port_status. Using DEVICE_ATTR_RO for port0 and port1 would probably change the code to looking something like this: static ssize_t cxlflash_show_port_status(u32 port, struct afu *afu, char *buf) { char *disp_status; u64 status; volatile u64 *fc_regs; fc_regs = afu-afu_map-global.fc_regs[port][0]; /* I split this line into two because I had to really look at where * the brackets were and this made it more obvious to me * what it was doing but that could just be me. */ status = readq_be(fc_regs[FC_MTIP_STATUS / 8]); status = FC_MTIP_STATUS_MASK; if (status == FC_MTIP_STATUS_ONLINE) disp_status = online; else if (status == FC_MTIP_STATUS_OFFLINE) disp_status = offline; else disp_status = unknown; return scnprintf(buf, PAGE_SIZE, %s\n, disp_status); } Since you have fixed values you could use also sprintf here (although the documentation currently says to only use scnprintf) and that would make port0_show and port1_show to be: static ssize_t port0_show(struct device *dev, struct device_attribute *attr, char *buf) { struct Scsi_Host *shost = class_to_shost(dev); struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata; struct afu *afu = cxlflash-afu; return cxlflash_show_port_status(0, afu, buf); } static ssize_t port1_show(struct device *dev, struct device_attribute *attr, char *buf) { struct Scsi_Host *shost = class_to_shost(dev); struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata; struct afu *afu = cxlflash-afu; return cxlflash_show_port_status(1, afu, buf); } I'm assuming that 0 and 1 are always valid for port number and don't need to be checked against NUM_FC_PORTS. That's just a suggestion and I haven't attempted to compile the above example and you'd need to test it to make sure that they still work as expected. Shane -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] cxlflash: Base superpipe support
On Fri, 2015-06-19 at 17:37 -0500, Matthew R. Ochs wrote: Add superpipe supporting infrastructure to device driver for the IBM CXL Flash adapter. This patch allows userspace applications to take advantage of the accelerated I/O features that this adapter provides and bypass the traditional filesystem stack. ... bypass the traditional filesystem stack. interesting :-) Other comments below. Signed-off-by: Matthew R. Ochs mro...@linux.vnet.ibm.com Signed-off-by: Manoj N. Kumar ma...@linux.vnet.ibm.com --- Documentation/powerpc/cxlflash.txt | 298 ++ drivers/scsi/cxlflash/Makefile |2 +- drivers/scsi/cxlflash/common.h | 18 + drivers/scsi/cxlflash/main.c | 12 + drivers/scsi/cxlflash/superpipe.c | 1856 drivers/scsi/cxlflash/superpipe.h | 210 include/uapi/scsi/cxlflash_ioctl.h | 159 +++ 7 files changed, 2554 insertions(+), 1 deletion(-) create mode 100644 Documentation/powerpc/cxlflash.txt create mode 100644 drivers/scsi/cxlflash/superpipe.c create mode 100644 drivers/scsi/cxlflash/superpipe.h create mode 100644 include/uapi/scsi/cxlflash_ioctl.h diff --git a/Documentation/powerpc/cxlflash.txt b/Documentation/powerpc/cxlflash.txt new file mode 100644 index 000..c4d3849 --- /dev/null +++ b/Documentation/powerpc/cxlflash.txt @@ -0,0 +1,298 @@ +Introduction + + +The IBM Power architecture provides support for CAPI (Coherent +Accelerator Power Interface), which is available to certain PCIe slots +on Power 8 systems. CAPI can be thought of as a special tunneling +protocol through PCIe that allow PCIe adapters to look like special +purpose co-processors which can read or write an application's +memory and generate page faults. As a result, the host interface to +an adapter running in CAPI mode does not require the data buffers to +be mapped to the device's memory (IOMMU bypass) nor does it require +memory to be pinned. + +On Linux, Coherent Accelerator (CXL) kernel services present CAPI +devices as a PCI device by implementing a virtual PCI host bridge. +This abstraction simplifies the infrastructure and programming +model, allowing for drivers to look similar to other native PCI +device drivers. + +CXL provides a mechanism by which user space applications can +directly talk to a device (network or storage) bypassing the typical +kernel/device driver stack. The CXL Flash Adapter Driver enables a +user space application direct access to Flash storage. + +The CXL Flash Adapter Driver is a kernel module that sits in the +SCSI stack as a low level device driver (below the SCSI disk and +protocol drivers) for the IBM CXL Flash Adapter. This driver is +responsible for the initialization of the adapter, setting up the +special path for user space access, and performing error recovery. It +communicates directly the Flash Accelerator Functional Unit (AFU) +as described in Documentation/powerpc/cxl.txt. + +The cxlflash driver supports two, mutually exclusive, modes of +operation at the device (LUN) level: + +- Any flash device (LUN) can be configured to be accessed as a + regular disk device (i.e.: /dev/sdc). This is the default mode. + +- Any flash device (LUN) can be configured to be accessed from + user space with a special block library. This mode further + specifies the means of accessing the device and provides for + either raw access to the entire LUN (referred to as direct + or physical LUN access) or access to a kernel/AFU-mediated + partition of the LUN (referred to as virtual LUN access). The + segmentation of a disk device into virtual LUNs is assisted + by special translation services provided by the Flash AFU. + +Overview + + +The Coherent Accelerator Interface Architecture (CAIA) introduces a +concept of a master context. A master typically has special privileges +granted to it by the kernel or hypervisor allowing it to perform AFU +wide management and control. The master may or may not be involved +directly in each user I/O, but at the minimum is involved in the +initial setup before the user application is allowed to send requests +directly to the AFU. + +The CXL Flash Adapter Driver establishes a master context with the +AFU. It uses memory mapped I/O (MMIO) for this control and setup. The +Adapter Problem Space Memory Map looks like this: + + +---+ + |512 * 64 KB User MMIO | + |(per context) | + | User Accessible | + +---+ + |512 * 128 B per context| +
Re: [PATCH 1/2] cxlflash: Base superpipe support
+struct dk_cxlflash_attach { + struct dk_cxlflash_hdr hdr; /* Common fields */ + __u64 num_interrupts; /* Requested number of interrupts */ + __u64 context_id; /* Returned context */ Matt, These ioctls seem to take a context id. Why isn't that assumed from the fd? Why get it again from userspace? Mikey -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] st: null pointer dereference panic caused by use after kref_put by st_open
Seymour, Shane M shane.seym...@hp.com writes: Two SLES11 SP3 servers encountered similar crashes simultaneously following some kind of SAN/tape target issue: Thanks, Reviewed-by: Johannes Thumshirn jthumsh...@suse.com -- Johannes Thumshirn Storage jthumsh...@suse.de +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] st: null pointer dereference panic caused by use after kref_put by st_open
Two SLES11 SP3 servers encountered similar crashes simultaneously following some kind of SAN/tape target issue: ... qla2xxx [:81:00.0]-801c:3: Abort command issued nexus=3:0:2 -- 1 2002. qla2xxx [:81:00.0]-801c:3: Abort command issued nexus=3:0:2 -- 1 2002. qla2xxx [:81:00.0]-8009:3: DEVICE RESET ISSUED nexus=3:0:2 cmd=882f89c2c7c0. qla2xxx [:81:00.0]-800c:3: do_reset failed for cmd=882f89c2c7c0. qla2xxx [:81:00.0]-800f:3: DEVICE RESET FAILED: Task management failed nexus=3:0:2 cmd=882f89c2c7c0. qla2xxx [:81:00.0]-8009:3: TARGET RESET ISSUED nexus=3:0:2 cmd=882f89c2c7c0. qla2xxx [:81:00.0]-800c:3: do_reset failed for cmd=882f89c2c7c0. qla2xxx [:81:00.0]-800f:3: TARGET RESET FAILED: Task management failed nexus=3:0:2 cmd=882f89c2c7c0. qla2xxx [:81:00.0]-8012:3: BUS RESET ISSUED nexus=3:0:2. qla2xxx [:81:00.0]-802b:3: BUS RESET SUCCEEDED nexus=3:0:2. qla2xxx [:81:00.0]-505f:3: Link is operational (8 Gbps). qla2xxx [:81:00.0]-8018:3: ADAPTER RESET ISSUED nexus=3:0:2. qla2xxx [:81:00.0]-00af:3: Performing ISP error recovery - ha=88bf04d18000. rport-3:0-0: blocked FC remote port time out: removing target and saving binding qla2xxx [:81:00.0]-505f:3: Link is operational (8 Gbps). qla2xxx [:81:00.0]-8017:3: ADAPTER RESET SUCCEEDED nexus=3:0:2. rport-2:0-0: blocked FC remote port time out: removing target and saving binding sg_rq_end_io: device detached BUG: unable to handle kernel NULL pointer dereference at 02a8 IP: [8133b268] __pm_runtime_idle+0x28/0x90 PGD 7e6586f067 PUD 7e5af06067 PMD 0 [1739975.390354] Oops: 0002 [#1] SMP CPU 0 ... Supported: No, Proprietary modules are loaded [1739975.390463] Pid: 27965, comm: ABCD Tainted: PF X 3.0.101-0.29-default #1 HP ProLiant DL580 Gen8 RIP: 0010:[8133b268] [8133b268] __pm_runtime_idle+0x28/0x90 RSP: 0018:8839dc1e7c68 EFLAGS: 00010202 RAX: RBX: 883f0592fc00 RCX: 0090 RDX: RSI: 0004 RDI: 0138 RBP: 0138 R08: 0010 R09: 81bd39d0 R10: 09c0 R11: 81025790 R12: 0001 R13: 883022212b80 R14: 0004 R15: 883022212b80 FS: 7f8e54560720() GS:88407f80() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 02a8 CR3: 007e6ced6000 CR4: 001407f0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process ABCD (pid: 27965, threadinfo 8839dc1e6000, task 883592e0c640) Stack: 883f0592fc00 fffa 0001 883022212b80 883eff772400 a03fa309 a04003a0 883f063196c0 887f0379a930 8115ea1e Call Trace: [a03fa309] st_open+0x129/0x240 [st] [8115ea1e] chrdev_open+0x13e/0x200 [811588a8] __dentry_open+0x198/0x310 [81167d74] do_last+0x1f4/0x800 [81168fe9] path_openat+0xd9/0x420 [8116946c] do_filp_open+0x4c/0xc0 [8115a00f] do_sys_open+0x17f/0x250 [81468d92] system_call_fastpath+0x16/0x1b [7f8e4f617fd0] 0x7f8e4f617fcf Code: eb d3 90 48 83 ec 28 40 f6 c6 04 48 89 6c 24 08 4c 89 74 24 20 48 89 fd 48 89 1c 24 4c 89 64 24 10 41 89 f6 4c 89 6c 24 18 74 11 f0 ff 8f 70 01 00 00 0f 94 c0 45 31 ed 84 c0 74 2b 4c 8d a5 a0 RIP [8133b268] __pm_runtime_idle+0x28/0x90 RSP 8839dc1e7c68 CR2: 02a8 Analysis reveals the cause of the crash to be due to STp-device being NULL. The pointer was NULLed via scsi_tape_put(STp) when it calls scsi_tape_release(). In st_open() we jump to err_out after scsi_block_when_processing_errors() completes and returns the device as offline (sdev_state was SDEV_DEL): 1180 /* Open the device. Needs to take the BKL only because of incrementing the SCSI host 1181module count. */ 1182 static int st_open(struct inode *inode, struct file *filp) 1183 { 1184 int i, retval = (-EIO); 1185 int resumed = 0; 1186 struct scsi_tape *STp; 1187 struct st_partstat *STps; 1188 int dev = TAPE_NR(inode); 1189 char *name; ... 1217 if (scsi_autopm_get_device(STp-device) 0) { 1218 retval = -EIO; 1219 goto err_out; 1220 } 1221 resumed = 1; 1222 if (!scsi_block_when_processing_errors(STp-device)) { 1223 retval = (-ENXIO); 1224 goto err_out; 1225 } ... 1264 err_out: 1265 normalize_buffer(STp-buffer); 1266 spin_lock(st_use_lock); 1267 STp-in_use = 0; 1268 spin_unlock(st_use_lock); 1269 scsi_tape_put(STp); -- STp-device = 0 after this 1270 if (resumed) 1271 scsi_autopm_put_device(STp-device); 1272 return retval; The ref count for the
Re: [PATCH RFC 1/2] cxlflash: Base support for IBM CXL Flash Adapter
Shane, All great suggestions. I've gone ahead and implemented these and also applied your comments to our other attributes as well. Will include in our next push of fixes. -matt On Jul 2, 2015, at 3:08 AM, Seymour, Shane M wrote: -Original Message- From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Matthew R. Ochs Sent: Monday, April 27, 2015 2:50 PM To: linux-scsi@vger.kernel.org; james.bottom...@hansenpartnership.com; n...@linux-iscsi.org; brk...@linux.vnet.ibm.com Cc: mi...@neuling.org; imun...@au1.ibm.com; Manoj N. Kumar Subject: [PATCH RFC 1/2] cxlflash: Base support for IBM CXL Flash Adapter +static ssize_t cxlflash_show_port_status(struct device *dev, + struct device_attribute *attr, + char *buf) +{ +struct Scsi_Host *shost = class_to_shost(dev); +struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata; +struct afu *afu = cxlflash-afu; + +char *disp_status; +int rc; +u32 port; +u64 status; +volatile u64 *fc_regs; + +rc = kstrtouint((attr-attr.name + 4), 10, port); +if (rc || (port NUM_FC_PORTS)) +return 0; + +fc_regs = afu-afu_map-global.fc_regs[port][0]; +status = +(readq_be(fc_regs[FC_MTIP_STATUS / 8]) FC_MTIP_STATUS_MASK); + +if (status == FC_MTIP_STATUS_ONLINE) +disp_status = online; +else if (status == FC_MTIP_STATUS_OFFLINE) +disp_status = offline; +else +disp_status = unknown; + +return snprintf(buf, PAGE_SIZE, %s\n, disp_status); You need to use scnprintf() instead of snprintf() here +static ssize_t cxlflash_show_lun_mode(struct device *dev, + struct device_attribute *attr, char *buf) +{ +struct Scsi_Host *shost = class_to_shost(dev); +struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata; +struct afu *afu = cxlflash-afu; + +return snprintf(buf, PAGE_SIZE, %u\n, afu-internal_lun); See above about snprintf() +static DEVICE_ATTR(port0, S_IRUGO, cxlflash_show_port_status, NULL); +static DEVICE_ATTR(port1, S_IRUGO, cxlflash_show_port_status, NULL); +static DEVICE_ATTR(lun_mode, S_IRUGO | S_IWUSR, cxlflash_show_lun_mode, +cxlflash_store_lun_mode); Please use DEVICE_ATTR_RO and DEVICE_ATTR_RW as appropriate if you can, you will need to change the show/store function names. The main reason I know for doing this is (using DEVICE_ATTR_RO as an example) if someone sees a sysfs attribute called port0 or port1 they should be able to search the kernel source to find a function called port0_show or port1_show without having to spend any extra time searching to find out what the driver is and then look at the driver source to find that they need to look at cxlflash_show_port_status. Using DEVICE_ATTR_RO for port0 and port1 would probably change the code to looking something like this: static ssize_t cxlflash_show_port_status(u32 port, struct afu *afu, char *buf) { char *disp_status; u64 status; volatile u64 *fc_regs; fc_regs = afu-afu_map-global.fc_regs[port][0]; /* I split this line into two because I had to really look at where * the brackets were and this made it more obvious to me * what it was doing but that could just be me. */ status = readq_be(fc_regs[FC_MTIP_STATUS / 8]); status = FC_MTIP_STATUS_MASK; if (status == FC_MTIP_STATUS_ONLINE) disp_status = online; else if (status == FC_MTIP_STATUS_OFFLINE) disp_status = offline; else disp_status = unknown; return scnprintf(buf, PAGE_SIZE, %s\n, disp_status); } Since you have fixed values you could use also sprintf here (although the documentation currently says to only use scnprintf) and that would make port0_show and port1_show to be: static ssize_t port0_show(struct device *dev, struct device_attribute *attr, char *buf) { struct Scsi_Host *shost = class_to_shost(dev); struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata; struct afu *afu = cxlflash-afu; return cxlflash_show_port_status(0, afu, buf); } static ssize_t port1_show(struct device *dev, struct device_attribute *attr, char *buf) { struct Scsi_Host *shost = class_to_shost(dev); struct cxlflash *cxlflash = (struct cxlflash *)shost-hostdata; struct afu *afu = cxlflash-afu; return cxlflash_show_port_status(1, afu, buf); } I'm assuming that 0 and 1 are always valid for port number and don't need to be checked against NUM_FC_PORTS. That's just a suggestion and I haven't attempted to compile the above example and you'd need to test it to make sure that they still work as expected. Shane -- To unsubscribe from
Re: [PATCH 1/2] cxlflash: Base superpipe support
On Thu, 2015-07-02 at 17:00 +1000, Michael Neuling wrote: +struct dk_cxlflash_attach { + struct dk_cxlflash_hdr hdr; /* Common fields */ + __u64 num_interrupts; /* Requested number of interrupts */ + __u64 context_id; /* Returned context */ Matt, These ioctls seem to take a context id. Why isn't that assumed from the fd? Why get it again from userspace? Sorry, ignore this. I'm thinking about the wrong file descriptor. Mikey -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv1] mpt2sas: setpci reset kernel oops fix
spinlock initialization modified as per comments, Could anyone please review and approve the patch On Thu, Jun 25, 2015 at 4:24 PM, Nagarajkumar Narayanan nagarajkumar.naraya...@seagate.com wrote: This patch contains the fix for kernel oops on issuing setpci reset along with sysfs, cli access spinlock initialization modified to DEFINE_SPINLOCK as per the comments for previous version of the patch From edf1c00ddcc9b045fd1527a23631f3d4b1eea599 Mon Sep 17 00:00:00 2001 From: Nagarajkumar Narayanan nagarajkumar.naraya...@seagate.com Date: Thu, 25 Jun 2015 11:20:44 +0530 Subject: [PATCH] mpt2sas setpci reset oops fix issuing setpci reset to nytro warpdrive card along with sysfs access and cli ioctl access resulted in kernel oops 1. pci_access_mutex lock added to provide synchronization between IOCTL, sysfs, PCI resource handling path 2. gioc_lock spinlock to protect list operations over multiple controllers Signed-off-by: Nagarajkumar Narayanan nagarajkumar.naraya...@seagate.com --- drivers/scsi/mpt2sas/mpt2sas_base.c |9 ++ drivers/scsi/mpt2sas/mpt2sas_base.h | 19 - drivers/scsi/mpt2sas/mpt2sas_ctl.c | 48 + drivers/scsi/mpt2sas/mpt2sas_scsih.c | 16 ++- 4 files changed, 84 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c index 11248de..9081f07 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_base.c +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c @@ -108,13 +108,17 @@ _scsih_set_fwfault_debug(const char *val, struct kernel_param *kp) { int ret = param_set_int(val, kp); struct MPT2SAS_ADAPTER *ioc; + unsigned long flags; if (ret) return ret; + /* global ioc spinlock to protect controller list on list operations */ printk(KERN_INFO setting fwfault_debug(%d)\n, mpt2sas_fwfault_debug); + spin_lock_irqsave(gioc_lock, flags); list_for_each_entry(ioc, mpt2sas_ioc_list, list) ioc-fwfault_debug = mpt2sas_fwfault_debug; + spin_unlock_irqrestore(gioc_lock, flags); return 0; } @@ -4436,6 +4440,9 @@ mpt2sas_base_free_resources(struct MPT2SAS_ADAPTER *ioc) __func__)); if (ioc-chip_phys ioc-chip) { + /* synchronizing freeing resource with pci_access_mutex lock */ + if (ioc-is_warpdrive) + mutex_lock(ioc-pci_access_mutex); _base_mask_interrupts(ioc); ioc-shost_recovery = 1; _base_make_ioc_ready(ioc, CAN_SLEEP, SOFT_RESET); @@ -4454,6 +4461,8 @@ mpt2sas_base_free_resources(struct MPT2SAS_ADAPTER *ioc) pci_disable_pcie_error_reporting(pdev); pci_disable_device(pdev); } + if (ioc-is_warpdrive) + mutex_unlock(ioc-pci_access_mutex); return; } diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h index caff8d1..c82bdb3 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_base.h +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h @@ -799,6 +799,12 @@ typedef void (*MPT2SAS_FLUSH_RUNNING_CMDS)(struct MPT2SAS_ADAPTER *ioc); * @delayed_tr_list: target reset link list * @delayed_tr_volume_list: volume target reset link list * @@temp_sensors_count: flag to carry the number of temperature sensors + * @pci_access_mutex: Mutex to synchronize ioctl,sysfs show path and + * pci resource handling. PCI resource freeing will lead to free + * vital hardware/memory resource, which might be in use by cli/sysfs + * path functions resulting in Null pointer reference followed by kernel + * crash. To avoid the above race condition we use mutex syncrhonization + * which ensures the syncrhonization between cli/sysfs_show path */ struct MPT2SAS_ADAPTER { struct list_head list; @@ -1015,6 +1021,7 @@ struct MPT2SAS_ADAPTER { u8 mfg_pg10_hide_flag; u8 hide_drives; + struct mutex pci_access_mutex; }; typedef u8 (*MPT_CALLBACK)(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8 msix_index, @@ -1023,6 +1030,17 @@ typedef u8 (*MPT_CALLBACK)(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8 msix_index, /* base shared API */ extern struct list_head mpt2sas_ioc_list; +/* spinlock on list operations over IOCs + * Case: when multiple warpdrive cards(IOCs) are in use + * Each IOC will added to the ioc list stucture on initialization. + * Watchdog threads run at regular intervals to check IOC for any + * fault conditions which will trigger the dead_ioc thread to + * deallocate pci resource, resulting deleting the IOC netry from list, + * this deletion need to protected by spinlock to enusre that + * ioc removal is syncrhonized, if not synchronized it might lead to + * list_del corruption as the ioc list is traversed in cli path + */ +extern spinlock_t
Re: [PATCH] st: null pointer dereference panic caused by use after kref_put by st_open
On 2.7.2015, at 15.01, Seymour, Shane M shane.seym...@hp.com wrote: Two SLES11 SP3 servers encountered similar crashes simultaneously following some kind of SAN/tape target issue: ... The crash is fixed by reordering the code so we no longer access the struct scsi_tape after the kref_put() is done on it in st_open(). Signed-off-by: Shane Seymour shane.seym...@hp.com Signed-off-by: Darren Lavender darren.laven...@hp.com Acked-by: Kai Mäkisara kai.makis...@kolumbus.fi Thanks for finding this. No code should touch an object after put(). Thanks, Kai -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] cxlflash: Base superpipe support
Mikey, Thanks for reviewing. Responses are inline below. -matt On Jul 2, 2015, at 1:39 AM, Michael Neuling wrote: On Fri, 2015-06-19 at 17:37 -0500, Matthew R. Ochs wrote: --- a/drivers/scsi/cxlflash/common.h +++ b/drivers/scsi/cxlflash/common.h @@ -103,6 +103,14 @@ struct cxlflash_cfg { struct pci_pool *cxlflash_cmd_pool; struct pci_dev *parent_dev; +spinlock_t ctx_tbl_slock; From the code, it's not clear to me what this lock is actually protecting. Writes to the pointers can be atomic. Are two pointers needed to be written atomically? So is it the contents of what it's pointing to? That doesn't seem correct ether as the contents are written outside of the lock So this lock is serving two purposes: First, we use it to serialize updates to the table. There are cases (not included in this patch) where we want to clear the entire table while avoiding updates to it. Second, by using it to serialize the table in conjunction with an atomic run count, it allows us to obtain a context reference without having to worry about a removal thread yanking the memory from under us. When we return from get_context() with a non-NULL reference, we are guaranteed that reference will not be freed until that thread of execution completes (the run count [nrefs] in the context is atomically decremented). +#include sislite.h +#include common.h +#include superpipe.h + +struct cxlflash_global global; Can this be static? Why yes it can...done! +/* + * Increment the reference count under lock so the context + * is not yanked from under us on a removal thread. + */ +atomic_inc(ctx_info-nrefs); +spin_unlock_irqrestore(cfg-ctx_tbl_slock, flags); There is no memory barrier here to ensure ctx_info-pid has been flushed. I don't think we need one. The pid is only set when the context is created, thus, the user won't be trying to issue other ioctls using this context until after a successful return from attach. +ctxpid = ctx_info-pid; +if (likely(!(ctx_ctrl CTX_CTRL_NOPID))) +if (pid != ctxpid) +goto denied; + +/* Free the context; note that rht_lun was allocated at same time */ +kfree(ctx_info); +cfg-num_user_contexts--; This seems racey. If two people are calling the destroy ioctl at the same time they will race on updating cfg-num_user_contexts; Sure, we can make this atomic. +ctx_info-pid = current-tgid; /* tgid = pid */ +ctx_info-ctx = ctx; +INIT_LIST_HEAD(ctx_info-luns); +INIT_LIST_HEAD(ctx_info-luns); Why this twice? Because I goofed. Will fix. +atomic_set(ctx_info-nrefs, 1); + +cfg-num_user_contexts++; Does this need to be atomic too? Can't two people call this ioctl at the same time? Yep. +spin_lock_irqsave(cfg-ctx_tbl_slock, flags); +cfg-ctx_tbl[ctxid] = NULL; +spin_unlock_irqrestore(cfg-ctx_tbl_slock, flags); Not sure you need a lock here. This NULL write should be atomic. Covered in earlier comments. + +/* Translate read/write O_* flags from fcntl.h to AFU permission bits */ +perms = SISL_RHT_PERM(attach-hdr.flags + 1); + +ctx_info = create_context(cfg, ctx, ctxid, fd, perms); I don't see a memory barrier between this create context and the insertion in the cfg-ctx_tbl table. It concerns me we could have a race when accessing it. same on the read side Regarding race conditions with the slot in the array, we're protected so long as your code doesn't have a bug in doling out process elements (context id). =) Regarding a memory barrier, aren't we not covered implicitly because of access being tucked under a spin lock? Or was this concern in the context of us not using a lock (per your previous comments). +/* + * No error paths after this point. Once the fd is installed it's + * visible to user space and can't be undone safely on this thread. + */ +list_add(lun_access-list, ctx_info-luns); +cfg-ctx_tbl[ctxid] = ctx_info; To be consistent we should be locking here. +fd_install(fd, file); + +out_attach: +attach-hdr.return_flags = 0; +attach-context_id = ctx_info-ctxid; +attach-block_size = lun_info-blk_len; +attach-mmio_size = sizeof(afu-afu_map-hosts[0].harea); +attach-last_lba = lun_info-max_lba; +attach-max_xfer = (sdev-host-max_sectors * 512) / lun_info-blk_len; +/* + * No error paths after this point. Once the fd is installed it's + * visible to user space and can't be undone safely on this thread. + */ +ctx_info-ctxid = ENCODE_CTXID(ctx_info, ctxid); +ctx_info-lfd = fd; +ctx_info-ctx = ctx; No memory barrier here. Seem like we could race in get_context() when we dereference ctx_info; +cfg-ctx_tbl[ctxid] = ctx_info; Same comment as in attach, I
Re: [PATCH 0/6] Fixes for memory corruption in mpt2sas
On 05/14/2015 09:41 PM, Calvin Owens wrote: Hello all, This patchset attempts to address problems we've been having with panics due to memory corruption from the mpt2sas driver. I will provide a similar set of fixes for mpt3sas, since we see similar issues there as well. Porting this to mpt3sas will be trivial since the part of the driver I'm touching is nearly identical between the two, so I thought it would be simpler to review a patch against mpt2sas alone at first. I've tested this for a few days on a big storage box that seemed to be very susceptible to the panics, and so far it seems to have eliminated them. Guys, can someone outside of FB please review this? We're hitting random memory corruptions without these fixes. -- Jens Axboe -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libiscsi: Fix iscsi_check_transport_timeouts possible infinite loop
On 6/30/15, 9:55 AM, Sagi Grimberg wrote: From: Ariel Nahum ari...@mellanox.com Connection last_ping is not being updated when iscsi_send_nopout fails. Not updating the last_ping will cause firing a timer to a past time (last_ping + ping_tmo current_time) which triggers an infinite loop of iscsi_check_transport_timeouts() and hogs the cpu. Fix this issue by checking the return value of iscsi_send_nopout. If it fails set the next_timeout to one second later. Signed-off-by: Ariel Nahum ari...@mellanox.com Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/scsi/libiscsi.c | 15 ++- 1 files changed, 10 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 8053f24..1ea4213 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -979,13 +979,13 @@ static void iscsi_tmf_rsp(struct iscsi_conn *conn, struct iscsi_hdr *hdr) wake_up(conn-ehwait); } -static void iscsi_send_nopout(struct iscsi_conn *conn, struct iscsi_nopin *rhdr) +static int iscsi_send_nopout(struct iscsi_conn *conn, struct iscsi_nopin *rhdr) { struct iscsi_nopout hdr; struct iscsi_task *task; if (!rhdr conn-ping_task) - return; + return -EINVAL; memset(hdr, 0, sizeof(struct iscsi_nopout)); hdr.opcode = ISCSI_OP_NOOP_OUT | ISCSI_OP_IMMEDIATE; @@ -999,13 +999,16 @@ static void iscsi_send_nopout(struct iscsi_conn *conn, struct iscsi_nopin *rhdr) hdr.ttt = RESERVED_ITT; task = __iscsi_conn_send_pdu(conn, (struct iscsi_hdr *)hdr, NULL, 0); - if (!task) + if (!task) { Are you hitting the failure case in the first chunk of the patch or the failure right above? If the latter, why is it failing? iscsi_conn_printk(KERN_ERR, conn, Could not send nopout\n); + return -EIO; + } else if (!rhdr) { I think the coding style is wrong. It should be: if () { } else if { } /* only track our nops */ conn-ping_task = task; conn-last_ping = jiffies; } + return 0; } static int iscsi_nop_out_rsp(struct iscsi_task *task, @@ -2095,8 +2098,10 @@ static void iscsi_check_transport_timeouts(unsigned long data) if (time_before_eq(last_recv + recv_timeout, jiffies)) { /* send a ping to try to provoke some traffic */ ISCSI_DBG_CONN(conn, Sending nopout as ping\n); - iscsi_send_nopout(conn, NULL); - next_timeout = conn-last_ping + (conn-ping_timeout * HZ); + if (iscsi_send_nopout(conn, NULL)) + next_timeout = jiffies + (1 * HZ); + else + next_timeout = conn-last_ping + (conn-ping_timeout * HZ); } else next_timeout = last_recv + recv_timeout; n -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas
On 06/08/2015 08:50 PM, Calvin Owens wrote: This patchset attempts to address problems we've been having with panics due to memory corruption from the mpt2sas driver. I will provide a similar set of fixes for mpt3sas, since we see similar issues there as well. Porting this to mpt3sas will be trivial since the part of the driver I'm touching is nearly identical between the two, so I thought it would be simpler to review a patch against mpt2sas alone at first. I've tested this on a handful of large storage boxes over the past few weeks, so far it seems to have completely eliminated the memory corruption panics. If you have to repost this series please convert BUG_ON(!spin_is_locked(ioc-sas_device_lock)); into lockdep_is_held(...). Otherwise, for the whole series: Reviewed-by: Bart Van Assche bart.vanass...@sandisk.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libiscsi: Fix host busy blocking during connection teardown
On 6/23/15, 8:11 PM, John Soni Jose wrote: Issue: In case of hw iscsi offload, an host can have N-number of active connections. There can be IO's running on some connections which make host-host_busy always TRUE. Now if logout from a connection is tried then the code gets into an infinite loop as host-host_busy is always TRUE. iscsi_conn_teardown() { . /* * Block until all in-progress commands for this connection * time out or fail. */ for (;;) { spin_lock_irqsave(session-host-host_lock, flags); if (!atomic_read(session-host-host_busy)) { /* OK for ERL == 0 */ spin_unlock_irqrestore(session-host-host_lock, flags); break; } spin_unlock_irqrestore(session-host-host_lock, flags); msleep_interruptible(500); iscsi_conn_printk(KERN_INFO, conn, iscsi conn_destroy(): host_busy %d host_failed %d\n, atomic_read(session-host-host_busy), session-host-host_failed); ... } } This is not an issue with software-iscsi/iser as each cxn is a separate host. Fix: Acquiring eh_mutex in iscsi_conn_teardown() before setting session-state = ISCSI_STATE_TERMINATE. Signed-off-by: John Soni Jose sony.j...@aavagotech.com --- drivers/scsi/libiscsi.c | 25 ++--- 1 file changed, 2 insertions(+), 23 deletions(-) diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 8053f24..98d9bb6 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -2941,10 +2941,10 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn) { struct iscsi_conn *conn = cls_conn-dd_data; struct iscsi_session *session = conn-session; - unsigned long flags; del_timer_sync(conn-transport_timer); + mutex_lock(session-eh_mutex); spin_lock_bh(session-frwd_lock); conn-c_stage = ISCSI_CONN_CLEANUP_WAIT; if (session-leadconn == conn) { @@ -2956,28 +2956,6 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn) } spin_unlock_bh(session-frwd_lock); - /* -* Block until all in-progress commands for this connection -* time out or fail. -*/ - for (;;) { - spin_lock_irqsave(session-host-host_lock, flags); - if (!atomic_read(session-host-host_busy)) { /* OK for ERL == 0 */ - spin_unlock_irqrestore(session-host-host_lock, flags); - break; - } - spin_unlock_irqrestore(session-host-host_lock, flags); - msleep_interruptible(500); - iscsi_conn_printk(KERN_INFO, conn, iscsi conn_destroy(): - host_busy %d host_failed %d\n, - atomic_read(session-host-host_busy), - session-host-host_failed); - /* -* force eh_abort() to unblock -*/ - wake_up(conn-ehwait); - } - /* flush queued up work because we free the connection below */ iscsi_suspend_tx(conn); @@ -2994,6 +2972,7 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn) if (session-leadconn == conn) session-leadconn = NULL; spin_unlock_bh(session-frwd_lock); + mutex_unlock(session-eh_mutex); iscsi_destroy_conn(cls_conn); } It looks ok. Reviewed-by: Mike Christie micha...@cs.wisc.edu -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/1] we added changes in fnic driver patch 1.6.0.16 to acquire io_req_lock in fnic_queuecommand() before issuing I/O so that io completion is serialized. But when releasing the lock we check fo
Signed-off-by: Hiral Shah his...@cisco.com Signed-off-by: Sesidhar Baddela sebad...@cisco.com Signed-off-by: Anil Chintalapati achin...@cisco.com --- drivers/scsi/fnic/fnic.h | 2 +- drivers/scsi/fnic/fnic_scsi.c | 4 +++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h index 26270c3..ce129e5 100644 --- a/drivers/scsi/fnic/fnic.h +++ b/drivers/scsi/fnic/fnic.h @@ -39,7 +39,7 @@ #define DRV_NAME fnic #define DRV_DESCRIPTIONCisco FCoE HBA Driver -#define DRV_VERSION1.6.0.17 +#define DRV_VERSION1.6.0.17a #define PFXDRV_NAME : #define DFX DRV_NAME %d: diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c index 155b286..25436cd 100644 --- a/drivers/scsi/fnic/fnic_scsi.c +++ b/drivers/scsi/fnic/fnic_scsi.c @@ -425,6 +425,7 @@ static int fnic_queuecommand_lck(struct scsi_cmnd *sc, void (*done)(struct scsi_ unsigned long ptr; struct fc_rport_priv *rdata; spinlock_t *io_lock = NULL; + int io_lock_acquired = 0; if (unlikely(fnic_chk_state_flags_locked(fnic, FNIC_FLAGS_IO_BLOCKED))) return SCSI_MLQUEUE_HOST_BUSY; @@ -518,6 +519,7 @@ static int fnic_queuecommand_lck(struct scsi_cmnd *sc, void (*done)(struct scsi_ spin_lock_irqsave(io_lock, flags); /* initialize rest of io_req */ + io_lock_acquired = 1; io_req-port_id = rport-port_id; io_req-start_time = jiffies; CMD_STATE(sc) = FNIC_IOREQ_CMD_PENDING; @@ -571,7 +573,7 @@ out: (((u64)CMD_FLAGS(sc) 32) | CMD_STATE(sc))); /* if only we issued IO, will we have the io lock */ - if (CMD_FLAGS(sc) FNIC_IO_INITIALIZED) + if (io_lock_acquired) spin_unlock_irqrestore(io_lock, flags); atomic_dec(fnic-in_flight); -- 1.8.5.4 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html