Re: Spurious DISK_EVENT_MEDIA_CHANGE on USB DVD hotplug?

2017-08-17 Thread Joe Lawrence
On 08/14/2017 02:30 PM, Tejun Heo wrote:
> Hello, Joe.
> 
> On Thu, Aug 10, 2017 at 10:45:54AM -0400, Joe Lawrence wrote:
>> In the case of my USB DVD -> laptop example, there is no media in my
>> device, however I still see the DISK_EVENT_MEDIA_CHANGE event.  This is
>> a bit confusing, and I was wondering if there was an explanation for
>> the following:
>>
>> drivers/scsi/sr.c :: sr_probe()
>>
>> disk->events = DISK_EVENT_MEDIA_CHANGE | DISK_EVENT_EJECT_REQUEST;
>> ...
>> cd->media_present = 1;
>>
>> DISK_EVENT_MEDIA_CHANGE events will pass through to userspace and
>> for some reason cd->media_present defaults to 1?  More on that below...
> 
> I don't have any concrete ideas but I think the only thing it's trying
> to do is to always generate at least one changed event no matter what.
> 
> ...
>> sr_check_events() compares the previous (in this case, default)
>> media_present value with what the TUR returns.  If it has changed, then
>> turn on the DISK_EVENT_MEDIA_CHANGE event bit.
>>
>> In my laptop USB DVD case, !scsi_status_is_good and sshdr.asc == 0x3a,
>> so last_present (1) and cd->media_present (0) mis-compare and the change
>> event is set.  That does not seem intuitive to me, is this a bug?
> 
> It's not incorrect.  We can try to change the behavior to avoid double
> notifications but given that this has been like this for a really long
> time and that it isn't technically incorrect, I'm not sure changing it
> is a good idea.  It might as well break other things.

Without a definition of DISK_EVENT_MEDIA_CHANGE or its udev
DISK_MEDIA_CHANGE counterpart it's kinda hard to say :)  But I agree
that changing this behavior could have inadvertent effects.

>> Bringing this back to the reported BMC case, which presumably does have
>> "media" present in the virtual device... is it reasonable to expect a
>> DISK_EVENT_MEDIA_CHANGE even for a new device that contains media?  (I
>> haven't verified, but in this case GET_EVENT_STATUS_NOTIFICATION might
>> be enough to set media present.)
> 
> Yeah, I think so.
> 
>> If there is documentation that explains DISK_EVENT_MEDIA_CHANGE conditions
>> somewhere, feel free to point me there.
> 
> AFAIK, there isn't any.  The only thing it tries to do is generating
> at least one event after media change.  Given that media is reported
> present after the last notification, I think userspace should be able
> to do the right thing (and must have been doing the right thing until
> recently).

I have no idea if udev or other userspace has changed in this respect,
or this is simply a timing window that this particular user has fallen
into.  This is a simulated device, so it might be fast/slower than any
real hardware.

Thanks for chiming in,

-- Joe



Spurious DISK_EVENT_MEDIA_CHANGE on USB DVD hotplug?

2017-08-10 Thread Joe Lawrence
Hi Tejun, Kay,

I'm investigating a customer report which manifests itself all the way
up in gnome-session when a BMC hotplug-adds a simulated DVD device.  The
user logs into their server's BMC and enables "media redirection", an
emulated DVD device + .iso is dynamically added to the bus... in the
past this has worked well, however, they are now noticing a timing
condition on RHEL7 that prevents gnome from successfully auto-mounting
the DVD media.

With Harald's help, I've done some debugging and we've found out that on
hotplug-add, the kernel sends two uevents (ADD, CHANGE) in short
succession.

(Example with an ordinary, physical USB DVD device on my laptop, is
very similar):

% udevadm monitor -k -e

  KERNEL[2409061.130338] add  
/devices/pci:00/:00:14.0/usb3/3-9/3-9.3/3-9.3:1.0/host20/target20:0:0/20:0:0:0/block/sr1
 (block)
  ACTION=add
  DEVNAME=/dev/sr1
  
DEVPATH=/devices/pci:00/:00:14.0/usb3/3-9/3-9.3/3-9.3:1.0/host20/target20:0:0/20:0:0:0/block/sr1
  DEVTYPE=disk
  MAJOR=11
  MINOR=1
  SEQNUM=5885
  SUBSYSTEM=block

  ...

  KERNEL[2409061.134076] change   
/devices/pci:00/:00:14.0/usb3/3-9/3-9.3/3-9.3:1.0/host20/target20:0:0/20:0:0:0/block/sr1
 (block)
  ACTION=change
  DEVNAME=/dev/sr1
  
DEVPATH=/devices/pci:00/:00:14.0/usb3/3-9/3-9.3/3-9.3:1.0/host20/target20:0:0/20:0:0:0/block/sr1
  DEVTYPE=disk
> DISK_MEDIA_CHANGE=1
  MAJOR=11
  MINOR=1
  SEQNUM=5889
  SUBSYSTEM=block

(Both of these events trigger a call out to the 'cdrom_id' userspace
program, the latter of which interferes with the gnome-session
auto-mounting feature.)

With a systemtap probe, I can also see that there are four userspace
openers of the cdrom when it is added:

  (parent)-> (child) : system-tap probe-point
---
> systemd-udevd(849)  -> systemd-udevd(6783) : 
> module("cdrom").function("cdrom_open@drivers/cdrom/cdrom.c:980")
  systemd-udevd(6783) -> cdrom_id(6791)  : 
module("cdrom").function("cdrom_open@drivers/cdrom/cdrom.c:980")
  systemd-udevd(849)  -> systemd-udevd(6783) : 
module("cdrom").function("cdrom_open@drivers/cdrom/cdrom.c:980")
  systemd-udevd(6783) -> cdrom_id(6794)  : 
module("cdrom").function("cdrom_open@drivers/cdrom/cdrom.c:980")

where on the first opener, the kernel eventually invokes
sr_mod::sr_check_events() and gets a DISK_EVENT_MEDIA_CHANGE return
code.

In the case of my USB DVD -> laptop example, there is no media in my
device, however I still see the DISK_EVENT_MEDIA_CHANGE event.  This is
a bit confusing, and I was wondering if there was an explanation for
the following:

drivers/scsi/sr.c :: sr_probe()

disk->events = DISK_EVENT_MEDIA_CHANGE | DISK_EVENT_EJECT_REQUEST;
...
cd->media_present = 1;

DISK_EVENT_MEDIA_CHANGE events will pass through to userspace and
for some reason cd->media_present defaults to 1?  More on that below...


drivers/scsi/sr.c :: sr_check_events()

...
do_tur:
/* let's see whether the media is there with TUR */
last_present = cd->media_present;
ret = scsi_test_unit_ready(cd->device, SR_TIMEOUT, MAX_RETRIES, &sshdr);

/*
 * Media is considered to be present if TUR succeeds or fails with
 * sense data indicating something other than media-not-present
 * (ASC 0x3a).
 */
cd->media_present = scsi_status_is_good(ret) ||
(scsi_sense_valid(&sshdr) && sshdr.asc != 0x3a);

if (last_present != cd->media_present)
cd->device->changed = 1;

if (cd->device->changed) {
events |= DISK_EVENT_MEDIA_CHANGE;
cd->device->changed = 0;
cd->tur_changed = true;
}
...

sr_check_events() compares the previous (in this case, default)
media_present value with what the TUR returns.  If it has changed, then
turn on the DISK_EVENT_MEDIA_CHANGE event bit.

In my laptop USB DVD case, !scsi_status_is_good and sshdr.asc == 0x3a,
so last_present (1) and cd->media_present (0) mis-compare and the change
event is set.  That does not seem intuitive to me, is this a bug?

Bringing this back to the reported BMC case, which presumably does have
"media" present in the virtual device... is it reasonable to expect a
DISK_EVENT_MEDIA_CHANGE even for a new device that contains media?  (I
haven't verified, but in this case GET_EVENT_STATUS_NOTIFICATION might
be enough to set media present.)

If there is documentation that explains DISK_EVENT_MEDIA_CHANGE conditions
somewhere, feel free to point me there.

Thanks,

-- Joe


[PATCH 2/2] mpt3sas - avoid mpt3sas_transport_port_add NULL parent_dev

2016-05-25 Thread Joe Lawrence
If _scsih_sas_host_add's call to mpt3sas_config_get_sas_iounit_pg0
fails, ioc->sas_hba.parent_dev may be left uninitialized.  A later
device probe could invoke mpt3sas_transport_port_add which will call
sas_port_alloc_num [scsi_transport_sas] with a NULL parent_dev pointer.

Signed-off-by: Joe Lawrence 
---
 drivers/scsi/mpt3sas/mpt3sas_transport.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_transport.c 
b/drivers/scsi/mpt3sas/mpt3sas_transport.c
index 6a84b82d71bb..ff93286bc32f 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_transport.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_transport.c
@@ -705,6 +705,11 @@ mpt3sas_transport_port_add(struct MPT3SAS_ADAPTER *ioc, 
u16 handle,
goto out_fail;
}
 
+   if (!sas_node->parent_dev) {
+   pr_err(MPT3SAS_FMT "failure at %s:%d/%s()!\n",
+   ioc->name, __FILE__, __LINE__, __func__);
+   goto out_fail;
+   }
port = sas_port_alloc_num(sas_node->parent_dev);
if ((sas_port_add(port))) {
pr_err(MPT3SAS_FMT "failure at %s:%d/%s()!\n",
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] mpt3sas - set num_phys after allocating phy[] space

2016-05-25 Thread Joe Lawrence
In _scsih_sas_host_add, the number of HBA phys are determined and then
later used to allocate an array of struct  _sas_phy's.  If the routine
sets ioc->sas_hba.num_phys, but then fails to allocate the
ioc->sas_hba.phy array (by kcalloc error or other intermediate
error/exit path), ioc->sas_hba is left in a dangerous state: all readers
of ioc->sas_hba.phy[] do so by indexing it from 0..ioc->sas_hba.num_phys
without checking that the space was ever allocated.

Modify _scsih_sas_host_add to set ioc->sas_hba.num_phys only after
successfully allocating ioc->sas_hba.phy[].

Signed-off-by: Joe Lawrence 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index f2139e5604a3..6e36d20c9e0b 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -4893,13 +4893,22 @@ _scsih_sas_host_add(struct MPT3SAS_ADAPTER *ioc)
u16 ioc_status;
u16 sz;
u8 device_missing_delay;
+   u8 num_phys;
 
-   mpt3sas_config_get_number_hba_phys(ioc, &ioc->sas_hba.num_phys);
-   if (!ioc->sas_hba.num_phys) {
+   mpt3sas_config_get_number_hba_phys(ioc, &num_phys);
+   if (!num_phys) {
pr_err(MPT3SAS_FMT "failure at %s:%d/%s()!\n",
ioc->name, __FILE__, __LINE__, __func__);
return;
}
+   ioc->sas_hba.phy = kcalloc(num_phys,
+   sizeof(struct _sas_phy), GFP_KERNEL);
+   if (!ioc->sas_hba.phy) {
+   pr_err(MPT3SAS_FMT "failure at %s:%d/%s()!\n",
+   ioc->name, __FILE__, __LINE__, __func__);
+   goto out;
+   }
+   ioc->sas_hba.num_phys = num_phys;
 
/* sas_iounit page 0 */
sz = offsetof(Mpi2SasIOUnitPage0_t, PhyData) + (ioc->sas_hba.num_phys *
@@ -4959,13 +4968,6 @@ _scsih_sas_host_add(struct MPT3SAS_ADAPTER *ioc)
MPI2_SASIOUNIT1_REPORT_MISSING_TIMEOUT_MASK;
 
ioc->sas_hba.parent_dev = &ioc->shost->shost_gendev;
-   ioc->sas_hba.phy = kcalloc(ioc->sas_hba.num_phys,
-   sizeof(struct _sas_phy), GFP_KERNEL);
-   if (!ioc->sas_hba.phy) {
-   pr_err(MPT3SAS_FMT "failure at %s:%d/%s()!\n",
-   ioc->name, __FILE__, __LINE__, __func__);
-   goto out;
-   }
for (i = 0; i < ioc->sas_hba.num_phys ; i++) {
if ((mpt3sas_config_get_phy_pg0(ioc, &mpi_reply, &phy_pg0,
i))) {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] _scsih_sas_host_add early exits can crash system

2016-05-25 Thread Joe Lawrence
There are many error paths in _scsih_sas_host_add that lead to an early
exit and a few that leave IOC resources uninitialized, setting the stage
for a later crash.

This can be emulated using a systemtap script like:

  % stap -g -e \
  'probe module("mpt3sas").function("mpt3sas_config_get_sas_iounit_pg0").return 
{ $return = -1 }'

to force early exit, while remove/re-adding an MPT3 adapter:

  % lspci -D | grep MPT
  :54:00.0 Mass storage controller: LSI Logic / Symbios Logic SAS3008 
PCI-Express Fusion-MPT SAS-3 (rev 02)

  % SYSFS=$(find /sys/devices -name :54:00.0)
  % SYSFS_PARENT=$(dirname $SYSFS)

  % echo 1 > $SYSFS/remove
  % sleep 1m
  % echo 1 > $SYSFS_PARENT/rescan

These two patches fix:
  1) referencing unallocated ioc->sas_hba.phy[] space
  2) passing a NULL ioc->sas_hba.parent_dev to the scsi_transport_sas
 layer.

Note: these changes don't improve or retry adapter initialization, but
  only try to prevent the system from crashing

Joe Lawrence (2):
  mpt3sas - set num_phys after allocating phy[] space
  mpt3sas - avoid mpt3sas_transport_port_add NULL parent_dev

 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 20 +++-
 drivers/scsi/mpt3sas/mpt3sas_transport.c |  5 +
 2 files changed, 16 insertions(+), 9 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix a bdi reregistration race, v3

2016-05-09 Thread Joe Lawrence
On 05/05/2016 04:40 PM, Joe Lawrence wrote:
> On 05/05/2016 03:58 PM, Bart Van Assche wrote:
>> On 03/28/2016 02:29 PM, Bart Van Assche wrote:
>>> Avoid that the sd driver registers a BDI device with a name that
>>> is still in use. This patch avoids that the following warning gets
>>> triggered:
>>>
>>> [ ... ]
>>
>> (replying to my own e-mail)
>>
>> If anyone could review this patch that would be very welcome.
> 
> Hi Bart,
> 
> I *think* I may be hitting this same problem running some tests here at 
> Stratus
> ... snip...


Hi Bart,

Good news = With your v3 patch, I didn't see the "sysfs: cannot create
duplicate filename '/devices/virtual/bdi/65:0'" warning during my
weekend testing (573 surprise disk HBA removals).

Bad news = I still crashed in add_disk > sysfs_create_link >
sysfs_do_create_link_sd on a NULL target_kobj->sd ... unfortunately I
don't have kdump working, so all I have is a serial console output to
work with for now.

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix a bdi reregistration race, v3

2016-05-05 Thread Joe Lawrence
On 05/05/2016 03:58 PM, Bart Van Assche wrote:
> On 03/28/2016 02:29 PM, Bart Van Assche wrote:
>> Avoid that the sd driver registers a BDI device with a name that
>> is still in use. This patch avoids that the following warning gets
>> triggered:
>>
>> [ ... ]
> 
> (replying to my own e-mail)
> 
> If anyone could review this patch that would be very welcome.

Hi Bart,

I *think* I may be hitting this same problem running some tests here at Stratus 
-- I've got slub_debugging and other kernel debugging turned on while 
repeatedly hotplugging controllers at various timings during probe.  It takes 
hours to hit, but I've seen it twice in the past week.  I've got some other 
testing on my plate at the moment, but when I get a chance I can give your v3 a 
spin.

Here's an example (sorry for the wide width) ...

[ cut here ]
WARNING: CPU: 5 PID: 30702 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x80
sysfs: cannot create duplicate filename '/devices/virtual/bdi/65:0'
Modules linked in: btrfs xor raid6_pq msdos ext4 jbd2 mbcache matroxfb(OE) 
ccmod(POE) ftmod(OE) videosw(OE) ipmi_devintf fuse xt_CHECKSUM ipt_MASQUERADE 
nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 xt_conntrack bonding ebtable_nat ebtable_broute bridge stp llc 
ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 
nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter 
ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat 
nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter vfat 
fat x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul 
crc32_pclmul ghash_clmulni_intel aesni_intel glue_helper lrw gf128mul 
ablk_helper cryptd dm_service_time pcspkr ses enclosure sg wmi nfsd auth_rpcgss 
nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c raid1 sr_mod 
cdrom crc32c_intel qla2xxx(OE) scsi_transport_fc ixgbe(OE) mdio igb(OE) ptp 
pps_core mpt3sas(OE) 
raid_class i2c_algo_bit sra_sense(OE) i2c_core sd_mod(OE) scsi_transport_sas 
dca scsi_hbas(OE) fjes ipmi_msghandler usb_storage dm_mirror dm_region_hash 
dm_log dm_mod [last unloaded: ipmi_devintf]
CPU: 5 PID: 30702 Comm: kworker/u97:7 Tainted: PW  OE   4.6.0-rc6+ #37
Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.1:64 02/03/2016
Workqueue: events_unbound async_run_entry_fn
 0286 e136f6f8 88038ae03a50 8134359f
 88038ae03aa0  88038ae03a90 8108af31
 001f5bdb0398 880067e0e7e8 88058f92def0 88084bd2c008
Call Trace:
 [] dump_stack+0x63/0x84
 [] __warn+0xd1/0xf0
 [] warn_slowpath_fmt+0x5f/0x80
 [] sysfs_warn_dup+0x64/0x80
 [] sysfs_create_dir_ns+0x7e/0x90
 [] kobject_add_internal+0xaa/0x320
 [] ? device_private_init+0x23/0x70
 [] kobject_add+0x75/0xd0
 [] ? mutex_lock+0x12/0x2f
 [] device_add+0x125/0x610
 [] device_create_groups_vargs+0xd8/0x100
 [] device_create_vargs+0x1c/0x20
 [] bdi_register+0x8c/0x180
 [] bdi_register_dev+0x27/0x30
 [] add_disk+0x17f/0x4a0
 [] ? update_autosuspend+0x55/0x60
 [] ? __pm_runtime_use_autosuspend+0x54/0x70
 [] sd_probe_async+0x115/0x1d0 [sd_mod]
 [] async_run_entry_fn+0x4a/0x140
 [] process_one_work+0x16e/0x420
 [] worker_thread+0x125/0x4b0
 [] ? __schedule+0x2ad/0x8a0
 [] ? rescuer_thread+0x380/0x380
 [] kthread+0xd8/0xf0
 [] ret_from_fork+0x22/0x40
 [] ? kthread_park+0x60/0x60
---[ end trace 353add4e78cb2a75 ]---
[ cut here ]
WARNING: CPU: 5 PID: 30702 at lib/kobject.c:240 kobject_add_internal+0x262/0x320
kobject_add_internal failed for 65:0 with -EEXIST, don't try to register things 
with the same name in the same directory.
Modules linked in: btrfs xor raid6_pq msdos ext4 jbd2 mbcache matroxfb(OE) 
ccmod(POE) ftmod(OE) videosw(OE) ipmi_devintf fuse xt_CHECKSUM ipt_MASQUERADE 
nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 xt_conntrack bonding ebtable_nat ebtable_broute bridge stp llc 
ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 
nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter 
ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat 
nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter vfat 
fat x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul 
crc32_pclmul ghash_clmulni_intel aesni_intel glue_helper lrw gf128mul 
ablk_helper cryptd dm_service_time pcspkr ses enclosure sg wmi nfsd auth_rpcgss 
nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c raid1 sr_mod 
cdrom crc32c_intel qla2xxx(OE) scsi_transport_fc ixgbe(OE) mdio igb(OE) ptp 
pps_core mpt3sas(OE) 
raid_class i2c_algo_bit sra_sense(OE) i2c_core sd_mod(OE) scsi_transport_sas 
dca scsi_hbas(OE) fjes ipmi_msghandler usb_storage dm_mirror dm_region_hash 
dm_log dm_mod [last unloaded: ipmi_devintf]
CPU: 5 PID: 30702 Comm: kworker/u97:7 Tainted: PW  OE   4.

[PATCH v2] mpt3sas - remove unused fw_event_work elements

2016-04-18 Thread Joe Lawrence
Firmware events are queued up using the fw_event_work's struct work, not
its delayed_work member.  The initial driver for SAS2 controllers had
handled firmware reset using the rescan barrier and was later redesigned
through "mpt2sas: [Resend] Host Reset code cleanup".  The delayed_work
variables are now unused and may provoke CONFIG_DEBUG_OBJECTS_TIMERS
"assert_init not available" false warnings in
_scsih_fw_event_cleanup_queue.

Cleanup fw_event_work's unused entries, update it's kerneldoc, and update
_scsih_fw_event_cleanup_queue accordingly.

Fixes: 146b16c8071f (mpt3sas: Refcount fw_events and fix unsafe list usage)
Signed-off-by: Joe Lawrence 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index e0e4920d0fa6..f2139e5604a3 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -174,13 +174,13 @@ struct sense_info {
  * struct fw_event_work - firmware event struct
  * @list: link list framework
  * @work: work object (ioc->fault_reset_work_q)
- * @cancel_pending_work: flag set during reset handling
  * @ioc: per adapter object
  * @device_handle: device handle
  * @VF_ID: virtual function id
  * @VP_ID: virtual port id
  * @ignore: flag meaning this event has been marked to ignore
- * @event: firmware event MPI2_EVENT_XXX defined in mpt2_ioc.h
+ * @event: firmware event MPI2_EVENT_XXX defined in mpi2_ioc.h
+ * @refcount: kref for this event
  * @event_data: reply event data payload follows
  *
  * This object stored on ioc->fw_event_list.
@@ -188,8 +188,6 @@ struct sense_info {
 struct fw_event_work {
struct list_headlist;
struct work_struct  work;
-   u8  cancel_pending_work;
-   struct delayed_work delayed_work;
 
struct MPT3SAS_ADAPTER *ioc;
u16 device_handle;
@@ -2804,12 +2802,12 @@ _scsih_fw_event_cleanup_queue(struct MPT3SAS_ADAPTER 
*ioc)
/*
 * Wait on the fw_event to complete. If this returns 1, then
 * the event was never executed, and we need a put for the
-* reference the delayed_work had on the fw_event.
+* reference the work had on the fw_event.
 *
 * If it did execute, we wait for it to finish, and the put will
 * happen from _firmware_event_work()
 */
-   if (cancel_delayed_work_sync(&fw_event->delayed_work))
+   if (cancel_work_sync(&fw_event->work))
fw_event_work_put(fw_event);
 
fw_event_work_put(fw_event);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mpt3sas - remove unused fw_event_work delayed_work

2016-04-11 Thread Joe Lawrence
Hi Chaitra, while discussing this patch with Ewan, I realized that 
cancel_pending_work is unused as well.  I can send a v2 with that and 
the associated kerneldoc update.


Do we know why f1c35e6aea579 "mpt2sas: RESCAN Barrier work is added in 
case of HBA reset" was unneeded for the mpt3 version?  If that is 
interesting, that info could be added to v2 commit message as well.


Thanks,

-- Joe


On 04/11/2016 07:13 AM, Chaitra Basappa wrote:

Hi,
  Please consider this patch as Ack-by: Chaitra P B


Thanks,
  Chaitra

-Original Message-
From: Sathya Prakash [mailto:sathya.prak...@broadcom.com]
Sent: Saturday, April 02, 2016 1:45 AM
To: emi...@redhat.com; Joe Lawrence
Cc: linux-scsi@vger.kernel.org; Chaitra Basappa; Suganath Prabu Subramani;
Calvin Owens
Subject: RE: [PATCH] mpt3sas - remove unused fw_event_work delayed_work

We will look into this early next week and provide a detailed response.  On
the first look this is ACK from Broadcom, will reconfirm.

-Original Message-
From: Ewan D. Milne [mailto:emi...@redhat.com]
Sent: Friday, April 01, 2016 2:04 PM
To: Joe Lawrence
Cc: linux-scsi@vger.kernel.org; Sathya Prakash; Chaitra P B; Suganath Prabu
Subramani; Calvin Owens
Subject: Re: [PATCH] mpt3sas - remove unused fw_event_work delayed_work

On Fri, 2016-04-01 at 15:13 -0400, Joe Lawrence wrote:

On 04/01/2016 02:51 PM, Ewan D. Milne wrote:

On Fri, 2016-04-01 at 13:56 -0400, Joe Lawrence wrote:

@@ -2804,12 +2803,12 @@ _scsih_fw_event_cleanup_queue(struct
MPT3SAS_ADAPTER *ioc)
/*
 * Wait on the fw_event to complete. If this returns 1, then
 * the event was never executed, and we need a put for the
-* reference the delayed_work had on the fw_event.
+* reference the work had on the fw_event.
 *
 * If it did execute, we wait for it to finish, and the put will
 * happen from _firmware_event_work()
 */
-   if (cancel_delayed_work_sync(&fw_event->delayed_work))
+   if (cancel_work_sync(&fw_event->work))
fw_event_work_put(fw_event);

fw_event_work_put(fw_event);


Hmm... Fixes: 146b16c8 (mpt3sas: Refcount fw_events and fix unsafe
list usage)


This could technically go back to f92363d12359 (mpt3sas: add new
driver supporting 12GB SAS) ...  but will probably only apply cleanly
to _scsih_fw_event_cleanup_queue after 146b16c8 (mpt3sas: Refcount
fw_events and fix unsafe list usage), so you're right.


Since mpt3sas uses ->work instead of _delayed_work this would seem
to be correct, however the deprecated mpt2sas driver had a commit
that changed the firmware event work mechanism to use ->delayed_work
instead of ->work:

commit f1c35e6aea579d5bdb6dc02dfa99c67c7c3b3f67
Author: Kashyap, Desai 
Date:   Tue Mar 9 16:31:43 2010 +0530


Okay, so this is pre-mpt3sas split.


 [SCSI] mpt2sas: RESCAN Barrier work is added in case of HBA reset.

 Add the cancel_pending_work flag from the fw_event_work
structure, and then to
 set the flag during host reset, check the flag later from work
threads
 context and if cancel_pending_work_flag is set ingore those events.


More unused mpt2 vestiges in the mpt3 version?

% cd drivers/scsi/mpt3sas/
% grep 'cancel_pending_work' *.{c,h}
mpt3sas_scsih.c: * @cancel_pending_work: flag set during reset handling
mpt3sas_scsih.c:u8  cancel_pending_work;


 Now Rescan after host reset is changed.
 Added special task MPT2SAS_RESCAN_AFTER_HOST_RESET. This task
will be queued
 at the time of HBA reset. this task is treated as barrier. All
work after
 MPT2SAS_RESCAN_AFTER_HOST_RESET will be treated as new work and
will be
 server by callback handle. If host_recovery is going on while
running RESCAN
 task, it will wait for shos_recovery_done completion which will
be called
 from HBA reset DONE context.


FWIW, I don't see anything like this in today's mpt3sas driver.


Well, that's the question.  Is there some functionality missing?  Were the
changes abandoned/replaced?  mpt2sas used delayed_work for something else,
so maybe that's why the firmware event changes initially used it (albeit
with a 0 delay) but it's hard to know...

cancel_delayed_work() will call del_timer() on delayed_work->timer, but it
looks like kzalloc is used to allocate the fw_event_work objects so perhaps
nothing bad will happen.  I was wondering, though, because I have seen dumps
of hung systems with requests that should have timed out but are not on any
timer list.

-Ewan



Regards,

-- Joe

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mpt3sas - remove unused fw_event_work delayed_work

2016-04-01 Thread Joe Lawrence
On 04/01/2016 02:51 PM, Ewan D. Milne wrote:
> On Fri, 2016-04-01 at 13:56 -0400, Joe Lawrence wrote:
>> @@ -2804,12 +2803,12 @@ _scsih_fw_event_cleanup_queue(struct MPT3SAS_ADAPTER 
>> *ioc)
>>  /*
>>   * Wait on the fw_event to complete. If this returns 1, then
>>   * the event was never executed, and we need a put for the
>> - * reference the delayed_work had on the fw_event.
>> + * reference the work had on the fw_event.
>>   *
>>   * If it did execute, we wait for it to finish, and the put will
>>   * happen from _firmware_event_work()
>>   */
>> -if (cancel_delayed_work_sync(&fw_event->delayed_work))
>> +if (cancel_work_sync(&fw_event->work))
>>  fw_event_work_put(fw_event);
>>  
>>  fw_event_work_put(fw_event);
> 
> Hmm... Fixes: 146b16c8 (mpt3sas: Refcount fw_events and fix unsafe list usage)

This could technically go back to f92363d12359 (mpt3sas: add new driver
supporting 12GB SAS) ...  but will probably only apply cleanly to
_scsih_fw_event_cleanup_queue after 146b16c8 (mpt3sas: Refcount
fw_events and fix unsafe list usage), so you're right.

> Since mpt3sas uses ->work instead of _delayed_work this would seem to be
> correct, however the deprecated mpt2sas driver had a commit that changed
> the firmware event work mechanism to use ->delayed_work instead of ->work:
> 
> commit f1c35e6aea579d5bdb6dc02dfa99c67c7c3b3f67
> Author: Kashyap, Desai 
> Date:   Tue Mar 9 16:31:43 2010 +0530

Okay, so this is pre-mpt3sas split.

> [SCSI] mpt2sas: RESCAN Barrier work is added in case of HBA reset.
> 
> Add the cancel_pending_work flag from the fw_event_work structure, and 
> then to
> set the flag during host reset, check the flag later from work threads
> context and if cancel_pending_work_flag is set ingore those events.

More unused mpt2 vestiges in the mpt3 version?

% cd drivers/scsi/mpt3sas/
% grep 'cancel_pending_work' *.{c,h}
mpt3sas_scsih.c: * @cancel_pending_work: flag set during reset handling
mpt3sas_scsih.c:u8  cancel_pending_work;

> Now Rescan after host reset is changed.
> Added special task MPT2SAS_RESCAN_AFTER_HOST_RESET. This task will be 
> queued
> at the time of HBA reset. this task is treated as barrier. All work after
> MPT2SAS_RESCAN_AFTER_HOST_RESET will be treated as new work and will be
> server by callback handle. If host_recovery is going on while running 
> RESCAN
> task, it will wait for shos_recovery_done completion which will be called
> from HBA reset DONE context.

FWIW, I don't see anything like this in today's mpt3sas driver.

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] mpt3sas - remove unused fw_event_work delayed_work

2016-04-01 Thread Joe Lawrence
The driver's fw events are queued up using the the fw_event_work's
struct work, not its delayed_work member.  The latter appears to be
unused and may provoke CONFIG_DEBUG_OBJECTS_TIMERS "assert_init not
available" false warnings in _scsih_fw_event_cleanup_queue.  Remove it
and update _scsih_fw_event_cleanup_queue accordingly.

Signed-off-by: Joe Lawrence 
---

I think this goes all the way back to the introduction of the mpt3sas
driver.  The previous generation mpt2sas driver uses delayed_work, so
perhaps it was simply copied and pasted into the mpt3sas but never
updated.

 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index e0e4920d0fa6..67643602efbc 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -189,7 +189,6 @@ struct fw_event_work {
struct list_headlist;
struct work_struct  work;
u8  cancel_pending_work;
-   struct delayed_work delayed_work;
 
struct MPT3SAS_ADAPTER *ioc;
u16 device_handle;
@@ -2804,12 +2803,12 @@ _scsih_fw_event_cleanup_queue(struct MPT3SAS_ADAPTER 
*ioc)
/*
 * Wait on the fw_event to complete. If this returns 1, then
 * the event was never executed, and we need a put for the
-* reference the delayed_work had on the fw_event.
+* reference the work had on the fw_event.
 *
 * If it did execute, we wait for it to finish, and the put will
 * happen from _firmware_event_work()
 */
-   if (cancel_delayed_work_sync(&fw_event->delayed_work))
+   if (cancel_work_sync(&fw_event->work))
fw_event_work_put(fw_event);
 
fw_event_work_put(fw_event);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

2015-07-13 Thread Joe Lawrence
On 07/12/2015 12:24 AM, Calvin Owens wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
> 
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
> 
> Cc: Christoph Hellwig 
> Cc: Bart Van Assche 
> Signed-off-by: Calvin Owens 
> ---
>  drivers/scsi/mpt2sas/mpt2sas_base.h  |  22 +-
>  drivers/scsi/mpt2sas/mpt2sas_scsih.c | 434 
> ---
>  drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
>  3 files changed, 315 insertions(+), 153 deletions(-)

[ ... snip ... ]

> @@ -2078,7 +2150,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
>   }
>  
>   spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>  sas_device_priv_data->sas_target->sas_address);
>   if (!sas_device) {
>   spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -2116,13 +2188,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
>   if (!ssp_target)
>   _scsih_display_sata_capabilities(ioc, handle, sdev);
>  
> -
>   _scsih_change_queue_depth(sdev, qdepth);
>  
>   if (ssp_target) {
>   sas_read_port_mode_page(sdev);
>   _scsih_enable_tlr(ioc, sdev);
>   }
> +
> + sas_device_put(sas_device);
>   return 0;
>  }

Hi Calvin,

Any reason why this sas_device_put is placed outside the sas_device
lock?  Most other instances in this patch were called just before unlocking.

BTW I attempted testing, but needed to port to mpt3 and ended up with a
driver that didn't boot :(   Hopefully I can retry later this week, or
find an older mpt2 box lying around.

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] qla2xxx: prevent board_disable from running during EEH

2015-06-26 Thread Joe Lawrence
On 06/26/2015 12:33 PM, Mauricio Faria de Oliveira wrote:
> Commit f3ddac1918fe963bcbf8d407a3a3c0881b47248b ("[SCSI] qla2xxx:
> Disable adapter when we encounter a PCI disconnect.") has introduced a
> code that disables the board, releasing some resources, when reading
> 0x.
> 
> In case this happens when there is an EEH, this read will trigger EEH
> detection and set PCI channel offline. EEH will be able to recover the
> card from this state by doing a reset, so it's a better option than
> simply disabling the card.
> 
> Since eeh_check_failure will mark the channel as offline before
> returning the read value, in case there really was an EEH, we can simply
> check for pci_channel_offline, preventing the board_disable code from
> running if it's true.
> 
> Without this patch, EEH code will try to access those same resources
> that board_disable will try to free. This race can cause EEH recovery to
> fail.
> 
> [  504.370577] EEH: Notify device driver to resume
> [  504.370580] qla2xxx [0001:07:00.0]-9002:2: The device failed to
> resume I/O from slot/link_reset.
> 
> Signed-off-by: Thadeu Lima de Souza Cascardo 
> Cc: 
> Cc: 
> ---
>  drivers/scsi/qla2xxx/qla_isr.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_isr.c
> b/drivers/scsi/qla2xxx/qla_isr.c
> index a04a1b1..8132926 100644
> --- a/drivers/scsi/qla2xxx/qla_isr.c
> +++ b/drivers/scsi/qla2xxx/qla_isr.c
> @@ -116,7 +116,7 @@ bool
>  qla2x00_check_reg32_for_disconnect(scsi_qla_host_t *vha, uint32_t reg)
>  {
>  /* Check for PCI disconnection */
> -if (reg == 0x) {
> +if (reg == 0x && !pci_channel_offline(vha->hw->pdev)) {
>  if (!test_and_set_bit(PFLG_DISCONNECTED, &vha->pci_flags) &&
>  !test_bit(PFLG_DRIVER_REMOVING, &vha->pci_flags) &&
>  !test_bit(PFLG_DRIVER_PROBING, &vha->pci_flags)) {

Hi Mauricio,

Re: signed-off-by chain -- I believe if you are (re)sending another
person's patch, you will need a:  "From: Thadeu Lima de Souza Cascardo
"  tag at the top of the message body to
retain original authorship and then your own:  "Signed-off-by: Mauricio
Faria de Oliveira " tag below Thadeu's
since the patch has passed through you.

Re: the patch -- I did some work last year to harden board_disable
against various races, but without having EEH hardware available, I was
uncertain about EEH behavior.  For example:

pci_read
   EEH -> marks channel offline
pci_read returns ~0

When do the EEH error handler callbacks run?

Thanks,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mpt2sas: Abort initialization if no memory I/O resources, detected

2015-06-23 Thread Joe Lawrence


On 06/21/2015 02:46 PM, Timothy Pearson wrote:
> On 06/16/2015 01:49 PM, Timothy Pearson wrote:
>> On 06/16/2015 12:42 PM, Joe Lawrence wrote:
>>> On 06/16/2015 12:28 PM, Timothy Pearson wrote:
>>>> On 06/12/2015 05:05 PM, Timothy Pearson wrote:
>>>>> The mpt2sas driver crashes if the BIOS does not set up at least one
>>>>> memory I/O resource. This failure can happen if the device is too
>>>>> slow to respond during POST and is missed by the BIOS, but Linux
>>>>> then detects the device later in the boot process.
>>>>>
>>>>> This patch aborts initialization and prints a warning if no memory I/O
>>>>> resources are found.
>>>>>
>>>>> Signed-off-by: Timothy Pearson
>>>>> Tested-by: Timothy Pearson
>>>>> ---
>>>>> drivers/scsi/mpt2sas/mpt2sas_base.c | 9 +
>>>>> 1 file changed, 9 insertions(+)
>>>>>
>>>>> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
>>>>> b/drivers/scsi/mpt2sas/mpt2sas_base.c
>>>>> index 11248de..15c9504 100644
>>>>> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
>>>>> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
>>>>> @@ -6,6 +6,8 @@
>>>>> * Copyright (C) 2007-2014 LSI Corporation
>>>>> * Copyright (C) 20013-2014 Avago Technologies
>>>>> * (mailto: mpt-fusionlinux@avagotech.com)
>>>>> + * Copyright (C) 2015 Raptor Engineering
>>>>> + * (mailto: supp...@araptorengineeringinc.com)
>>>>> *
>>>>> * This program is free software; you can redistribute it and/or
>>>>> * modify it under the terms of the GNU General Public License
>>>>> @@ -1582,6 +1584,13 @@ mpt2sas_base_map_resources(struct
>>>>> MPT2SAS_ADAPTER
>>>>> *ioc)
>>>>> }
>>>>> }
>>>>>
>>>>> + if (ioc->chip == NULL) {
>>>>> + printk(MPT2SAS_ERR_FMT "unable to map "
>>>>> + "adapter memory (resource not found)!\n", ioc->name);
>>>>> + r = -EINVAL;
>>>>> + goto out_fail;
>>>>> + }
>>>>> +
>>>>> _base_mask_interrupts(ioc);
>>>>>
>>>>> r = _base_get_ioc_facts(ioc, CAN_SLEEP);
>>>>
>>>> Just following up on this patch as I have not yet received any
>>>> response.
>>>>
>>>> Thanks!
>>>
>>> Hi Tim -- just curious, why was the similar check on ioc->chip just a
>>> few lines above the one added by the patch insufficient?
>>>
>>> That loop block sets memap_sz when it finds an IORESOURCE_MEM so that it
>>> only sets ioc->chip once. I wonder if the fix might be simpler if the
>>> existing ioc->chip check relocated entirely to where you put it (maybe
>>> also pulling the entire error text onto one line for easier grepping).
>>>
>>> Regards,
>>>
>>> -- Joe
>>
>> If there are no IORESOURCE_MEM resources allocated by the BIOS (i.e. if
>> the BIOS does not run resource allocation on the mpt2sas device) then
>> the check you are referring to is not executed, and the driver attempts
>> to perform operations on a null ioc->chip pointer.
>>
>> I can relocate the check if desired.
>>
> 
> On looking more closely at the code I think having the two separate
> checks is useful for debugging.  The first error message is triggered if
> the resource exists but Linux cannot map it for some reason, and the
> second error message is triggered if the resource does not exist at all
> (buggy BIOS).

Fair enough.  The two error messages are unique, so grepping will lead
to the correct check.

Only other nits would be the addition of a copyright (up to Avago I
guess) and an mpt3sas version (these drivers are 99% the same code).

Acked-by: Joe Lawrence 

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: OOPS: unplugging western digital passport drive

2015-06-22 Thread Joe Lawrence
Hi Alan -- last I heard from Stan on this bug was April 14 -- he was on
"long holidays" at that time :)

-- Joe

On 06/17/2015 03:09 PM, Alan Stern wrote:
> Joe or Stanisナ^ツaw:
> 
> I never heard anything back about this.  Does the patch fix the crash?
> 
> Alan Stern
> 
> 
> On Wed, 18 Mar 2015, Alan Stern wrote:
> 
>> On Wed, 18 Mar 2015, Alan Stern wrote:
>>
>>> On Tue, 17 Mar 2015, Joe Lawrence wrote:
>>>
>>>> On 03/11/2015 12:25 AM, Stanisław Pitucha wrote:
>>>>> Hi linux-scsi,
>>>>> I've got another case of reproducible crash when unplugging western
>>>>> digital passport drives. This was mentioned before in
>>>>> http://www.spinics.net/lists/linux-scsi/msg82603.html
>>>
>>> Like it says in that thread, the problem is somehow related to the ses
>>> driver.  If you remove or blacklist that driver, there won't be any
>>> more crashes.
>>
>> Looks like I spoke too soon.  I think the patch below will fix the 
>> problem.  Let me know what happens.
>>
>> Alan Stern
>>
>>
>>
>>
>> Index: usb-4.0/drivers/scsi/scsi_pm.c
>> ===
>> --- usb-4.0.orig/drivers/scsi/scsi_pm.c
>> +++ usb-4.0/drivers/scsi/scsi_pm.c
>> @@ -217,14 +217,18 @@ static int sdev_runtime_suspend(struct d
>>  {
>>  const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
>>  struct scsi_device *sdev = to_scsi_device(dev);
>> -int err;
>> +struct device *blk_rpm_dev = sdev->request_queue->dev;
>> +int err = 0;
>>  
>> -err = blk_pre_runtime_suspend(sdev->request_queue);
>> -if (err)
>> -return err;
>> +if (blk_rpm_dev) {
>> +err = blk_pre_runtime_suspend(sdev->request_queue);
>> +if (err)
>> +return err;
>> +}
>>  if (pm && pm->runtime_suspend)
>>  err = pm->runtime_suspend(dev);
>> -blk_post_runtime_suspend(sdev->request_queue, err);
>> +if (blk_rpm_dev)
>> +blk_post_runtime_suspend(sdev->request_queue, err);
>>  
>>  return err;
>>  }
>> @@ -246,12 +250,15 @@ static int sdev_runtime_resume(struct de
>>  {
>>  struct scsi_device *sdev = to_scsi_device(dev);
>>  const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
>> +struct device *blk_rpm_dev = sdev->request_queue->dev;
>>  int err = 0;
>>  
>> -blk_pre_runtime_resume(sdev->request_queue);
>> +if (blk_rpm_dev)
>> +blk_pre_runtime_resume(sdev->request_queue);
>>  if (pm && pm->runtime_resume)
>>  err = pm->runtime_resume(dev);
>> -blk_post_runtime_resume(sdev->request_queue, err);
>> +if (blk_rpm_dev)
>> +blk_post_runtime_resume(sdev->request_queue, err);
>>  
>>  return err;
>>  }
>>
>>
>>
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in


Re: [PATCH 17/20 v1] [SCSI] mpt3sas: Use alloc_ordered_workqueue() API instead of create_singlethread_workqueue() API

2015-06-18 Thread Joe Lawrence
On 06/18/2015 09:06 AM, Sreekanth Reddy wrote:
> On Thu, Jun 18, 2015 at 5:40 PM, Joe Lawrence  
> wrote:
>> On 06/16/2015 01:37 AM, Sreekanth Reddy wrote:
>>> Created a thread using alloc_ordered_workqueue() API in order to process
>>> the works from firmware Work-queue sequentially instead of
>>> create_singlethread_workqueue() API.
>>>
>>> Changes in v1:
>>> No need to check for backport compatibility in the upstream kernel.
>>> so removing the else section where driver use
>>> create_singlethread_workqueue() API if alloc_ordered_workqueue() API is
>>> not defined, This else section is not required since in the latest upstream
>>> kernel this alloc_ordered_workqueue() API is always defined.
>>>
>>> Signed-off-by: Sreekanth Reddy 
>>> ---
>>>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
>>> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>>> index b848458..7e5926c 100644
>>> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>>> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>>> @@ -8085,8 +8085,8 @@ _scsih_probe(struct pci_dev *pdev, const struct 
>>> pci_device_id *id)
>>>   /* event thread */
>>>   snprintf(ioc->firmware_event_name, sizeof(ioc->firmware_event_name),
>>>   "fw_event%d", ioc->id);
>>> - ioc->firmware_event_thread = create_singlethread_workqueue(
>>> - ioc->firmware_event_name);
>>> + ioc->firmware_event_thread = alloc_ordered_workqueue(
>>> + ioc->firmware_event_name, WQ_MEM_RECLAIM);
>>>   if (!ioc->firmware_event_thread) {
>>>   pr_err(MPT3SAS_FMT "failure at %s:%d/%s()!\n",
>>>   ioc->name, __FILE__, __LINE__, __func__);
>>>
>>
>> Hi Sreekanth,
>>
>> Is this change still needed after e09c2c2954684 workqueue: apply
>> __WQ_ORDERED to create_singlethread_workqueue() ?  (3.17+)
> 
> I won't say that it is compulsory required, but I feel it is better if
> these changes are included. since initially we thought that thread
> created by using create_singlethread_workqueue() will process the
> works sequentially but in-between  it has broken and then it is fixed
> by Tejun.  So I thought it is better to directly use the
> alloc_ordered_workqueue() as create_singlethead_workqueue() API also
> invoked the same API.

Ok, I was just wondering if maybe create_singlethread_workqueue was
fixed after this patch was initially written.  Since it's effectively
the same...

Reviewed-by: Joe Lawrence 

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 17/20 v1] [SCSI] mpt3sas: Use alloc_ordered_workqueue() API instead of create_singlethread_workqueue() API

2015-06-18 Thread Joe Lawrence
On 06/16/2015 01:37 AM, Sreekanth Reddy wrote:
> Created a thread using alloc_ordered_workqueue() API in order to process
> the works from firmware Work-queue sequentially instead of
> create_singlethread_workqueue() API.
> 
> Changes in v1:
> No need to check for backport compatibility in the upstream kernel.
> so removing the else section where driver use 
> create_singlethread_workqueue() API if alloc_ordered_workqueue() API is
> not defined, This else section is not required since in the latest upstream
> kernel this alloc_ordered_workqueue() API is always defined.
> 
> Signed-off-by: Sreekanth Reddy 
> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index b848458..7e5926c 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -8085,8 +8085,8 @@ _scsih_probe(struct pci_dev *pdev, const struct 
> pci_device_id *id)
>   /* event thread */
>   snprintf(ioc->firmware_event_name, sizeof(ioc->firmware_event_name),
>   "fw_event%d", ioc->id);
> - ioc->firmware_event_thread = create_singlethread_workqueue(
> - ioc->firmware_event_name);
> + ioc->firmware_event_thread = alloc_ordered_workqueue(
> + ioc->firmware_event_name, WQ_MEM_RECLAIM);
>   if (!ioc->firmware_event_thread) {
>   pr_err(MPT3SAS_FMT "failure at %s:%d/%s()!\n",
>   ioc->name, __FILE__, __LINE__, __func__);
> 

Hi Sreekanth,

Is this change still needed after e09c2c2954684 workqueue: apply
__WQ_ORDERED to create_singlethread_workqueue() ?  (3.17+)

In upstream, this change looks cosmetic (unless Tejun has a preference
for one over the other), but maybe converting to alloc_ordered_workqueue
keeps your in house version in closer sync?

Thanks,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mpt2sas: Abort initialization if no memory I/O resources, detected

2015-06-16 Thread Joe Lawrence
On 06/16/2015 12:28 PM, Timothy Pearson wrote:
> On 06/12/2015 05:05 PM, Timothy Pearson wrote:
>> The mpt2sas driver crashes if the BIOS does not set up at least one
>> memory I/O resource. This failure can happen if the device is too
>> slow to respond during POST and is missed by the BIOS, but Linux
>> then detects the device later in the boot process.
>>
>> This patch aborts initialization and prints a warning if no memory I/O
>> resources are found.
>>
>> Signed-off-by: Timothy Pearson 
>> Tested-by: Timothy Pearson 
>> ---
>> drivers/scsi/mpt2sas/mpt2sas_base.c | 9 +
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
>> b/drivers/scsi/mpt2sas/mpt2sas_base.c
>> index 11248de..15c9504 100644
>> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
>> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
>> @@ -6,6 +6,8 @@
>> * Copyright (C) 2007-2014 LSI Corporation
>> * Copyright (C) 20013-2014 Avago Technologies
>> * (mailto: mpt-fusionlinux@avagotech.com)
>> + * Copyright (C) 2015 Raptor Engineering
>> + * (mailto: supp...@araptorengineeringinc.com)
>> *
>> * This program is free software; you can redistribute it and/or
>> * modify it under the terms of the GNU General Public License
>> @@ -1582,6 +1584,13 @@ mpt2sas_base_map_resources(struct MPT2SAS_ADAPTER
>> *ioc)
>> }
>> }
>>
>> + if (ioc->chip == NULL) {
>> + printk(MPT2SAS_ERR_FMT "unable to map "
>> + "adapter memory (resource not found)!\n", ioc->name);
>> + r = -EINVAL;
>> + goto out_fail;
>> + }
>> +
>> _base_mask_interrupts(ioc);
>>
>> r = _base_get_ioc_facts(ioc, CAN_SLEEP);
> 
> Just following up on this patch as I have not yet received any response.
> 
> Thanks!

Hi Tim -- just curious, why was the similar check on ioc->chip just a
few lines above the one added by the patch insufficient?

That loop block sets memap_sz when it finds an IORESOURCE_MEM so that it
only sets ioc->chip once.  I wonder if the fix might be simpler if the
existing ioc->chip check relocated entirely to where you put it (maybe
also pulling the entire error text onto one line for easier grepping).

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 17/20] [SCSI] mpt3sas: Use alloc_ordered_workqueue() API instead of create_singlethread_workqueue() API

2015-06-12 Thread Joe Lawrence
On 06/12/2015 05:42 AM, Sreekanth Reddy wrote:
...
> +#if defined(alloc_ordered_workqueue)
> + ioc->firmware_event_thread = alloc_ordered_workqueue(
> + ioc->firmware_event_name, WQ_MEM_RECLAIM);
> +#else
> + ioc->firmware_event_thread = create_singlethread_workqueue(
>   ioc->firmware_event_name);
> +#endif

Hi Sreekanth,

I think the upstream version of this code can safely assume
alloc_ordered_workqueue is defined, no?

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] mpt2sas,mpt3sas - PCI master abort fixups

2015-04-13 Thread Joe Lawrence
On 04/13/2015 10:38 AM, James Bottomley wrote:
> On Mon, 2015-04-13 at 10:06 -0400, Joe Lawrence wrote:
>> On 04/12/2015 08:54 PM, James Bottomley wrote:
>>> On Sun, 2015-04-12 at 20:11 -0400, Joe Lawrence wrote:
>>>> On 12/30/2014 09:07 AM, Joe Lawrence wrote:
>>>>> A colleague noticed that the mpt2 and mpt3sas drivers do not correctly
>>>>> check the PCI master abort pattern in _base_wait_for_doorbell_ack.  This
>>>>> pattern should be checked *prior* to any valid bit patterns, which would
>>>>> always return true since a PCI read on master abort sets all bits high.
>>>>>
>>>>> The second patch adds similar checking to _base_wait_for_doorbell_int and
>>>>> _base_wait_for_doorbell_not_used to avoid potentially long loops around
>>>>> PCI reads.
>>>>>
>>>>> Joe Lawrence (2):
>>>>>   mpt2sas,mpt3sas: correct master-abort checking in doorbell ack
>>>>>   mpt2sas,mpt3sas: additional master abort checks
>>>>>
>>>>>  drivers/scsi/mpt2sas/mpt2sas_base.c |   17 -
>>>>>  drivers/scsi/mpt3sas/mpt3sas_base.c |   17 -
>>>>>  2 files changed, 24 insertions(+), 10 deletions(-)
>>>>>
>>>>
>>>> Avago ping?
>>>>
>>>> This one was pretty straightforward: check 0x *before* any
>>>> individual bit(s), i.e. before reading the doorbell register.
>>>
>>> OK, Joe, explain why this patch is important: what problems could result
>>> from it not being present?  If you convince everyone then no more mpt2/3
>>> sas patches until this is at least commented on and a plan of action
>>> proposed.
>>
>> Hi James,
>>
>> As currently coded:  If the PCI read returns a master abort,
>> _base_wait_for_doorbell_ack will loop until it exhausts its timeout (up
>> to 15 seconds).  Other parts of the driver, like the periodic watchdog
>> or EEH, may detect a similar problem before such a long time and cleanup
>> the mess.  However, complete device removal may be stalled until whoever
>> called _base_wait_for_doorbell_ack is satisfied that it has finished.
> 
> I think we all picked this up from the changelog.  What I meant was in
> what situations might a card get a master abort ... because that's when
> the problem becomes user visible.  It sounds like it's something that
> might not occur very often or is a bit theoretical, is that right?

Got it -- yes, master abort should be an infrequent event.  On Stratus
HW it typically indicates PCI hotplug removal of a device.  We run many
automated removal/add tests, which find corner case bugs and other race
conditions in driver error paths and whatnot.

>> This behavior is not really a bug, but feels like one in the making.
>> Should additional code be introduced, copy/pasted, etc. it may not do
>> what was intended.
>>
>> For future reference, would a repost have been more appropriate?  This
>> changeset was so small that I figured a status ping would have sufficed.
> 
> Either works.  I was just trying to work out what sort of attention
> needs to be paid to the fix based on what sort of problem it fixes for
> the end user.

Ok.  This particular piece of code isn't going to crash the machine or
cause corruption, etc.  It only seemed proper to fix the check that was
already there.

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] mpt2sas,mpt3sas - PCI master abort fixups

2015-04-13 Thread Joe Lawrence
On 04/12/2015 08:54 PM, James Bottomley wrote:
> On Sun, 2015-04-12 at 20:11 -0400, Joe Lawrence wrote:
>> On 12/30/2014 09:07 AM, Joe Lawrence wrote:
>>> A colleague noticed that the mpt2 and mpt3sas drivers do not correctly
>>> check the PCI master abort pattern in _base_wait_for_doorbell_ack.  This
>>> pattern should be checked *prior* to any valid bit patterns, which would
>>> always return true since a PCI read on master abort sets all bits high.
>>>
>>> The second patch adds similar checking to _base_wait_for_doorbell_int and
>>> _base_wait_for_doorbell_not_used to avoid potentially long loops around
>>> PCI reads.
>>>
>>> Joe Lawrence (2):
>>>   mpt2sas,mpt3sas: correct master-abort checking in doorbell ack
>>>   mpt2sas,mpt3sas: additional master abort checks
>>>
>>>  drivers/scsi/mpt2sas/mpt2sas_base.c |   17 -
>>>  drivers/scsi/mpt3sas/mpt3sas_base.c |   17 -
>>>  2 files changed, 24 insertions(+), 10 deletions(-)
>>>
>>
>> Avago ping?
>>
>> This one was pretty straightforward: check 0x *before* any
>> individual bit(s), i.e. before reading the doorbell register.
> 
> OK, Joe, explain why this patch is important: what problems could result
> from it not being present?  If you convince everyone then no more mpt2/3
> sas patches until this is at least commented on and a plan of action
> proposed.

Hi James,

As currently coded:  If the PCI read returns a master abort,
_base_wait_for_doorbell_ack will loop until it exhausts its timeout (up
to 15 seconds).  Other parts of the driver, like the periodic watchdog
or EEH, may detect a similar problem before such a long time and cleanup
the mess.  However, complete device removal may be stalled until whoever
called _base_wait_for_doorbell_ack is satisfied that it has finished.

This behavior is not really a bug, but feels like one in the making.
Should additional code be introduced, copy/pasted, etc. it may not do
what was intended.

For future reference, would a repost have been more appropriate?  This
changeset was so small that I figured a status ping would have sufficed.

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] mpt2sas,mpt3sas - PCI master abort fixups

2015-04-12 Thread Joe Lawrence
On 12/30/2014 09:07 AM, Joe Lawrence wrote:
> A colleague noticed that the mpt2 and mpt3sas drivers do not correctly
> check the PCI master abort pattern in _base_wait_for_doorbell_ack.  This
> pattern should be checked *prior* to any valid bit patterns, which would
> always return true since a PCI read on master abort sets all bits high.
> 
> The second patch adds similar checking to _base_wait_for_doorbell_int and
> _base_wait_for_doorbell_not_used to avoid potentially long loops around
> PCI reads.
> 
> Joe Lawrence (2):
>   mpt2sas,mpt3sas: correct master-abort checking in doorbell ack
>   mpt2sas,mpt3sas: additional master abort checks
> 
>  drivers/scsi/mpt2sas/mpt2sas_base.c |   17 -
>  drivers/scsi/mpt3sas/mpt3sas_base.c |   17 -
>  2 files changed, 24 insertions(+), 10 deletions(-)
> 

Avago ping?

This one was pretty straightforward: check 0x *before* any
individual bit(s), i.e. before reading the doorbell register.

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: OOPS: unplugging western digital passport drive

2015-03-17 Thread Joe Lawrence
On 03/11/2015 12:25 AM, Stanisław Pitucha wrote:
> Hi linux-scsi,
> I've got another case of reproducible crash when unplugging western
> digital passport drives. This was mentioned before in
> http://www.spinics.net/lists/linux-scsi/msg82603.html
> 
> Is there any way I can get a more correct stacktrace of the original
> bug without changing/recompiling the kernel? (or with?)
> 
> My last log from oops is:
> 
> [ 3343.469871]: usb 4-1: USB disconnect, device number 3
> [ 3343.521948]: BUG: unable to handle kernel NULL pointer dereference
> at 01a0
> [ 3343.522052]: IP: [] blk_post_runtime_resume+0x65/0x80
> [ 3343.522131]: PGD 0
> [ 3343.522159]: Oops: 0002 [#1] PREEMPT SMP
> [ 3343.522215]: Modules linked in: nls_utf8 hfsplus ses enclosure uas
> usb_storage rfcomm fuse ctr ccm bnep ecb btusb bluetooth uvcvideo
> videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev
> joydev media mousedev iTCO_wdt iTCO_vendor_support arc4 coretemp
> intel_rapl x86_pkg_temp_thermal intel_powerclamp iwldvm kvm_intel
> mac80211 kvm evdev mac_hid psmouse serio_raw pcspkr snd_hda_codec_hdmi
> iwlwifi snd_hda_codec_realtek i2c_i801 snd_hda_codec_generic
> rtsx_pci_ms i915 snd_hda_intel memstick snd_hda_controller cfg80211
> snd_hda_codec r8169 snd_hwdep mii snd_pcm drm_kms_helper snd_timer
> lpc_ich shpchp drm thinkpad_acpi intel_gtt i2c_algo_bit nvram mei_me
> i2c_core snd mei soundcore led_class rfkill hwmon tpm_tis tpm battery
> wmi thermal video button processor ac sch_fq_codel usbhid
> hid_logitech_dj
> [ 3343.523406]:  hid_generic hid nfs lockd grace sunrpc fscache ext4
> crc16 mbcache jbd2 sha256_ssse3 sha256_generic algif_skcipher af_alg
> dm_crypt dm_mod sr_mod cdrom sd_mod rtsx_pci_sdmmc atkbd mmc_core
> libps2 crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel
> ahci libahci aesni_intel aes_x86_64 lrw gf128mul glue_helper
> ablk_helper cryptd libata xhci_pci ehci_pci scsi_mod xhci_hcd ehci_hcd
> rtsx_pci usbcore usb_common i8042 serio
> [ 3343.523825]: CPU: 1 PID: 22490 Comm: kworker/1:0 Not tainted 3.18.6-1-ARCH 
> #1
> [ 3343.523882]: Hardware name: LENOVO 2481CTO/2481CTO, BIOS
> G3ET36WW(1.10) 06/20/2012
> [ 3343.523953]: Workqueue: usb_hub_wq hub_event [usbcore]
> [ 3343.523997]: task: 880211fd5080 ti: 8801971bc000 task.ti:
> 8801971bc000
> [ 3343.524054]: RIP: 0010:[]  []
> blk_post_runtime_resume+0x65/0x80
> [ 3343.524131]: RSP: 0018:8801971bf8c8  EFLAGS: 00010092
> [ 3343.524173]: RAX:  RBX: 880211ca1698 RCX:
> 
> [ 3343.524228]: RDX: 0001000dea0f RSI: 0009 RDI:
> 880211ca1698
> [ 3343.524283]: RBP: 8801971bf8d8 R08:  R09:
> 880216c3f000
> [ 3343.524338]: R10: 88021f256db0 R11: ea0002987d40 R12:
> 
> [ 3343.524392]: R13:  R14: 88020df7b428 R15:
> 0004
> [ 3343.524448]: FS:  () GS:88021f24()
> knlGS:
> [ 3343.524509]: CS:  0010 DS:  ES:  CR0: 80050033
> [ 3343.524554]: CR2: 01a0 CR3: 01811000 CR4:
> 001407e0
> [ 3343.524609]: Stack:
> [ 3343.524628]:  8802123fe968  8801971bf908
> a0119564
> [ 3343.524695]:  8801971bf910 8802123fe968 8802123fea16
> a01194e0
> [ 3343.524762]:  8801971bf938 813c4636 8800a61f5780
> 8802123fe968
> [ 3343.524828]: Call Trace:
> [ 3343.524866]:  [] scsi_runtime_resume+0x84/0xd0 [scsi_mod]
> [ 3343.524930]:  [] ?
> scsi_autopm_put_device+0x20/0x20 [scsi_mod]
> [ 3343.524990]:  [] __rpm_callback+0x36/0x90
> [ 3343.525035]:  [] rpm_callback+0x26/0xa0
> [ 3343.525079]:  [] rpm_resume+0x4b1/0x690
> [ 3343.525124]:  [] __pm_runtime_resume+0x40/0x60
> [ 3343.525177]:  [] __device_release_driver+0x29/0xf0
> [ 3343.525228]:  [] device_release_driver+0x23/0x30
> [ 3343.525277]:  [] bus_remove_device+0x108/0x180
> [ 3343.525325]:  [] device_del+0x129/0x1f0
> [ 3343.525378]:  [] __scsi_remove_device+0xcd/0xe0 
> [scsi_mod]
> [ 3343.525442]:  [] scsi_forget_host+0x64/0x70 [scsi_mod]
> [ 3343.525502]:  [] scsi_remove_host+0x7b/0x130 [scsi_mod]
> [ 3343.525558]:  [] usb_stor_disconnect+0x59/0xd0
> [usb_storage]
> [ 3343.525625]:  [] usb_unbind_interface+0x1f8/0x2c0 
> [usbcore]
> [ 3343.525682]:  [] ? rpm_idle+0x23/0x340
> [ 3343.525729]:  [] __device_release_driver+0x7f/0xf0
> [ 3343.525779]:  [] device_release_driver+0x23/0x30
> [ 3343.525828]:  [] bus_remove_device+0x108/0x180
> [ 3343.525876]:  [] device_del+0x129/0x1f0
> [ 3343.528574]:  [] usb_disable_device+0x91/0x290 [usbcore]
> [ 3343.531302]:  [] usb_disconnect+0x94/0x2d0 [usbcore]
> [ 3343.533981]:  [] hub_event+0x66a/0x1640 [usbcore]
> [ 3343.536657]:  [] ? __schedule+0x3e8/0xa50
> [ 3343.539275]:  [] process_one_work+0x145/0x400
> [ 3343.541919]:  [] worker_thread+0x6b/0x480
> [ 3343.544524]:  [] ? init_pwq.part.22+0x10/0x10
> [ 3343.547108]:  [] kthread+0xea/0x100
> [ 3343.549708]:  [

Re: usb-storage URB use-after-free

2015-01-30 Thread Joe Lawrence
On Thu, 29 Jan 2015 11:42:18 -0500
Alan Stern  wrote:

> On Wed, 28 Jan 2015, Joe Lawrence wrote:
> 
> > This one should have gone over to linux-usb.
> > 
> > -- Joe
> > 
> > On 01/28/2015 05:04 PM, Joe Lawrence wrote:
> > > Hello linux-usb,
> > > 
> > > We've hit a USB use-after-free on Stratus HW during device removal tests.
> > > We're running fio disk I/O to a scsi disk hanging off USB when the USB
> > > controller is hotplug removed.  This crash is very consistent (usually the
> > > first device pull during testing).  Without I/O, it may take days to
> > > encounter.
> > > 
> > > general protection fault:  [#1] SMP
> > > ...
> > > CPU: 35 PID: 19626 Comm: kworker/u97:0 Tainted: PF   W  
> > > O--   3.10.0-210.el7.dev02.x86_64 #1
> > > Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.0:30 
> > > 11/12/2014
> > > Workqueue: scsi_tmf_872 scmd_eh_abort_handler
> > > task: 88031bd91960 ti: 880981318000 task.ti: 880981318000
> > > RIP: 0010:[]  [] kobject_put+0xd/0x60
> > > RSP: 0018:88098131bd28  EFLAGS: 00010002
> > > RAX:  RBX: 6b6b6b6b6b6b6c0b RCX: 0001002f0009
> > > RDX: 002f RSI: ea0015719800 RDI: 6b6b6b6b6b6b6c0b
> > > RBP: 88098131bd30 R08: 88055c6622f0 R09: 0001002f0008
> > > R10: 88085f407a80 R11: 81450503 R12: 8804bab6d248
> > > R13: ff98 R14: 0086 R15: 0c20
> > > FS:  () GS:88085fce() 
> > > knlGS:
> > > CS:  0010 DS:  ES:  CR0: 80050033
> > > CR2: 7f2ebb5d6008 CR3: 001038dc5000 CR4: 001407e0
> > > DR0:  DR1:  DR2: 
> > > DR3:  DR6: 0ff0 DR7: 0400
> > > Stack:
> > >   88098131bd40 813cd327 88098131bd50
> > >  8140a65a 88098131bd78 81416795 
> > >  880414791968 880414791978 88098131bd88 814175f6
> > > Call Trace:
> > >  [] put_device+0x17/0x20
> > >  [] usb_put_dev+0x1a/0x20
> > >  [] usb_hcd_unlink_urb+0x65/0xe0
> > >  [] usb_unlink_urb+0x26/0x50
> > >  [] usb_sg_cancel+0x82/0xe0
> > >  [] usb_stor_stop_transport+0x31/0x50 [usb_storage]
> > >  [] command_abort+0xcd/0xe0 [usb_storage]
> > >  [] scmd_eh_abort_handler+0xbf/0x480
> > >  [] process_one_work+0x17b/0x470
> > >  [] worker_thread+0x11b/0x400
> > >  [] ? rescuer_thread+0x400/0x400
> > >  [] kthread+0xcf/0xe0
> > >  [] ? kthread_create_on_node+0x140/0x140
> > >  [] ret_from_fork+0x7c/0xb0
> > >  [] ? kthread_create_on_node+0x140/0x140
> > > 
> > > from another crash, we know that the URB itself has been freed:
> 
> > > but the underlying usb_device->device is still present:
> 
> > > It looks like usb_hcd_unlink_urb takes a reference out on the underlying
> > > device but not the URB, which in our test case consistently gets freed
> > > just before usb_hcd_unlink_urb tries to return a reference on 
> > > urb->device. 
> 
> The analysis looks correct.
> 
> > > Maybe the URB is a reference count short when usb_hcd_unlink_urb calls
> > > unlink1?  Somewhere usb-storage missed taking out a ref?
> 
> No; it's a simple use-after-free error.
> 
> > > A tourniquet hack-patch follows (probably inside out, as the URB should 
> > > stay
> > > intact while usb_hcd_unlink_urb does its thing) and has been running a
> > > little over 30 surprise device removals under I/O without incident.
> > > 
> > > Any better suggestions?
> 
> Please try this instead.
> 
> Alan Stern
> 
> 
> 
> Index: usb-3.19/drivers/usb/core/hcd.c
> ===
> --- usb-3.19.orig/drivers/usb/core/hcd.c
> +++ usb-3.19/drivers/usb/core/hcd.c
> @@ -1618,6 +1618,7 @@ static int unlink1(struct usb_hcd *hcd,
>  int usb_hcd_unlink_urb (struct urb *urb, int status)
>  {
>   struct usb_hcd  *hcd;
> + struct usb_device   *udev = urb->dev;
>   int retval = -EIDRM;
>   unsigned long   flags;
>  
> @@ -1629,20 +1630,19 @@ int usb_hcd_unlink_urb (struct urb *urb,
>   spin_lock_irqsave(&hcd_urb_unlink_lock, flags);
>   if (atomic_read(&urb->use_

Re: usb-storage URB use-after-free

2015-01-28 Thread Joe Lawrence
This one should have gone over to linux-usb.

-- Joe

On 01/28/2015 05:04 PM, Joe Lawrence wrote:
> Hello linux-usb,
> 
> We've hit a USB use-after-free on Stratus HW during device removal tests.
> We're running fio disk I/O to a scsi disk hanging off USB when the USB
> controller is hotplug removed.  This crash is very consistent (usually the
> first device pull during testing).  Without I/O, it may take days to
> encounter.
> 
> general protection fault:  [#1] SMP
> ...
> CPU: 35 PID: 19626 Comm: kworker/u97:0 Tainted: PF   W  O--   
> 3.10.0-210.el7.dev02.x86_64 #1
> Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.0:30 
> 11/12/2014
> Workqueue: scsi_tmf_872 scmd_eh_abort_handler
> task: 88031bd91960 ti: 880981318000 task.ti: 880981318000
> RIP: 0010:[]  [] kobject_put+0xd/0x60
> RSP: 0018:88098131bd28  EFLAGS: 00010002
> RAX:  RBX: 6b6b6b6b6b6b6c0b RCX: 0001002f0009
> RDX: 002f RSI: ea0015719800 RDI: 6b6b6b6b6b6b6c0b
> RBP: 88098131bd30 R08: 88055c6622f0 R09: 0001002f0008
> R10: 88085f407a80 R11: 81450503 R12: 8804bab6d248
> R13: ff98 R14: 0086 R15: 0c20
> FS:  () GS:88085fce() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 7f2ebb5d6008 CR3: 001038dc5000 CR4: 001407e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Stack:
>   88098131bd40 813cd327 88098131bd50
>  8140a65a 88098131bd78 81416795 
>  880414791968 880414791978 88098131bd88 814175f6
> Call Trace:
>  [] put_device+0x17/0x20
>  [] usb_put_dev+0x1a/0x20
>  [] usb_hcd_unlink_urb+0x65/0xe0
>  [] usb_unlink_urb+0x26/0x50
>  [] usb_sg_cancel+0x82/0xe0
>  [] usb_stor_stop_transport+0x31/0x50 [usb_storage]
>  [] command_abort+0xcd/0xe0 [usb_storage]
>  [] scmd_eh_abort_handler+0xbf/0x480
>  [] process_one_work+0x17b/0x470
>  [] worker_thread+0x11b/0x400
>  [] ? rescuer_thread+0x400/0x400
>  [] kthread+0xcf/0xe0
>  [] ? kthread_create_on_node+0x140/0x140
>  [] ret_from_fork+0x7c/0xb0
>  [] ? kthread_create_on_node+0x140/0x140
> 
> from another crash, we know that the URB itself has been freed:
> 
> crash> struct -o urb | grep SIZE
> SIZE: 0xc0
> crash> p 0xc0/8
> $2 = 0x18
> crash> rd 880846df9248 0x18
> 880846df9248:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df9258:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df9268:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df9278:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df9288:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df9298:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df92a8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df92b8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df92c8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df92d8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df92e8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df92f8:  6b6b6b6b6b6b6b6b a56b6b6b6b6b6b6b   kkk.
> 
> but the underlying usb_device->device is still present:
> 
> crash> struct usb_device 88084fd22e68
> ...
>   dev = {
> parent = 0x881050c3d418, 
> p = 0x88084ff27ae8, 
> kobj = {
>   name = 0x88084ff071e0 "6-1", 
>   entry = {
> next = 0x88084ff09568, 
> prev = 0x88084ff09000
>   }, 
>   parent = 0x881050c3d428, 
>   kset = 0x8808540dc9f0, 
>   ktype = 0x819cc220 , 
>   sd = 0x88084ff1b8b8, 
>   kref = {
> refcount = {
>   counter = 0x8
> }
>   }, 
>   state_initialized = 0x1, 
>   state_in_sysfs = 0x1, 
>   state_add_uevent_sent = 0x1, 
>   state_remove_uevent_sent = 0x0, 
>   uevent_suppress = 0x0
> },
> ...
> 
> I added trace printing to usb_{alloc_urb,get_urb,free_urb} and urb_destroy
> showing the URB and the calling function, then crashed again:
> 
> ... snip ...
>  usb-storage-687   [000]   350.013212: bprint:   
> usb_alloc_urb : JL: usb_sg_init+0xe1 -> usb_alloc_urb(0x8808489b0a28)
>  usb-storage-687   [000]   350.013212: bprint:   usb_get_urb 
> : JL: usb_hcd_submit_urb+0x3b -> usb_get_urb(0x8808489b0a28)
>

usb-storage URB use-after-free

2015-01-28 Thread Joe Lawrence
Hello linux-usb,

We've hit a USB use-after-free on Stratus HW during device removal tests.
We're running fio disk I/O to a scsi disk hanging off USB when the USB
controller is hotplug removed.  This crash is very consistent (usually the
first device pull during testing).  Without I/O, it may take days to
encounter.

general protection fault:  [#1] SMP
...
CPU: 35 PID: 19626 Comm: kworker/u97:0 Tainted: PF   W  O--   
3.10.0-210.el7.dev02.x86_64 #1
Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.0:30 11/12/2014
Workqueue: scsi_tmf_872 scmd_eh_abort_handler
task: 88031bd91960 ti: 880981318000 task.ti: 880981318000
RIP: 0010:[]  [] kobject_put+0xd/0x60
RSP: 0018:88098131bd28  EFLAGS: 00010002
RAX:  RBX: 6b6b6b6b6b6b6c0b RCX: 0001002f0009
RDX: 002f RSI: ea0015719800 RDI: 6b6b6b6b6b6b6c0b
RBP: 88098131bd30 R08: 88055c6622f0 R09: 0001002f0008
R10: 88085f407a80 R11: 81450503 R12: 8804bab6d248
R13: ff98 R14: 0086 R15: 0c20
FS:  () GS:88085fce() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f2ebb5d6008 CR3: 001038dc5000 CR4: 001407e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Stack:
  88098131bd40 813cd327 88098131bd50
 8140a65a 88098131bd78 81416795 
 880414791968 880414791978 88098131bd88 814175f6
Call Trace:
 [] put_device+0x17/0x20
 [] usb_put_dev+0x1a/0x20
 [] usb_hcd_unlink_urb+0x65/0xe0
 [] usb_unlink_urb+0x26/0x50
 [] usb_sg_cancel+0x82/0xe0
 [] usb_stor_stop_transport+0x31/0x50 [usb_storage]
 [] command_abort+0xcd/0xe0 [usb_storage]
 [] scmd_eh_abort_handler+0xbf/0x480
 [] process_one_work+0x17b/0x470
 [] worker_thread+0x11b/0x400
 [] ? rescuer_thread+0x400/0x400
 [] kthread+0xcf/0xe0
 [] ? kthread_create_on_node+0x140/0x140
 [] ret_from_fork+0x7c/0xb0
 [] ? kthread_create_on_node+0x140/0x140

from another crash, we know that the URB itself has been freed:

crash> struct -o urb | grep SIZE
SIZE: 0xc0
crash> p 0xc0/8
$2 = 0x18
crash> rd 880846df9248 0x18
880846df9248:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df9258:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df9268:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df9278:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df9288:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df9298:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df92a8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df92b8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df92c8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df92d8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df92e8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df92f8:  6b6b6b6b6b6b6b6b a56b6b6b6b6b6b6b   kkk.

but the underlying usb_device->device is still present:

crash> struct usb_device 88084fd22e68
...
  dev = {
parent = 0x881050c3d418, 
p = 0x88084ff27ae8, 
kobj = {
  name = 0x88084ff071e0 "6-1", 
  entry = {
next = 0x88084ff09568, 
prev = 0x88084ff09000
  }, 
  parent = 0x881050c3d428, 
  kset = 0x8808540dc9f0, 
  ktype = 0x819cc220 , 
  sd = 0x88084ff1b8b8, 
  kref = {
refcount = {
  counter = 0x8
}
  }, 
  state_initialized = 0x1, 
  state_in_sysfs = 0x1, 
  state_add_uevent_sent = 0x1, 
  state_remove_uevent_sent = 0x0, 
  uevent_suppress = 0x0
},
...

I added trace printing to usb_{alloc_urb,get_urb,free_urb} and urb_destroy
showing the URB and the calling function, then crashed again:

... snip ...
 usb-storage-687   [000]   350.013212: bprint:   usb_alloc_urb 
: JL: usb_sg_init+0xe1 -> usb_alloc_urb(0x8808489b0a28)
 usb-storage-687   [000]   350.013212: bprint:   usb_get_urb : 
JL: usb_hcd_submit_urb+0x3b -> usb_get_urb(0x8808489b0a28)
   kworker/u97:2-703   [006]   380.656791: bprint:   usb_free_urb : 
JL: __usb_hcd_giveback_urb -> usb_free_urb(0x8808489b0a28)
 usb-storage-687   [000]   380.656793: bprint:   usb_free_urb : 
JL: sg_clean+0x33 -> usb_free_urb(0x8808489b0a28)
 usb-storage-687   [000]   380.656794: bprint:   urb_destroy : 
JL: usb_free_urb+0x40 -> urb_destroy(0x8808489b0a28)

crash> bt 703
PID: 703TASK: 8808497f8000  CPU: 6   COMMAND: "kworker/u97:2"
 #0 [8808497f7c10] die at 8101736b
 #1 [8808497f7c40] do_general_protection at 8160be4e
 #2 [8808497f7c70] general_protection at 8160b768
[

Re: [SCIS] mpt3sas: wait_for_completion_timeout timeout not reported

2014-12-30 Thread Joe Lawrence
On 12/30/2014 11:19 AM, Nicholas Mc Guire wrote:
> wait_for_completion_timeout reaching timeout was being ignored,
> this also should fail if timeout condition occurs. 
> 
> Thanks to Joe Lawrence  for confirmation.

How about this instead:

Acked-by: Joe Lawrence 

You still probably want a review from the Avago folks though.

BTW, don't worry about the "[SCSI]" subject prefix to the commit,
"[PATCH]" is fine.  I believe the former is a convention that the
maintainer applies to patches as he collects them to indicate that they
originated via the SCSI tree.

Thanks,

-- Joe

> this was only compile tested with
> x86_64_defconfig + CONFIG_SCSI_LOWLEVEL=y + CONFIG_SCSI_MPT3SAS=m
> 
> patch is against linux-next 3.19.0-rc1 -next-20141226
> Signed-off-by: Nicholas Mc Guire 
> ---
>  drivers/scsi/mpt3sas/mpt3sas_config.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_config.c 
> b/drivers/scsi/mpt3sas/mpt3sas_config.c
> index 4472c2a..04ff21b 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_config.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_config.c
> @@ -393,7 +393,7 @@ _config_request(struct MPT3SAS_ADAPTER *ioc, 
> Mpi2ConfigRequest_t
>   mpt3sas_base_put_smid_default(ioc, smid);
>   timeleft = wait_for_completion_timeout(&ioc->config_cmds.done,
>   timeout*HZ);
> - if (!(ioc->config_cmds.status & MPT3_CMD_COMPLETE)) {
> + if (timeleft == 0 || !(ioc->config_cmds.status & MPT3_CMD_COMPLETE)) {
>   pr_err(MPT3SAS_FMT "%s: timeout\n",
>   ioc->name, __func__);
>   _debug_dump_mf(mpi_request,
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] mpt2sas,mpt3sas: additional master abort checks

2014-12-30 Thread Joe Lawrence
Add checks for PCI master abort reads in _base_wait_for_doorbell_int and
_base_wait_for_doorbell_not_used, which contain potentially long loops
around readl calls.

Signed-off-by: Joe Lawrence 
---
 drivers/scsi/mpt2sas/mpt2sas_base.c |   10 --
 drivers/scsi/mpt3sas/mpt3sas_base.c |   10 --
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c 
b/drivers/scsi/mpt2sas/mpt2sas_base.c
index 16d6fd5e037e..6ad1268cc57b 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -2942,7 +2942,9 @@ _base_wait_for_doorbell_int(struct MPT2SAS_ADAPTER *ioc, 
int timeout,
cntdn = (sleep_flag == CAN_SLEEP) ? 1000*timeout : 2000*timeout;
do {
int_status = readl(&ioc->chip->HostInterruptStatus);
-   if (int_status & MPI2_HIS_IOC2SYS_DB_STATUS) {
+   if (int_status == 0x)
+   goto out;
+   else if (int_status & MPI2_HIS_IOC2SYS_DB_STATUS) {
dhsprintk(ioc, printk(MPT2SAS_INFO_FMT "%s: "
"successful count(%d), timeout(%d)\n", ioc->name,
__func__, count, timeout));
@@ -2955,6 +2957,7 @@ _base_wait_for_doorbell_int(struct MPT2SAS_ADAPTER *ioc, 
int timeout,
count++;
} while (--cntdn);
 
+ out:
printk(MPT2SAS_ERR_FMT "%s: failed due to timeout count(%d), "
"int_status(%x)!\n", ioc->name, __func__, count, int_status);
return -EFAULT;
@@ -3032,7 +3035,9 @@ _base_wait_for_doorbell_not_used(struct MPT2SAS_ADAPTER 
*ioc, int timeout,
cntdn = (sleep_flag == CAN_SLEEP) ? 1000*timeout : 2000*timeout;
do {
doorbell_reg = readl(&ioc->chip->Doorbell);
-   if (!(doorbell_reg & MPI2_DOORBELL_USED)) {
+   if (doorbell_reg == 0x)
+   goto out;
+   else if (!(doorbell_reg & MPI2_DOORBELL_USED)) {
dhsprintk(ioc, printk(MPT2SAS_INFO_FMT "%s: "
"successful count(%d), timeout(%d)\n", ioc->name,
__func__, count, timeout));
@@ -3045,6 +3050,7 @@ _base_wait_for_doorbell_not_used(struct MPT2SAS_ADAPTER 
*ioc, int timeout,
count++;
} while (--cntdn);
 
+ out:
printk(MPT2SAS_ERR_FMT "%s: failed due to timeout count(%d), "
"doorbell_reg(%x)!\n", ioc->name, __func__, count, doorbell_reg);
return -EFAULT;
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 878bf6ddd2a0..d3b6549640c7 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -3115,7 +3115,9 @@ _base_wait_for_doorbell_int(struct MPT3SAS_ADAPTER *ioc, 
int timeout,
cntdn = (sleep_flag == CAN_SLEEP) ? 1000*timeout : 2000*timeout;
do {
int_status = readl(&ioc->chip->HostInterruptStatus);
-   if (int_status & MPI2_HIS_IOC2SYS_DB_STATUS) {
+   if (int_status == 0x)
+   goto out;
+   else if (int_status & MPI2_HIS_IOC2SYS_DB_STATUS) {
dhsprintk(ioc, pr_info(MPT3SAS_FMT
"%s: successful count(%d), timeout(%d)\n",
ioc->name, __func__, count, timeout));
@@ -3128,6 +3130,7 @@ _base_wait_for_doorbell_int(struct MPT3SAS_ADAPTER *ioc, 
int timeout,
count++;
} while (--cntdn);
 
+ out:
pr_err(MPT3SAS_FMT
"%s: failed due to timeout count(%d), int_status(%x)!\n",
ioc->name, __func__, count, int_status);
@@ -3207,7 +3210,9 @@ _base_wait_for_doorbell_not_used(struct MPT3SAS_ADAPTER 
*ioc, int timeout,
cntdn = (sleep_flag == CAN_SLEEP) ? 1000*timeout : 2000*timeout;
do {
doorbell_reg = readl(&ioc->chip->Doorbell);
-   if (!(doorbell_reg & MPI2_DOORBELL_USED)) {
+   if (doorbell_reg == 0x)
+   goto out;
+   else if (!(doorbell_reg & MPI2_DOORBELL_USED)) {
dhsprintk(ioc, pr_info(MPT3SAS_FMT
"%s: successful count(%d), timeout(%d)\n",
ioc->name, __func__, count, timeout));
@@ -3220,6 +3225,7 @@ _base_wait_for_doorbell_not_used(struct MPT3SAS_ADAPTER 
*ioc, int timeout,
count++;
} while (--cntdn);
 
+ out:
pr_err(MPT3SAS_FMT
"%s: failed due to timeout count(%d), doorbell_reg(%x)!\n",
ioc->name, __func__, count, doorbell_reg);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] mpt2sas,mpt3sas - PCI master abort fixups

2014-12-30 Thread Joe Lawrence
A colleague noticed that the mpt2 and mpt3sas drivers do not correctly
check the PCI master abort pattern in _base_wait_for_doorbell_ack.  This
pattern should be checked *prior* to any valid bit patterns, which would
always return true since a PCI read on master abort sets all bits high.

The second patch adds similar checking to _base_wait_for_doorbell_int and
_base_wait_for_doorbell_not_used to avoid potentially long loops around
PCI reads.

Joe Lawrence (2):
  mpt2sas,mpt3sas: correct master-abort checking in doorbell ack
  mpt2sas,mpt3sas: additional master abort checks

 drivers/scsi/mpt2sas/mpt2sas_base.c |   17 -
 drivers/scsi/mpt3sas/mpt3sas_base.c |   17 -
 2 files changed, 24 insertions(+), 10 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] mpt2sas,mpt3sas: correct master-abort checking in doorbell ack

2014-12-30 Thread Joe Lawrence
Test against the invalid data pattern ~0 before testing valid data
patterns.

Reported-by: Derek Shute 
Signed-off-by: Joe Lawrence 
---
 drivers/scsi/mpt2sas/mpt2sas_base.c |7 ---
 drivers/scsi/mpt3sas/mpt3sas_base.c |7 ---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c 
b/drivers/scsi/mpt2sas/mpt2sas_base.c
index 3fb01d1883c6..16d6fd5e037e 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -2983,7 +2983,9 @@ _base_wait_for_doorbell_ack(struct MPT2SAS_ADAPTER *ioc, 
int timeout,
cntdn = (sleep_flag == CAN_SLEEP) ? 1000*timeout : 2000*timeout;
do {
int_status = readl(&ioc->chip->HostInterruptStatus);
-   if (!(int_status & MPI2_HIS_SYS2IOC_DB_STATUS)) {
+   if (int_status == 0x)
+   goto out;
+   else if (!(int_status & MPI2_HIS_SYS2IOC_DB_STATUS)) {
dhsprintk(ioc, printk(MPT2SAS_INFO_FMT "%s: "
"successful count(%d), timeout(%d)\n", ioc->name,
__func__, count, timeout));
@@ -2995,8 +2997,7 @@ _base_wait_for_doorbell_ack(struct MPT2SAS_ADAPTER *ioc, 
int timeout,
mpt2sas_base_fault_info(ioc , doorbell);
return -EFAULT;
}
-   } else if (int_status == 0x)
-   goto out;
+   }
 
if (sleep_flag == CAN_SLEEP)
msleep(1);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index b9c27398e206..878bf6ddd2a0 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -3157,7 +3157,9 @@ _base_wait_for_doorbell_ack(struct MPT3SAS_ADAPTER *ioc, 
int timeout,
cntdn = (sleep_flag == CAN_SLEEP) ? 1000*timeout : 2000*timeout;
do {
int_status = readl(&ioc->chip->HostInterruptStatus);
-   if (!(int_status & MPI2_HIS_SYS2IOC_DB_STATUS)) {
+   if (int_status == 0x)
+   goto out;
+   else if (!(int_status & MPI2_HIS_SYS2IOC_DB_STATUS)) {
dhsprintk(ioc, pr_info(MPT3SAS_FMT
"%s: successful count(%d), timeout(%d)\n",
ioc->name, __func__, count, timeout));
@@ -3169,8 +3171,7 @@ _base_wait_for_doorbell_ack(struct MPT3SAS_ADAPTER *ioc, 
int timeout,
mpt3sas_base_fault_info(ioc , doorbell);
return -EFAULT;
}
-   } else if (int_status == 0x)
-   goto out;
+   }
 
if (sleep_flag == CAN_SLEEP)
usleep_range(1000, 1500);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [SCSI RFC] mpt2sas: wait_for_completion_timeout timeout not reported

2014-12-30 Thread Joe Lawrence
On 12/29/2014 12:25 PM, Nicholas Mc Guire wrote:
> wait_for_completion_timeout reaching timeout was being ignored,
> this probably also should fail if timeout condition occurs ?
> 
> this was only compile tested with
> x86_64_defconfig + CONFIG_SCSI_LOWLEVEL=y + CONFIG_SCSI_MPT2SAS=m
> 
> patch is against linux-next 3.19.0-rc1 -next-20141226
> 
> Signed-off-by: Nicholas Mc Guire 
> ---
>  drivers/scsi/mpt2sas/mpt2sas_config.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_config.c 
> b/drivers/scsi/mpt2sas/mpt2sas_config.c
> index c72a2ff..02dc2d8 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_config.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_config.c
> @@ -391,7 +391,7 @@ _config_request(struct MPT2SAS_ADAPTER *ioc, 
> Mpi2ConfigRequest_t
>   mpt2sas_base_put_smid_default(ioc, smid);
>   timeleft = wait_for_completion_timeout(&ioc->config_cmds.done,
>   timeout*HZ);
> - if (!(ioc->config_cmds.status & MPT2_CMD_COMPLETE)) {
> + if (timeleft == 0 || !(ioc->config_cmds.status & MPT2_CMD_COMPLETE)) {
>   printk(MPT2SAS_ERR_FMT "%s: timeout\n",
>   ioc->name, __func__);
>   _debug_dump_mf(mpi_request,
> 

Hi Nicholas,

This would look like the correct thing to do here, for it eliminates a
small window where we timeout when mpt2sas_config_done had set
MPT2_CMD_COMPLETE and MPT3_CMD_REPLY_VALID, but has not finished copying
the reply data into config_cmds.reply.

It looks like the same patch should be applied to the mpt3sas driver as
well.

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 31/35] qla2xxx: Disable PCI device in shutdown handler.

2014-09-24 Thread Joe Lawrence
On Wed, 24 Sep 2014 11:37:55 -0400
Chad Dupuis  wrote:

> 
> 
> On Wed, 24 Sep 2014, Joe Lawrence wrote:
> 
> > On Wed, 24 Sep 2014 03:08:34 -0400
> > Saurav Kashyap  wrote:
> >
> >> From: Chad Dupuis 
> >>
> >> Disable the PCI device during shutdown to prevent any races with
> >> other PCI code such as the AER handling code.
> >>
> >> Signed-off-by: Chad Dupuis 
> >> Signed-off-by: Saurav Kashyap 
> >> ---
> >>  drivers/scsi/qla2xxx/qla_os.c |3 +++
> >>  1 files changed, 3 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> >> index 08c8e9c..70445bc 100644
> >> --- a/drivers/scsi/qla2xxx/qla_os.c
> >> +++ b/drivers/scsi/qla2xxx/qla_os.c
> >> @@ -3037,6 +3037,9 @@ qla2x00_shutdown(struct pci_dev *pdev)
> >>qla2x00_free_irqs(vha);
> >>
> >>qla2x00_free_fw_dump(ha);
> >> +
> >> +  pci_disable_pcie_error_reporting(pdev);
> >> +  pci_disable_device(pdev);
> >>  }
> >>
> >>  /* Deletes all the virtual ports for a given ha */
> >
> > Hi Saurav,
> >
> > Just curious if there were any bug report instances to go along with
> > this patch?  (Or a short explanation of the race.)
> 
> Joe,
> 
> The race here was during reboot after taking a crash dump.  We hit an AER 
> condition so the driver's AER handler is called after we've already 
> shutdown the adapter which causes us to try to free some already freed 
> structs.  To prevent this we disable the PCI function from the shutdown 
> handler.

Hi Chad,

Could the driver's AER handler get invoked after qla2x00_free_fw_dump,
but before pci_disable_pcie_error_reporting (ie, is there anything to
hold off .shutdown from running at the same time as the AER callbacks)?

Thanks,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 31/35] qla2xxx: Disable PCI device in shutdown handler.

2014-09-24 Thread Joe Lawrence
On Wed, 24 Sep 2014 03:08:34 -0400
Saurav Kashyap  wrote:

> From: Chad Dupuis 
> 
> Disable the PCI device during shutdown to prevent any races with
> other PCI code such as the AER handling code.
> 
> Signed-off-by: Chad Dupuis 
> Signed-off-by: Saurav Kashyap 
> ---
>  drivers/scsi/qla2xxx/qla_os.c |3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index 08c8e9c..70445bc 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -3037,6 +3037,9 @@ qla2x00_shutdown(struct pci_dev *pdev)
>   qla2x00_free_irqs(vha);
>  
>   qla2x00_free_fw_dump(ha);
> +
> + pci_disable_pcie_error_reporting(pdev);
> + pci_disable_device(pdev);
>  }
>  
>  /* Deletes all the virtual ports for a given ha */

Hi Saurav,

Just curious if there were any bug report instances to go along with
this patch?  (Or a short explanation of the race.)

Thanks,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 20/35] qla2xxx: Unload of qla2xxx driver crashes the machine.

2014-09-24 Thread Joe Lawrence
Hi Saurav,

Will these changes conflict with those submitted in August [1] to
Christoph's drivers-for-3.18 branch?

In particular, "qla2xxx: Fix shost use-after-free on device
removal" [2] fixed this same driver unload issue in a slightly
different manner.  That change was marked for stable as the bug was
introduced by fe1b806f4f71 ("qla2xxx: Refactor shutdown code so some
functionality can be reused") in previous releases.

Regards,

-- Joe

[1] http://thread.gmane.org/gmane.linux.scsi/93859
[2] 
http://git.infradead.org/users/hch/scsi-queue.git/commitdiff/db7157d4cfce6edf052452fb1d327d4d11b67f4c

On Wed, 24 Sep 2014 03:08:23 -0400
Saurav Kashyap  wrote:

> From: Arun Easi 
> 
> Signed-off-by: Arun Easi 
> Signed-off-by: Saurav Kashyap 
> ---
>  drivers/scsi/qla2xxx/qla_os.c |4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index 8846883..6539513 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -3190,10 +3190,10 @@ qla2x00_remove_one(struct pci_dev *pdev)
>  
>   qla2x00_free_device(base_vha);
>  
> - scsi_host_put(base_vha->host);
> -
>   qla2x00_clear_drv_active(base_vha);
>  
> + scsi_host_put(base_vha->host);
> +
>   qla2x00_unmap_iobases(ha);
>  
>   pci_release_selected_regions(ha->pdev, ha->bars);

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] mpt2sas: remove duplicate disable_discovery MODULE_PARAM

2014-09-05 Thread Joe Lawrence
The disable_discovery module parameter is declared and only used by
mpt2sas_scsih.c. Remove the extra copy in mpt2sas_base.c.

Signed-off-by: Joe Lawrence 
---
 drivers/scsi/mpt2sas/mpt2sas_base.c |4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c 
b/drivers/scsi/mpt2sas/mpt2sas_base.c
index cf51b48..bb8dcb0 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -84,10 +84,6 @@ static int mpt2sas_fwfault_debug;
 MODULE_PARM_DESC(mpt2sas_fwfault_debug, " enable detection of firmware fault "
"and halt firmware - (default=0)");
 
-static int disable_discovery = -1;
-module_param(disable_discovery, int, 0);
-MODULE_PARM_DESC(disable_discovery, " disable discovery ");
-
 /**
  * _scsih_set_fwfault_debug - global setting of ioc->fwfault_debug.
  *
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND 6/6] qla2xxx: Prevent probe and board_disable race

2014-08-26 Thread Joe Lawrence
The PCI register read checking introduced in commit fe1b806f4f71
("qla2xxx: Disable adapter when we encounter a PCI disconnect") is
active during driver probe. Hold off scheduling any board removal until
the driver probe has completed. This ensures that the the board_disable
work structure is initialized and more importantly, avoids racing
qla2x00_probe_one.

Signed-off-by: Joe Lawrence 
Acked-by: Chad Dupuis 
---
 drivers/scsi/qla2xxx/qla_def.h |1 +
 drivers/scsi/qla2xxx/qla_isr.c |3 ++-
 drivers/scsi/qla2xxx/qla_os.c  |2 ++
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
index 1398818..de5a9c4 100644
--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -3405,6 +3405,7 @@ typedef struct scsi_qla_host {
unsigned long   pci_flags;
 #define PFLG_DISCONNECTED  0   /* PCI device removed */
 #define PFLG_DRIVER_REMOVING   1   /* PCI driver .remove */
+#define PFLG_DRIVER_PROBING2   /* PCI driver .probe */
 
uint32_tdevice_flags;
 #define SWITCH_FOUND   BIT_0
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 341c64d..a0992bf 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -118,7 +118,8 @@ qla2x00_check_reg32_for_disconnect(scsi_qla_host_t *vha, 
uint32_t reg)
/* Check for PCI disconnection */
if (reg == 0x) {
if (!test_and_set_bit(PFLG_DISCONNECTED, &vha->pci_flags) &&
-   !test_bit(PFLG_DRIVER_REMOVING, &vha->pci_flags)) {
+   !test_bit(PFLG_DRIVER_REMOVING, &vha->pci_flags) &&
+   !test_bit(PFLG_DRIVER_PROBING, &vha->pci_flags)) {
/*
 * Schedule this (only once) on the default system
 * workqueue so that all the adapter workqueues and the
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 84d4df6..f937859 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -2632,6 +2632,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct 
pci_device_id *id)
}
 
pci_set_drvdata(pdev, base_vha);
+   set_bit(PFLG_DRIVER_PROBING, &base_vha->pci_flags);
 
host = base_vha->host;
base_vha->req = req;
@@ -2928,6 +2929,7 @@ skip_dpc:
 
qlt_add_target(ha, base_vha);
 
+   clear_bit(PFLG_DRIVER_PROBING, &base_vha->pci_flags);
return 0;
 
 probe_init_failed:
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND 3/6] qla2xxx: Collect PCI register checks and board_disable scheduling

2014-08-26 Thread Joe Lawrence
Add an uint16_t variant of qla2x00_check_reg_for_disconnect and use
these routines to check and schedule a PCI-disconnected board from a
centralized place.

Signed-off-by: Joe Lawrence 
Acked-by: Chad Dupuis 
---
 drivers/scsi/qla2xxx/qla_gbl.h |3 ++-
 drivers/scsi/qla2xxx/qla_isr.c |   28 +---
 drivers/scsi/qla2xxx/qla_mr.c  |2 +-
 drivers/scsi/qla2xxx/qla_nx.c  |6 +++---
 drivers/scsi/qla2xxx/qla_os.c  |8 +---
 5 files changed, 20 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
index d646540..43ef0db 100644
--- a/drivers/scsi/qla2xxx/qla_gbl.h
+++ b/drivers/scsi/qla2xxx/qla_gbl.h
@@ -475,7 +475,8 @@ extern uint8_t *qla25xx_read_nvram_data(scsi_qla_host_t *, 
uint8_t *, uint32_t,
 extern int qla25xx_write_nvram_data(scsi_qla_host_t *, uint8_t *, uint32_t,
uint32_t);
 extern int qla2x00_is_a_vp_did(scsi_qla_host_t *, uint32_t);
-bool qla2x00_check_reg_for_disconnect(scsi_qla_host_t *, uint32_t);
+bool qla2x00_check_reg32_for_disconnect(scsi_qla_host_t *, uint32_t);
+bool qla2x00_check_reg16_for_disconnect(scsi_qla_host_t *, uint16_t);
 
 extern int qla2x00_beacon_on(struct scsi_qla_host *);
 extern int qla2x00_beacon_off(struct scsi_qla_host *);
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index bea0e74..016ebed 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -56,16 +56,8 @@ qla2100_intr_handler(int irq, void *dev_id)
vha = pci_get_drvdata(ha->pdev);
for (iter = 50; iter--; ) {
hccr = RD_REG_WORD(®->hccr);
-   /* Check for PCI disconnection */
-   if (hccr == 0x) {
-   /*
-* Schedule this on the default system workqueue so that
-* all the adapter workqueues and the DPC thread can be
-* shutdown cleanly.
-*/
-   schedule_work(&ha->board_disable);
+   if (qla2x00_check_reg16_for_disconnect(vha, hccr))
break;
-   }
if (hccr & HCCR_RISC_PAUSE) {
if (pci_channel_offline(ha->pdev))
break;
@@ -121,7 +113,7 @@ qla2100_intr_handler(int irq, void *dev_id)
 }
 
 bool
-qla2x00_check_reg_for_disconnect(scsi_qla_host_t *vha, uint32_t reg)
+qla2x00_check_reg32_for_disconnect(scsi_qla_host_t *vha, uint32_t reg)
 {
/* Check for PCI disconnection */
if (reg == 0x) {
@@ -136,6 +128,12 @@ qla2x00_check_reg_for_disconnect(scsi_qla_host_t *vha, 
uint32_t reg)
return false;
 }
 
+bool
+qla2x00_check_reg16_for_disconnect(scsi_qla_host_t *vha, uint16_t reg)
+{
+   return qla2x00_check_reg32_for_disconnect(vha, 0x | reg);
+}
+
 /**
  * qla2300_intr_handler() - Process interrupts for the ISP23xx and ISP63xx.
  * @irq:
@@ -174,7 +172,7 @@ qla2300_intr_handler(int irq, void *dev_id)
vha = pci_get_drvdata(ha->pdev);
for (iter = 50; iter--; ) {
stat = RD_REG_DWORD(®->u.isp2300.host_status);
-   if (qla2x00_check_reg_for_disconnect(vha, stat))
+   if (qla2x00_check_reg32_for_disconnect(vha, stat))
break;
if (stat & HSR_RISC_PAUSED) {
if (unlikely(pci_channel_offline(ha->pdev)))
@@ -2631,7 +2629,7 @@ qla24xx_intr_handler(int irq, void *dev_id)
vha = pci_get_drvdata(ha->pdev);
for (iter = 50; iter--; ) {
stat = RD_REG_DWORD(®->host_status);
-   if (qla2x00_check_reg_for_disconnect(vha, stat))
+   if (qla2x00_check_reg32_for_disconnect(vha, stat))
break;
if (stat & HSRX_RISC_PAUSED) {
if (unlikely(pci_channel_offline(ha->pdev)))
@@ -2721,7 +2719,7 @@ qla24xx_msix_rsp_q(int irq, void *dev_id)
 * we process the response queue.
 */
stat = RD_REG_DWORD(®->host_status);
-   if (qla2x00_check_reg_for_disconnect(vha, stat))
+   if (qla2x00_check_reg32_for_disconnect(vha, stat))
goto out;
qla24xx_process_response_queue(vha, rsp);
if (!ha->flags.disable_msix_handshake) {
@@ -2761,7 +2759,7 @@ qla25xx_msix_rsp_q(int irq, void *dev_id)
hccr = RD_REG_DWORD_RELAXED(®->hccr);
spin_unlock_irqrestore(&ha->hardware_lock, flags);
}
-   if (qla2x00_check_reg_for_disconnect(vha, hccr))
+   if (qla2x00_check_reg32_for_disconnect(vha, hccr))
goto out;
queue_work_on((int) (rsp->id - 1), ha->wq, &rsp->q_work);
 
@@ -2796,7 +2794,7 @@ qla24xx_msix_default(int irq, void *dev_id)
vha = pci_get_drvdata(ha->pdev);
  

[PATCH RESEND 4/6] qla2xxx: Schedule board_disable only once

2014-08-26 Thread Joe Lawrence
There are various callers of qla2x00_check_reg{32,16}_for_disconnect
that may schedule board removal on PCI-disconnect. Test-and-set a
dedicated flag before scheduling board_disable so it is invoked only
once.

Signed-off-by: Joe Lawrence 
Acked-by: Chad Dupuis 
---
 drivers/scsi/qla2xxx/qla_def.h |3 +++
 drivers/scsi/qla2xxx/qla_isr.c |   14 --
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
index b643991..4267ec6 100644
--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -3402,6 +3402,9 @@ typedef struct scsi_qla_host {
 #define FX00_CRITEMP_RECOVERY  25
 #define FX00_HOST_INFO_RESEND  26
 
+   unsigned long   pci_flags;
+#define PFLG_DISCONNECTED  0   /* PCI device removed */
+
uint32_tdevice_flags;
 #define SWITCH_FOUND   BIT_0
 #define DFLG_NO_CABLE  BIT_1
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 016ebed..fd75b91 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -117,12 +117,14 @@ qla2x00_check_reg32_for_disconnect(scsi_qla_host_t *vha, 
uint32_t reg)
 {
/* Check for PCI disconnection */
if (reg == 0x) {
-   /*
-* Schedule this on the default system workqueue so that all the
-* adapter workqueues and the DPC thread can be shutdown
-* cleanly.
-*/
-   schedule_work(&vha->hw->board_disable);
+   if (!test_and_set_bit(PFLG_DISCONNECTED, &vha->pci_flags)) {
+   /*
+* Schedule this (only once) on the default system
+* workqueue so that all the adapter workqueues and the
+* DPC thread can be shutdown cleanly.
+*/
+   schedule_work(&vha->hw->board_disable);
+   }
return true;
} else
return false;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND 5/6] qla2xxx: Prevent removal and board_disable race

2014-08-26 Thread Joe Lawrence
Introduce mutual exclusion between the qla2xxx_remove_one PCI driver
callback and qla2x00_disable_board_on_pci_error, which is scheduled as
board_disable work by qla2x00_check_reg{32,16}_for_disconnect:

* Leave the driver-specific data attached to the underlying PCI device
intact in qla2x00_disable_board_on_pci_error, so that qla2x00_remove_one
has enough breadcrumbs to determine that any board_disable work has been
completed.

* In qla2xxx_remove_one, set a bit to prevent any subsequent
board_disable work from scheduling, then cancel and wait until pending
work has completed.

* Reuse the PCI device enable count check in qla2x00_remove_one to
determine if board_disable has occured. The original purpose of this
check was unnecessary since the driver remove function wasn't called
when the probe fails.

Signed-off-by: Joe Lawrence 
Acked-by: Chad Dupuis 
---
 drivers/scsi/qla2xxx/qla_def.h |1 +
 drivers/scsi/qla2xxx/qla_isr.c |3 ++-
 drivers/scsi/qla2xxx/qla_os.c  |   31 +++
 3 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
index 4267ec6..1398818 100644
--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -3404,6 +3404,7 @@ typedef struct scsi_qla_host {
 
unsigned long   pci_flags;
 #define PFLG_DISCONNECTED  0   /* PCI device removed */
+#define PFLG_DRIVER_REMOVING   1   /* PCI driver .remove */
 
uint32_tdevice_flags;
 #define SWITCH_FOUND   BIT_0
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index fd75b91..341c64d 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -117,7 +117,8 @@ qla2x00_check_reg32_for_disconnect(scsi_qla_host_t *vha, 
uint32_t reg)
 {
/* Check for PCI disconnection */
if (reg == 0x) {
-   if (!test_and_set_bit(PFLG_DISCONNECTED, &vha->pci_flags)) {
+   if (!test_and_set_bit(PFLG_DISCONNECTED, &vha->pci_flags) &&
+   !test_bit(PFLG_DRIVER_REMOVING, &vha->pci_flags)) {
/*
 * Schedule this (only once) on the default system
 * workqueue so that all the adapter workqueues and the
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 3bfa89d..84d4df6 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3131,15 +3131,25 @@ qla2x00_remove_one(struct pci_dev *pdev)
scsi_qla_host_t *base_vha;
struct qla_hw_data  *ha;
 
+   base_vha = pci_get_drvdata(pdev);
+   ha = base_vha->hw;
+
+   /* Indicate device removal to prevent future board_disable and wait
+* until any pending board_disable has completed. */
+   set_bit(PFLG_DRIVER_REMOVING, &base_vha->pci_flags);
+   cancel_work_sync(&ha->board_disable);
+
/*
-* If the PCI device is disabled that means that probe failed and any
-* resources should be have cleaned up on probe exit.
+* If the PCI device is disabled then there was a PCI-disconnect and
+* qla2x00_disable_board_on_pci_error has taken care of most of the
+* resources.
 */
-   if (!atomic_read(&pdev->enable_cnt))
+   if (!atomic_read(&pdev->enable_cnt)) {
+   scsi_host_put(base_vha->host);
+   kfree(ha);
+   pci_set_drvdata(pdev, NULL);
return;
-
-   base_vha = pci_get_drvdata(pdev);
-   ha = base_vha->hw;
+   }
 
qla2x00_wait_for_hba_ready(base_vha);
 
@@ -4799,18 +4809,15 @@ qla2x00_disable_board_on_pci_error(struct work_struct 
*work)
qla82xx_md_free(base_vha);
qla2x00_free_queues(ha);
 
-   scsi_host_put(base_vha->host);
-
qla2x00_unmap_iobases(ha);
 
pci_release_selected_regions(ha->pdev, ha->bars);
-   kfree(ha);
-   ha = NULL;
-
pci_disable_pcie_error_reporting(pdev);
pci_disable_device(pdev);
-   pci_set_drvdata(pdev, NULL);
 
+   /*
+* Let qla2x00_remove_one cleanup qla_hw_data on device removal.
+*/
 }
 
 /**
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND 2/6] qla2xxx: Use qla2x00_clear_drv_active on probe failure

2014-08-26 Thread Joe Lawrence
Take advantage of commit fe1b806f4f71 ("qla2xxx: Refactor shutdown code
so some functionality can be reused") to remove an inlined copy of
qla2x00_clear_drv_active in the driver's probe hardware error path.

Signed-off-by: Joe Lawrence 
Acked-by: Chad Dupuis 
---
 drivers/scsi/qla2xxx/qla_os.c |   13 +++--
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 8252c0e..53449d7 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -238,6 +238,7 @@ static int qla2xxx_eh_host_reset(struct scsi_cmnd *);
 
 static int qla2x00_change_queue_depth(struct scsi_device *, int, int);
 static int qla2x00_change_queue_type(struct scsi_device *, int);
+static void qla2x00_clear_drv_active(struct qla_hw_data *);
 static void qla2x00_free_device(scsi_qla_host_t *);
 
 struct scsi_host_template qla2xxx_driver_template = {
@@ -2954,16 +2955,8 @@ probe_failed:
scsi_host_put(base_vha->host);
 
 probe_hw_failed:
-   if (IS_QLA82XX(ha)) {
-   qla82xx_idc_lock(ha);
-   qla82xx_clear_drv_active(ha);
-   qla82xx_idc_unlock(ha);
-   }
-   if (IS_QLA8044(ha)) {
-   qla8044_idc_lock(ha);
-   qla8044_clear_drv_active(ha);
-   qla8044_idc_unlock(ha);
-   }
+   qla2x00_clear_drv_active(ha);
+
 iospace_config_failed:
if (IS_P3P_TYPE(ha)) {
if (!ha->nx_pcibase)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND 1/6] qla2xxx: Fix shost use-after-free on device removal

2014-08-26 Thread Joe Lawrence
Once calling scsi_host_put, be careful to not access qla_hw_data through
the Scsi_Host private data (ie, scsi_qla_host base_vha).

Fixes: fe1b806f4f71 ("qla2xxx: Refactor shutdown code so some functionality can 
be reused") 
Cc: sta...@vger.kernel.org # 3.14, 3.15, 3.16
Signed-off-by: Joe Lawrence 
Acked-by: Chad Dupuis 
---
 drivers/scsi/qla2xxx/qla_os.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index be9698d..8252c0e 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3119,10 +3119,8 @@ qla2x00_unmap_iobases(struct qla_hw_data *ha)
 }
 
 static void
-qla2x00_clear_drv_active(scsi_qla_host_t *vha)
+qla2x00_clear_drv_active(struct qla_hw_data *ha)
 {
-   struct qla_hw_data *ha = vha->hw;
-
if (IS_QLA8044(ha)) {
qla8044_idc_lock(ha);
qla8044_clear_drv_active(ha);
@@ -3193,7 +3191,7 @@ qla2x00_remove_one(struct pci_dev *pdev)
 
scsi_host_put(base_vha->host);
 
-   qla2x00_clear_drv_active(base_vha);
+   qla2x00_clear_drv_active(ha);
 
qla2x00_unmap_iobases(ha);
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND 0/6] qla2xxx device removal fixups

2014-08-26 Thread Joe Lawrence
Resending patchset against scsi-queue/drivers-for-3.18 with Acked-by and
Cc stable annotations.

Joe Lawrence (6):
  qla2xxx: Fix shost use-after-free on device removal
  qla2xxx: Use qla2x00_clear_drv_active on probe failure
  qla2xxx: Collect PCI register checks and board_disable scheduling
  qla2xxx: Schedule board_disable only once
  qla2xxx: Prevent removal and board_disable race
  qla2xxx: Prevent probe and board_disable race

 drivers/scsi/qla2xxx/qla_def.h |5 
 drivers/scsi/qla2xxx/qla_gbl.h |3 +-
 drivers/scsi/qla2xxx/qla_isr.c |   44 +++--
 drivers/scsi/qla2xxx/qla_mr.c  |2 +-
 drivers/scsi/qla2xxx/qla_nx.c  |6 ++--
 drivers/scsi/qla2xxx/qla_os.c  |   60 ++--
 6 files changed, 61 insertions(+), 59 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] qla2xxx device removal fixups

2014-08-26 Thread Joe Lawrence
On Wed, 18 Jun 2014 10:02:13 -0400
Joe Lawrence  wrote:

> Hi Chad, Giri, et al.
> 
> Stratus has been testing the upstream qla2xxx driver against surprise
> device removal and has found a few minor issues along the way.  With
> this patchset, results have been good.  Although the following changes
> may be most interesting on a Stratus platform, they should be applicable
> to general hotplug scenarios on other hardware.
> 
> The first two patches are independent from the others and can be
> reviewed as such.  One fixes up a use-after-free and the other
> consolidates some code.

Patch 1 corrects a bug introduced by commit fe1b806, which has made its
way into three linux-stable releases:

% git tag --contains fe1b806 | grep '^v3.[0-9]*$'
v3.14
v3.15
v3.16

> The final four patches are more invasive and modify the scsi_qla_host
> structure to avoid device removal race conditions.  These changes were
> written to demonstrate the problem at hand and minimally fix them in
> order to continue testing.  (For example, adding completely a new
> pci_flags member to scsi_qla_host is probably overkill.) 
> 
> We would welcome any feedback or questions about our testing.

Chad ack'd the series on 28 Jul 2014.  Any chance this can make into
hch's queue for 3.18?

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/6] qla2xxx: Prevent removal and board_disable race

2014-07-25 Thread Joe Lawrence
On Fri, 25 Jul 2014 11:23:02 -0400
Chad Dupuis  wrote:

> 
> 
> On Wed, 18 Jun 2014, Joe Lawrence wrote:
> 
> > Introduce mutual exclusion between the qla2xxx_remove_one PCI driver
> > callback and qla2x00_disable_board_on_pci_error, which is scheduled as
> > board_disable work by qla2x00_check_reg{32,16}_for_disconnect:
> >
> > * Leave the driver-specific data attached to the underlying PCI device
> > intact in qla2x00_disable_board_on_pci_error, so that qla2x00_remove_one
> > has enough breadcrumbs to determine that any board_disable work has been
> > completed.
> >
> > * In qla2xxx_remove_one, set a bit to prevent any subsequent
> > board_disable work from scheduling, then cancel and wait until pending
> > work has completed.
> >
> > * Reuse the PCI device enable count check in qla2x00_remove_one to
> > determine if board_disable has occured.  The original purpose of this
> > check was unnecessary since the driver remove function wasn't called
> > when the probe fails.
> >
> > Signed-off-by: Joe Lawrence 
> > ---
> > drivers/scsi/qla2xxx/qla_def.h |1 +
> > drivers/scsi/qla2xxx/qla_isr.c |3 ++-
> > drivers/scsi/qla2xxx/qla_os.c  |   31 +++
> > 3 files changed, 22 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
> > index 1267b11..7c441c9 100644
> > --- a/drivers/scsi/qla2xxx/qla_def.h
> > +++ b/drivers/scsi/qla2xxx/qla_def.h
> > @@ -3404,6 +3404,7 @@ typedef struct scsi_qla_host {
> >
> > unsigned long   pci_flags;
> > #define PFLG_DISCONNECTED   0   /* PCI device removed */
> > +#define PFLG_DRIVER_REMOVING   1   /* PCI driver .remove */
> >
> > uint32_tdevice_flags;
> > #define SWITCH_FOUNDBIT_0
> > diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
> > index 2741ad8..ee5eef4 100644
> > --- a/drivers/scsi/qla2xxx/qla_isr.c
> > +++ b/drivers/scsi/qla2xxx/qla_isr.c
> > @@ -117,7 +117,8 @@ qla2x00_check_reg32_for_disconnect(scsi_qla_host_t 
> > *vha, uint32_t reg)
> > {
> > /* Check for PCI disconnection */
> > if (reg == 0x) {
> > -   if (!test_and_set_bit(PFLG_DISCONNECTED, &vha->pci_flags)) {
> > +   if (!test_and_set_bit(PFLG_DISCONNECTED, &vha->pci_flags) &&
> > +   !test_bit(PFLG_DRIVER_REMOVING, &vha->pci_flags)) {
> 
> In our remove function we set a bit that we are unloading:
> 
> set_bit (UNLOADING, &base_vha->dpc_flags);
> 
> could this be used instead?

Hi Chad,

Thanks for the comments.

I think with a little bit of code shuffling this could be accomplished.
My goal with the patchset was to try and present the problem/fix as
plain as possible.  It was easiest to collect all the atomic bits I
needed inside a single variable.  Should I be tacking on such flags to
'dpc_flags' ?

> 
> > /*
> >  * Schedule this (only once) on the default system
> >  * workqueue so that all the adapter workqueues and the
> > diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> > index 39c9953..51cba37 100644
> > --- a/drivers/scsi/qla2xxx/qla_os.c
> > +++ b/drivers/scsi/qla2xxx/qla_os.c
> > @@ -3123,15 +3123,25 @@ qla2x00_remove_one(struct pci_dev *pdev)
> > scsi_qla_host_t *base_vha;
> > struct qla_hw_data  *ha;
> >
> > +   base_vha = pci_get_drvdata(pdev);
> > +   ha = base_vha->hw;
> > +
> > +   /* Indicate device removal to prevent future board_disable and wait
> > +* until any pending board_disable has completed. */
> > +   set_bit(PFLG_DRIVER_REMOVING, &base_vha->pci_flags);
> > +   cancel_work_sync(&ha->board_disable);
> > +
> > /*
> > -* If the PCI device is disabled that means that probe failed and any
> > -* resources should be have cleaned up on probe exit.
> > +* If the PCI device is disabled then there was a PCI-disconnect and
> > +* qla2x00_disable_board_on_pci_error has taken care of most of the
> > +* resources.
> >  */
> > -   if (!atomic_read(&pdev->enable_cnt))
> > +   if (!atomic_read(&pdev->enable_cnt)) {
> 
> Should we also add a check here that this is a disconnection before 
> freeing these structs?  The original intent of the check for 
> pdev->enable_cnt is to make sure we don't try to dereference an already 
> freed struct if probe failed.

I'm not exactly

Re: [PATCH] lpfc: Avoid to disable pci_dev twice

2014-07-17 Thread Joe Lawrence
[ +cc linux-pci and Bjorn, comments inline/below ... ]

On Thu, 17 Jul 2014 02:32:31 -0400
Mike Qiu  wrote:

> In IBM Power servers, when hardware error occurs during probe
> state, EEH subsystem will call driver's error_detected interface,
> which will call pci_disable_device(). But driver's probe function also
> call pci_disable_device() in this situation.
> 
> So pci_dev will be disabled twice:
> 
> Device lpfc disabling already-disabled device
> [ cut here ]
> WARNING: at drivers/pci/pci.c:1407
> CPU: 0 PID: 8744 Comm: kworker/0:0 Tainted: GW
> 3.10.42-2002.pkvm2_1_1.6.ppc64 #1
> Workqueue: events .work_for_cpu_fn
> task: c0274e3f5400 ti: c027d3958000 task.ti: c027d3958000
> NIP: c0471b8c LR: c0471b88 CTR: c043ebe0
> REGS: c027d395b650 TRAP: 0700   Tainted: GW 
> (3.10.42-2002.pkvm2_1_1.6.ppc64)
> MSR: 900100029032   CR: 28b52b44  XER: 2000
> CFAR: c0879ab8 SOFTE: 1
> ...
> NIP .pci_disable_device+0xcc/0xe0
> LR  .pci_disable_device+0xc8/0xe0
> Call Trace:
> .pci_disable_device+0xc8/0xe0 (unreliable)
> .lpfc_disable_pci_dev+0x50/0x80 [lpfc]
> .lpfc_pci_probe_one+0x870/0x21a0 [lpfc]
> .local_pci_probe+0x68/0xb0
> .work_for_cpu_fn+0x38/0x60
> .process_one_work+0x1a4/0x4d0
> .worker_thread+0x37c/0x490
> .kthread+0xf0/0x100
> .ret_from_kernel_thread+0x5c/0x80
> 
> Signed-off-by: Mike Qiu 
> ---
>  drivers/scsi/lpfc/lpfc.h  |  1 +
>  drivers/scsi/lpfc/lpfc_init.c | 59 
> +++
>  2 files changed, 55 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
> index 434e903..0c7bad9 100644
> --- a/drivers/scsi/lpfc/lpfc.h
> +++ b/drivers/scsi/lpfc/lpfc.h
> @@ -813,6 +813,7 @@ struct lpfc_hba {
>  #define VPD_MASK0xf /* mask for any vpd data */
>  
>   uint8_t soft_wwn_enable;
> + uint8_t probe_done;
>  
>   struct timer_list fcp_poll_timer;
>   struct timer_list eratt_poll;
> diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
> index 06f9a5b..c2e67ae 100644
> --- a/drivers/scsi/lpfc/lpfc_init.c
> +++ b/drivers/scsi/lpfc/lpfc_init.c
> @@ -9519,6 +9519,9 @@ lpfc_pci_probe_one_s3(struct pci_dev *pdev, const 
> struct pci_device_id *pid)
>   }
>   }
>  
> + /* Set the probe flag */
> + phba->probe_done = 1;
> +
>   /* Perform post initialization setup */
>   lpfc_post_init_setup(phba);
>  
> @@ -9795,6 +9798,9 @@ lpfc_sli_prep_dev_for_recover(struct lpfc_hba *phba)
>  static void
>  lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
>  {
> + if (phba)
> + return;
> +

Should that be "if *not* phba" like the others below?

>   lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
>   "2710 PCI channel disable preparing for reset\n");
>  
> @@ -9812,7 +9818,8 @@ lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
>  
>   /* Disable interrupt and pci device */
>   lpfc_sli_disable_intr(phba);
> - pci_disable_device(phba->pcidev);
> + if (phba->probe_done && phba->pcidev)
> + pci_disable_device(phba->pcidev);
>  }
>  
>  /**
> @@ -10282,6 +10289,9 @@ lpfc_pci_probe_one_s4(struct pci_dev *pdev, const 
> struct pci_device_id *pid)
>   goto out_disable_intr;
>   }
>  
> + /* Set probe_done flag */
> + phba->probe_done = 1;
> +
>   /* Log the current active interrupt mode */
>   phba->intr_mode = intr_mode;
>   lpfc_log_intr_mode(phba, intr_mode);
> @@ -10544,6 +10554,9 @@ lpfc_sli4_prep_dev_for_recover(struct lpfc_hba *phba)
>  static void
>  lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
>  {
> + if (!phba)
> + return;
> +
>   lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
>   "2826 PCI channel disable preparing for reset\n");
>  
> @@ -10562,7 +10575,9 @@ lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
>   /* Disable interrupt and pci device */
>   lpfc_sli4_disable_intr(phba);
>   lpfc_sli4_queue_destroy(phba);
> - pci_disable_device(phba->pcidev);
> +
> + if (phba->probe_done && phba->pcidev)
> + pci_disable_device(phba->pcidev);
>  }
>  
>  /**
> @@ -10893,9 +10908,21 @@ static pci_ers_result_t
>  lpfc_io_error_detected(struct pci_dev *pdev, pci_channel_state_t state)
>  {
>   struct Scsi_Host *shost = pci_get_drvdata(pdev);
> - struct lpfc_hba *phba = ((struct lpfc_vport *)shost->hostdata)->phba;
> + struct lpfc_hba *phba;
>   pci_ers_result_t rc = PCI_ERS_RESULT_DISCONNECT;
>  
> + if (!shost)
> + /* Run here means it may during probe state and
> +  * Scsi_Host has not been created and We can do nothing
> +  * in this state so call for hotplug*/
> + return PCI_ERS_RESULT_NONE;

Is it possible to get here during device removal, ie
lpfc_pci_remove_one?  If so, we may have shost in hand now, but can
these routi

Re: mpt2sas stuck installing

2014-07-14 Thread Joe Lawrence
On Sat, 12 Jul 2014 03:13:07 +
"Elliott, Robert (Server Storage)"  wrote:

> 
> 
> > -Original Message-
> > From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> > ow...@vger.kernel.org] On Behalf Of Joe Lawrence
> ...
> > In your crash stack trace, the scsi error handler has issued a host
> > reset, but then crashed in mpt2sas_base_get_iocstate.  Reading through
> > _scsih_shutdown, I don't believe that the mpt2sas .shutdown path takes
> > any precaution before heading down mpt2sas_base_detach to free adapter
> > resources.  The ordinary .remove path looks similar, though it does
> > call sas_remove_host before freeing resources, *then* scsi_remove_host
> > and scsi_host_put.
> > 
> > I wonder if this ordering needs to be reversed (and added to
> > _scsih_shutdown) to properly de-register from the SCSI midlayer prior
> > to removing the controller instance.
> > 
> > Regards,
> > 
> > -- Joe
> 
> Nagalakshmi was working on an mpt3sas patch for the scsi-mq tree 
> to do just that.  I don't recall if the patch has made it into the 
> scsi-mq tree yet. Apparently it's also needed for non-mq and mpt2sas.
> 
> It is making this change:
> > sas_remove_host(shost);
> > +   scsi_remove_host(shost);
> > mpt3sas_base_detach(ioc);
> > list_del(&ioc->list);
> > -   scsi_remove_host(shost);
> > scsi_host_put(shost);
> 
> We are making a similar change in hpsa.  Doing so led to a general 
> protection fault, which unveiled that we also needed to change 
> cancel_delayed_work() calls to cancel_delayed_work_sync() to 
> ensure there are no worker functions still active after the 
> scsi_host structure is unregistered.

It looks like Sreekanth posted those patches today -- the new ordering
seems correct, but is there any reason why _scsih_shutdown skips
{sas,scsi}_remove_host calls?

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt2sas: mpt2sas_base.c: Fix for possible null pointer dereference

2014-07-03 Thread Joe Lawrence
On Wed, 28 May 2014, Christoph Hellwig wrote:

> > -   ioc_status = le16_to_cpu(mpi_reply->IOCStatus) & MPI2_IOCSTATUS_MASK;
> > +   if (mpi_reply) {
> > +   ioc_status = le16_to_cpu(mpi_reply->IOCStatus) & 
> > MPI2_IOCSTATUS_MASK;
> > +   }
> >  
> > if (ioc_status != MPI2_IOCSTATUS_SUCCESS)
> > ioc->port_enable_failed = 1;
> 
> ioc_status isn't initialized without the reply and used here as well
> as later in the function.  I think we'll need input from LSI or others
> with the spec on what to do when we didn't get a reply.

Any update on this? 

The mpt3 version checks for !mpi_reply and returns 1.  Which leads to 
another question -- should mpt{2,3}sas_port_enable_done ever return 0 (as their 
respective comments describe)?

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] drivers: message: fusion: Simplify rounding

2014-07-03 Thread Joe Lawrence
ALIGN is certainly more readable to me.

Reviewed-by: Joe Lawrence 

-- Joe

On Tue, 1 Jul 2014, Rasmus Villemoes wrote:

> Rounding up to a multiple of 4 should be done using the ALIGN
> macro. As a bonus, this also makes the generated code smaller.
> 
> In GetIocFacts(), sz is assigned to a few lines below without being
> read in the meantime, so it is ok that it doesn't end up with the same
> value as facts->FWImageSize.
> 
> Signed-off-by: Rasmus Villemoes 
> ---
>  drivers/message/fusion/mptbase.c | 7 +--
>  drivers/message/fusion/mptctl.c  | 7 +--
>  2 files changed, 2 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/message/fusion/mptbase.c 
> b/drivers/message/fusion/mptbase.c
> index ebc0af7..10b16a1 100644
> --- a/drivers/message/fusion/mptbase.c
> +++ b/drivers/message/fusion/mptbase.c
> @@ -3175,12 +3175,7 @@ GetIocFacts(MPT_ADAPTER *ioc, int sleepFlag, int 
> reason)
>   facts->FWImageSize = le32_to_cpu(facts->FWImageSize);
>   }
>  
> - sz = facts->FWImageSize;
> - if ( sz & 0x01 )
> - sz += 1;
> - if ( sz & 0x02 )
> - sz += 2;
> - facts->FWImageSize = sz;
> + facts->FWImageSize = ALIGN(facts->FWImageSize, 4);
>  
>   if (!facts->RequestFrameSize) {
>   /*  Something is wrong!  */
> diff --git a/drivers/message/fusion/mptctl.c b/drivers/message/fusion/mptctl.c
> index 8a050e8..1004392 100644
> --- a/drivers/message/fusion/mptctl.c
> +++ b/drivers/message/fusion/mptctl.c
> @@ -1749,12 +1749,7 @@ mptctl_replace_fw (unsigned long arg)
>  
>   /* Allocate memory for the new FW image
>*/
> - newFwSize = karg.newImageSize;
> -
> - if (newFwSize & 0x01)
> - newFwSize += 1;
> - if (newFwSize & 0x02)
> - newFwSize += 2;
> + newFwSize = ALIGN(karg.newImageSize, 4);
>  
>   mpt_alloc_fw_memory(ioc, newFwSize);
>   if (ioc->cached_fw == NULL)
> -- 
> 1.9.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 10/11] mptfusion: combine fw_event_work and its event_data

2014-06-25 Thread Joe Lawrence
Tack the firmware reply event_data payload to the end of its
corresponding struct fw_event_work allocation.  Rework fw_event_work
allocation calculations to include the event_data size where
appropriate.

This clarifies the code a bit and avoids the following smatch warnings:

  drivers/message/fusion/mptsas.c:1003 mptsas_queue_device_delete()
error: memcpy() 'fw_event->event_data' too small (29 vs 36)

  drivers/message/fusion/mptsas.c:1017 mptsas_queue_rescan() error: not
allocating enough data 168 vs 160

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
Reviewed-by: Christoph Hellwig 
---
 drivers/message/fusion/mptsas.c |   16 ++--
 drivers/message/fusion/mptsas.h |2 +-
 2 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 54b51a9..5454d838 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -990,11 +990,10 @@ mptsas_queue_device_delete(MPT_ADAPTER *ioc,
MpiEventDataSasDeviceStatusChange_t *sas_event_data)
 {
struct fw_event_work *fw_event;
-   int sz;
 
-   sz = offsetof(struct fw_event_work, event_data) +
-   sizeof(MpiEventDataSasDeviceStatusChange_t);
-   fw_event = kzalloc(sz, GFP_ATOMIC);
+   fw_event = kzalloc(sizeof(*fw_event) +
+  sizeof(MpiEventDataSasDeviceStatusChange_t),
+  GFP_ATOMIC);
if (!fw_event) {
printk(MYIOC_s_WARN_FMT "%s: failed at (line=%d)\n",
ioc->name, __func__, __LINE__);
@@ -1011,10 +1010,8 @@ static void
 mptsas_queue_rescan(MPT_ADAPTER *ioc)
 {
struct fw_event_work *fw_event;
-   int sz;
 
-   sz = offsetof(struct fw_event_work, event_data);
-   fw_event = kzalloc(sz, GFP_ATOMIC);
+   fw_event = kzalloc(sizeof(*fw_event), GFP_ATOMIC);
if (!fw_event) {
printk(MYIOC_s_WARN_FMT "%s: failed at (line=%d)\n",
ioc->name, __func__, __LINE__);
@@ -4983,7 +4980,7 @@ static int
 mptsas_event_process(MPT_ADAPTER *ioc, EventNotificationReply_t *reply)
 {
u32 event = le32_to_cpu(reply->Event);
-   int sz, event_data_sz;
+   int event_data_sz;
struct fw_event_work *fw_event;
unsigned long delay;
 
@@ -5093,8 +5090,7 @@ mptsas_event_process(MPT_ADAPTER *ioc, 
EventNotificationReply_t *reply)
 
event_data_sz = ((reply->MsgLength * 4) -
offsetof(EventNotificationReply_t, Data));
-   sz = offsetof(struct fw_event_work, event_data) + event_data_sz;
-   fw_event = kzalloc(sz, GFP_ATOMIC);
+   fw_event = kzalloc(sizeof(*fw_event) + event_data_sz, GFP_ATOMIC);
if (!fw_event) {
printk(MYIOC_s_WARN_FMT "%s: failed at (line=%d)\n", ioc->name,
 __func__, __LINE__);
diff --git a/drivers/message/fusion/mptsas.h b/drivers/message/fusion/mptsas.h
index 57e86ab..c396483 100644
--- a/drivers/message/fusion/mptsas.h
+++ b/drivers/message/fusion/mptsas.h
@@ -110,7 +110,7 @@ struct fw_event_work {
MPT_ADAPTER *ioc;
u32 event;
u8  retries;
-   u8  __attribute__((aligned(4))) event_data[1];
+   charevent_data[0] __aligned(4);
 };
 
 struct mptsas_discovery_event {
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 02/11] mpt2sas: combine fw_event_work and its event_data

2014-06-25 Thread Joe Lawrence
Tack the firmware reply event_data payload to the end of its
corresponding struct fw_event_work allocation.  This matches the
convention in the mptfusion driver and simplifies the code.

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Cc: Christoph Hellwig 
Cc: Sreekanth Reddy 
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c |   52 --
 1 file changed, 30 insertions(+), 22 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c 
b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 13e49c3..e7801ff 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -173,7 +173,7 @@ struct fw_event_work {
u8  VP_ID;
u8  ignore;
u16 event;
-   void*event_data;
+   charevent_data[0] __aligned(4);
 };
 
 /* raid transport support */
@@ -2834,7 +2834,6 @@ _scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct 
fw_event_work
 
spin_lock_irqsave(&ioc->fw_event_lock, flags);
list_del(&fw_event->list);
-   kfree(fw_event->event_data);
kfree(fw_event);
spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
 }
@@ -3520,7 +3519,8 @@ _scsih_check_topo_delete_events(struct MPT2SAS_ADAPTER 
*ioc,
if (fw_event->event != MPI2_EVENT_SAS_TOPOLOGY_CHANGE_LIST ||
fw_event->ignore)
continue;
-   local_event_data = fw_event->event_data;
+   local_event_data = (Mpi2EventDataSasTopologyChangeList_t *)
+   fw_event->event_data;
if (local_event_data->ExpStatus ==
MPI2_EVENT_SAS_TOPO_ES_ADDED ||
local_event_data->ExpStatus ==
@@ -5504,7 +5504,9 @@ _scsih_sas_topology_change_event(struct MPT2SAS_ADAPTER 
*ioc,
u64 sas_address;
unsigned long flags;
u8 link_rate, prev_link_rate;
-   Mpi2EventDataSasTopologyChangeList_t *event_data = fw_event->event_data;
+   Mpi2EventDataSasTopologyChangeList_t *event_data =
+   (Mpi2EventDataSasTopologyChangeList_t *)
+   fw_event->event_data;
 
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
if (ioc->logging_level & MPT_DEBUG_EVENT_WORK_TASK)
@@ -5699,7 +5701,8 @@ _scsih_sas_device_status_change_event(struct 
MPT2SAS_ADAPTER *ioc,
u64 sas_address;
unsigned long flags;
Mpi2EventDataSasDeviceStatusChange_t *event_data =
-   fw_event->event_data;
+   (Mpi2EventDataSasDeviceStatusChange_t *)
+   fw_event->event_data;
 
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
if (ioc->logging_level & MPT_DEBUG_EVENT_WORK_TASK)
@@ -5794,6 +5797,7 @@ _scsih_sas_enclosure_dev_status_change_event(struct 
MPT2SAS_ADAPTER *ioc,
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
if (ioc->logging_level & MPT_DEBUG_EVENT_WORK_TASK)
_scsih_sas_enclosure_dev_status_change_event_debug(ioc,
+(Mpi2EventDataSasEnclDevStatusChange_t *)
 fw_event->event_data);
 #endif
 }
@@ -5818,7 +5822,9 @@ _scsih_sas_broadcast_primitive_event(struct 
MPT2SAS_ADAPTER *ioc,
u32 termination_count;
u32 query_count;
Mpi2SCSITaskManagementReply_t *mpi_reply;
-   Mpi2EventDataSasBroadcastPrimitive_t *event_data = fw_event->event_data;
+   Mpi2EventDataSasBroadcastPrimitive_t *event_data =
+   (Mpi2EventDataSasBroadcastPrimitive_t *)
+   fw_event->event_data;
u16 ioc_status;
unsigned long flags;
int r;
@@ -5969,7 +5975,9 @@ static void
 _scsih_sas_discovery_event(struct MPT2SAS_ADAPTER *ioc,
 struct fw_event_work *fw_event)
 {
-   Mpi2EventDataSasDiscovery_t *event_data = fw_event->event_data;
+   Mpi2EventDataSasDiscovery_t *event_data =
+   (Mpi2EventDataSasDiscovery_t *)
+   fw_event->event_data;
 
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
if (ioc->logging_level & MPT_DEBUG_EVENT_WORK_TASK) {
@@ -6357,7 +6365,9 @@ _scsih_sas_ir_config_change_event(struct MPT2SAS_ADAPTER 
*ioc,
Mpi2EventIrConfigElement_t *element;
int i;
u8 foreign_config;
-   Mpi2EventDataIrConfigChangeList_t *event_data = fw_event->event_data;
+   Mpi2EventDataIrConfigChangeList_t *event_data =
+   (Mpi2EventDataIrConfigChangeList_t *)
+   fw_event->event_data;
 
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
if ((ioc->logging_level & MPT_DEBUG_EVENT_WORK_TASK)
@@ -6425,7 +6435,9 @@ _scsih_sas_ir_volume_event(struct MPT2SAS_ADAPTER *ioc,
u16 handle;
u32 state;
int rc;
-   Mpi2EventDataIrVolume_t *event_data = fw_event->event_data;
+   Mpi2EventDataIrVolume_t *event_data =
+   (Mpi2EventDataIrVolume_t *)
+ 

[PATCH v2 11/11] mptfusion: tweak null pointer checks

2014-06-25 Thread Joe Lawrence
Fixes the following smatch warnings:

  drivers/message/fusion/mptbase.c:652 mptbase_reply() warn: variable
dereferenced before check 'reply' (see line 639)

  [JL: No-brainer, the enclosing switch statement dereferences
   reply, so we can't get here unless reply is valid.]

  drivers/message/fusion/mptsas.c:1255 mptsas_taskmgmt_complete() error:
we previously assumed 'pScsiTmReply' could be null (see line 1227)

  [HCH: Reading the code in mptsas_taskmgmt_complete it's pretty
   obvious that it can't do anything useful if mr/pScsiTmReply are
   NULL, so I suspect it would be best to just return at the
   beginning of the function.

   I'd love to understand if it actually could ever be zero, which I
   doubt.  Maybe the LSI people can shed some light on that?]

  drivers/message/fusion/mptsas.c:3888 mptsas_not_responding_devices()
error: we previously assumed 'port_info->phy_info' could be null
(see line 3875)

  [HCH: It's pretty obvious from reading mptsas_sas_io_unit_pg0 that
   we never register a port_info with a NULL phy_info in the lists,
   so all NULL checks on it could be deleted.]

  drivers/message/fusion/mptscsih.c:1284 mptscsih_info() error:
we previously assumed 'h' could be null (see line 1274)

  [HCH: shost_priv can't return NULL, so the if (h) should be
   removed.]

  drivers/message/fusion/mptscsih.c:1388 mptscsih_qcmd() error: we
previously assumed 'vdevice' could be null (see line 1373)

  [HCH: vdevice can't ever be NULL here, it's allocated in
   ->slave_alloc and thus guaranteed to be around when
   ->queuecommand is called.]

Signed-off-by: Joe Lawrence 
Cc: Christoph Hellwig 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---
 drivers/message/fusion/mptbase.c  |   10 +++
 drivers/message/fusion/mptsas.c   |   52 ++---
 drivers/message/fusion/mptscsih.c |   19 ++
 3 files changed, 37 insertions(+), 44 deletions(-)

diff --git a/drivers/message/fusion/mptbase.c b/drivers/message/fusion/mptbase.c
index 9d4c782..a896d94 100644
--- a/drivers/message/fusion/mptbase.c
+++ b/drivers/message/fusion/mptbase.c
@@ -649,12 +649,10 @@ mptbase_reply(MPT_ADAPTER *ioc, MPT_FRAME_HDR *req, 
MPT_FRAME_HDR *reply)
case MPI_FUNCTION_CONFIG:
case MPI_FUNCTION_SAS_IO_UNIT_CONTROL:
ioc->mptbase_cmds.status |= MPT_MGMT_STATUS_COMMAND_GOOD;
-   if (reply) {
-   ioc->mptbase_cmds.status |= MPT_MGMT_STATUS_RF_VALID;
-   memcpy(ioc->mptbase_cmds.reply, reply,
-   min(MPT_DEFAULT_FRAME_SIZE,
-   4 * reply->u.reply.MsgLength));
-   }
+   ioc->mptbase_cmds.status |= MPT_MGMT_STATUS_RF_VALID;
+   memcpy(ioc->mptbase_cmds.reply, reply,
+   min(MPT_DEFAULT_FRAME_SIZE,
+   4 * reply->u.reply.MsgLength));
if (ioc->mptbase_cmds.status & MPT_MGMT_STATUS_PENDING) {
ioc->mptbase_cmds.status &= ~MPT_MGMT_STATUS_PENDING;
complete(&ioc->mptbase_cmds.done);
diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 5454d838..e3494a6 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -1203,27 +1203,28 @@ mptsas_taskmgmt_complete(MPT_ADAPTER *ioc, 
MPT_FRAME_HDR *mf, MPT_FRAME_HDR *mr)
"(mf = %p, mr = %p)\n", ioc->name, mf, mr));
 
pScsiTmReply = (SCSITaskMgmtReply_t *)mr;
-   if (pScsiTmReply) {
-   dtmprintk(ioc, printk(MYIOC_s_DEBUG_FMT
-   "\tTaskMgmt completed: fw_channel = %d, fw_id = %d,\n"
-   "\ttask_type = 0x%02X, iocstatus = 0x%04X "
-   "loginfo = 0x%08X,\n\tresponse_code = 0x%02X, "
-   "term_cmnds = %d\n", ioc->name,
-   pScsiTmReply->Bus, pScsiTmReply->TargetID,
-   pScsiTmReply->TaskType,
-   le16_to_cpu(pScsiTmReply->IOCStatus),
-   le32_to_cpu(pScsiTmReply->IOCLogInfo),
-   pScsiTmReply->ResponseCode,
-   le32_to_cpu(pScsiTmReply->TerminationCount)));
-
-   if (pScsiTmReply->ResponseCode)
-   mptscsih_taskmgmt_response_code(ioc,
-   pScsiTmReply->ResponseCode);
-   }
-
-   if (pScsiTmReply && (pScsiTmReply->TaskType ==
+   if (!pScsiTmReply)
+   return 0;
+
+   dtmprintk(ioc, printk(MYIOC_s_DEBUG_FMT
+   "\tTaskMgmt completed: fw_channel = %d, fw_id = %d,\n"
+   "\ttask_type = 0x%02X, iocstatus = 0

[PATCH v2 09/11] mptfusion: make adapter prod_name[] a pointer

2014-06-25 Thread Joe Lawrence
The struct _MPT_ADAPTER doesn't need a full copy of the product string,
so prod_name can point to the string literal storage that the driver
already provides.

Avoids the following smatch warning:

  drivers/message/fusion/mptbase.c:2858 MptDisplayIocCapabilities()
warn: this array is probably non-NULL. 'ioc->prod_name'

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
Reviewed-by: Christoph Hellwig 
---
 drivers/message/fusion/mptbase.c |   11 +--
 drivers/message/fusion/mptbase.h |2 +-
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/message/fusion/mptbase.c b/drivers/message/fusion/mptbase.c
index ea30033..9d4c782 100644
--- a/drivers/message/fusion/mptbase.c
+++ b/drivers/message/fusion/mptbase.c
@@ -1408,8 +1408,8 @@ mpt_verify_adapter(int iocid, MPT_ADAPTER **iocpp)
  * in /proc/mpt/summary and /sysfs/class/scsi_host/host/version_product
  *
  **/
-static void
-mpt_get_product_name(u16 vendor, u16 device, u8 revision, char *prod_name)
+static const char*
+mpt_get_product_name(u16 vendor, u16 device, u8 revision)
 {
char *product_str = NULL;
 
@@ -1635,8 +1635,7 @@ mpt_get_product_name(u16 vendor, u16 device, u8 revision, 
char *prod_name)
}
 
  out:
-   if (product_str)
-   sprintf(prod_name, "%s", product_str);
+   return product_str;
 }
 
 /**
@@ -1887,8 +1886,8 @@ mpt_attach(struct pci_dev *pdev, const struct 
pci_device_id *id)
dinitprintk(ioc, printk(MYIOC_s_INFO_FMT "facts @ %p, pfacts[0] @ %p\n",
ioc->name, &ioc->facts, &ioc->pfacts[0]));
 
-   mpt_get_product_name(pdev->vendor, pdev->device, pdev->revision,
-ioc->prod_name);
+   ioc->prod_name = mpt_get_product_name(pdev->vendor, pdev->device,
+ pdev->revision);
 
switch (pdev->device)
{
diff --git a/drivers/message/fusion/mptbase.h b/drivers/message/fusion/mptbase.h
index 76c05bc..78c8104 100644
--- a/drivers/message/fusion/mptbase.h
+++ b/drivers/message/fusion/mptbase.h
@@ -605,7 +605,7 @@ typedef struct _MPT_ADAPTER
int  id;/* Unique adapter id N 
{0,1,2,...} */
int  pci_irq;   /* This irq   */
char name[MPT_NAME_LENGTH]; /* "iocN" */
-   char prod_name[MPT_NAME_LENGTH];/* "LSIFC9x9"   
  */
+   const char   *prod_name;/* "LSIFC9x9" */
 #ifdef CONFIG_FUSION_LOGGING
/* used in mpt_display_event_info */
char evStr[EVENT_DESCR_STR_SZ];
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 08/11] mptfusion: use memdup_user

2014-06-25 Thread Joe Lawrence
Let memdup_user handle the kmalloc, copy_from_user and error checking
kfree code.

Spotted by the following smatch (false positive) warning:

  drivers/message/fusion/mptctl.c:1369 mptctl_getiocinfo() warn:
possible info leak 'karg'

Signed-off-by: Joe Lawrence 
Cc: Christoph Hellwig 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---
 drivers/message/fusion/mptctl.c |   18 +-
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/message/fusion/mptctl.c b/drivers/message/fusion/mptctl.c
index 8a050e8..b0a892a 100644
--- a/drivers/message/fusion/mptctl.c
+++ b/drivers/message/fusion/mptctl.c
@@ -1261,19 +1261,11 @@ mptctl_getiocinfo (unsigned long arg, unsigned int 
data_size)
else
return -EFAULT;
 
-   karg = kmalloc(data_size, GFP_KERNEL);
-   if (karg == NULL) {
-   printk(KERN_ERR MYNAM "%s::mpt_ioctl_iocinfo() @%d - no memory 
available!\n",
-   __FILE__, __LINE__);
-   return -ENOMEM;
-   }
-
-   if (copy_from_user(karg, uarg, data_size)) {
-   printk(KERN_ERR MYNAM "%s@%d::mptctl_getiocinfo - "
-   "Unable to read in mpt_ioctl_iocinfo struct @ %p\n",
-   __FILE__, __LINE__, uarg);
-   kfree(karg);
-   return -EFAULT;
+   karg = memdup_user(uarg, data_size);
+   if (IS_ERR(karg)) {
+   printk(KERN_ERR MYNAM "%s@%d::mpt_ioctl_iocinfo() - memdup_user 
returned error [%ld]\n",
+   __FILE__, __LINE__, PTR_ERR(karg));
+   return PTR_ERR(karg);
}
 
if (((iocnum = mpt_verify_adapter(karg->hdr.iocnum, &ioc)) < 0) ||
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 07/11] mptfusion: remove redundant kfree checks

2014-06-25 Thread Joe Lawrence
Fixes the following smatch warnings:

  drivers/message/fusion/mptfc.c:529 mptfc_target_destroy() info:
redundant null check on starget->hostdata calling kfree()

  drivers/message/fusion/mptspi.c:465 mptspi_target_destroy() info:
redundant null check on starget->hostdata calling kfree()

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
Reviewed-by: Christoph Hellwig 
---
 drivers/message/fusion/mptfc.c  |3 +--
 drivers/message/fusion/mptspi.c |3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/message/fusion/mptfc.c b/drivers/message/fusion/mptfc.c
index 02a3eef..21ac845 100644
--- a/drivers/message/fusion/mptfc.c
+++ b/drivers/message/fusion/mptfc.c
@@ -525,8 +525,7 @@ mptfc_target_destroy(struct scsi_target *starget)
if (ri) /* better be! */
ri->starget = NULL;
}
-   if (starget->hostdata)
-   kfree(starget->hostdata);
+   kfree(starget->hostdata);
starget->hostdata = NULL;
 }
 
diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index 7b4db9a..787933d 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -461,8 +461,7 @@ static int mptspi_target_alloc(struct scsi_target *starget)
 static void
 mptspi_target_destroy(struct scsi_target *starget)
 {
-   if (starget->hostdata)
-   kfree(starget->hostdata);
+   kfree(starget->hostdata);
starget->hostdata = NULL;
 }
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 06/11] mptfusion: mark file-private functions as static

2014-06-25 Thread Joe Lawrence
Fixes the following sparse warnings:

  drivers/message/fusion/mptbase.c:7011:1: warning: symbol
'mpt_SoftResetHandler' was not declared. Should it be static?

  drivers/message/fusion/mptsas.c:1578:23: warning: symbol
'mptsas_refreshing_device_handles' was not declared. Should it be
static?

  drivers/message/fusion/mptsas.c:3653:24: warning: symbol
'mptsas_expander_add' was not declared. Should it be static?

  drivers/message/fusion/mptsas.c:5327:1: warning: symbol
'mptsas_shutdown' was not declared. Should it be static?

  drivers/message/fusion/mptspi.c:624:1: warning: symbol
'mptscsih_quiesce_raid' was not declared. Should it be static?

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
Reviewed-by: Christoph Hellwig 
---
 drivers/message/fusion/mptbase.c |2 +-
 drivers/message/fusion/mptsas.c  |6 +++---
 drivers/message/fusion/mptspi.c  |2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/message/fusion/mptbase.c b/drivers/message/fusion/mptbase.c
index ebc0af7..ea30033 100644
--- a/drivers/message/fusion/mptbase.c
+++ b/drivers/message/fusion/mptbase.c
@@ -7007,7 +7007,7 @@ EXPORT_SYMBOL(mpt_halt_firmware);
  * IOC doesn't reply to any outstanding request. This will transfer IOC
  * to READY state.
  **/
-int
+static int
 mpt_SoftResetHandler(MPT_ADAPTER *ioc, int sleepFlag)
 {
int  rc;
diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 711fcb5..54b51a9 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -1575,7 +1575,7 @@ mptsas_del_end_device(MPT_ADAPTER *ioc, struct 
mptsas_phyinfo *phy_info)
mptsas_port_delete(ioc, phy_info->port_details);
 }
 
-struct mptsas_phyinfo *
+static struct mptsas_phyinfo *
 mptsas_refreshing_device_handles(MPT_ADAPTER *ioc,
struct mptsas_devinfo *sas_device)
 {
@@ -3648,7 +3648,7 @@ mptsas_send_expander_event(struct fw_event_work *fw_event)
  * @handle:
  *
  */
-struct mptsas_portinfo *
+static struct mptsas_portinfo *
 mptsas_expander_add(MPT_ADAPTER *ioc, u16 handle)
 {
struct mptsas_portinfo buffer, *port_info;
@@ -5321,7 +5321,7 @@ mptsas_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
return error;
 }
 
-void
+static void
 mptsas_shutdown(struct pci_dev *pdev)
 {
MPT_ADAPTER *ioc = pci_get_drvdata(pdev);
diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index 49d1133..7b4db9a 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -620,7 +620,7 @@ static void mptspi_read_parameters(struct scsi_target 
*starget)
spi_width(starget) = (nego & MPI_SCSIDEVPAGE0_NP_WIDE) ? 1 : 0;
 }
 
-int
+static int
 mptscsih_quiesce_raid(MPT_SCSI_HOST *hd, int quiesce, u8 channel, u8 id)
 {
MPT_ADAPTER *ioc = hd->ioc;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 05/11] mpt3sas: combine fw_event_work and its event_data

2014-06-25 Thread Joe Lawrence
Tack the firmware reply event_data payload to the end of its
corresponding struct fw_event_work allocation.  This matches the
convention in the mptfusion driver and simplifies the code.

This avoids the following smatch warning:

  drivers/scsi/mpt3sas/mpt3sas_scsih.c:2519
mpt3sas_send_trigger_data_event() warn: possible memory leak of
'fw_event'

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
Cc: Christoph Hellwig 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |   56 +++---
 1 file changed, 31 insertions(+), 25 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index f3ee3b4..a14be8f 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -190,7 +190,7 @@ struct fw_event_work {
u8  VP_ID;
u8  ignore;
u16 event;
-   void*event_data;
+   charevent_data[0] __aligned(4);
 };
 
 /* raid transport support */
@@ -2492,7 +2492,6 @@ _scsih_fw_event_free(struct MPT3SAS_ADAPTER *ioc, struct 
fw_event_work
 
spin_lock_irqsave(&ioc->fw_event_lock, flags);
list_del(&fw_event->list);
-   kfree(fw_event->event_data);
kfree(fw_event);
spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
 }
@@ -2513,12 +2512,10 @@ mpt3sas_send_trigger_data_event(struct MPT3SAS_ADAPTER 
*ioc,
 
if (ioc->is_driver_loading)
return;
-   fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+   fw_event = kzalloc(sizeof(*fw_event) + sizeof(*event_data),
+  GFP_ATOMIC);
if (!fw_event)
return;
-   fw_event->event_data = kzalloc(sizeof(*event_data), GFP_ATOMIC);
-   if (!fw_event->event_data)
-   return;
fw_event->event = MPT3SAS_PROCESS_TRIGGER_DIAG;
fw_event->ioc = ioc;
memcpy(fw_event->event_data, event_data, sizeof(*event_data));
@@ -3213,7 +3210,8 @@ _scsih_check_topo_delete_events(struct MPT3SAS_ADAPTER 
*ioc,
if (fw_event->event != MPI2_EVENT_SAS_TOPOLOGY_CHANGE_LIST ||
fw_event->ignore)
continue;
-   local_event_data = fw_event->event_data;
+   local_event_data = (Mpi2EventDataSasTopologyChangeList_t *)
+  fw_event->event_data;
if (local_event_data->ExpStatus ==
MPI2_EVENT_SAS_TOPO_ES_ADDED ||
local_event_data->ExpStatus ==
@@ -5045,7 +5043,9 @@ _scsih_sas_topology_change_event(struct MPT3SAS_ADAPTER 
*ioc,
u64 sas_address;
unsigned long flags;
u8 link_rate, prev_link_rate;
-   Mpi2EventDataSasTopologyChangeList_t *event_data = fw_event->event_data;
+   Mpi2EventDataSasTopologyChangeList_t *event_data =
+   (Mpi2EventDataSasTopologyChangeList_t *)
+   fw_event->event_data;
 
 #ifdef CONFIG_SCSI_MPT3SAS_LOGGING
if (ioc->logging_level & MPT_DEBUG_EVENT_WORK_TASK)
@@ -5243,7 +5243,8 @@ _scsih_sas_device_status_change_event(struct 
MPT3SAS_ADAPTER *ioc,
u64 sas_address;
unsigned long flags;
Mpi2EventDataSasDeviceStatusChange_t *event_data =
-   fw_event->event_data;
+   (Mpi2EventDataSasDeviceStatusChange_t *)
+   fw_event->event_data;
 
 #ifdef CONFIG_SCSI_MPT3SAS_LOGGING
if (ioc->logging_level & MPT_DEBUG_EVENT_WORK_TASK)
@@ -5339,6 +5340,7 @@ _scsih_sas_enclosure_dev_status_change_event(struct 
MPT3SAS_ADAPTER *ioc,
 #ifdef CONFIG_SCSI_MPT3SAS_LOGGING
if (ioc->logging_level & MPT_DEBUG_EVENT_WORK_TASK)
_scsih_sas_enclosure_dev_status_change_event_debug(ioc,
+(Mpi2EventDataSasEnclDevStatusChange_t *)
 fw_event->event_data);
 #endif
 }
@@ -5363,7 +5365,9 @@ _scsih_sas_broadcast_primitive_event(struct 
MPT3SAS_ADAPTER *ioc,
u32 termination_count;
u32 query_count;
Mpi2SCSITaskManagementReply_t *mpi_reply;
-   Mpi2EventDataSasBroadcastPrimitive_t *event_data = fw_event->event_data;
+   Mpi2EventDataSasBroadcastPrimitive_t *event_data =
+   (Mpi2EventDataSasBroadcastPrimitive_t *)
+   fw_event->event_data;
u16 ioc_status;
unsigned long flags;
int r;
@@ -5515,7 +5519,8 @@ static void
 _scsih_sas_discovery_event(struct MPT3SAS_ADAPTER *ioc,
struct fw_event_work *fw_event)
 {
-   Mpi2EventDataSasDiscovery_t *event_data = fw_event->event_data;
+   Mpi2EventDataSasDiscovery_t *event_data =
+   (Mpi2EventDataSasDiscovery_t *) fw_event->event_data;
 
 #ifdef CONFIG_SCSI_MPT3SAS_LOGGING
if (ioc->logging_

[PATCH v2 03/11] mpt2sas: annotate ioc->reply_post_host_index as __iomem

2014-06-25 Thread Joe Lawrence
The MPT2SAS_ADAPTER reply_post_host_index[] holds calculated addresses
in memory mapped register space.  Add an "__iomem" annotation to silence
the following sparse warnings:

  drivers/scsi/mpt2sas/mpt2sas_base.c:1006:43:
warning: incorrect type in argument 2 (different address spaces)
   expected void volatile [noderef] *addr
   got unsigned long long [usertype] *

  drivers/scsi/mpt2sas/mpt2sas_base.c:4299:22:
warning: cast removes address space of expression
  drivers/scsi/mpt2sas/mpt2sas_base.c:4303:27:
warning: cast removes address space of expression

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Reviewed-by: Christoph Hellwig 
Acked-by: Sreekanth Reddy 
---
 drivers/scsi/mpt2sas/mpt2sas_base.c |9 +
 drivers/scsi/mpt2sas/mpt2sas_base.h |2 +-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c 
b/drivers/scsi/mpt2sas/mpt2sas_base.c
index 8b88118..a31397c 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -4295,12 +4295,13 @@ mpt2sas_base_attach(struct MPT2SAS_ADAPTER *ioc)
goto out_free_resources;
 
if (ioc->is_warpdrive) {
-   ioc->reply_post_host_index[0] =
-   (resource_size_t *)&ioc->chip->ReplyPostHostIndex;
+   ioc->reply_post_host_index[0] = (resource_size_t __iomem *)
+   &ioc->chip->ReplyPostHostIndex;
 
for (i = 1; i < ioc->cpu_msix_table_sz; i++)
-   ioc->reply_post_host_index[i] = (resource_size_t *)
-   ((u8 *)&ioc->chip->Doorbell + (0x4000 + ((i - 1)
+   ioc->reply_post_host_index[i] =
+   (resource_size_t __iomem *)
+   ((u8 __iomem *)&ioc->chip->Doorbell + (0x4000 + ((i - 1)
* 4)));
}
 
diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h 
b/drivers/scsi/mpt2sas/mpt2sas_base.h
index fd3b998..0ac5815 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -837,7 +837,7 @@ struct MPT2SAS_ADAPTER {
u8  msix_enable;
u16 msix_vector_count;
u8  *cpu_msix_table;
-   resource_size_t **reply_post_host_index;
+   resource_size_t __iomem **reply_post_host_index;
u16 cpu_msix_table_sz;
u32 ioc_reset_count;
MPT2SAS_FLUSH_RUNNING_CMDS schedule_dead_ioc_flush_running_cmds;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 04/11] mpt3sas: correct scsi_{target,device} hostdata allocation

2014-06-25 Thread Joe Lawrence
In _scsih_{slave,target}_alloc, an incorrect structure type is passed
to sizeof() when allocating storage for hostdata.  Luckily larger
structure types were used, so at least the wrong sizes were safe:

  struct scsi_device (1784 bytes) > struct MPT3SAS_DEVICE (24 bytes)
  struct scsi_target (760 bytes)  > struct MPT3SAS_TARGET (32 bytes)

This fixes the following smatch warnings:

  drivers/scsi/mpt3sas/mpt3sas_scsih.c:1166 _scsih_target_alloc()
warn: struct type mismatch 'MPT3SAS_TARGET vs scsi_target'

  drivers/scsi/mpt3sas/mpt3sas_scsih.c:1280 _scsih_slave_alloc()
warn: struct type mismatch 'MPT3SAS_DEVICE vs scsi_device'

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Reviewed-by: Christoph Hellwig 
Acked-by: Sreekanth Reddy 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 18e713d..f3ee3b4 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -1163,7 +1163,8 @@ _scsih_target_alloc(struct scsi_target *starget)
unsigned long flags;
struct sas_rphy *rphy;
 
-   sas_target_priv_data = kzalloc(sizeof(struct scsi_target), GFP_KERNEL);
+   sas_target_priv_data = kzalloc(sizeof(*sas_target_priv_data),
+  GFP_KERNEL);
if (!sas_target_priv_data)
return -ENOMEM;
 
@@ -1277,7 +1278,8 @@ _scsih_slave_alloc(struct scsi_device *sdev)
struct _sas_device *sas_device;
unsigned long flags;
 
-   sas_device_priv_data = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
+   sas_device_priv_data = kzalloc(sizeof(*sas_device_priv_data),
+  GFP_KERNEL);
if (!sas_device_priv_data)
return -ENOMEM;
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 01/11] mpt2sas: correct scsi_{target,device} hostdata allocation

2014-06-25 Thread Joe Lawrence
In _scsih_{slave,target}_alloc, an incorrect structure type is passed
to sizeof() when allocating storage for hostdata.  Luckily larger
structure types were used, so at least the wrong sizes were safe:

  struct scsi_device (1784 bytes) > struct MPT2SAS_DEVICE (24 bytes)
  struct scsi_target (760 bytes)  > struct MPT2SAS_TARGET (40 bytes)

This fixes the following smatch warnings:

  drivers/scsi/mpt2sas/mpt2sas_scsih.c:1295 _scsih_target_alloc()
warn: struct type mismatch 'MPT2SAS_TARGET vs scsi_target'

  drivers/scsi/mpt2sas/mpt2sas_scsih.c:1409 _scsih_slave_alloc()
warn: struct type mismatch 'MPT2SAS_DEVICE vs scsi_device'

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Reviewed-by: Christoph Hellwig 
Acked-by: Sreekanth Reddy 
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c 
b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 5055f92..13e49c3 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -1292,7 +1292,8 @@ _scsih_target_alloc(struct scsi_target *starget)
unsigned long flags;
struct sas_rphy *rphy;
 
-   sas_target_priv_data = kzalloc(sizeof(struct scsi_target), GFP_KERNEL);
+   sas_target_priv_data = kzalloc(sizeof(*sas_target_priv_data),
+  GFP_KERNEL);
if (!sas_target_priv_data)
return -ENOMEM;
 
@@ -1406,7 +1407,8 @@ _scsih_slave_alloc(struct scsi_device *sdev)
struct _sas_device *sas_device;
unsigned long flags;
 
-   sas_device_priv_data = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
+   sas_device_priv_data = kzalloc(sizeof(*sas_device_priv_data),
+  GFP_KERNEL);
if (!sas_device_priv_data)
return -ENOMEM;
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 00/11] mptfusion/mpt2/mpt3 static checker fixups

2014-06-25 Thread Joe Lawrence
v2:

Combined the mptfusion and mpt{2,3}sas static checker patches,
re-ordering them by driver.  Updated Reviewed-by and Acked-by tags as
well as Sreekanth's email address.  Based off v3.16-rc2, compile tested.

patches dropped:

* mpt3sas: fix possible memory leak in mpt3sas_send_trigger_data_event

  Christoph suggested combining this into a single allocation, so this
  patch was transformed into two new patches (removing the Reviewed-by
  and Acked-by tags):

  mpt2sas-combine-fw_event_work-and-its-event_data.patch
  mpt3sas-combine-fw_event_work-and-its-event_data.patch

* mptfusion: initChainBuffers should return errno

  Christoph noted that the whole callchain uses -1 instead of errno.
  Let it be.


patches modified:

* mptfusion: zero kernel-space source of copy_to_user

  A static checker false-positive brought me here, Christoph suggested
  using memdup_user.

* mptfusion: combine fw_event_work and its event_data

  Remove the unnecessary sz variables.

* mptfusion: tweak null pointer checks

  Moved commentry from myself (JL) and Christoph (HCH) into the commit
  message.


In the mpt{2,3}sas-combine-fw_event_work-and-its-event_data patches, I
was wondering if the alignment attribute should be:

  __attribute__ ((aligned (sizeof(unsigned long

instead of:

  char event_data[0] __aligned(4)

Regards,

Joe Lawrence (11):
  mpt2sas: correct scsi_{target,device} hostdata allocation
  mpt2sas: combine fw_event_work and its event_data
  mpt2sas: annotate ioc->reply_post_host_index as __iomem
  mpt3sas: correct scsi_{target,device} hostdata allocation
  mpt3sas: combine fw_event_work and its event_data
  mptfusion: mark file-private functions as static
  mptfusion: remove redundant kfree checks
  mptfusion: use memdup_user
  mptfusion: make adapter prod_name[] a pointer
  mptfusion: combine fw_event_work and its event_data
  mptfusion: tweak null pointer checks

 drivers/message/fusion/mptbase.c |   23 +--
 drivers/message/fusion/mptbase.h |2 +-
 drivers/message/fusion/mptctl.c  |   18 +++--
 drivers/message/fusion/mptfc.c   |3 +-
 drivers/message/fusion/mptsas.c  |   74 --
 drivers/message/fusion/mptsas.h  |2 +-
 drivers/message/fusion/mptscsih.c|   19 -
 drivers/message/fusion/mptspi.c  |5 +--
 drivers/scsi/mpt2sas/mpt2sas_base.c  |9 +++--
 drivers/scsi/mpt2sas/mpt2sas_base.h  |2 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c |   58 +++---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |   62 +++-
 12 files changed, 137 insertions(+), 140 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] mptfusion static checker fixups

2014-06-25 Thread Joe Lawrence
Hi Christoph, Sreekanth,

Would it be easier to combine this set with the mpt2/mpt3 changes and
repost a V2?  There were a few changes Christoph request in each set,
so I can post those up later today.

-- Joe

On Wed, 25 Jun 2014 13:01:50 +0200
Christoph Hellwig  wrote:

> Can I get another review for this series?
> 
> On Wed, Jun 04, 2014 at 12:49:42PM -0400, Joe Lawrence wrote:
> > While reviewing the mpt2/mpt3 static checker fixup patchset, Christoph
> > inquired about mptfusion.  None of the sparse / smatch warnings from the
> > earlier patchset directly apply to fusion, but there were a few easy to
> > fix warnings (compile tested only).
> > 
> > The patchset is ordered from the smallest/easiest change up to the last
> > three, which are bit more involved and should be reviewed by LSI,
> > especially the last one "mptfusion: tweak null pointer checks".  See the
> > commentary in those patches after the signed-off-by line.
> > 
> > Cc: Christoph Hellwig 
> > Cc: Dan Carpenter 
> > Cc: Sreekanth Reddy 
> > 
> > Joe Lawrence (7):
> >   mptfusion: mark file-private functions as static
> >   mptfusion: remove redundant kfree checks
> >   mptfusion: initChainBuffers should return errno
> >   mptfusion: zero kernel-space source of copy_to_user
> >   mptfusion: make adapter prod_name[] a pointer
> >   mptfusion: combine fw_event_work and its event_data
> >   mptfusion: tweak null pointer checks
> > 
> >  drivers/message/fusion/mptbase.c  |   29 +--
> >  drivers/message/fusion/mptbase.h  |2 +-
> >  drivers/message/fusion/mptctl.c   |2 +-
> >  drivers/message/fusion/mptfc.c|3 +-
> >  drivers/message/fusion/mptsas.c   |   57 
> > +++--
> >  drivers/message/fusion/mptsas.h   |2 +-
> >  drivers/message/fusion/mptscsih.c |7 +++--
> >  drivers/message/fusion/mptspi.c   |5 ++--
> >  8 files changed, 52 insertions(+), 55 deletions(-)
> > 
> > -- 
> > 1.7.10.4
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> ---end quoted text---
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/6] qla2xxx: Prevent probe and board_disable race

2014-06-18 Thread Joe Lawrence
The PCI register read checking introduced in commit f3ddac19 "qla2xxx:
Disable adapter when we encounter a PCI disconnect" is active during
driver probe.  Hold off scheduling any board removal until the driver
probe has completed.  This ensures that the the board_disable work
structure is initialized and more importantly, avoids racing
qla2x00_probe_one.

Signed-off-by: Joe Lawrence 
---
 drivers/scsi/qla2xxx/qla_def.h |1 +
 drivers/scsi/qla2xxx/qla_isr.c |3 ++-
 drivers/scsi/qla2xxx/qla_os.c  |2 ++
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
index 7c441c9..253a89d 100644
--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -3405,6 +3405,7 @@ typedef struct scsi_qla_host {
unsigned long   pci_flags;
 #define PFLG_DISCONNECTED  0   /* PCI device removed */
 #define PFLG_DRIVER_REMOVING   1   /* PCI driver .remove */
+#define PFLG_DRIVER_PROBING2   /* PCI driver .probe */
 
uint32_tdevice_flags;
 #define SWITCH_FOUND   BIT_0
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index ee5eef4..4ea8252 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -118,7 +118,8 @@ qla2x00_check_reg32_for_disconnect(scsi_qla_host_t *vha, 
uint32_t reg)
/* Check for PCI disconnection */
if (reg == 0x) {
if (!test_and_set_bit(PFLG_DISCONNECTED, &vha->pci_flags) &&
-   !test_bit(PFLG_DRIVER_REMOVING, &vha->pci_flags)) {
+   !test_bit(PFLG_DRIVER_REMOVING, &vha->pci_flags) &&
+   !test_bit(PFLG_DRIVER_PROBING, &vha->pci_flags)) {
/*
 * Schedule this (only once) on the default system
 * workqueue so that all the adapter workqueues and the
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 51cba37..0b39425 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -2629,6 +2629,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct 
pci_device_id *id)
}
 
pci_set_drvdata(pdev, base_vha);
+   set_bit(PFLG_DRIVER_PROBING, &base_vha->pci_flags);
 
host = base_vha->host;
base_vha->req = req;
@@ -2920,6 +2921,7 @@ skip_dpc:
 
qlt_add_target(ha, base_vha);
 
+   clear_bit(PFLG_DRIVER_PROBING, &base_vha->pci_flags);
return 0;
 
 probe_init_failed:
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/6] qla2xxx: Prevent removal and board_disable race

2014-06-18 Thread Joe Lawrence
Introduce mutual exclusion between the qla2xxx_remove_one PCI driver
callback and qla2x00_disable_board_on_pci_error, which is scheduled as
board_disable work by qla2x00_check_reg{32,16}_for_disconnect:

* Leave the driver-specific data attached to the underlying PCI device
intact in qla2x00_disable_board_on_pci_error, so that qla2x00_remove_one
has enough breadcrumbs to determine that any board_disable work has been
completed.

* In qla2xxx_remove_one, set a bit to prevent any subsequent
board_disable work from scheduling, then cancel and wait until pending
work has completed.

* Reuse the PCI device enable count check in qla2x00_remove_one to
determine if board_disable has occured.  The original purpose of this
check was unnecessary since the driver remove function wasn't called
when the probe fails.

Signed-off-by: Joe Lawrence 
---
 drivers/scsi/qla2xxx/qla_def.h |1 +
 drivers/scsi/qla2xxx/qla_isr.c |3 ++-
 drivers/scsi/qla2xxx/qla_os.c  |   31 +++
 3 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
index 1267b11..7c441c9 100644
--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -3404,6 +3404,7 @@ typedef struct scsi_qla_host {
 
unsigned long   pci_flags;
 #define PFLG_DISCONNECTED  0   /* PCI device removed */
+#define PFLG_DRIVER_REMOVING   1   /* PCI driver .remove */
 
uint32_tdevice_flags;
 #define SWITCH_FOUND   BIT_0
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 2741ad8..ee5eef4 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -117,7 +117,8 @@ qla2x00_check_reg32_for_disconnect(scsi_qla_host_t *vha, 
uint32_t reg)
 {
/* Check for PCI disconnection */
if (reg == 0x) {
-   if (!test_and_set_bit(PFLG_DISCONNECTED, &vha->pci_flags)) {
+   if (!test_and_set_bit(PFLG_DISCONNECTED, &vha->pci_flags) &&
+   !test_bit(PFLG_DRIVER_REMOVING, &vha->pci_flags)) {
/*
 * Schedule this (only once) on the default system
 * workqueue so that all the adapter workqueues and the
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 39c9953..51cba37 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3123,15 +3123,25 @@ qla2x00_remove_one(struct pci_dev *pdev)
scsi_qla_host_t *base_vha;
struct qla_hw_data  *ha;
 
+   base_vha = pci_get_drvdata(pdev);
+   ha = base_vha->hw;
+
+   /* Indicate device removal to prevent future board_disable and wait
+* until any pending board_disable has completed. */
+   set_bit(PFLG_DRIVER_REMOVING, &base_vha->pci_flags);
+   cancel_work_sync(&ha->board_disable);
+
/*
-* If the PCI device is disabled that means that probe failed and any
-* resources should be have cleaned up on probe exit.
+* If the PCI device is disabled then there was a PCI-disconnect and
+* qla2x00_disable_board_on_pci_error has taken care of most of the
+* resources.
 */
-   if (!atomic_read(&pdev->enable_cnt))
+   if (!atomic_read(&pdev->enable_cnt)) {
+   scsi_host_put(base_vha->host);
+   kfree(ha);
+   pci_set_drvdata(pdev, NULL);
return;
-
-   base_vha = pci_get_drvdata(pdev);
-   ha = base_vha->hw;
+   }
 
qla2x00_wait_for_hba_ready(base_vha);
 
@@ -4791,18 +4801,15 @@ qla2x00_disable_board_on_pci_error(struct work_struct 
*work)
qla82xx_md_free(base_vha);
qla2x00_free_queues(ha);
 
-   scsi_host_put(base_vha->host);
-
qla2x00_unmap_iobases(ha);
 
pci_release_selected_regions(ha->pdev, ha->bars);
-   kfree(ha);
-   ha = NULL;
-
pci_disable_pcie_error_reporting(pdev);
pci_disable_device(pdev);
-   pci_set_drvdata(pdev, NULL);
 
+   /*
+* Let qla2x00_remove_one cleanup qla_hw_data on device removal.
+*/
 }
 
 /**
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/6] qla2xxx: Collect PCI register checks and board_disable scheduling

2014-06-18 Thread Joe Lawrence
Add an uint16_t variant of qla2x00_check_reg_for_disconnect and use
these routines to check and schedule a PCI-disconnected board from a
centralized place.

Signed-off-by: Joe Lawrence 
---
 drivers/scsi/qla2xxx/qla_gbl.h |3 ++-
 drivers/scsi/qla2xxx/qla_isr.c |   28 +---
 drivers/scsi/qla2xxx/qla_mr.c  |2 +-
 drivers/scsi/qla2xxx/qla_nx.c  |6 +++---
 drivers/scsi/qla2xxx/qla_os.c  |8 +---
 5 files changed, 20 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
index d48dea8..a60c1b8 100644
--- a/drivers/scsi/qla2xxx/qla_gbl.h
+++ b/drivers/scsi/qla2xxx/qla_gbl.h
@@ -475,7 +475,8 @@ extern uint8_t *qla25xx_read_nvram_data(scsi_qla_host_t *, 
uint8_t *, uint32_t,
 extern int qla25xx_write_nvram_data(scsi_qla_host_t *, uint8_t *, uint32_t,
uint32_t);
 extern int qla2x00_is_a_vp_did(scsi_qla_host_t *, uint32_t);
-bool qla2x00_check_reg_for_disconnect(scsi_qla_host_t *, uint32_t);
+bool qla2x00_check_reg32_for_disconnect(scsi_qla_host_t *, uint32_t);
+bool qla2x00_check_reg16_for_disconnect(scsi_qla_host_t *, uint16_t);
 
 extern int qla2x00_beacon_on(struct scsi_qla_host *);
 extern int qla2x00_beacon_off(struct scsi_qla_host *);
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index a56825c..e1b16ad 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -56,16 +56,8 @@ qla2100_intr_handler(int irq, void *dev_id)
vha = pci_get_drvdata(ha->pdev);
for (iter = 50; iter--; ) {
hccr = RD_REG_WORD(®->hccr);
-   /* Check for PCI disconnection */
-   if (hccr == 0x) {
-   /*
-* Schedule this on the default system workqueue so that
-* all the adapter workqueues and the DPC thread can be
-* shutdown cleanly.
-*/
-   schedule_work(&ha->board_disable);
+   if (qla2x00_check_reg16_for_disconnect(vha, hccr))
break;
-   }
if (hccr & HCCR_RISC_PAUSE) {
if (pci_channel_offline(ha->pdev))
break;
@@ -121,7 +113,7 @@ qla2100_intr_handler(int irq, void *dev_id)
 }
 
 bool
-qla2x00_check_reg_for_disconnect(scsi_qla_host_t *vha, uint32_t reg)
+qla2x00_check_reg32_for_disconnect(scsi_qla_host_t *vha, uint32_t reg)
 {
/* Check for PCI disconnection */
if (reg == 0x) {
@@ -136,6 +128,12 @@ qla2x00_check_reg_for_disconnect(scsi_qla_host_t *vha, 
uint32_t reg)
return false;
 }
 
+bool
+qla2x00_check_reg16_for_disconnect(scsi_qla_host_t *vha, uint16_t reg)
+{
+   return qla2x00_check_reg32_for_disconnect(vha, 0x | reg);
+}
+
 /**
  * qla2300_intr_handler() - Process interrupts for the ISP23xx and ISP63xx.
  * @irq:
@@ -174,7 +172,7 @@ qla2300_intr_handler(int irq, void *dev_id)
vha = pci_get_drvdata(ha->pdev);
for (iter = 50; iter--; ) {
stat = RD_REG_DWORD(®->u.isp2300.host_status);
-   if (qla2x00_check_reg_for_disconnect(vha, stat))
+   if (qla2x00_check_reg32_for_disconnect(vha, stat))
break;
if (stat & HSR_RISC_PAUSED) {
if (unlikely(pci_channel_offline(ha->pdev)))
@@ -2633,7 +2631,7 @@ qla24xx_intr_handler(int irq, void *dev_id)
vha = pci_get_drvdata(ha->pdev);
for (iter = 50; iter--; ) {
stat = RD_REG_DWORD(®->host_status);
-   if (qla2x00_check_reg_for_disconnect(vha, stat))
+   if (qla2x00_check_reg32_for_disconnect(vha, stat))
break;
if (stat & HSRX_RISC_PAUSED) {
if (unlikely(pci_channel_offline(ha->pdev)))
@@ -2723,7 +2721,7 @@ qla24xx_msix_rsp_q(int irq, void *dev_id)
 * we process the response queue.
 */
stat = RD_REG_DWORD(®->host_status);
-   if (qla2x00_check_reg_for_disconnect(vha, stat))
+   if (qla2x00_check_reg32_for_disconnect(vha, stat))
goto out;
qla24xx_process_response_queue(vha, rsp);
if (!ha->flags.disable_msix_handshake) {
@@ -2763,7 +2761,7 @@ qla25xx_msix_rsp_q(int irq, void *dev_id)
hccr = RD_REG_DWORD_RELAXED(®->hccr);
spin_unlock_irqrestore(&ha->hardware_lock, flags);
}
-   if (qla2x00_check_reg_for_disconnect(vha, hccr))
+   if (qla2x00_check_reg32_for_disconnect(vha, hccr))
goto out;
queue_work_on((int) (rsp->id - 1), ha->wq, &rsp->q_work);
 
@@ -2798,7 +2796,7 @@ qla24xx_msix_default(int irq, void *dev_id)
vha = pci_get_drvdata(ha->pdev);
do {
stat = RD_REG_DWORD(®

[PATCH 2/6] qla2xxx: Use qla2x00_clear_drv_active on probe failure

2014-06-18 Thread Joe Lawrence
Take advantage of commit fe1b806f "qla2xxx: Refactor shutdown code so
some functionality can be reused" to remove an inlined copy of
qla2x00_clear_drv_active in the driver's probe hardware error path.

Signed-off-by: Joe Lawrence 
---
 drivers/scsi/qla2xxx/qla_os.c |   13 +++--
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 6a512b7..31913bb 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -238,6 +238,7 @@ static int qla2xxx_eh_host_reset(struct scsi_cmnd *);
 
 static int qla2x00_change_queue_depth(struct scsi_device *, int, int);
 static int qla2x00_change_queue_type(struct scsi_device *, int);
+static void qla2x00_clear_drv_active(struct qla_hw_data *);
 static void qla2x00_free_device(scsi_qla_host_t *);
 
 struct scsi_host_template qla2xxx_driver_template = {
@@ -2946,16 +2947,8 @@ probe_failed:
scsi_host_put(base_vha->host);
 
 probe_hw_failed:
-   if (IS_QLA82XX(ha)) {
-   qla82xx_idc_lock(ha);
-   qla82xx_clear_drv_active(ha);
-   qla82xx_idc_unlock(ha);
-   }
-   if (IS_QLA8044(ha)) {
-   qla8044_idc_lock(ha);
-   qla8044_clear_drv_active(ha);
-   qla8044_idc_unlock(ha);
-   }
+   qla2x00_clear_drv_active(ha);
+
 iospace_config_failed:
if (IS_P3P_TYPE(ha)) {
if (!ha->nx_pcibase)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/6] qla2xxx device removal fixups

2014-06-18 Thread Joe Lawrence
Hi Chad, Giri, et al.

Stratus has been testing the upstream qla2xxx driver against surprise
device removal and has found a few minor issues along the way.  With
this patchset, results have been good.  Although the following changes
may be most interesting on a Stratus platform, they should be applicable
to general hotplug scenarios on other hardware.

The first two patches are independent from the others and can be
reviewed as such.  One fixes up a use-after-free and the other
consolidates some code.

The final four patches are more invasive and modify the scsi_qla_host
structure to avoid device removal race conditions.  These changes were
written to demonstrate the problem at hand and minimally fix them in
order to continue testing.  (For example, adding completely a new
pci_flags member to scsi_qla_host is probably overkill.) 

We would welcome any feedback or questions about our testing.

Regards,

-- Joe


Joe Lawrence (6):

  qla2xxx: Fix shost use-after-free on device removal

scsi_qla_host *base_vha is allocated as part of the Scsi_Host
private hostdata[] by qla2x00_create_host.  When the last reference
on the host is put by qla2x00_remove_one, it's released along with
any hostdata[], rendering base_vha unsafe to dereference.  
  
To safely complete the adapter cleanup in qla2x00_remove_one,
use the scsi_qla_host_t *qla_hw_data pointer that is already in
hand.
  
To reproduce, set kernel boot option slub_debug=FZPU and remove
an adapter instance.


  qla2xxx: Use qla2x00_clear_drv_active on probe failure

Clean up some duplicate code along the way.


  qla2xxx: Collect PCI register checks and board_disable scheduling

PCI register read checks are performed throughout the driver to
detect disconnected hardware.  There is currently a function that
verifies 32-bit values and schedules board_removal if needed. Other
16-bit registers are checked and board_removal scheduling scattered
around the driver.

This change pulls all of these together such that a new 16-bit
register check function wrappers the existing 32-bit version and
centralizes the scheduling of board_disable work to a single
invocation.

The following patches depend upon this change.


  qla2xxx: Schedule board_disable only once

In automated hot-plug device removal testing, occasionally
qla2x00_disable_board_on_pci_error would run twice.  I believe this
occurred by a second qla2x00_check_reg_for_disconnect scheduling
board_disable while the first instance was still running.

The easiest solution seemed to be adding a per-adapter flag that
could be test_and_set before scheduling board_disable. 


  qla2xxx: Prevent removal and board_disable race

This race was discovered through many iterations of automated PCI
hotplug.  The automated test inserts faults on the Stratus platform
to simulate device removal.  After removal, it re-adds the devices,
simulating PCI hotplug insertion.  Rinse, repeat.

The race occurs when a Stratus proprietary hotplug driver detects
that PCI devices have gone missing and calls into the kernel PCI
subsystem to remove them.  At nearly the same time, the qla2xxx
driver figures out the same issue and schedules a board_disable.  It
takes a few hundred device removals to hit, but it seemed that the
problem was that qla2x00_disable_board_on_pci_error started, then
qla2xxx_remove_one was invoked, so one of the two started touching
already freed resources.

Although the bug manifested on a Stratus platform running modified
drivers, the window for qla2xxx_remove_one to race
qla2x00_disable_board_on_pci_error is still present if running stock
kernel/drivers on commodity hardware.  It may be difficult to hit,
but one could theoretically invoke PCI device removal via sysfs (or
simply unload the qla2xxx driver) and for board_disable to run at
the same time.

The fix leaves the Scsi_Host intact during board_disable so that
qla2x00_remove_one can safely stop the main timer and
cancel_work_sync on any outstanding board_disable work.  A
PFLG_DRIVER_REMOVING bit is also set to prevent any other
board_disable instances from scheduling.  Once the asynchronous
players are out of the way, qla2x00_remove_one can move forward
cleaning up whatever remaining resources are still attached.

The PCI device enable count check in qla2x00_remove_one was
re-purposed to determine if board_disable had run.  Its original
intention to detect qla2x00_probe_one failure was invalid.  As long
as the driver .probe routine returns an -ERRNO, then its counterpart
.remove routine is *not* called.


  qla2xxx: Prevent probe and board_disable race

An automated Stratus device removal test hotplug removes devices
shortly during/after their initialization.  The test inserts
devices, then proceeds to remove th

[PATCH 4/6] qla2xxx: Schedule board_disable only once

2014-06-18 Thread Joe Lawrence
There are various callers of qla2x00_check_reg{32,16}_for_disconnect
that may schedule board removal on PCI-disconnect.  Test-and-set a
dedicated flag before scheduling board_disable so it is invoked only
once.

Signed-off-by: Joe Lawrence 
---
 drivers/scsi/qla2xxx/qla_def.h |3 +++
 drivers/scsi/qla2xxx/qla_isr.c |   14 --
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
index 1fa0104..1267b11 100644
--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -3402,6 +3402,9 @@ typedef struct scsi_qla_host {
 #define FX00_CRITEMP_RECOVERY  25
 #define FX00_HOST_INFO_RESEND  26
 
+   unsigned long   pci_flags;
+#define PFLG_DISCONNECTED  0   /* PCI device removed */
+
uint32_tdevice_flags;
 #define SWITCH_FOUND   BIT_0
 #define DFLG_NO_CABLE  BIT_1
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index e1b16ad..2741ad8 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -117,12 +117,14 @@ qla2x00_check_reg32_for_disconnect(scsi_qla_host_t *vha, 
uint32_t reg)
 {
/* Check for PCI disconnection */
if (reg == 0x) {
-   /*
-* Schedule this on the default system workqueue so that all the
-* adapter workqueues and the DPC thread can be shutdown
-* cleanly.
-*/
-   schedule_work(&vha->hw->board_disable);
+   if (!test_and_set_bit(PFLG_DISCONNECTED, &vha->pci_flags)) {
+   /*
+* Schedule this (only once) on the default system
+* workqueue so that all the adapter workqueues and the
+* DPC thread can be shutdown cleanly.
+*/
+   schedule_work(&vha->hw->board_disable);
+   }
return true;
} else
return false;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/6] qla2xxx: Fix shost use-after-free on device removal

2014-06-18 Thread Joe Lawrence
Once calling scsi_host_put, be careful to not access qla_hw_data through
the Scsi_Host private data (ie, scsi_qla_host base_vha).

Signed-off-by: Joe Lawrence 
---
 drivers/scsi/qla2xxx/qla_os.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 5a430c7..6a512b7 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3111,10 +3111,8 @@ qla2x00_unmap_iobases(struct qla_hw_data *ha)
 }
 
 static void
-qla2x00_clear_drv_active(scsi_qla_host_t *vha)
+qla2x00_clear_drv_active(struct qla_hw_data *ha)
 {
-   struct qla_hw_data *ha = vha->hw;
-
if (IS_QLA8044(ha)) {
qla8044_idc_lock(ha);
qla8044_clear_drv_active(ha);
@@ -3185,7 +3183,7 @@ qla2x00_remove_one(struct pci_dev *pdev)
 
scsi_host_put(base_vha->host);
 
-   qla2x00_clear_drv_active(base_vha);
+   qla2x00_clear_drv_active(ha);
 
qla2x00_unmap_iobases(ha);
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] mptfusion: combine fw_event_work and its event_data

2014-06-06 Thread Joe Lawrence
On Thu, 5 Jun 2014 02:34:29 -0700
Christoph Hellwig  wrote:

> > diff --git a/drivers/message/fusion/mptsas.h 
> > b/drivers/message/fusion/mptsas.h
> > index 57e86ab..c396483 100644
> > --- a/drivers/message/fusion/mptsas.h
> > +++ b/drivers/message/fusion/mptsas.h
> > @@ -110,7 +110,7 @@ struct fw_event_work {
> > MPT_ADAPTER *ioc;
> > u32 event;
> > u8  retries;
> > -   u8  __attribute__((aligned(4))) event_data[1];
> > +   charevent_data[0] __aligned(4);
> >  };

Is this alignment necessary and if so, should it be more like:
  __attribute__ ((aligned (sizeof(unsigned long

> >  
> > -   sz = offsetof(struct fw_event_work, event_data) +
> > -   sizeof(MpiEventDataSasDeviceStatusChange_t);
> > +   sz = sizeof(*fw_event) +
> > +   sizeof(MpiEventDataSasDeviceStatusChange_t);
> > fw_event = kzalloc(sz, GFP_ATOMIC);
> 
> Seems like there is no point in keeping the sz variable here and at
> the other occurances.  Not that it really matters, but if we make a pass
> over this code we might as well fix that up, too.
 
I'll take a look and post up a v2 next week, adding these to a v2
mpt2/mpt3 set.

Thanks,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/7] mptfusion: zero kernel-space source of copy_to_user

2014-06-04 Thread Joe Lawrence
On Wed, 4 Jun 2014 12:49:46 -0400
Joe Lawrence  wrote:

> Fixes the following smatch warning:
> 
>   drivers/message/fusion/mptctl.c:1369 mptctl_getiocinfo() warn:
> possible info leak 'karg'
> 
> Signed-off-by: Joe Lawrence 
> Cc: Christoph Hellwig 
> Cc: Dan Carpenter 
> Cc: Sreekanth Reddy 
> ---
>  drivers/message/fusion/mptctl.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/message/fusion/mptctl.c b/drivers/message/fusion/mptctl.c
> index dcc8385..e6d8935 100644
> --- a/drivers/message/fusion/mptctl.c
> +++ b/drivers/message/fusion/mptctl.c
> @@ -1261,7 +1261,7 @@ mptctl_getiocinfo (unsigned long arg, unsigned int 
> data_size)
>   else
>   return -EFAULT;
>  
> - karg = kmalloc(data_size, GFP_KERNEL);
> + karg = kzalloc(data_size, GFP_KERNEL);
>   if (karg == NULL) {
>   printk(KERN_ERR MYNAM "%s::mpt_ioctl_iocinfo() @%d - no memory 
> available!\n",
>   __FILE__, __LINE__);

Hi Dan,

kzalloc silenced that smatch warning, but the code looks like:

  (calculate data_size)
  ...
  karg = kmalloc(data_size, GFP_KERNEL);
  ...
  if (copy_from_user(karg, uarg, data_size)) {
  ...
  if (copy_to_user((char __user *)arg, karg, data_size)) {

where 'data_size' once calculated, is unchanged.  Since the size
allocated is the same copied from the user and the same copied back out
to the user, would this really be considered an info leak?

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] mptfusion: tweak null pointer checks

2014-06-04 Thread Joe Lawrence
Fixes the following smatch warnings:

  drivers/message/fusion/mptbase.c:652 mptbase_reply() warn: variable
dereferenced before check 'reply' (see line 639)

  drivers/message/fusion/mptsas.c:1255 mptsas_taskmgmt_complete() error:
we previously assumed 'pScsiTmReply' could be null (see line 1227)


  drivers/message/fusion/mptsas.c:3888 mptsas_not_responding_devices()
error: we previously assumed 'port_info->phy_info' could be null
(see line 3875)

  drivers/message/fusion/mptscsih.c:1284 mptscsih_info() error:
we previously assumed 'h' could be null (see line 1274)

  drivers/message/fusion/mptscsih.c:1388 mptscsih_qcmd() error: we
previously assumed 'vdevice' could be null (see line 1373)

Signed-off-by: Joe Lawrence 
Cc: Christoph Hellwig 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---

JL: Comments on the above warnings, in order:

No-brainer, the enclosing switch statement dereferences 'reply',
so we can't get here unless 'reply' is valid.

"Retry target reset" needs to know the target ID and channel, so
enclose that entire block inside a if (pScsiTmReply) block.

The code before the loop checks for 'port_info->phy_info', but not
many other places in the driver bother.  Not sure what to do here.

'hostdata' is checked and then immediately dereferenced after
that block.  How about a default return string of "N/A" to indicate
one isn't available? 

Not sure what to do here.  'vdevice' is needed to "set up the
message frame" target ID and channel, so it should probably return
an error in in this case.

 drivers/message/fusion/mptbase.c  |   10 --
 drivers/message/fusion/mptsas.c   |   27 ++-
 drivers/message/fusion/mptscsih.c |7 ---
 3 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/message/fusion/mptbase.c b/drivers/message/fusion/mptbase.c
index 03263d6..c68a951 100644
--- a/drivers/message/fusion/mptbase.c
+++ b/drivers/message/fusion/mptbase.c
@@ -649,12 +649,10 @@ mptbase_reply(MPT_ADAPTER *ioc, MPT_FRAME_HDR *req, 
MPT_FRAME_HDR *reply)
case MPI_FUNCTION_CONFIG:
case MPI_FUNCTION_SAS_IO_UNIT_CONTROL:
ioc->mptbase_cmds.status |= MPT_MGMT_STATUS_COMMAND_GOOD;
-   if (reply) {
-   ioc->mptbase_cmds.status |= MPT_MGMT_STATUS_RF_VALID;
-   memcpy(ioc->mptbase_cmds.reply, reply,
-   min(MPT_DEFAULT_FRAME_SIZE,
-   4 * reply->u.reply.MsgLength));
-   }
+   ioc->mptbase_cmds.status |= MPT_MGMT_STATUS_RF_VALID;
+   memcpy(ioc->mptbase_cmds.reply, reply,
+   min(MPT_DEFAULT_FRAME_SIZE,
+   4 * reply->u.reply.MsgLength));
if (ioc->mptbase_cmds.status & MPT_MGMT_STATUS_PENDING) {
ioc->mptbase_cmds.status &= ~MPT_MGMT_STATUS_PENDING;
complete(&ioc->mptbase_cmds.done);
diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 35c87b1..72e464fc 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -1252,17 +1252,19 @@ mptsas_taskmgmt_complete(MPT_ADAPTER *ioc, 
MPT_FRAME_HDR *mf, MPT_FRAME_HDR *mr)
ioc->name, jiffies_to_msecs(jiffies -
target_reset_list->time_count)/1000));
 
-   id = pScsiTmReply->TargetID;
-   channel = pScsiTmReply->Bus;
-   target_reset_list->time_count = jiffies;
+   if (pScsiTmReply) {
+   id = pScsiTmReply->TargetID;
+   channel = pScsiTmReply->Bus;
+   target_reset_list->time_count = jiffies;
 
-   /*
-* retry target reset
-*/
-   if (!target_reset_list->target_reset_issued) {
-   if (mptsas_target_reset(ioc, channel, id))
-   target_reset_list->target_reset_issued = 1;
-   return 1;
+   /*
+* retry target reset
+*/
+   if (!target_reset_list->target_reset_issued) {
+   if (mptsas_target_reset(ioc, channel, id))
+   target_reset_list->target_reset_issued = 1;
+   return 1;
+   }
}
 
/*
@@ -3872,9 +3874,8 @@ retry_page:
  redo_expander_scan:
list_for_each_entry(port_info, &ioc->sas_topology, list) {
 
-   if (port_info->phy_info &&
-   (!(port_info->phy_info[0].identify.device_info &
-   MPI_SAS_DEVICE_INFO_SMP_TARGET)))
+   if (!(port_info->phy_info[0].identify.device_info &
+   MPI_SAS_DEVICE_INFO_SMP_TARGET))

[PATCH 6/7] mptfusion: combine fw_event_work and its event_data

2014-06-04 Thread Joe Lawrence
Tack the firmware reply event_data payload to the end of its
corresponding struct fw_event_work allocation.  Rework fw_event_work
allocation calculations to include the event_data size where
appropriate.

This clarifies the code a bit and avoids the following smatch warnings:

  drivers/message/fusion/mptsas.c:1003 mptsas_queue_device_delete()
error: memcpy() 'fw_event->event_data' too small (29 vs 36)

  drivers/message/fusion/mptsas.c:1017 mptsas_queue_rescan() error: not
allocating enough data 168 vs 160

Signed-off-by: Joe Lawrence 
Cc: Christoph Hellwig 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---

JL: In the existing code, struct fw_event_work includes an aligned
char event_data[1] and when event data payload is required, a
larger number of bytes is allocated.  This is calculated by
subtracting the offsetof(struct fw_event_work, event_data) from
the sizeof(struct fw_event_work) and adding the required payload
size.  Was a single byte required to guarantee the alignment?
If not, it seems clearer to just use a 0-length array and avoid the
offsetof part.

 drivers/message/fusion/mptsas.c |   24 
 drivers/message/fusion/mptsas.h |2 +-
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 8bb580d..35c87b1 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -992,15 +992,15 @@ mptsas_queue_device_delete(MPT_ADAPTER *ioc,
struct fw_event_work *fw_event;
int sz;
 
-   sz = offsetof(struct fw_event_work, event_data) +
-   sizeof(MpiEventDataSasDeviceStatusChange_t);
+   sz = sizeof(*fw_event) +
+   sizeof(MpiEventDataSasDeviceStatusChange_t);
fw_event = kzalloc(sz, GFP_ATOMIC);
if (!fw_event) {
printk(MYIOC_s_WARN_FMT "%s: failed at (line=%d)\n",
ioc->name, __func__, __LINE__);
return;
}
-   memcpy(fw_event->event_data, sas_event_data,
+   memcpy(&fw_event->event_data, sas_event_data,
sizeof(MpiEventDataSasDeviceStatusChange_t));
fw_event->event = MPI_EVENT_SAS_DEVICE_STATUS_CHANGE;
fw_event->ioc = ioc;
@@ -1013,7 +1013,7 @@ mptsas_queue_rescan(MPT_ADAPTER *ioc)
struct fw_event_work *fw_event;
int sz;
 
-   sz = offsetof(struct fw_event_work, event_data);
+   sz = sizeof(*fw_event);
fw_event = kzalloc(sz, GFP_ATOMIC);
if (!fw_event) {
printk(MYIOC_s_WARN_FMT "%s: failed at (line=%d)\n",
@@ -3617,7 +3617,7 @@ mptsas_send_expander_event(struct fw_event_work *fw_event)
 
ioc = fw_event->ioc;
expander_data = (MpiEventDataSasExpanderStatusChange_t *)
-   fw_event->event_data;
+   &fw_event->event_data;
memcpy(&sas_address, &expander_data->SASAddress, sizeof(__le64));
sas_address = le64_to_cpu(sas_address);
port_info = mptsas_find_portinfo_by_sas_address(ioc, sas_address);
@@ -3694,7 +3694,7 @@ mptsas_send_link_status_event(struct fw_event_work 
*fw_event)
u8 link_rate;
 
ioc = fw_event->ioc;
-   link_data = (MpiEventDataSasPhyLinkStatus_t *)fw_event->event_data;
+   link_data = (MpiEventDataSasPhyLinkStatus_t *) &fw_event->event_data;
 
memcpy(&sas_address, &link_data->SASAddress, sizeof(__le64));
sas_address = le64_to_cpu(sas_address);
@@ -4043,7 +4043,7 @@ mptsas_handle_queue_full_event(struct fw_event_work 
*fw_event)
 
 
ioc = fw_event->ioc;
-   qfull_data = (EventDataQueueFull_t *)fw_event->event_data;
+   qfull_data = (EventDataQueueFull_t *) &fw_event->event_data;
fw_id = qfull_data->TargetID;
fw_channel = qfull_data->Bus;
current_depth = le16_to_cpu(qfull_data->CurrentDepth);
@@ -4584,7 +4584,7 @@ mptsas_send_sas_event(struct fw_event_work *fw_event)
 
ioc = fw_event->ioc;
sas_event_data = (EVENT_DATA_SAS_DEVICE_STATUS_CHANGE *)
-   fw_event->event_data;
+   &fw_event->event_data;
device_info = le32_to_cpu(sas_event_data->DeviceInfo);
 
if ((device_info &
@@ -4652,7 +4652,7 @@ mptsas_send_raid_event(struct fw_event_work *fw_event)
RaidPhysDiskPage0_t phys_disk;
 
ioc = fw_event->ioc;
-   raid_event_data = (EVENT_DATA_RAID *)fw_event->event_data;
+   raid_event_data = (EVENT_DATA_RAID *) &fw_event->event_data;
status = le32_to_cpu(raid_event_data->SettingsStatus);
state = (status >> 8) & 0xff;
 
@@ -4950,7 +4950,7 @@ mptsas_send_ir2_event(struct fw_event_work *fw_event)
RaidPhysDiskPage0_t phys_disk;
 
ioc = fw_event->ioc;
-   ir2_data = (MPI_EVENT_DATA_IR2 *)fw_event->event_data;
+   ir2_data = (M

[PATCH 5/7] mptfusion: make adapter prod_name[] a pointer

2014-06-04 Thread Joe Lawrence
The struct _MPT_ADAPTER doesn't need a full copy of the product string,
so prod_name can point to the string literal storage that the driver
already provides.

Avoids the following sparse warning:

  drivers/message/fusion/mptbase.c:2858 MptDisplayIocCapabilities()
warn: this array is probably non-NULL. 'ioc->prod_name'

Signed-off-by: Joe Lawrence 
Cc: Christoph Hellwig 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---
 drivers/message/fusion/mptbase.c |   11 +--
 drivers/message/fusion/mptbase.h |2 +-
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/message/fusion/mptbase.c b/drivers/message/fusion/mptbase.c
index a70b873..03263d6 100644
--- a/drivers/message/fusion/mptbase.c
+++ b/drivers/message/fusion/mptbase.c
@@ -1408,8 +1408,8 @@ mpt_verify_adapter(int iocid, MPT_ADAPTER **iocpp)
  * in /proc/mpt/summary and /sysfs/class/scsi_host/host/version_product
  *
  **/
-static void
-mpt_get_product_name(u16 vendor, u16 device, u8 revision, char *prod_name)
+static const char*
+mpt_get_product_name(u16 vendor, u16 device, u8 revision)
 {
char *product_str = NULL;
 
@@ -1635,8 +1635,7 @@ mpt_get_product_name(u16 vendor, u16 device, u8 revision, 
char *prod_name)
}
 
  out:
-   if (product_str)
-   sprintf(prod_name, "%s", product_str);
+   return product_str;
 }
 
 /**
@@ -1887,8 +1886,8 @@ mpt_attach(struct pci_dev *pdev, const struct 
pci_device_id *id)
dinitprintk(ioc, printk(MYIOC_s_INFO_FMT "facts @ %p, pfacts[0] @ %p\n",
ioc->name, &ioc->facts, &ioc->pfacts[0]));
 
-   mpt_get_product_name(pdev->vendor, pdev->device, pdev->revision,
-ioc->prod_name);
+   ioc->prod_name = mpt_get_product_name(pdev->vendor, pdev->device,
+ pdev->revision);
 
switch (pdev->device)
{
diff --git a/drivers/message/fusion/mptbase.h b/drivers/message/fusion/mptbase.h
index 76c05bc..78c8104 100644
--- a/drivers/message/fusion/mptbase.h
+++ b/drivers/message/fusion/mptbase.h
@@ -605,7 +605,7 @@ typedef struct _MPT_ADAPTER
int  id;/* Unique adapter id N 
{0,1,2,...} */
int  pci_irq;   /* This irq   */
char name[MPT_NAME_LENGTH]; /* "iocN" */
-   char prod_name[MPT_NAME_LENGTH];/* "LSIFC9x9"   
  */
+   const char   *prod_name;/* "LSIFC9x9" */
 #ifdef CONFIG_FUSION_LOGGING
/* used in mpt_display_event_info */
char evStr[EVENT_DESCR_STR_SZ];
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] mptfusion: mark file-private functions as static

2014-06-04 Thread Joe Lawrence
Fixes the following sparse warnings:

  drivers/message/fusion/mptbase.c:7011:1: warning: symbol
'mpt_SoftResetHandler' was not declared. Should it be static?

  drivers/message/fusion/mptsas.c:1578:23: warning: symbol
'mptsas_refreshing_device_handles' was not declared. Should it be
static?

  drivers/message/fusion/mptsas.c:3653:24: warning: symbol
'mptsas_expander_add' was not declared. Should it be static?

  drivers/message/fusion/mptsas.c:5327:1: warning: symbol
'mptsas_shutdown' was not declared. Should it be static?

  drivers/message/fusion/mptspi.c:624:1: warning: symbol
'mptscsih_quiesce_raid' was not declared. Should it be static?

Signed-off-by: Joe Lawrence 
Cc: Christoph Hellwig 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---
 drivers/message/fusion/mptbase.c |2 +-
 drivers/message/fusion/mptsas.c  |6 +++---
 drivers/message/fusion/mptspi.c  |2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/message/fusion/mptbase.c b/drivers/message/fusion/mptbase.c
index 570b18a..5ef3f89 100644
--- a/drivers/message/fusion/mptbase.c
+++ b/drivers/message/fusion/mptbase.c
@@ -7007,7 +7007,7 @@ EXPORT_SYMBOL(mpt_halt_firmware);
  * IOC doesn't reply to any outstanding request. This will transfer IOC
  * to READY state.
  **/
-int
+static int
 mpt_SoftResetHandler(MPT_ADAPTER *ioc, int sleepFlag)
 {
int  rc;
diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 00d339c..8bb580d 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -1575,7 +1575,7 @@ mptsas_del_end_device(MPT_ADAPTER *ioc, struct 
mptsas_phyinfo *phy_info)
mptsas_port_delete(ioc, phy_info->port_details);
 }
 
-struct mptsas_phyinfo *
+static struct mptsas_phyinfo *
 mptsas_refreshing_device_handles(MPT_ADAPTER *ioc,
struct mptsas_devinfo *sas_device)
 {
@@ -3650,7 +3650,7 @@ mptsas_send_expander_event(struct fw_event_work *fw_event)
  * @handle:
  *
  */
-struct mptsas_portinfo *
+static struct mptsas_portinfo *
 mptsas_expander_add(MPT_ADAPTER *ioc, u16 handle)
 {
struct mptsas_portinfo buffer, *port_info;
@@ -5323,7 +5323,7 @@ mptsas_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
return error;
 }
 
-void
+static void
 mptsas_shutdown(struct pci_dev *pdev)
 {
MPT_ADAPTER *ioc = pci_get_drvdata(pdev);
diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index 5653e50..ed8de0a 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -620,7 +620,7 @@ static void mptspi_read_parameters(struct scsi_target 
*starget)
spi_width(starget) = (nego & MPI_SCSIDEVPAGE0_NP_WIDE) ? 1 : 0;
 }
 
-int
+static int
 mptscsih_quiesce_raid(MPT_SCSI_HOST *hd, int quiesce, u8 channel, u8 id)
 {
MPT_ADAPTER *ioc = hd->ioc;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] mptfusion: remove redundant kfree checks

2014-06-04 Thread Joe Lawrence
Fixes the following smatch warnings:

  drivers/message/fusion/mptfc.c:529 mptfc_target_destroy() info:
redundant null check on starget->hostdata calling kfree()

  drivers/message/fusion/mptspi.c:465 mptspi_target_destroy() info:
redundant null check on starget->hostdata calling kfree()

Signed-off-by: Joe Lawrence 
Cc: Christoph Hellwig 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---
 drivers/message/fusion/mptfc.c  |3 +--
 drivers/message/fusion/mptspi.c |3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/message/fusion/mptfc.c b/drivers/message/fusion/mptfc.c
index fd75108..4bac2ad 100644
--- a/drivers/message/fusion/mptfc.c
+++ b/drivers/message/fusion/mptfc.c
@@ -525,8 +525,7 @@ mptfc_target_destroy(struct scsi_target *starget)
if (ri) /* better be! */
ri->starget = NULL;
}
-   if (starget->hostdata)
-   kfree(starget->hostdata);
+   kfree(starget->hostdata);
starget->hostdata = NULL;
 }
 
diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index ed8de0a..7ce9a5e 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -461,8 +461,7 @@ static int mptspi_target_alloc(struct scsi_target *starget)
 static void
 mptspi_target_destroy(struct scsi_target *starget)
 {
-   if (starget->hostdata)
-   kfree(starget->hostdata);
+   kfree(starget->hostdata);
starget->hostdata = NULL;
 }
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] mptfusion: zero kernel-space source of copy_to_user

2014-06-04 Thread Joe Lawrence
Fixes the following smatch warning:

  drivers/message/fusion/mptctl.c:1369 mptctl_getiocinfo() warn:
possible info leak 'karg'

Signed-off-by: Joe Lawrence 
Cc: Christoph Hellwig 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---
 drivers/message/fusion/mptctl.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/message/fusion/mptctl.c b/drivers/message/fusion/mptctl.c
index dcc8385..e6d8935 100644
--- a/drivers/message/fusion/mptctl.c
+++ b/drivers/message/fusion/mptctl.c
@@ -1261,7 +1261,7 @@ mptctl_getiocinfo (unsigned long arg, unsigned int 
data_size)
else
return -EFAULT;
 
-   karg = kmalloc(data_size, GFP_KERNEL);
+   karg = kzalloc(data_size, GFP_KERNEL);
if (karg == NULL) {
printk(KERN_ERR MYNAM "%s::mpt_ioctl_iocinfo() @%d - no memory 
available!\n",
__FILE__, __LINE__);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/7] mptfusion: initChainBuffers should return errno

2014-06-04 Thread Joe Lawrence
The lone caller of initChainBuffers checks the return code for < 0, so
it is safe to appease smatch and return the proper errno value.

Fixes the following smatch warnings:

  drivers/message/fusion/mptbase.c:4328 initChainBuffers() warn:
returning -1 instead of -ENOMEM is sloppy

  drivers/message/fusion/mptbase.c:4335 initChainBuffers() warn:
returning -1 instead of -ENOMEM is sloppy

  drivers/message/fusion/mptbase.c:4402 initChainBuffers() warn:
returning -1 instead of -ENOMEM is sloppy

Signed-off-by: Joe Lawrence 
Cc: Christoph Hellwig 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---
 drivers/message/fusion/mptbase.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/message/fusion/mptbase.c b/drivers/message/fusion/mptbase.c
index 5ef3f89..a70b873 100644
--- a/drivers/message/fusion/mptbase.c
+++ b/drivers/message/fusion/mptbase.c
@@ -4325,14 +4325,14 @@ initChainBuffers(MPT_ADAPTER *ioc)
sz = ioc->req_depth * sizeof(int);
mem = kmalloc(sz, GFP_ATOMIC);
if (mem == NULL)
-   return -1;
+   return -ENOMEM;
 
ioc->ReqToChain = (int *) mem;
dinitprintk(ioc, printk(MYIOC_s_DEBUG_FMT "ReqToChain alloc  @ 
%p, sz=%d bytes\n",
ioc->name, mem, sz));
mem = kmalloc(sz, GFP_ATOMIC);
if (mem == NULL)
-   return -1;
+   return -ENOMEM;
 
ioc->RequestNB = (int *) mem;
dinitprintk(ioc, printk(MYIOC_s_DEBUG_FMT "RequestNB alloc  @ 
%p, sz=%d bytes\n",
@@ -4399,7 +4399,7 @@ initChainBuffers(MPT_ADAPTER *ioc)
if (ioc->ChainToChain == NULL) {
mem = kmalloc(sz, GFP_ATOMIC);
if (mem == NULL)
-   return -1;
+   return -ENOMEM;
 
ioc->ChainToChain = (int *) mem;
dinitprintk(ioc, printk(MYIOC_s_DEBUG_FMT "ChainToChain alloc @ 
%p, sz=%d bytes\n",
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/7] mptfusion static checker fixups

2014-06-04 Thread Joe Lawrence
While reviewing the mpt2/mpt3 static checker fixup patchset, Christoph
inquired about mptfusion.  None of the sparse / smatch warnings from the
earlier patchset directly apply to fusion, but there were a few easy to
fix warnings (compile tested only).

The patchset is ordered from the smallest/easiest change up to the last
three, which are bit more involved and should be reviewed by LSI,
especially the last one "mptfusion: tweak null pointer checks".  See the
commentary in those patches after the signed-off-by line.

Cc: Christoph Hellwig 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 

Joe Lawrence (7):
  mptfusion: mark file-private functions as static
  mptfusion: remove redundant kfree checks
  mptfusion: initChainBuffers should return errno
  mptfusion: zero kernel-space source of copy_to_user
  mptfusion: make adapter prod_name[] a pointer
  mptfusion: combine fw_event_work and its event_data
  mptfusion: tweak null pointer checks

 drivers/message/fusion/mptbase.c  |   29 +--
 drivers/message/fusion/mptbase.h  |2 +-
 drivers/message/fusion/mptctl.c   |2 +-
 drivers/message/fusion/mptfc.c|3 +-
 drivers/message/fusion/mptsas.c   |   57 +++--
 drivers/message/fusion/mptsas.h   |2 +-
 drivers/message/fusion/mptscsih.c |7 +++--
 drivers/message/fusion/mptspi.c   |5 ++--
 8 files changed, 52 insertions(+), 55 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Christoph Hellwig wrote:

2014-06-02 Thread Joe Lawrence
> On Mon, Jun 02, 2014 at 10:37:26AM -0400, Joe Lawrence wrote:
> > If mpt3sas_send_trigger_data_event exits early without inserting a
> > fw_event, be sure to undo any prior allocations.
> 
> Looks good, but why don't we just allocate the two in a single
> allocation?

Hi Christoph,

The following routines already handle two allocations safely:

 mpt2sas_scsih_event_callback
 mpt3sas_scsih_event_callback

but if we wanted to merge them, it could look something like this (introducing
a bunch of UglyCamelCaseCastings).  Compile tested only...  and doesn't include
mpt3sas or fusion versions.

-- >8 --

>From b34615dfb103613f228a82eb4eb6644a04036256 Mon Sep 17 00:00:00 2001
From: Joe Lawrence 
Date: Sun, 1 Jun 2014 22:36:45 -0400
Subject: [PATCH] mpt3sas: combine fw_event_work and its event_data

Tack the firmware reply event_data payload to the end of its
corresponding struct fw_event_work allocation.

This fixes the following smatch warning:

  drivers/scsi/mpt3sas/mpt3sas_scsih.c:2522
mpt3sas_send_trigger_data_event() warn: possible memory leak of
    'fw_event'

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |   54 +-
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index e6f0720..8c2aa5d 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -190,7 +190,7 @@ struct fw_event_work {
u8  VP_ID;
u8  ignore;
u16 event;
-   void*event_data;
+   charevent_data[0];
 };
 
 /* raid transport support */
@@ -2495,7 +2495,6 @@ _scsih_fw_event_free(struct MPT3SAS_ADAPTER *ioc, struct 
fw_event_work
 
spin_lock_irqsave(&ioc->fw_event_lock, flags);
list_del(&fw_event->list);
-   kfree(fw_event->event_data);
kfree(fw_event);
spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
 }
@@ -2516,12 +2515,10 @@ mpt3sas_send_trigger_data_event(struct MPT3SAS_ADAPTER 
*ioc,
 
if (ioc->is_driver_loading)
return;
-   fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+   fw_event = kzalloc(sizeof(*fw_event) + sizeof(*event_data),
+  GFP_ATOMIC);
if (!fw_event)
return;
-   fw_event->event_data = kzalloc(sizeof(*event_data), GFP_ATOMIC);
-   if (!fw_event->event_data)
-   return;
fw_event->event = MPT3SAS_PROCESS_TRIGGER_DIAG;
fw_event->ioc = ioc;
memcpy(fw_event->event_data, event_data, sizeof(*event_data));
@@ -3216,7 +3213,8 @@ _scsih_check_topo_delete_events(struct MPT3SAS_ADAPTER 
*ioc,
if (fw_event->event != MPI2_EVENT_SAS_TOPOLOGY_CHANGE_LIST ||
fw_event->ignore)
continue;
-   local_event_data = fw_event->event_data;
+   local_event_data = (Mpi2EventDataSasTopologyChangeList_t *)
+  &fw_event->event_data;
if (local_event_data->ExpStatus ==
MPI2_EVENT_SAS_TOPO_ES_ADDED ||
local_event_data->ExpStatus ==
@@ -5051,7 +5049,8 @@ _scsih_sas_topology_change_event(struct MPT3SAS_ADAPTER 
*ioc,
u64 sas_address;
unsigned long flags;
u8 link_rate, prev_link_rate;
-   Mpi2EventDataSasTopologyChangeList_t *event_data = fw_event->event_data;
+   Mpi2EventDataSasTopologyChangeList_t *event_data =
+   (Mpi2EventDataSasTopologyChangeList_t *) &fw_event->event_data;
 
 #ifdef CONFIG_SCSI_MPT3SAS_LOGGING
if (ioc->logging_level & MPT_DEBUG_EVENT_WORK_TASK)
@@ -5249,7 +5248,7 @@ _scsih_sas_device_status_change_event(struct 
MPT3SAS_ADAPTER *ioc,
u64 sas_address;
unsigned long flags;
Mpi2EventDataSasDeviceStatusChange_t *event_data =
-   fw_event->event_data;
+   (Mpi2EventDataSasDeviceStatusChange_t *) &fw_event->event_data;
 
 #ifdef CONFIG_SCSI_MPT3SAS_LOGGING
if (ioc->logging_level & MPT_DEBUG_EVENT_WORK_TASK)
@@ -5345,7 +5344,7 @@ _scsih_sas_enclosure_dev_status_change_event(struct 
MPT3SAS_ADAPTER *ioc,
 #ifdef CONFIG_SCSI_MPT3SAS_LOGGING
if (ioc->logging_level & MPT_DEBUG_EVENT_WORK_TASK)
_scsih_sas_enclosure_dev_status_change_event_debug(ioc,
-fw_event->event_data);
+&fw_event->event_data);
 #endif
 }
 
@@ -5369,7 +5368,8 @@ _scsih_sas_broadcast_primitive_event(struct 
MPT3SAS_ADAPTER *ioc,
u32 termination_count;
u32 query_count;
Mpi2SCSITaskManagementReply_t *mpi_reply;
-   Mpi2EventDataSasBroa

Re: [PATCH 0/4] mpt2/mpt3 static checker fixups

2014-06-02 Thread Joe Lawrence
On Mon, 2 Jun 2014 09:33:07 -0700
Christoph Hellwig  wrote:

> On Mon, Jun 02, 2014 at 10:34:28AM -0400, Joe Lawrence wrote:
> > Hello Sreekanth, Dan,
> > 
> > These are a few minor smatch and sparse static checker fixes for the LSI
> > mpt2 and mpt3 drivers.  The first three fix real potential bugs and the
> > last cleans up a noisy complaint from sparse.
> 
> Can you check if any of these also applies to drivers/message/fusion?
> There is tons of copy & paste between all three drivers unfortunately.

The fixes in this patchset don't apply to drivers/message/fusion,
however there are a half-dozen or so warnings that I could take a look
at if the driver is still maintained.

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] mpt2sas: annotate ioc->reply_post_host_index as __iomem

2014-06-02 Thread Joe Lawrence
The MPT2SAS_ADAPTER reply_post_host_index[] holds calculated addresses
in memory mapped register space.  Add an "__iomem" annotation to silence
the following sparse warnings:

  drivers/scsi/mpt2sas/mpt2sas_base.c:1006:43:
warning: incorrect type in argument 2 (different address spaces)
   expected void volatile [noderef] *addr
   got unsigned long long [usertype] *

  drivers/scsi/mpt2sas/mpt2sas_base.c:4299:22:
warning: cast removes address space of expression
  drivers/scsi/mpt2sas/mpt2sas_base.c:4303:27:
warning: cast removes address space of expression

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---
 drivers/scsi/mpt2sas/mpt2sas_base.c |9 +
 drivers/scsi/mpt2sas/mpt2sas_base.h |2 +-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c 
b/drivers/scsi/mpt2sas/mpt2sas_base.c
index bde63f7..a6e477e 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -4295,12 +4295,13 @@ mpt2sas_base_attach(struct MPT2SAS_ADAPTER *ioc)
goto out_free_resources;
 
if (ioc->is_warpdrive) {
-   ioc->reply_post_host_index[0] =
-   (resource_size_t *)&ioc->chip->ReplyPostHostIndex;
+   ioc->reply_post_host_index[0] = (resource_size_t __iomem *)
+   &ioc->chip->ReplyPostHostIndex;
 
for (i = 1; i < ioc->cpu_msix_table_sz; i++)
-   ioc->reply_post_host_index[i] = (resource_size_t *)
-   ((u8 *)&ioc->chip->Doorbell + (0x4000 + ((i - 1)
+   ioc->reply_post_host_index[i] =
+   (resource_size_t __iomem *)
+   ((u8 __iomem *)&ioc->chip->Doorbell + (0x4000 + ((i - 1)
* 4)));
}
 
diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h 
b/drivers/scsi/mpt2sas/mpt2sas_base.h
index 1f2ac3a..23d0c85 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -837,7 +837,7 @@ struct MPT2SAS_ADAPTER {
u8  msix_enable;
u16 msix_vector_count;
u8  *cpu_msix_table;
-   resource_size_t **reply_post_host_index;
+   resource_size_t __iomem **reply_post_host_index;
u16 cpu_msix_table_sz;
u32 ioc_reset_count;
MPT2SAS_FLUSH_RUNNING_CMDS schedule_dead_ioc_flush_running_cmds;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] mpt3sas: correct scsi_{target,device} hostdata allocation

2014-06-02 Thread Joe Lawrence
In _scsih_{slave,target}_alloc, an incorrect structure type is passed
to sizeof() when allocating storage for hostdata.  Luckily larger
structure types were used, so at least the wrong sizes were safe:

  struct scsi_device (1784 bytes) > struct MPT3SAS_DEVICE (24 bytes)
  struct scsi_target (760 bytes)  > struct MPT3SAS_TARGET (32 bytes)

This fixes the following smatch warnings:

  drivers/scsi/mpt3sas/mpt3sas_scsih.c:1166 _scsih_target_alloc()
warn: struct type mismatch 'MPT3SAS_TARGET vs scsi_target'

  drivers/scsi/mpt3sas/mpt3sas_scsih.c:1280 _scsih_slave_alloc()
warn: struct type mismatch 'MPT3SAS_DEVICE vs scsi_device'

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index a961fe1..e6f0720 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -1163,7 +1163,8 @@ _scsih_target_alloc(struct scsi_target *starget)
unsigned long flags;
struct sas_rphy *rphy;
 
-   sas_target_priv_data = kzalloc(sizeof(struct scsi_target), GFP_KERNEL);
+   sas_target_priv_data = kzalloc(sizeof(*sas_target_priv_data),
+  GFP_KERNEL);
if (!sas_target_priv_data)
return -ENOMEM;
 
@@ -1277,7 +1278,8 @@ _scsih_slave_alloc(struct scsi_device *sdev)
struct _sas_device *sas_device;
unsigned long flags;
 
-   sas_device_priv_data = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
+   sas_device_priv_data = kzalloc(sizeof(*sas_device_priv_data),
+  GFP_KERNEL);
if (!sas_device_priv_data)
return -ENOMEM;
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] mpt3sas: fix possible memory leak in mpt3sas_send_trigger_data_event

2014-06-02 Thread Joe Lawrence
If mpt3sas_send_trigger_data_event exits early without inserting a
fw_event, be sure to undo any prior allocations.

This fixes the following smatch warning:

  drivers/scsi/mpt3sas/mpt3sas_scsih.c:2522
mpt3sas_send_trigger_data_event() warn: possible memory leak of
'fw_event'

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index e6f0720..25e3a22 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -2520,8 +2520,10 @@ mpt3sas_send_trigger_data_event(struct MPT3SAS_ADAPTER 
*ioc,
if (!fw_event)
return;
fw_event->event_data = kzalloc(sizeof(*event_data), GFP_ATOMIC);
-   if (!fw_event->event_data)
+   if (!fw_event->event_data) {
+   kfree(fw_event);
return;
+   }
fw_event->event = MPT3SAS_PROCESS_TRIGGER_DIAG;
fw_event->ioc = ioc;
memcpy(fw_event->event_data, event_data, sizeof(*event_data));
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] mpt2sas: correct scsi_{target,device} hostdata allocation

2014-06-02 Thread Joe Lawrence
In _scsih_{slave,target}_alloc, an incorrect structure type is passed
to sizeof() when allocating storage for hostdata.  Luckily larger
structure types were used, so at least the wrong sizes were safe:

  struct scsi_device (1784 bytes) > struct MPT2SAS_DEVICE (24 bytes)
  struct scsi_target (760 bytes)  > struct MPT2SAS_TARGET (40 bytes)

This fixes the following smatch warnings:

  drivers/scsi/mpt2sas/mpt2sas_scsih.c:1295 _scsih_target_alloc()
warn: struct type mismatch 'MPT2SAS_TARGET vs scsi_target'

  drivers/scsi/mpt2sas/mpt2sas_scsih.c:1409 _scsih_slave_alloc()
warn: struct type mismatch 'MPT2SAS_DEVICE vs scsi_device'

Signed-off-by: Joe Lawrence 
Cc: Dan Carpenter 
Cc: Sreekanth Reddy 
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c 
b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 6fd7d40..deb0bd2 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -1292,7 +1292,8 @@ _scsih_target_alloc(struct scsi_target *starget)
unsigned long flags;
struct sas_rphy *rphy;
 
-   sas_target_priv_data = kzalloc(sizeof(struct scsi_target), GFP_KERNEL);
+   sas_target_priv_data = kzalloc(sizeof(*sas_target_priv_data),
+  GFP_KERNEL);
if (!sas_target_priv_data)
return -ENOMEM;
 
@@ -1406,7 +1407,8 @@ _scsih_slave_alloc(struct scsi_device *sdev)
struct _sas_device *sas_device;
unsigned long flags;
 
-   sas_device_priv_data = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
+   sas_device_priv_data = kzalloc(sizeof(*sas_device_priv_data),
+  GFP_KERNEL);
if (!sas_device_priv_data)
return -ENOMEM;
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] mpt2/mpt3 static checker fixups

2014-06-02 Thread Joe Lawrence
Hello Sreekanth, Dan,

These are a few minor smatch and sparse static checker fixes for the LSI
mpt2 and mpt3 drivers.  The first three fix real potential bugs and the
last cleans up a noisy complaint from sparse.

Joe Lawrence (4):
  mpt2sas: correct scsi_{target,device} hostdata allocation
  mpt3sas: correct scsi_{target,device} hostdata allocation
  mpt3sas: fix possible memory leak in mpt3sas_send_trigger_data_event
  mpt2sas: annotate ioc->reply_post_host_index as __iomem

 drivers/scsi/mpt2sas/mpt2sas_base.c  |9 +
 drivers/scsi/mpt2sas/mpt2sas_base.h  |2 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c |6 --
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |   10 +++---
 4 files changed, 17 insertions(+), 10 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make SCSI error handler code easier to understand

2014-05-28 Thread Joe Lawrence
On Mon, 26 May 2014 17:12:27 +0200
Bart Van Assche  wrote:

> Every now and then someone asks how it is avoided that the SCSI error
> handler and the SCSI completion handler are invoked concurrently for
> the same SCSI command. Hence this patch series that should make the SCSI
> error handler code a little easier to understand.

Hi Bart,

We talked about REQ_ATOM_COMPLETE a while back (so I may be one of
those people who periodically ask about SCSI error handling /
completion), and I just thought I'd chime in here to clarify the
scenario we were worried about.

Before d555a2ab "[SCSI] Fix spurious request sense in error handling"
we saw the following sequence of events:

  [ 1394.345991] sd 2:0:1:0: [sdw] Done:
  [ 1394.361790] TIMEOUT
  [ 1394.374148] sd 2:0:1:0: [sdw]
  [ 1394.388775] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
  [ 1394.409068] sd 2:0:1:0: [sdw] CDB:
  [ 1394.424782] Read(10): 28 00 00 00 00 18 00 00 08 00
  [ 1394.443772] sd 2:0:1:0: [sdw] scmd 88104dce02c0 abort scheduled
  [ 1394.474026] sd 2:0:1:0: [sdw] aborting command 88104dce02c0
  [ 1394.573338] qla2xxx [:46:00.0]-801c:2: Abort command issued 
nexus=2:1:0 --  1 2002.
  [ 1394.609313] sd 2:0:1:0: [sdw] scmd 88104dce02c0 retry aborted command
  
  [ 1425.397434] sd 2:0:1:0: [sdw] Done:
  [ 1425.417802] TIMEOUT
  [ 1425.431914] sd 2:0:1:0: [sdw]
  [ 1425.448247] Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
  [ 1425.470736] sd 2:0:1:0: [sdw] CDB:
  [ 1425.488005] Read(10): 28 00 00 00 00 18 00 00 08 00
  [ 1425.508472] sd 2:0:1:0: [sdw] scmd 88104dce02c0 previous abort failed
  [ 1425.533299] scsi_eh_2: waking up 0/1/1
  [ 1425.551189] sd 2:0:1:0: scsi_eh_prt_fail_stats: cmds failed: 1, cancel: 0
  [ 1425.575897] Total of 1 commands on 1 devices require eh work
  [ 1425.597970] sd 2:0:1:0: [sdw] scsi_eh_2: requesting sense

  ie, something like:

  CMD -> .queuecommand ... timeout
  ABORT -> .eh_abort_handler
  CMD (retry) -> .queuecommand ... timeout
  ABORT -> scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD)
  ... into scsi_error_handler ...
  REQUEST SENSE -> .queuecommand

We eventually found our way into scsi_unjam_host, which "*knows*
that all commands on the bus have either completed, failed or timed
out.", and then scsi_send_eh_cmnd, which "hijacks" a SCSI command
structure.  Presumably the second comment follows from the first.

But in the case of timeout, when was the LLDD informed to abort or
forget about that scsi command structure?

In our testing, it appeared that the LLDD's .queuecommand was called
back-to-back with the same scsi_cmnd pointer.  The driver in question
was qla2xxx, who keeps a driver-private structure to scsi_cmnd
mapping (assumed to be one-to-one) that got confused by this sequence
of events.

After d555a2ab, in this scenario scsi_unjam_host skips the request
sense and escalates to the next level (scsi_eh_abort_cmds), skipping
straight to the abort as we would have expected...  This avoids the
scsi_cmnd "hijack"... but was it ever safe to do so in the first place?

Or should LLDD expect to gracefully handle the same scsi_cmnd pointer
coming into .queuecommand w/o an intermediate abort or other
scsi_host_template callback?

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] scsi_transport_sas: move bsg destructor into sas_rphy_remove

2014-05-22 Thread Joe Lawrence
The recent change in sysfs, bcdde7e221a8750f9b62b6d0bd31b72ea4ad9309
"sysfs: make __sysfs_remove_dir() recursive" revealed an asymmetric
rphy device creation/deletion sequence in scsi_transport_sas:

  modprobe mpt2sas
sas_rphy_add
  device_add A   rphy->dev
  device_add B   sas_device transport class
  device_add C   sas_end_device transport class
  device_add D   bsg class

  rmmod mpt2sas
sas_rphy_delete
  sas_rphy_remove
device_del B
device_del C
device_del A
  sysfs_remove_group recursive sysfs dir removal
  sas_rphy_free
device_del D warning

  where device A is the parent of B, C, and D.

When sas_rphy_free tries to unregister the bsg request queue (device D
above), the ensuing sysfs cleanup discovers that its sysfs group has
already been removed and emits a warning, "sysfs group... not found for
kobject 'end_device-X:0'".

Since bsg creation is a side effect of sas_rphy_add, move its
complementary removal call into sas_rphy_remove. This imposes the
following tear-down order for the devices above: D, B, C, A.

Note the sas_device and sas_end_device transport class devices (B and C
above) are created and destroyed both via the list match traversal in
attribute_container_device_trigger, so the order in which they are
handled is fixed. This is fine as long as they are deleted before their
parent device.

Signed-off-by: Joe Lawrence 
Cc: "James E.J. Bottomley" 
Cc: Christoph Hellwig 
Acked-by: Dan Williams 
---
 drivers/scsi/scsi_transport_sas.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_transport_sas.c 
b/drivers/scsi/scsi_transport_sas.c
index 1b68142..c341f85 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -1621,8 +1621,6 @@ void sas_rphy_free(struct sas_rphy *rphy)
list_del(&rphy->list);
mutex_unlock(&sas_host->lock);
 
-   sas_bsg_remove(shost, rphy);
-
transport_destroy_device(dev);
 
put_device(dev);
@@ -1681,6 +1679,7 @@ sas_rphy_remove(struct sas_rphy *rphy)
}
 
sas_rphy_unlink(rphy);
+   sas_bsg_remove(NULL, rphy);
transport_remove_device(dev);
device_del(dev);
 }
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RESEND] scsi_transport_sas: move bsg destructor into sas_rphy_remove

2014-05-02 Thread Joe Lawrence
On Wed, 2 Apr 2014 16:48:52 -0400
Joe Lawrence  wrote:

> On Wed, 2 Apr 2014 16:40:41 -0400
> Joe Lawrence  wrote:
> 
> > Since bsg creation is a side effect of sas_rphy_add, move its
> > complementary removal call into sas_rphy_remove. 
> 
> Hello James,
> 
> This a resend of a patch posted to quiet an end-device sysfs warning in
> its device deletion path when removing the mpt2sas driver:
> 
>   (initial report)
>   sysfs group not found for kobject on mpt2sas unload
>   http://marc.info/?t=13849746004
>   
>   (patch + ACK + comments)
>   http://marc.info/?t=13860945551
> 
>   (gentoo, LSI repro)
>   mpt2sas driver barfs when force removing a drive on 3.13.1
>   http://marc.info/?t=13912235131
> 
> Dan Williams had a few other suggestions for cleanup in this area, but
> those could be handled in a subsequent patch.  This one missed 3.14
> inclusion and the warning still occurs in Linus's tree as of today
> (post scsi/block merge) hence the repost.

Ping? Just reproduced on 3.15-rc3.  Also found this:

  (~500 repro count on Fedora Problem Tracker)
  http://retrace.fedoraproject.org/faf/problems/1623421/

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.15-mw: Oops Workqueue: writeback bdi_writeback_workfn (flush-8:16) RIP: e030:[] [] kobject_put+0x11/0x70

2014-04-14 Thread Joe Lawrence
On Mon, 14 Apr 2014 04:30:15 -0700
Christoph Hellwig  wrote:

> On Sat, Apr 12, 2014 at 01:34:31PM +0200, Sander Eikelenboom wrote:
> > Hi,
> > 
> > I just ran into the oops belowafter some uptime.
> 
> Classic use after free introduced by my recent changes, sorry.
> 
> This should fix it:
> 
> ---
> From: Christoph Hellwig 
> Subject: scsi: don't reference freed command in scsi_init_sgtable
> 
> When scsi_init_io fails we have to release our device reference, but
> we do this trying to reference the just freed command.  Add a local
> scsi_device pointer to fix this.
> 
> Reported-by: Sander Eikelenboom 
> Signed-off-by: Christoph Hellwig 
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 65a123d..54eff6a 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1044,6 +1044,7 @@ static int scsi_init_sgtable(struct request *req, 
> struct scsi_data_buffer *sdb,
>   */
>  int scsi_init_io(struct scsi_cmnd *cmd, gfp_t gfp_mask)
>  {
> + struct scsi_device *sdev = cmd->device;
>   struct request *rq = cmd->request;
>  
>   int error = scsi_init_sgtable(rq, &cmd->sdb, gfp_mask);
> @@ -1091,7 +1092,7 @@ err_exit:
>   scsi_release_buffers(cmd);
>   cmd->request->special = NULL;
>   scsi_put_command(cmd);
> - put_device(&cmd->device->sdev_gendev);
> + put_device(&sdev->sdev_gendev);
>   return error;
>  }
>  EXPORT_SYMBOL(scsi_init_io);

Hi Christoph,

I hit a similar crash last week on a franken-kernel here (3.14 + scsi
misc + qlogic patches + out of tree drivers + terriblethingsIknow).  I
think there is one other similar use-after-free that's been in place
for a while now:

int scsi_prep_return(struct request_queue *q, struct request *req, int ret)
{
struct scsi_device *sdev = q->queuedata;

switch (ret) {
case BLKPREP_KILL:
req->errors = DID_NO_CONNECT << 16;
/* release the command and kill it */
if (req->special) {
struct scsi_cmnd *cmd = req->special;
scsi_release_buffers(cmd);
scsi_put_command(cmd);<<
put_device(&cmd->device->sdev_gendev);<<
req->special = NULL;
}
break;
...

and the backtrace looked like:

general protection fault:  [#1] SMP 
Modules linked in: ccmod(POF) ftmod(OF) ipmi_devintf ipmi_msghandler bonding sg 
x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel 
ghash_clmulni_intel ixgbe(OF) aesni_intel igb(OF) ptp lrw pps_core gf128mul 
glue_helper i2c_algo_bit mdio ablk_helper cryptd pcspkr dca ntb i2c_core 
matroxfb(OF) videosw(OF) nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c 
dm_service_time sd_mod(OF) crc_t10dif crct10dif_common qla2xxx(OF) 
scsi_transport_fc mpt2sas(OF) raid_class scsi_tgt scsi_transport_sas dm_mirror 
dm_region_hash dm_log dm_multipath dm_mod
CPU: 8 PID: 23976 Comm: systemd-udevd Tainted: PF   W  O 3.14.0+ #2
Hardware name: Stratus ftServer 6400/G7LAY, BIOS BIOS Version 6.3:57 12/25/2013
task: 880420b3d7c0 ti: 880729138000 task.ti: 880729138000
RIP: 0010:[]  [] kobject_put+0xd/0x60
RSP: 0018:880729139a80  EFLAGS: 00010002
RAX:  RBX: 6b6b6b6b6b6b6ce3 RCX: 0001002e0003
RDX: 002e RSI: ea0021370400 RDI: 6b6b6b6b6b6b6ce3
RBP: 880729139a88 R08: 88084dc16300 R09: 0001002e0002
R10: 88104f603a80 R11: ea0021370400 R12: 88084dc16300
R13: 0001 R14: 881026935388 R15: 880ff56b3a18
FS:  7f2eed940880() GS:88085fd0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f2eed0e2c20 CR3: 00084d6c8000 CR4: 000407e0
Stack:
 88076248d0a8 880729139a98 813e57d7 880729139ac0
 8141bf3e 00018141be66 88076248d0a8 88104def6840
 880729139b20 a00a46e0 88104def6840 88076248d0a8
Call Trace:
 [] put_device+0x17/0x20
 [] scsi_prep_return+0x9e/0xc0
 [] sd_prep_fn+0x70/0xcd0 [sd_mod]
 [] blk_peek_request+0x12f/0x250
 [] scsi_request_fn+0x48/0x570
 [] __blk_run_queue+0x33/0x40
 [] queue_unplugged+0x2a/0xa0
 [] blk_flush_plug_list+0x1d8/0x230
 [] blk_finish_plug+0x14/0x40
 [] __do_page_cache_readahead+0x209/0x290
 [] force_page_cache_readahead+0x6d/0xa0
 [] page_cache_sync_readahead+0x43/0x50
 [] generic_file_aio_read+0x4f5/0x720
 [] blkdev_aio_read+0x4b/0x70
 [] do_sync_read+0x67/0xa0
 [] vfs_read+0x9b/0x160
 [] SyS_read+0x55/0xd0
 [] system_call_fastpath+0x16/0x1b

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND] scsi_transport_sas: move bsg destructor into sas_rphy_remove

2014-04-02 Thread Joe Lawrence
The recent change in sysfs, bcdde7e221a8750f9b62b6d0bd31b72ea4ad9309
"sysfs: make __sysfs_remove_dir() recursive" revealed an asymmetric
rphy device creation/deletion sequence in scsi_transport_sas:

  modprobe mpt2sas
sas_rphy_add
  device_add A   rphy->dev
  device_add B   sas_device transport class
  device_add C   sas_end_device transport class
  device_add D   bsg class

  rmmod mpt2sas
sas_rphy_delete
  sas_rphy_remove
device_del B
device_del C
device_del A
  sysfs_remove_group recursive sysfs dir removal
  sas_rphy_free
device_del D warning

  where device A is the parent of B, C, and D.

When sas_rphy_free tries to unregister the bsg request queue (device D
above), the ensuing sysfs cleanup discovers that its sysfs group has
already been removed and emits a warning, "sysfs group... not found for
kobject 'end_device-X:0'".

Since bsg creation is a side effect of sas_rphy_add, move its
complementary removal call into sas_rphy_remove. This imposes the
following tear-down order for the devices above: D, B, C, A.

Note the sas_device and sas_end_device transport class devices (B and C
above) are created and destroyed both via the list match traversal in
attribute_container_device_trigger, so the order in which they are
handled is fixed. This is fine as long as they are deleted before their
parent device.

Signed-off-by: Joe Lawrence 
Cc: "James E.J. Bottomley" 
Cc: Dan Williams 
Cc: "Reddy, Sreekanth" 
---
 drivers/scsi/scsi_transport_sas.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_transport_sas.c 
b/drivers/scsi/scsi_transport_sas.c
index 1b68142..c341f85 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -1621,8 +1621,6 @@ void sas_rphy_free(struct sas_rphy *rphy)
list_del(&rphy->list);
mutex_unlock(&sas_host->lock);
 
-   sas_bsg_remove(shost, rphy);
-
transport_destroy_device(dev);
 
put_device(dev);
@@ -1681,6 +1679,7 @@ sas_rphy_remove(struct sas_rphy *rphy)
}
 
sas_rphy_unlink(rphy);
+   sas_bsg_remove(NULL, rphy);
transport_remove_device(dev);
device_del(dev);
 }
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RESEND] scsi_transport_sas: move bsg destructor into sas_rphy_remove

2014-04-02 Thread Joe Lawrence
On Wed, 2 Apr 2014 16:40:41 -0400
Joe Lawrence  wrote:

> Since bsg creation is a side effect of sas_rphy_add, move its
> complementary removal call into sas_rphy_remove. 

Hello James,

This a resend of a patch posted to quiet an end-device sysfs warning in
its device deletion path when removing the mpt2sas driver:

  (initial report)
  sysfs group not found for kobject on mpt2sas unload
  http://marc.info/?t=13849746004
  
  (patch + ACK + comments)
  http://marc.info/?t=13860945551

  (gentoo, LSI repro)
  mpt2sas driver barfs when force removing a drive on 3.13.1
  http://marc.info/?t=13912235131

Dan Williams had a few other suggestions for cleanup in this area, but
those could be handled in a subsequent patch.  This one missed 3.14
inclusion and the warning still occurs in Linus's tree as of today
(post scsi/block merge) hence the repost.

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi_transport_sas: move bsg destructor into sas_rphy_remove

2014-02-21 Thread Joe Lawrence
On Wed, 12 Feb 2014 11:13:22 -0800
Dan Williams  wrote:
> 
> Acked-by: Dan Williams 
> 
> I think this is the right move, and some follow-up cleanups / fixes
> come to mind:

Dan -- Thanks for taking a look.  A few comments below.

> 1/ Why is sas_bsg_remove() multiplexing 2 use cases?  Let's just have
> a sas_host_bsg_remove() and a sas_device_bsg_remove().

Splitting the two use cases reduces them into a simple conditional call
to bsg_unregister_queue(), that could just be done in place.  On the
other hand, sas_bsg_initialize() looks like it handles both cases as
well.

> 2/ I think sas_rphy_free() mishandles the reference counts in the
> sas_rphy_add() failing case.
> sas_rphy_free() does:
> transport_destroy_device(dev); /* which is just put_device(dev) */
> put_device(dev);
> 
> ...but if sas_rphy_add() failed we never took the reference from
> transport_add_device() and I expect we only have the one reference
> from device_initialize() to drop.
> 
> We also ignore the return code from transport_add_device() which is
> another hole.

Ugh, I think fixing this properly would require piping the return code
of the trigger function back out of the attribute container code:

void transport_add_device()
  void attribute_container_device_trigger()
loop through the attribute_container_list
  if match
int transport_add_class_device()
  ... return error ...

Not to mention several callers of sas_rphy_add keep on truckin' when it
does return failure.

Perhaps this can be handled in another way?

> Commenting on this mail should refresh this in James' queue.  But you
> may want to resend anyways given this is a few months old.

Whatever is easier for James.

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mpt2sas driver barfs when force removing a drive on 3.13.1

2014-02-12 Thread Joe Lawrence
Hi Sreekanth,

There hasn't been much recent activity in
drivers/scsi/scsi_transport_sas.c, so I wasn't sure who could best
review the patch.

Dan Williams added sas_rphy_unlink() back in 2011, perhaps he can
remember why sas_bsg_remove wasn't performed there?

Regards,

-- Joe


On Wed, 12 Feb 2014 11:43:58 +
"Reddy, Sreekanth"  wrote:

> Hi Joe,
> 
> May I known the status of this patch having subject
> "[PATCH] scsi_transport_sas: move bsg destructor into sas_rphy_remove".
> 
> I need the status of this patch as I am also observing the same WARNING call 
> trace on latest 3.14.0-rc2+ kernel whenever we unload the mpt2sas/mpt3sas 
> driver or whenever we unplug the drive attached to the HBA.
> But after applying yours patch, these WARNING call trace is not observed.
> 
> Regards,
> Sreekanth
> 
> >-Original Message-
> >From: Joe Lawrence [mailto:joe.lawre...@stratus.com]
> >Sent: Monday, February 03, 2014 7:54 PM
> >To: prometheanf...@gentoo.org
> >Cc: Nandigama, Nagalakshmi; Reddy, Sreekanth; Support; DL-MPT Fusion
> >Linux; linux-scsi@vger.kernel.org; jbottom...@parallels.com; linux-
> >ker...@vger.kernel.org; ker...@gentoo.org
> >Subject: Re: mpt2sas driver barfs when force removing a drive on 3.13.1
> >
> >On Fri, 31 Jan 2014 20:49:39 -0600
> >Matthew Thode  wrote:
> >
> >> I decided to pull a drive while it was in use out of laziness (it was
> >> open via luks, but not in actual use).  Got a fun trace as a result.
> >> Just thought you'd like to know :D
> >>
> >
> >Hi Matthew,
> >
> >The first trace looks a lot like what I see on mpt2sas driver removal [1].  I
> >posted a suggested fix back in Dec [2], which you might try, however it is 
> >not
> >reviewed at this point.
> >
> >[1] http://thread.gmane.org/gmane.linux.scsi/86237
> >[2] https://github.com/joe-
> >lawrence/linux/compare/scsi_transport_sas_sysfs_warning.patch
> >
> >Regards,
> >
> >-- Joe
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mpt2sas driver barfs when force removing a drive on 3.13.1

2014-02-03 Thread Joe Lawrence
On Fri, 31 Jan 2014 20:49:39 -0600
Matthew Thode  wrote:

> I decided to pull a drive while it was in use out of laziness (it was
> open via luks, but not in actual use).  Got a fun trace as a result.
> Just thought you'd like to know :D
> 

Hi Matthew, 

The first trace looks a lot like what I see on mpt2sas driver removal
[1].  I posted a suggested fix back in Dec [2], which you might try,
however it is not reviewed at this point.

[1] http://thread.gmane.org/gmane.linux.scsi/86237
[2] 
https://github.com/joe-lawrence/linux/compare/scsi_transport_sas_sysfs_warning.patch

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] scsi_transport_sas: move bsg destructor into sas_rphy_remove

2013-12-03 Thread Joe Lawrence
The recent change in sysfs, bcdde7e221a8750f9b62b6d0bd31b72ea4ad9309
"sysfs: make __sysfs_remove_dir() recursive" revealed an asymmetric
rphy device creation/deletion sequence in scsi_transport_sas:

  modprobe mpt2sas
sas_rphy_add
  device_add A   rphy->dev
  device_add B   sas_device transport class
  device_add C   sas_end_device transport class
  device_add D   bsg class

  rmmod mpt2sas
sas_rphy_delete
  sas_rphy_remove
device_del B
device_del C
device_del A
  sysfs_remove_group recursive sysfs dir removal
  sas_rphy_free
device_del D warning

  where device A is the parent of B, C, and D.

When sas_rphy_free tries to unregister the bsg request queue (device D
above), the ensuing sysfs cleanup discovers that its sysfs group has
already been removed and emits a warning, "sysfs group... not found for
kobject 'end_device-X:0'".

Since bsg creation is a side effect of sas_rphy_add, move its
complementary removal call into sas_rphy_remove. This imposes the
following tear-down order for the devices above: D, B, C, A.

Note the sas_device and sas_end_device transport class devices (B and C
above) are created and destroyed both via the list match traversal in
attribute_container_device_trigger, so the order in which they are
handled is fixed. This is fine as long as they are deleted before their
parent device.

Signed-off-by: Joe Lawrence 
Cc: "James E.J. Bottomley" 
---
 drivers/scsi/scsi_transport_sas.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_transport_sas.c 
b/drivers/scsi/scsi_transport_sas.c
index 1b681427dde0..c341f855fadc 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -1621,8 +1621,6 @@ void sas_rphy_free(struct sas_rphy *rphy)
list_del(&rphy->list);
mutex_unlock(&sas_host->lock);
 
-   sas_bsg_remove(shost, rphy);
-
transport_destroy_device(dev);
 
put_device(dev);
@@ -1681,6 +1679,7 @@ sas_rphy_remove(struct sas_rphy *rphy)
}
 
sas_rphy_unlink(rphy);
+   sas_bsg_remove(NULL, rphy);
transport_remove_device(dev);
device_del(dev);
 }
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sysfs group not found for kobject on mpt2sas unload

2013-12-03 Thread Joe Lawrence
On Mon, 25 Nov 2013 11:23:39 -0500
Joe Lawrence  wrote:

> On Wed, 20 Nov 2013 14:08:40 -0500
> Joe Lawrence  wrote:
> 
> > Starting in 3.12, when loading and unloading the mpt2sas driver, I see
> > the following warning:
> > 
> > [ cut here ]
> > WARNING: CPU: 20 PID: 19096 at fs/sysfs/group.c:214 
> > sysfs_remove_group+0xc6/0xd0()
> > sysfs group 81ca2f40 not found for kobject 'end_device-30:0'
> > Modules linked in: mpt2sas(-) 
> > stap_edcc1781e2697fc53c3d320bc2530218_19063(OF) ebtable_nat osst 
> > nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat 
> > nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 
> > iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 
> > nf_defrag_ipv4 xt_conntrack nf_conntrack bonding ebtable_filter ebtables 
> > ip6table_filter ip6_tables ixgbe igb x86_pkg_temp_thermal ptp coretemp 
> > pps_core joydev mdio crc32_pclmul crc32c_intel raid_class pcspkr 
> > ghash_clmulni_intel scsi_transport_sas ipmi_si dca ipmi_msghandler ntb 
> > uinput dm_round_robin sd_mod qla2xxx syscopyarea sysfillrect sysimgblt 
> > i2c_algo_bit drm_kms_helper ttm drm scsi_transport_fc scsi_tgt usb_storage 
> > i2c_core dm_multipath [last unloaded: 
> > stap_2da929a187c82c607a23237c27bf2d06_18803]
> > CPU: 20 PID: 19096 Comm: rmmod Tainted: GF   W  O 3.12.0+ #4
> > Hardware name: Stratus ftServer 6400/G7LAZ, BIOS BIOS Version 6.2:52 
> > 04/09/2013
> >  0009 88083ad3bb88 8165265e 88083ad3bbd0
> >  88083ad3bbc0 8105514d  81ca2f40
> >  880851f46ef8 88007a73cf38 88083fb8c6e8 88083ad3bc20
> > Call Trace:
> >  [] dump_stack+0x4d/0x66
> >  [] warn_slowpath_common+0x7d/0xa0
> >  [] warn_slowpath_fmt+0x4c/0x50
> >  [] ? sysfs_get_dirent_ns+0x4e/0x70
> >  [] sysfs_remove_group+0xc6/0xd0
> >  [] dpm_sysfs_remove+0x43/0x50
> >  [] device_del+0x45/0x1c0
> >  [] device_unregister+0x1e/0x60
> >  [] bsg_unregister_queue+0x5e/0xa0
> >  [] sas_rphy_free+0x7a/0xb0 [scsi_transport_sas]
> >  [] sas_port_delete+0x35/0x160 [scsi_transport_sas]
> >  [] ? sysfs_remove_link+0x23/0x30
> >  [] mpt2sas_transport_port_remove+0x19a/0x1e0 [mpt2sas]
> >  [] _scsih_remove_device+0xb0/0x100 [mpt2sas]
> >  [] 
> > mpt2sas_device_remove_by_sas_address.part.54+0x59/0x80 [mpt2sas]
> >  [] _scsih_remove+0xf9/0x210 [mpt2sas]
> >  [] pci_device_remove+0x3b/0xb0
> >  [] __device_release_driver+0x7f/0xf0
> >  [] driver_detach+0xc0/0xd0
> >  [] bus_remove_driver+0x55/0xd0
> >  [] driver_unregister+0x2c/0x50
> >  [] pci_unregister_driver+0x23/0x80
> >  [] _scsih_exit+0x25/0x912 [mpt2sas]
> >  [] SyS_delete_module+0x16d/0x2d0
> >  [] ? do_page_fault+0xe/0x10
> >  [] system_call_fastpath+0x16/0x1b
> > ---[ end trace b4eef98870c871fd ]---
> > 
> > Instrumenting the module loading/unloading cycle with systemtap, it
> > reports the following sequence of events:
> > 
> > modprobe
> >   device_add(A)
> >   device_add(B child of A)
> >   device_add(C child of A)
> >   device_add(D child of A)
> > 
> > rmmod
> >   device_del(B child of A)
> >   device_del(C child of A)
> >   device_del(A)
> >   device_del(D child of A) << WARNING
> > 
> > The same sequence of device_add/del events occur in 3.11, but without
> > the warning.  Git bisect shows bcdde7e221a8750f9b62b6d0bd31b72ea4ad9309
> > "sysfs: make __sysfs_remove_dir() recursive" as the first bad commit.
> > 
> 
> FWIW, I applied Mika's patch that Tejun posted the other day, "sysfs:
> handle duplicate removal attempts in sysfs_remove_group()" [1] and the
> warning goes away on mpt2sas unload.
> 
> [1] 
> http://git.kernel.org/cgit/linux/kernel/git/tj/misc.git/commit/?h=review-sysfs-fixes&id=a69cc96d8c434c6cb64847f37caa890af705fc5c
> 
> -- Joe

Hi James,

(Correction, the sysfs change introducing the warning is 3.13-rc1+, not
3.12.)

Also, it seems that in 3.13-rc2, the patch from Mika that I had
referenced was reverted and that bug handled separately.

  81440e7 Revert "sysfs: handle duplicate removal attempts in 
sysfs_remove_group()"
  54d7114 sysfs: handle duplicate removal attempts in sysfs_remove_group()

So in 3.13-rc2, the warning persists when loading and unloading the
mpt2sas driver.

I looked at the code again, and I'm not sure why the sas_bsg_remove call
was placed in sas_rphy_free and not sas_rphy_remove.

The sas_rphy_remove function is tasked with undoing the actions of

Re: sysfs group not found for kobject on mpt2sas unload

2013-11-25 Thread Joe Lawrence
On Wed, 20 Nov 2013 14:08:40 -0500
Joe Lawrence  wrote:

> Starting in 3.12, when loading and unloading the mpt2sas driver, I see
> the following warning:
> 
> [ cut here ]
> WARNING: CPU: 20 PID: 19096 at fs/sysfs/group.c:214 
> sysfs_remove_group+0xc6/0xd0()
> sysfs group 81ca2f40 not found for kobject 'end_device-30:0'
> Modules linked in: mpt2sas(-) stap_edcc1781e2697fc53c3d320bc2530218_19063(OF) 
> ebtable_nat osst nf_conntrack_netbios_ns nf_conntrack_broadcast 
> ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT 
> nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat 
> iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack 
> bonding ebtable_filter ebtables ip6table_filter ip6_tables ixgbe igb 
> x86_pkg_temp_thermal ptp coretemp pps_core joydev mdio crc32_pclmul 
> crc32c_intel raid_class pcspkr ghash_clmulni_intel scsi_transport_sas ipmi_si 
> dca ipmi_msghandler ntb uinput dm_round_robin sd_mod qla2xxx syscopyarea 
> sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm drm scsi_transport_fc 
> scsi_tgt usb_storage i2c_core dm_multipath [last unloaded: 
> stap_2da929a187c82c607a23237c27bf2d06_18803]
> CPU: 20 PID: 19096 Comm: rmmod Tainted: GF   W  O 3.12.0+ #4
> Hardware name: Stratus ftServer 6400/G7LAZ, BIOS BIOS Version 6.2:52 
> 04/09/2013
>  0009 88083ad3bb88 8165265e 88083ad3bbd0
>  88083ad3bbc0 8105514d  81ca2f40
>  880851f46ef8 88007a73cf38 88083fb8c6e8 88083ad3bc20
> Call Trace:
>  [] dump_stack+0x4d/0x66
>  [] warn_slowpath_common+0x7d/0xa0
>  [] warn_slowpath_fmt+0x4c/0x50
>  [] ? sysfs_get_dirent_ns+0x4e/0x70
>  [] sysfs_remove_group+0xc6/0xd0
>  [] dpm_sysfs_remove+0x43/0x50
>  [] device_del+0x45/0x1c0
>  [] device_unregister+0x1e/0x60
>  [] bsg_unregister_queue+0x5e/0xa0
>  [] sas_rphy_free+0x7a/0xb0 [scsi_transport_sas]
>  [] sas_port_delete+0x35/0x160 [scsi_transport_sas]
>  [] ? sysfs_remove_link+0x23/0x30
>  [] mpt2sas_transport_port_remove+0x19a/0x1e0 [mpt2sas]
>  [] _scsih_remove_device+0xb0/0x100 [mpt2sas]
>  [] mpt2sas_device_remove_by_sas_address.part.54+0x59/0x80 
> [mpt2sas]
>  [] _scsih_remove+0xf9/0x210 [mpt2sas]
>  [] pci_device_remove+0x3b/0xb0
>  [] __device_release_driver+0x7f/0xf0
>  [] driver_detach+0xc0/0xd0
>  [] bus_remove_driver+0x55/0xd0
>  [] driver_unregister+0x2c/0x50
>  [] pci_unregister_driver+0x23/0x80
>  [] _scsih_exit+0x25/0x912 [mpt2sas]
>  [] SyS_delete_module+0x16d/0x2d0
>  [] ? do_page_fault+0xe/0x10
>  [] system_call_fastpath+0x16/0x1b
> ---[ end trace b4eef98870c871fd ]---
> 
> Instrumenting the module loading/unloading cycle with systemtap, it
> reports the following sequence of events:
> 
> modprobe
>   device_add(A)
>   device_add(B child of A)
>   device_add(C child of A)
>   device_add(D child of A)
> 
> rmmod
>   device_del(B child of A)
>   device_del(C child of A)
>   device_del(A)
>   device_del(D child of A) << WARNING
> 
> The same sequence of device_add/del events occur in 3.11, but without
> the warning.  Git bisect shows bcdde7e221a8750f9b62b6d0bd31b72ea4ad9309
> "sysfs: make __sysfs_remove_dir() recursive" as the first bad commit.
> 

FWIW, I applied Mika's patch that Tejun posted the other day, "sysfs:
handle duplicate removal attempts in sysfs_remove_group()" [1] and the
warning goes away on mpt2sas unload.

[1] 
http://git.kernel.org/cgit/linux/kernel/git/tj/misc.git/commit/?h=review-sysfs-fixes&id=a69cc96d8c434c6cb64847f37caa890af705fc5c

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


sysfs group not found for kobject on mpt2sas unload

2013-11-20 Thread Joe Lawrence
Starting in 3.12, when loading and unloading the mpt2sas driver, I see
the following warning:

[ cut here ]
WARNING: CPU: 20 PID: 19096 at fs/sysfs/group.c:214 
sysfs_remove_group+0xc6/0xd0()
sysfs group 81ca2f40 not found for kobject 'end_device-30:0'
Modules linked in: mpt2sas(-) stap_edcc1781e2697fc53c3d320bc2530218_19063(OF) 
ebtable_nat osst nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE 
ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 
nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 
nf_defrag_ipv4 xt_conntrack nf_conntrack bonding ebtable_filter ebtables 
ip6table_filter ip6_tables ixgbe igb x86_pkg_temp_thermal ptp coretemp pps_core 
joydev mdio crc32_pclmul crc32c_intel raid_class pcspkr ghash_clmulni_intel 
scsi_transport_sas ipmi_si dca ipmi_msghandler ntb uinput dm_round_robin sd_mod 
qla2xxx syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm drm 
scsi_transport_fc scsi_tgt usb_storage i2c_core dm_multipath [last unloaded: 
stap_2da929a187c82c607a23237c27bf2d06_18803]
CPU: 20 PID: 19096 Comm: rmmod Tainted: GF   W  O 3.12.0+ #4
Hardware name: Stratus ftServer 6400/G7LAZ, BIOS BIOS Version 6.2:52 04/09/2013
 0009 88083ad3bb88 8165265e 88083ad3bbd0
 88083ad3bbc0 8105514d  81ca2f40
 880851f46ef8 88007a73cf38 88083fb8c6e8 88083ad3bc20
Call Trace:
 [] dump_stack+0x4d/0x66
 [] warn_slowpath_common+0x7d/0xa0
 [] warn_slowpath_fmt+0x4c/0x50
 [] ? sysfs_get_dirent_ns+0x4e/0x70
 [] sysfs_remove_group+0xc6/0xd0
 [] dpm_sysfs_remove+0x43/0x50
 [] device_del+0x45/0x1c0
 [] device_unregister+0x1e/0x60
 [] bsg_unregister_queue+0x5e/0xa0
 [] sas_rphy_free+0x7a/0xb0 [scsi_transport_sas]
 [] sas_port_delete+0x35/0x160 [scsi_transport_sas]
 [] ? sysfs_remove_link+0x23/0x30
 [] mpt2sas_transport_port_remove+0x19a/0x1e0 [mpt2sas]
 [] _scsih_remove_device+0xb0/0x100 [mpt2sas]
 [] mpt2sas_device_remove_by_sas_address.part.54+0x59/0x80 
[mpt2sas]
 [] _scsih_remove+0xf9/0x210 [mpt2sas]
 [] pci_device_remove+0x3b/0xb0
 [] __device_release_driver+0x7f/0xf0
 [] driver_detach+0xc0/0xd0
 [] bus_remove_driver+0x55/0xd0
 [] driver_unregister+0x2c/0x50
 [] pci_unregister_driver+0x23/0x80
 [] _scsih_exit+0x25/0x912 [mpt2sas]
 [] SyS_delete_module+0x16d/0x2d0
 [] ? do_page_fault+0xe/0x10
 [] system_call_fastpath+0x16/0x1b
---[ end trace b4eef98870c871fd ]---

Instrumenting the module loading/unloading cycle with systemtap, it
reports the following sequence of events:

modprobe
  device_add(A)
  device_add(B child of A)
  device_add(C child of A)
  device_add(D child of A)

rmmod
  device_del(B child of A)
  device_del(C child of A)
  device_del(A)
  device_del(D child of A) << WARNING

The same sequence of device_add/del events occur in 3.11, but without
the warning.  Git bisect shows bcdde7e221a8750f9b62b6d0bd31b72ea4ad9309
"sysfs: make __sysfs_remove_dir() recursive" as the first bad commit.

The systemtap script and its output (from mpt2sas calls only) below:

% stap -d scsi_transport_sas -d mpt2sas -e 'probe 
kernel.function("device_add"),kernel.function("device_del") { printf("%s 
dev=%p\n%s", pp(), $dev, $$parms$$); print_backtrace(); printf("\n"); }  probe 
begin { printf("loaded\n") }'

% modprobe mpt2sas

kernel.function("device_add@drivers/base/core.c:976") dev=0x88007a73cf38
dev={.parent=0x88007a73bf60, .p=0x0, .kobj={.name="end_device-30:0", 
.entry={.next=0x88007a73cf50, .prev=0x88007a73cf50}, .parent=0x0, 
.kset=0x88085f878848, .ktype=0x81ca2300, .sd=0x0, 
.kref={.refcount={.counter=3}}, .state_initialized=1, .state_in_sysfs=0, 
.state_add_uevent_sent=0, .state_remove_uevent_sent=0, .uevent_suppress=0}, 
.init_name="", .type=0x0, .mutex={.count={.counter=1}, 
.wait_lock={={.rlock={.raw_lock={={.head_tail=0, 
.tickets={.head='\000', .t 0x813e4760 : device_add+0x0/0x640 [kernel]
 0xa00dcd54 : sas_rphy_add+0x64/0x170 [scsi_transport_sas]
 0xa0235c73 : mpt2sas_transport_port_add+0x273/0xa90 [mpt2sas]
 0xa022aace : _scsih_scan_finished+0x20e/0x2c0 [mpt2sas]
 0x8141dd77 : do_scsi_scan_host+0x77/0xa0 [kernel]
 0x8141df6c : do_scan_async+0x1c/0x150 [kernel]
 0x8107d829 : async_run_entry_fn+0x39/0x120 [kernel]
 0x81070507 : process_one_work+0x177/0x460 [kernel]
 0x810711eb : worker_thread+0x11b/0x3a0 [kernel]
 0x81077822 : kthread+0xd2/0xf0 [kernel]
 0x81661d2c : ret_from_fork+0x7c/0xb0 [kernel]

kernel.function("device_add@drivers/base/core.c:976") dev=0x88007a73b4f8
dev={.parent=0x88007a73cf38, .p=0x0, .kobj={.name="end_device-30:0", 
.entry={.next=0x88007a73b510, .prev=0x88007a73b510}, .parent=0x0, 
.kset=0x88085f878848, .ktype=0x81ca2300, .sd=0x0, 
.kref={.refcount={.counter=2}}, .state_initialized=1, .state_in_sysfs=0, 
.state_add_uevent_sent=0, .state_remove_uevent_sen

Re: PATCH: st.c: Fix blk_get_queue usage

2013-11-15 Thread Joe Lawrence
On Fri, 15 Nov 2013 12:23:06 +0100
Bodo Stroesser  wrote:

> On 14.11.2013, at 22.50, Joe Lawrence  wrote:
> 
> > On Thu, 14 Nov 2013 20:22:40 +0100
> > Bodo Stroesser  wrote:
> > 
> > > On 14.11.2013, at 20.05, Kai M??kisara (Kolumbus) 
> > >  wrote:
> > > 
> > > > 
> > > > On 14.11.2013, at 16.48, Bodo Stroesser  
> > > > wrote:
> > > > 
> 
> < snip >
>  
> > > > 
> > > > With this patch, blk_get_queue() is not called with the correct 
> > > > argument.
> > > > Maybe change the call to blk_get_queue(SDp->request_queue) ?
> > > > 
> > > > >   if (blk_get_queue(disk->queue))
> > > 
> > > Yes, thank you. You are obviously right. Below is the revised patch.
> > > Sorry for the mistake.
> > > 
> > > Bodo
> > > 
> > > > >   goto out_put_disk;
> > > > > + disk->queue = SDp->request_queue;
> > > > >   tpnt->driver = &st_template;
> > > > > 
> > > > >   tpnt->device = SDp--
> > > > > To unsubscribe from this list: send the line "unsubscribe linux-scsi" 
> > > > > in
> > > > > the body of a message to majord...@vger.kernel.org
> > > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > > > ;
> > > > 
> > > > Thanks,
> > > > Kai
> > > > 
> > > 
> > > From: Bodo Stroesser 
> > > Date: Thu, 14 Nov 2013 14:35:00 +0100
> > > Subject: [PATCH] sg: fix blk_get_queue usage
> > > 
> > > If blk_queue_get() in st_probe fails, disk->queue must not
> > > be set to SDp->request_queue, as that would result in
> > > put_disk() dropping a not taken reference.
> > > 
> > > Thus, disk->queue should be set only after a successful
> > > blk_queue_get().
> > > Revised patch due to a hint from Kai Makisara.
> > > 
> > > Signed-off-by: Bodo Stroesser 
> > > 
> > > ---
> > > 
> > > --- a/drivers/scsi/st.c   2013-11-14 14:10:40.0 +0100
> > > +++ b/drivers/scsi/st.c   2013-11-14 14:10:57.0 +0100
> > > @@ -4107,11 +4107,11 @@
> > >   kref_init(&tpnt->kref);
> > >   tpnt->disk = disk;
> > >   disk->private_data = &tpnt->driver;
> > > - disk->queue = SDp->request_queue;
> > >   /* SCSI tape doesn't register this gendisk via add_disk().  Manually
> > >* take queue reference that release_disk() expects. */
> > > - if (blk_get_queue(disk->queue))
> > > + if (blk_get_queue(SDp->request_queue))
> > >   goto out_put_disk;
> > > + disk->queue = SDp->request_queue;
> > >   tpnt->driver = &st_template;
> > >  
> > >   tpnt->device = SDp;
> > >  ??{.n?+???+%??lzwm??b??r??zX??  ?(?? 
> > > }?z?&j:+v???zZ+??+zf???h???~i???z? ?w?&?)?? f
> > 
> > Hi Bodo,
> > 
> > Minor nit, wasn't blk_get_queue modified to return false on failure?
> > 
> > 09ac46c42946 "block: misc updates to blk_get_queue()"
> > 
> > Just curious what tree this patch was tested against.
> > 
> > Thanks,
> > 
> > -- Joe
> > 
> 
> Hi Joe,
> 
> we are intensively using SuSE SLES11 (currently kernel 3.0.80-x). At the 
> moment
> I'm not involved with mainline.
> 
> The patch resulted from the hunt for another problem and was accepted by 
> SuSE. They
> told me that this would be relevant for mainline also. So I posted it but 
> didn't
> check mainline for changes :-(
> 
> Thus, here is the third revision of the patch that now is suited for mainline.
> Sorry for the noise.
> 
> Thank you.
> 
> Bodo
> 
> ---
> 
> From: Bodo Stroesser 
> Date: Thu, 14 Nov 2013 14:35:00 +0100
> Subject: [PATCH] sg: fix blk_get_queue usage
> 
> If blk_queue_get() in st_probe fails, disk->queue must not
> be set to SDp->request_queue, as that would result in
> put_disk() dropping a not taken reference.
> 
> Thus, disk->queue should be set only after a successful
> blk_queue_get().
> 
> 2nd revised patch due to hints from Kai Makisara and Joe Lawrence.
> 
> Now this should be suited for mainline.
> 
> Signed-off-by: Bodo Stroesser 
> 
> ---
> 
> --- a/drivers/scsi/st.c   2013-11-15 11:58:39.

Re: PATCH: st.c: Fix blk_get_queue usage

2013-11-14 Thread Joe Lawrence
On Thu, 14 Nov 2013 20:22:40 +0100
Bodo Stroesser  wrote:

> On 14.11.2013, at 20.05, Kai M??kisara (Kolumbus)  
> wrote:
> 
> > 
> > On 14.11.2013, at 16.48, Bodo Stroesser  wrote:
> > 
> > > Hi,
> > > 
> > > in st_probe(), st.c I stumbled across what I'd call a minor problem.
> > > 
> > > So I'd like to suggest the following patch.
> > > 
> > > Best Regards,
> > > Bodo
> > > 
> > > P.S.: Please CC me, I'm not on the list.
> > > 
> > > -
> > > 
> > > 
> > > From: Bodo Stroesser 
> > > Date: Thu, 14 Nov 2013 14:35:00 +0100
> > > Subject: [PATCH] sg: fix blk_get_queue usage
> > > 
> > > If blk_queue_get() in st_probe fails, disk->queue must not
> > > be set to SDp->request_queue, as that would result in
> > > put_disk() dropping a not taken reference.
> > > 
> > > Thus, disk->queue should be set only after a successful
> > > blk_queue_get().
> > > 
> > > Signed-off-by: Bodo Stroesser 
> > > 
> > > ---
> > > 
> > > --- a/drivers/scsi/st.c   2013-11-14 14:10:40.0 +0100
> > > +++ b/drivers/scsi/st.c   2013-11-14 14:10:57.0 +0100
> > > @@ -4107,11 +4107,11 @@
> > >   kref_init(&tpnt->kref);
> > >   tpnt->disk = disk;
> > >   disk->private_data = &tpnt->driver;
> > > - disk->queue = SDp->request_queue;
> > >   /* SCSI tape doesn't register this gendisk via add_disk().  Manually
> > >* take queue reference that release_disk() expects. */
> > 
> > With this patch, blk_get_queue() is not called with the correct argument.
> > Maybe change the call to blk_get_queue(SDp->request_queue) ?
> > 
> > >   if (blk_get_queue(disk->queue))
> 
> Yes, thank you. You are obviously right. Below is the revised patch.
> Sorry for the mistake.
> 
> Bodo
> 
> > >   goto out_put_disk;
> > > + disk->queue = SDp->request_queue;
> > >   tpnt->driver = &st_template;
> > > 
> > >   tpnt->device = SDp--
> > > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > > the body of a message to majord...@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > ;
> > 
> > Thanks,
> > Kai
> > 
> 
> From: Bodo Stroesser 
> Date: Thu, 14 Nov 2013 14:35:00 +0100
> Subject: [PATCH] sg: fix blk_get_queue usage
> 
> If blk_queue_get() in st_probe fails, disk->queue must not
> be set to SDp->request_queue, as that would result in
> put_disk() dropping a not taken reference.
> 
> Thus, disk->queue should be set only after a successful
> blk_queue_get().
> Revised patch due to a hint from Kai Makisara.
> 
> Signed-off-by: Bodo Stroesser 
> 
> ---
> 
> --- a/drivers/scsi/st.c   2013-11-14 14:10:40.0 +0100
> +++ b/drivers/scsi/st.c   2013-11-14 14:10:57.0 +0100
> @@ -4107,11 +4107,11 @@
>   kref_init(&tpnt->kref);
>   tpnt->disk = disk;
>   disk->private_data = &tpnt->driver;
> - disk->queue = SDp->request_queue;
>   /* SCSI tape doesn't register this gendisk via add_disk().  Manually
>* take queue reference that release_disk() expects. */
> - if (blk_get_queue(disk->queue))
> + if (blk_get_queue(SDp->request_queue))
>   goto out_put_disk;
> + disk->queue = SDp->request_queue;
>   tpnt->driver = &st_template;
>  
>   tpnt->device = SDp;
> ??{.n?+???+%??lzwm??b??r??zX???(??}?z?&j:+v???zZ+??+zf???h???~i???z??w?&?)??f

Hi Bodo,

Minor nit, wasn't blk_get_queue modified to return false on failure?

09ac46c42946 "block: misc updates to blk_get_queue()"

Just curious what tree this patch was tested against.

Thanks,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >