Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects

2016-05-18 Thread Sreekanth Reddy
rphan what lies beneath it:
>> >
>> > [  955.119328] kobject: '6:0:0:0' (88074e2e1008): kobject_uevent_env
>> > [  955.133890] kobject: '6:0:0:0' (88074e2e1008): fill_kobj_path: path 
>> > = '/end_device-6:0:0/target6:0:0/6:0:0:0/bsg/6:0:0:0'
>> > 
>>
>> [Sreekanth] Calvin, Now with your patch we are observing same type of
>>  while unregistered the drive with scsi_transport layer
>> after calling scsi_remove_host() on the same driver unload path,
>>
>> kobject: 'end_device-2:0' (8800dad0c010): kobject_uevent_env
>> kobject: 'end_device-2:0' (8800dad0c010): fill_kobj_path: path =
>> '/host2/port-2:0/end_device-2:0/bsg/end_device-2:0'
>>
>> As scsi_remove_host() has freed it's parent
>> "/devices/pci:00/:00:07.0" we are observing these warnings.
>>
>> here is it's complete path
>>
>>  fill_kobj_path: path =
>> '/devices/pci:00/:00:07.0/host2/port-2:0/end_device-2:0/bsg/end_device-2:0',
>
> Could you post the stacktraces for your WARN/OOPS?
>
>> Until on  ~3.19/4.0 kernel all the thing are working fine, we haven't
>> seen these types of kernel warning. Also we haven't done any changes
>> in mpt3sas driver w.r.t to this after 4.0 kernel.
>
> I'll dig around more. In any case, my patch seems to be papering over
> the problem rather than solving it, so we need a different fix.
>
> Thanks,
> Calvin
>
>> >
>> > As the above from CONFIG_DEBUG_KOBJECT suggests, "end_device-6:0:0" no 
>> > longer
>> > has a parent. The path to the enclosured device is:
>> >
>> > /sys/devices/pci:00/:00:01.0/:01:00.0/host6/port-6:0/expander-6:0/port-6:0:15/end_device-6:0:15/target6:0:15/6:0:15:0/enclosure/6:0:15:0/ArrayDevice09
>> >
>> > Enclosure wants to re-add the device outside the enclosure:
>> >
>> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/misc/enclosure.c#n404
>> >
>> > We melt when trying to re-add "ArrayDevice09" to the "6:0:15:0" directory
>> > above the "enclosure" directory best I can tell. This is *incredibly*
>> > confusing because kobj_add_internal() actually clobbers the parent pointer
>> > if it's NULL:
>> >
>> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/lib/kobject.c#n216
>> >
>> > So even though it appears that the parent was "6:0:15:0" here, it was
>> > actually NULL when passed into kobj_add_internal(). Thus, we try to do
>> > the add with the kset as our parent, and we get -ENOENT because that
>> > node has been marked as inactive in kernfs as per here:
>> >
>> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/kernfs/dir.c#n743
>> >
>> > Why is it inactive? I think, because __kernfs_remove() deleted an ancestor:
>> >
>> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/kernfs/dir.c#n1243
>> >
>> > ...which brings us full circle.
>> >
>> >> Doesn't this need to be taken care in enclosure_removal?
>> >
>> > Even with enclosure compiled out, I still get a WARN like the following
>> > for every block device being removed at rmmod (-dirty here is just from me
>> > applying a fix to SES I sent to the list a few days ago):
>> >
>> > [  224.057286] [ cut here ]
>> > [  224.067696] WARNING: CPU: 1 PID: 5412 at fs/sysfs/group.c:237 
>> > sysfs_remove_group+0x10f/0x150
>> > [  224.088007] sysfs group 83109ba0 not found for kobject 'sdc'
>> > [  224.102283] Modules linked in: mpt3sas(-) 
>> > [  224.200483] CPU: 0 PID: 5412 Comm: rmmod Not tainted 4.6.0-dirty #4
>> > [  224.219797] Hardware name: Wiwynn   HoneyBadger/PantherPlus, BIOS 
>> > HBM6.71 02/03/2016
>> > [  224.237001]  ffff82af6280 8806e826f8b0 ffff81e13e54 
>> > 8806e826f928
>> > [  224.253811]   8806e826f8f8 81127822 
>> > 831a5968
>> > [  224.270647]  880600ed ed00dd04df21  
>> > 880754ae3078
>> > [  224.287462] Call Trace:
>> > [  224.292953]  [] dump_stack+0x68/0x94
>> > [  224.304433]  [] __warn+0x172/0x1b0
>> > [  224.315514]  [] warn_slowpath_fmt+0x97/0xb0
>> > [  224.351845]  [] sysfs_remove_group+0x10f/0x150
>> > [  224.36525

Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects

2016-05-18 Thread Calvin Owens
t; [  224.200483] CPU: 0 PID: 5412 Comm: rmmod Not tainted 4.6.0-dirty #4
> > [  224.219797] Hardware name: Wiwynn   HoneyBadger/PantherPlus, BIOS 
> > HBM6.71 02/03/2016
> > [  224.237001]  82af6280 8806e826f8b0 81e13e54 
> > 8806e826f928
> > [  224.253811]   8806e826f8f8 81127822 
> > 831a5968
> > [  224.270647]  880600ed ed00dd04df21  
> > 880754ae3078
> > [  224.287462] Call Trace:
> > [  224.292953]  [] dump_stack+0x68/0x94
> > [  224.304433]  [] __warn+0x172/0x1b0
> > [  224.315514]  [] warn_slowpath_fmt+0x97/0xb0
> > [  224.351845]  [] sysfs_remove_group+0x10f/0x150
> > [  224.365252]  [] blk_trace_remove_sysfs+0x14/0x20
> > [  224.379021]  [] blk_unregister_queue+0xcf/0x120
> > [  224.392589]  [] del_gendisk+0x26e/0x600
> > [  224.432345]  [] sd_remove+0xf5/0x1b0
> > [  224.443808]  [] __device_release_driver+0x160/0x3a0
> > [  224.458151]  [] device_release_driver+0x25/0x40
> > [  224.471734]  [] bus_remove_device+0x2e1/0x4b0
> > [  224.484926]  [] device_del+0x37d/0x680
> > [  224.555423]  [] __scsi_remove_device+0x1e5/0x250
> > [  224.569196]  [] scsi_forget_host+0x12c/0x1e0
> > [  224.582197]  [] scsi_remove_host+0x10c/0x300
> > [  224.595227]  [] scsih_remove+0x339/0x670 [mpt3sas]
> > [  224.609383]  [] pci_device_remove+0x70/0x110
> > [  224.622379]  [] __device_release_driver+0x160/0x3a0
> > [  224.636724]  [] driver_detach+0x183/0x200
> > [  224.649166]  [] bus_remove_driver+0xdf/0x200
> > [  224.662185]  [] driver_unregister+0x67/0xa0
> > [  224.675009]  [] pci_unregister_driver+0x1e/0xe0
> > [  224.688613]  [] _mpt3sas_exit+0x23/0x5e9 [mpt3sas]
> > [  224.702782]  [] SyS_delete_module+0x2eb/0x390
> > [  224.743359]  [] entry_SYSCALL_64_fastpath+0x18/0xa8
> > [  224.759036] ---[ end trace 13f1919bf2ec7bb0 ]---
> >
> > My patch fixes that too :)
> >
> > The only driver besides mpt*sas that calls sas_delete_port() explicitly is
> > HPSA, and it does it in the opposite order mpt3sas does: scsi_remove_host()
> > first.
> >
> > Thanks,
> > Calvin
> >
> >> -Original Message-
> >> From: Calvin Owens [mailto:calvinow...@fb.com]
> >> Sent: Monday, May 16, 2016 2:25 PM
> >> To: Elliott, Robert (Persistent Memory)
> >> Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J.
> >> Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com;
> >> linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org;
> >> kernel-t...@fb.com; calvinow...@fb.com
> >> Subject: Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY
> >> objects
> >>
> >> On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote:
> >> >
> >> >
> >> > > -Original Message-
> >> > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> >> > > ow...@vger.kernel.org] On Behalf Of Calvin Owens
> >> > > Sent: Friday, May 13, 2016 3:28 PM
> >> > ...
> >> > > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS
> >> > > PHY objects
> >> > >
> >> > ...
> >> > > The issue is that enclosure_remove_device() expects to be able to
> >> > > re-add the device that was previously enclosured: so with SES
> >> > > running, the order we unwind things matters in a way it otherwise
> >> > > wouldn't.
> >> > >
> >> > > Since mpt3sas deletes the SAS objects before the SCSI hosts,
> >> > > enclosure ends up trying to re-parent the SCSI device from the
> >> > > enclosure to the SAS PHY which has already been deleted. This obviously
> >> > > ends in sadness.
> >> > >
> >> > > The fix appears to be simple: just call scsi_remove_host() before we
> >> > > call
> >> > > sas_port_delete() and/or sas_remove_host().
> >> > >
> >> > ...
> >> > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> >> > > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev)
> >> > >   _scsih_raid_device_remove(ioc, raid_device);
> >> > >   }
> >> > >
> >> > > + scsi_remove_host(shost);
> >> > > +
> >> > >   /* free ports attached to the sas_host */
> >> > >   list_for_each_entry_safe(mpt3sas_port, next_port,
> >> > >  &ioc->sas_hba.sas_port_list, port_list) { @@ -8172,7 +8174,6 @@
> >> > > void scsih_remove(struct pci_dev *pdev)
> >> > >   }
> >> > >
> >> > >   sas_remove_host(shost);
> >> > > - scsi_remove_host(shost);
> >> > >   mpt3sas_base_detach(ioc);
> >> > >   spin_lock(&gioc_lock);
> >> > >   list_del(&ioc->list);
> >> >
> >> > That change matches the pattern of all other drivers calling
> >> > sas_remove_host, except for one:
> >> > drivers/message/fusion/mptsas.c
> >> >
> >> > That consensus means this comment in drivers/scsi/scsi_transport_sas.c
> >> > is wrong:
> >> >
> >> > /**
> >> >  * sas_remove_host  -  tear down a Scsi_Host's SAS data structures
> >> >  * @shost:  Scsi Host that is torn down
> >> >  *
> >> >  * Removes all SAS PHYs and remote PHYs for a given Scsi_Host.
> >> >  * Must be called just before scsi_remove_host for SAS HBAs.
> >> >  */
> >>
> >> Yes, exactly: that comment appears to be backwards, and as you say most
> >> drivers appear to do the opposite (I looked at HPSA at least, which calls
> >> sas_port_delete() before scsi_remove_host()).
> >>
> >> I suppose I should send a patch to fix the comment as well? It'd be nice to
> >> get some testing to be certain I'm not breaking somebody else's setup by
> >> fixing mine though...
> >>
> >> Thanks,
> >> Calvin


Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects

2016-05-18 Thread Sreekanth Reddy
 [] blk_trace_remove_sysfs+0x14/0x20
> [  224.379021]  [] blk_unregister_queue+0xcf/0x120
> [  224.392589]  [] del_gendisk+0x26e/0x600
> [  224.432345]  [] sd_remove+0xf5/0x1b0
> [  224.443808]  [] __device_release_driver+0x160/0x3a0
> [  224.458151]  [] device_release_driver+0x25/0x40
> [  224.471734]  [] bus_remove_device+0x2e1/0x4b0
> [  224.484926]  [] device_del+0x37d/0x680
> [  224.555423]  [] __scsi_remove_device+0x1e5/0x250
> [  224.569196]  [] scsi_forget_host+0x12c/0x1e0
> [  224.582197]  [] scsi_remove_host+0x10c/0x300
> [  224.595227]  [] scsih_remove+0x339/0x670 [mpt3sas]
> [  224.609383]  [] pci_device_remove+0x70/0x110
> [  224.622379]  [] __device_release_driver+0x160/0x3a0
> [  224.636724]  [] driver_detach+0x183/0x200
> [  224.649166]  [] bus_remove_driver+0xdf/0x200
> [  224.662185]  [] driver_unregister+0x67/0xa0
> [  224.675009]  [] pci_unregister_driver+0x1e/0xe0
> [  224.688613]  [] _mpt3sas_exit+0x23/0x5e9 [mpt3sas]
> [  224.702782]  [] SyS_delete_module+0x2eb/0x390
> [  224.743359]  [] entry_SYSCALL_64_fastpath+0x18/0xa8
> [  224.759036] ---[ end trace 13f1919bf2ec7bb0 ]---
>
> My patch fixes that too :)
>
> The only driver besides mpt*sas that calls sas_delete_port() explicitly is
> HPSA, and it does it in the opposite order mpt3sas does: scsi_remove_host()
> first.
>
> Thanks,
> Calvin
>
>> -Original Message-
>> From: Calvin Owens [mailto:calvinow...@fb.com]
>> Sent: Monday, May 16, 2016 2:25 PM
>> To: Elliott, Robert (Persistent Memory)
>> Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J.
>> Bottomley; Martin K. Petersen; mpt-fusionlinux....@broadcom.com;
>> linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org;
>> kernel-t...@fb.com; calvinow...@fb.com
>> Subject: Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY
>> objects
>>
>> On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote:
>> >
>> >
>> > > -Original Message-
>> > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
>> > > ow...@vger.kernel.org] On Behalf Of Calvin Owens
>> > > Sent: Friday, May 13, 2016 3:28 PM
>> > ...
>> > > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS
>> > > PHY objects
>> > >
>> > ...
>> > > The issue is that enclosure_remove_device() expects to be able to
>> > > re-add the device that was previously enclosured: so with SES
>> > > running, the order we unwind things matters in a way it otherwise
>> > > wouldn't.
>> > >
>> > > Since mpt3sas deletes the SAS objects before the SCSI hosts,
>> > > enclosure ends up trying to re-parent the SCSI device from the
>> > > enclosure to the SAS PHY which has already been deleted. This obviously
>> > > ends in sadness.
>> > >
>> > > The fix appears to be simple: just call scsi_remove_host() before we
>> > > call
>> > > sas_port_delete() and/or sas_remove_host().
>> > >
>> > ...
>> > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> > > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev)
>> > >   _scsih_raid_device_remove(ioc, raid_device);
>> > >   }
>> > >
>> > > + scsi_remove_host(shost);
>> > > +
>> > >   /* free ports attached to the sas_host */
>> > >   list_for_each_entry_safe(mpt3sas_port, next_port,
>> > >  &ioc->sas_hba.sas_port_list, port_list) { @@ -8172,7 +8174,6 @@
>> > > void scsih_remove(struct pci_dev *pdev)
>> > >   }
>> > >
>> > >   sas_remove_host(shost);
>> > > - scsi_remove_host(shost);
>> > >   mpt3sas_base_detach(ioc);
>> > >   spin_lock(&gioc_lock);
>> > >   list_del(&ioc->list);
>> >
>> > That change matches the pattern of all other drivers calling
>> > sas_remove_host, except for one:
>> > drivers/message/fusion/mptsas.c
>> >
>> > That consensus means this comment in drivers/scsi/scsi_transport_sas.c
>> > is wrong:
>> >
>> > /**
>> >  * sas_remove_host  -  tear down a Scsi_Host's SAS data structures
>> >  * @shost:  Scsi Host that is torn down
>> >  *
>> >  * Removes all SAS PHYs and remote PHYs for a given Scsi_Host.
>> >  * Must be called just before scsi_remove_host for SAS HBAs.
>> >  */
>>
>> Yes, exactly: that comment appears to be backwards, and as you say most
>> drivers appear to do the opposite (I looked at HPSA at least, which calls
>> sas_port_delete() before scsi_remove_host()).
>>
>> I suppose I should send a patch to fix the comment as well? It'd be nice to
>> get some testing to be certain I'm not breaking somebody else's setup by
>> fixing mine though...
>>
>> Thanks,
>> Calvin


Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects

2016-05-16 Thread Calvin Owens
gt; From: Calvin Owens [mailto:calvinow...@fb.com]
> Sent: Monday, May 16, 2016 2:25 PM
> To: Elliott, Robert (Persistent Memory)
> Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J.
> Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com;
> linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org;
> kernel-t...@fb.com; calvinow...@fb.com
> Subject: Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY
> objects
> 
> On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote:
> >
> >
> > > -Original Message-
> > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > > ow...@vger.kernel.org] On Behalf Of Calvin Owens
> > > Sent: Friday, May 13, 2016 3:28 PM
> > ...
> > > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS
> > > PHY objects
> > >
> > ...
> > > The issue is that enclosure_remove_device() expects to be able to
> > > re-add the device that was previously enclosured: so with SES
> > > running, the order we unwind things matters in a way it otherwise
> > > wouldn't.
> > >
> > > Since mpt3sas deletes the SAS objects before the SCSI hosts,
> > > enclosure ends up trying to re-parent the SCSI device from the
> > > enclosure to the SAS PHY which has already been deleted. This obviously
> > > ends in sadness.
> > >
> > > The fix appears to be simple: just call scsi_remove_host() before we
> > > call
> > > sas_port_delete() and/or sas_remove_host().
> > >
> > ...
> > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev)
> > >   _scsih_raid_device_remove(ioc, raid_device);
> > >   }
> > >
> > > + scsi_remove_host(shost);
> > > +
> > >   /* free ports attached to the sas_host */
> > >   list_for_each_entry_safe(mpt3sas_port, next_port,
> > >  &ioc->sas_hba.sas_port_list, port_list) { @@ -8172,7 +8174,6 @@
> > > void scsih_remove(struct pci_dev *pdev)
> > >   }
> > >
> > >   sas_remove_host(shost);
> > > - scsi_remove_host(shost);
> > >   mpt3sas_base_detach(ioc);
> > >   spin_lock(&gioc_lock);
> > >   list_del(&ioc->list);
> >
> > That change matches the pattern of all other drivers calling
> > sas_remove_host, except for one:
> > drivers/message/fusion/mptsas.c
> >
> > That consensus means this comment in drivers/scsi/scsi_transport_sas.c
> > is wrong:
> >
> > /**
> >  * sas_remove_host  -  tear down a Scsi_Host's SAS data structures
> >  * @shost:  Scsi Host that is torn down
> >  *
> >  * Removes all SAS PHYs and remote PHYs for a given Scsi_Host.
> >  * Must be called just before scsi_remove_host for SAS HBAs.
> >  */
> 
> Yes, exactly: that comment appears to be backwards, and as you say most
> drivers appear to do the opposite (I looked at HPSA at least, which calls
> sas_port_delete() before scsi_remove_host()).
> 
> I suppose I should send a patch to fix the comment as well? It'd be nice to
> get some testing to be certain I'm not breaking somebody else's setup by
> fixing mine though...
> 
> Thanks,
> Calvin


RE: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects

2016-05-16 Thread Sathya Prakash Veerichetty
Our understanding is the relationship between the SCSI host and SAS end
devices is a parent-child and before ripping of SCSI host we need to rip of
all the children. Why the enclosure ends up trying to re-parent the SCSI
device from the enclosure to the SAS PHY even after we remove the SAS Phy?.
Doesn't this need to be taken care in enclosure_removal?

-Original Message-
From: Calvin Owens [mailto:calvinow...@fb.com]
Sent: Monday, May 16, 2016 2:25 PM
To: Elliott, Robert (Persistent Memory)
Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J.
Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com;
linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org;
kernel-t...@fb.com; calvinow...@fb.com
Subject: Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY
objects

On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote:
>
>
> > -Original Message-
> > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > ow...@vger.kernel.org] On Behalf Of Calvin Owens
> > Sent: Friday, May 13, 2016 3:28 PM
> ...
> > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS
> > PHY objects
> >
> ...
> > The issue is that enclosure_remove_device() expects to be able to
> > re-add the device that was previously enclosured: so with SES
> > running, the order we unwind things matters in a way it otherwise
> > wouldn't.
> >
> > Since mpt3sas deletes the SAS objects before the SCSI hosts,
> > enclosure ends up trying to re-parent the SCSI device from the
> > enclosure to the SAS PHY which has already been deleted. This obviously
> > ends in sadness.
> >
> > The fix appears to be simple: just call scsi_remove_host() before we
> > call
> > sas_port_delete() and/or sas_remove_host().
> >
> ...
> > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev)
> > _scsih_raid_device_remove(ioc, raid_device);
> > }
> >
> > +   scsi_remove_host(shost);
> > +
> > /* free ports attached to the sas_host */
> > list_for_each_entry_safe(mpt3sas_port, next_port,
> >&ioc->sas_hba.sas_port_list, port_list) { @@ -8172,7 +8174,6 @@
> > void scsih_remove(struct pci_dev *pdev)
> > }
> >
> > sas_remove_host(shost);
> > -   scsi_remove_host(shost);
> > mpt3sas_base_detach(ioc);
> > spin_lock(&gioc_lock);
> > list_del(&ioc->list);
>
> That change matches the pattern of all other drivers calling
> sas_remove_host, except for one:
>   drivers/message/fusion/mptsas.c
>
> That consensus means this comment in drivers/scsi/scsi_transport_sas.c
> is wrong:
>
> /**
>  * sas_remove_host  -  tear down a Scsi_Host's SAS data structures
>  * @shost:  Scsi Host that is torn down
>  *
>  * Removes all SAS PHYs and remote PHYs for a given Scsi_Host.
>  * Must be called just before scsi_remove_host for SAS HBAs.
>  */

Yes, exactly: that comment appears to be backwards, and as you say most
drivers appear to do the opposite (I looked at HPSA at least, which calls
sas_port_delete() before scsi_remove_host()).

I suppose I should send a patch to fix the comment as well? It'd be nice to
get some testing to be certain I'm not breaking somebody else's setup by
fixing mine though...

Thanks,
Calvin


Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects

2016-05-16 Thread Calvin Owens
On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote:
> 
> 
> > -Original Message-
> > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > ow...@vger.kernel.org] On Behalf Of Calvin Owens
> > Sent: Friday, May 13, 2016 3:28 PM
> ...
> > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY
> > objects
> > 
> ...
> > The issue is that enclosure_remove_device() expects to be able to re-add
> > the device that was previously enclosured: so with SES running, the order
> > we unwind things matters in a way it otherwise wouldn't.
> > 
> > Since mpt3sas deletes the SAS objects before the SCSI hosts, enclosure
> > ends up trying to re-parent the SCSI device from the enclosure to the SAS
> > PHY which has already been deleted. This obviously ends in sadness.
> > 
> > The fix appears to be simple: just call scsi_remove_host() before we call
> > sas_port_delete() and/or sas_remove_host().
> > 
> ...
> > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev)
> > _scsih_raid_device_remove(ioc, raid_device);
> > }
> > 
> > +   scsi_remove_host(shost);
> > +
> > /* free ports attached to the sas_host */
> > list_for_each_entry_safe(mpt3sas_port, next_port,
> >&ioc->sas_hba.sas_port_list, port_list) {
> > @@ -8172,7 +8174,6 @@ void scsih_remove(struct pci_dev *pdev)
> > }
> > 
> > sas_remove_host(shost);
> > -   scsi_remove_host(shost);
> > mpt3sas_base_detach(ioc);
> > spin_lock(&gioc_lock);
> > list_del(&ioc->list);
> 
> That change matches the pattern of all other drivers calling
> sas_remove_host, except for one:
>   drivers/message/fusion/mptsas.c
> 
> That consensus means this comment in drivers/scsi/scsi_transport_sas.c
> is wrong:
> 
> /**
>  * sas_remove_host  -  tear down a Scsi_Host's SAS data structures
>  * @shost:  Scsi Host that is torn down
>  *
>  * Removes all SAS PHYs and remote PHYs for a given Scsi_Host.
>  * Must be called just before scsi_remove_host for SAS HBAs.
>  */

Yes, exactly: that comment appears to be backwards, and as you say most
drivers appear to do the opposite (I looked at HPSA at least, which calls
sas_port_delete() before scsi_remove_host()).

I suppose I should send a patch to fix the comment as well? It'd be nice
to get some testing to be certain I'm not breaking somebody else's setup
by fixing mine though...

Thanks,
Calvin


RE: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects

2016-05-13 Thread Elliott, Robert (Persistent Memory)


> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> ow...@vger.kernel.org] On Behalf Of Calvin Owens
> Sent: Friday, May 13, 2016 3:28 PM
...
> Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY
> objects
> 
...
> The issue is that enclosure_remove_device() expects to be able to re-add
> the device that was previously enclosured: so with SES running, the order
> we unwind things matters in a way it otherwise wouldn't.
> 
> Since mpt3sas deletes the SAS objects before the SCSI hosts, enclosure
> ends up trying to re-parent the SCSI device from the enclosure to the SAS
> PHY which has already been deleted. This obviously ends in sadness.
> 
> The fix appears to be simple: just call scsi_remove_host() before we call
> sas_port_delete() and/or sas_remove_host().
> 
...
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev)
>   _scsih_raid_device_remove(ioc, raid_device);
>   }
> 
> + scsi_remove_host(shost);
> +
>   /* free ports attached to the sas_host */
>   list_for_each_entry_safe(mpt3sas_port, next_port,
>  &ioc->sas_hba.sas_port_list, port_list) {
> @@ -8172,7 +8174,6 @@ void scsih_remove(struct pci_dev *pdev)
>   }
> 
>   sas_remove_host(shost);
> - scsi_remove_host(shost);
>   mpt3sas_base_detach(ioc);
>   spin_lock(&gioc_lock);
>   list_del(&ioc->list);

That change matches the pattern of all other drivers calling
sas_remove_host, except for one:
drivers/message/fusion/mptsas.c

That consensus means this comment in drivers/scsi/scsi_transport_sas.c
is wrong:

/**
 * sas_remove_host  -  tear down a Scsi_Host's SAS data structures
 * @shost:  Scsi Host that is torn down
 *
 * Removes all SAS PHYs and remote PHYs for a given Scsi_Host.
 * Must be called just before scsi_remove_host for SAS HBAs.
 */




[PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects

2016-05-13 Thread Calvin Owens
On the hardware I'm testing on, simply removing the mpt3sas module
triggers a litany of WARNs culminating in an OOPS:

  [ cut here ]
  WARNING: CPU: 5 PID: 13348 at lib/kobject.c:244 
kobject_add_internal+0x359/0x8a0
  kobject_add_internal failed for ArrayDevice09 (error: -2 parent: 6:0:15:0)
  CPU: 5 PID: 13348 Comm: rmmod Not tainted 
4.6.0-rc2-mpt3sas-debug-1-g7e7e6f4 #2
  Hardware name: Wiwynn   HoneyBadger/PantherPlus, BIOS HBP6.6 11/20/2015
   82b88ec0 8806ce76f820 81deff03 8806ce76f898
    8806ce76f868 811191e2 880749819af8
   00f4 ed00d9cedf0f 880749819b08 88074c9b8168
  Call Trace:
   [] dump_stack+0x67/0x94
   [] __warn+0x172/0x1b0
   [] warn_slowpath_fmt+0x97/0xb0
   [] kobject_add_internal+0x359/0x8a0
   [] kobject_add+0x10e/0x1c0
   [] device_add+0x30a/0x1490
   [] enclosure_remove_device+0x172/0x1cc [enclosure]
   [] ses_intf_remove+0x1c4/0x270 [ses]
   [] device_del+0x2ab/0x680
   [] device_unregister+0x12/0x30
   [] __scsi_remove_device+0x1d5/0x250
   [] scsi_forget_host+0x12c/0x1e0
   [] scsi_remove_host+0x10c/0x300
   [] scsih_remove+0x321/0x680 [mpt3sas]
   [] pci_device_remove+0x70/0x110
   [] __device_release_driver+0x160/0x3a0
   [] driver_detach+0x183/0x200
   [] bus_remove_driver+0xdf/0x200
   [] driver_unregister+0x67/0xa0
   [] pci_unregister_driver+0x1e/0xe0
   [] _mpt3sas_exit+0x23/0x219 [mpt3sas]
   [] SyS_delete_module+0x2ee/0x390
   [] entry_SYSCALL_64_fastpath+0x18/0xa8
  ---[ end trace fe163024b624f4af ]---

  general protection fault:  [#1] SMP KASAN
  CPU: 6 PID: 17388 Comm: rmmod Tainted: GW   
4.6.0-rc2-1-g7e7e6f4 #1
  Hardware name: Wiwynn   HoneyBadger/PantherPlus, BIOS HBP6.6 11/20/2015
  task: 880753731740 ti: 8806896f task.ti: 8806896f
  RIP: 0010:[]  [] klist_put+0x1f/0x160
  RSP: 0018:8806896f7a10  EFLAGS: 00010202
  RAX: dc00 RBX:  RCX: 880753731f48
  RDX: 000b RSI: 0001 RDI: 0058
  RBP: 8806896f7a30 R08: 0006 R09: 
  R10:  R11:  R12: 880752bec000
  R13: 0001 R14: dc00 R15: 880752bec2b0
  FS:  7feef8ea6700() GS:88075ef8() knlGS:
  CS:  0010 DS:  ES:  CR0: 80050033
  CR2: 7fe45044d020 CR3: 0006f7092000 CR4: 001006e0
  Stack:
    880752bec000  dc00
   8806896f7a40 8290987e 8806896f7b00 820aa3bd
   8806896f7aa0 1100d12def4f 880752bec390 829183ca
  Call Trace:
   [] klist_del+0xe/0x10
   [] device_del+0x12d/0x680
   [] device_unregister+0x12/0x30
   [] enclosure_unregister+0xe0/0x170 [enclosure]
   [] ses_intf_remove+0x190/0x270 [ses]
   [] device_del+0x2ab/0x680
   [] device_unregister+0x12/0x30
   [] __scsi_remove_device+0x1d5/0x250
   [] scsi_forget_host+0x12c/0x1e0
   [] scsi_remove_host+0x10c/0x300
   [] scsih_remove+0x321/0x680 [mpt3sas]
   [] pci_device_remove+0x70/0x110
   [] __device_release_driver+0x160/0x3a0
   [] driver_detach+0x183/0x200
   [] bus_remove_driver+0xdf/0x200
   [] driver_unregister+0x67/0xa0
   [] pci_unregister_driver+0x1e/0xe0
   [] _mpt3sas_exit+0x23/0xa89 [mpt3sas]
   [] SyS_delete_module+0x2ee/0x390
   [] entry_SYSCALL_64_fastpath+0x18/0xa8

The issue is that enclosure_remove_device() expects to be able to re-add
the device that was previously enclosured: so with SES running, the order
we unwind things matters in a way it otherwise wouldn't.

Since mpt3sas deletes the SAS objects before the SCSI hosts, enclosure
ends up trying to re-parent the SCSI device from the enclosure to the SAS
PHY which has already been deleted. This obviously ends in sadness.

The fix appears to be simple: just call scsi_remove_host() before we call
sas_port_delete() and/or sas_remove_host().

Signed-off-by: Calvin Owens 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index e0e4920..4aa128a 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev)
_scsih_raid_device_remove(ioc, raid_device);
}
 
+   scsi_remove_host(shost);
+
/* free ports attached to the sas_host */
list_for_each_entry_safe(mpt3sas_port, next_port,
   &ioc->sas_hba.sas_port_list, port_list) {
@@ -8172,7 +8174,6 @@ void scsih_remove(struct pci_dev *pdev)
}
 
sas_remove_host(shost);
-   scsi_remove_host(shost);
mpt3sas_base_detach(ioc);
spin_lock(&gioc_lock);
list_del(&ioc->list);
-- 
2.8.0.rc2