Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects
rphan what lies beneath it: >> > >> > [ 955.119328] kobject: '6:0:0:0' (88074e2e1008): kobject_uevent_env >> > [ 955.133890] kobject: '6:0:0:0' (88074e2e1008): fill_kobj_path: path >> > = '/end_device-6:0:0/target6:0:0/6:0:0:0/bsg/6:0:0:0' >> > >> >> [Sreekanth] Calvin, Now with your patch we are observing same type of >> while unregistered the drive with scsi_transport layer >> after calling scsi_remove_host() on the same driver unload path, >> >> kobject: 'end_device-2:0' (8800dad0c010): kobject_uevent_env >> kobject: 'end_device-2:0' (8800dad0c010): fill_kobj_path: path = >> '/host2/port-2:0/end_device-2:0/bsg/end_device-2:0' >> >> As scsi_remove_host() has freed it's parent >> "/devices/pci:00/:00:07.0" we are observing these warnings. >> >> here is it's complete path >> >> fill_kobj_path: path = >> '/devices/pci:00/:00:07.0/host2/port-2:0/end_device-2:0/bsg/end_device-2:0', > > Could you post the stacktraces for your WARN/OOPS? > >> Until on ~3.19/4.0 kernel all the thing are working fine, we haven't >> seen these types of kernel warning. Also we haven't done any changes >> in mpt3sas driver w.r.t to this after 4.0 kernel. > > I'll dig around more. In any case, my patch seems to be papering over > the problem rather than solving it, so we need a different fix. > > Thanks, > Calvin > >> > >> > As the above from CONFIG_DEBUG_KOBJECT suggests, "end_device-6:0:0" no >> > longer >> > has a parent. The path to the enclosured device is: >> > >> > /sys/devices/pci:00/:00:01.0/:01:00.0/host6/port-6:0/expander-6:0/port-6:0:15/end_device-6:0:15/target6:0:15/6:0:15:0/enclosure/6:0:15:0/ArrayDevice09 >> > >> > Enclosure wants to re-add the device outside the enclosure: >> > >> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/misc/enclosure.c#n404 >> > >> > We melt when trying to re-add "ArrayDevice09" to the "6:0:15:0" directory >> > above the "enclosure" directory best I can tell. This is *incredibly* >> > confusing because kobj_add_internal() actually clobbers the parent pointer >> > if it's NULL: >> > >> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/lib/kobject.c#n216 >> > >> > So even though it appears that the parent was "6:0:15:0" here, it was >> > actually NULL when passed into kobj_add_internal(). Thus, we try to do >> > the add with the kset as our parent, and we get -ENOENT because that >> > node has been marked as inactive in kernfs as per here: >> > >> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/kernfs/dir.c#n743 >> > >> > Why is it inactive? I think, because __kernfs_remove() deleted an ancestor: >> > >> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/kernfs/dir.c#n1243 >> > >> > ...which brings us full circle. >> > >> >> Doesn't this need to be taken care in enclosure_removal? >> > >> > Even with enclosure compiled out, I still get a WARN like the following >> > for every block device being removed at rmmod (-dirty here is just from me >> > applying a fix to SES I sent to the list a few days ago): >> > >> > [ 224.057286] [ cut here ] >> > [ 224.067696] WARNING: CPU: 1 PID: 5412 at fs/sysfs/group.c:237 >> > sysfs_remove_group+0x10f/0x150 >> > [ 224.088007] sysfs group 83109ba0 not found for kobject 'sdc' >> > [ 224.102283] Modules linked in: mpt3sas(-) >> > [ 224.200483] CPU: 0 PID: 5412 Comm: rmmod Not tainted 4.6.0-dirty #4 >> > [ 224.219797] Hardware name: Wiwynn HoneyBadger/PantherPlus, BIOS >> > HBM6.71 02/03/2016 >> > [ 224.237001] ffff82af6280 8806e826f8b0 ffff81e13e54 >> > 8806e826f928 >> > [ 224.253811] 8806e826f8f8 81127822 >> > 831a5968 >> > [ 224.270647] 880600ed ed00dd04df21 >> > 880754ae3078 >> > [ 224.287462] Call Trace: >> > [ 224.292953] [] dump_stack+0x68/0x94 >> > [ 224.304433] [] __warn+0x172/0x1b0 >> > [ 224.315514] [] warn_slowpath_fmt+0x97/0xb0 >> > [ 224.351845] [] sysfs_remove_group+0x10f/0x150 >> > [ 224.36525
Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects
t; [ 224.200483] CPU: 0 PID: 5412 Comm: rmmod Not tainted 4.6.0-dirty #4 > > [ 224.219797] Hardware name: Wiwynn HoneyBadger/PantherPlus, BIOS > > HBM6.71 02/03/2016 > > [ 224.237001] 82af6280 8806e826f8b0 81e13e54 > > 8806e826f928 > > [ 224.253811] 8806e826f8f8 81127822 > > 831a5968 > > [ 224.270647] 880600ed ed00dd04df21 > > 880754ae3078 > > [ 224.287462] Call Trace: > > [ 224.292953] [] dump_stack+0x68/0x94 > > [ 224.304433] [] __warn+0x172/0x1b0 > > [ 224.315514] [] warn_slowpath_fmt+0x97/0xb0 > > [ 224.351845] [] sysfs_remove_group+0x10f/0x150 > > [ 224.365252] [] blk_trace_remove_sysfs+0x14/0x20 > > [ 224.379021] [] blk_unregister_queue+0xcf/0x120 > > [ 224.392589] [] del_gendisk+0x26e/0x600 > > [ 224.432345] [] sd_remove+0xf5/0x1b0 > > [ 224.443808] [] __device_release_driver+0x160/0x3a0 > > [ 224.458151] [] device_release_driver+0x25/0x40 > > [ 224.471734] [] bus_remove_device+0x2e1/0x4b0 > > [ 224.484926] [] device_del+0x37d/0x680 > > [ 224.555423] [] __scsi_remove_device+0x1e5/0x250 > > [ 224.569196] [] scsi_forget_host+0x12c/0x1e0 > > [ 224.582197] [] scsi_remove_host+0x10c/0x300 > > [ 224.595227] [] scsih_remove+0x339/0x670 [mpt3sas] > > [ 224.609383] [] pci_device_remove+0x70/0x110 > > [ 224.622379] [] __device_release_driver+0x160/0x3a0 > > [ 224.636724] [] driver_detach+0x183/0x200 > > [ 224.649166] [] bus_remove_driver+0xdf/0x200 > > [ 224.662185] [] driver_unregister+0x67/0xa0 > > [ 224.675009] [] pci_unregister_driver+0x1e/0xe0 > > [ 224.688613] [] _mpt3sas_exit+0x23/0x5e9 [mpt3sas] > > [ 224.702782] [] SyS_delete_module+0x2eb/0x390 > > [ 224.743359] [] entry_SYSCALL_64_fastpath+0x18/0xa8 > > [ 224.759036] ---[ end trace 13f1919bf2ec7bb0 ]--- > > > > My patch fixes that too :) > > > > The only driver besides mpt*sas that calls sas_delete_port() explicitly is > > HPSA, and it does it in the opposite order mpt3sas does: scsi_remove_host() > > first. > > > > Thanks, > > Calvin > > > >> -Original Message- > >> From: Calvin Owens [mailto:calvinow...@fb.com] > >> Sent: Monday, May 16, 2016 2:25 PM > >> To: Elliott, Robert (Persistent Memory) > >> Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. > >> Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com; > >> linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; > >> kernel-t...@fb.com; calvinow...@fb.com > >> Subject: Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY > >> objects > >> > >> On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote: > >> > > >> > > >> > > -Original Message- > >> > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel- > >> > > ow...@vger.kernel.org] On Behalf Of Calvin Owens > >> > > Sent: Friday, May 13, 2016 3:28 PM > >> > ... > >> > > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS > >> > > PHY objects > >> > > > >> > ... > >> > > The issue is that enclosure_remove_device() expects to be able to > >> > > re-add the device that was previously enclosured: so with SES > >> > > running, the order we unwind things matters in a way it otherwise > >> > > wouldn't. > >> > > > >> > > Since mpt3sas deletes the SAS objects before the SCSI hosts, > >> > > enclosure ends up trying to re-parent the SCSI device from the > >> > > enclosure to the SAS PHY which has already been deleted. This obviously > >> > > ends in sadness. > >> > > > >> > > The fix appears to be simple: just call scsi_remove_host() before we > >> > > call > >> > > sas_port_delete() and/or sas_remove_host(). > >> > > > >> > ... > >> > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > >> > > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev) > >> > > _scsih_raid_device_remove(ioc, raid_device); > >> > > } > >> > > > >> > > + scsi_remove_host(shost); > >> > > + > >> > > /* free ports attached to the sas_host */ > >> > > list_for_each_entry_safe(mpt3sas_port, next_port, > >> > > &ioc->sas_hba.sas_port_list, port_list) { @@ -8172,7 +8174,6 @@ > >> > > void scsih_remove(struct pci_dev *pdev) > >> > > } > >> > > > >> > > sas_remove_host(shost); > >> > > - scsi_remove_host(shost); > >> > > mpt3sas_base_detach(ioc); > >> > > spin_lock(&gioc_lock); > >> > > list_del(&ioc->list); > >> > > >> > That change matches the pattern of all other drivers calling > >> > sas_remove_host, except for one: > >> > drivers/message/fusion/mptsas.c > >> > > >> > That consensus means this comment in drivers/scsi/scsi_transport_sas.c > >> > is wrong: > >> > > >> > /** > >> > * sas_remove_host - tear down a Scsi_Host's SAS data structures > >> > * @shost: Scsi Host that is torn down > >> > * > >> > * Removes all SAS PHYs and remote PHYs for a given Scsi_Host. > >> > * Must be called just before scsi_remove_host for SAS HBAs. > >> > */ > >> > >> Yes, exactly: that comment appears to be backwards, and as you say most > >> drivers appear to do the opposite (I looked at HPSA at least, which calls > >> sas_port_delete() before scsi_remove_host()). > >> > >> I suppose I should send a patch to fix the comment as well? It'd be nice to > >> get some testing to be certain I'm not breaking somebody else's setup by > >> fixing mine though... > >> > >> Thanks, > >> Calvin
Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects
[] blk_trace_remove_sysfs+0x14/0x20 > [ 224.379021] [] blk_unregister_queue+0xcf/0x120 > [ 224.392589] [] del_gendisk+0x26e/0x600 > [ 224.432345] [] sd_remove+0xf5/0x1b0 > [ 224.443808] [] __device_release_driver+0x160/0x3a0 > [ 224.458151] [] device_release_driver+0x25/0x40 > [ 224.471734] [] bus_remove_device+0x2e1/0x4b0 > [ 224.484926] [] device_del+0x37d/0x680 > [ 224.555423] [] __scsi_remove_device+0x1e5/0x250 > [ 224.569196] [] scsi_forget_host+0x12c/0x1e0 > [ 224.582197] [] scsi_remove_host+0x10c/0x300 > [ 224.595227] [] scsih_remove+0x339/0x670 [mpt3sas] > [ 224.609383] [] pci_device_remove+0x70/0x110 > [ 224.622379] [] __device_release_driver+0x160/0x3a0 > [ 224.636724] [] driver_detach+0x183/0x200 > [ 224.649166] [] bus_remove_driver+0xdf/0x200 > [ 224.662185] [] driver_unregister+0x67/0xa0 > [ 224.675009] [] pci_unregister_driver+0x1e/0xe0 > [ 224.688613] [] _mpt3sas_exit+0x23/0x5e9 [mpt3sas] > [ 224.702782] [] SyS_delete_module+0x2eb/0x390 > [ 224.743359] [] entry_SYSCALL_64_fastpath+0x18/0xa8 > [ 224.759036] ---[ end trace 13f1919bf2ec7bb0 ]--- > > My patch fixes that too :) > > The only driver besides mpt*sas that calls sas_delete_port() explicitly is > HPSA, and it does it in the opposite order mpt3sas does: scsi_remove_host() > first. > > Thanks, > Calvin > >> -Original Message- >> From: Calvin Owens [mailto:calvinow...@fb.com] >> Sent: Monday, May 16, 2016 2:25 PM >> To: Elliott, Robert (Persistent Memory) >> Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. >> Bottomley; Martin K. Petersen; mpt-fusionlinux....@broadcom.com; >> linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; >> kernel-t...@fb.com; calvinow...@fb.com >> Subject: Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY >> objects >> >> On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote: >> > >> > >> > > -Original Message- >> > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel- >> > > ow...@vger.kernel.org] On Behalf Of Calvin Owens >> > > Sent: Friday, May 13, 2016 3:28 PM >> > ... >> > > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS >> > > PHY objects >> > > >> > ... >> > > The issue is that enclosure_remove_device() expects to be able to >> > > re-add the device that was previously enclosured: so with SES >> > > running, the order we unwind things matters in a way it otherwise >> > > wouldn't. >> > > >> > > Since mpt3sas deletes the SAS objects before the SCSI hosts, >> > > enclosure ends up trying to re-parent the SCSI device from the >> > > enclosure to the SAS PHY which has already been deleted. This obviously >> > > ends in sadness. >> > > >> > > The fix appears to be simple: just call scsi_remove_host() before we >> > > call >> > > sas_port_delete() and/or sas_remove_host(). >> > > >> > ... >> > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c >> > > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev) >> > > _scsih_raid_device_remove(ioc, raid_device); >> > > } >> > > >> > > + scsi_remove_host(shost); >> > > + >> > > /* free ports attached to the sas_host */ >> > > list_for_each_entry_safe(mpt3sas_port, next_port, >> > > &ioc->sas_hba.sas_port_list, port_list) { @@ -8172,7 +8174,6 @@ >> > > void scsih_remove(struct pci_dev *pdev) >> > > } >> > > >> > > sas_remove_host(shost); >> > > - scsi_remove_host(shost); >> > > mpt3sas_base_detach(ioc); >> > > spin_lock(&gioc_lock); >> > > list_del(&ioc->list); >> > >> > That change matches the pattern of all other drivers calling >> > sas_remove_host, except for one: >> > drivers/message/fusion/mptsas.c >> > >> > That consensus means this comment in drivers/scsi/scsi_transport_sas.c >> > is wrong: >> > >> > /** >> > * sas_remove_host - tear down a Scsi_Host's SAS data structures >> > * @shost: Scsi Host that is torn down >> > * >> > * Removes all SAS PHYs and remote PHYs for a given Scsi_Host. >> > * Must be called just before scsi_remove_host for SAS HBAs. >> > */ >> >> Yes, exactly: that comment appears to be backwards, and as you say most >> drivers appear to do the opposite (I looked at HPSA at least, which calls >> sas_port_delete() before scsi_remove_host()). >> >> I suppose I should send a patch to fix the comment as well? It'd be nice to >> get some testing to be certain I'm not breaking somebody else's setup by >> fixing mine though... >> >> Thanks, >> Calvin
Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects
gt; From: Calvin Owens [mailto:calvinow...@fb.com] > Sent: Monday, May 16, 2016 2:25 PM > To: Elliott, Robert (Persistent Memory) > Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. > Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com; > linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; > kernel-t...@fb.com; calvinow...@fb.com > Subject: Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY > objects > > On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote: > > > > > > > -Original Message- > > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel- > > > ow...@vger.kernel.org] On Behalf Of Calvin Owens > > > Sent: Friday, May 13, 2016 3:28 PM > > ... > > > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS > > > PHY objects > > > > > ... > > > The issue is that enclosure_remove_device() expects to be able to > > > re-add the device that was previously enclosured: so with SES > > > running, the order we unwind things matters in a way it otherwise > > > wouldn't. > > > > > > Since mpt3sas deletes the SAS objects before the SCSI hosts, > > > enclosure ends up trying to re-parent the SCSI device from the > > > enclosure to the SAS PHY which has already been deleted. This obviously > > > ends in sadness. > > > > > > The fix appears to be simple: just call scsi_remove_host() before we > > > call > > > sas_port_delete() and/or sas_remove_host(). > > > > > ... > > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > > > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev) > > > _scsih_raid_device_remove(ioc, raid_device); > > > } > > > > > > + scsi_remove_host(shost); > > > + > > > /* free ports attached to the sas_host */ > > > list_for_each_entry_safe(mpt3sas_port, next_port, > > > &ioc->sas_hba.sas_port_list, port_list) { @@ -8172,7 +8174,6 @@ > > > void scsih_remove(struct pci_dev *pdev) > > > } > > > > > > sas_remove_host(shost); > > > - scsi_remove_host(shost); > > > mpt3sas_base_detach(ioc); > > > spin_lock(&gioc_lock); > > > list_del(&ioc->list); > > > > That change matches the pattern of all other drivers calling > > sas_remove_host, except for one: > > drivers/message/fusion/mptsas.c > > > > That consensus means this comment in drivers/scsi/scsi_transport_sas.c > > is wrong: > > > > /** > > * sas_remove_host - tear down a Scsi_Host's SAS data structures > > * @shost: Scsi Host that is torn down > > * > > * Removes all SAS PHYs and remote PHYs for a given Scsi_Host. > > * Must be called just before scsi_remove_host for SAS HBAs. > > */ > > Yes, exactly: that comment appears to be backwards, and as you say most > drivers appear to do the opposite (I looked at HPSA at least, which calls > sas_port_delete() before scsi_remove_host()). > > I suppose I should send a patch to fix the comment as well? It'd be nice to > get some testing to be certain I'm not breaking somebody else's setup by > fixing mine though... > > Thanks, > Calvin
RE: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects
Our understanding is the relationship between the SCSI host and SAS end devices is a parent-child and before ripping of SCSI host we need to rip of all the children. Why the enclosure ends up trying to re-parent the SCSI device from the enclosure to the SAS PHY even after we remove the SAS Phy?. Doesn't this need to be taken care in enclosure_removal? -Original Message- From: Calvin Owens [mailto:calvinow...@fb.com] Sent: Monday, May 16, 2016 2:25 PM To: Elliott, Robert (Persistent Memory) Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; kernel-t...@fb.com; calvinow...@fb.com Subject: Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote: > > > > -Original Message- > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel- > > ow...@vger.kernel.org] On Behalf Of Calvin Owens > > Sent: Friday, May 13, 2016 3:28 PM > ... > > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS > > PHY objects > > > ... > > The issue is that enclosure_remove_device() expects to be able to > > re-add the device that was previously enclosured: so with SES > > running, the order we unwind things matters in a way it otherwise > > wouldn't. > > > > Since mpt3sas deletes the SAS objects before the SCSI hosts, > > enclosure ends up trying to re-parent the SCSI device from the > > enclosure to the SAS PHY which has already been deleted. This obviously > > ends in sadness. > > > > The fix appears to be simple: just call scsi_remove_host() before we > > call > > sas_port_delete() and/or sas_remove_host(). > > > ... > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev) > > _scsih_raid_device_remove(ioc, raid_device); > > } > > > > + scsi_remove_host(shost); > > + > > /* free ports attached to the sas_host */ > > list_for_each_entry_safe(mpt3sas_port, next_port, > >&ioc->sas_hba.sas_port_list, port_list) { @@ -8172,7 +8174,6 @@ > > void scsih_remove(struct pci_dev *pdev) > > } > > > > sas_remove_host(shost); > > - scsi_remove_host(shost); > > mpt3sas_base_detach(ioc); > > spin_lock(&gioc_lock); > > list_del(&ioc->list); > > That change matches the pattern of all other drivers calling > sas_remove_host, except for one: > drivers/message/fusion/mptsas.c > > That consensus means this comment in drivers/scsi/scsi_transport_sas.c > is wrong: > > /** > * sas_remove_host - tear down a Scsi_Host's SAS data structures > * @shost: Scsi Host that is torn down > * > * Removes all SAS PHYs and remote PHYs for a given Scsi_Host. > * Must be called just before scsi_remove_host for SAS HBAs. > */ Yes, exactly: that comment appears to be backwards, and as you say most drivers appear to do the opposite (I looked at HPSA at least, which calls sas_port_delete() before scsi_remove_host()). I suppose I should send a patch to fix the comment as well? It'd be nice to get some testing to be certain I'm not breaking somebody else's setup by fixing mine though... Thanks, Calvin
Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects
On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote: > > > > -Original Message- > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel- > > ow...@vger.kernel.org] On Behalf Of Calvin Owens > > Sent: Friday, May 13, 2016 3:28 PM > ... > > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY > > objects > > > ... > > The issue is that enclosure_remove_device() expects to be able to re-add > > the device that was previously enclosured: so with SES running, the order > > we unwind things matters in a way it otherwise wouldn't. > > > > Since mpt3sas deletes the SAS objects before the SCSI hosts, enclosure > > ends up trying to re-parent the SCSI device from the enclosure to the SAS > > PHY which has already been deleted. This obviously ends in sadness. > > > > The fix appears to be simple: just call scsi_remove_host() before we call > > sas_port_delete() and/or sas_remove_host(). > > > ... > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev) > > _scsih_raid_device_remove(ioc, raid_device); > > } > > > > + scsi_remove_host(shost); > > + > > /* free ports attached to the sas_host */ > > list_for_each_entry_safe(mpt3sas_port, next_port, > >&ioc->sas_hba.sas_port_list, port_list) { > > @@ -8172,7 +8174,6 @@ void scsih_remove(struct pci_dev *pdev) > > } > > > > sas_remove_host(shost); > > - scsi_remove_host(shost); > > mpt3sas_base_detach(ioc); > > spin_lock(&gioc_lock); > > list_del(&ioc->list); > > That change matches the pattern of all other drivers calling > sas_remove_host, except for one: > drivers/message/fusion/mptsas.c > > That consensus means this comment in drivers/scsi/scsi_transport_sas.c > is wrong: > > /** > * sas_remove_host - tear down a Scsi_Host's SAS data structures > * @shost: Scsi Host that is torn down > * > * Removes all SAS PHYs and remote PHYs for a given Scsi_Host. > * Must be called just before scsi_remove_host for SAS HBAs. > */ Yes, exactly: that comment appears to be backwards, and as you say most drivers appear to do the opposite (I looked at HPSA at least, which calls sas_port_delete() before scsi_remove_host()). I suppose I should send a patch to fix the comment as well? It'd be nice to get some testing to be certain I'm not breaking somebody else's setup by fixing mine though... Thanks, Calvin
RE: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects
> -Original Message- > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel- > ow...@vger.kernel.org] On Behalf Of Calvin Owens > Sent: Friday, May 13, 2016 3:28 PM ... > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY > objects > ... > The issue is that enclosure_remove_device() expects to be able to re-add > the device that was previously enclosured: so with SES running, the order > we unwind things matters in a way it otherwise wouldn't. > > Since mpt3sas deletes the SAS objects before the SCSI hosts, enclosure > ends up trying to re-parent the SCSI device from the enclosure to the SAS > PHY which has already been deleted. This obviously ends in sadness. > > The fix appears to be simple: just call scsi_remove_host() before we call > sas_port_delete() and/or sas_remove_host(). > ... > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev) > _scsih_raid_device_remove(ioc, raid_device); > } > > + scsi_remove_host(shost); > + > /* free ports attached to the sas_host */ > list_for_each_entry_safe(mpt3sas_port, next_port, > &ioc->sas_hba.sas_port_list, port_list) { > @@ -8172,7 +8174,6 @@ void scsih_remove(struct pci_dev *pdev) > } > > sas_remove_host(shost); > - scsi_remove_host(shost); > mpt3sas_base_detach(ioc); > spin_lock(&gioc_lock); > list_del(&ioc->list); That change matches the pattern of all other drivers calling sas_remove_host, except for one: drivers/message/fusion/mptsas.c That consensus means this comment in drivers/scsi/scsi_transport_sas.c is wrong: /** * sas_remove_host - tear down a Scsi_Host's SAS data structures * @shost: Scsi Host that is torn down * * Removes all SAS PHYs and remote PHYs for a given Scsi_Host. * Must be called just before scsi_remove_host for SAS HBAs. */
[PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects
On the hardware I'm testing on, simply removing the mpt3sas module triggers a litany of WARNs culminating in an OOPS: [ cut here ] WARNING: CPU: 5 PID: 13348 at lib/kobject.c:244 kobject_add_internal+0x359/0x8a0 kobject_add_internal failed for ArrayDevice09 (error: -2 parent: 6:0:15:0) CPU: 5 PID: 13348 Comm: rmmod Not tainted 4.6.0-rc2-mpt3sas-debug-1-g7e7e6f4 #2 Hardware name: Wiwynn HoneyBadger/PantherPlus, BIOS HBP6.6 11/20/2015 82b88ec0 8806ce76f820 81deff03 8806ce76f898 8806ce76f868 811191e2 880749819af8 00f4 ed00d9cedf0f 880749819b08 88074c9b8168 Call Trace: [] dump_stack+0x67/0x94 [] __warn+0x172/0x1b0 [] warn_slowpath_fmt+0x97/0xb0 [] kobject_add_internal+0x359/0x8a0 [] kobject_add+0x10e/0x1c0 [] device_add+0x30a/0x1490 [] enclosure_remove_device+0x172/0x1cc [enclosure] [] ses_intf_remove+0x1c4/0x270 [ses] [] device_del+0x2ab/0x680 [] device_unregister+0x12/0x30 [] __scsi_remove_device+0x1d5/0x250 [] scsi_forget_host+0x12c/0x1e0 [] scsi_remove_host+0x10c/0x300 [] scsih_remove+0x321/0x680 [mpt3sas] [] pci_device_remove+0x70/0x110 [] __device_release_driver+0x160/0x3a0 [] driver_detach+0x183/0x200 [] bus_remove_driver+0xdf/0x200 [] driver_unregister+0x67/0xa0 [] pci_unregister_driver+0x1e/0xe0 [] _mpt3sas_exit+0x23/0x219 [mpt3sas] [] SyS_delete_module+0x2ee/0x390 [] entry_SYSCALL_64_fastpath+0x18/0xa8 ---[ end trace fe163024b624f4af ]--- general protection fault: [#1] SMP KASAN CPU: 6 PID: 17388 Comm: rmmod Tainted: GW 4.6.0-rc2-1-g7e7e6f4 #1 Hardware name: Wiwynn HoneyBadger/PantherPlus, BIOS HBP6.6 11/20/2015 task: 880753731740 ti: 8806896f task.ti: 8806896f RIP: 0010:[] [] klist_put+0x1f/0x160 RSP: 0018:8806896f7a10 EFLAGS: 00010202 RAX: dc00 RBX: RCX: 880753731f48 RDX: 000b RSI: 0001 RDI: 0058 RBP: 8806896f7a30 R08: 0006 R09: R10: R11: R12: 880752bec000 R13: 0001 R14: dc00 R15: 880752bec2b0 FS: 7feef8ea6700() GS:88075ef8() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7fe45044d020 CR3: 0006f7092000 CR4: 001006e0 Stack: 880752bec000 dc00 8806896f7a40 8290987e 8806896f7b00 820aa3bd 8806896f7aa0 1100d12def4f 880752bec390 829183ca Call Trace: [] klist_del+0xe/0x10 [] device_del+0x12d/0x680 [] device_unregister+0x12/0x30 [] enclosure_unregister+0xe0/0x170 [enclosure] [] ses_intf_remove+0x190/0x270 [ses] [] device_del+0x2ab/0x680 [] device_unregister+0x12/0x30 [] __scsi_remove_device+0x1d5/0x250 [] scsi_forget_host+0x12c/0x1e0 [] scsi_remove_host+0x10c/0x300 [] scsih_remove+0x321/0x680 [mpt3sas] [] pci_device_remove+0x70/0x110 [] __device_release_driver+0x160/0x3a0 [] driver_detach+0x183/0x200 [] bus_remove_driver+0xdf/0x200 [] driver_unregister+0x67/0xa0 [] pci_unregister_driver+0x1e/0xe0 [] _mpt3sas_exit+0x23/0xa89 [mpt3sas] [] SyS_delete_module+0x2ee/0x390 [] entry_SYSCALL_64_fastpath+0x18/0xa8 The issue is that enclosure_remove_device() expects to be able to re-add the device that was previously enclosured: so with SES running, the order we unwind things matters in a way it otherwise wouldn't. Since mpt3sas deletes the SAS objects before the SCSI hosts, enclosure ends up trying to re-parent the SCSI device from the enclosure to the SAS PHY which has already been deleted. This obviously ends in sadness. The fix appears to be simple: just call scsi_remove_host() before we call sas_port_delete() and/or sas_remove_host(). Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index e0e4920..4aa128a 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev) _scsih_raid_device_remove(ioc, raid_device); } + scsi_remove_host(shost); + /* free ports attached to the sas_host */ list_for_each_entry_safe(mpt3sas_port, next_port, &ioc->sas_hba.sas_port_list, port_list) { @@ -8172,7 +8174,6 @@ void scsih_remove(struct pci_dev *pdev) } sas_remove_host(shost); - scsi_remove_host(shost); mpt3sas_base_detach(ioc); spin_lock(&gioc_lock); list_del(&ioc->list); -- 2.8.0.rc2