On Fri, Aug 31, 2018 at 2:25 PM, Lukas Wunner <lu...@wunner.de> wrote:
> [cc += linux-pci, benh]
>
> On Fri, Aug 31, 2018 at 7:37 AM Suganath Prabu S 
> <suganath-prabu.subram...@broadcom.com> wrote:
>> Posting below set of patches to support PCIe Hot Plug surprise removal,
>> and few defect fixes.
>
> Please cross-post to linux-pci in the future.
>
>
> Regarding [PATCH 1/7] mpt3sas: Introduce mpt3sas_base_pci_device_is_unplugged:
> https://www.spinics.net/lists/linux-scsi/msg122962.html
>
> * mpt3sas_base_pci_device_is_unplugged() is a duplication of the existing
>   pci_device_is_present().

Thanks for pointing this pci_device_is_present() API, we will replace
mpt3sas_base_pci_device_is_unplugged() with  pci_device_is_present().

>
> * Just reading the vendor ID may not be sufficient to detect unplug,
>   it may also read as "all ones" if the link is down due to error
>   recovery by DPC.

So, is their any other way to detect pci device unplug apart from
reading the vendor ID, I mean we have check any other flags, etc?

>
>
> Regarding [PATCH 2/7] mpt3sas: Add HBA hot plug watchdog thread:
> https://www.spinics.net/lists/linux-scsi/msg122963.html
>
> * I don't see why you need to poll for the device's removal from a
>   watchdog thread.  pciehp will invoke your driver's ->remove hook
>   once the device is gone.

If we have some three to four PCI devices and all pci devices are hot
unplugged simultaneously, then we observed that driver's-remove hook
is called sequentially. So it takes some time to call fourth PCI
device driver's->remove hook. so during this time we want all the
outstanding commands to be gracefully terminated and hence we added
this watchdog thread to quickly detect the hba unplug and take
necessary steps such as gracefully terminate the outstanding IOs and
stop receiving further IOs on it. At later time when PCI subsystem
calls driver's-remove hook then driver can quickly release the
resources allocated for this unplugged device.

>
> * A recent discussion initiated by Benjamin Herrenschmidt came to the
>   conclusion that device removal should be treated as a type of
>   error state (either pci_channel_io_perm_failure or another, newly
>   introduced state).  It will then be possible to detect the device's
>   inaccessibility with pci_channel_offline().  Please help work towards
>   such a future solution in the PCI core instead of solutions localized
>   to a single device driver.  Sorry, the discussion was lengthy, it is
>   available here:
>   https://www.spinics.net/lists/linux-pci/msg75425.html

Oh great, sure. We have very limited knowledge on PCI subsystem but we
try our best in future to provide solutions in the PCI core.

>
> Thanks,
>
> Lukas

Reply via email to