On Wed, Oct 25, 2017 at 02:29:40PM +0300, Mika Westerberg wrote:
> During surprise hot-unplug the device is not accessible anymore and
> register reads return 0xffffffff. When that happens 
> pciehp_unconfigure_device()
> may inadvertently think the device below the bridge may be a display
> device of somesort as reading PCI_BRIDGE_CONTROL register also returns
> 0xff. This results failure of the remove operation:
> 
>   pciehp 0000:00:1c.0:pcie004: Slot(0): Link Down
>   pciehp 0000:00:1c.0:pcie004: Slot(0): Card present
>   pciehp 0000:00:1c.0:pcie004: Cannot remove display device 0000:01:00.0
> 
> Because of this the hierarchy is left untouched preventing further
> hotplug operations.
> 
> Now, it is not clear why the check is there in the first place and why
> we would like to prevent removing a bridge if it has PCI_BRIDGE_CTL_VGA
> set. In case of surprise hot-unplug, it would not even be possible to
> prevent the removal. It may be due to the fact that pciehp_pci.c pretty
> much copies similar implementation from shpchp_pci.c and this check was
> just left there in the code without further thinking if it is actually
> needed at all.
> 
> Given this and the issue described above, I think it makes sense to drop
> the whole PCI_BRIDGE_CONTROL check from pciehp_unconfigure_device()
> which is what this patch does.
> 
> Signed-off-by: Mika Westerberg <[email protected]>
> ---
> The previous version of the patch can be found here:
> 
>   https://patchwork.kernel.org/patch/10024145/
> 
> Changes from the previous version:
> 
>   * Drop the whole PCI_BRIDGE_CONTROL check
>   * Update patch subject to reflect that
> 
>  drivers/pci/hotplug/pciehp_pci.c | 12 ------------
>  1 file changed, 12 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pciehp_pci.c 
> b/drivers/pci/hotplug/pciehp_pci.c
> index 2a1ca020cf5a..acc360d1a1fb 100644
> --- a/drivers/pci/hotplug/pciehp_pci.c
> +++ b/drivers/pci/hotplug/pciehp_pci.c
> @@ -79,7 +79,6 @@ int pciehp_configure_device(struct slot *p_slot)
>  int pciehp_unconfigure_device(struct slot *p_slot)
>  {
>       int rc = 0;
> -     u8 bctl = 0;
>       u8 presence = 0;
>       struct pci_dev *dev, *temp;
>       struct pci_bus *parent = p_slot->ctrl->pcie->port->subordinate;
> @@ -101,17 +100,6 @@ int pciehp_unconfigure_device(struct slot *p_slot)
>       list_for_each_entry_safe_reverse(dev, temp, &parent->devices,
>                                        bus_list) {
>               pci_dev_get(dev);
> -             if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && presence) {
> -                     pci_read_config_byte(dev, PCI_BRIDGE_CONTROL, &bctl);
> -                     if (bctl & PCI_BRIDGE_CTL_VGA) {
> -                             ctrl_err(ctrl,
> -                                      "Cannot remove display device %s\n",
> -                                      pci_name(dev));
> -                             pci_dev_put(dev);
> -                             rc = -EINVAL;
> -                             break;
> -                     }
> -             }
>               if (!presence) {
>                       pci_dev_set_disconnected(dev, NULL);
>                       if (pci_has_subordinate(dev))

As you mention, shpchp_unconfigure_device() contains the same code; is
there any reason not to remove it from there as well?

I know you probably can't test shpchp, and neither can I, but it looks
like it has the same issue, and I don't want to avoid fixing it there
just for want of testing.

I wish we knew more about why that PCI_BRIDGE_CTL_VGA check was there
in the first place.  I think it was added by Dely Sy in 2004 [1].
LinkedIn thinks he's still at Intel; any chance you could ping him and
see if he has any insight?  My guess is that it was originally in a
path that didn't have to worry about surprise unplug.  But I still
don't know why it would be a problem.

Bjorn

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=c16b4b14d980

Reply via email to