答复: [PATCH] USB:Fix kernel NULL pointer when unbind UHCI form vfio-pci

2020-08-18 Thread TimGuo-oc
> -Original Mail-
> Sender : Alan Stern 
> Time : 2020/7/24  0:39
> Receiver : Alex Williamson 
> CC : Weitao Wang(BJ-RD) ; Greg KH
> ; WeitaoWang-oc
> ; mathias.ny...@linux.intel.com;
> ulf.hans...@linaro.org; vk...@kernel.org; hsleste...@gmail.com;
> linux-...@vger.kernel.org; linux-kernel@vger.kernel.org;
> carsten_sch...@mentor.com; efre...@linux.com; Tony W. Wang(XA-RD)
> ; Cobe Chen(BJ-RD) ;
> Tim Guo(BJ-RD) ; wwt8...@163.com
> Subject : Re: [PATCH] USB:Fix kernel NULL pointer when unbind UHCI form 
> vfio-pci
> 
> On Thu, Jul 23, 2020 at 10:17:35AM -0600, Alex Williamson wrote:
> > The IOMMU grouping restriction does solve the hardware issue, so long
> > as one driver doesn't blindly assume the driver private data for
> > another device and modify it.
> 
> Correction: The IOMMU grouping restriction solves the hardware issue for
> vfio-pci.  It won't necessarily help if some other driver comes along and 
> wants
> to bind to this hardware.
> 
> >   I do agree that your solution would
> > work, requiring all devices are unbound before any can be bound, but
> > it also seems difficult to manage.  The issue is largely unique to USB
> > AFAIK.  On the other hand, drivers coordinating with each other to
> > register their _private_ data as share-able within a set of drivers
> > seems like a much more direct and explicit interaction between the
> > drivers.  Thanks,
> 
> Yes, that makes sense.  But it would have to be implemented in the driver 
> core,
> not in particular subsystems like USB or PCI.  And it might be seen as 
> overkill,
> given that only UHCI/OHCI/EHCI devices require this sort of sharing AFAIK.
> 
> Also, when you think about it, what form would such coordination among drivers
> take?  From your description, it sounds like the drivers would agree to avoid
> accessing each other's private data if the proper registration wasn't in 
> place.
> 
> On the other hand, a stronger and perhaps more robust approach would be to
> enforce the condition that non-cooperating drivers are never bound to devices
> in the same group at the same time.  That's basically what I'm proposing here
> -- the question is whether the enforcement should be instituted in the kernel 
> or
> should merely be part of a standard protocol followed by userspace drivers.
> 
> Given that it's currently needed in only one place, it seems reasonable to 
> leave
> this as a "gentlemen's agreement" in userspace for the time being instead of
> adding it to the kernel.
But Qemu and Virt-manager don't comply with this "gentlemen's agreement" even 
if the EHCI and UHCI sit in the same IOMMU group. 
And the current behavior of Qemu and linux USB driver will cause a catastrophe. 
Maybe the hackers can use this BUG to attack the computer.
We shouldn't believe the VMM always comply with the "gentlemen's agreement", I 
think.
> 
> Alan Stern


答复: [PATCH] USB:Fix kernel NULL pointer when unbind UHCI form vfio-pci

2020-07-27 Thread WeitaoWang-oc

On Fri, 24 Jul 2020 19:17:49 + Alex wrote:
> On Fri, 24 Jul 2020 12:57:49 +
> WeitaoWang-oc  wrote:
> 
> > On Thu, 23 Jul 2020 12:38:21 -0400, Alan wrote:
> > > On Thu, Jul 23, 2020 at 10:17:35AM -0600, Alex Williamson wrote:
> > > > The IOMMU grouping restriction does solve the hardware issue, so long
> > > > as one driver doesn't blindly assume the driver private data for
> > > > another device and modify it.
> > >
> > > Correction: The IOMMU grouping restriction solves the hardware issue for
> > > vfio-pci.  It won't necessarily help if some other driver comes along
> > > and wants to bind to this hardware.
> > >
> > > >   I do agree that your solution would
> > > > work, requiring all devices are unbound before any can be bound, but it
> > > > also seems difficult to manage.  The issue is largely unique to USB
> > > > AFAIK.  On the other hand, drivers coordinating with each other to
> > > > register their _private_ data as share-able within a set of drivers
> > > > seems like a much more direct and explicit interaction between the
> > > > drivers.  Thanks,
> > >
> > > Yes, that makes sense.  But it would have to be implemented in the
> > > driver core, not in particular subsystems like USB or PCI.  And it might
> > > be seen as overkill, given that only UHCI/OHCI/EHCI devices require this
> > > sort of sharing AFAIK.
> > >
> > > Also, when you think about it, what form would such coordination among
> > > drivers take?  From your description, it sounds like the drivers would
> > > agree to avoid accessing each other's private data if the proper
> > > registration wasn't in place.
> > >
> > > On the other hand, a stronger and perhaps more robust approach would be
> > > to enforce the condition that non-cooperating drivers are never bound to
> > > devices in the same group at the same time.  That's basically what I'm
> > > proposing here -- the question is whether the enforcement should be
> > > instituted in the kernel or should merely be part of a standard protocol
> > > followed by userspace drivers.
> > >
> > > Given that it's currently needed in only one place, it seems reasonable
> > > to leave this as a "gentlemen's agreement" in userspace for the time
> > > being instead of adding it to the kernel.
> > >
> >
> > Provided that EHCI and UHCI host controller declare not support P2P and
> > ACS. So, we can assign EHCI and UHCI host controller to different IOMMU
> > group separately. We assign EHCI host controller to host and assign UHCI
> > host controller to VM. Then, ehci_hcd driver load/unload operation in host
> > will cause the same issue as discussed
> 
> And you have an example of such a device?  I expect these do not exist,
> nor should they.  It seems like it would be an improper use of ACS.
> Thanks,


In kernel source code tree drivers/pci/quirks.c,
There is a device list declared by pci_dev_acs_enabled. In which list, such as 
multi-function device without acs capability not allow peer-to-peer bewteen 
functions. Those device can be assign to to different IOMMU group separately.

Thnaks
weitao



答复: [PATCH] USB:Fix kernel NULL pointer when unbind UHCI form vfio-pci

2020-07-24 Thread WeitaoWang-oc

On Thu, 23 Jul 2020 12:38:21 -0400, Alan wrote:
> On Thu, Jul 23, 2020 at 10:17:35AM -0600, Alex Williamson wrote:
> > The IOMMU grouping restriction does solve the hardware issue, so long
> > as one driver doesn't blindly assume the driver private data for
> > another device and modify it.
> 
> Correction: The IOMMU grouping restriction solves the hardware issue for
> vfio-pci.  It won't necessarily help if some other driver comes along
> and wants to bind to this hardware.
> 
> >   I do agree that your solution would
> > work, requiring all devices are unbound before any can be bound, but it
> > also seems difficult to manage.  The issue is largely unique to USB
> > AFAIK.  On the other hand, drivers coordinating with each other to
> > register their _private_ data as share-able within a set of drivers
> > seems like a much more direct and explicit interaction between the
> > drivers.  Thanks,
> 
> Yes, that makes sense.  But it would have to be implemented in the
> driver core, not in particular subsystems like USB or PCI.  And it might
> be seen as overkill, given that only UHCI/OHCI/EHCI devices require this
> sort of sharing AFAIK.
> 
> Also, when you think about it, what form would such coordination among
> drivers take?  From your description, it sounds like the drivers would
> agree to avoid accessing each other's private data if the proper
> registration wasn't in place.
> 
> On the other hand, a stronger and perhaps more robust approach would be
> to enforce the condition that non-cooperating drivers are never bound to
> devices in the same group at the same time.  That's basically what I'm
> proposing here -- the question is whether the enforcement should be
> instituted in the kernel or should merely be part of a standard protocol
> followed by userspace drivers.
> 
> Given that it's currently needed in only one place, it seems reasonable
> to leave this as a "gentlemen's agreement" in userspace for the time
> being instead of adding it to the kernel.
>   

Provided that EHCI and UHCI host controller declare not support P2P and
ACS. So, we can assign EHCI and UHCI host controller to different IOMMU 
group separately. We assign EHCI host controller to host and assign UHCI
host controller to VM. Then, ehci_hcd driver load/unload operation in host
will cause the same issue as discussed

Thanks
Weitao



Re: 答复: [PATCH] USB:Fix kernel NULL pointer when unbind UHCI form vfio-pci

2020-07-23 Thread gre...@linuxfoundation.org
On Thu, Jul 23, 2020 at 08:36:25AM +, WeitaoWang-oc wrote:
> 
> On Thu,23 July 2020 04:18:00 + Alex wrote:
> > On Wed, 22 Jul 2020 19:57:48 +0800
> > WeitaoWangoc  wrote:
> > 
> > >  drivers/usb/core/hcd-pci.c | 5 +
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
> > > index 1547aa6..484f2a0 100644
> > > --- a/drivers/usb/core/hcd-pci.c
> > > +++ b/drivers/usb/core/hcd-pci.c
> > > @@ -34,6 +34,7 @@ static DECLARE_RWSEM(companions_rwsem);
> > >  #define CL_OHCIPCI_CLASS_SERIAL_USB_OHCI
> > >  #define CL_EHCIPCI_CLASS_SERIAL_USB_EHCI
> > >
> > > +#define PCI_DEV_DRV_FLAG   2
> > >  static inline int is_ohci_or_uhci(struct pci_dev *pdev)  {
> > > return pdev->class == CL_OHCI || pdev->class == CL_UHCI; @@
> > > -68,6 +69,8 @@ static void for_each_companion(struct pci_dev *pdev, struct
> > usb_hcd *hcd,
> > > if (companion->class != CL_UHCI && companion->class !=
> > CL_OHCI &&
> > > companion->class != CL_EHCI)
> > > continue;
> > > +   if (!(companion->priv_flags & PCI_DEV_DRV_FLAG))
> > 
> > But pci_dev.priv_flags is private data for the driver that currently
> > owns the device, which could be vfio-pci.  This is really no different
> > than assuming the structure at device.driver_data.  If vfio-pci were to
> > make legitimate use of pci_dev.priv_flags, this could simply blow up
> > again.  Should there instead be some sort of registration interface
> > where hcd complaint drivers register their devices and only those
> > registered devices can have their driver private data arbitrarily poked
> > by another driver?  Thanks,
> 
> Thanks for your explanation. Set pci_dev.priv_flags is really not a 
> reasonable approach. Are there any more detailed suggestions 
> to patch this issue?

This is not a kernel issue, it is a "do not do this in this way from
userspace" issue :)

thanks,

greg k-h


答复: [PATCH] USB:Fix kernel NULL pointer when unbind UHCI form vfio-pci

2020-07-23 Thread WeitaoWang-oc

On Thu,23 July 2020 04:18:00 + Alex wrote:
> On Wed, 22 Jul 2020 19:57:48 +0800
> WeitaoWangoc  wrote:
> 
> >  drivers/usb/core/hcd-pci.c | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
> > index 1547aa6..484f2a0 100644
> > --- a/drivers/usb/core/hcd-pci.c
> > +++ b/drivers/usb/core/hcd-pci.c
> > @@ -34,6 +34,7 @@ static DECLARE_RWSEM(companions_rwsem);
> >  #define CL_OHCIPCI_CLASS_SERIAL_USB_OHCI
> >  #define CL_EHCIPCI_CLASS_SERIAL_USB_EHCI
> >
> > +#define PCI_DEV_DRV_FLAG   2
> >  static inline int is_ohci_or_uhci(struct pci_dev *pdev)  {
> > return pdev->class == CL_OHCI || pdev->class == CL_UHCI; @@
> > -68,6 +69,8 @@ static void for_each_companion(struct pci_dev *pdev, struct
> usb_hcd *hcd,
> > if (companion->class != CL_UHCI && companion->class !=
> CL_OHCI &&
> > companion->class != CL_EHCI)
> > continue;
> > +   if (!(companion->priv_flags & PCI_DEV_DRV_FLAG))
> 
> But pci_dev.priv_flags is private data for the driver that currently
> owns the device, which could be vfio-pci.  This is really no different
> than assuming the structure at device.driver_data.  If vfio-pci were to
> make legitimate use of pci_dev.priv_flags, this could simply blow up
> again.  Should there instead be some sort of registration interface
> where hcd complaint drivers register their devices and only those
> registered devices can have their driver private data arbitrarily poked
> by another driver?  Thanks,

Thanks for your explanation. Set pci_dev.priv_flags is really not a 
reasonable approach. Are there any more detailed suggestions 
to patch this issue?

Thanks
Weitaowang


Re: 答复: [PATCH] USB:Fix kernel NULL pointer when unbind UHCI form vfio-pci

2020-07-22 Thread Greg KH
On Thu, Jul 23, 2020 at 02:59:55AM +, Weitao Wang(BJ-RD) wrote:
> CONFIDENTIAL NOTE:
> This email contains confidential or legally privileged information and is for 
> the sole use of its intended recipient. Any unauthorized review, use, copying 
> or forwarding of this email or the content of this email is strictly 
> prohibited.


This footer is not compatible with Linux mailing lists, sorry, I am not
allowed to respond to it.

greg k-h


答复: [PATCH] USB:Fix kernel NULL pointer when unbind UHCI form vfio-pci

2020-07-22 Thread Weitao Wang(BJ-RD)

On , Jul 22, 2020 at 02:44:14PM +0200, Alan wrote:
> On Wed, Jul 22, 2020 at 02:44:14PM +0200, Greg KH wrote:
> > On Wed, Jul 22, 2020 at 07:57:48PM +0800, WeitaoWangoc wrote:
> > > This bug is found in Zhaoxin platform, but it's a commom code bug.
> > > Fail sequence:
> > > step1: Unbind UHCI controller from native driver;
> > > step2: Bind UHCI controller to vfio-pci, which will put UHCI controller 
> > > in one
> vfio
> > >group's device list and set UHCI's dev->driver_data to struct
> vfio-pci(for UHCI)
> >
> > Hah, that works?  How do you do that properly?  What code does that?
>
> Yeah, that can't possibly work.  The USB core expects that any host
> controller device (or at least, any PCI host controller device) has its
> driver_data set to point to a struct usb_hcd.  It doesn't expect a host
> controller to be bound to anything other than a host controller driver.
>
> Things could easily go very wrong here.  For example, suppose at this
> point the ehci-hcd driver just happens to bind to the EHCI controller.
> When this happens, the EHCI controller hardware takes over all the USB
> connections that were routed to the UHCI controller.  How will vfio-pci
> deal with that?  Pretty ungracefully, I imagine.
>
> The only way to make this work at all is to unbind both uhci-hcd and
> ehci-hcd first.  Then after both are finished you can safely bind
> vfio-pci to the EHCI controller and the UHCI controllers (in that
> order).
>
I'm agree with you, unbind both uhci-hcd and ehci-hcd first then bind to
vfio-pci is a more reasonable sequence. Our experiments prove that this
sequence is indeed good as expected.
However, I did not find a formal document to prescribe this order.
Unfortunately, some application software such as virt-manager/qemu assign
UHCI/EHCI to guest OS has the same bind/unbind sequence as test “by hand”.
Do we need to consider compatibility with this application scenario?

The following log is captured when starting then shutdown the
virtual machine.

/* starting virtual machine*/
[ 2785.250001] audit: type=1400 audit(1594375837.191:48): apparmor="STATUS" 
operation="profile_load" profile="unconfined" 
name="libvirt-fa674e73-67a2-4672-8524-e62dea8a3c6c" pid=2008 
comm="apparmor_parser"
[ 2785.467510] uhci_hcd :00:10.0: remove, state 4
[ 2785.472426] usb usb1: USB disconnect, device number 1
/*bind :00:10.0 to vfio-pci*/
[ 2785.478798] uhci_hcd :00:10.0: USB bus 1 deregistered
[ 2785.568741] uhci_hcd :00:10.1: remove, state 1
[ 2785.573562] usb usb2: USB disconnect, device number 1
[ 2785.578793] usb 2-2: USB disconnect, device number 3
[ 2785.758016] uhci_hcd :00:10.1: USB bus 2 deregistered
/*bind :00:10.1 to vfio-pci*/
[ 2786.037448] ehci-pci :00:10.7: remove, state 4
[ 2786.042460] usb usb3: USB disconnect, device number 1
[ 2786.048700] ehci-pci :00:10.7: USB bus 3 deregistered
/*bind :00:10.7 to vfio-pci*/
[ 2787.518041] audit: type=1400 audit(1594375839.459:49): apparmor="STATUS" 
operation="profile_replace" profile="unconfined" 
name="libvirt-fa674e73-67a2-4672-8524-e62dea8a3c6c" pid=2034 
comm="apparmor_parser"
[ 2788.290706] audit: type=1400 audit(1594375840.231:50): apparmor="STATUS" 
operation="profile_replace" profile="unconfined" 
name="libvirt-fa674e73-67a2-4672-8524-e62dea8a3c6c" pid=2037 
comm="apparmor_parser"
[ 2788.960070] audit: type=1400 audit(1594375840.899:51): apparmor="STATUS" 
operation="profile_replace" info="same as current profile, skipping" 
profile="unconfined" name="libvirt-fa674e73-67a2-4672-8524-e62dea8a3c6c" 
pid=2040 comm="apparmor_parser"
[ 2788.968821] virbr0: port 2(vnet0) entered blocking state
[ 2788.988159] virbr0: port 2(vnet0) entered disabled state
[ 2788.993711] device vnet0 entered promiscuous mode
[ 2788.999453] virbr0: port 2(vnet0) entered blocking state
[ 2789.005053] virbr0: port 2(vnet0) entered listening state
[ 2789.098717] systemd-journald[286]: Successfully sent stream file descriptor 
to service manager.
[ 2789.564241] audit: type=1400 audit(1594375841.507:52): apparmor="STATUS" 
operation="profile_replace" profile="unconfined" 
name="libvirt-fa674e73-67a2-4672-8524-e62dea8a3c6c" pid=2065 
comm="apparmor_parser"
[ 2791.028028] virbr0: port 2(vnet0) entered learning state
[ 2793.047999] virbr0: port 2(vnet0) entered forwarding state
[ 2793.053449] virbr0: topology change detected, propagating
[ 2793.433604] vfio_cap_init: :00:10.7 hiding cap 0xa

/* shutdown virtual machine*/
[ 3838.772058] systemd-journald[286]: Successfully sent stream file descriptor 
to service manager.
[ 3838.815819] systemd-journald[286]: Successfully sent stream file descriptor 
to service manager.
[ 3838.871002] systemd-journald[286]: Successfully sent stream file descriptor 
to service manager.
[ 3838.884606] systemd-journald[286]: Successfully sent stream file descriptor 
to service manager.
[ 3838.894514] systemd-journald[286]: Successfully sent stream file descriptor 
to service manager.
[ 3838.896791] rfkill: 

答复: [PATCH] USB:Fix kernel NULL pointer when unbind UHCI form vfio-pci

2020-07-22 Thread Weitao Wang(BJ-RD)
  
On Wed, Jul 22, 2020 at 02:44:14PM +0200, Greg KH wrote:
> On Wed, Jul 22, 2020 at 07:57:48PM +0800, WeitaoWangoc wrote:
> > This bug is found in Zhaoxin platform, but it's a commom code bug.
> > Fail sequence:
> > step1: Unbind UHCI controller from native driver;
> > step2: Bind UHCI controller to vfio-pci, which will put UHCI controller in 
> > one
> vfio
> >group's device list and set UHCI's dev->driver_data to struct
> vfio-pci(for UHCI)
>
> Hah, that works?  How do you do that properly?  What code does that?

Drivers/vfio/vfio.c
The function vfio_group_create_device will set UHCI dev->driver_data
to vfio-device struct.

> > step3: Unbind EHCI controller from native driver, will try to tell UHCI 
> > native
> driver
> >that "I'm removed by set companion_hcd->self.hs_companion to
> NULL. However,
> >companion_hcd get from UHCI's dev->driver_data that has modified
> by vfio-pci
> >already.So, the vfio-pci structure will be damaged!
>
> Because you are messing around with bind/unbind "by hand", which is
> never guaranteed to work properly.
>
> It's amazing any of that works at all...

If adjust the sequence of UHCI/EHCI unbind form native driver, that
can avoid Null Pointer dereference. Eg. Step3-->stet4-->step1-->step2.
Our experiments prove that this sequence is indeed good as expected.
However, some application software such as virt-manager/qemu assign
/EHCI to guest OS has the same bind/unbind sequence as test “by hand”.

> 4.19.65-2020051917-rainos #1
>
> 4.19 is a bit old, what about 5.7?

5.7.7 has the same issue.

> > +#define PCI_DEV_DRV_FLAG   2
>
> Why is the USB core allowed to touch a private PCI structure field?
>
> I do not think this is allowed.
>
> And why is this a USB host controller-specific issue, shouldn't this
> type of thing be fixed in the PCI core?

When ehci hcd_driver bind or unbind to ehci, it will touch/modify UHCI usb_hcd
that get from UHCI's dev->driver_data. However, UHCI maybe bind to a
another driver(vfio-pci) and it's driver_data will be rewritten. what we
should do is to prevent ehci_hcd to modify UHCI's dev->driver_data when UHCI
has already bound to vfio-pci.

Thanks
weitaowang



保密声明:
本邮件含有保密或专有信息,仅供指定收件人使用。严禁对本邮件或其内容做任何未经授权的查阅、使用、复制或转发。
CONFIDENTIAL NOTE:
This email contains confidential or legally privileged information and is for 
the sole use of its intended recipient. Any unauthorized review, use, copying 
or forwarding of this email or the content of this email is strictly prohibited.