Re: [Xen-devel] [PATCH] xen/pt: fix some pass-thru devices don't work across reboot

2018-11-12 Thread Jan Beulich
>>> On 12.11.18 at 05:58,  wrote:
> On Fri, Nov 09, 2018 at 02:14:04AM -0700, Jan Beulich wrote:
> On 09.11.18 at 01:11,  wrote:
>>> I find some pass-thru devices don't work any more across guest
>>> reboot. Assigning it to another domain also meets the same issue. And
>>> the only way to make it work again is un-binding and binding it to
>>> pciback. Someone reported this issue one year ago [1].
>>
>>I find "some" above and in the title misleading. According to the
>>following description, the issue ought to affect exactly all MSI-X
>>capable devices.
> 
> Some devices whose driver doesn't disable MSI-X when shutdown. But Xen can't
> rely on the guest driver to do this.

The please clarify this in the description.

> That's why I think this issue should be handled in Xen.

I fully agree.

>>Next I'm weary of you clearing the device's maskall
>>bit without also clearing the enable bit (or ASSERT()ing that it's
>>cleared, if that's guaranteed).
> 
> I don't get why claring maskall bit without clearing enable bit would be an
> issue.

Because this means that not individually masked MSI-X entries would
become usable for interrupt delivery by the device again. A misbehaving
device (perhaps by having been programmed wrongly or maliciously)
would then become a risk to the entire system.

>>I also don't follow why you OR in
>>->guest_maskall: Isn't it supposed to be clear anyway?
> 
> Guest's first write to msixctrl register would override this value.
> So don't clear it is also fine. Considering that do_pci_add() issues QMP
> command to add pci device to qemu prior to xc_assign_device(), the question
> here is whether there is any chance qemu's first write has happened.

If some more general re-work (as discussed further down) was not
going to happen, then this would need to be investigated in more
detail. For now I'll leave this aside.

>>From a more general perspective, forcing ->host_maskall to true
>>in msi_set_mask_bit() when memory decoding is disabled is a
>>safety measure. I'd like to see justification (in the description) why
>>it is safe to clear the bit again at a later point.
> 
> Currently, no idea how to prove it. My throught is simple: make sure
> all status is benign. The host_maskall bit is set due to some actions of the
> last owner. Clear it to avoid it affecting the new domain.

But that's safe only if the device as a whole is in a sane state. One
reason could therefore be that the clearing of the bit happens
strictly after the device was reset.

>>Thing is that _if_ it
>>is safe to clear the bit when assigning the device to another guest,
>>why wouldn't it even more so be safe to do so when assigning it
>>back to Dom0 (i.e. in deassign_device())?
> 
> I considered it. But deassign_device() happens before destroying domain during
> which host_maskall is set.

Meaning at that point the owner is Dom0 again. Perhaps there'd be
a way for pciback to signal it has taken control of the device again
and (as per above) reset it?

>>And why would it not be safe to clear the bit perhaps already at
>>the point MSI-X gets disabled?
> 
> Yes, I also thought it might be an option. But doing this definitely
> conflicts with the intention of the first if() in __pci_disable_msix() and
> msi_set_mask_bit() which is trying to enable MSI-x to access MSI-x table to
> set the per-vector mask bit. If it is safe, we don't need that also. 

But that's transient enabling/disabling. I'm only talking about the
permanent disabling case.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/pt: fix some pass-thru devices don't work across reboot

2018-11-11 Thread Chao Gao
On Fri, Nov 09, 2018 at 02:14:04AM -0700, Jan Beulich wrote:
 On 09.11.18 at 01:11,  wrote:
>> I find some pass-thru devices don't work any more across guest
>> reboot. Assigning it to another domain also meets the same issue. And
>> the only way to make it work again is un-binding and binding it to
>> pciback. Someone reported this issue one year ago [1].
>
>I find "some" above and in the title misleading. According to the
>following description, the issue ought to affect exactly all MSI-X
>capable devices.

Some devices whose driver doesn't disable MSI-X when shutdown. But Xen can't
rely on the guest driver to do this. That's why I think this issue should be
handled in Xen.

If QEMU is killed/crashed before destroying a VM, all MSI-x capable devices
would suffer the same issue.

>
>> The root cause is that xen sets the maskall bit in MSI-x capability
>> during guest shutdown.
>
>That's in __pci_disable_msix()? Please help readers by being
>specific.

I think the call trace is:

->arch_domain_destroy
->unmap_domain_pirq (if guest doesn't disable MSI-x)
->pirq_guest_force_unbind
->__pirq_guest_unbind
->mask_msi_irq(=desc->handler->disable())
->the warning in msi_set_mask_bit()

>
>> @@ -1439,7 +1440,27 @@ static int assign_device(struct domain *d, u16 seg, 
>> u8 bus, u8 devfn, u32 flag)
>>  }
>>  
>>  if ( pdev->msix )
>> +{
>> +uint8_t slot = PCI_SLOT(devfn), func = PCI_FUNC(devfn);
>> +uint8_t pos = pci_find_cap_offset(seg, bus, slot, func,
>> +  PCI_CAP_ID_MSIX);
>> +uint16_t control = pci_conf_read16(seg, bus, slot, func,
>> +   msix_control_reg(pos));
>> +uint16_t new_control;
>> +
>> +/* Reset status owned by Xen */
>> +pdev->msix->host_maskall = false;
>> +pdev->msix->warned = DOMID_INVALID;
>> +
>> +/* Update 'maskall' bit in MSI-X Capability */
>> +new_control = (control & ~PCI_MSIX_FLAGS_MASKALL) |
>> +  pdev->msix->guest_maskall;
>> +if ( new_control != control )
>> +pci_conf_write16(seg, bus, slot, func, msix_control_reg(pos),
>> + control);
>> +
>>  msixtbl_init(d);
>> +}
>
>First of all, all maskall bit management is in x86/msi.c. Please keep it
>that way, by introducing a function there, called from here (if need
>be, read on).

Will do

>Next I'm weary of you clearing the device's maskall
>bit without also clearing the enable bit (or ASSERT()ing that it's
>cleared, if that's guaranteed).

I don't get why claring maskall bit without clearing enable bit would be an
issue.

>I also don't follow why you OR in
>->guest_maskall: Isn't it supposed to be clear anyway?

Guest's first write to msixctrl register would override this value.
So don't clear it is also fine. Considering that do_pci_add() issues QMP
command to add pci device to qemu prior to xc_assign_device(), the question
here is whether there is any chance qemu's first write has happened.

>
>From a more general perspective, forcing ->host_maskall to true
>in msi_set_mask_bit() when memory decoding is disabled is a
>safety measure. I'd like to see justification (in the description) why
>it is safe to clear the bit again at a later point.

Currently, no idea how to prove it. My throught is simple: make sure
all status is benign. The host_maskall bit is set due to some actions of the
last owner. Clear it to avoid it affecting the new domain.

>Thing is that _if_ it
>is safe to clear the bit when assigning the device to another guest,
>why wouldn't it even more so be safe to do so when assigning it
>back to Dom0 (i.e. in deassign_device())?

I considered it. But deassign_device() happens before destroying domain during
which host_maskall is set.

>
>And why would it not be safe to clear the bit perhaps already at
>the point MSI-X gets disabled?

Yes, I also thought it might be an option. But doing this definitely
conflicts with the intention of the first if() in __pci_disable_msix() and
msi_set_mask_bit() which is trying to enable MSI-x to access MSI-x table to
set the per-vector mask bit. If it is safe, we don't need that also. 

Thanks
Chao

>It would seem to me that
>_pci_cleanup_msix() could call msix_set_enable(, false), and
>msix_set_enable() might legitimately clear host_maskall in that
>case (but please double check; another possibility could be that
>pci_cleanup_msi() needs a second call site added somewhere).
>In the end the state after disabling MSI-X should match that
>before enabling it the first time.
>
>Jan
>
>

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/pt: fix some pass-thru devices don't work across reboot

2018-11-09 Thread Jan Beulich
>>> On 09.11.18 at 01:11,  wrote:
> I find some pass-thru devices don't work any more across guest
> reboot. Assigning it to another domain also meets the same issue. And
> the only way to make it work again is un-binding and binding it to
> pciback. Someone reported this issue one year ago [1].

I find "some" above and in the title misleading. According to the
following description, the issue ought to affect exactly all MSI-X
capable devices.

> The root cause is that xen sets the maskall bit in MSI-x capability
> during guest shutdown.

That's in __pci_disable_msix()? Please help readers by being
specific.

> @@ -1439,7 +1440,27 @@ static int assign_device(struct domain *d, u16 seg, u8 
> bus, u8 devfn, u32 flag)
>  }
>  
>  if ( pdev->msix )
> +{
> +uint8_t slot = PCI_SLOT(devfn), func = PCI_FUNC(devfn);
> +uint8_t pos = pci_find_cap_offset(seg, bus, slot, func,
> +  PCI_CAP_ID_MSIX);
> +uint16_t control = pci_conf_read16(seg, bus, slot, func,
> +   msix_control_reg(pos));
> +uint16_t new_control;
> +
> +/* Reset status owned by Xen */
> +pdev->msix->host_maskall = false;
> +pdev->msix->warned = DOMID_INVALID;
> +
> +/* Update 'maskall' bit in MSI-X Capability */
> +new_control = (control & ~PCI_MSIX_FLAGS_MASKALL) |
> +  pdev->msix->guest_maskall;
> +if ( new_control != control )
> +pci_conf_write16(seg, bus, slot, func, msix_control_reg(pos),
> + control);
> +
>  msixtbl_init(d);
> +}

First of all, all maskall bit management is in x86/msi.c. Please keep it
that way, by introducing a function there, called from here (if need
be, read on). Next I'm weary of you clearing the device's maskall
bit without also clearing the enable bit (or ASSERT()ing that it's
cleared, if that's guaranteed). I also don't follow why you OR in
->guest_maskall: Isn't it supposed to be clear anyway?

From a more general perspective, forcing ->host_maskall to true
in msi_set_mask_bit() when memory decoding is disabled is a
safety measure. I'd like to see justification (in the description) why
it is safe to clear the bit again at a later point. Thing is that _if_ it
is safe to clear the bit when assigning the device to another guest,
why wouldn't it even more so be safe to do so when assigning it
back to Dom0 (i.e. in deassign_device())?

And why would it not be safe to clear the bit perhaps already at
the point MSI-X gets disabled? It would seem to me that
_pci_cleanup_msix() could call msix_set_enable(, false), and
msix_set_enable() might legitimately clear host_maskall in that
case (but please double check; another possibility could be that
pci_cleanup_msi() needs a second call site added somewhere).
In the end the state after disabling MSI-X should match that
before enabling it the first time.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] xen/pt: fix some pass-thru devices don't work across reboot

2018-11-09 Thread Chao Gao
I find some pass-thru devices don't work any more across guest
reboot. Assigning it to another domain also meets the same issue. And
the only way to make it work again is un-binding and binding it to
pciback. Someone reported this issue one year ago [1].

The root cause is that xen sets the maskall bit in MSI-x capability
during guest shutdown. The maskall bit will be kept even after the
device being assigned to another domain and no guest's operation can
unmask it.

If driver doesn't disable MSI-X during shutdown, the related pirq
won't be unmapped. Then xen will unmap all pirq. But pciback has
already disabled meory decoding before xen unmapping pirq. Then when
Xen is disabling a MSI of the device, it has to sets the maskall bit
to mask a MSI rather than sets maskbit in MSI-x table (seeing the
warning in msi_set_mask_bit()).

To fix this, host_maskall flag is reset and MSI-x maskall bit is
updated. Also 'msix->warned' is initialized to DOMID_INVALID to avoid
warnings missing for Dom0.

[1]: https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg02520.html

Signed-off-by: Chao Gao 
---
 xen/drivers/passthrough/pci.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index e5b9602..10b8bed 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -327,6 +327,7 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 
bus, u8 devfn)
 return NULL;
 }
 spin_lock_init(>table_lock);
+msix->warned = DOMID_INVALID;
 pdev->msix = msix;
 }
 
@@ -1439,7 +1440,27 @@ static int assign_device(struct domain *d, u16 seg, u8 
bus, u8 devfn, u32 flag)
 }
 
 if ( pdev->msix )
+{
+uint8_t slot = PCI_SLOT(devfn), func = PCI_FUNC(devfn);
+uint8_t pos = pci_find_cap_offset(seg, bus, slot, func,
+  PCI_CAP_ID_MSIX);
+uint16_t control = pci_conf_read16(seg, bus, slot, func,
+   msix_control_reg(pos));
+uint16_t new_control;
+
+/* Reset status owned by Xen */
+pdev->msix->host_maskall = false;
+pdev->msix->warned = DOMID_INVALID;
+
+/* Update 'maskall' bit in MSI-X Capability */
+new_control = (control & ~PCI_MSIX_FLAGS_MASKALL) |
+  pdev->msix->guest_maskall;
+if ( new_control != control )
+pci_conf_write16(seg, bus, slot, func, msix_control_reg(pos),
+ control);
+
 msixtbl_init(d);
+}
 
 pdev->fault.count = 0;
 
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel