On 2016-05-03 05:22, Peter Xu wrote:
> On Fri, Apr 29, 2016 at 09:52:14PM +0200, Radim Krčmář wrote:
>> 2016-04-28 17:18+0800, Peter Xu:
>>> On Thu, Apr 28, 2016 at 09:19:28AM +0200, Jan Kiszka wrote:
>>>> Instead of fiddling with irq routes for the IOAPIC - where we don't need
>>>> it -, I would suggest to do the following: Send IOAPIC events via
>>>> kvm_irqchip_send_msi to the kernel. Only irqfd users (vhost, vfio)
>>>> should use the pattern you are now applying to the IOAPIC: establish
>>>> static routes as an irqfd is set up, and that route should be translated
>>>> by the iommu first, register an IEC notifier to update any affected
>>>> route when the iommu translation changes.
>>>
>>> Yes, maybe that's the right thing to do. Or say, when split irqchip,
>>> IOAPIC can avoid using GSI routes any more. If with that, I should
>>> also remove lots of things, like: IEC notifiers for IOAPIC, and all
>>> things related to msi route sync-up in IOAPIC codes with KVM (so I
>>> suppose we will save 24 gsi route entries for KVM, which sounds
>>> good).
>>
>> Sadly, we can't get rid of those GSI routes.  KVM uses them to decide
>> whether it should forward EOI to userspace.  And QEMU also has to remap
>> them, because KVM uses dest_id for that decision.
> 
> Thanks to point out.  Actually I have started to test the new patch
> and did encountered issue when using split irqchip mode. The above
> explains. :)
> 
> So I think I will keep this part of codes un-touched in this
> series, and I can move forward with IR changes.
> 
> However, I am still wondering whether we can avoid this in KVM when
> when using split irqchip mode? E.g., what if we do not do this check
> in KVM and let all EOI be passed to userspace? Like:
> 
> ---
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 36591fa..c0a6e51 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -936,10 +936,6 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, 
> int vector)
>  {
>         int trigger_mode;
> 
> -       /* Eoi the ioapic only if the ioapic doesn't own the vector. */
> -       if (!kvm_ioapic_handles_vector(apic, vector))
> -               return;
> -
>         /* Request a KVM exit to inform the userspace IOAPIC. */
>         if (irqchip_split(apic->vcpu->kvm)) {
>                 apic->vcpu->arch.pending_ioapic_eoi = vector;
> @@ -947,6 +943,10 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, 
> int vector)
>                 return;
>         }
> 
> +       /* Eoi the ioapic only if the ioapic doesn't own the vector. */
> +       if (!kvm_ioapic_handles_vector(apic, vector))
> +               return;
> +
>         if (apic_test_vector(vector, apic->regs + APIC_TMR))
>                 trigger_mode = IOAPIC_LEVEL_TRIG;
>         else
> 
> ---
> 
> I believe this will brings more user-space context switches, I just
> do not know how this will impact the system (only for curiosity :).

Yes, this doesn't look good from the performance POV. Given that most
EOIs of the APICs will not trigger a message to an IOAPIC and userspace
exits are expensive, that should be measurable.

But you should optimize route updating a bit: I noticed real delays
(almost a second) when reprogramming all IOAPIC pins during Jailhouse
handover. That's because of the heavyweight synchronize_srcu we have on
each route change. Even if that is an extreme case, you should try to
reduce route updates to only those that were actually affected by an
IRTE change.

Jan

Reply via email to