On 04/05/17 09:38, Christoffer Dall wrote:
> On Thu, May 04, 2017 at 09:28:50AM +0100, Marc Zyngier wrote:
>> On 04/05/17 09:13, Christoffer Dall wrote:
>>> On Thu, May 04, 2017 at 09:09:47AM +0100, Marc Zyngier wrote:
>>>> On 03/05/17 19:32, Christoffer Dall wrote:
>>>>> Since we got support for devices in userspace which allows reporting the
>>>>> PMU overflow output status to userspace, we should actually allow
>>>>> creating the PMU on systems without an in-kernel irqchip, which in turn
>>>>> requires us to slightly clarify error codes for the ABI and move things
>>>>> around for the initialization phase.
>>>>>
>>>>> Signed-off-by: Christoffer Dall <cd...@linaro.org>
>>>>> ---
>>>>>  Documentation/virtual/kvm/devices/vcpu.txt | 16 +++++++++-------
>>>>>  virt/kvm/arm/pmu.c                         | 27 
>>>>> +++++++++++++++++----------
>>>>>  2 files changed, 26 insertions(+), 17 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/virtual/kvm/devices/vcpu.txt 
>>>>> b/Documentation/virtual/kvm/devices/vcpu.txt
>>>>> index 02f5068..352af6e 100644
>>>>> --- a/Documentation/virtual/kvm/devices/vcpu.txt
>>>>> +++ b/Documentation/virtual/kvm/devices/vcpu.txt
>>>>> @@ -16,7 +16,9 @@ Parameters: in kvm_device_attr.addr the address for PMU 
>>>>> overflow interrupt is a
>>>>>  Returns: -EBUSY: The PMU overflow interrupt is already set
>>>>>           -ENXIO: The overflow interrupt not set when attempting to get it
>>>>>           -ENODEV: PMUv3 not supported
>>>>> -         -EINVAL: Invalid PMU overflow interrupt number supplied
>>>>> +         -EINVAL: Invalid PMU overflow interrupt number supplied or
>>>>> +                  trying to set the IRQ number without using an in-kernel
>>>>> +                  irqchip.
>>>>>  
>>>>>  A value describing the PMUv3 (Performance Monitor Unit v3) overflow 
>>>>> interrupt
>>>>>  number for this vcpu. This interrupt could be a PPI or SPI, but the 
>>>>> interrupt
>>>>> @@ -25,11 +27,11 @@ all vcpus, while as an SPI it must be a separate 
>>>>> number per vcpu.
>>>>>  
>>>>>  1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT
>>>>>  Parameters: no additional parameter in kvm_device_attr.addr
>>>>> -Returns: -ENODEV: PMUv3 not supported
>>>>> -         -ENXIO: PMUv3 not properly configured as required prior to 
>>>>> calling this
>>>>> -                 attribute
>>>>> +Returns: -ENODEV: PMUv3 not supported or GIC not initialized
>>>>> +         -ENXIO: PMUv3 not properly configured or in-kernel irqchip not
>>>>> +                 conigured as required prior to calling this attribute
>>>>>           -EBUSY: PMUv3 already initialized
>>>>>  
>>>>> -Request the initialization of the PMUv3.  This must be done after 
>>>>> creating the
>>>>> -in-kernel irqchip.  Creating a PMU with a userspace irqchip is currently 
>>>>> not
>>>>> -supported.
>>>>> +Request the initialization of the PMUv3.  If using the PMUv3 with an 
>>>>> in-kernel
>>>>> +virtual GIC implementation, this must be done after initializing the 
>>>>> in-kernel
>>>>> +irqchip.
>>>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>>>> index 4b43e7f..f046b08 100644
>>>>> --- a/virt/kvm/arm/pmu.c
>>>>> +++ b/virt/kvm/arm/pmu.c
>>>>> @@ -456,21 +456,25 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu 
>>>>> *vcpu)
>>>>>   if (!kvm_arm_support_pmu_v3())
>>>>>           return -ENODEV;
>>>>>  
>>>>> - /*
>>>>> -  * We currently require an in-kernel VGIC to use the PMU emulation,
>>>>> -  * because we do not support forwarding PMU overflow interrupts to
>>>>> -  * userspace yet.
>>>>> -  */
>>>>> - if (!irqchip_in_kernel(vcpu->kvm) || !vgic_initialized(vcpu->kvm))
>>>>> -         return -ENODEV;
>>>>> -
>>>>> - if (!test_bit(KVM_ARM_VCPU_PMU_V3, vcpu->arch.features) ||
>>>>> -     !kvm_arm_pmu_irq_initialized(vcpu))
>>>>> + if (!test_bit(KVM_ARM_VCPU_PMU_V3, vcpu->arch.features))
>>>>>           return -ENXIO;
>>>>>  
>>>>>   if (kvm_arm_pmu_v3_ready(vcpu))
>>>>>           return -EBUSY;
>>>>>  
>>>>> + if (irqchip_in_kernel(vcpu->kvm)) {
>>>>> +         /*
>>>>> +          * If using the PMU with an in-kernel virtual GIC
>>>>> +          * implementation, we require the GIC to be already
>>>>> +          * initialized when initializing the PMU.
>>>>> +          */
>>>>> +         if (!vgic_initialized(vcpu->kvm))
>>>>> +                 return -ENODEV;
>>>>> +
>>>>> +         if (!kvm_arm_pmu_irq_initialized(vcpu))
>>>>> +                 return -ENXIO;
>>>>> + }
>>>>> +
>>>>>   kvm_pmu_vcpu_reset(vcpu);
>>>>>   vcpu->arch.pmu.ready = true;
>>>>>  
>>>>> @@ -512,6 +516,9 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, 
>>>>> struct kvm_device_attr *attr)
>>>>>           int __user *uaddr = (int __user *)(long)attr->addr;
>>>>>           int irq;
>>>>>  
>>>>> +         if (!irqchip_in_kernel(vcpu->kvm))
>>>>> +                 return -EINVAL;
>>>>> +
>>>>
>>>> Shouldn't we fail the same way for {get,has}_attr? get_attr is going to
>>>> generate a -ENXIO, and has_attr is going to lie about the availability
>>>> of KVM_ARM_VCPU_PMU_V3_IRQ...
>>>>
>>>
>>> Here's the text from api.txt:
>>>
>>>   Tests whether a device supports a particular attribute.  A successful
>>>   return indicates the attribute is implemented.  It does not necessarily
>>>   indicate that the attribute can be read or written in the device's
>>>   current state.  "addr" is ignored.
>>>
>>> My interpretation therefore is that QEMU can use this ioctl to figure
>>> out if the feature is supported (sort of like a capability), but that
>>> doesn't mean that the configuration of the VM is such that the attribute
>>> can be get or set at that moment.
>>>
>>> For example, there will also alway be situations where you can get an
>>> attr, but not set an attr, what should the has_attr return then?
>>
>> My issue here is that whether we can get/set the interrupt or not is not
>> a function of the device itself, but of the way it is "wired". No matter
>> what "the device's current state" is, we'll never be able to get/set the
>> interrupt.
>>
>> I'd tend to err on the side of caution and return something that is
>> unambiguous, be maybe I have too strict an interpretation of the API.
>>
> 
> Hmm, I see the has_attr as a method for userspace to discover "does this
> kernel have this feature".  If we make has_attr return a value depending
> on the VM having an in-kernel gic or not, we implicitly require
> userspace to create a VM with an in-kernel GIC to discover if this
> kernel has that feature, and therefore also impose an ordering
> requirement of figuring out the capabilities of the kernel (i.e. create
> the GIC before checking this API).
> 
> I think QEMU should be able to do:
> 
>   if (has_attr()) {
>      kvm_supports_set_timer_irq = true;
>      vtimer_irq = foo;
>   } else {
>      kvm_supports_set_timer_irq = false;
>      vtimer_irq = 27; /* default, we're stuck with it */
>   }
> 
>   create_board_definition();
>   create_dt();
>   create_acpi();
> 
>   /* do whatever */
> 
>   if (kvm_supports_set_timer_irq && kvm_irqchip_in_kernel()) {
>       kvm_arm_timer_set_irq(...);
>   }
> 
> And all this should not be coupled to when we create the irqchip device.
> 
> But I may be failing to see the case where the current implementation
> creates a problem for userspace, in which case we should document the
> ordering requirement.

I'm not sure it would create any problem, at least not for the PMU
(there is no working code that would have created a PMU without an irqchip).

It is just that we have a slightly diverging interpretation of what
has_attr means. You see it as "attribute that the device supports", and
I see it as "attribute that the device supports in this configuration".
I'm happy to use your semantics from now on.

Thanks,

        M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Reply via email to