On Thu, Apr 25, 2013 at 12:47:39PM +0200, Alexander Graf wrote:
>
> On 25.04.2013, at 11:43, Gleb Natapov wrote:
>
> > On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote:
> >> Currently, devices that are emulated inside KVM are configured in a
> >> hardcoded manner based on an assumption that any given architecture
> >> only has one way to do it. If there's any need to access device state,
> >> it is done through inflexible one-purpose-only IOCTLs (e.g.
> >> KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is
> >> cumbersome and depletes a limited numberspace.
> >>
> >> This API provides a mechanism to instantiate a device of a certain
> >> type, returning an ID that can be used to set/get attributes of the
> >> device. Attributes may include configuration parameters (e.g.
> >> register base address), device state, operational commands, etc. It
> >> is similar to the ONE_REG API, except that it acts on devices rather
> >> than vcpus.
> >>
> >> Both device types and individual attributes can be tested without having
> >> to create the device or get/set the attribute, without the need for
> >> separately managing enumerated capabilities.
> >>
> >> Signed-off-by: Scott Wood <[email protected]>
> >> ---
> >> v4:
> >> - Move some boilerplate back into generic code, as requested by Gleb.
> >> File descriptor management and reference counting is no longer the
> >> concern of the device implementation.
> >>
> >> - Don't hold kvm->lock during create. The original reasons
> >> for doing so have vanished as for as MPIC is concerned, and
> >> this avoids needing to answer the question of whether to
> >> hold the lock during destroy as well.
> >>
> >> Paul, you may need to acquire the lock yourself in kvm_create_xics()
> >> to protect the -EEXIST check.
> >>
> >> v3: remove some changes that were merged into this patch by accident,
> >> and fix the error documentation for KVM_CREATE_DEVICE.
> >> ---
> >> Documentation/virtual/kvm/api.txt | 70 ++++++++++++++++
> >> Documentation/virtual/kvm/devices/README | 1 +
> >> include/linux/kvm_host.h | 35 ++++++++
> >> include/uapi/linux/kvm.h | 27 +++++++
> >> virt/kvm/kvm_main.c | 129
> >> ++++++++++++++++++++++++++++++
> >> 5 files changed, 262 insertions(+)
> >> create mode 100644 Documentation/virtual/kvm/devices/README
> >>
> >> diff --git a/Documentation/virtual/kvm/api.txt
> >> b/Documentation/virtual/kvm/api.txt
> >> index 976eb65..d52f3f9 100644
> >> --- a/Documentation/virtual/kvm/api.txt
> >> +++ b/Documentation/virtual/kvm/api.txt
> >> @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents
> >> from the data
> >> written, then `n_invalid' invalid entries, invalidating any previously
> >> valid entries found.
> >>
> >> +4.79 KVM_CREATE_DEVICE
> >> +
> >> +Capability: KVM_CAP_DEVICE_CTRL
> >> +Type: vm ioctl
> >> +Parameters: struct kvm_create_device (in/out)
> >> +Returns: 0 on success, -1 on error
> >> +Errors:
> >> + ENODEV: The device type is unknown or unsupported
> >> + EEXIST: Device already created, and this type of device may not
> >> + be instantiated multiple times
> >> +
> >> + Other error conditions may be defined by individual device types or
> >> + have their standard meanings.
> >> +
> >> +Creates an emulated device in the kernel. The file descriptor returned
> >> +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
> >> +
> >> +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
> >> +device type is supported (not necessarily whether it can be created
> >> +in the current vm).
> >> +
> >> +Individual devices should not define flags. Attributes should be used
> >> +for specifying any behavior that is not implied by the device type
> >> +number.
> >> +
> >> +struct kvm_create_device {
> >> + __u32 type; /* in: KVM_DEV_TYPE_xxx */
> >> + __u32 fd; /* out: device handle */
> >> + __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */
> >> +};
> > Should we add __u32 padding here to make struct size multiple of u64?
>
> Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't
> and ppc64 doesn't either.
>
Not really. I just notices that we pad some structures to that effect.
> >
> >> +
> >> +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
> >> +
> >> +Capability: KVM_CAP_DEVICE_CTRL
> >> +Type: device ioctl
> >> +Parameters: struct kvm_device_attr
> >> +Returns: 0 on success, -1 on error
> >> +Errors:
> >> + ENXIO: The group or attribute is unknown/unsupported for this device
> >> + EPERM: The attribute cannot (currently) be accessed this way
> >> + (e.g. read-only attribute, or attribute that only makes
> >> + sense when the device is in a different state)
> >> +
> >> + Other error conditions may be defined by individual device types.
> >> +
> >> +Gets/sets a specified piece of device configuration and/or state. The
> >> +semantics are device-specific. See individual device documentation in
> >> +the "devices" directory. As with ONE_REG, the size of the data
> >> +transferred is defined by the particular attribute.
> >> +
> >> +struct kvm_device_attr {
> >> + __u32 flags; /* no flags currently defined */
> >> + __u32 group; /* device-defined */
> >> + __u64 attr; /* group-defined */
> >> + __u64 addr; /* userspace address of attr data */
> >> +};
> >> +
> >> +4.81 KVM_HAS_DEVICE_ATTR
> >> +
> >> +Capability: KVM_CAP_DEVICE_CTRL
> >> +Type: device ioctl
> >> +Parameters: struct kvm_device_attr
> >> +Returns: 0 on success, -1 on error
> >> +Errors:
> >> + ENXIO: The group or attribute is unknown/unsupported for this device
> >> +
> >> +Tests whether a device supports a particular attribute. A successful
> >> +return indicates the attribute is implemented. It does not necessarily
> >> +indicate that the attribute can be read or written in the device's
> >> +current state. "addr" is ignored.
> >>
> >> 4.77 KVM_ARM_VCPU_INIT
> >>
> >> diff --git a/Documentation/virtual/kvm/devices/README
> >> b/Documentation/virtual/kvm/devices/README
> >> new file mode 100644
> >> index 0000000..34a6983
> >> --- /dev/null
> >> +++ b/Documentation/virtual/kvm/devices/README
> >> @@ -0,0 +1 @@
> >> +This directory contains specific device bindings for KVM_CAP_DEVICE_CTRL.
> >> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> >> index 20d77d2..8fce9bc 100644
> >> --- a/include/linux/kvm_host.h
> >> +++ b/include/linux/kvm_host.h
> >> @@ -1063,6 +1063,41 @@ static inline bool kvm_check_request(int req,
> >> struct kvm_vcpu *vcpu)
> >>
> >> extern bool kvm_rebooting;
> >>
> >> +struct kvm_device_ops;
> >> +
> >> +struct kvm_device {
> >> + struct kvm_device_ops *ops;
> >> + struct kvm *kvm;
> >> + atomic_t users;
> >> + void *private;
> >> +};
> >> +
> >> +/* create, destroy, and name are mandatory */
> >> +struct kvm_device_ops {
> >> + const char *name;
> >> + int (*create)(struct kvm_device *dev, u32 type);
> >> +
> >> + /*
> >> + * Destroy is responsible for freeing dev.
> >> + *
> >> + * Destroy may be called before or after destructors are called
> >> + * on emulated I/O regions, depending on whether a reference is
> >> + * held by a vcpu or other kvm component that gets destroyed
> >> + * after the emulated I/O.
> >> + */
> >> + void (*destroy)(struct kvm_device *dev);
> >> +
> >> + int (*set_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
> >> + int (*get_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
> >> + int (*has_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
> >> + long (*ioctl)(struct kvm_device *dev, unsigned int ioctl,
> >> + unsigned long arg);
> >> +};
> >> +
> >> +void kvm_device_get(struct kvm_device *dev);
> >> +void kvm_device_put(struct kvm_device *dev);
> >> +struct kvm_device *kvm_device_from_filp(struct file *filp);
> >> +
> >> #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
> >>
> >> static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool
> >> val)
> >> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> >> index 74d0ff3..20ce2d2 100644
> >> --- a/include/uapi/linux/kvm.h
> >> +++ b/include/uapi/linux/kvm.h
> >> @@ -668,6 +668,7 @@ struct kvm_ppc_smmu_info {
> >> #define KVM_CAP_PPC_EPR 86
> >> #define KVM_CAP_ARM_PSCI 87
> >> #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
> >> +#define KVM_CAP_DEVICE_CTRL 89
> >>
> >> #ifdef KVM_CAP_IRQ_ROUTING
> >>
> >> @@ -909,6 +910,32 @@ struct kvm_s390_ucas_mapping {
> >> #define KVM_ARM_SET_DEVICE_ADDR _IOW(KVMIO, 0xab, struct
> >> kvm_arm_device_addr)
> >>
> >> /*
> >> + * Device control API, available with KVM_CAP_DEVICE_CTRL
> >> + */
> >> +#define KVM_CREATE_DEVICE_TEST 1
> >> +
> >> +struct kvm_create_device {
> >> + __u32 type; /* in: KVM_DEV_TYPE_xxx */
> >> + __u32 fd; /* out: device handle */
> >> + __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */
> >> +};
> >> +
> >> +struct kvm_device_attr {
> >> + __u32 flags; /* no flags currently defined */
> >> + __u32 group; /* device-defined */
> >> + __u64 attr; /* group-defined */
> >> + __u64 addr; /* userspace address of attr data */
> >> +};
> > Please move struct definitions and KVM_CREATE_DEVICE_TEST define out
> > from ioctl definition block.
>
> Let me change that in my tree...
>
So are you sending this via your tree and I should not apply it directly?
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html