On 05.06.14 15:10, Alexey Kardashevskiy wrote:
On 06/05/2014 11:06 PM, Alexander Graf wrote:
On 05.06.14 08:43, Alexey Kardashevskiy wrote:
On 06/05/2014 03:49 PM, Alexey Kardashevskiy wrote:
POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating
TCE tables in the host kernel memory and handle H_PUT_TCE requests
targeted to specific LIOBN (logical bus number) right in the host without
switching to QEMU. At the moment this is used for emulated devices only
and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE
handler finds a LIOBN and corresponding table, it will put a TCE to
the table and complete hypercall execution. The user space will not be
notified.
Upcoming VFIO support is going to use the same sPAPRTCETable device class
so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE
tables for VFIO are going to be allocated in the host as well.
However VFIO operates with real IOMMU tables and simple copying of
a TCE to the real hardware TCE table will not work as guest physical
to host physical address translation is requited.
So until the host kernel gets VFIO support for H_PUT_TCE, we better not
to register VFIO's TCE in the host.
This adds a bool @kvm_accel flag to the sPAPRTCETable device telling
that sPAPRTCETable should not try allocating TCE table in the host kernel.
Instead, the table will be created in QEMU.
This adds an kvm_accel parameter to spapr_tce_new_table() to let users
choose whether to use acceleration or not. At the moment it is enabled
for VIO and emulated PCI. Upcoming VFIO support will set it to false.
Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru>
---
This is a workaround but it lets me have one IOMMU device for VIO, emulated
PCI and VFIO which is a good thing.
The other way around would be a new KVM_CAP_SPAPR_TCE_VFIO capability but
this needs kernel update.
Never mind, I'll make it a capability. I'll post capability reservation
patch separately.
Just rename the flag from "kvm_accel" to "vfio_accel", set it to true for
vfio and false for emulated devices. Then the spapr_iommu file can check on
the capability (and default to false for now, since it doesn't exist yet).
Is that ok if the flag does not have to do anything with VFIO per se? :)
The flag means "use in-kernel acceleration if the vfio coupling
capability is available", no?
That way you don't have to reserve a CAP today.
Why exactly cannot we do that today?
Because the CAP namespace isn't a garbage bin we can just throw IDs at.
Maybe we realize during patch review that we need completely different CAPs.
Alex