[Qemu-devel] [V17 0/4] AMD IOMMU

2016-08-31 Thread David Kiarie
Hi all,

This patchset adds basic AMD IOMMU emulation support to Qemu. 

Changes since v16 - this is mainly supposed to come as a ping :-)
   -minor endian-ness fixes

Changes since v15
   -Endian-ness issue fix
   -cleaned up unused macros
   -removed guest frame number(gfn) from cache entry

Changes since v14
   -MMIO register reading/write bug fix [Peter]
   -Endian-ness issue fix[Peter]
   -Bitfields layouts in IOMMU commands fix[Peter]
   -IVRS changed IVHD device entry from type 3 to 1 to save a few bytes
   -coding style issues, comment grammer and other miscellaneous fixes.

Changes since v13
   -Added an error to make AMD IOMMU incompatible with device assignment.[Alex]
   -Converted AMD IOMMU into a composite PCI and System Bus device. This helps 
with:
  -We can now inherit from X86 IOMMU base class(which is implemented as a 
System Bus device).
  -We can now reserve MMIO region for IOMMU without a BAR register and 
without a hack.

Changes since v12

   -Coding style fixes [Jan, Michael]
   -Error logging fix to avoid using a macro[Jan]
   -moved some PCI macros to PCI header[Jan]
   -Use a lookup table for MMIO register names when tracing[Jan]

Changes since V11
   -AMD IOMMU is not started with -device amd-iommu (with a dependency on 
Marcel's patches).
   -IOMMU commands are represented using bitfields which is less error prone 
and more readable[Peter]
   -Changed from debug fprintfs to tracing[Jan]

Changes since V10
 
   -Support for huge pages including some obscure AMD IOMMU feature that allows 
default page size override[Jan].
   -Fixed an issue with generation of interrupts. We noted that AMD IOMMU has 
BusMaster- and is therefore not able to generate interrupts like any other PCI 
device. We have resulted in writing directly to system address but this could 
be fixed by some patches which have not been merged yet.

Changes since v9

   -amd_iommu prefixes have been renamed to a shorter 'amdvi' both in the macros
and in the functions/code. The register macros have not been moved to the 
implementation file since almost the macros there are basically macros and 
I 
reckoned renaming them should suffice.
   -taken care of byte order in the use of 'dma_memory_read'[Michael]
   -Taken care of invalid DTE entries to ensure no DMA unless a device is 
configured to allow it.
   -An issue with the emulate IOMMU defaulting to AMD_IOMMU has been 
fixed[Marcel]
   
You can test[1] this patches by starting with parameters 
qemu-system-x86_64 -M -device amd-iommu -m 2G -enable-kvm -smp 4 -cpu host 
-hda file.img -soundhw ac97 
emulating whatever devices you want.

Not passing any command line parameters to linux should be enough to test this 
patches since the devices are basically
passes-through but to the 'host' (l1 guest). You can still go ahead pass 
command line parameter 'iommu=pt iommu=1'
and try to pass a device to L2 guest. This can also done without passing any 
iommu related parameters to the kernel. 

David Kiarie (4):
  hw/pci: Prepare for AMD IOMMU
  hw/i386/trace-events: Add AMD IOMMU trace events
  hw/i386: Introduce AMD IOMMU
  hw/i386: AMD IOMMU IVRS table

 hw/acpi/aml-build.c |2 +-
 hw/i386/Makefile.objs   |1 +
 hw/i386/acpi-build.c|   76 ++-
 hw/i386/amd_iommu.c | 1383 +++
 hw/i386/amd_iommu.h |  289 +
 hw/i386/intel_iommu.c   |1 +
 hw/i386/trace-events|   29 +
 hw/i386/x86-iommu.c |6 +
 include/hw/acpi/aml-build.h |1 +
 include/hw/i386/x86-iommu.h |   12 +
 include/hw/pci/pci.h|4 +-
 11 files changed, 1793 insertions(+), 11 deletions(-)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

-- 
2.1.4




Re: [Qemu-devel] [V2 0/6] AMD IOMMU interrupt remapping

2016-08-15 Thread David Kiarie
On Mon, Aug 15, 2016 at 7:32 PM, David Kiarie 
wrote:

> Hello all,
>
> The following patchset implements AMD-Vi interrupt remapping logic and
> hooks it onto existing IR infrastucture.
>
> I have bundled this patchset together with the "Explicit SID for
> IOAPIC"."Explicit SID for IOAPIC" functions to
> affiliate MSI routes with a requester ID and a PCI device if present which
> enables platform devices like IOAPIC to
> make interrupt requests using an explicit SID as required by both VT-d and
> AMD-Vi.
>

This has a dependency on AMD IOMMU patchset but for quick testing
https://github.com/aslaq/qemu IR


>
> David Kiarie (6):
>   hw/msi: Allow platform devices to use explicit SID
>   hw/i386: enforce SID verification
>   hw/iommu: Prepare for AMD IOMMU interrupt remapping
>   hw/iommu: AMD IOMMU interrupt remapping
>   hw/acpi: report IOAPIC on IVRS
>   hw/iommu: share common code between IOMMUs
>
>  hw/i386/acpi-build.c  |   2 +
>  hw/i386/amd_iommu.c   | 244 ++
> +++-
>  hw/i386/amd_iommu.h   |  40 +++
>  hw/i386/intel_iommu.c |  89 +++---
>  hw/i386/kvm/pci-assign.c  |  12 +-
>  hw/i386/x86-iommu.c   |   8 ++
>  hw/intc/ioapic.c  |  25 +++-
>  hw/misc/ivshmem.c |   6 +-
>  hw/vfio/pci.c |   6 +-
>  hw/virtio/virtio-pci.c|   7 +-
>  include/hw/i386/ioapic_internal.h |   1 +
>  include/hw/i386/x86-iommu.h   |   1 +
>  include/sysemu/kvm.h  |  25 ++--
>  kvm-all.c |  10 +-
>  kvm-stub.c|   5 +-
>  target-i386/kvm.c |  15 ++-
>  16 files changed, 386 insertions(+), 110 deletions(-)
>
> --
> 2.1.4
>
>


[Qemu-devel] [V2 4/6] hw/iommu: AMD IOMMU interrupt remapping

2016-08-15 Thread David Kiarie
Introduce AMD IOMMU interrupt remapping and hook it onto
the existing interrupt remapping infrastructure

Signed-off-by: David Kiarie 
---
 hw/i386/amd_iommu.c | 244 +++-
 hw/i386/amd_iommu.h |   4 +-
 2 files changed, 243 insertions(+), 5 deletions(-)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 19da365..08d6dae 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -18,11 +18,10 @@
  * with this program; if not, see <http://www.gnu.org/licenses/>.
  *
  * Cache implementation inspired by hw/i386/intel_iommu.c
+ *
  */
 #include "qemu/osdep.h"
-#include 
-#include "hw/pci/msi.h"
-#include "hw/i386/pc.h"
+#include "qemu/error-report.h"
 #include "hw/i386/amd_iommu.h"
 #include "hw/pci/pci_bus.h"
 #include "trace.h"
@@ -270,6 +269,31 @@ typedef struct QEMU_PACKED {
 #endif /* __BIG_ENDIAN_BITFIELD */
 } CMDCompletePPR;
 
+typedef union IRTE {
+struct {
+#ifdef HOST_WORDS_BIGENDIAN
+uint32_t destination:8;
+uint32_t rsvd_1:1;
+uint32_t dm:1;
+uint32_t rq_eoi:1;
+uint32_t int_type:3;
+uint32_t no_fault:1;
+uint32_t valid:1;
+#else
+uint32_t valid:1;
+uint32_t no_fault:1;
+uint32_t int_type:3;
+uint32_t rq_eoi:1;
+uint32_t dm:1;
+uint32_t rsvd_1:1;
+uint32_t destination:8;
+#endif
+uint32_t vector:8;
+uint32_t rsvd_2:8;
+} bits;
+uint32_t data;
+} IRTE;
+
 /* configure MMIO registers at startup/reset */
 static void amdvi_set_quad(AMDVIState *s, hwaddr addr, uint64_t val,
uint64_t romask, uint64_t w1cmask)
@@ -660,6 +684,11 @@ static void amdvi_inval_inttable(AMDVIState *s, 
CMDInvalIntrTable *inval)
 amdvi_log_illegalcom_error(s, inval->type, s->cmdbuf + s->cmdbuf_head);
 return;
 }
+
+if (s->ir_cache) {
+x86_iommu_iec_notify_all(X86_IOMMU_DEVICE(s), true, 0, 0);
+}
+
 trace_amdvi_intr_inval();
 }
 
@@ -1221,6 +1250,197 @@ static IOMMUTLBEntry amdvi_translate(MemoryRegion 
*iommu, hwaddr addr,
 return ret;
 }
 
+static inline int amdvi_ir_handle_non_vectored(MSIMessage *src,
+   MSIMessage *dst, uint8_t bitpos,
+   uint64_t dte)
+{
+if ((dte & (1UL << bitpos))) {
+/* passing interrupt enabled */
+memcpy(dst, src, sizeof(*dst));
+} else {
+/* should be target aborted */
+return -AMDVI_TARGET_ABORT;
+}
+return 0;
+}
+
+static int amdvi_remap_ir_intctl(uint64_t dte, IRTE irte,
+ MSIMessage *src, MSIMessage *dst)
+{
+int ret = 0;
+
+switch ((dte >> AMDVI_DTE_INTCTL_RSHIFT) & 3UL) {
+case AMDVI_INTCTL_PASS:
+/* pass */
+memcpy(dst, src, sizeof(*dst));
+break;
+case AMDVI_INTCTL_REMAP:
+/* remap */
+if (irte.bits.valid) {
+/* LOCAL APIC address */
+dst->address = AMDVI_LOCAL_APIC_ADDR;
+/* destination mode */
+dst->address |= ((uint64_t)irte.bits.dm) <<
+AMDVI_MSI_ADDR_DM_RSHIFT;
+/* RH */
+dst->address |= ((uint64_t)irte.bits.rq_eoi) <<
+AMDVI_MSI_ADDR_RH_RSHIFT;
+/* Destination ID */
+dst->address |= ((uint64_t)irte.bits.destination) <<
+AMDVI_MSI_ADDR_DEST_RSHIFT;
+/* construct data - vector */
+dst->data |= irte.bits.vector;
+/* Interrupt type */
+dst->data |= ((uint64_t)irte.bits.int_type) <<
+ AMDVI_MSI_DATA_DM_RSHIFT;
+} else  {
+ret = -AMDVI_TARGET_ABORT;
+}
+break;
+case AMDVI_INTCTL_ABORT:
+case AMDVI_INTCTL_RSVD:
+ret = -AMDVI_TARGET_ABORT;
+}
+return ret;
+}
+
+static int amdvi_irte_get(AMDVIState *s, MSIMessage *src, IRTE *irte,
+  uint64_t *dte, uint16_t devid)
+{
+uint64_t irte_root, offset = devid * AMDVI_DEVTAB_ENTRY_SIZE,
+ ir_table_size;
+
+irte_root = dte[2] & AMDVI_IRTEROOT_MASK;
+offset = (src->data & AMDVI_IRTE_INDEX_MASK) << 2;
+ir_table_size = 1UL << (dte[2] & AMDVI_IR_TABLE_SIZE_MASK);
+/* enforce IR table size */
+if (offset > (ir_table_size * AMDVI_DEFAULT_IRTE_SIZE)) {
+trace_amdvi_invalid_irte_entry(offset, ir_table_size);
+return -AMDVI_TARGET_ABORT;
+}
+/* read IRTE */
+if (dma_memory_read(&address_space_memory, irte_root + offset,
+irte, sizeof(*irte))) {
+trace_amdvi_irte_get_fail(irte_root, offset);
+return -AMDVI_DEV_TAB_HW;
+}
+return 0;
+}
+
+static int amdvi_int_re

[Qemu-devel] [V2 5/6] hw/acpi: report IOAPIC on IVRS

2016-08-15 Thread David Kiarie
Report IOAPIC via IVRS which effectively allows linux AMD-Vi
driver to enable interrupt remapping

Signed-off-by: David Kiarie 
---
 hw/i386/acpi-build.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 49bd183..c2559ff 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2615,6 +2615,8 @@ build_amd_iommu(GArray *table_data, BIOSLinker *linker)
  *   Refer to Spec - Table 95:IVHD Device Entry Type Codes(4-byte)
  */
 build_append_int_noprefix(table_data, 0x001, 4);
+/* IOAPIC represented as an 8-byte entry. Spec v2.62 Tables 97 */
+build_append_int_noprefix(table_data, 0x0100a000cf48, 8);
 
 build_header(linker, table_data, (void *)(table_data->data + iommu_start),
  "IVRS", table_data->len - iommu_start, 1, NULL, NULL);
-- 
2.1.4




[Qemu-devel] [V2 2/6] hw/i386: enforce SID verification

2016-08-15 Thread David Kiarie
Platform device are now able to make interrupt request with
explicit SIDs hence we can safely expect triggered AddressSpace ID
to match the requesting ID

Signed-off-by: David Kiarie 
---
 hw/i386/intel_iommu.c | 77 ++-
 1 file changed, 39 insertions(+), 38 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 496d836..e4bad6a 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2043,43 +2043,41 @@ static int vtd_irte_get(IntelIOMMUState *iommu, 
uint16_t index,
 return -VTD_FR_IR_IRTE_RSVD;
 }
 
-if (sid != X86_IOMMU_SID_INVALID) {
-/* Validate IRTE SID */
-source_id = le32_to_cpu(entry->irte.source_id);
-switch (entry->irte.sid_vtype) {
-case VTD_SVT_NONE:
-VTD_DPRINTF(IR, "No SID validation for IRTE index %d", index);
-break;
-
-case VTD_SVT_ALL:
-mask = vtd_svt_mask[entry->irte.sid_q];
-if ((source_id & mask) != (sid & mask)) {
-VTD_DPRINTF(GENERAL, "SID validation for IRTE index "
-"%d failed (reqid 0x%04x sid 0x%04x)", index,
-sid, source_id);
-return -VTD_FR_IR_SID_ERR;
-}
-break;
+/* Validate IRTE SID */
+source_id = le32_to_cpu(entry->irte.source_id);
+switch (entry->irte.sid_vtype) {
+case VTD_SVT_NONE:
+VTD_DPRINTF(IR, "No SID validation for IRTE index %d", index);
+break;
 
-case VTD_SVT_BUS:
-bus_max = source_id >> 8;
-bus_min = source_id & 0xff;
-bus = sid >> 8;
-if (bus > bus_max || bus < bus_min) {
-VTD_DPRINTF(GENERAL, "SID validation for IRTE index %d "
-"failed (bus %d outside %d-%d)", index, bus,
-bus_min, bus_max);
-return -VTD_FR_IR_SID_ERR;
-}
-break;
+case VTD_SVT_ALL:
+mask = vtd_svt_mask[entry->irte.sid_q];
+if ((source_id & mask) != (sid & mask)) {
+VTD_DPRINTF(GENERAL, "SID validation for IRTE index "
+"%d failed (reqid 0x%04x sid 0x%04x)", index,
+sid, source_id);
+return -VTD_FR_IR_SID_ERR;
+}
+break;
 
-default:
-VTD_DPRINTF(GENERAL, "Invalid SVT bits (0x%x) in IRTE index "
-"%d", entry->irte.sid_vtype, index);
-/* Take this as verification failure. */
+case VTD_SVT_BUS:
+bus_max = source_id >> 8;
+bus_min = source_id & 0xff;
+bus = sid >> 8;
+if (bus > bus_max || bus < bus_min) {
+VTD_DPRINTF(GENERAL, "SID validation for IRTE index %d "
+"failed (bus %d outside %d-%d)", index, bus,
+bus_min, bus_max);
 return -VTD_FR_IR_SID_ERR;
-break;
 }
+break;
+
+default:
+VTD_DPRINTF(GENERAL, "Invalid SVT bits (0x%x) in IRTE index "
+"%d", entry->irte.sid_vtype, index);
+/* Take this as verification failure. */
+return -VTD_FR_IR_SID_ERR;
+break;
 }
 
 return 0;
@@ -2252,14 +2250,17 @@ static MemTxResult vtd_mem_ir_write(void *opaque, 
hwaddr addr,
 {
 int ret = 0;
 MSIMessage from = {}, to = {};
-uint16_t sid = X86_IOMMU_SID_INVALID;
+VTDAddressSpace *as = opaque;
+uint16_t sid = PCI_BUILD_BDF(pci_bus_num(as->bus), as->devfn);
 
 from.address = (uint64_t) addr + VTD_INTERRUPT_ADDR_FIRST;
 from.data = (uint32_t) value;
 
-if (!attrs.unspecified) {
-/* We have explicit Source ID */
-sid = attrs.requester_id;
+if (attrs.requester_id != sid) {
+VTD_DPRINTF(GENERAL, "int remap request for sid 0x%04x"
+" requester_id 0x%04x couldn't be verified",
+sid, attrs.requester_id);
+return MEMTX_ERROR;
 }
 
 ret = vtd_interrupt_remap_msi(opaque, &from, &to, sid);
@@ -2325,7 +2326,7 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, 
PCIBus *bus, int devfn)
 memory_region_init_iommu(&vtd_dev_as->iommu, OBJECT(s),
  &s->iommu_ops, "intel_iommu", UINT64_MAX);
 memory_region_init_io(&vtd_dev_as->iommu_ir, OBJECT(s),
-  &vtd_mem_ir_ops, s, "intel_iommu_ir",
+  &vtd_mem_ir_ops, vtd_dev_as, "intel_iommu_ir",
   VTD_INTERRUPT_ADDR_SIZE);
 memory_region_add_subregion(&vtd_dev_as->iommu, 
VTD_INTERRUPT_ADDR_FIRST,
 &vtd_dev_as->iommu_ir);
-- 
2.1.4




[Qemu-devel] [V2 6/6] hw/iommu: share common code between IOMMUs

2016-08-15 Thread David Kiarie
Enabling interrupt remapping with kernel_irqchip=on should result
in an error for both VT-d and AMD-Vi

Signed-off-by: David Kiarie 
---
 hw/i386/intel_iommu.c | 9 -
 hw/i386/x86-iommu.c   | 8 
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index e4bad6a..bf86dcc 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -30,7 +30,6 @@
 #include "hw/boards.h"
 #include "hw/i386/x86-iommu.h"
 #include "hw/pci-host/q35.h"
-#include "sysemu/kvm.h"
 
 /*#define DEBUG_INTEL_IOMMU*/
 #ifdef DEBUG_INTEL_IOMMU
@@ -2472,14 +2471,6 @@ static void vtd_realize(DeviceState *dev, Error **errp)
 Q35_PSEUDO_DEVFN_IOAPIC);
 /* Pseudo address space under root PCI bus. */
 pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
-
-/* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
-if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
-!kvm_irqchip_is_split()) {
-error_report("Intel Interrupt Remapping cannot work with "
- "kernel-irqchip=on, please use 'split|off'.");
-exit(1);
-}
 }
 
 static void vtd_class_init(ObjectClass *klass, void *data)
diff --git a/hw/i386/x86-iommu.c b/hw/i386/x86-iommu.c
index 2278af7..66510f7 100644
--- a/hw/i386/x86-iommu.c
+++ b/hw/i386/x86-iommu.c
@@ -21,6 +21,7 @@
 #include "hw/sysbus.h"
 #include "hw/boards.h"
 #include "hw/i386/x86-iommu.h"
+#include "sysemu/kvm.h"
 #include "qemu/error-report.h"
 #include "trace.h"
 
@@ -84,6 +85,13 @@ static void x86_iommu_realize(DeviceState *dev, Error **errp)
 if (x86_class->realize) {
 x86_class->realize(dev, errp);
 }
+/* Currently IOMMU IR only support "kernel-irqchip={off|split}" */
+if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
+!kvm_irqchip_is_split()) {
+error_report("Interrupt Remapping cannot work with "
+ "kernel-irqchip=on, please use 'split|off'.");
+exit(1);
+}
 
 x86_iommu_set_default(X86_IOMMU_DEVICE(dev));
 }
-- 
2.1.4




[Qemu-devel] [V2 0/6] AMD IOMMU interrupt remapping

2016-08-15 Thread David Kiarie
Hello all,

The following patchset implements AMD-Vi interrupt remapping logic and hooks it 
onto existing IR infrastucture.

I have bundled this patchset together with the "Explicit SID for 
IOAPIC"."Explicit SID for IOAPIC" functions to 
affiliate MSI routes with a requester ID and a PCI device if present which 
enables platform devices like IOAPIC to
make interrupt requests using an explicit SID as required by both VT-d and 
AMD-Vi.

David Kiarie (6):
  hw/msi: Allow platform devices to use explicit SID
  hw/i386: enforce SID verification
  hw/iommu: Prepare for AMD IOMMU interrupt remapping
  hw/iommu: AMD IOMMU interrupt remapping
  hw/acpi: report IOAPIC on IVRS
  hw/iommu: share common code between IOMMUs

 hw/i386/acpi-build.c  |   2 +
 hw/i386/amd_iommu.c   | 244 +-
 hw/i386/amd_iommu.h   |  40 +++
 hw/i386/intel_iommu.c |  89 +++---
 hw/i386/kvm/pci-assign.c  |  12 +-
 hw/i386/x86-iommu.c   |   8 ++
 hw/intc/ioapic.c  |  25 +++-
 hw/misc/ivshmem.c |   6 +-
 hw/vfio/pci.c |   6 +-
 hw/virtio/virtio-pci.c|   7 +-
 include/hw/i386/ioapic_internal.h |   1 +
 include/hw/i386/x86-iommu.h   |   1 +
 include/sysemu/kvm.h  |  25 ++--
 kvm-all.c |  10 +-
 kvm-stub.c|   5 +-
 target-i386/kvm.c |  15 ++-
 16 files changed, 386 insertions(+), 110 deletions(-)

-- 
2.1.4




[Qemu-devel] [V2 1/6] hw/msi: Allow platform devices to use explicit SID

2016-08-15 Thread David Kiarie
When using IOMMU platform devices like IOAPIC are required to make
interrupt remapping requests using explicit SID.We affiliate an MSI
route with a requester ID and a PCI device if present which ensures
that platform devices can call IOMMU interrupt remapping code with
explicit SID while maintaining compatility with the original code
which mainly dealt with PCI devices.

Signed-off-by: David Kiarie 
---
 hw/i386/intel_iommu.c |  3 +++
 hw/i386/kvm/pci-assign.c  | 12 
 hw/intc/ioapic.c  | 25 +
 hw/misc/ivshmem.c |  6 --
 hw/vfio/pci.c |  6 --
 hw/virtio/virtio-pci.c|  7 +--
 include/hw/i386/ioapic_internal.h |  1 +
 include/hw/i386/x86-iommu.h   |  1 +
 include/sysemu/kvm.h  | 25 ++---
 kvm-all.c | 10 ++
 kvm-stub.c|  5 +++--
 target-i386/kvm.c | 15 +--
 12 files changed, 79 insertions(+), 37 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index d6e02c8..496d836 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2466,6 +2466,9 @@ static void vtd_realize(DeviceState *dev, Error **errp)
 vtd_init(s);
 sysbus_mmio_map(SYS_BUS_DEVICE(s), 0, Q35_HOST_BRIDGE_IOMMU_ADDR);
 pci_setup_iommu(bus, vtd_host_dma_iommu, dev);
+/* IOMMU expected IOAPIC SID */
+x86_iommu->ioapic_bdf = PCI_BUILD_BDF(Q35_PSEUDO_DEVFN_IOAPIC,
+Q35_PSEUDO_DEVFN_IOAPIC);
 /* Pseudo address space under root PCI bus. */
 pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
 
diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index 8238fbc..3f26be1 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -976,7 +976,8 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev)
 if (ctrl_byte & PCI_MSI_FLAGS_ENABLE) {
 int virq;
 
-virq = kvm_irqchip_add_msi_route(kvm_state, 0, pci_dev);
+virq = kvm_irqchip_add_msi_route(kvm_state, 0, pci_dev,
+ pci_requester_id(pci_dev));
 if (virq < 0) {
 perror("assigned_dev_update_msi: kvm_irqchip_add_msi_route");
 return;
@@ -1014,7 +1015,8 @@ static void assigned_dev_update_msi_msg(PCIDevice 
*pci_dev)
 }
 
 kvm_irqchip_update_msi_route(kvm_state, assigned_dev->msi_virq[0],
- msi_get_message(pci_dev, 0), pci_dev);
+ msi_get_message(pci_dev, 0), pci_dev,
+ pci_requester_id(pci_dev));
 kvm_irqchip_commit_routes(kvm_state);
 }
 
@@ -1078,7 +1080,8 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
*pci_dev)
 continue;
 }
 
-r = kvm_irqchip_add_msi_route(kvm_state, i, pci_dev);
+r = kvm_irqchip_add_msi_route(kvm_state, i, pci_dev,
+  pci_requester_id(pci_dev));
 if (r < 0) {
 return r;
 }
@@ -1599,7 +1602,8 @@ static void assigned_dev_msix_mmio_write(void *opaque, 
hwaddr addr,
 
 ret = kvm_irqchip_update_msi_route(kvm_state,
adev->msi_virq[i], msg,
-   pdev);
+   pdev,
+   pci_requester_id(pdev));
 if (ret) {
 error_report("Error updating irq routing entry (%d)", ret);
 }
diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index 31791b0..b8b2f33 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -95,9 +95,17 @@ static void ioapic_entry_parse(uint64_t entry, struct 
ioapic_entry_info *info)
 (info->delivery_mode << MSI_DATA_DELIVERY_MODE_SHIFT);
 }
 
-static void ioapic_service(IOAPICCommonState *s)
+static void ioapic_as_write(IOAPICCommonState *s, uint32_t data, uint64_t addr)
 {
 AddressSpace *ioapic_as = PC_MACHINE(qdev_get_machine())->ioapic_as;
+MemTxAttrs attrs;
+
+attrs.requester_id = s->devid;
+address_space_stl_le(ioapic_as, addr, data, attrs, NULL);
+}
+
+static void ioapic_service(IOAPICCommonState *s)
+{
 struct ioapic_entry_info info;
 uint8_t i;
 uint32_t mask;
@@ -141,7 +149,7 @@ static void ioapic_service(IOAPICCommonState *s)
  * the IOAPIC message into a MSI one, and its
  * address space will decide whether we need a
  * translation. */
-stl_le_phys(ioapic_as, info.addr, info.data);
+ioapic_as_write(s, info.data, info.addr);
 }
 }
 }
@@ -197,7 +205,7 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
 ioapic_ent

[Qemu-devel] [V2 3/6] hw/iommu: Prepare for AMD IOMMU interrupt remapping

2016-08-15 Thread David Kiarie
Introduce macros and trace events for use in AMD IOMMU
interrupt remapping

Signed-off-by: David Kiarie 
---
 hw/i386/amd_iommu.h | 38 --
 1 file changed, 16 insertions(+), 22 deletions(-)

diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
index 2f4ac55..6f62e3a 100644
--- a/hw/i386/amd_iommu.h
+++ b/hw/i386/amd_iommu.h
@@ -187,11 +187,6 @@
 #define AMDVI_MT_LINT1  0xb
 #define AMDVI_MT_LINT0  0xe
 
-/* Ext reg, GA support */
-#define AMDVI_GASUP(1UL << 7)
-/* MMIO control GA enable bits */
-#define AMDVI_GAEN (1UL << 17)
-
 /* MSI interrupt type mask */
 #define AMDVI_IR_TYPE_MASK 0x300
 
@@ -204,12 +199,18 @@
 /* bits determining whether specific interrupts should be passed
  * split DTE into 64-bit chunks
  */
-#define AMDVI_DTE_INTPASS   56
-#define AMDVI_DTE_EINTPASS  57
-#define AMDVI_DTE_NMIPASS   58
-#define AMDVI_DTE_INTCTL60
-#define AMDVI_DTE_LINT0PASS 62
-#define AMDVI_DTE_LINT1PASS 63
+#define AMDVI_DTE_INTPASS_LSHIFT   56
+#define AMDVI_DTE_EINTPASS_LSHIFT  57
+#define AMDVI_DTE_NMIPASS_LSHIFT   58
+#define AMDVI_DTE_INTCTL_RSHIFT60
+#define AMDVI_DTE_LINT0PASS_LSHIFT 62
+#define AMDVI_DTE_LINT1PASS_LSHIFT 63
+
+/* INTCTL expected values */
+#define AMDVI_INTCTL_ABORT  0x0
+#define AMDVI_INTCTL_PASS   0x1
+#define AMDVI_INTCTL_REMAP  0x2
+#define AMDVI_INTCTL_RSVD   0x3
 
 /* interrupt data valid */
 #define AMDVI_IR_VALID  (1UL << 0)
@@ -220,17 +221,6 @@
 /* default IRTE size */
 #define AMDVI_DEFAULT_IRTE_SIZE 0x4
 
-/* IRTE size with GASup enabled */
-#define AMDVI_IRTE_SIZE_GASUP   0x10
-
-#define AMDVI_IRTE_VECTOR_MASK(0xffU << 16)
-#define AMDVI_IRTE_DEST_MASK  (0xffU << 8)
-#define AMDVI_IRTE_DM_MASK(0x1U << 6)
-#define AMDVI_IRTE_RQEOI_MASK (0x1U << 5)
-#define AMDVI_IRTE_INTTYPE_MASK   (0x7U << 2)
-#define AMDVI_IRTE_SUPIOPF_MASK   (0x1U << 1)
-#define AMDVI_IRTE_REMAP_MASK (0x1U << 0)
-
 #define AMDVI_IR_TABLE_SIZE_MASK 0xfe
 
 /* offsets into MSI data */
@@ -243,6 +233,10 @@
 #define AMDVI_MSI_ADDR_RH_RSHIFT   0x3
 #define AMDVI_MSI_ADDR_DEST_RSHIFT 0xc
 
+#define AMDVI_BUS_NUM  0x0
+/* AMD-Vi specific IOAPIC Device function */
+#define AMDVI_DEVFN_IOAPIC 0xa0
+
 #define AMDVI_LOCAL_APIC_ADDR 0xfee0
 
 /* extended feature support */
-- 
2.1.4




[Qemu-devel] [V16 3/4] hw/i386: Introduce AMD IOMMU

2016-08-13 Thread David Kiarie
Add AMD IOMMU emulaton to Qemu in addition to Intel IOMMU.
The IOMMU does basic translation, error checking and has a
minimal IOTLB implementation. This IOMMU bypassed the need
for target aborts by responding with IOMMU_NONE access rights
and exempts the region 0xfee0-0xfeef from translation
as it is the q35 interrupt region.

We advertise features that are not yet implemented to please
the Linux IOMMU driver.

IOTLB aims at implementing commands on real IOMMUs which is
essential for debugging and may not offer any performance
benefits

Signed-off-by: David Kiarie 
---
 hw/i386/Makefile.objs |1 +
 hw/i386/amd_iommu.c   | 1390 +
 hw/i386/amd_iommu.h   |  289 ++
 3 files changed, 1680 insertions(+)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index 90e94ff..909ead6 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -3,6 +3,7 @@ obj-y += multiboot.o
 obj-y += pc.o pc_piix.o pc_q35.o
 obj-y += pc_sysfw.o
 obj-y += x86-iommu.o intel_iommu.o
+obj-y += amd_iommu.o
 obj-$(CONFIG_XEN) += ../xenpv/ xen/
 
 obj-y += kvmvapic.o
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
new file mode 100644
index 000..27e68e0
--- /dev/null
+++ b/hw/i386/amd_iommu.c
@@ -0,0 +1,1390 @@
+/*
+ * QEMU emulation of AMD IOMMU (AMD-Vi)
+ *
+ * Copyright (C) 2011 Eduard - Gabriel Munteanu
+ * Copyright (C) 2015 David Kiarie, 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ * Cache implementation inspired by hw/i386/intel_iommu.c
+ */
+#include "qemu/osdep.h"
+#include "hw/i386/amd_iommu.h"
+#include "trace.h"
+
+/* used AMD-Vi MMIO registers */
+const char *amdvi_mmio_low[] = {
+"AMDVI_MMIO_DEVTAB_BASE",
+"AMDVI_MMIO_CMDBUF_BASE",
+"AMDVI_MMIO_EVTLOG_BASE",
+"AMDVI_MMIO_CONTROL",
+"AMDVI_MMIO_EXCL_BASE",
+"AMDVI_MMIO_EXCL_LIMIT",
+"AMDVI_MMIO_EXT_FEATURES",
+"AMDVI_MMIO_PPR_BASE",
+"UNHANDLED"
+};
+const char *amdvi_mmio_high[] = {
+"AMDVI_MMIO_COMMAND_HEAD",
+"AMDVI_MMIO_COMMAND_TAIL",
+"AMDVI_MMIO_EVTLOG_HEAD",
+"AMDVI_MMIO_EVTLOG_TAIL",
+"AMDVI_MMIO_STATUS",
+"AMDVI_MMIO_PPR_HEAD",
+"AMDVI_MMIO_PPR_TAIL",
+"UNHANDLED"
+};
+typedef struct AMDVIAddressSpace {
+uint8_t bus_num;/* bus number   */
+uint8_t devfn;  /* device function  */
+AMDVIState *iommu_state;/* AMDVI - one per machine  */
+MemoryRegion iommu; /* Device's address translation region  */
+MemoryRegion iommu_ir;  /* Device's interrupt remapping region  */
+AddressSpace as;/* device's corresponding address space */
+} AMDVIAddressSpace;
+
+/* AMDVI cache entry */
+typedef struct AMDVIIOTLBEntry {
+uint16_t domid; /* assigned domain id  */
+uint16_t devid; /* device owning entry */
+uint64_t perms; /* access permissions  */
+uint64_t translated_addr;   /* translated address  */
+uint64_t page_mask; /* physical page size  */
+} AMDVIIOTLBEntry;
+
+/* serialize IOMMU command processing */
+typedef struct QEMU_PACKED {
+#ifdef HOST_WORDS_BIGENDIAN
+uint64_t type:4;   /* command type   */
+uint64_t reserved:8;
+uint64_t store_addr:49;/* addr to write  */
+uint64_t completion_flush:1;   /* allow more executions  */
+uint64_t completion_int:1; /* set MMIOWAITINT*/
+uint64_t completion_store:1;   /* write data to address  */
+#else
+uint64_t completion_store:1;
+uint64_t completion_int:1;
+uint64_t completion_flush:1;
+uint64_t store_addr:49;
+uint64_t reserved:8;
+uint64_t type:4;
+#endif /* __BIG_ENDIAN_BITFIELD */
+uint64_t store_data;   /* data to write  */
+} CMDCompletionWait;
+
+/* invalidate internal caches for devid */
+typedef struct QEMU_PACKED {
+#ifdef HOST_WORDS_BIGENDIAN
+uint64_t devid:16; /* device to invalidate   */
+uint64_t reserved_1:44;
+uint64_t type:4;  

[Qemu-devel] [V16 1/4] hw/pci: Prepare for AMD IOMMU

2016-08-13 Thread David Kiarie
Introduce PCI macros from for use by AMD IOMMU

Signed-off-by: David Kiarie 
---
 include/hw/pci/pci.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 929ec2f..5ff92de 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -11,11 +11,13 @@
 #include "hw/pci/pcie.h"
 
 /* PCI bus */
-
 #define PCI_DEVFN(slot, func)   slot) & 0x1f) << 3) | ((func) & 0x07))
+#define PCI_BUS_NUM(x)  (((x) >> 8) & 0xff)
 #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f)
 #define PCI_FUNC(devfn) ((devfn) & 0x07)
 #define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn))
+#define PCI_BUS_MAX 256
+#define PCI_DEVFN_MAX   256
 #define PCI_SLOT_MAX32
 #define PCI_FUNC_MAX8
 
-- 
2.1.4




[Qemu-devel] [V16 2/4] hw/i386/trace-events: Add AMD IOMMU trace events

2016-08-13 Thread David Kiarie
Signed-off-by: David Kiarie 
---
 hw/i386/trace-events | 29 +
 1 file changed, 29 insertions(+)

diff --git a/hw/i386/trace-events b/hw/i386/trace-events
index b4882c1..592de3a 100644
--- a/hw/i386/trace-events
+++ b/hw/i386/trace-events
@@ -13,3 +13,32 @@ mhp_pc_dimm_assigned_address(uint64_t addr) "0x%"PRIx64
 
 # hw/i386/x86-iommu.c
 x86_iommu_iec_notify(bool global, uint32_t index, uint32_t mask) "Notify IEC 
invalidation: global=%d index=%" PRIu32 " mask=%" PRIu32
+
+# hw/i386/amd_iommu.c
+amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at addr 
0x%"PRIx64 " +  offset 0x%"PRIx32
+amdvi_cache_update(uint16_t domid, uint32_t bus, uint32_t slot, uint32_t func, 
uint64_t gpa, uint64_t txaddr) " update iotlb domid 0x%"PRIx16" devid: 
%02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
+amdvi_completion_wait_fail(uint64_t addr) "error: fail to write at address 
0x%"PRIx64
+amdvi_mmio_write(const char *reg, uint64_t addr, unsigned size, uint64_t val, 
unsigned long offset) "%s write addr 0x%"PRIx64 ", size %d, val 0x%"PRIx64 ", 
offset 0x%"PRIx64
+amdvi_mmio_read(const char *reg, uint64_t addr, unsigned size, uint64_t 
offset) "%s read addr 0x%"PRIx64", size %d offset 0x%"PRIx64
+amdvi_command_error(uint64_t status) "error: Executing commands with command 
buffer disabled 0x%"PRIx64
+amdvi_command_read_fail(uint64_t addr, uint32_t head) "error: fail to access 
memory at 0x%"PRIx64" + 0x%"PRIu32
+amdvi_command_exec(uint32_t head, uint32_t tail, uint64_t buf) "command buffer 
head at 0x%"PRIx32 " command buffer tail at 0x%"PRIx32" command buffer base at 
0x%" PRIx64
+amdvi_unhandled_command(uint8_t type) "unhandled command %d"
+amdvi_intr_inval(void) "Interrupt table invalidated"
+amdvi_iotlb_inval(void) "IOTLB pages invalidated"
+amdvi_prefetch_pages(void) "Pre-fetch of AMD-Vi pages requested"
+amdvi_pages_inval(uint16_t domid) "AMD-Vi pages for domain 0x%"PRIx16 " 
invalidated"
+amdvi_all_inval(void) "Invalidation of all AMD-Vi cache requested "
+amdvi_ppr_exec(void) "Execution of PPR queue requested "
+amdvi_devtab_inval(uint16_t bus, uint16_t slot, uint16_t func) "device table 
entry for devid: %02x:%02x.%x invalidated"
+amdvi_completion_wait(uint64_t addr, uint64_t data) "completion wait requested 
with store address 0x%"PRIx64" and store data 0x%"PRIx64
+amdvi_control_status(uint64_t val) "MMIO_STATUS state 0x%"PRIx64
+amdvi_iotlb_reset(void) "IOTLB exceed size limit - reset "
+amdvi_completion_wait_exec(uint64_t addr, uint64_t data) "completion wait 
requested with store address 0x%"PRIx64" and store data 0x%"PRIx64
+amdvi_dte_get_fail(uint64_t addr, uint32_t offset) "error: failed to access 
Device Entry devtab 0x%"PRIx64" offset 0x%"PRIx32
+amdvi_invalid_dte(uint64_t addr) "PTE entry at 0x%"PRIx64" is invalid "
+amdvi_get_pte_hwerror(uint64_t addr) "hardware error eccessing PTE at addr 
0x%"PRIx64
+amdvi_mode_invalid(unsigned level, uint64_t addr)"error: translation level 
0x%"PRIu8" translating addr 0x%"PRIx64
+amdvi_page_fault(uint64_t addr) "error: page fault accessing guest physical 
address 0x%"PRIx64
+amdvi_iotlb_hit(uint16_t bus, uint16_t slot, uint16_t func, uint64_t addr, 
uint64_t txaddr) "hit iotlb devid %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
+amdvi_translation_result(uint16_t bus, uint16_t slot, uint16_t func, uint64_t 
addr, uint64_t txaddr) "devid: %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
-- 
2.1.4




[Qemu-devel] [V16 0/4] AMD IOMMU

2016-08-13 Thread David Kiarie
Hi all,

This patchset adds basic AMD IOMMU emulation support to Qemu. 

Changes since v15
   -Endian-ness issue fix
   -cleaned up unused macros
   -removed guest frame number(gfn) from cache entry

Changes since v14
   -MMIO register reading/write bug fix [Peter]
   -Endian-ness issue fix[Peter]
   -Bitfields layouts in IOMMU commands fix[Peter]
   -IVRS changed IVHD device entry from type 3 to 1 to save a few bytes
   -coding style issues, comment grammer and other miscellaneous fixes.

Changes since v13
   -Added an error to make AMD IOMMU incompatible with device assignment.[Alex]
   -Converted AMD IOMMU into a composite PCI and System Bus device. This helps 
with:
  -We can now inherit from X86 IOMMU base class(which is implemented as a 
System Bus device).
  -We can now reserve MMIO region for IOMMU without a BAR register and 
without a hack.

Changes since v12

   -Coding style fixes [Jan, Michael]
   -Error logging fix to avoid using a macro[Jan]
   -moved some PCI macros to PCI header[Jan]
   -Use a lookup table for MMIO register names when tracing[Jan]

Changes since V11
   -AMD IOMMU is not started with -device amd-iommu (with a dependency on 
Marcel's patches).
   -IOMMU commands are represented using bitfields which is less error prone 
and more readable[Peter]
   -Changed from debug fprintfs to tracing[Jan]

Changes since V10
 
   -Support for huge pages including some obscure AMD IOMMU feature that allows 
default page size override[Jan].
   -Fixed an issue with generation of interrupts. We noted that AMD IOMMU has 
BusMaster- and is therefore not able to generate interrupts like any other PCI 
device. We have resulted in writing directly to system address but this could 
be fixed by some patches which have not been merged yet.

Changes since v9

   -amd_iommu prefixes have been renamed to a shorter 'amdvi' both in the macros
and in the functions/code. The register macros have not been moved to the 
implementation file since almost the macros there are basically macros and 
I 
reckoned renaming them should suffice.
   -taken care of byte order in the use of 'dma_memory_read'[Michael]
   -Taken care of invalid DTE entries to ensure no DMA unless a device is 
configured to allow it.
   -An issue with the emulate IOMMU defaulting to AMD_IOMMU has been 
fixed[Marcel]
   
You can test[1] this patches by starting with parameters 
qemu-system-x86_64 -M -device amd-iommu -m 2G -enable-kvm -smp 4 -cpu host 
-hda file.img -soundhw ac97 
emulating whatever devices you want.

Not passing any command line parameters to linux should be enough to test this 
patches since the devices are basically
passes-through but to the 'host' (l1 guest). You can still go ahead pass 
command line parameter 'iommu=pt iommu=1'
and try to pass a device to L2 guest. This can also done without passing any 
iommu related parameters to the kernel. 

David Kiarie (4):
  hw/pci: Prepare for AMD IOMMU
  hw/i386/trace-events: Add AMD IOMMU trace events
  hw/i386: Introduce AMD IOMMU
  hw/i386: AMD IOMMU IVRS table

 hw/acpi/aml-build.c |2 +-
 hw/i386/Makefile.objs   |1 +
 hw/i386/acpi-build.c|   76 ++-
 hw/i386/amd_iommu.c | 1392 +++
 hw/i386/amd_iommu.h |  289 +
 hw/i386/intel_iommu.c   |1 +
 hw/i386/trace-events|   29 +
 hw/i386/x86-iommu.c |6 +
 include/hw/acpi/aml-build.h |1 +
 include/hw/i386/x86-iommu.h |   12 +
 include/hw/pci/pci.h|4 +-
 11 files changed, 1802 insertions(+), 11 deletions(-)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

-- 
2.1.4




Re: [Qemu-devel] [V1 2/4] hw/iommu: AMD IOMMU interrupt remapping

2016-08-12 Thread David Kiarie
On Fri, Aug 12, 2016 at 11:08 PM, Valentine Sinitsyn <
valentine.sinit...@gmail.com> wrote:

> On 11.08.2016 00:42, David Kiarie wrote:
>
>> Introduce AMD IOMMU interrupt remapping and hook it onto
>
>
>> +static inline int amdvi_ir_pass(MSIMessage *src, MSIMessage *dst,
>> uint8_t bit,
>> +uint64_t dte)
>>
> The name is misleading. Actually, this function handles non-vectored
> interrupts (either passes or target aborts them). Maybe call it
> amdvi_ir_handle_non_vectored() ?
>
> +{
>> +if ((dte & (1UL << bit))) {
>> +/* passing interrupt enabled */
>> +dst->address = src->address;
>> +dst->data = src->data;
>> +} else {
>> +/* should be target aborted */
>> +return -AMDVI_TARGET_ABORT;
>> +}
>> +return 0;
>> +}
>> +
>> +static int amdvi_remap_ir_intctl(uint64_t dte, uint32_t irte,
>> + MSIMessage *src, MSIMessage *dst)
>> +{
>> +int ret = 0;
>> +
>> +switch ((dte >> AMDVI_DTE_INTCTL ) & 3UL) {
>>
> AMDVI_DTE_INTCTL_SHIFT? Yes, I should have mentioned it in a previous
> patch, sorry. Maybe also introduce macros for 3UL and 1, 2, 3 in switch
> branches below.
>
> +case 1:
>> +/* pass */
>> +memcpy(dst, src, sizeof(*dst));
>> +break;
>> +case 2:
>> +/* remap */
>> +if (irte & AMDVI_IRTE_REMAP_MASK) {
>> +/* LOCAL APIC address */
>> +dst->address = AMDVI_LOCAL_APIC_ADDR;
>> +/* destination mode */
>> +dst->address |= (((irte & AMDVI_IRTE_DM_MASK) >> 6) <<
>> +AMDVI_MSI_ADDR_DM_RSHIFT);
>> +/* RH */
>> +dst->address |= ((irte & AMDVI_IRTE_RQEOI_MASK) >> 5) <<
>> +AMDVI_MSI_ADDR_RH_RSHIFT;
>> +/* Destination ID */
>> +dst->address |= ((irte & AMDVI_IRTE_DEST_MASK) >> 8) <<
>> +AMDVI_MSI_ADDR_DEST_RSHIFT;
>> +/* construct data - vector */
>> +dst->data |= (irte & AMDVI_IRTE_VECTOR_MASK) >> 16;
>> +/* Interrupt type */
>> +dst->data |= ((irte & AMDVI_IRTE_INTTYPE_MASK) >> 2) <<
>> +AMDVI_MSI_DATA_DM_RSHIFT;
>>
> These bit operations look scary. Did you considered using bitfields or
> wrapping them in macros?


Will look at introducing macros to cover this comment and the previous one.


>
> +} else  {
>> +ret = -AMDVI_TARGET_ABORT;
>> +}
>> +break;
>> +case 0:
>> +case 3:
>>
> In fact, you should report this as event when IR == 1.
>
> +default:
>> +ret = -AMDVI_TARGET_ABORT;
>> +}
>> +return ret;
>> +}
>> +/*
>> + * We don't support guest virtual APIC so IRTE size will most likely
>> always be 4
>> + */
>> +static int amdvi_irte_get(AMDVIState *s, MSIMessage *src, uint32_t *irte,
>> +  uint64_t *dte, uint16_t devid)
>> +{
>> +uint64_t irte_root, offset = devid * AMDVI_DEVTAB_ENTRY_SIZE,
>> + irte_size = AMDVI_DEFAULT_IRTE_SIZE,
>> + ir_table_size;
>> +
>> +/* check for GASup and if it's enabled */
>> +if ((amdvi_readq(s, AMDVI_EXT_FEATURES) & AMDVI_GASUP)
>> +&& (amdvi_readq(s, AMDVI_MMIO_CONTROL) & AMDVI_GAEN)) {
>> +/* set a different IRTE size */
>> +irte_size = AMDVI_IRTE_SIZE_GASUP;
>> +}
>>
> As I said, this is likely the only place where we account for Virtual
> APIC. You don't seem to handle Virtual APIC Root in DTE, for instance.
> Maybe drop this incomplete support altogether, and print some warning here
> instead?


I'll drop everything and print a warning instead.


>
>
> +if (dma_memory_read(&address_space_memory, s->devtab + offset, dte,
>> +AMDVI_DEVTAB_ENTRY_SIZE)) {
>> +trace_amdvi_dte_get_fail(s->devtab, offset);
>> +return -AMDVI_DEV_TAB_HW;
>> +}
>> +
>> +irte_root = dte[2] & AMDVI_IRTEROOT_MASK;
>> +offset = (src->data & AMDVI_IRTE_INDEX_MASK) << 2;
>> +ir_table_size = pow(2, dte[2] & AMDVI_IR_TABLE_SIZE_MASK);
>>
> 1 << dte[2] & AMDVI_IR_TABLE_SIZE_MASK ?
>
>
> +/* enforce IR table size */
>> +if 

Re: [Qemu-devel] [V1 3/4] hw/acpi: report IOAPIC on IVRS

2016-08-12 Thread David Kiarie
On Fri, Aug 12, 2016 at 11:14 PM, Valentine Sinitsyn <
valentine.sinit...@gmail.com> wrote:

> On 11.08.2016 00:42, David Kiarie wrote:
>
>> Report IOAPIC via IVRS which effectively allows linux AMD-Vi
>> driver to enable interrupt remapping
>>
>> Signed-off-by: David Kiarie 
>> ---
>>  hw/i386/acpi-build.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>> index 49bd183..da602c3 100644
>> --- a/hw/i386/acpi-build.c
>> +++ b/hw/i386/acpi-build.c
>> @@ -2615,6 +2615,8 @@ build_amd_iommu(GArray *table_data, BIOSLinker
>> *linker)
>>   *   Refer to Spec - Table 95:IVHD Device Entry Type Codes(4-byte)
>>   */
>>  build_append_int_noprefix(table_data, 0x001, 4);
>> +/* IOAPIC represented as an 8-byte entry. Spec v2.62 Tables 97 */
>> +build_append_int_noprefix(table_data, 0x0100a000ff48, 8);
>>
> Nit: bit 3 in DTE Setting is reserved (Table 97 in the spec), while you
> set them all with 0xff.


Noted, thanks!


>
>
>
>>  build_header(linker, table_data, (void *)(table_data->data +
>> iommu_start),
>>   "IVRS", table_data->len - iommu_start, 1, NULL, NULL);
>>
>>
> Valentine
>
>


Re: [Qemu-devel] [V15 3/4] hw/i386: Introduce AMD IOMMU

2016-08-12 Thread David Kiarie
On Fri, Aug 12, 2016 at 10:10 PM, Valentine Sinitsyn <
valentine.sinit...@gmail.com> wrote:

> Hi David,
>
> On 02.08.2016 13:39, David Kiarie wrote:
>
>> Add AMD IOMMU emulaton to Qemu in addition to Intel IOMMU.
>> The IOMMU does basic translation, error checking and has a
>> minimal IOTLB implementation. This IOMMU bypassed the need
>> for target aborts by responding with IOMMU_NONE access rights
>> and exempts the region 0xfee0-0xfeef from translation
>> as it is the q35 interrupt region.
>>
>> We advertise features that are not yet implemented to please
>> the Linux IOMMU driver.
>>
>> IOTLB aims at implementing commands on real IOMMUs which is
>> essential for debugging and may not offer any performance
>> benefits
>>
>> Signed-off-by: David Kiarie 
>> ---
>> +
>> +/* IRTE size with GASup enabled */
>> +#define AMDVI_IRTE_SIZE_GASUP   0x10
>> +
>> +#define AMDVI_IRTE_VECTOR_MASK(0xffU << 16)
>> +#define AMDVI_IRTE_DEST_MASK  (0xffU << 8)
>> +#define AMDVI_IRTE_DM_MASK(0x1U << 6)
>> +#define AMDVI_IRTE_RQEOI_MASK (0x1U << 5)
>> +#define AMDVI_IRTE_INTTYPE_MASK   (0x7U << 2)
>> +#define AMDVI_IRTE_SUPIOPF_MASK   (0x1U << 1)
>> +#define AMDVI_IRTE_REMAP_MASK (0x1U << 0)
>> +
>> +#define AMDVI_IR_TABLE_SIZE_MASK 0xfe
>> +
>> +/* offsets into MSI data */
>> +#define AMDVI_MSI_DATA_DM_RSHIFT   0x8
>> +#define AMDVI_MSI_DATA_LEVEL_RSHIFT0xe
>> +#define AMDVI_MSI_DATA_TRM_RSHIFT  0xf
>> +
>> +/* offsets into MSI address */
>> +#define AMDVI_MSI_ADDR_DM_RSHIFT   0x2
>> +#define AMDVI_MSI_ADDR_RH_RSHIFT   0x3
>> +#define AMDVI_MSI_ADDR_DEST_RSHIFT 0xc
>> +
>> +#define AMDVI_LOCAL_APIC_ADDR 0xfee0
>> +
>> +/* extended feature support */
>> +#define AMDVI_EXT_FEATURES (AMDVI_FEATURE_PREFETCH | AMDVI_FEATURE_PPR |
>> \
>> +AMDVI_FEATURE_IA | AMDVI_FEATURE_GT | AMDVI_FEATURE_GA | \
>>
>

> Came across this when reviewing your IR series.
> Do you really support Guest Translation in your code? I'm also not sure if
> QEMU emulates Virtual APIC. So I'd skip the last two bits.


You mean GT and GA ? I could do a bit more research about GA but as I have
mentioned in the commit message the Linux AMD-Vi (which is the primary
target) checks for some of these features when deciding the version of this
IOMMU otherwise it defaults to IOMMU version 1.


>
> Valentine
>
>
> +AMDVI_FEATURE_HE | AMDVI_GATS_MODE | AMDVI_HATS_MODE)
>> +
>> +/* capabilities header */
>> +#define AMDVI_CAPAB_FEATURES (AMDVI_CAPAB_FLAT_EXT | \
>> +AMDVI_CAPAB_FLAG_NPCACHE | AMDVI_CAPAB_FLAG_IOTLBSUP \
>> +| AMDVI_CAPAB_ID_SEC | AMDVI_CAPAB_INIT_TYPE | \
>> +AMDVI_CAPAB_FLAG_HTTUNNEL |  AMDVI_CAPAB_EFR_SUP)
>> +
>> +/* AMDVI default address */
>> +#define AMDVI_BASE_ADDR 0xfed8
>>
>
>


Re: [Qemu-devel] [V15 3/4] hw/i386: Introduce AMD IOMMU

2016-08-11 Thread David Kiarie
On Thu, Aug 11, 2016 at 11:23 AM, Valentine Sinitsyn <
valentine.sinit...@gmail.com> wrote:

> Hi,
>
>
> On 02.08.2016 13:39, David Kiarie wrote:
>
>> +static void amdvi_writeq_raw(AMDVIState *s, uint64_t val, hwaddr addr)
>> +{+
>> +static void amdvi_generate_msi_interrupt(AMDVIState *s)
>> +{
>> +MSIMessage msg;
>> +if (msi_enabled(&s->pci.dev)) {
>> +msg = msi_get_message(&s->pci.dev, 0);
>> +address_space_stl_le(&address_space_memory, msg.address,
>> msg.data,
>> + MEMTXATTRS_UNSPECIFIED, NULL);
>>
> Nit: don't you want to set the requester ID to the IOMMU's BDF here?


We could though I overlooked that because IOMMU interrupt requests are not
processed by IOMMU.


>
>
> Valentine
>
>>
>>


Re: [Qemu-devel] [V1 0/4] AMD-Vi Interrupt Remapping

2016-08-11 Thread David Kiarie
On Thu, Aug 11, 2016 at 9:39 AM, Valentine Sinitsyn <
valentine.sinit...@gmail.com> wrote:

> Hi David,
>
> On 11.08.2016 00:42, David Kiarie wrote:
>
>> Hello all,
>>
>> The following patchset adds AMD-Vi interrupt remapping logic
>> to Qemu and hooks it onto existing interrupt remapping infrastructure.It
>> has
>> a dependency on "Explicit SID for IOAPIC" patchset though.
>>
>> I would appreciate your feedback!
>>
> Nice work! I'll look into it deeper in a few days.
>
> In the meantime, I'm not able to apply this on top of your V15 IOMMU
> series applied to the current master. Am I missing anything?


I think it got stale already. You can test from here though
https://github.com/aslaq/qemu v15


>
> Valentine
>
>
>
>> For quick testing https://github.com/aslaq/qemu IR
>>
>> David Kiarie (4):
>>   hw/iommu: Prepare for AMD IOMMU interrupt remapping
>>   hw/iommu: AMD IOMMU interrupt remapping
>>   hw/acpi: report IOAPIC on IVRS
>>   hw/iommu: share common between IOMMUs
>>
>>  hw/i386/acpi-build.c  |   2 +
>>  hw/i386/amd_iommu.c   | 226 ++
>> +++-
>>  hw/i386/amd_iommu.h   |  74 +
>>  hw/i386/intel_iommu.c |   9 --
>>  hw/i386/trace-events  |   7 ++
>>  hw/i386/x86-iommu.c   |   8 ++
>>  6 files changed, 316 insertions(+), 10 deletions(-)
>>
>>


[Qemu-devel] [V1 2/4] hw/iommu: AMD IOMMU interrupt remapping

2016-08-10 Thread David Kiarie
Introduce AMD IOMMU interrupt remapping and hook it onto
the existing interrupt remapping infrastructure

Signed-off-by: David Kiarie 
---
 hw/i386/amd_iommu.c | 226 +++-
 hw/i386/amd_iommu.h |   2 +
 2 files changed, 227 insertions(+), 1 deletion(-)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 5fab9aa..825159b 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -18,12 +18,14 @@
  * with this program; if not, see <http://www.gnu.org/licenses/>.
  *
  * Cache implementation inspired by hw/i386/intel_iommu.c
+ *
  */
 #include "qemu/osdep.h"
 #include 
 #include "hw/pci/msi.h"
 #include "hw/i386/pc.h"
 #include "hw/i386/amd_iommu.h"
+#include "hw/i386/ioapic_internal.h"
 #include "hw/pci/pci_bus.h"
 #include "trace.h"
 
@@ -660,6 +662,11 @@ static void amdvi_inval_inttable(AMDVIState *s, 
CMDInvalIntrTable *inval)
 amdvi_log_illegalcom_error(s, inval->type, s->cmdbuf + s->cmdbuf_head);
 return;
 }
+
+if (s->ir_cache) {
+x86_iommu_iec_notify_all(X86_IOMMU_DEVICE(s), true, 0, 0);
+}
+
 trace_amdvi_intr_inval();
 }
 
@@ -1221,6 +1228,207 @@ static IOMMUTLBEntry amdvi_translate(MemoryRegion 
*iommu, hwaddr addr,
 return ret;
 }
 
+static inline int amdvi_ir_pass(MSIMessage *src, MSIMessage *dst, uint8_t bit,
+uint64_t dte)
+{
+if ((dte & (1UL << bit))) {
+/* passing interrupt enabled */
+dst->address = src->address;
+dst->data = src->data;
+} else {
+/* should be target aborted */
+return -AMDVI_TARGET_ABORT;
+}
+return 0;
+}
+
+static int amdvi_remap_ir_intctl(uint64_t dte, uint32_t irte,
+ MSIMessage *src, MSIMessage *dst)
+{
+int ret = 0;
+
+switch ((dte >> AMDVI_DTE_INTCTL ) & 3UL) {
+case 1:
+/* pass */
+memcpy(dst, src, sizeof(*dst));
+break;
+case 2:
+/* remap */
+if (irte & AMDVI_IRTE_REMAP_MASK) {
+/* LOCAL APIC address */
+dst->address = AMDVI_LOCAL_APIC_ADDR;
+/* destination mode */
+dst->address |= (((irte & AMDVI_IRTE_DM_MASK) >> 6) <<
+AMDVI_MSI_ADDR_DM_RSHIFT);
+/* RH */
+dst->address |= ((irte & AMDVI_IRTE_RQEOI_MASK) >> 5) <<
+AMDVI_MSI_ADDR_RH_RSHIFT;
+/* Destination ID */
+dst->address |= ((irte & AMDVI_IRTE_DEST_MASK) >> 8) <<
+AMDVI_MSI_ADDR_DEST_RSHIFT;
+/* construct data - vector */
+dst->data |= (irte & AMDVI_IRTE_VECTOR_MASK) >> 16;
+/* Interrupt type */
+dst->data |= ((irte & AMDVI_IRTE_INTTYPE_MASK) >> 2) <<
+AMDVI_MSI_DATA_DM_RSHIFT;
+} else  {
+ret = -AMDVI_TARGET_ABORT;
+}
+break;
+case 0:
+case 3:
+default:
+ret = -AMDVI_TARGET_ABORT;
+}
+return ret;
+}
+/*
+ * We don't support guest virtual APIC so IRTE size will most likely always be 
4
+ */
+static int amdvi_irte_get(AMDVIState *s, MSIMessage *src, uint32_t *irte,
+  uint64_t *dte, uint16_t devid)
+{
+uint64_t irte_root, offset = devid * AMDVI_DEVTAB_ENTRY_SIZE,
+ irte_size = AMDVI_DEFAULT_IRTE_SIZE,
+ ir_table_size;
+
+/* check for GASup and if it's enabled */
+if ((amdvi_readq(s, AMDVI_EXT_FEATURES) & AMDVI_GASUP)
+&& (amdvi_readq(s, AMDVI_MMIO_CONTROL) & AMDVI_GAEN)) {
+/* set a different IRTE size */
+irte_size = AMDVI_IRTE_SIZE_GASUP;
+}
+if (dma_memory_read(&address_space_memory, s->devtab + offset, dte,
+AMDVI_DEVTAB_ENTRY_SIZE)) {
+trace_amdvi_dte_get_fail(s->devtab, offset);
+return -AMDVI_DEV_TAB_HW;
+}
+
+irte_root = dte[2] & AMDVI_IRTEROOT_MASK;
+offset = (src->data & AMDVI_IRTE_INDEX_MASK) << 2;
+ir_table_size = pow(2, dte[2] & AMDVI_IR_TABLE_SIZE_MASK);
+/* enforce IR table size */
+if (offset > (ir_table_size * irte_size)) {
+trace_amdvi_invalid_irte_entry(offset, ir_table_size);
+return -AMDVI_TARGET_ABORT;
+}
+/* read IRTE */
+if (dma_memory_read(&address_space_memory, irte_root + offset,
+irte, sizeof(*irte))) {
+trace_amdvi_irte_get_fail(irte_root, offset);
+return -AMDVI_DEV_TAB_HW;
+}
+return 0;
+}
+
+static int amdvi_int_remap(X86IOMMUState *iommu, MSIMessage *src,
+   MSIMessage *dst, uint16_t sid)
+{
+trace_amdvi_ir_request(src->data, src->address, sid);
+
+AMDVIState *s = AMD_IOMMU_DEVICE(iommu);
+in

[Qemu-devel] [V1 4/4] hw/iommu: share common between IOMMUs

2016-08-10 Thread David Kiarie
Enabling interrupt remapping with kernel_irqchip=on should
result in error for both VT-d and AMD-Vi

Signed-off-by: David Kiarie 
---
 hw/i386/intel_iommu.c | 9 -
 hw/i386/x86-iommu.c   | 8 
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 7c80907..857f9f1 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -30,7 +30,6 @@
 #include "hw/boards.h"
 #include "hw/i386/x86-iommu.h"
 #include "hw/pci-host/q35.h"
-#include "sysemu/kvm.h"
 
 #define DEBUG_INTEL_IOMMU
 #ifdef DEBUG_INTEL_IOMMU
@@ -2472,14 +2471,6 @@ static void vtd_realize(DeviceState *dev, Error **errp)
 Q35_PSEUDO_DEVFN_IOAPIC;
 /* Pseudo address space under root PCI bus. */
 pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
-
-/* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
-if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
-!kvm_irqchip_is_split()) {
-error_report("Intel Interrupt Remapping cannot work with "
- "kernel-irqchip=on, please use 'split|off'.");
-exit(1);
-}
 }
 
 static void vtd_class_init(ObjectClass *klass, void *data)
diff --git a/hw/i386/x86-iommu.c b/hw/i386/x86-iommu.c
index 2278af7..66510f7 100644
--- a/hw/i386/x86-iommu.c
+++ b/hw/i386/x86-iommu.c
@@ -21,6 +21,7 @@
 #include "hw/sysbus.h"
 #include "hw/boards.h"
 #include "hw/i386/x86-iommu.h"
+#include "sysemu/kvm.h"
 #include "qemu/error-report.h"
 #include "trace.h"
 
@@ -84,6 +85,13 @@ static void x86_iommu_realize(DeviceState *dev, Error **errp)
 if (x86_class->realize) {
 x86_class->realize(dev, errp);
 }
+/* Currently IOMMU IR only support "kernel-irqchip={off|split}" */
+if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
+!kvm_irqchip_is_split()) {
+error_report("Interrupt Remapping cannot work with "
+ "kernel-irqchip=on, please use 'split|off'.");
+exit(1);
+}
 
 x86_iommu_set_default(X86_IOMMU_DEVICE(dev));
 }
-- 
2.1.4




Re: [Qemu-devel] [V1 0/4] AMD-Vi Interrupt Remapping

2016-08-10 Thread David Kiarie
On Wed, Aug 10, 2016 at 10:42 PM, David Kiarie 
wrote:

Sorry, forgot to cc Peter.

Hello all,
>
> The following patchset adds AMD-Vi interrupt remapping logic
> to Qemu and hooks it onto existing interrupt remapping infrastructure.It
> has
> a dependency on "Explicit SID for IOAPIC" patchset though.
>
> I would appreciate your feedback!
>
> For quick testing https://github.com/aslaq/qemu IR
>
> David Kiarie (4):
>   hw/iommu: Prepare for AMD IOMMU interrupt remapping
>   hw/iommu: AMD IOMMU interrupt remapping
>   hw/acpi: report IOAPIC on IVRS
>   hw/iommu: share common between IOMMUs
>
>  hw/i386/acpi-build.c  |   2 +
>  hw/i386/amd_iommu.c   | 226 ++
> +++-
>  hw/i386/amd_iommu.h   |  74 +
>  hw/i386/intel_iommu.c |   9 --
>  hw/i386/trace-events  |   7 ++
>  hw/i386/x86-iommu.c   |   8 ++
>  6 files changed, 316 insertions(+), 10 deletions(-)
>
> --
> 2.1.4
>
>


[Qemu-devel] [V1 3/4] hw/acpi: report IOAPIC on IVRS

2016-08-10 Thread David Kiarie
Report IOAPIC via IVRS which effectively allows linux AMD-Vi
driver to enable interrupt remapping

Signed-off-by: David Kiarie 
---
 hw/i386/acpi-build.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 49bd183..da602c3 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2615,6 +2615,8 @@ build_amd_iommu(GArray *table_data, BIOSLinker *linker)
  *   Refer to Spec - Table 95:IVHD Device Entry Type Codes(4-byte)
  */
 build_append_int_noprefix(table_data, 0x001, 4);
+/* IOAPIC represented as an 8-byte entry. Spec v2.62 Tables 97 */
+build_append_int_noprefix(table_data, 0x0100a000ff48, 8);
 
 build_header(linker, table_data, (void *)(table_data->data + iommu_start),
  "IVRS", table_data->len - iommu_start, 1, NULL, NULL);
-- 
2.1.4




[Qemu-devel] [V1 1/4] hw/iommu: Prepare for AMD IOMMU interrupt remapping

2016-08-10 Thread David Kiarie
Introduce macros and trace events for use in AMD IOMMU
interrupt remapping

Signed-off-by: David Kiarie 
---
 hw/i386/amd_iommu.h  | 72 
 hw/i386/trace-events |  7 +
 2 files changed, 79 insertions(+)

diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
index 5d0b91d..2a7f19e 100644
--- a/hw/i386/amd_iommu.h
+++ b/hw/i386/amd_iommu.h
@@ -177,6 +177,78 @@
 #define AMDVI_IOTLB_MAX_SIZE 1024
 #define AMDVI_DEVID_SHIFT36
 
+/* interrupt types */
+#define AMDVI_MT_FIXED  0x0
+#define AMDVI_MT_ARBIT  0x1
+#define AMDVI_MT_SMI0x2
+#define AMDVI_MT_NMI0x3
+#define AMDVI_MT_INIT   0x4
+#define AMDVI_MT_EXTINT 0x6
+#define AMDVI_MT_LINT1  0xb
+#define AMDVI_MT_LINT0  0xe
+
+/* Ext reg, GA support */
+#define AMDVI_GASUP(1UL << 7)
+/* MMIO control GA enable bits */
+#define AMDVI_GAEN (1UL << 17)
+
+/* MSI interrupt type mask */
+#define AMDVI_IR_TYPE_MASK 0x300
+
+/* interrupt destination mode */
+#define AMDVI_IRDEST_MODE_MASK 0x2
+
+/* select MSI data 10:0 bits */
+#define AMDVI_IRTE_INDEX_MASK 0x7ff
+
+/* bits determining whether specific interrupts should be passed
+ * split DTE into 64-bit chunks
+ */
+#define AMDVI_DTE_INTPASS   56
+#define AMDVI_DTE_EINTPASS  57
+#define AMDVI_DTE_NMIPASS   58
+#define AMDVI_DTE_INTCTL60
+#define AMDVI_DTE_LINT0PASS 62
+#define AMDVI_DTE_LINT1PASS 63
+
+/* interrupt data valid */
+#define AMDVI_IR_VALID  (1UL << 0)
+
+/* interrupt root table mask */
+#define AMDVI_IRTEROOT_MASK 0xc0
+
+/* default IRTE size */
+#define AMDVI_DEFAULT_IRTE_SIZE 0x4
+
+/* IRTE size with GASup enabled */
+#define AMDVI_IRTE_SIZE_GASUP   0x10
+
+#define AMDVI_IRTE_VECTOR_MASK(0xffU << 16)
+#define AMDVI_IRTE_DEST_MASK  (0xffU << 8)
+#define AMDVI_IRTE_DM_MASK(0x1U << 6)
+#define AMDVI_IRTE_RQEOI_MASK (0x1U << 5)
+#define AMDVI_IRTE_INTTYPE_MASK   (0x7U << 2)
+#define AMDVI_IRTE_SUPIOPF_MASK   (0x1U << 1)
+#define AMDVI_IRTE_REMAP_MASK (0x1U << 0)
+
+#define AMDVI_IR_TABLE_SIZE_MASK 0xfe
+
+/* offsets into MSI data */
+#define AMDVI_MSI_DATA_DM_RSHIFT   0x8
+#define AMDVI_MSI_DATA_LEVEL_RSHIFT0xe
+#define AMDVI_MSI_DATA_TRM_RSHIFT  0xf
+
+/* offsets into MSI address */
+#define AMDVI_MSI_ADDR_DM_RSHIFT   0x2
+#define AMDVI_MSI_ADDR_RH_RSHIFT   0x3
+#define AMDVI_MSI_ADDR_DEST_RSHIFT 0xc
+
+#define AMDVI_BUS_NUM  0x0
+/* AMD-Vi specific IOAPIC Device function */
+#define AMDVI_DEVFN_IOAPIC 0xa0
+
+#define AMDVI_LOCAL_APIC_ADDR 0xfee0
+
 /* extended feature support */
 #define AMDVI_EXT_FEATURES (AMDVI_FEATURE_PREFETCH | AMDVI_FEATURE_PPR | \
 AMDVI_FEATURE_IA | AMDVI_FEATURE_GT | AMDVI_FEATURE_GA | \
diff --git a/hw/i386/trace-events b/hw/i386/trace-events
index 592de3a..5c12c10 100644
--- a/hw/i386/trace-events
+++ b/hw/i386/trace-events
@@ -42,3 +42,10 @@ amdvi_mode_invalid(unsigned level, uint64_t addr)"error: 
translation level 0x%"P
 amdvi_page_fault(uint64_t addr) "error: page fault accessing guest physical 
address 0x%"PRIx64
 amdvi_iotlb_hit(uint16_t bus, uint16_t slot, uint16_t func, uint64_t addr, 
uint64_t txaddr) "hit iotlb devid %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
 amdvi_translation_result(uint16_t bus, uint16_t slot, uint16_t func, uint64_t 
addr, uint64_t txaddr) "devid: %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
+amdvi_irte_get_fail(uint64_t addr, uint64_t offset) "couldn't access device 
table entry 0x%"PRIx64" + offset 0x%"PRIx64
+amdvi_invalid_irte_entry(uint16_t devid, uint64_t offset) "devid %x requested 
IRTE offset 0x%"PRIx64" Outside IR table range"
+amdvi_ir_request(uint32_t data, uint64_t addr, uint16_t sid) "IR request data 
0x%"PRIx32" address 0x%"PRIx64" SID %x"
+amdvi_ir_remap(uint32_t data, uint64_t addr, uint16_t sid) "IR remap data 
0x%"PRIx32" address 0x%"PRIx64" SID %x"
+amdvi_ir_target_abort(uint32_t data, uint64_t addr, uint16_t sid) "IR target 
abort data 0x%"PRIx32" address 0x%"PRIx64" SID %x"
+amdvi_ir_write_fail(uint64_t addr, uint32_t data) "fail to write to addr 
0x%"PRIx64 " value 0x%"PRIx32
+amdvi_ir_read_fail(uint64_t addr) " fail to read from addr 0x%"PRIx64
-- 
2.1.4




[Qemu-devel] [V1 0/4] AMD-Vi Interrupt Remapping

2016-08-10 Thread David Kiarie
Hello all,

The following patchset adds AMD-Vi interrupt remapping logic
to Qemu and hooks it onto existing interrupt remapping infrastructure.It has 
a dependency on "Explicit SID for IOAPIC" patchset though.

I would appreciate your feedback!

For quick testing https://github.com/aslaq/qemu IR

David Kiarie (4):
  hw/iommu: Prepare for AMD IOMMU interrupt remapping
  hw/iommu: AMD IOMMU interrupt remapping
  hw/acpi: report IOAPIC on IVRS
  hw/iommu: share common between IOMMUs

 hw/i386/acpi-build.c  |   2 +
 hw/i386/amd_iommu.c   | 226 +-
 hw/i386/amd_iommu.h   |  74 +
 hw/i386/intel_iommu.c |   9 --
 hw/i386/trace-events  |   7 ++
 hw/i386/x86-iommu.c   |   8 ++
 6 files changed, 316 insertions(+), 10 deletions(-)

-- 
2.1.4




Re: [Qemu-devel] [V15 0/4] AMD IOMMU

2016-08-10 Thread David Kiarie
On Tue, Aug 9, 2016 at 11:27 PM, David Kiarie 
wrote:

> Hi all,
>
> This patchset adds basic AMD IOMMU emulation support to Qemu.
>
> Change since v14
>-MMIO register reading/write bug fix [Peter]
>-Endian-ness issue fix[Peter]
>-Bitfields layouts in IOMMU commands fix[Peter]
>

I seem to have left out a few of this.

   -IVRS changed IVHD device entry from type 3 to 1 to save a few bytes
>-coding style issues, comment grammer and other miscellaneous fixes.
>
> Changes since v13
>-Added an error to make AMD IOMMU incompatible with device
> assignment.[Alex]
>-Converted AMD IOMMU into a composite PCI and System Bus device. This
> helps with:
>   -We can now inherit from X86 IOMMU base class(which is implemented
> as a System Bus device).
>   -We can now reserve MMIO region for IOMMU without a BAR register and
> without a hack.
>
> Changes since v12
>
>-Coding style fixes [Jan, Michael]
>-Error logging fix to avoid using a macro[Jan]
>-moved some PCI macros to PCI header[Jan]
>-Use a lookup table for MMIO register names when tracing[Jan]
>
> Changes since V11
>-AMD IOMMU is not started with -device amd-iommu (with a dependency on
> Marcel's patches).
>-IOMMU commands are represented using bitfields which is less error
> prone and more readable[Peter]
>-Changed from debug fprintfs to tracing[Jan]
>
> Changes since V10
>
>-Support for huge pages including some obscure AMD IOMMU feature that
> allows default page size override[Jan].
>-Fixed an issue with generation of interrupts. We noted that AMD IOMMU
> has BusMaster- and is therefore not able to generate interrupts like any
> other PCI device. We have resulted in writing directly to system address
> but this could be fixed by some patches which have not been merged yet.
>
> Changes since v9
>
>-amd_iommu prefixes have been renamed to a shorter 'amdvi' both in the
> macros
> and in the functions/code. The register macros have not been moved to
> the
> implementation file since almost the macros there are basically macros
> and I
> reckoned renaming them should suffice.
>-taken care of byte order in the use of 'dma_memory_read'[Michael]
>-Taken care of invalid DTE entries to ensure no DMA unless a device is
> configured to allow it.
>-An issue with the emulate IOMMU defaulting to AMD_IOMMU has been
> fixed[Marcel]
>
> You can test[1] this patches by starting with parameters
> qemu-system-x86_64 -M -device amd-iommu -m 2G -enable-kvm -smp 4 -cpu
> host -hda file.img -soundhw ac97
> emulating whatever devices you want.
>
> Not passing any command line parameters to linux should be enough to test
> this patches since the devices are basically
> passes-through but to the 'host' (l1 guest). You can still go ahead pass
> command line parameter 'iommu=pt iommu=1'
> and try to pass a device to L2 guest. This can also done without passing
> any iommu related parameters to the kernel.
>
> David Kiarie (4):
>   hw/pci: Prepare for AMD IOMMU
>   hw/i386/trace-events: Add AMD IOMMU trace events
>   hw/i386: Introduce AMD IOMMU
>   hw/i386: AMD IOMMU IVRS table
>
>  hw/acpi/aml-build.c |2 +-
>  hw/i386/Makefile.objs   |1 +
>  hw/i386/acpi-build.c|   76 ++-
>  hw/i386/amd_iommu.c | 1401 ++
> +
>  hw/i386/amd_iommu.h |  390 
>  hw/i386/intel_iommu.c   |1 +
>  hw/i386/trace-events|   36 ++
>  hw/i386/x86-iommu.c |6 +
>  include/hw/acpi/aml-build.h |1 +
>  include/hw/i386/x86-iommu.h |   12 +
>  include/hw/pci/pci.h|4 +-
>  11 files changed, 1919 insertions(+), 11 deletions(-)
>  create mode 100644 hw/i386/amd_iommu.c
>  create mode 100644 hw/i386/amd_iommu.h
>
> --
> 2.1.4
>
>


Re: [Qemu-devel] [RFC 1/2] hw/msi: Allow platform devices to use explicit SID

2016-08-09 Thread David Kiarie
On Wed, Aug 10, 2016 at 8:41 AM, Peter Xu  wrote:

> On Tue, Aug 09, 2016 at 05:32:16PM +0300, David Kiarie wrote:
> > When using IOMMU platform devices like IOAPIC are required to make
> > interrupt remapping requests using explicit SID.We affiliate an MSI
> > route with a requester ID and a PCI device if present which ensures
> > that platform devices can call IOMMU interrupt remapping code with
> > explicit SID while maintaining compatility with the original code
> > which mainly dealt with PCI devices.
> >
> > Signed-off-by: David Kiarie 
>
> Hi,
>
> This idea is good to me overall, with some tiny comments below.
>
> [...]
>
> > -static void ioapic_service(IOAPICCommonState *s)
> > +static void ioapic_write_ioapic_as(IOAPICCommonState *s, uint32_t
> data, uint64_t addr)
>
> Rename to ioapic_as_write()?
>
> [...]
>
> > @@ -385,12 +393,23 @@ static void ioapic_machine_done_notify(Notifier
> *notifier, void *data)
> >
> >  if (kvm_irqchip_is_split()) {
> >  X86IOMMUState *iommu = x86_iommu_get_default();
> > +MSIMessage msg = {0, 0};
> > +int i;
> > +
> >  if (iommu) {
> >  /* Register this IOAPIC with IOMMU IEC notifier, so that
> >   * when there are IR invalidates, we can be notified to
> >   * update kernel IR cache. */
> > -x86_iommu_iec_register_notifier(iommu,
> ioapic_iec_notifier, s);
> > +s->devid = iommu->ioapic_bdf;
> > +/* update IOAPIC routes to the right SID */
> > +for (i = 0; i < IOAPIC_NUM_PINS; i++) {
> > +kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL,
> s->devid);
> > +}
> > +kvm_irqchip_commit_routes(kvm_state);
>
> Here, not sure whether it'll be better if we remove
> kvm_irqchip_add_msi_route() in kvm_arch_init_irq_routing() directly
> and call them here. So no extra update needed.
>

Thought about that too but  I was worried another device might reserve
routes before IOAPIC does.

>
> >  }
> > +
> > +kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL,
> s->devid);
>
> What is this line used for?


This one shouldn't be here. It got left over.


>


> > +x86_iommu_iec_register_notifier(iommu, ioapic_iec_notifier, s);
> >  }
> >  #endif
> >  }
> > @@ -407,6 +426,7 @@ static void ioapic_realize(DeviceState *dev, Error
> **errp)
> >
> >  memory_region_init_io(&s->io_memory, OBJECT(s), &ioapic_io_ops, s,
> >"ioapic", 0x1000);
> > +s->devid = 0;
>
> Nit: We can remove this line.
>
> [...]
>
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index 755f921..54e27fc 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -705,7 +705,8 @@ static int kvm_virtio_pci_vq_vector_use(VirtIOPCIProxy
> *proxy,
> >  int ret;
> >
> >  if (irqfd->users == 0) {
> > -ret = kvm_irqchip_add_msi_route(kvm_state, vector,
> &proxy->pci_dev);
> > +ret = kvm_irqchip_add_msi_route(kvm_state, vector,
> &proxy->pci_dev,
> > +pci_requester_id(&proxy->pci_dev));
> >  if (ret < 0) {
> >  return ret;
> >  }
> > @@ -838,7 +839,8 @@ static int virtio_pci_vq_vector_unmask(VirtIOPCIProxy
> *proxy,
> >  irqfd = &proxy->vector_irqfd[vector];
> >  if (irqfd->msg.data != msg.data || irqfd->msg.address !=
> msg.address) {
> >  ret = kvm_irqchip_update_msi_route(kvm_state, irqfd->virq,
> msg,
> > -   &proxy->pci_dev);
> > +&proxy->pci_dev,
>
> Nit: Here you changed indentation, I would suggest keep it, as well in
> the next line.
>
> > +pci_requester_id(&proxy->pci_
> dev));
> >  if (ret < 0) {
> >  return ret;
> >  }
> > diff --git a/include/hw/i386/ioapic_internal.h b/include/hw/i386/ioapic_
> internal.h
> > index a11d86d..d68a24f 100644
> > --- a/include/hw/i386/ioapic_internal.h
> > +++ b/include/hw/i386/ioapic_internal.h
> > @@ -103,6 +103,7 @@ typedef struct IOAPICCommonClass {
> >  struct IOAPICCommonState {
> >  SysBusDevice busdev;
> >  MemoryRegion io_memory;
> > +uint16_t devid;
> >  uint8_t id

Re: [Qemu-devel] [RFC 2/2] hw/i386: enforce SID verification

2016-08-09 Thread David Kiarie
On Wed, Aug 10, 2016 at 8:49 AM, Peter Xu  wrote:

> On Tue, Aug 09, 2016 at 05:32:17PM +0300, David Kiarie wrote:
>
> [...]
>
> > @@ -2252,14 +2250,17 @@ static MemTxResult vtd_mem_ir_write(void
> *opaque, hwaddr addr,
> >  {
> >  int ret = 0;
> >  MSIMessage from = {}, to = {};
> > -uint16_t sid = X86_IOMMU_SID_INVALID;
> > +VTDAddressSpace *as = opaque;
> > +uint16_t sid = pci_bus_num(as->bus) << 8 | as->devfn;
>
> SID can be something not equals to BDF. E.g., when there are PCI
> bridges. See pci_requester_id(). However...
>
> >
> >  from.address = (uint64_t) addr + VTD_INTERRUPT_ADDR_FIRST;
> >  from.data = (uint32_t) value;
> >
> > -if (!attrs.unspecified) {
> > -/* We have explicit Source ID */
> > -sid = attrs.requester_id;
> > +if (attrs.requester_id != sid) {
> > +VTD_DPRINTF(GENERAL, "int remap request for sid 0x%04x"
> > +" requester_id 0x%04x couldn't be verified",
> > +sid, attrs.requester_id);
> > +return MEMTX_ERROR;
>
> ...I am not sure whether we need extra check here. In what case will
> attrs.requester_id != sid ?
>

Meaning I should remove this check ?


>
> Though I agree to remove the original if().
>
> >  }
> >
> >  ret = vtd_interrupt_remap_msi(opaque, &from, &to, sid);
> > @@ -2325,7 +2326,7 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState
> *s, PCIBus *bus, int devfn)
> >  memory_region_init_iommu(&vtd_dev_as->iommu, OBJECT(s),
> >   &s->iommu_ops, "intel_iommu",
> UINT64_MAX);
> >  memory_region_init_io(&vtd_dev_as->iommu_ir, OBJECT(s),
> > -  &vtd_mem_ir_ops, s, "intel_iommu_ir",
> > +  &vtd_mem_ir_ops, vtd_dev_as,
> "intel_iommu_ir",
> >VTD_INTERRUPT_ADDR_SIZE);
> >  memory_region_add_subregion(&vtd_dev_as->iommu,
> VTD_INTERRUPT_ADDR_FIRST,
> >  &vtd_dev_as->iommu_ir);
> > @@ -2465,6 +2466,9 @@ static void vtd_realize(DeviceState *dev, Error
> **errp)
> >  vtd_init(s);
> >  sysbus_mmio_map(SYS_BUS_DEVICE(s), 0, Q35_HOST_BRIDGE_IOMMU_ADDR);
> >  pci_setup_iommu(bus, vtd_host_dma_iommu, dev);
> > +/* IOMMU expected IOAPIC SID */
> > +x86_iommu->ioapic_bdf = Q35_PSEUDO_DEVFN_IOAPIC << 8 |
> > +Q35_PSEUDO_DEVFN_IOAPIC;
>
> We can use PCI_BUILD_BDF() here.
>
> -- peterx
>


Re: [Qemu-devel] [V15 3/4] hw/i386: Introduce AMD IOMMU

2016-08-09 Thread David Kiarie
On Wed, Aug 10, 2016 at 5:08 AM, Peter Xu  wrote:

> On Tue, Aug 09, 2016 at 03:52:07PM +0300, David Kiarie wrote:
>
> [...]
>
> > > > +if (dma_memory_write(&address_space_memory, s->evtlog_len +
> > > s->evtlog_tail,
> > > > +&evt, AMDVI_EVENT_LEN)) {
> > >
> > > Check with MEMTX_OK?
> > >
> >
> > I'm not sure what exactly you mean here.
>
> I mean we have return code macros for these memory operations, like
> MEMTX_OK/MEMTX_ERROR/... However please feel free to ignore this
> comment since I see merely no place in current QEMU code that is doing
> the checking at all. Your call.
>
> >
> >
> > >
> > > [...]
> > >
> > > > +/*
> > > > + * AMDVi event structure
> > > > + *0:15   -> DeviceID
> > > > + *55:63  -> event type + miscellaneous info
> > > > + *64:127 -> related address
> > > > + */
> > > > +static void amdvi_encode_event(uint64_t *evt, uint16_t devid,
> uint64_t
> > > addr,
> > > > +   uint16_t info)
> > > > +{
> > > > +amdvi_setevent_bits(evt, devid, 0, 16);
> > > > +amdvi_setevent_bits(evt, info, 55, 8);
> > > > +amdvi_setevent_bits(evt, addr, 63, 64);
> > >   ^^
> > > should here be 64?
> > >
> > > Also, I am not sure whether we need this amdvi_setevent_bits() if it's
> > > only used in this function. Though not a big problem for me.
> > >
> >
> > It's only used in this function but I actually wrote his mainly for
> future
> > use. The idea is that various events encode totally different information
> > while the above is an over-simplified version to encode information
> common
> > to most events. In case an event wants to encode more information it
> would
> > turn out much more easier.
>
> Yes my above comment is "Nit" for sure. :) Please have it if you like.
>
> >
> >
> > >
> > > > +}
> > > > +/* log an error encountered page-walking
> > >
> > > "during page-walking"
> > >
> >
> > "encountered page-walking"  sounds right to me. "page-walking" is a verb,
> > in continuous tense, right ? how about I say "during hacking" ;-)
>
> I am not that good at English. I pointed that out since I "suspect"
> that is wrong (in case that would help). But if you are confident
> enough, please just ignore. I'm mostly ok with all comments as long as
> they are "understandable".
>

I changed that to  "encountered during a page walk" - I'm sure no one has a
problem with that :-)


> >
> >
> > > > + *
> > > > + * @addr: virtual address in translation request
> > > > + */
> > > > +static void amdvi_page_fault(AMDVIState *s, uint16_t devid,
> > > > + hwaddr addr, uint16_t info)
> > > > +{
> > > > +uint64_t evt[4];
> > > > +
> > > > +info |= AMDVI_EVENT_IOPF_I | AMDVI_EVENT_IOPF;
> > > > +amdvi_encode_event(evt, devid, addr, info);
> > > > +amdvi_log_event(s, evt);
> > > > +pci_word_test_and_set_mask(s->pci.dev.config + PCI_STATUS,
> > > > +PCI_STATUS_SIG_TARGET_ABORT);
> > >
> > > Nit: maybe we can provide a function for setting this bit.
> > >
> >
> > I've actually being ignoring these since Qemu doesn't seem to care about
> > them.
> >
>
> Sorry I failed to understand your sentence.
>

I mean Qemu PCI bus doesn't abort any transactions regardless of whether a
device has set abort status.


> -- peterx
>


[Qemu-devel] [V15 2/4] hw/i386/trace-events: Add AMD IOMMU trace events

2016-08-09 Thread David Kiarie
Signed-off-by: David Kiarie 
---
 hw/i386/trace-events | 29 +
 1 file changed, 29 insertions(+)

diff --git a/hw/i386/trace-events b/hw/i386/trace-events
index b4882c1..592de3a 100644
--- a/hw/i386/trace-events
+++ b/hw/i386/trace-events
@@ -13,3 +13,32 @@ mhp_pc_dimm_assigned_address(uint64_t addr) "0x%"PRIx64
 
 # hw/i386/x86-iommu.c
 x86_iommu_iec_notify(bool global, uint32_t index, uint32_t mask) "Notify IEC 
invalidation: global=%d index=%" PRIu32 " mask=%" PRIu32
+
+# hw/i386/amd_iommu.c
+amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at addr 
0x%"PRIx64 " +  offset 0x%"PRIx32
+amdvi_cache_update(uint16_t domid, uint32_t bus, uint32_t slot, uint32_t func, 
uint64_t gpa, uint64_t txaddr) " update iotlb domid 0x%"PRIx16" devid: 
%02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
+amdvi_completion_wait_fail(uint64_t addr) "error: fail to write at address 
0x%"PRIx64
+amdvi_mmio_write(const char *reg, uint64_t addr, unsigned size, uint64_t val, 
unsigned long offset) "%s write addr 0x%"PRIx64 ", size %d, val 0x%"PRIx64 ", 
offset 0x%"PRIx64
+amdvi_mmio_read(const char *reg, uint64_t addr, unsigned size, uint64_t 
offset) "%s read addr 0x%"PRIx64", size %d offset 0x%"PRIx64
+amdvi_command_error(uint64_t status) "error: Executing commands with command 
buffer disabled 0x%"PRIx64
+amdvi_command_read_fail(uint64_t addr, uint32_t head) "error: fail to access 
memory at 0x%"PRIx64" + 0x%"PRIu32
+amdvi_command_exec(uint32_t head, uint32_t tail, uint64_t buf) "command buffer 
head at 0x%"PRIx32 " command buffer tail at 0x%"PRIx32" command buffer base at 
0x%" PRIx64
+amdvi_unhandled_command(uint8_t type) "unhandled command %d"
+amdvi_intr_inval(void) "Interrupt table invalidated"
+amdvi_iotlb_inval(void) "IOTLB pages invalidated"
+amdvi_prefetch_pages(void) "Pre-fetch of AMD-Vi pages requested"
+amdvi_pages_inval(uint16_t domid) "AMD-Vi pages for domain 0x%"PRIx16 " 
invalidated"
+amdvi_all_inval(void) "Invalidation of all AMD-Vi cache requested "
+amdvi_ppr_exec(void) "Execution of PPR queue requested "
+amdvi_devtab_inval(uint16_t bus, uint16_t slot, uint16_t func) "device table 
entry for devid: %02x:%02x.%x invalidated"
+amdvi_completion_wait(uint64_t addr, uint64_t data) "completion wait requested 
with store address 0x%"PRIx64" and store data 0x%"PRIx64
+amdvi_control_status(uint64_t val) "MMIO_STATUS state 0x%"PRIx64
+amdvi_iotlb_reset(void) "IOTLB exceed size limit - reset "
+amdvi_completion_wait_exec(uint64_t addr, uint64_t data) "completion wait 
requested with store address 0x%"PRIx64" and store data 0x%"PRIx64
+amdvi_dte_get_fail(uint64_t addr, uint32_t offset) "error: failed to access 
Device Entry devtab 0x%"PRIx64" offset 0x%"PRIx32
+amdvi_invalid_dte(uint64_t addr) "PTE entry at 0x%"PRIx64" is invalid "
+amdvi_get_pte_hwerror(uint64_t addr) "hardware error eccessing PTE at addr 
0x%"PRIx64
+amdvi_mode_invalid(unsigned level, uint64_t addr)"error: translation level 
0x%"PRIu8" translating addr 0x%"PRIx64
+amdvi_page_fault(uint64_t addr) "error: page fault accessing guest physical 
address 0x%"PRIx64
+amdvi_iotlb_hit(uint16_t bus, uint16_t slot, uint16_t func, uint64_t addr, 
uint64_t txaddr) "hit iotlb devid %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
+amdvi_translation_result(uint16_t bus, uint16_t slot, uint16_t func, uint64_t 
addr, uint64_t txaddr) "devid: %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
-- 
2.1.4




[Qemu-devel] [V15 3/4] hw/i386: Introduce AMD IOMMU

2016-08-09 Thread David Kiarie
Add AMD IOMMU emulaton to Qemu in addition to Intel IOMMU.
The IOMMU does basic translation, error checking and has a
minimal IOTLB implementation. This IOMMU bypassed the need
for target aborts by responding with IOMMU_NONE access rights
and exempts the region 0xfee0-0xfeef from translation
as it is the q35 interrupt region.

We advertise features that are not yet implemented to please
the Linux IOMMU driver.

IOTLB aims at implementing commands on real IOMMUs which is
essential for debugging and may not offer any performance
benefits

Signed-off-by: David Kiarie 
---
 hw/i386/Makefile.objs |1 +
 hw/i386/amd_iommu.c   | 1399 +
 hw/i386/amd_iommu.h   |  390 ++
 hw/i386/trace-events  |7 +
 4 files changed, 1797 insertions(+)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index 90e94ff..909ead6 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -3,6 +3,7 @@ obj-y += multiboot.o
 obj-y += pc.o pc_piix.o pc_q35.o
 obj-y += pc_sysfw.o
 obj-y += x86-iommu.o intel_iommu.o
+obj-y += amd_iommu.o
 obj-$(CONFIG_XEN) += ../xenpv/ xen/
 
 obj-y += kvmvapic.o
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
new file mode 100644
index 000..6016778c
--- /dev/null
+++ b/hw/i386/amd_iommu.c
@@ -0,0 +1,1399 @@
+/*
+ * QEMU emulation of AMD IOMMU (AMD-Vi)
+ *
+ * Copyright (C) 2011 Eduard - Gabriel Munteanu
+ * Copyright (C) 2015 David Kiarie, 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ * Cache implementation inspired by hw/i386/intel_iommu.c
+ */
+#include "qemu/osdep.h"
+#include 
+#include "hw/pci/msi.h"
+#include "hw/i386/pc.h"
+#include "hw/i386/amd_iommu.h"
+#include "hw/pci/pci_bus.h"
+#include "trace.h"
+
+/* used AMD-Vi MMIO registers */
+const char *amdvi_mmio_low[] = {
+"AMDVI_MMIO_DEVTAB_BASE",
+"AMDVI_MMIO_CMDBUF_BASE",
+"AMDVI_MMIO_EVTLOG_BASE",
+"AMDVI_MMIO_CONTROL",
+"AMDVI_MMIO_EXCL_BASE",
+"AMDVI_MMIO_EXCL_LIMIT",
+"AMDVI_MMIO_EXT_FEATURES",
+"AMDVI_MMIO_PPR_BASE",
+"UNHANDLED"
+};
+const char *amdvi_mmio_high[] = {
+"AMDVI_MMIO_COMMAND_HEAD",
+"AMDVI_MMIO_COMMAND_TAIL",
+"AMDVI_MMIO_EVTLOG_HEAD",
+"AMDVI_MMIO_EVTLOG_TAIL",
+"AMDVI_MMIO_STATUS",
+"AMDVI_MMIO_PPR_HEAD",
+"AMDVI_MMIO_PPR_TAIL",
+"UNHANDLED"
+};
+typedef struct AMDVIAddressSpace {
+uint8_t bus_num;/* bus number   */
+uint8_t devfn;  /* device function  */
+AMDVIState *iommu_state;/* AMDVI - one per machine  */
+MemoryRegion iommu; /* Device's address translation region  */
+MemoryRegion iommu_ir;  /* Device's interrupt remapping region  */
+AddressSpace as;/* device's corresponding address space */
+} AMDVIAddressSpace;
+
+/* AMDVI cache entry */
+typedef struct AMDVIIOTLBEntry {
+uint64_t gfn;   /* guest frame number  */
+uint16_t domid; /* assigned domain id  */
+uint16_t devid; /* device owning entry */
+uint64_t perms; /* access permissions  */
+uint64_t translated_addr;   /* translated address  */
+uint64_t page_mask; /* physical page size  */
+} AMDVIIOTLBEntry;
+
+/* serialize IOMMU command processing */
+typedef struct QEMU_PACKED {
+#ifdef HOST_WORDS_BIGENDIAN
+uint64_t type:4;   /* command type   */
+uint64_t reserved:8;
+uint64_t store_addr:49;/* addr to write  */
+uint64_t completion_flush:1;   /* allow more executions  */
+uint64_t completion_int:1; /* set MMIOWAITINT*/
+uint64_t completion_store:1;   /* write data to address  */
+#else
+uint64_t completion_store:1;
+uint64_t completion_int:1;
+uint64_t completion_flush:1;
+uint64_t store_addr:49;
+uint64_t reserved:8;
+uint64_t type:4;
+#endif /* __BIG_ENDIAN_BITFIELD */
+uint64_t store_data;   /* data to write  */
+} CMDCompletionWait;

[Qemu-devel] [V15 1/4] hw/pci: Prepare for AMD IOMMU

2016-08-09 Thread David Kiarie
Introduce PCI macros from for use by AMD IOMMU

Signed-off-by: David Kiarie 
---
 include/hw/pci/pci.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 929ec2f..5ff92de 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -11,11 +11,13 @@
 #include "hw/pci/pcie.h"
 
 /* PCI bus */
-
 #define PCI_DEVFN(slot, func)   slot) & 0x1f) << 3) | ((func) & 0x07))
+#define PCI_BUS_NUM(x)  (((x) >> 8) & 0xff)
 #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f)
 #define PCI_FUNC(devfn) ((devfn) & 0x07)
 #define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn))
+#define PCI_BUS_MAX 256
+#define PCI_DEVFN_MAX   256
 #define PCI_SLOT_MAX32
 #define PCI_FUNC_MAX8
 
-- 
2.1.4




[Qemu-devel] [V15 0/4] AMD IOMMU

2016-08-09 Thread David Kiarie
Hi all,

This patchset adds basic AMD IOMMU emulation support to Qemu. 

Change since v14
   -MMIO register reading/write bug fix [Peter]
   -Endian-ness issue fix[Peter]
   -Bitfields layouts in IOMMU commands fix[Peter]
   -IVRS changed IVHD device entry from type 3 to 1 to save a few bytes
   -coding style issues, comment grammer and other miscellaneous fixes.

Changes since v13
   -Added an error to make AMD IOMMU incompatible with device assignment.[Alex]
   -Converted AMD IOMMU into a composite PCI and System Bus device. This helps 
with:
  -We can now inherit from X86 IOMMU base class(which is implemented as a 
System Bus device).
  -We can now reserve MMIO region for IOMMU without a BAR register and 
without a hack.

Changes since v12

   -Coding style fixes [Jan, Michael]
   -Error logging fix to avoid using a macro[Jan]
   -moved some PCI macros to PCI header[Jan]
   -Use a lookup table for MMIO register names when tracing[Jan]

Changes since V11
   -AMD IOMMU is not started with -device amd-iommu (with a dependency on 
Marcel's patches).
   -IOMMU commands are represented using bitfields which is less error prone 
and more readable[Peter]
   -Changed from debug fprintfs to tracing[Jan]

Changes since V10
 
   -Support for huge pages including some obscure AMD IOMMU feature that allows 
default page size override[Jan].
   -Fixed an issue with generation of interrupts. We noted that AMD IOMMU has 
BusMaster- and is therefore not able to generate interrupts like any other PCI 
device. We have resulted in writing directly to system address but this could 
be fixed by some patches which have not been merged yet.

Changes since v9

   -amd_iommu prefixes have been renamed to a shorter 'amdvi' both in the macros
and in the functions/code. The register macros have not been moved to the 
implementation file since almost the macros there are basically macros and 
I 
reckoned renaming them should suffice.
   -taken care of byte order in the use of 'dma_memory_read'[Michael]
   -Taken care of invalid DTE entries to ensure no DMA unless a device is 
configured to allow it.
   -An issue with the emulate IOMMU defaulting to AMD_IOMMU has been 
fixed[Marcel]
   
You can test[1] this patches by starting with parameters 
qemu-system-x86_64 -M -device amd-iommu -m 2G -enable-kvm -smp 4 -cpu host 
-hda file.img -soundhw ac97 
emulating whatever devices you want.

Not passing any command line parameters to linux should be enough to test this 
patches since the devices are basically
passes-through but to the 'host' (l1 guest). You can still go ahead pass 
command line parameter 'iommu=pt iommu=1'
and try to pass a device to L2 guest. This can also done without passing any 
iommu related parameters to the kernel. 

David Kiarie (4):
  hw/pci: Prepare for AMD IOMMU
  hw/i386/trace-events: Add AMD IOMMU trace events
  hw/i386: Introduce AMD IOMMU
  hw/i386: AMD IOMMU IVRS table

 hw/acpi/aml-build.c |2 +-
 hw/i386/Makefile.objs   |1 +
 hw/i386/acpi-build.c|   76 ++-
 hw/i386/amd_iommu.c | 1401 +++
 hw/i386/amd_iommu.h |  390 
 hw/i386/intel_iommu.c   |1 +
 hw/i386/trace-events|   36 ++
 hw/i386/x86-iommu.c |6 +
 include/hw/acpi/aml-build.h |1 +
 include/hw/i386/x86-iommu.h |   12 +
 include/hw/pci/pci.h|4 +-
 11 files changed, 1919 insertions(+), 11 deletions(-)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

-- 
2.1.4




Re: [Qemu-devel] [RFC 2/2] hw/i386: enforce SID verification

2016-08-09 Thread David Kiarie
On Tue, Aug 9, 2016 at 9:41 PM, Valentine Sinitsyn <
valentine.sinit...@gmail.com> wrote:

>
>
> On 09.08.2016 19:32, David Kiarie wrote:
>
>> Platform device are now able to make interrupt request with
>> explicit SIDs hence we can safely expect triggered AddressSpace ID
>> to match the requesting ID
>>
>> Signed-off-by: David Kiarie 
>> ---
>>  hw/i386/intel_iommu.c | 82 +++---
>> -
>>  1 file changed, 43 insertions(+), 39 deletions(-)
>>
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index 28c31a2..153ac4e 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -32,7 +32,7 @@
>>  #include "hw/pci-host/q35.h"
>>  #include "sysemu/kvm.h"
>>
>> -/*#define DEBUG_INTEL_IOMMU*/
>> +#define DEBUG_INTEL_IOMMU
>>
> A leftowver?


Yes,  thanks shouldn't be present here.


>
>  #ifdef DEBUG_INTEL_IOMMU
>>  enum {
>>  DEBUG_GENERAL, DEBUG_CSR, DEBUG_INV, DEBUG_MMU, DEBUG_FLOG,
>> @@ -2043,43 +2043,41 @@ static int vtd_irte_get(IntelIOMMUState *iommu,
>> uint16_t index,
>>  return -VTD_FR_IR_IRTE_RSVD;
>>  }
>>
>> -if (sid != X86_IOMMU_SID_INVALID) {
>> -/* Validate IRTE SID */
>> -source_id = le32_to_cpu(entry->irte.source_id);
>> -switch (entry->irte.sid_vtype) {
>> -case VTD_SVT_NONE:
>> -VTD_DPRINTF(IR, "No SID validation for IRTE index %d",
>> index);
>> -break;
>> -
>> -case VTD_SVT_ALL:
>> -mask = vtd_svt_mask[entry->irte.sid_q];
>> -if ((source_id & mask) != (sid & mask)) {
>> -VTD_DPRINTF(GENERAL, "SID validation for IRTE index "
>> -"%d failed (reqid 0x%04x sid 0x%04x)", index,
>> -sid, source_id);
>> -return -VTD_FR_IR_SID_ERR;
>> -}
>> -break;
>> +/* Validate IRTE SID */
>> +source_id = le32_to_cpu(entry->irte.source_id);
>> +switch (entry->irte.sid_vtype) {
>> +case VTD_SVT_NONE:
>> +VTD_DPRINTF(IR, "No SID validation for IRTE index %d", index);
>> +break;
>>
>> -case VTD_SVT_BUS:
>> -bus_max = source_id >> 8;
>> -bus_min = source_id & 0xff;
>> -bus = sid >> 8;
>> -if (bus > bus_max || bus < bus_min) {
>> -VTD_DPRINTF(GENERAL, "SID validation for IRTE index %d "
>> -"failed (bus %d outside %d-%d)", index, bus,
>> -bus_min, bus_max);
>> -return -VTD_FR_IR_SID_ERR;
>> -}
>> -break;
>> +case VTD_SVT_ALL:
>> +mask = vtd_svt_mask[entry->irte.sid_q];
>> +if ((source_id & mask) != (sid & mask)) {
>> +VTD_DPRINTF(GENERAL, "SID validation for IRTE index "
>> +"%d failed (reqid 0x%04x sid 0x%04x)", index,
>> +sid, source_id);
>> +return -VTD_FR_IR_SID_ERR;
>> +}
>> +break;
>>
>> -default:
>> -VTD_DPRINTF(GENERAL, "Invalid SVT bits (0x%x) in IRTE index "
>> -"%d", entry->irte.sid_vtype, index);
>> -/* Take this as verification failure. */
>> +case VTD_SVT_BUS:
>> +bus_max = source_id >> 8;
>> +bus_min = source_id & 0xff;
>> +bus = sid >> 8;
>> +if (bus > bus_max || bus < bus_min) {
>> +VTD_DPRINTF(GENERAL, "SID validation for IRTE index %d "
>> +"failed (bus %d outside %d-%d)", index, bus,
>> +bus_min, bus_max);
>>  return -VTD_FR_IR_SID_ERR;
>> -break;
>>  }
>> +break;
>> +
>> +default:
>> +VTD_DPRINTF(GENERAL, "Invalid SVT bits (0x%x) in IRTE index "
>> +"%d", entry->irte.sid_vtype, index);
>> +/* Take this as verification failure. */
>> +return -VTD_FR_IR_SID_ERR;
>> +break;
>>  }
>>
>>  return 0;
>> @@ -2252,14 +2250,17 @@ static MemTxResult vtd_mem_ir_write(void *opaque,
>> hwadd

Re: [Qemu-devel] [V15 3/4] hw/i386: Introduce AMD IOMMU

2016-08-09 Thread David Kiarie
On Tue, Aug 9, 2016 at 8:44 AM, Peter Xu  wrote:

> On Tue, Aug 02, 2016 at 11:39:06AM +0300, David Kiarie wrote:
>
> [...]
>
> > +/* external write */
> > +static void amdvi_writew(AMDVIState *s, hwaddr addr, uint16_t val)
> > +{
> > +uint16_t romask = lduw_le_p(&s->romask[addr]);
> > +uint16_t w1cmask = lduw_le_p(&s->w1cmask[addr]);
> > +uint16_t oldval = lduw_le_p(&s->mmior[addr]);
> > +stw_le_p(&s->mmior[addr], (val & ~(val & w1cmask)) | (romask &
> oldval));
>
> I think the above is problematic, e.g., what if we write 1 to one of
> the romask while it's 0 originally? In that case, the RO bit will be
> written to 1.
>
> Maybe we need:
>
>   stw_le_p(&s->mmior[addr], ((oldval & romask) | (val & ~romask)) & \
> (val & w1cmask));
>
> Same question to the below two functions.
>

It seems to me you're not taking care of w1/c bits correctly ?

I think:

stw_le_p(&s->mmior[addr], ((oldval & romask) | (val & ~romask)) & \
   ~ (val & w1cmask));
should suffice.


> > +}
> > +/*
> > + * AMDVi event structure
> > + *0:15   -> DeviceID
> > + *55:63  -> event type + miscellaneous info
> > + *64:127 -> related address
> > + */
> > +static void amdvi_encode_event(uint64_t *evt, uint16_t devid, uint64_t
> addr,
> > +   uint16_t info)
> > +{
> > +amdvi_setevent_bits(evt, devid, 0, 16);
> > +amdvi_setevent_bits(evt, info, 55, 8);
> > +amdvi_setevent_bits(evt, addr, 63, 64);
>   ^^
> should here be 64?
>

The code is correct but the comment above is misleading.


>
> Also, I am not sure whether we need this amdvi_setevent_bits() if it's
> only used in this function. Though not a big problem for me.
>
> > +}
> > +/* log an error encountered page-walking
>
> Thanks,
>
> -- peterx
>


[Qemu-devel] [RFC 2/2] hw/i386: enforce SID verification

2016-08-09 Thread David Kiarie
Platform device are now able to make interrupt request with
explicit SIDs hence we can safely expect triggered AddressSpace ID
to match the requesting ID

Signed-off-by: David Kiarie 
---
 hw/i386/intel_iommu.c | 82 +++
 1 file changed, 43 insertions(+), 39 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 28c31a2..153ac4e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -32,7 +32,7 @@
 #include "hw/pci-host/q35.h"
 #include "sysemu/kvm.h"
 
-/*#define DEBUG_INTEL_IOMMU*/
+#define DEBUG_INTEL_IOMMU
 #ifdef DEBUG_INTEL_IOMMU
 enum {
 DEBUG_GENERAL, DEBUG_CSR, DEBUG_INV, DEBUG_MMU, DEBUG_FLOG,
@@ -2043,43 +2043,41 @@ static int vtd_irte_get(IntelIOMMUState *iommu, 
uint16_t index,
 return -VTD_FR_IR_IRTE_RSVD;
 }
 
-if (sid != X86_IOMMU_SID_INVALID) {
-/* Validate IRTE SID */
-source_id = le32_to_cpu(entry->irte.source_id);
-switch (entry->irte.sid_vtype) {
-case VTD_SVT_NONE:
-VTD_DPRINTF(IR, "No SID validation for IRTE index %d", index);
-break;
-
-case VTD_SVT_ALL:
-mask = vtd_svt_mask[entry->irte.sid_q];
-if ((source_id & mask) != (sid & mask)) {
-VTD_DPRINTF(GENERAL, "SID validation for IRTE index "
-"%d failed (reqid 0x%04x sid 0x%04x)", index,
-sid, source_id);
-return -VTD_FR_IR_SID_ERR;
-}
-break;
+/* Validate IRTE SID */
+source_id = le32_to_cpu(entry->irte.source_id);
+switch (entry->irte.sid_vtype) {
+case VTD_SVT_NONE:
+VTD_DPRINTF(IR, "No SID validation for IRTE index %d", index);
+break;
 
-case VTD_SVT_BUS:
-bus_max = source_id >> 8;
-bus_min = source_id & 0xff;
-bus = sid >> 8;
-if (bus > bus_max || bus < bus_min) {
-VTD_DPRINTF(GENERAL, "SID validation for IRTE index %d "
-"failed (bus %d outside %d-%d)", index, bus,
-bus_min, bus_max);
-return -VTD_FR_IR_SID_ERR;
-}
-break;
+case VTD_SVT_ALL:
+mask = vtd_svt_mask[entry->irte.sid_q];
+if ((source_id & mask) != (sid & mask)) {
+VTD_DPRINTF(GENERAL, "SID validation for IRTE index "
+"%d failed (reqid 0x%04x sid 0x%04x)", index,
+sid, source_id);
+return -VTD_FR_IR_SID_ERR;
+}
+break;
 
-default:
-VTD_DPRINTF(GENERAL, "Invalid SVT bits (0x%x) in IRTE index "
-"%d", entry->irte.sid_vtype, index);
-/* Take this as verification failure. */
+case VTD_SVT_BUS:
+bus_max = source_id >> 8;
+bus_min = source_id & 0xff;
+bus = sid >> 8;
+if (bus > bus_max || bus < bus_min) {
+VTD_DPRINTF(GENERAL, "SID validation for IRTE index %d "
+"failed (bus %d outside %d-%d)", index, bus,
+bus_min, bus_max);
 return -VTD_FR_IR_SID_ERR;
-break;
 }
+break;
+
+default:
+VTD_DPRINTF(GENERAL, "Invalid SVT bits (0x%x) in IRTE index "
+"%d", entry->irte.sid_vtype, index);
+/* Take this as verification failure. */
+return -VTD_FR_IR_SID_ERR;
+break;
 }
 
 return 0;
@@ -2252,14 +2250,17 @@ static MemTxResult vtd_mem_ir_write(void *opaque, 
hwaddr addr,
 {
 int ret = 0;
 MSIMessage from = {}, to = {};
-uint16_t sid = X86_IOMMU_SID_INVALID;
+VTDAddressSpace *as = opaque;
+uint16_t sid = pci_bus_num(as->bus) << 8 | as->devfn;
 
 from.address = (uint64_t) addr + VTD_INTERRUPT_ADDR_FIRST;
 from.data = (uint32_t) value;
 
-if (!attrs.unspecified) {
-/* We have explicit Source ID */
-sid = attrs.requester_id;
+if (attrs.requester_id != sid) {
+VTD_DPRINTF(GENERAL, "int remap request for sid 0x%04x"
+" requester_id 0x%04x couldn't be verified",
+sid, attrs.requester_id);
+return MEMTX_ERROR;
 }
 
 ret = vtd_interrupt_remap_msi(opaque, &from, &to, sid);
@@ -2325,7 +2326,7 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, 
PCIBus *bus, int devfn)
 memory_region_init_iommu(&vtd_dev_as->iommu, OBJECT(s),
  &s->iommu_ops, "intel_iommu", UINT64_MAX);
 memory_region_init_io(&vtd_dev_as->iommu_ir, OBJECT(s),
-  &vtd_mem_ir_ops, s, &q

[Qemu-devel] [RFC 0/2] Explicit SID for IOAPIC

2016-08-09 Thread David Kiarie
IOMMU require platform device like IOAPIC and possibly HPET to make
interrupt requests using explicit SIDs which the currently don't.

This patches modify x86 code such that an MSIroute entry is affiliated with 
a requester ID and, if present, a PCI device. This change doesn't seem
have any side effects as far as I can tell.

David Kiarie (2):
  hw/msi: Allow platform devices to use explicit SID
  hw/i386: enforce SID verification

 hw/i386/intel_iommu.c | 82 ---
 hw/i386/kvm/pci-assign.c  | 12 --
 hw/intc/ioapic.c  | 28 +++--
 hw/misc/ivshmem.c |  6 ++-
 hw/vfio/pci.c |  6 ++-
 hw/virtio/virtio-pci.c|  6 ++-
 include/hw/i386/ioapic_internal.h |  1 +
 include/hw/i386/x86-iommu.h   |  1 +
 include/sysemu/kvm.h  |  7 ++--
 kvm-all.c | 10 +++--
 target-i386/kvm.c | 15 ---
 11 files changed, 108 insertions(+), 66 deletions(-)

-- 
2.1.4




[Qemu-devel] [RFC 1/2] hw/msi: Allow platform devices to use explicit SID

2016-08-09 Thread David Kiarie
When using IOMMU platform devices like IOAPIC are required to make
interrupt remapping requests using explicit SID.We affiliate an MSI
route with a requester ID and a PCI device if present which ensures
that platform devices can call IOMMU interrupt remapping code with
explicit SID while maintaining compatility with the original code
which mainly dealt with PCI devices.

Signed-off-by: David Kiarie 
---
 hw/i386/kvm/pci-assign.c  | 12 
 hw/intc/ioapic.c  | 28 
 hw/misc/ivshmem.c |  6 --
 hw/vfio/pci.c |  6 --
 hw/virtio/virtio-pci.c|  6 --
 include/hw/i386/ioapic_internal.h |  1 +
 include/hw/i386/x86-iommu.h   |  1 +
 include/sysemu/kvm.h  |  7 ---
 kvm-all.c | 10 ++
 target-i386/kvm.c | 15 +--
 10 files changed, 65 insertions(+), 27 deletions(-)

diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index 8238fbc..99547c5 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -976,7 +976,8 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev)
 if (ctrl_byte & PCI_MSI_FLAGS_ENABLE) {
 int virq;
 
-virq = kvm_irqchip_add_msi_route(kvm_state, 0, pci_dev);
+virq = kvm_irqchip_add_msi_route(kvm_state, 0, pci_dev,
+pci_requester_id(pci_dev));
 if (virq < 0) {
 perror("assigned_dev_update_msi: kvm_irqchip_add_msi_route");
 return;
@@ -1014,7 +1015,8 @@ static void assigned_dev_update_msi_msg(PCIDevice 
*pci_dev)
 }
 
 kvm_irqchip_update_msi_route(kvm_state, assigned_dev->msi_virq[0],
- msi_get_message(pci_dev, 0), pci_dev);
+ msi_get_message(pci_dev, 0), pci_dev,
+ pci_requester_id(pci_dev));
 kvm_irqchip_commit_routes(kvm_state);
 }
 
@@ -1078,7 +1080,8 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
*pci_dev)
 continue;
 }
 
-r = kvm_irqchip_add_msi_route(kvm_state, i, pci_dev);
+r = kvm_irqchip_add_msi_route(kvm_state, i, pci_dev,
+pci_requester_id(pci_dev));
 if (r < 0) {
 return r;
 }
@@ -1599,7 +1602,8 @@ static void assigned_dev_msix_mmio_write(void *opaque, 
hwaddr addr,
 
 ret = kvm_irqchip_update_msi_route(kvm_state,
adev->msi_virq[i], msg,
-   pdev);
+   pdev,
+   pci_requester_id(pdev));
 if (ret) {
 error_report("Error updating irq routing entry (%d)", ret);
 }
diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index 31791b0..cc7fb5d 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -95,9 +95,17 @@ static void ioapic_entry_parse(uint64_t entry, struct 
ioapic_entry_info *info)
 (info->delivery_mode << MSI_DATA_DELIVERY_MODE_SHIFT);
 }
 
-static void ioapic_service(IOAPICCommonState *s)
+static void ioapic_write_ioapic_as(IOAPICCommonState *s, uint32_t data, 
uint64_t addr)
 {
 AddressSpace *ioapic_as = PC_MACHINE(qdev_get_machine())->ioapic_as;
+MemTxAttrs attrs;
+
+attrs.requester_id = s->devid;
+address_space_stl_le(ioapic_as, addr, data, attrs, NULL);
+}
+
+static void ioapic_service(IOAPICCommonState *s)
+{
 struct ioapic_entry_info info;
 uint8_t i;
 uint32_t mask;
@@ -141,7 +149,7 @@ static void ioapic_service(IOAPICCommonState *s)
  * the IOAPIC message into a MSI one, and its
  * address space will decide whether we need a
  * translation. */
-stl_le_phys(ioapic_as, info.addr, info.data);
+ioapic_write_ioapic_as(s, info.data, info.addr);
 }
 }
 }
@@ -197,7 +205,7 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
 ioapic_entry_parse(s->ioredtbl[i], &info);
 msg.address = info.addr;
 msg.data = info.data;
-kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
+kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL, s->devid);
 }
 kvm_irqchip_commit_routes(kvm_state);
 }
@@ -385,12 +393,23 @@ static void ioapic_machine_done_notify(Notifier 
*notifier, void *data)
 
 if (kvm_irqchip_is_split()) {
 X86IOMMUState *iommu = x86_iommu_get_default();
+MSIMessage msg = {0, 0};
+int i;
+
 if (iommu) {
 /* Register this IOAPIC with IOMMU IEC notifier, so that
  * when there are IR invalidates, we can be notified to
  * update kernel IR cache. */
-x86_iommu_iec_register_n

Re: [Qemu-devel] [V15 3/4] hw/i386: Introduce AMD IOMMU

2016-08-09 Thread David Kiarie
On Tue, Aug 9, 2016 at 4:01 PM, Valentine Sinitsyn <
valentine.sinit...@gmail.com> wrote:

> Hi all,
>
> On 09.08.2016 17:52, David Kiarie wrote:
>
>>
>>
>> On Tue, Aug 9, 2016 at 8:44 AM, Peter Xu > <mailto:pet...@redhat.com>> wrote:
>>
>> On Tue, Aug 02, 2016 at 11:39:06AM +0300, David Kiarie wrote:
>>
>> [...]
>>
>> > +/* invalidate internal caches for devid */
>> > +typedef struct QEMU_PACKED {
>> > +#ifdef HOST_WORDS_BIGENDIAN
>> > +uint64_t devid;/* device to invalidate   */
>> > +uint64_t reserved_1:44;
>> > +uint64_t type:4;   /* command type   */
>> > +#else
>> > +uint64_t devid;
>> > +uint64_t reserved_1:44;
>> > +uint64_t type:4;
>> > +#endif /* __BIG_ENDIAN_BITFIELD */
>>
>> Guess you forgot to reverse the order of fields in one of above block.
>>
>>
>> Yes, I forgot to reverse order of fields here.
>>
>>
>>
>> [...]
>>
>> > +/* load adddress translation info for devid into translation cache
>> */
>> > +typedef struct QEMU_PACKED {
>> > +#ifdef HOST_WORDS_BIGENDIAN
>> > +uint64_t type:4;  /* command type   */
>> > +uint64_t reserved_2:8;
>> > +uint64_t pasid_19_0:20;
>> > +uint64_t pfcount_7_0:8;
>> > +uint64_t reserved_1:8;
>> > +uint64_t devid;   /* related devid  */
>> > +#else
>> > +uint64_t devid;
>> > +uint64_t reserved_1:8;
>> > +uint64_t pfcount_7_0:8;
>> > +uint64_t pasid_19_0:20;
>> > +uint64_t reserved_2:8;
>> > +uint64_t type:4;
>> > +#endif /* __BIG_ENDIAN_BITFIELD */
>>
>> For this one, "devid" looks like a 16 bits field?
>>
>>
>> Right. should be 16 bits.
>>
>>
>>
>> [...]
>>
>> > +/* issue a PCIe completion packet for devid */
>> > +typedef struct QEMU_PACKED {
>> > +#ifdef HOST_WORDS_BIGENDIAN
>> > +uint32_t devid;   /* related devid  */
>> > +uint32_t reserved_1;
>> > +#else
>> > +uint32_t reserved_1;
>> > +uint32_t devid;
>> > +#endif /* __BIG_ENDIAN_BITFIELD */
>>
>> Here I am not sure we need this "#ifdef".
>>
>>
>> There's an error here but it's not with the #ifdef but instead I have
>> not set the right bit on the bitfields - for instance devid should be 16.
>>
>>
>>
>> [...]
>>
>> > +/* external write */
>> > +static void amdvi_writew(AMDVIState *s, hwaddr addr, uint16_t val)
>> > +{
>> > +uint16_t romask = lduw_le_p(&s->romask[addr]);
>> > +uint16_t w1cmask = lduw_le_p(&s->w1cmask[addr]);
>> > +uint16_t oldval = lduw_le_p(&s->mmior[addr]);
>> > +stw_le_p(&s->mmior[addr], (val & ~(val & w1cmask)) | (romask &
>> oldval));
>>
>> I think the above is problematic, e.g., what if we write 1 to one of
>> the romask while it's 0 originally? In that case, the RO bit will be
>> written to 1.
>>
>> Maybe we need:
>>
>>   stw_le_p(&s->mmior[addr], ((oldval & romask) | (val & ~romask)) & \
>> (val & w1cmask));
>>
>> Same question to the below two functions.
>>
>>
>> Right. I was very determined to come up with my algo but failed horribly
>> ;-)
>>
>>
>>
>> > +}
>> > +
>> > +static void amdvi_writel(AMDVIState *s, hwaddr addr, uint32_t val)
>> > +{
>> > +uint32_t romask = ldl_le_p(&s->romask[addr]);
>> > +uint32_t w1cmask = ldl_le_p(&s->w1cmask[addr]);
>> > +uint32_t oldval = ldl_le_p(&s->mmior[addr]);
>> > +stl_le_p(&s->mmior[addr], (val & ~(val & w1cmask)) | (romask &
>> oldval));
>> > +}
>> > +
>> > +static void amdvi_writeq(AMDVIState *s, hwaddr addr, uint64_t val)
>> > +{
>> > +uint64_t romask = ldq_le_p(&s->romask[addr]);
>> > +uint64_t w1cmask = ldq_le_p(&s->w1cmask[addr]);
>> > +uint32_t oldval = 

Re: [Qemu-devel] [V15 3/4] hw/i386: Introduce AMD IOMMU

2016-08-09 Thread David Kiarie
On Tue, Aug 9, 2016 at 8:44 AM, Peter Xu  wrote:

> On Tue, Aug 02, 2016 at 11:39:06AM +0300, David Kiarie wrote:
>
> [...]
>
> > +/* invalidate internal caches for devid */
> > +typedef struct QEMU_PACKED {
> > +#ifdef HOST_WORDS_BIGENDIAN
> > +uint64_t devid;/* device to invalidate   */
> > +uint64_t reserved_1:44;
> > +uint64_t type:4;   /* command type   */
> > +#else
> > +uint64_t devid;
> > +uint64_t reserved_1:44;
> > +uint64_t type:4;
> > +#endif /* __BIG_ENDIAN_BITFIELD */
>
> Guess you forgot to reverse the order of fields in one of above block.
>

Yes, I forgot to reverse order of fields here.


>
> [...]
>
> > +/* load adddress translation info for devid into translation cache */
> > +typedef struct QEMU_PACKED {
> > +#ifdef HOST_WORDS_BIGENDIAN
> > +uint64_t type:4;  /* command type   */
> > +uint64_t reserved_2:8;
> > +uint64_t pasid_19_0:20;
> > +uint64_t pfcount_7_0:8;
> > +uint64_t reserved_1:8;
> > +uint64_t devid;   /* related devid  */
> > +#else
> > +uint64_t devid;
> > +uint64_t reserved_1:8;
> > +uint64_t pfcount_7_0:8;
> > +uint64_t pasid_19_0:20;
> > +uint64_t reserved_2:8;
> > +uint64_t type:4;
> > +#endif /* __BIG_ENDIAN_BITFIELD */
>
> For this one, "devid" looks like a 16 bits field?
>

Right. should be 16 bits.


>
> [...]
>
> > +/* issue a PCIe completion packet for devid */
> > +typedef struct QEMU_PACKED {
> > +#ifdef HOST_WORDS_BIGENDIAN
> > +uint32_t devid;   /* related devid  */
> > +uint32_t reserved_1;
> > +#else
> > +uint32_t reserved_1;
> > +uint32_t devid;
> > +#endif /* __BIG_ENDIAN_BITFIELD */
>
> Here I am not sure we need this "#ifdef".
>

There's an error here but it's not with the #ifdef but instead I have not
set the right bit on the bitfields - for instance devid should be 16.


>
> [...]
>
> > +/* external write */
> > +static void amdvi_writew(AMDVIState *s, hwaddr addr, uint16_t val)
> > +{
> > +uint16_t romask = lduw_le_p(&s->romask[addr]);
> > +uint16_t w1cmask = lduw_le_p(&s->w1cmask[addr]);
> > +uint16_t oldval = lduw_le_p(&s->mmior[addr]);
> > +stw_le_p(&s->mmior[addr], (val & ~(val & w1cmask)) | (romask &
> oldval));
>
> I think the above is problematic, e.g., what if we write 1 to one of
> the romask while it's 0 originally? In that case, the RO bit will be
> written to 1.
>
> Maybe we need:
>
>   stw_le_p(&s->mmior[addr], ((oldval & romask) | (val & ~romask)) & \
> (val & w1cmask));
>
> Same question to the below two functions.
>

Right. I was very determined to come up with my algo but failed horribly ;-)


>
> > +}
> > +
> > +static void amdvi_writel(AMDVIState *s, hwaddr addr, uint32_t val)
> > +{
> > +uint32_t romask = ldl_le_p(&s->romask[addr]);
> > +uint32_t w1cmask = ldl_le_p(&s->w1cmask[addr]);
> > +uint32_t oldval = ldl_le_p(&s->mmior[addr]);
> > +stl_le_p(&s->mmior[addr], (val & ~(val & w1cmask)) | (romask &
> oldval));
> > +}
> > +
> > +static void amdvi_writeq(AMDVIState *s, hwaddr addr, uint64_t val)
> > +{
> > +uint64_t romask = ldq_le_p(&s->romask[addr]);
> > +uint64_t w1cmask = ldq_le_p(&s->w1cmask[addr]);
> > +uint32_t oldval = ldq_le_p(&s->mmior[addr]);
> > +stq_le_p(&s->mmior[addr], (val & ~(val & w1cmask)) | (romask &
> oldval));
> > +}
> > +
> > +/* OR a 64-bit register with a 64-bit value */
> > +static bool amdvi_orq(AMDVIState *s, hwaddr addr, uint64_t val)
>
> Nit: This function name gives me an illusion that it's a write op, not
> read. IMHO it'll be better we directly use amdvi_readq() for all the
> callers of this function, which is more clear to me.
>
> > +{
> > +return amdvi_readq(s, addr) | val;
> > +}
> > +
> > +/* OR a 64-bit register with a 64-bit value storing result in the
> register */
> > +static void amdvi_orassignq(AMDVIState *s, hwaddr addr, uint64_t val)
> > +{
> > +amdvi_writeq_raw(s, addr, amdvi_readq(s, addr) | val);
> > +}
> > +
> > +/* AND a 64-bit register with a 64-bit value storing result in the
> register */
> > +static void amdvi_and_assignq(AMDVIState *s, hwaddr addr, uint64_t val)
&g

Re: [Qemu-devel] [V15 3/4] hw/i386: Introduce AMD IOMMU

2016-08-09 Thread David Kiarie
On Tue, Aug 9, 2016 at 8:44 AM, Peter Xu  wrote:

> On Tue, Aug 02, 2016 at 11:39:06AM +0300, David Kiarie wrote:
>
> [...]
>
>
Hi Peter.

Most of your comments are valid thought some are subjective :-). I'm
covering most if not all of them on next version (should coming shortly).

> +/* invalidate internal caches for devid */
> > +typedef struct QEMU_PACKED {
> > +#ifdef HOST_WORDS_BIGENDIAN
> > +uint64_t devid;/* device to invalidate   */
> > +uint64_t reserved_1:44;
> > +uint64_t type:4;   /* command type   */
> > +#else
> > +uint64_t devid;
> > +uint64_t reserved_1:44;
> > +uint64_t type:4;
> > +#endif /* __BIG_ENDIAN_BITFIELD */
>
> Guess you forgot to reverse the order of fields in one of above block.
>
> [...]
>
> > +/* load adddress translation info for devid into translation cache */
> > +typedef struct QEMU_PACKED {
> > +#ifdef HOST_WORDS_BIGENDIAN
> > +uint64_t type:4;  /* command type   */
> > +uint64_t reserved_2:8;
> > +uint64_t pasid_19_0:20;
> > +uint64_t pfcount_7_0:8;
> > +uint64_t reserved_1:8;
> > +uint64_t devid;   /* related devid  */
> > +#else
> > +uint64_t devid;
> > +uint64_t reserved_1:8;
> > +uint64_t pfcount_7_0:8;
> > +uint64_t pasid_19_0:20;
> > +uint64_t reserved_2:8;
> > +uint64_t type:4;
> > +#endif /* __BIG_ENDIAN_BITFIELD */
>
> For this one, "devid" looks like a 16 bits field?
>
> [...]
>
> > +/* issue a PCIe completion packet for devid */
> > +typedef struct QEMU_PACKED {
> > +#ifdef HOST_WORDS_BIGENDIAN
> > +uint32_t devid;   /* related devid  */
> > +uint32_t reserved_1;
> > +#else
> > +uint32_t reserved_1;
> > +uint32_t devid;
> > +#endif /* __BIG_ENDIAN_BITFIELD */
>
> Here I am not sure we need this "#ifdef".
>
> [...]
>
> > +/* external write */
> > +static void amdvi_writew(AMDVIState *s, hwaddr addr, uint16_t val)
> > +{
> > +uint16_t romask = lduw_le_p(&s->romask[addr]);
> > +uint16_t w1cmask = lduw_le_p(&s->w1cmask[addr]);
> > +uint16_t oldval = lduw_le_p(&s->mmior[addr]);
> > +stw_le_p(&s->mmior[addr], (val & ~(val & w1cmask)) | (romask &
> oldval));
>
> I think the above is problematic, e.g., what if we write 1 to one of
> the romask while it's 0 originally? In that case, the RO bit will be
> written to 1.
>
> Maybe we need:
>
>   stw_le_p(&s->mmior[addr], ((oldval & romask) | (val & ~romask)) & \
> (val & w1cmask));
>
> Same question to the below two functions.
>
> > +}
> > +
> > +static void amdvi_writel(AMDVIState *s, hwaddr addr, uint32_t val)
> > +{
> > +uint32_t romask = ldl_le_p(&s->romask[addr]);
> > +uint32_t w1cmask = ldl_le_p(&s->w1cmask[addr]);
> > +uint32_t oldval = ldl_le_p(&s->mmior[addr]);
> > +stl_le_p(&s->mmior[addr], (val & ~(val & w1cmask)) | (romask &
> oldval));
> > +}
> > +
> > +static void amdvi_writeq(AMDVIState *s, hwaddr addr, uint64_t val)
> > +{
> > +uint64_t romask = ldq_le_p(&s->romask[addr]);
> > +uint64_t w1cmask = ldq_le_p(&s->w1cmask[addr]);
> > +uint32_t oldval = ldq_le_p(&s->mmior[addr]);
> > +stq_le_p(&s->mmior[addr], (val & ~(val & w1cmask)) | (romask &
> oldval));
> > +}
> > +
> > +/* OR a 64-bit register with a 64-bit value */
> > +static bool amdvi_orq(AMDVIState *s, hwaddr addr, uint64_t val)
>
> Nit: This function name gives me an illusion that it's a write op, not
> read. IMHO it'll be better we directly use amdvi_readq() for all the
> callers of this function, which is more clear to me.
>
> > +{
> > +return amdvi_readq(s, addr) | val;
> > +}
> > +
> > +/* OR a 64-bit register with a 64-bit value storing result in the
> register */
> > +static void amdvi_orassignq(AMDVIState *s, hwaddr addr, uint64_t val)
> > +{
> > +amdvi_writeq_raw(s, addr, amdvi_readq(s, addr) | val);
> > +}
> > +
> > +/* AND a 64-bit register with a 64-bit value storing result in the
> register */
> > +static void amdvi_and_assignq(AMDVIState *s, hwaddr addr, uint64_t val)
>
> Nit: the name is not matched with above:
>
>   amdvi_{or|and}assign[qw]
>
> Though I would prefer:
>
>   amdvi_assign_[qw]_{or|and}
>
> [...]
>

Re: [Qemu-devel] [PULL v5 29/57] intel_iommu: add SID validation for IR

2016-08-08 Thread David Kiarie
On Mon, Aug 8, 2016 at 12:06 PM, Peter Xu  wrote:

> On Tue, Aug 02, 2016 at 03:17:20PM +0300, David Kiarie wrote:
> > On Tue, Aug 2, 2016 at 3:12 PM, Peter Xu  wrote:
> >
> > > On Tue, Aug 02, 2016 at 02:58:55PM +0300, David Kiarie wrote:
> > > > > Sure. David, so do you like to do it or I cook this patch? :)
> > > >
> > > > If there are no objections I will look at this employing Jan's
> approach:
> > > > associating a write with an address space.
> > >
> > > Do you mean to translate current stl_le_phys() into something like
> > > address_space_stl_le(), with MemTxAttrs? (in ioapic_service())
> > >
> >
> > I tried doing something like that but the write gets discarded
> somewhere. I
> > don't see the write from IOMMU side.
>
> Hi, Jan, David,
>
> Sorry to respond late, but what's the version of your guest kernel? I
> suspect there is bug in guest IOMMU codes with IR on EOI handling, and
> maybe you can try to boost IOAPIC version to 0x20 when with old
> kernels using "-global ioapic.version=0x20".
>

I'm using mainline 4.7 kernel. I haven't experience any issue yet.


>
> Thanks,
>
> -- peterx
>


Re: [Qemu-devel] [V15 1/4] hw/pci: Prepare for AMD IOMMU

2016-08-08 Thread David Kiarie
On Mon, Aug 8, 2016 at 12:01 PM, Peter Xu  wrote:

> On Tue, Aug 02, 2016 at 11:39:04AM +0300, David Kiarie wrote:
> > Introduce PCI macros from for use by AMD IOMMU
> >
> > Signed-off-by: David Kiarie 
> > ---
> >  include/hw/pci/pci.h | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > index 929ec2f..d47e0e6 100644
> > --- a/include/hw/pci/pci.h
> > +++ b/include/hw/pci/pci.h
> > @@ -11,11 +11,14 @@
> >  #include "hw/pci/pcie.h"
> >
> >  /* PCI bus */
> > -
> > +#define PCI_BDF(bus, devfn) uint16_t)(bus)) << 8) | (devfn))
>
> Seems the same as PCI_BUILD_BDF() below?
>

Yes, I noted. It's one of the things I intend to fix on the version.


> >  #define PCI_DEVFN(slot, func)   slot) & 0x1f) << 3) | ((func) &
> 0x07))
> > +#define PCI_BUS_NUM(x)  (((x) >> 8) & 0xff)
> >  #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f)
> >  #define PCI_FUNC(devfn) ((devfn) & 0x07)
> >  #define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn))
> > +#define PCI_BUS_MAX 256
> > +#define PCI_DEVFN_MAX   256
> >  #define PCI_SLOT_MAX32
> >  #define PCI_FUNC_MAX8
> >
> > --
> > 2.1.4
> >
>
> -- peterx
>


Re: [Qemu-devel] [PULL v5 29/57] intel_iommu: add SID validation for IR

2016-08-02 Thread David Kiarie
On Tue, Aug 2, 2016 at 3:16 PM, Jan Kiszka  wrote:

> On 2016-08-02 13:58, David Kiarie wrote:
> >
> >
> > On Tue, Aug 2, 2016 at 1:28 PM, Peter Xu  > <mailto:pet...@redhat.com>> wrote:
> >
> > On Tue, Aug 02, 2016 at 10:46:13AM +0200, Jan Kiszka wrote:
> > > On 2016-08-02 10:36, Peter Xu wrote:
> > > > On Mon, Aug 01, 2016 at 06:39:05PM +0200, Jan Kiszka wrote:
> > > >
> > > > [...]
> > > >
> > > >>>  static MemTxResult vtd_mem_ir_read(void *opaque, hwaddr addr,
> > > >>> @@ -2209,11 +2250,17 @@ static MemTxResult
> > vtd_mem_ir_write(void *opaque, hwaddr addr,
> > > >>>  {
> > > >>>  int ret = 0;
> > > >>>  MSIMessage from = {}, to = {};
> > > >>> +uint16_t sid = X86_IOMMU_SID_INVALID;
> > > >>>
> > > >>>  from.address = (uint64_t) addr + VTD_INTERRUPT_ADDR_FIRST;
> > > >>>  from.data = (uint32_t) value;
> > > >>>
> > > >>> -ret = vtd_interrupt_remap_msi(opaque, &from, &to);
> > > >>> +if (!attrs.unspecified) {
> > > >>> +/* We have explicit Source ID */
> > > >>> +sid = attrs.requester_id;
> > > >>> +}
> > > >>
> > > >> ...here you fall back to X86_IOMMU_SID_INVALID if writer to
> > this region
> > > >> has not provided some valid attrs. That is questionable, defeats
> > > >> validation of the IOAPIC e.g. (and you can see lots of
> > > >> X86_IOMMU_SID_INVALID in vtd_irte_get when booting a guest).
> > > >>
> > > >> The credits also go to David who noticed that he still doesn't
> > get a
> > > >> proper ID from the IOAPIC while implementing AMD IR. Looks like
> > we need
> > > >> to enlighten the IOAPIC MSI writes...
> > > >
> > > > Jan, David,
> > > >
> > > > At the time when drafting the patch, I skipped SID verification
> for
> > > > IOAPIC interrupts since it differs from generic PCI devices (no
> > > > natural requester ID, so need some hacky lines to enable it).
> > >
> > > It's not hacky at all if done properly. For Intel it is simply
> > > (Q35_PSEUDO_BUS_PLATFORM << 8) | Q35_PSEUDO_DEVFN_IOAPIC, but it
> > will be
> > > 0x00a0 (as constant as well) for AMD. So we need some interface to
> > tell
> > > those parameters to the IOMMU. Keep in mind that we will need a
> > similar
> > > interface for other platform devices, e.g. the HPET.
> >
> > Okay.
> >
> > >
> > > >
> > > > I can try to cook another seperate patch to enable it (for 2.8
> > > > possibly?). Thanks for pointing out this issue.
> > >
> > > David needs that IOAPIC ID as well in order to finish interrupt
> > > remapping on AMD. Please synchronize with him who will implement
> what.
> >
> > Sure. David, so do you like to do it or I cook this patch? :)
> >
> >
> > If there are no objections I will look at this employing Jan's approach:
> > associating a write with an address space.
> >
>
> That actually means making it an IOMMU-local thing (if write_requester
> == invalid -> derive ID from addressed memory region, probably via the
> opaque passed to the ops). In that case, everyone could do this on himself.
>

Yes, each person can actually do their part without affecting the other ;-)


>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RDA ITP SES-DE
> Corporate Competence Center Embedded Linux
>


Re: [Qemu-devel] [PULL v5 29/57] intel_iommu: add SID validation for IR

2016-08-02 Thread David Kiarie
On Tue, Aug 2, 2016 at 3:12 PM, Peter Xu  wrote:

> On Tue, Aug 02, 2016 at 02:58:55PM +0300, David Kiarie wrote:
> > > Sure. David, so do you like to do it or I cook this patch? :)
> >
> > If there are no objections I will look at this employing Jan's approach:
> > associating a write with an address space.
>
> Do you mean to translate current stl_le_phys() into something like
> address_space_stl_le(), with MemTxAttrs? (in ioapic_service())
>

I tried doing something like that but the write gets discarded somewhere. I
don't see the write from IOMMU side.

>
> Also, IIUC we also need to tweak a little bit more for split irqchip
> case in kvm_arch_fixup_msi_route(). Actually for this one, I think
> maybe we can assume the requester ID be IOAPIC's when dev == NULL,
> since HPET should not be using kernel irqchip, right?
>

I meant do have something like what is here
http://git.kiszka.org/?p=qemu.git;a=commitdiff;h=4f27331e7769a571c7d7fb61cf75e1b2fe908f85



> Thanks,
>
> -- peterx
>


Re: [Qemu-devel] [PULL v5 29/57] intel_iommu: add SID validation for IR

2016-08-02 Thread David Kiarie
On Tue, Aug 2, 2016 at 1:28 PM, Peter Xu  wrote:

> On Tue, Aug 02, 2016 at 10:46:13AM +0200, Jan Kiszka wrote:
> > On 2016-08-02 10:36, Peter Xu wrote:
> > > On Mon, Aug 01, 2016 at 06:39:05PM +0200, Jan Kiszka wrote:
> > >
> > > [...]
> > >
> > >>>  static MemTxResult vtd_mem_ir_read(void *opaque, hwaddr addr,
> > >>> @@ -2209,11 +2250,17 @@ static MemTxResult vtd_mem_ir_write(void
> *opaque, hwaddr addr,
> > >>>  {
> > >>>  int ret = 0;
> > >>>  MSIMessage from = {}, to = {};
> > >>> +uint16_t sid = X86_IOMMU_SID_INVALID;
> > >>>
> > >>>  from.address = (uint64_t) addr + VTD_INTERRUPT_ADDR_FIRST;
> > >>>  from.data = (uint32_t) value;
> > >>>
> > >>> -ret = vtd_interrupt_remap_msi(opaque, &from, &to);
> > >>> +if (!attrs.unspecified) {
> > >>> +/* We have explicit Source ID */
> > >>> +sid = attrs.requester_id;
> > >>> +}
> > >>
> > >> ...here you fall back to X86_IOMMU_SID_INVALID if writer to this
> region
> > >> has not provided some valid attrs. That is questionable, defeats
> > >> validation of the IOAPIC e.g. (and you can see lots of
> > >> X86_IOMMU_SID_INVALID in vtd_irte_get when booting a guest).
> > >>
> > >> The credits also go to David who noticed that he still doesn't get a
> > >> proper ID from the IOAPIC while implementing AMD IR. Looks like we
> need
> > >> to enlighten the IOAPIC MSI writes...
> > >
> > > Jan, David,
> > >
> > > At the time when drafting the patch, I skipped SID verification for
> > > IOAPIC interrupts since it differs from generic PCI devices (no
> > > natural requester ID, so need some hacky lines to enable it).
> >
> > It's not hacky at all if done properly. For Intel it is simply
> > (Q35_PSEUDO_BUS_PLATFORM << 8) | Q35_PSEUDO_DEVFN_IOAPIC, but it will be
> > 0x00a0 (as constant as well) for AMD. So we need some interface to tell
> > those parameters to the IOMMU. Keep in mind that we will need a similar
> > interface for other platform devices, e.g. the HPET.
>
> Okay.
>
> >
> > >
> > > I can try to cook another seperate patch to enable it (for 2.8
> > > possibly?). Thanks for pointing out this issue.
> >
> > David needs that IOAPIC ID as well in order to finish interrupt
> > remapping on AMD. Please synchronize with him who will implement what.
>
> Sure. David, so do you like to do it or I cook this patch? :)


If there are no objections I will look at this employing Jan's approach:
associating a write with an address space.


>


> Thanks,
>
> -- peterx
>


[Qemu-devel] [V15 3/4] hw/i386: Introduce AMD IOMMU

2016-08-02 Thread David Kiarie
Add AMD IOMMU emulaton to Qemu in addition to Intel IOMMU.
The IOMMU does basic translation, error checking and has a
minimal IOTLB implementation. This IOMMU bypassed the need
for target aborts by responding with IOMMU_NONE access rights
and exempts the region 0xfee0-0xfeef from translation
as it is the q35 interrupt region.

We advertise features that are not yet implemented to please
the Linux IOMMU driver.

IOTLB aims at implementing commands on real IOMMUs which is
essential for debugging and may not offer any performance
benefits

Signed-off-by: David Kiarie 
---
 hw/i386/Makefile.objs |1 +
 hw/i386/amd_iommu.c   | 1397 +
 hw/i386/amd_iommu.h   |  390 ++
 hw/i386/trace-events  |7 +
 4 files changed, 1795 insertions(+)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index 90e94ff..909ead6 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -3,6 +3,7 @@ obj-y += multiboot.o
 obj-y += pc.o pc_piix.o pc_q35.o
 obj-y += pc_sysfw.o
 obj-y += x86-iommu.o intel_iommu.o
+obj-y += amd_iommu.o
 obj-$(CONFIG_XEN) += ../xenpv/ xen/
 
 obj-y += kvmvapic.o
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
new file mode 100644
index 000..7b64dd7
--- /dev/null
+++ b/hw/i386/amd_iommu.c
@@ -0,0 +1,1397 @@
+/*
+ * QEMU emulation of AMD IOMMU (AMD-Vi)
+ *
+ * Copyright (C) 2011 Eduard - Gabriel Munteanu
+ * Copyright (C) 2015 David Kiarie, 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ * Cache implementation inspired by hw/i386/intel_iommu.c
+ *
+ */
+#include "qemu/osdep.h"
+#include 
+#include "hw/pci/msi.h"
+#include "hw/i386/pc.h"
+#include "hw/i386/amd_iommu.h"
+#include "hw/pci/pci_bus.h"
+#include "trace.h"
+
+/* used AMD-Vi MMIO registers */
+const char *amdvi_mmio_low[] = {
+"AMDVI_MMIO_DEVTAB_BASE",
+"AMDVI_MMIO_CMDBUF_BASE",
+"AMDVI_MMIO_EVTLOG_BASE",
+"AMDVI_MMIO_CONTROL",
+"AMDVI_MMIO_EXCL_BASE",
+"AMDVI_MMIO_EXCL_LIMIT",
+"AMDVI_MMIO_EXT_FEATURES",
+"AMDVI_MMIO_PPR_BASE",
+"UNHANDLED"
+};
+const char *amdvi_mmio_high[] = {
+"AMDVI_MMIO_COMMAND_HEAD",
+"AMDVI_MMIO_COMMAND_TAIL",
+"AMDVI_MMIO_EVTLOG_HEAD",
+"AMDVI_MMIO_EVTLOG_TAIL",
+"AMDVI_MMIO_STATUS",
+"AMDVI_MMIO_PPR_HEAD",
+"AMDVI_MMIO_PPR_TAIL",
+"UNHANDLED"
+};
+typedef struct AMDVIAddressSpace {
+uint8_t bus_num;/* bus number   */
+uint8_t devfn;  /* device function  */
+AMDVIState *iommu_state;/* AMDVI - one per machine  */
+MemoryRegion iommu; /* Device's address translation region  */
+MemoryRegion iommu_ir;  /* Device's interrupt remapping region  */
+AddressSpace as;/* device's corresponding address space */
+} AMDVIAddressSpace;
+
+/* AMDVI cache entry */
+typedef struct AMDVIIOTLBEntry {
+uint64_t gfn;   /* guest frame number  */
+uint16_t domid; /* assigned domain id  */
+uint16_t devid; /* device owning entry */
+uint64_t perms; /* access permissions  */
+uint64_t translated_addr;   /* translated address  */
+uint64_t page_mask; /* physical page size  */
+} AMDVIIOTLBEntry;
+
+/* serialize IOMMU command processing */
+typedef struct QEMU_PACKED {
+#ifdef HOST_WORDS_BIGENDIAN
+uint64_t type:4;   /* command type   */
+uint64_t reserved:8;
+uint64_t store_addr:49;/* addr to write  */
+uint64_t completion_flush:1;   /* allow more executions  */
+uint64_t completion_int:1; /* set MMIOWAITINT*/
+uint64_t completion_store:1;   /* write data to address  */
+#else
+uint64_t completion_store:1;
+uint64_t completion_int:1;
+uint64_t completion_flush:1;
+uint64_t store_addr:49;
+uint64_t reserved:8;
+uint64_t type:4;
+#endif /* __BIG_ENDIAN_BITFIELD */
+uint64_t store_data;   /* data to write  */
+} CMDCompletionWait;

[Qemu-devel] [V15 2/4] hw/i386/trace-events: Add AMD IOMMU trace events

2016-08-02 Thread David Kiarie
Signed-off-by: David Kiarie 
---
 hw/i386/trace-events | 29 +
 1 file changed, 29 insertions(+)

diff --git a/hw/i386/trace-events b/hw/i386/trace-events
index b4882c1..592de3a 100644
--- a/hw/i386/trace-events
+++ b/hw/i386/trace-events
@@ -13,3 +13,32 @@ mhp_pc_dimm_assigned_address(uint64_t addr) "0x%"PRIx64
 
 # hw/i386/x86-iommu.c
 x86_iommu_iec_notify(bool global, uint32_t index, uint32_t mask) "Notify IEC 
invalidation: global=%d index=%" PRIu32 " mask=%" PRIu32
+
+# hw/i386/amd_iommu.c
+amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at addr 
0x%"PRIx64 " +  offset 0x%"PRIx32
+amdvi_cache_update(uint16_t domid, uint32_t bus, uint32_t slot, uint32_t func, 
uint64_t gpa, uint64_t txaddr) " update iotlb domid 0x%"PRIx16" devid: 
%02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
+amdvi_completion_wait_fail(uint64_t addr) "error: fail to write at address 
0x%"PRIx64
+amdvi_mmio_write(const char *reg, uint64_t addr, unsigned size, uint64_t val, 
unsigned long offset) "%s write addr 0x%"PRIx64 ", size %d, val 0x%"PRIx64 ", 
offset 0x%"PRIx64
+amdvi_mmio_read(const char *reg, uint64_t addr, unsigned size, uint64_t 
offset) "%s read addr 0x%"PRIx64", size %d offset 0x%"PRIx64
+amdvi_command_error(uint64_t status) "error: Executing commands with command 
buffer disabled 0x%"PRIx64
+amdvi_command_read_fail(uint64_t addr, uint32_t head) "error: fail to access 
memory at 0x%"PRIx64" + 0x%"PRIu32
+amdvi_command_exec(uint32_t head, uint32_t tail, uint64_t buf) "command buffer 
head at 0x%"PRIx32 " command buffer tail at 0x%"PRIx32" command buffer base at 
0x%" PRIx64
+amdvi_unhandled_command(uint8_t type) "unhandled command %d"
+amdvi_intr_inval(void) "Interrupt table invalidated"
+amdvi_iotlb_inval(void) "IOTLB pages invalidated"
+amdvi_prefetch_pages(void) "Pre-fetch of AMD-Vi pages requested"
+amdvi_pages_inval(uint16_t domid) "AMD-Vi pages for domain 0x%"PRIx16 " 
invalidated"
+amdvi_all_inval(void) "Invalidation of all AMD-Vi cache requested "
+amdvi_ppr_exec(void) "Execution of PPR queue requested "
+amdvi_devtab_inval(uint16_t bus, uint16_t slot, uint16_t func) "device table 
entry for devid: %02x:%02x.%x invalidated"
+amdvi_completion_wait(uint64_t addr, uint64_t data) "completion wait requested 
with store address 0x%"PRIx64" and store data 0x%"PRIx64
+amdvi_control_status(uint64_t val) "MMIO_STATUS state 0x%"PRIx64
+amdvi_iotlb_reset(void) "IOTLB exceed size limit - reset "
+amdvi_completion_wait_exec(uint64_t addr, uint64_t data) "completion wait 
requested with store address 0x%"PRIx64" and store data 0x%"PRIx64
+amdvi_dte_get_fail(uint64_t addr, uint32_t offset) "error: failed to access 
Device Entry devtab 0x%"PRIx64" offset 0x%"PRIx32
+amdvi_invalid_dte(uint64_t addr) "PTE entry at 0x%"PRIx64" is invalid "
+amdvi_get_pte_hwerror(uint64_t addr) "hardware error eccessing PTE at addr 
0x%"PRIx64
+amdvi_mode_invalid(unsigned level, uint64_t addr)"error: translation level 
0x%"PRIu8" translating addr 0x%"PRIx64
+amdvi_page_fault(uint64_t addr) "error: page fault accessing guest physical 
address 0x%"PRIx64
+amdvi_iotlb_hit(uint16_t bus, uint16_t slot, uint16_t func, uint64_t addr, 
uint64_t txaddr) "hit iotlb devid %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
+amdvi_translation_result(uint16_t bus, uint16_t slot, uint16_t func, uint64_t 
addr, uint64_t txaddr) "devid: %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
-- 
2.1.4




[Qemu-devel] [V15 1/4] hw/pci: Prepare for AMD IOMMU

2016-08-02 Thread David Kiarie
Introduce PCI macros from for use by AMD IOMMU

Signed-off-by: David Kiarie 
---
 include/hw/pci/pci.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 929ec2f..d47e0e6 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -11,11 +11,14 @@
 #include "hw/pci/pcie.h"
 
 /* PCI bus */
-
+#define PCI_BDF(bus, devfn) uint16_t)(bus)) << 8) | (devfn))
 #define PCI_DEVFN(slot, func)   slot) & 0x1f) << 3) | ((func) & 0x07))
+#define PCI_BUS_NUM(x)  (((x) >> 8) & 0xff)
 #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f)
 #define PCI_FUNC(devfn) ((devfn) & 0x07)
 #define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn))
+#define PCI_BUS_MAX 256
+#define PCI_DEVFN_MAX   256
 #define PCI_SLOT_MAX32
 #define PCI_FUNC_MAX8
 
-- 
2.1.4




[Qemu-devel] [V15 4/4] hw/i386: AMD IOMMU IVRS table

2016-08-02 Thread David Kiarie
Add IVRS table for AMD IOMMU. Generate IVRS or DMAR
depending on emulated IOMMU.

Signed-off-by: David Kiarie 
---
 hw/acpi/aml-build.c |  2 +-
 hw/i386/acpi-build.c| 76 -
 hw/i386/x86-iommu.c | 19 
 include/hw/acpi/aml-build.h |  1 +
 include/hw/i386/x86-iommu.h | 11 +++
 5 files changed, 101 insertions(+), 8 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index db3e914..b2a1e40 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -226,7 +226,7 @@ static void build_extop_package(GArray *package, uint8_t op)
 build_prepend_byte(package, 0x5B); /* ExtOpPrefix */
 }
 
-static void build_append_int_noprefix(GArray *table, uint64_t value, int size)
+void build_append_int_noprefix(GArray *table, uint64_t value, int size)
 {
 int i;
 
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index a26a4bb..efed318 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -59,7 +59,8 @@
 
 #include "qapi/qmp/qint.h"
 #include "qom/qom-qobject.h"
-#include "hw/i386/x86-iommu.h"
+#include "hw/i386/amd_iommu.h"
+#include "hw/i386/intel_iommu.h"
 
 #include "hw/acpi/ipmi.h"
 
@@ -2562,6 +2563,68 @@ build_dmar_q35(GArray *table_data, BIOSLinker *linker)
 build_header(linker, table_data, (void *)(table_data->data + dmar_start),
  "DMAR", table_data->len - dmar_start, 1, NULL, NULL);
 }
+/*
+ *   IVRS table as specified in AMD IOMMU Specification v2.62, Section 5.2
+ *   accessible here http://support.amd.com/TechDocs/48882_IOMMU.pdf
+ */
+static void
+build_amd_iommu(GArray *table_data, BIOSLinker *linker)
+{
+int iommu_start = table_data->len;
+AMDVIState *s = AMD_IOMMU_DEVICE(x86_iommu_get_default());
+assert(s);
+
+/* IVRS header */
+acpi_data_push(table_data, sizeof(AcpiTableHeader));
+/* IVinfo - IO virtualization information common to all IOMMU
+ * units in a system
+ */
+build_append_int_noprefix(table_data, 40UL << 8/* PASize */, 4);
+/* reserved */
+build_append_int_noprefix(table_data, 0, 8);
+
+/* IVHD definition - type 10h */
+build_append_int_noprefix(table_data, 0x10, 1);
+/* virtualization flags */
+build_append_int_noprefix(table_data,
+ (1UL << 0) | /* HtTunEn  */
+ (1UL << 4) | /* iotblSup */
+ (1UL << 6) | /* PrefSup  */
+ (1UL << 7),  /* PPRSup   */
+ 1);
+/* IVHD length */
+build_append_int_noprefix(table_data, 0x28, 2);
+/* DeviceID */
+build_append_int_noprefix(table_data, s->devid, 2);
+/* Capability offset */
+build_append_int_noprefix(table_data, s->capab_offset, 2);
+/* IOMMU base address */
+build_append_int_noprefix(table_data, s->mmio.addr, 8);
+/* PCI Segment Group */
+build_append_int_noprefix(table_data, 0, 2);
+/* IOMMU info */
+build_append_int_noprefix(table_data, 0, 2);
+/* IOMMU Feature Reporting */
+build_append_int_noprefix(table_data,
+ (48UL << 30) | /* HATS   */
+ (48UL << 28) | /* GATS   */
+ (1UL << 2),/* GTSup  */
+ 4);
+/* Add device flags here
+ *   These are 4-byte device entries currently reporting the range of
+ *   devices 00h - h; all devices
+ *   Device setting affecting all devices should be made here
+ *
+ *   Refer to Spec - Table 95:IVHD Device Entry Type Codes(4-byte)
+ */
+/* start of device range, 4-byte entries */
+build_append_int_noprefix(table_data, 0x0003, 4);
+/* end of device range */
+build_append_int_noprefix(table_data, 0x0004, 4);
+
+build_header(linker, table_data, (void *)(table_data->data + iommu_start),
+ "IVRS", table_data->len - iommu_start, 1, NULL, NULL);
+}
 
 static GArray *
 build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned rsdt_tbl_offset)
@@ -2622,11 +2685,6 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
 return true;
 }
 
-static bool acpi_has_iommu(void)
-{
-return !!x86_iommu_get_default();
-}
-
 static
 void acpi_build(AcpiBuildTables *tables, MachineState *machine)
 {
@@ -2639,6 +2697,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
*machine)
 AcpiMcfgInfo mcfg;
 Range pci_hole, pci_hole64;
 uint8_t *u;
+IommuType IOMMUType = x86_iommu_get_type();
 size_t aml_len = 0;
 GArray *tables_blob = tables->table_data;
 AcpiSlicOem slic_oem = { .id = NULL, .table_id = NULL };
@@ -2706,7 +2765,10 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
*machine)
 acpi_add_table(table_offsets, tables_blob);
 buil

[Qemu-devel] [V15 0/4] AMD IOMMU

2016-08-02 Thread David Kiarie
Hi all,

This patchset adds basic AMD IOMMU emulation support to Qemu. This version 
happens to have been delayed since I expected to send it together with IR code 
but it seems that may take even longer so I'm sending this first.

Changes since v13
   -Added an error to make AMD IOMMU incompatible with device assignment.[Alex]
   -Converted AMD IOMMU into a composite PCI and System Bus device. This helps 
with:
  -We can now inherit from X86 IOMMU base class(which is implemented as a 
System Bus device).
  -We can now reserve MMIO region for IOMMU without a BAR register and 
without a hack.

Changes since v12

   -Coding style fixes [Jan, Michael]
   -Error logging fix to avoid using a macro[Jan]
   -moved some PCI macros to PCI header[Jan]
   -Use a lookup table for MMIO register names when tracing[Jan]

Changes since V11
   -AMD IOMMU is not started with -device amd-iommu (with a dependency on 
Marcel's patches).
   -IOMMU commands are represented using bitfields which is less error prone 
and more readable[Peter]
   -Changed from debug fprintfs to tracing[Jan]

Changes since V10
 
   -Support for huge pages including some obscure AMD IOMMU feature that allows 
default page size override[Jan].
   -Fixed an issue with generation of interrupts. We noted that AMD IOMMU has 
BusMaster- and is therefore not able to generate interrupts like any other PCI 
device. We have resulted in writing directly to system address but this could 
be fixed by some patches which have not been merged yet.

Changes since v9

   -amd_iommu prefixes have been renamed to a shorter 'amdvi' both in the macros
and in the functions/code. The register macros have not been moved to the 
implementation file since almost the macros there are basically macros and 
I 
reckoned renaming them should suffice.
   -taken care of byte order in the use of 'dma_memory_read'[Michael]
   -Taken care of invalid DTE entries to ensure no DMA unless a device is 
configured to allow it.
   -An issue with the emulate IOMMU defaulting to AMD_IOMMU has been 
fixed[Marcel]
   
You can test[1] this patches by starting with parameters 
qemu-system-x86_64 -M -device amd-iommu -m 2G -enable-kvm -smp 4 -cpu host 
-hda file.img -soundhw ac97 
emulating whatever devices you want.

Not passing any command line parameters to linux should be enough to test this 
patches since the devices are basically
passes-through but to the 'host' (l1 guest). You can still go ahead pass 
command line parameter 'iommu=pt iommu=1'
and try to pass a device to L2 guest. This can also done without passing any 
iommu related parameters to the kernel. 

David Kiarie (4):
  hw/pci: Prepare for AMD IOMMU
  hw/i386/trace-events: Add AMD IOMMU trace events
  hw/i386: Introduce AMD IOMMU
  hw/i386: AMD IOMMU IVRS table

 hw/acpi/aml-build.c |2 +-
 hw/i386/Makefile.objs   |1 +
 hw/i386/acpi-build.c|   76 ++-
 hw/i386/amd_iommu.c | 1397 +++
 hw/i386/amd_iommu.h |  390 
 hw/i386/trace-events|   36 ++
 hw/i386/x86-iommu.c |   19 +
 include/hw/acpi/aml-build.h |1 +
 include/hw/i386/x86-iommu.h |   11 +
 include/hw/pci/pci.h|5 +-
 10 files changed, 1929 insertions(+), 9 deletions(-)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

-- 
2.1.4




[Qemu-devel] Question on Qemu IOAPIC ID

2016-07-23 Thread David Kiarie
Hello all,

I, working on AMD IOMMU interrupt remapping would like to ask some
questions on Qemu IOAPIC id.

I currently have a problem in that Linux AMD IOMMU driver expects at least
one IOAPIC on the chipset to have ID 0xa0 while Qemu IOAPIC id is always 0.
I am faced with two options
   -Change IOAPIC id so that it always matches what Linux AMD IOMMU driver
expects. In this case I am not sure which problems I could encounter with
this approach.
   -Have a way to configure IOAPIC ID based on emulated IOMMU.

I prefer the first approach since it's quite simple but would like to hear
other opinions regarding this issue.

David.


Re: [Qemu-devel] [PATCH v11 14/28] intel_iommu: Add support for PCI MSI remap

2016-07-13 Thread David Kiarie
On Tue, Jul 5, 2016 at 11:19 AM, Peter Xu  wrote:
> This patch enables interrupt remapping for PCI devices.
>
> To play the trick, one memory region "iommu_ir" is added as child region
> of the original iommu memory region, covering range 0xfeeX (which is
> the address range for APIC). All the writes to this range will be taken
> as MSI, and translation is carried out only when IR is enabled.
>
> Idea suggested by Paolo Bonzini.
>
> Signed-off-by: Peter Xu 
> ---
>  hw/i386/intel_iommu.c  | 251 
> +
>  hw/i386/intel_iommu_internal.h |   2 +
>  include/hw/i386/intel_iommu.h  |  66 +++
>  3 files changed, 319 insertions(+)
>
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index a12091e..90bf9e9 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -1982,6 +1982,252 @@ static Property vtd_properties[] = {
>  DEFINE_PROP_END_OF_LIST(),
>  };
>
> +/* Read IRTE entry with specific index */
> +static int vtd_irte_get(IntelIOMMUState *iommu, uint16_t index,
> +VTD_IRTE *entry)
> +{
> +dma_addr_t addr = 0x00;
> +
> +addr = iommu->intr_root + index * sizeof(*entry);
> +if (dma_memory_read(&address_space_memory, addr, entry,
> +sizeof(*entry))) {
> +VTD_DPRINTF(GENERAL, "error: fail to access IR root at 0x%"PRIx64
> +" + %"PRIu16, iommu->intr_root, index);
> +return -VTD_FR_IR_ROOT_INVAL;
> +}
> +
> +if (!entry->present) {
> +VTD_DPRINTF(GENERAL, "error: present flag not set in IRTE"
> +" entry index %u value 0x%"PRIx64 " 0x%"PRIx64,
> +index, le64_to_cpu(entry->data[1]),
> +le64_to_cpu(entry->data[0]));
> +return -VTD_FR_IR_ENTRY_P;
> +}
> +
> +if (entry->__reserved_0 || entry->__reserved_1 || \
> +entry->__reserved_2) {
> +VTD_DPRINTF(GENERAL, "error: IRTE entry index %"PRIu16
> +" reserved fields non-zero: 0x%"PRIx64 " 0x%"PRIx64,
> +index, le64_to_cpu(entry->data[1]),
> +le64_to_cpu(entry->data[0]));
> +return -VTD_FR_IR_IRTE_RSVD;
> +}
> +
> +/*
> + * TODO: Check Source-ID corresponds to SVT (Source Validation
> + * Type) bits
> + */
> +
> +return 0;
> +}
> +
> +/* Fetch IRQ information of specific IR index */
> +static int vtd_remap_irq_get(IntelIOMMUState *iommu, uint16_t index, VTDIrq 
> *irq)
> +{
> +VTD_IRTE irte;
> +int ret = 0;
> +
> +bzero(&irte, sizeof(irte));
> +
> +ret = vtd_irte_get(iommu, index, &irte);
> +if (ret) {
> +return ret;
> +}
> +
> +irq->trigger_mode = irte.trigger_mode;
> +irq->vector = irte.vector;
> +irq->delivery_mode = irte.delivery_mode;
> +/* Not support EIM yet: please refer to vt-d 9.10 DST bits */
> +#define  VTD_IR_APIC_DEST_MASK (0xff00ULL)
> +#define  VTD_IR_APIC_DEST_SHIFT(8)
> +irq->dest = (le32_to_cpu(irte.dest_id) & VTD_IR_APIC_DEST_MASK) >> \
> +VTD_IR_APIC_DEST_SHIFT;
> +irq->dest_mode = irte.dest_mode;
> +irq->redir_hint = irte.redir_hint;
> +
> +VTD_DPRINTF(IR, "remapping interrupt index %d: trig:%u,vec:%u,"
> +"deliver:%u,dest:%u,dest_mode:%u", index,
> +irq->trigger_mode, irq->vector, irq->delivery_mode,
> +irq->dest, irq->dest_mode);
> +
> +return 0;
> +}
> +
> +/* Generate one MSI message from VTDIrq info */
> +static void vtd_generate_msi_message(VTDIrq *irq, MSIMessage *msg_out)
> +{
> +VTD_MSIMessage msg = {};
> +
> +/* Generate address bits */
> +msg.dest_mode = irq->dest_mode;
> +msg.redir_hint = irq->redir_hint;
> +msg.dest = irq->dest;
> +msg.__addr_head = cpu_to_le32(0xfee);
> +/* Keep this from original MSI address bits */
> +msg.__not_used = irq->msi_addr_last_bits;
> +
> +/* Generate data bits */
> +msg.vector = irq->vector;
> +msg.delivery_mode = irq->delivery_mode;
> +msg.level = 1;
> +msg.trigger_mode = irq->trigger_mode;
> +
> +msg_out->address = msg.msi_addr;
> +msg_out->data = msg.msi_data;
> +}
> +
> +/* Interrupt remapping for MSI/MSI-X entry */
> +static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
> +   MSIMessage *origin,
> +   MSIMessage *translated)
> +{
> +int ret = 0;
> +VTD_IR_MSIAddress addr;
> +uint16_t index;
> +VTDIrq irq = {0};
> +
> +assert(origin && translated);
> +
> +if (!iommu || !iommu->intr_enabled) {
> +goto do_not_translate;
> +}
> +
> +if (origin->address & VTD_MSI_ADDR_HI_MASK) {
> +VTD_DPRINTF(GENERAL, "error: MSI addr high 32 bits nonzero"
> +" during interrupt remapping: 0x%"PRIx32,
> +(uint32_t)((origin->address & VTD_MSI_ADDR_HI_MASK) >> \
> +VTD_MSI_ADDR_HI_S

Re: [Qemu-devel] [PATCH v10 27/26] intel_iommu: disallow kernel-irqchip=on with IR

2016-07-11 Thread David Kiarie
On Fri, Jun 24, 2016 at 10:10 AM, Peter Xu  wrote:
> When user specify "kernel-irqchip=on", throw error and then quit.
>
> Signed-off-by: Peter Xu 
> ---
>
> One more patch for this series. Without this one, guest kernel will
> possibly hang. This is not user friendly.
>
>  hw/i386/intel_iommu.c | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 4ff9a24..618b0f9 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -20,6 +20,7 @@
>   */
>
>  #include "qemu/osdep.h"
> +#include "qemu/error-report.h"
>  #include "hw/sysbus.h"
>  #include "exec/address-spaces.h"
>  #include "intel_iommu_internal.h"
> @@ -29,6 +30,7 @@
>  #include "hw/boards.h"
>  #include "hw/i386/x86-iommu.h"
>  #include "hw/pci-host/q35.h"
> +#include "sysemu/kvm.h"
>
>  /*#define DEBUG_INTEL_IOMMU*/
>  #ifdef DEBUG_INTEL_IOMMU
> @@ -2458,6 +2460,13 @@ static void vtd_realize(DeviceState *dev, Error **errp)
>  bus->iommu_opaque = dev;
>  /* Pseudo address space under root PCI bus. */
>  pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
> +
> +/* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
> +if (kvm_irqchip_in_kernel() && !kvm_irqchip_is_split()) {
> +error_report("Intel Interrupt Remapping cannot work with "
> + "kernel-irqchip=on, please use 'split|off'.");
> +exit(1);
> +}
>  }

Shouldn't you be checking whether VT-d interrupt remapping is
enabled(I'm assuming it's off by default) before you ensure
kernel-irqchip=off|split ? Doesn't the above imply that one can't use
VT-d with kernel_irqchip=on (regardless of whether IR is enabled) ?

>
>  static void vtd_class_init(ObjectClass *klass, void *data)
> --
> 2.4.11
>



Re: [Qemu-devel] [PATCH v11 04/28] x86-iommu: q35: generalize find_add_as()

2016-07-11 Thread David Kiarie
On Mon, Jul 11, 2016 at 10:41 AM, Peter Xu  wrote:
> On Mon, Jul 11, 2016 at 10:16:11AM +0300, David Kiarie wrote:
>> On Mon, Jul 11, 2016 at 9:49 AM, Peter Xu  wrote:
>> > On Mon, Jul 11, 2016 at 08:46:12AM +0300, David Kiarie wrote:
>> >> On Mon, Jul 11, 2016 at 8:32 AM, Peter Xu  wrote:
>> >> > On Sat, Jul 09, 2016 at 10:14:48AM +0200, Jan Kiszka wrote:
>> >> >> On 2016-07-05 10:19, Peter Xu wrote:
>> >> >> > Remove VT-d calls in common q35 codes. Instead, we provide a general
>> >> >> > find_add_as() for x86-iommu type.
>> >> >> >
>> >> >> > Signed-off-by: Peter Xu 
>> >> >> > ---
>> >> >> >  hw/i386/intel_iommu.c | 15 ---
>> >> >> >  include/hw/i386/intel_iommu.h |  5 -
>> >> >> >  include/hw/i386/x86-iommu.h   |  3 +++
>> >> >> >  3 files changed, 11 insertions(+), 12 deletions(-)
>> >> >>
>> >> >> You claim to remove something from "common q35 code", but I don't see
>> >> >> changes to it. Instead, the patch introduces a method that seems to
>> >> >> remain unused outside the implementing class (I just grep'ed your 
>> >> >> tree).
>> >> >> Anything missing?
>> >> >
>> >> > Right. The commit message lost its point after I did the rebase to
>> >> > Marcel's "-device intel_iommu" patches... Thanks for pointing it out.
>> >>
>> >> I think Jan is mainly asking about where the method 'find_add_as()' is
>> >> being used. Unless I'm too missing something It doesn't seem to be
>> >> used anywhere outside the implementing class.
>>
>> Hi
>> >
>> > This patch can be dropped. I was just not sure whether it's the
>> > correct time to do that. Anyway, we may still need one more patch to
>> > cleanup this in the future, as I have mentioned in the previous email.
>>
>> I think there is a misunderstanding here.
>>
>> We (me and Jan) are basically asking did you plan to use "find_add_as"
>> somewhere and may be missed it ? Why does x86-iommu class need
>> "find_add_as" ?
>> The reason is I'm not able to receive IOAPIC
>> interrupts with AMD IOMMU basing my work on your code. We thought
>> you'd clarify on where "find_add_as" is used or how you plan to use
>> it.
>
> As mentioned in previous email, before Marcel's patches,
> vtd_host_dma_iommu() was named q35_host_dma_iommu().

Okay, that solves it - _before_ the adoption of '-device iommu' so
you're right, this is not needed anymore.

 At that time, I
> need "find_add_as" to let Q35 codes get rid of direct calls to VT-d
> (so that pc_q35.c will not need to include "intel_iommu.h" any more,
> instead, it should include "x86-iommu.h"). Also, that interface is
> prepared for future AMD as well. However, now AMD (you patches) are
> directly calling pci_setup_iommu(). I am not sure whether you were
> using it from the beginning, but IIUC as long as you are using
> pci_setup_iommu() interface, we should be able to avoid providing
> find_add_as any more. So I think this patch is indeed okay to be
> dropped... Please kindly correct me if I missed anything. :)
>
> The only reason that we keep this patch (as far as I can think of..)
> is that mst has done some testing on v11 and I'm not sure whether we'd
> better keep it untouched if we are going to merge it (fixing commit
> message does not count, right?). But I'd say I'm not familiar with how
> maintainers manage codes to be merged... Maybe different maintainers
> have their own flavor on this matter? I don't know. Anyway, these are
> only my wild guess.
>
> For the problem you have encountered with IOAPIC, do you think it's
> related to this patch? Have you tried to add some logs in e.g.
> ioapic_service() to see what's wrong in there?
>
> (Maybe we need some general trace logs in IOAPIC codes?...)
>
> -- peterx



Re: [Qemu-devel] [PATCH v11 04/28] x86-iommu: q35: generalize find_add_as()

2016-07-11 Thread David Kiarie
On Mon, Jul 11, 2016 at 9:49 AM, Peter Xu  wrote:
> On Mon, Jul 11, 2016 at 08:46:12AM +0300, David Kiarie wrote:
>> On Mon, Jul 11, 2016 at 8:32 AM, Peter Xu  wrote:
>> > On Sat, Jul 09, 2016 at 10:14:48AM +0200, Jan Kiszka wrote:
>> >> On 2016-07-05 10:19, Peter Xu wrote:
>> >> > Remove VT-d calls in common q35 codes. Instead, we provide a general
>> >> > find_add_as() for x86-iommu type.
>> >> >
>> >> > Signed-off-by: Peter Xu 
>> >> > ---
>> >> >  hw/i386/intel_iommu.c | 15 ---
>> >> >  include/hw/i386/intel_iommu.h |  5 -
>> >> >  include/hw/i386/x86-iommu.h   |  3 +++
>> >> >  3 files changed, 11 insertions(+), 12 deletions(-)
>> >>
>> >> You claim to remove something from "common q35 code", but I don't see
>> >> changes to it. Instead, the patch introduces a method that seems to
>> >> remain unused outside the implementing class (I just grep'ed your tree).
>> >> Anything missing?
>> >
>> > Right. The commit message lost its point after I did the rebase to
>> > Marcel's "-device intel_iommu" patches... Thanks for pointing it out.
>>
>> I think Jan is mainly asking about where the method 'find_add_as()' is
>> being used. Unless I'm too missing something It doesn't seem to be
>> used anywhere outside the implementing class.

Hi
>
> This patch can be dropped. I was just not sure whether it's the
> correct time to do that. Anyway, we may still need one more patch to
> cleanup this in the future, as I have mentioned in the previous email.

I think there is a misunderstanding here.

We (me and Jan) are basically asking did you plan to use "find_add_as"
somewhere and may be missed it ? Why does x86-iommu class need
"find_add_as" ? The reason is I'm not able to receive IOAPIC
interrupts with AMD IOMMU basing my work on your code. We thought
you'd clarify on where "find_add_as" is used or how you plan to use
it.

>
> I see that mst is possibly not around these two days. Let me prepare a
> v12 before he comes back. Thank you.
>
> -- peterx



Re: [Qemu-devel] [V13 3/4] hw/i386: Introduce AMD IOMMU

2016-07-10 Thread David Kiarie
On Fri, Jul 8, 2016 at 7:30 PM, Alex Williamson
 wrote:
> On Fri,  8 Jul 2016 11:18:22 +0300
> David Kiarie  wrote:
>
>> Add AMD IOMMU emulaton to Qemu in addition to Intel IOMMU.
>> The IOMMU does basic translation, error checking and has a
>> minimal IOTLB implementation. This IOMMU bypassed the need
>> for target aborts by responding with IOMMU_NONE access rights
>> and exempts the region 0xfee0-0xfeef from translation
>> as it is the q35 interrupt region.
>>
>> We advertise features that are not yet implemented to please
>> the Linux IOMMU driver.
>>
>> IOTLB aims at implementing commands on real IOMMUs which is
>> essential for debugging and may not offer any performance
>> benefits
>>
>> Signed-off-by: David Kiarie 
>> ---
>>  hw/i386/Makefile.objs |1 +
>>  hw/i386/amd_iommu.c   | 1384 
>> +
>>  hw/i386/amd_iommu.h   |  285 ++
>>  3 files changed, 1670 insertions(+)
>>  create mode 100644 hw/i386/amd_iommu.c
>>  create mode 100644 hw/i386/amd_iommu.h
>
> Hi,

Hello,

>
> Please consider an update or follow-on patch which adds something
> similar to:
>
> commit 3cb3b1549f5401dc3a5e1d073e34063dc274136f
> Author: Alex Williamson 
> Date:   Thu Jun 30 13:00:24 2016 -0600
>
> intel_iommu: Throw hw_error on notify_started
>
> http://git.qemu.org/?p=qemu.git;a=commit;h=3cb3b1549f5401dc3a5e1d073e34063dc274136f
>
> This would simply make amd_iommu incompatible with device assignment
> until someone tackles adding the proper code to support it.  Thanks,

Thanks, this will be incorporated in the next series.

>
> Alex



Re: [Qemu-devel] [PATCH v11 04/28] x86-iommu: q35: generalize find_add_as()

2016-07-10 Thread David Kiarie
On Mon, Jul 11, 2016 at 8:32 AM, Peter Xu  wrote:
> On Sat, Jul 09, 2016 at 10:14:48AM +0200, Jan Kiszka wrote:
>> On 2016-07-05 10:19, Peter Xu wrote:
>> > Remove VT-d calls in common q35 codes. Instead, we provide a general
>> > find_add_as() for x86-iommu type.
>> >
>> > Signed-off-by: Peter Xu 
>> > ---
>> >  hw/i386/intel_iommu.c | 15 ---
>> >  include/hw/i386/intel_iommu.h |  5 -
>> >  include/hw/i386/x86-iommu.h   |  3 +++
>> >  3 files changed, 11 insertions(+), 12 deletions(-)
>>
>> You claim to remove something from "common q35 code", but I don't see
>> changes to it. Instead, the patch introduces a method that seems to
>> remain unused outside the implementing class (I just grep'ed your tree).
>> Anything missing?
>
> Right. The commit message lost its point after I did the rebase to
> Marcel's "-device intel_iommu" patches... Thanks for pointing it out.

I think Jan is mainly asking about where the method 'find_add_as()' is
being used. Unless I'm too missing something It doesn't seem to be
used anywhere outside the implementing class.

>
> Before the rebase, there is one q35_host_dma_iommu() in pc_q35.c, and
> originally this patch did remove something from q35. While in Marcel's
> commit (621d983a1f), q35_host_dma_iommu() is renamed to
> vtd_host_dma_iommu(), and it's put inside intel_iommu.c. After that,
> this commit message stopped making sense.
>
> So I think at least the commit message of this patch could be fixed
> into something like:
>
>"Introduce common find_add_as() interface for x86-iommu."
>
> And if I now see this... A better solution is to provide a more common
> interface directly in x86-iommu.c to find address spaces, and let
> Intel/AMD IOMMUs share this functionality. After all, we are doing
> merely the same thing to maintain namespaces in both Intel/AMD IOMMUs
> (vtd_find_add_as() and bridge_host_amdvi()). So, do you (and mst?)
> think I should respin to a v12, or we can first fix commit message of
> this patch, then I post another patch basd on this series for a better
> cleanup?
>
> Thanks,
>
> -- peterx



[Qemu-devel] [V13 2/4] hw/i386/trace-events: Add AMD IOMMU trace events

2016-07-08 Thread David Kiarie
Signed-off-by: David Kiarie 
---
 hw/i386/trace-events | 29 +
 1 file changed, 29 insertions(+)

diff --git a/hw/i386/trace-events b/hw/i386/trace-events
index ea77bc2..a2f529e 100644
--- a/hw/i386/trace-events
+++ b/hw/i386/trace-events
@@ -10,3 +10,32 @@ xen_pv_mmio_write(uint64_t addr) "WARNING: write to Xen PV 
Device MMIO space (ad
 # hw/i386/pc.c
 mhp_pc_dimm_assigned_slot(int slot) "0x%d"
 mhp_pc_dimm_assigned_address(uint64_t addr) "0x%"PRIx64
+
+# hw/i386/amd_iommu.c
+amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at addr 
0x%"PRIx64 " +  offset 0x%"PRIx32
+amdvi_cache_update(uint16_t domid, uint32_t bus, uint32_t slot, uint32_t func, 
uint64_t gpa, uint64_t txaddr) " update iotlb domid 0x%"PRIx16" devid: 
%02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
+amdvi_completion_wait_fail(uint64_t addr) "error: fail to write at address 
0x%"PRIx64
+amdvi_mmio_write(const char *reg, uint64_t addr, unsigned size, uint64_t val, 
unsigned long offset) "%s write addr 0x%"PRIx64 ", size %d, val 0x%"PRIx64 ", 
offset 0x%"PRIx64
+amdvi_mmio_read(const char *reg, uint64_t addr, unsigned size, uint64_t 
offset) "%s read addr 0x%"PRIx64", size %d offset 0x%"PRIx64
+amdvi_command_error(uint64_t status) "error: Executing commands with command 
buffer disabled 0x%"PRIx64
+amdvi_command_read_fail(uint64_t addr, uint32_t head) "error: fail to access 
memory at 0x%"PRIx64" + 0x%"PRIu32
+amdvi_command_exec(uint32_t head, uint32_t tail, uint64_t buf) "command buffer 
head at 0x%"PRIx32 " command buffer tail at 0x%"PRIx32" command buffer base at 
0x%" PRIx64
+amdvi_unhandled_command(uint8_t type) "unhandled command %d"
+amdvi_intr_inval(void) "Interrupt table invalidated"
+amdvi_iotlb_inval(void) "IOTLB pages invalidated"
+amdvi_prefetch_pages(void) "Pre-fetch of AMD-Vi pages requested"
+amdvi_pages_inval(uint16_t domid) "AMD-Vi pages for domain 0x%"PRIx16 " 
invalidated"
+amdvi_all_inval(void) "Invalidation of all AMD-Vi cache requested "
+amdvi_ppr_exec(void) "Execution of PPR queue requested "
+amdvi_devtab_inval(uint16_t bus, uint16_t slot, uint16_t func) "device table 
entry for devid: %02x:%02x.%x invalidated"
+amdvi_completion_wait(uint64_t addr, uint64_t data) "completion wait requested 
with store address 0x%"PRIx64" and store data 0x%"PRIx64
+amdvi_control_status(uint64_t val) "MMIO_STATUS state 0x%"PRIx64
+amdvi_iotlb_reset(void) "IOTLB exceed size limit - reset "
+amdvi_completion_wait_exec(uint64_t addr, uint64_t data) "completion wait 
requested with store address 0x%"PRIx64" and store data 0x%"PRIx64
+amdvi_dte_get_fail(uint64_t addr, uint32_t offset) "error: failed to access 
Device Entry devtab 0x%"PRIx64" offset 0x%"PRIx32
+amdvi_invalid_dte(uint64_t addr) "PTE entry at 0x%"PRIx64" is invalid "
+amdvi_get_pte_hwerror(uint64_t addr) "hardware error eccessing PTE at addr 
0x%"PRIx64
+amdvi_mode_invalid(unsigned level, uint64_t addr)"error: translation level 
0x%"PRIu8" translating addr 0x%"PRIx64
+amdvi_page_fault(uint64_t addr) "error: page fault accessing guest physical 
address 0x%"PRIx64
+amdvi_iotlb_hit(uint16_t bus, uint16_t slot, uint16_t func, uint64_t addr, 
uint64_t txaddr) "hit iotlb devid %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
+amdvi_translation_result(uint16_t bus, uint16_t slot, uint16_t func, uint64_t 
addr, uint64_t txaddr) "devid: %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
-- 
2.1.4




[Qemu-devel] [V13 3/4] hw/i386: Introduce AMD IOMMU

2016-07-08 Thread David Kiarie
Add AMD IOMMU emulaton to Qemu in addition to Intel IOMMU.
The IOMMU does basic translation, error checking and has a
minimal IOTLB implementation. This IOMMU bypassed the need
for target aborts by responding with IOMMU_NONE access rights
and exempts the region 0xfee0-0xfeef from translation
as it is the q35 interrupt region.

We advertise features that are not yet implemented to please
the Linux IOMMU driver.

IOTLB aims at implementing commands on real IOMMUs which is
essential for debugging and may not offer any performance
benefits

Signed-off-by: David Kiarie 
---
 hw/i386/Makefile.objs |1 +
 hw/i386/amd_iommu.c   | 1384 +
 hw/i386/amd_iommu.h   |  285 ++
 3 files changed, 1670 insertions(+)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index b52d5b8..2f1a265 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -3,6 +3,7 @@ obj-y += multiboot.o
 obj-y += pc.o pc_piix.o pc_q35.o
 obj-y += pc_sysfw.o
 obj-y += intel_iommu.o
+obj-y += amd_iommu.o
 obj-$(CONFIG_XEN) += ../xenpv/ xen/
 
 obj-y += kvmvapic.o
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
new file mode 100644
index 000..b480d8e
--- /dev/null
+++ b/hw/i386/amd_iommu.c
@@ -0,0 +1,1384 @@
+/*
+ * QEMU emulation of AMD IOMMU (AMD-Vi)
+ *
+ * Copyright (C) 2011 Eduard - Gabriel Munteanu
+ * Copyright (C) 2015 David Kiarie, 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ * Cache implementation inspired by hw/i386/intel_iommu.c
+ *
+ */
+#include "qemu/osdep.h"
+#include "trace.h"
+#include "hw/pci/msi.h"
+#include "hw/i386/pc.h"
+#include "hw/i386/amd_iommu.h"
+
+/* used AMD-Vi MMIO registers */
+const char *amdvi_mmio_low[] = {
+"AMDVI_MMIO_DEVTAB_BASE",
+"AMDVI_MMIO_CMDBUF_BASE",
+"AMDVI_MMIO_EVTLOG_BASE",
+"AMDVI_MMIO_CONTROL",
+"AMDVI_MMIO_EXCL_BASE",
+"AMDVI_MMIO_EXCL_LIMIT",
+"AMDVI_MMIO_EXT_FEATURES",
+"AMDVI_MMIO_PPR_BASE",
+"UNHANDLED"
+};
+const char *amdvi_mmio_high[] = {
+"AMDVI_MMIO_COMMAND_HEAD",
+"AMDVI_MMIO_COMMAND_TAIL",
+"AMDVI_MMIO_EVTLOG_HEAD",
+"AMDVI_MMIO_EVTLOG_TAIL",
+"AMDVI_MMIO_STATUS",
+"AMDVI_MMIO_PPR_HEAD",
+"AMDVI_MMIO_PPR_TAIL",
+"UNHANDLED"
+};
+typedef struct AMDVIAddressSpace {
+uint8_t bus_num;/* bus number   */
+uint8_t devfn;  /* device function  */
+AMDVIState *iommu_state;/* AMDVI - one per machine  */
+MemoryRegion iommu; /* Device's iommu region*/
+AddressSpace as;/* device's corresponding address space */
+} AMDVIAddressSpace;
+
+/* AMD-Vi cache entry */
+typedef struct AMDVIIOTLBEntry {
+uint64_t gfn;   /* guest frame number  */
+uint16_t domid; /* assigned domain id  */
+uint16_t devid; /* device owning entry */
+uint64_t perms; /* access permissions  */
+uint64_t translated_addr;   /* translated address  */
+uint64_t page_mask; /* physical page size  */
+} AMDVIIOTLBEntry;
+
+/* serialize IOMMU command processing */
+typedef struct QEMU_PACKED {
+#ifdef HOST_WORDS_BIGENDIAN
+uint64_t type:4;   /* command type   */
+uint64_t reserved:8;
+uint64_t store_addr:49;/* addr to write  */
+uint64_t completion_flush:1;   /* allow more executions  */
+uint64_t completion_int:1; /* set MMIOWAITINT*/
+uint64_t completion_store:1;   /* write data to address  */
+#else
+uint64_t completion_store:1;
+uint64_t completion_int:1;
+uint64_t completion_flush:1;
+uint64_t store_addr:49;
+uint64_t reserved:8;
+uint64_t type:4;
+#endif /* __BIG_ENDIAN_BITFIELD */
+uint64_t store_data;   /* data to write  */
+} CMDCompletionWait;
+
+/* invalidate internal caches for devid */
+typedef struct QEMU_PACKED {
+#ifdef HOST_WORDS_BIGENDIAN
+uint64_t devid;/* device to invalidate   

[Qemu-devel] [V13 1/4] hw/pci: Prepare for AMD IOMMU

2016-07-08 Thread David Kiarie
Introduce PCI macros from for use by AMD IOMMU

Signed-off-by: David Kiarie 
---
 include/hw/pci/pci.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 9ed1624..959d05b 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -11,11 +11,14 @@
 #include "hw/pci/pcie.h"
 
 /* PCI bus */
-
+#define PCI_BDF(bus, devfn) uint16_t)(bus)) << 8) | (devfn))
 #define PCI_DEVFN(slot, func)   slot) & 0x1f) << 3) | ((func) & 0x07))
+#define PCI_BUS_NUM(x)  (((x) >> 8) & 0xff)
 #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f)
 #define PCI_FUNC(devfn) ((devfn) & 0x07)
 #define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn))
+#define PCI_BUS_MAX 256
+#define PCI_DEVFN_MAX   256
 #define PCI_SLOT_MAX32
 #define PCI_FUNC_MAX8
 
-- 
2.1.4




[Qemu-devel] [V13 0/4] AMD IOMMU

2016-07-08 Thread David Kiarie
Hi all,

This patchset adds basic AMD IOMMU emulation support to Qemu.

Changes since v12

   -Coding style fixes [Jan, Michael]
   -Error logging fix to avoid using a macro[Jan]
   -moved some PCI macros to PCI header[Jan]
   -Use a lookup table for MMIO register names when tracing[Jan]

Changes since V11
   -AMD IOMMU is not started with -device amd-iommu (with a dependency on 
Marcel's patches).
   -IOMMU commands are represented using bitfields which is less error prone 
and more readable[Peter]
   -Changed from debug fprintfs to tracing[Jan]

Changes since V10
 
   -Support for huge pages including some obscure AMD IOMMU feature that allows 
default page size override[Jan].
   -Fixed an issue with generation of interrupts. We noted that AMD IOMMU has 
BusMaster- and is therefore not able to generate interrupts like any other PCI 
device. We have resulted in writing directly to system address but this could 
be fixed by some patches which have not been merged yet.

Changes since v9

   -amd_iommu prefixes have been renamed to a shorter 'amdvi' both in the macros
and in the functions/code. The register macros have not been moved to the 
implementation file since almost the macros there are basically macros and 
I 
reckoned renaming them should suffice.
   -taken care of byte order in the use of 'dma_memory_read'[Michael]
   -Taken care of invalid DTE entries to ensure no DMA unless a device is 
configured to allow it.
   -An issue with the emulate IOMMU defaulting to AMD_IOMMU has been 
fixed[Marcel]
   
You can test[1] this patches by starting with parameters 
qemu-system-x86_64 -M -device amd-iommu -m 2G -enable-kvm -smp 4 -cpu host 
-hda file.img -soundhw ac97 
emulating whatever devices you want.

Not passing any command line parameters to linux should be enough to test this 
patches since the devices are basically
passes-through but to the 'host' (l1 guest). You can still go ahead pass 
command line parameter 'iommu=pt iommu=1'
and try to pass a device to L2 guest. This can also done without passing any 
iommu related parameters to the kernel. 


David Kiarie (4):
  hw/pci: Prepare for AMD IOMMU
  hw/i386/trace-events: Add AMD IOMMU trace events
  hw/i386: Introduce AMD IOMMU
  hw/i386: AMD IOMMU IVRS table

 hw/acpi/aml-build.c |2 +-
 hw/i386/Makefile.objs   |1 +
 hw/i386/acpi-build.c|  102 +++-
 hw/i386/amd_iommu.c | 1384 +++
 hw/i386/amd_iommu.h |  285 +
 hw/i386/trace-events|   29 +
 include/hw/acpi/aml-build.h |1 +
 include/hw/pci/pci.h|5 +-
 8 files changed, 1796 insertions(+), 13 deletions(-)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

-- 
2.1.4




Re: [Qemu-devel] [V12 4/4] hw/i386: AMD IOMMU IVRS table

2016-07-08 Thread David Kiarie
On Mon, Jul 4, 2016 at 11:33 PM, Michael S. Tsirkin  wrote:
> On Wed, Jun 15, 2016 at 03:21:52PM +0300, David Kiarie wrote:
>> Add IVRS table for AMD IOMMU. Generate IVRS or DMAR
>> depending on emulated IOMMU.
>>
>> Signed-off-by: David Kiarie 
>> ---
>>  hw/acpi/aml-build.c |  2 +-
>>  hw/i386/acpi-build.c| 95 
>> +++--
>>  include/hw/acpi/acpi-defs.h | 13 +++
>>  include/hw/acpi/aml-build.h |  1 +
>>  4 files changed, 99 insertions(+), 12 deletions(-)
>>
>> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
>> index 123160a..9ce10aa 100644
>> --- a/hw/acpi/aml-build.c
>> +++ b/hw/acpi/aml-build.c
>> @@ -226,7 +226,7 @@ static void build_extop_package(GArray *package, uint8_t 
>> op)
>>  build_prepend_byte(package, 0x5B); /* ExtOpPrefix */
>>  }
>>
>> -static void build_append_int_noprefix(GArray *table, uint64_t value, int 
>> size)
>> +void build_append_int_noprefix(GArray *table, uint64_t value, int size)
>>  {
>>  int i;
>>
>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>> index 8ca2032..ecdb15d 100644
>> --- a/hw/i386/acpi-build.c
>> +++ b/hw/i386/acpi-build.c
>> @@ -51,6 +51,7 @@
>>  #include "hw/pci/pci_bus.h"
>>  #include "hw/pci-host/q35.h"
>>  #include "hw/i386/intel_iommu.h"
>> +#include "hw/i386/amd_iommu.h"
>>  #include "hw/timer/hpet.h"
>>
>>  #include "hw/acpi/aml-build.h"
>> @@ -116,6 +117,12 @@ typedef struct AcpiBuildPciBusHotplugState {
>>  bool pcihp_bridge_en;
>>  } AcpiBuildPciBusHotplugState;
>>
>> +typedef enum IommuType {
>> +TYPE_INTEL,
>> +TYPE_AMD,
>> +TYPE_NONE
>> +} IommuType;
>> +
>>  static void acpi_get_pm_info(AcpiPmInfo *pm)
>>  {
>>  Object *piix = piix4_pm_find();
>> @@ -2439,6 +2446,77 @@ build_dmar_q35(GArray *table_data, BIOSLinker *linker)
>>   "DMAR", table_data->len - dmar_start, 1, NULL, NULL);
>>  }
>>
>> +static void
>> +build_amd_iommu(GArray *table_data, BIOSLinker *linker)
>> +{
>> +int iommu_start = table_data->len;
>> +bool iommu_ambig;
>> +
>> +/* IVRS definition  - table header has an extra 2-byte field */
>
> Pls document the spec where this comes from, version and relevant chapters.

Misleading comment here. I have fixed the rest.

>
>> +acpi_data_push(table_data, sizeof(AcpiTableHeader));
>> +/* common virtualization information */
>
> Please change comments to match the spec exactly, case and all. E.g. /* 
> IVinfo - I/O
> virtualization information common to all IOMMU units in a
> system */
>
>> +build_append_int_noprefix(table_data, AMD_IOMMU_HOST_ADDRESS_WIDTH << 
>> 8, 4);
>> +/* reserved */
>> +build_append_int_noprefix(table_data, 0, 8);
>> +
>> +AMDVIState *s = (AMDVIState *)object_resolve_path_type("",
>> +TYPE_AMD_IOMMU_DEVICE, &iommu_ambig);
>> +
>> +/* IVDB definition - type 10h */
>> +if (!iommu_ambig) {
>> +/* IVHD definition - type 10h */
>> +build_append_int_noprefix(table_data, 0x10, 1);
>> +/* virtualization flags */
>> +build_append_int_noprefix(table_data, (IVHD_HT_TUNEN |
>> + IVHD_PPRSUP | IVHD_IOTLBSUP | IVHD_PREFSUP), 1);
>
> Just open-code it, and add a comment matching spec.
> This is how we do it in ACPI since no constant is ever
> reused anywhere.
> E.g.
>
>  (1 << 0) /* HtTunEn */ | (1 << 7) /* PPRSup */ .
>
> if you do this, you will also see it's not ordered, so
> you can sort by bit #.
>
>
>> +/* ivhd length */
>> +build_append_int_noprefix(table_data, 0x20, 2);
>> +/* iommu device id */
>> +build_append_int_noprefix(table_data, s->devid, 2);
>> +/* offset of capability registers */
>> +build_append_int_noprefix(table_data, s->capab_offset, 2);
>> +/* mmio base register */
>> +build_append_int_noprefix(table_data, s->mmio.addr, 8);
>> +/* pci segment */
>> +build_append_int_noprefix(table_data, 0, 2);
>> +/* interrupt numbers */
>> +build_append_int_noprefix(table_data, 0, 2);
>> +/* feature reporting */
>> +build_append_int_noprefix(table_data, (IVHD_EFR_GTSUP |
>> +IVHD_EFR_HATS |

Re: [Qemu-devel] [V12 3/4] hw/i386: Introduce AMD IOMMU

2016-07-08 Thread David Kiarie
On Mon, Jul 4, 2016 at 8:41 AM, Jan Kiszka  wrote:
> On 2016-07-04 07:06, David Kiarie wrote:
>> On Wed, Jun 22, 2016 at 11:24 PM, Jan Kiszka  wrote:
>>> On 2016-06-15 14:21, David Kiarie wrote:
>>>> +
>>>> +
>>>> +/* PCI SIG constants */
>>>> +#define PCI_BUS_MAX 256
>>>> +#define PCI_SLOT_MAX 32
>>>> +#define PCI_FUNC_MAX 8
>>>> +#define PCI_DEVFN_MAX 256
>>>
>>> Shouldn't those four go to the pci header?
>>
>> The macros/defines in PCI header are picked from linux while some of
>> these are not picked from linux. I'v prefixed them with AMDVI_ though.
>
> They are not AMDVI-specific, rather PCI-generic.

Am I getting you right here, the above should go into PCI header, right ?

>
> Jan
>
>



Re: [Qemu-devel] [V12 3/4] hw/i386: Introduce AMD IOMMU

2016-07-03 Thread David Kiarie
On Mon, Jul 4, 2016 at 8:41 AM, Jan Kiszka  wrote:
> On 2016-07-04 07:06, David Kiarie wrote:
>> On Wed, Jun 22, 2016 at 11:24 PM, Jan Kiszka  wrote:
>>> On 2016-06-15 14:21, David Kiarie wrote:
>>>> +static uint64_t amdvi_mmio_read(void *opaque, hwaddr addr, unsigned size)
>>>> +{
>>>> +AMDVIState *s = opaque;
>>>> +
>>>> +uint64_t val = -1;
>>>> +if (addr + size > AMDVI_MMIO_SIZE) {
>>>> +trace_amdvi_mmio_read("error: addr outside region: max ",
>>>> +(uint64_t)AMDVI_MMIO_SIZE, addr, size);
>>>> +return (uint64_t)-1;
>>>> +}
>>>> +
>>>> +if (size == 2) {
>>>> +val = amdvi_readw(s, addr);
>>>> +} else if (size == 4) {
>>>> +val = amdvi_readl(s, addr);
>>>> +} else if (size == 8) {
>>>> +val = amdvi_readq(s, addr);
>>>> +}
>>>> +
>>>> +switch (addr & ~0x07) {
>>>> +case AMDVI_MMIO_DEVICE_TABLE:
>>>> +trace_amdvi_mmio_read("MMIO_DEVICE_TABLE", addr, size, addr & 
>>>> ~0x07);
>>>> +break;
>>>> +
>>>> +case AMDVI_MMIO_COMMAND_BASE:
>>>> +trace_amdvi_mmio_read("MMIO_COMMAND_BASE", addr, size, addr & 
>>>> ~0x07);
>>>> +break;
>>>> +
>>>> +case AMDVI_MMIO_EVENT_BASE:
>>>> +trace_amdvi_mmio_read("MMIO_EVENT_BASE", addr, size, addr & 
>>>> ~0x07);
>>>> +break;
>>>> +
>>>> +case AMDVI_MMIO_CONTROL:
>>>> +trace_amdvi_mmio_read("MMIO_MMIO_CONTROL", addr, size, addr & 
>>>> ~0x07);
>>>> +break;
>>>> +
>>>> +case AMDVI_MMIO_EXCL_BASE:
>>>> +trace_amdvi_mmio_read("MMIO_EXCL_BASE", addr, size, addr & ~0x07);
>>>> +break;
>>>> +
>>>> +case AMDVI_MMIO_EXCL_LIMIT:
>>>> +trace_amdvi_mmio_read("MMIO_EXCL_LIMIT", addr, size, addr & 
>>>> ~0x07);
>>>> +break;
>>>> +
>>>> +case AMDVI_MMIO_COMMAND_HEAD:
>>>> +trace_amdvi_mmio_read("MMIO_COMMAND_HEAD", addr, size, addr & 
>>>> ~0x07);
>>>> +break;
>>>> +
>>>> +case AMDVI_MMIO_COMMAND_TAIL:
>>>> +trace_amdvi_mmio_read("MMIO_COMMAND_TAIL", addr, size, addr & 
>>>> ~0x07);
>>>> +break;
>>>> +
>>>> +case AMDVI_MMIO_EVENT_HEAD:
>>>> +trace_amdvi_mmio_read("MMIO_EVENT_HEAD", addr, size, addr & 
>>>> ~0x07);
>>>> +break;
>>>> +
>>>> +case AMDVI_MMIO_EVENT_TAIL:
>>>> +trace_amdvi_mmio_read("MMIO_EVENT_TAIL", addr, size, addr & 
>>>> ~0x07);
>>>> +break;
>>>> +
>>>> +case AMDVI_MMIO_STATUS:
>>>> +trace_amdvi_mmio_read("MMIO_STATUS", addr, size, addr & ~0x07);
>>>> +break;
>>>> +
>>>> +case AMDVI_MMIO_EXT_FEATURES:
>>>> +trace_amdvi_mmio_read("MMIO_EXT_FEATURES", addr, size, addr & 
>>>> ~0x07);
>>>> +break;
>>>
>>> What about a lookup table for that name?
>>
>> I can't find an obvious way to index a table given the register address.
>
> Well, you would need a low ((addr & 0x2000) == 0) and a high table (addr
> & 0x2000), and then do the indexing based on (addr & ~0x2000) / 8.
>
>>
>>>
>>>> +
>>>> +default:
>>>> +trace_amdvi_mmio_read("UNHANDLED READ", addr, size, addr & ~0x07);
>>>> +}
>>>> +return val;
>>>> +}
>>>> +
>>>> +static void amdvi_handle_control_write(AMDVIState *s)
>>>> +{
>>>> +/*
>>>> + * read whatever is already written in case
>>>> + * software is writing in chucks less than 8 bytes
>>>> + */
>>>> +unsigned long control = amdvi_readq(s, AMDVI_MMIO_CONTROL);
>>>> +s->enabled = !!(control & AMDVI_MMIO_CONTROL_AMDVIEN);
>>>> +
>>>> +s->ats_enabled = !!(control & AMDVI_MMIO_CONTROL_HTT

Re: [Qemu-devel] [V12 3/4] hw/i386: Introduce AMD IOMMU

2016-07-03 Thread David Kiarie
On Wed, Jun 22, 2016 at 11:24 PM, Jan Kiszka  wrote:
> On 2016-06-15 14:21, David Kiarie wrote:
>> +
>> +/* System Software might never read from some of this fields but anyways */
>
> No read-modify-write accesses observed in the field? And fields like
> AMDVI_MMIO_STATUS or AMDVI_MMIO_EXT_FEATURES sound a lot like they are
> rather about reading than writing. Misleading comment?

Yeah, misleading comment. AMDVI_MMIO_EXT_FEATURES is read only while
some AMDVI_MMIO_STATUS is r/w1c and yes, I'm enforcing that in the
code.

>
>> +static uint64_t amdvi_mmio_read(void *opaque, hwaddr addr, unsigned size)
>> +{
>> +AMDVIState *s = opaque;
>> +
>> +uint64_t val = -1;
>> +if (addr + size > AMDVI_MMIO_SIZE) {
>> +trace_amdvi_mmio_read("error: addr outside region: max ",
>> +(uint64_t)AMDVI_MMIO_SIZE, addr, size);
>> +return (uint64_t)-1;
>> +}
>> +
>> +if (size == 2) {
>> +val = amdvi_readw(s, addr);
>> +} else if (size == 4) {
>> +val = amdvi_readl(s, addr);
>> +} else if (size == 8) {
>> +val = amdvi_readq(s, addr);
>> +}
>> +
>> +switch (addr & ~0x07) {
>> +case AMDVI_MMIO_DEVICE_TABLE:
>> +trace_amdvi_mmio_read("MMIO_DEVICE_TABLE", addr, size, addr & 
>> ~0x07);
>> +break;
>> +
>> +case AMDVI_MMIO_COMMAND_BASE:
>> +trace_amdvi_mmio_read("MMIO_COMMAND_BASE", addr, size, addr & 
>> ~0x07);
>> +break;
>> +
>> +case AMDVI_MMIO_EVENT_BASE:
>> +trace_amdvi_mmio_read("MMIO_EVENT_BASE", addr, size, addr & ~0x07);
>> +break;
>> +
>> +case AMDVI_MMIO_CONTROL:
>> +trace_amdvi_mmio_read("MMIO_MMIO_CONTROL", addr, size, addr & 
>> ~0x07);
>> +break;
>> +
>> +case AMDVI_MMIO_EXCL_BASE:
>> +trace_amdvi_mmio_read("MMIO_EXCL_BASE", addr, size, addr & ~0x07);
>> +break;
>> +
>> +case AMDVI_MMIO_EXCL_LIMIT:
>> +trace_amdvi_mmio_read("MMIO_EXCL_LIMIT", addr, size, addr & ~0x07);
>> +break;
>> +
>> +case AMDVI_MMIO_COMMAND_HEAD:
>> +trace_amdvi_mmio_read("MMIO_COMMAND_HEAD", addr, size, addr & 
>> ~0x07);
>> +break;
>> +
>> +case AMDVI_MMIO_COMMAND_TAIL:
>> +trace_amdvi_mmio_read("MMIO_COMMAND_TAIL", addr, size, addr & 
>> ~0x07);
>> +break;
>> +
>> +case AMDVI_MMIO_EVENT_HEAD:
>> +trace_amdvi_mmio_read("MMIO_EVENT_HEAD", addr, size, addr & ~0x07);
>> +break;
>> +
>> +case AMDVI_MMIO_EVENT_TAIL:
>> +trace_amdvi_mmio_read("MMIO_EVENT_TAIL", addr, size, addr & ~0x07);
>> +break;
>> +
>> +case AMDVI_MMIO_STATUS:
>> +trace_amdvi_mmio_read("MMIO_STATUS", addr, size, addr & ~0x07);
>> +break;
>> +
>> +case AMDVI_MMIO_EXT_FEATURES:
>> +trace_amdvi_mmio_read("MMIO_EXT_FEATURES", addr, size, addr & 
>> ~0x07);
>> +break;
>
> What about a lookup table for that name?

I can't find an obvious way to index a table given the register address.

>
>> +
>> +default:
>> +trace_amdvi_mmio_read("UNHANDLED READ", addr, size, addr & ~0x07);
>> +}
>> +return val;
>> +}
>> +
>> +static void amdvi_handle_control_write(AMDVIState *s)
>> +{
>> +/*
>> + * read whatever is already written in case
>> + * software is writing in chucks less than 8 bytes
>> + */
>> +unsigned long control = amdvi_readq(s, AMDVI_MMIO_CONTROL);
>> +s->enabled = !!(control & AMDVI_MMIO_CONTROL_AMDVIEN);
>> +
>> +s->ats_enabled = !!(control & AMDVI_MMIO_CONTROL_HTTUNEN);
>> +s->evtlog_enabled = s->enabled && !!(control &
>> +AMDVI_MMIO_CONTROL_EVENTLOGEN);
>> +
>> +s->evtlog_intr = !!(control & AMDVI_MMIO_CONTROL_EVENTINTEN);
>> +s->completion_wait_intr = !!(control & AMDVI_MMIO_CONTROL_COMWAITINTEN);
>> +s->cmdbuf_enabled = s->enabled && !!(control &
>> +AMDVI_MMIO_CONTROL_CMDBUFLEN);
>> +
>> +/* update the flags depending on the control register */
>> +if (s->cmdbuf_enabled) {
>> +amdvi_orq(s, AMDVI_MMIO_STATUS, AMDVI_MMIO_

Re: [Qemu-devel] [V11 2/4] hw/i386: ACPI IVRS table

2016-06-18 Thread David Kiarie
On Tue, May 24, 2016 at 10:06 AM, Valentine Sinitsyn
 wrote:
> Hi all,
>
>
> On 24.05.2016 11:54, Peter Xu wrote:
>>
>> On Sun, May 22, 2016 at 01:21:52PM +0300, David Kiarie wrote:
>> [...]
>>>
>>> +static void
>>> +build_amd_iommu(GArray *table_data, GArray *linker)
>>> +{
>>> +int iommu_start = table_data->len;
>>> +bool iommu_ambig;
>>> +
>>> +/* IVRS definition  - table header has an extra 2-byte field */
>>> +acpi_data_push(table_data, (sizeof(AcpiTableHeader)));
>>> +/* common virtualization information */
>>> +build_append_int_noprefix(table_data, AMD_IOMMU_HOST_ADDRESS_WIDTH
>>> << 8, 4);
>>> +/* reserved */
>>> +build_append_int_noprefix(table_data, 0, 8);
>>> +
>>> +AMDVIState *s = (AMDVIState *)object_resolve_path_type("",
>>> +TYPE_AMD_IOMMU_DEVICE, &iommu_ambig);
>>> +
>>> +/* IVDB definition - type 10h */
>>> +if (!iommu_ambig) {
>>> +/* IVHD definition - type 10h */
>>> +build_append_int_noprefix(table_data, 0x10, 1);
>>> +/* virtualization flags */
>>> +build_append_int_noprefix(table_data, (IVHD_HT_TUNEN |
>>> + IVHD_PPRSUP | IVHD_IOTLBSUP | IVHD_PREFSUP), 1);
>>> +/* ivhd length */
>>> +build_append_int_noprefix(table_data, 0x20, 2);
>>> +/* iommu device id */
>>> +build_append_int_noprefix(table_data, PCI_DEVICE_ID_RD890_IOMMU,
>>> 2);
>>> +/* offset of capability registers */
>>> +build_append_int_noprefix(table_data, s->capab_offset, 2);
>>> +/* mmio base register */
>>> +build_append_int_noprefix(table_data, s->mmio.addr, 8);
>>> +/* pci segment */
>>> +build_append_int_noprefix(table_data, 0, 2);
>>> +/* interrupt numbers */
>>> +build_append_int_noprefix(table_data, 0, 2);
>>> +/* feature reporting */
>>> +build_append_int_noprefix(table_data, (IVHD_EFR_GTSUP |
>>> +IVHD_EFR_HATS | IVHD_EFR_GATS), 4);
>>> +/* Add device flags here
>>> + *   These are 4-byte device entries currently reporting the
>>> range of
>>> + *   devices 00h - h; all devices
>>> + *   Device setting affecting all devices should be made here
>>> + *
>>> + *   Refer to
>>> + *
>>> (http://developer.amd.com/wordpress/media/2012/10/488821.pdf)
>>> + *   Table 95
>>
>>
>> I failed to find Table 95 in the document. Is that typo?
>
> I guess it should be "Table 75". David, am I right?
> On a side note, 2.0 specification you mention is rather outdated.
> Please consider referencing something newer, like 2.6.
>
>
>>
>> [...]
>>
>>>   static
>>>   void acpi_build(AcpiBuildTables *tables, MachineState *machine)
>>>   {
>>> @@ -2657,6 +2721,7 @@ void acpi_build(AcpiBuildTables *tables,
>>> MachineState *machine)
>>>   AcpiMcfgInfo mcfg;
>>>   PcPciInfo pci;
>>>   uint8_t *u;
>>> +IommuType IOMMUType = has_iommu();
>>>   size_t aml_len = 0;
>>>   GArray *tables_blob = tables->table_data;
>>>   AcpiSlicOem slic_oem = { .id = NULL, .table_id = NULL };
>>> @@ -2722,7 +2787,13 @@ void acpi_build(AcpiBuildTables *tables,
>>> MachineState *machine)
>>>   acpi_add_table(table_offsets, tables_blob);
>>>   build_mcfg_q35(tables_blob, tables->linker, &mcfg);
>>>   }
>>> -if (acpi_has_iommu()) {
>>> +
>>> +if (IOMMUType == TYPE_AMD) {
>>> +acpi_add_table(table_offsets, tables_blob);
>>> +build_amd_iommu(tables_blob, tables->linker);
>>> +}
>>> +
>>> +if (IOMMUType == TYPE_INTEL) {
>>>   acpi_add_table(table_offsets, tables_blob);
>>>   build_dmar_q35(tables_blob, tables->linker);
>>>   }
>>
>>
>> Nit: I'd prefer:
>>
>>  if (type == Intel) {
>>  ...
>>  } else if (type == AMD) {
>>  ...
>>  }
>>

I missed this is the last version of the patch I should fix it in next version.

On taking a closer look at this there might be larger problem where
with the advent of -device  users can possibly emulate two
IOMMUs at the same time ? A proposed solution was to have
pci_setup_iommu check that DMA hook as not been setup yet and fail if
yes. I should send a fix for that too.

>> for better readability.
>>
>> Thanks,
>>
>> -- peterx
>>
>
> Best,
> Valentine



Re: [Qemu-devel] [V12 0/4] AMD IOMMU

2016-06-15 Thread David Kiarie
On Wed, Jun 15, 2016 at 5:26 PM, Eduardo Habkost  wrote:
> On Wed, Jun 15, 2016 at 03:21:48PM +0300, David Kiarie wrote:
>> Hi all,
>>
>> This patchset adds basic AMD IOMMU emulation support to Qemu.
>>
>> Changes since V11
>>-AMD IOMMU is not started with -device amd-iommu (with a dependency on 
>> Marcel's patches).
>>-IOMMU commands are represented using bitfields which is less error prone 
>> and more readable[Peter]
>>-Changed from debug fprintfs to tracing[Jan]
>
> What were the issues that required the sysbus+PCI code you sent
> previously? How did you address them in this series?

Short answer: Those issues are not present in this patch.

Long answer: The sysbus + PCI code is necessary for interrupt
remapping to be implemented(it could also be done without sysbus + PCI
code but that wouldn't be in line with the Intel IOMMU interrupt
remapping code, which is already on the list). The idea is that X86
IOMMUs should have a base class that implements common code. It was
decided that this class should be a SysBusDev(which works perfectly
with Intel IOMMU) but not with AMD IOMMU which has PCI properties. I
had to find a way to provide for the PCI properties once I inherit
from Intel IOMMU.

In this patchset, I have not inherited from the base class(it's not
merged yet) and even if it was merged I'd prefer to only use the
SysBus + PCI code when necessary(when I work on interrupt remapping)
so as to avoid delaying this patchset further.


>
> --
> Eduardo



[Qemu-devel] [V12 3/4] hw/i386: Introduce AMD IOMMU

2016-06-15 Thread David Kiarie
Add AMD IOMMU emulaton to Qemu in addition to Intel IOMMU
The IOMMU does basic translation, error checking and has a
minimal IOTLB implementation. This IOMMU bypassed the need
for target aborts by responding with IOMMU_NONE access rights
and exempts the region 0xfee0-0xfeef from translation
as it is the q35 interrupt region. We also advertise features
that are not yet implemented to please the Linux IOMMU driver.

IOTLB aims at implementing commands on real IOMMUs which is
essential for debugging and may not offer any performance
benefits

Signed-off-by: David Kiarie 
---
 hw/i386/Makefile.objs |1 +
 hw/i386/amd_iommu.c   | 1559 +
 hw/i386/amd_iommu.h   |  287 +
 3 files changed, 1847 insertions(+)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index b52d5b8..2f1a265 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -3,6 +3,7 @@ obj-y += multiboot.o
 obj-y += pc.o pc_piix.o pc_q35.o
 obj-y += pc_sysfw.o
 obj-y += intel_iommu.o
+obj-y += amd_iommu.o
 obj-$(CONFIG_XEN) += ../xenpv/ xen/
 
 obj-y += kvmvapic.o
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
new file mode 100644
index 000..460a8f3
--- /dev/null
+++ b/hw/i386/amd_iommu.c
@@ -0,0 +1,1559 @@
+/*
+ * QEMU emulation of AMD IOMMU (AMD-Vi)
+ *
+ * Copyright (C) 2011 Eduard - Gabriel Munteanu
+ * Copyright (C) 2015 David Kiarie, 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ * Cache implementation inspired by hw/i386/intel_iommu.c
+ *
+ */
+#include "qemu/osdep.h"
+#include "trace.h"
+#include "hw/pci/msi.h"
+#include "hw/i386/pc.h"
+#include "hw/i386/amd_iommu.h"
+
+#define ENCODE_EVENT(devid, info, addr, rshift) do { \
+*(uint16_t *)&evt[0] = devid; \
+*(uint8_t *)&evt[3]  = info;  \
+*(uint64_t *)&evt[4] = rshift ? cpu_to_le64(addr) :\
+   cpu_to_le64(addr) >> rshift; \
+} while (0)
+
+typedef struct AMDVIAddressSpace {
+uint8_t bus_num;/* bus number   */
+uint8_t devfn;  /* device function  */
+AMDVIState *iommu_state;/* AMDVI - one per machine  */
+MemoryRegion iommu; /* Device's iommu region*/
+AddressSpace as;/* device's corresponding address space */
+} AMDVIAddressSpace;
+
+/* AMD-Vi cache entry */
+typedef struct AMDVIIOTLBEntry {
+uint64_t gfn;   /* guest frame number  */
+uint16_t domid; /* assigned domain id  */
+uint16_t devid; /* device owning entry */
+uint64_t perms; /* access permissions  */
+uint64_t translated_addr;   /* translated address  */
+uint64_t page_mask; /* physical page size  */
+} AMDVIIOTLBEntry;
+
+#if defined(HOST_WORDS_BIGENDIAN)
+#define __BIG_ENDIAN_BITFIELD
+#endif
+
+/* serialize IOMMU command processing */
+struct CMDCompletionWait {
+
+#ifdef __BIG_ENDIAN_BITFIELD
+uint64_t type:4;   /* command type   */
+uint64_t reserved:8;
+uint64_t store_addr:49;/* addr to write  */
+uint64_t completion_flush:1;   /* allow more executions  */
+uint64_t completion_int:1; /* set MMIOWAITINT*/
+uint64_t completion_store:1;   /* write data to address  */
+#else
+uint64_t completion_store:1;
+uint64_t completion_int:1;
+uint64_t completion_flush:1;
+uint64_t store_addr:49;
+uint64_t reserved:8;
+uint64_t type:4;
+#endif /* __BIG_ENDIAN_BITFIELD */
+
+uint64_t store_data;   /* data to write  */
+} QEMU_PACKED;
+
+/* invalidate internal caches for devid */
+struct CMDInvalDevEntry {
+
+#ifdef __BIG_ENDIAN_BITFIELD
+uint64_t devid;/* device to invalidate   */
+uint64_t reserved_1:44;
+uint64_t type:4;   /* command type   */
+#else
+uint64_t devid;
+uint64_t reserved_1:44;
+uint64_t type:4;
+#endif /* __BIG_ENDIAN_BITFIELD */
+
+uint64_t reserved_2;
+} QEMU_PACKED;
+
+/* invalidate a range of entries in IOMMU translation cache for devid */
+struct CMDInvalIommuPages {
+
+#ifdef __BIG_ENDIAN_BITFIELD
+uin

[Qemu-devel] [V12 1/4] hw/pci: Prepare for AMD IOMMU

2016-06-15 Thread David Kiarie
Introduce PCI macros from linux headers for use by AMD IOMMU

Signed-off-by: David Kiarie 
---
 include/hw/pci/pci.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 4420f47..ac376c5 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -12,6 +12,8 @@
 
 /* PCI bus */
 
+#define PCI_BUS_NUM(x)  (((x) >> 8) & 0xff)
+#define PCI_DEVID(bus, devfn)   uint16_t)(bus)) << 8) | (devfn))
 #define PCI_DEVFN(slot, func)   slot) & 0x1f) << 3) | ((func) & 0x07))
 #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f)
 #define PCI_FUNC(devfn) ((devfn) & 0x07)
-- 
2.1.4




[Qemu-devel] [V12 0/4] AMD IOMMU

2016-06-15 Thread David Kiarie
Hi all,

This patchset adds basic AMD IOMMU emulation support to Qemu.

Changes since V11
   -AMD IOMMU is not started with -device amd-iommu (with a dependency on 
Marcel's patches).
   -IOMMU commands are represented using bitfields which is less error prone 
and more readable[Peter]
   -Changed from debug fprintfs to tracing[Jan]

Changes since V10
 
   -Support for huge pages including some obscure AMD IOMMU feature that allows 
default page size override[Jan].
   -Fixed an issue with generation of interrupts. We noted that AMD IOMMU has 
BusMaster- and is therefore not able to generate interrupts like any other PCI 
device. We have resulted in writing directly to system address but this could 
be fixed by some patches which have not been merged yet.

Changes since v9

   -amd_iommu prefixes have been renamed to a shorter 'amdvi' both in the macros
and in the functions/code. The register macros have not been moved to the 
implementation file since almost the macros there are basically macros and 
I 
reckoned renaming them should suffice.
   -taken care of byte order in the use of 'dma_memory_read'[Michael]
   -Taken care of invalid DTE entries to ensure no DMA unless a device is 
configured to allow it.
   -An issue with the emulate IOMMU defaulting to AMD_IOMMU has been 
fixed[Marcel]
   
You can test[1] this patches by starting with parameters 
qemu-system-x86_64 -M -device amd-iommu -m 2G -enable-kvm -smp 4 -cpu host 
-hda file.img -soundhw ac97 
emulating whatever devices you want.

Not passing any command line parameters to linux should be enough to test this 
patches since the devices are basically
passes-through but to the 'host' (l1 guest). You can still go ahead pass 
command line parameter 'iommu=pt iommu=1'
and try to pass a device to L2 guest. This can also done without passing any 
iommu related parameters to the kernel. 

For convinience:
[1] https://github.com/aslaq/qemu/tree/v12

David Kiarie (4):
  hw/pci: Prepare for AMD IOMMU
  trace-events: Add AMD IOMMU trace events
  hw/i386: Introduce AMD IOMMU
  hw/i386: AMD IOMMU IVRS table

 hw/acpi/aml-build.c |2 +-
 hw/i386/Makefile.objs   |1 +
 hw/i386/acpi-build.c|   95 ++-
 hw/i386/amd_iommu.c | 1559 +++
 hw/i386/amd_iommu.h |  287 
 include/hw/acpi/acpi-defs.h |   13 +
 include/hw/acpi/aml-build.h |1 +
 include/hw/pci/pci.h|2 +
 trace-events|   29 +
 9 files changed, 1977 insertions(+), 12 deletions(-)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

-- 
2.1.4




[Qemu-devel] [V12 2/4] trace-events: Add AMD IOMMU trace events

2016-06-15 Thread David Kiarie
Signed-off-by: David Kiarie 
---
 trace-events | 29 +
 1 file changed, 29 insertions(+)

diff --git a/trace-events b/trace-events
index 2f14205..340d019 100644
--- a/trace-events
+++ b/trace-events
@@ -2164,3 +2164,32 @@ e1000e_cfg_support_virtio(bool support) "Virtio header 
supported: %d"
 
 e1000e_vm_state_running(void) "VM state is running"
 e1000e_vm_state_stopped(void) "VM state is stopped"
+
+# hw/i386/amd_iommu.c
+amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at addr 
0x%"PRIx64 " +  offset 0x%"PRIx32
+amdvi_cache_update(uint16_t domid, uint32_t bus, uint32_t slot, uint32_t func, 
uint64_t gpa, uint64_t txaddr) " update iotlb domid 0x%"PRIx16" devid: 
%02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
+amdvi_completion_wait_fail(uint64_t addr) "error: fail to write at address 
0x%"PRIx64
+amdvi_mmio_write(const char *reg, uint64_t addr, unsigned size, uint64_t val, 
unsigned long offset) "%s write addr 0x%"PRIx64 ", size %d, val 0x%"PRIx64 ", 
offset 0x%"PRIx64
+amdvi_mmio_read(const char *reg, uint64_t addr, unsigned size, uint64_t 
offset) "%s read addr 0x%"PRIx64", size %d offset 0x%"PRIx64
+amdvi_command_error(uint64_t status) "error: Executing commands with command 
buffer disabled 0x%"PRIx64
+amdvi_command_read_fail(uint64_t addr, uint32_t head) "error: fail to access 
memory at 0x%"PRIx64" + 0x%"PRIu32
+amdvi_command_exec(uint32_t head, uint32_t tail, uint64_t buf) "command buffer 
head at 0x%"PRIx32 " command buffer tail at 0x%"PRIx32" command buffer base at 
0x%" PRIx64
+amdvi_unhandled_command(uint8_t type) "unhandled command %d"
+amdvi_intr_inval(void) "Interrupt table invalidated"
+amdvi_iotlb_inval(void) "IOTLB pages invalidated"
+amdvi_prefetch_pages(void) "Pre-fetch of AMD-Vi pages requested"
+amdvi_pages_inval(uint16_t domid) "AMD-Vi pages for domain 0x%"PRIx16 " 
invalidated"
+amdvi_all_inval(void) "Invalidation of all AMD-Vi cache requested "
+amdvi_ppr_exec(void) "Execution of PPR queue requested "
+amdvi_devtab_inval(uint16_t bus, uint16_t slot, uint16_t func) "device table 
entry for devid: %02x:%02x.%x invalidated"
+amdvi_completion_wait(uint64_t addr, uint64_t data) "completion wait requested 
with store address 0x%"PRIx64" and store data 0x%"PRIx64
+amdvi_control_status(uint64_t val) "MMIO_STATUS state 0x%"PRIx64
+amdvi_iotlb_reset(void) "IOTLB exceed size limit - reset "
+amdvi_completion_wait_exec(uint64_t addr, uint64_t data) "completion wait 
requested with store address 0x%"PRIx64" and store data 0x%"PRIx64
+amdvi_dte_get_fail(uint64_t addr, uint32_t offset) "error: failed to access 
Device Entry devtab 0x%"PRIx64" offset 0x%"PRIx32
+amdvi_invalid_dte(uint64_t addr) "PTE entry at 0x%"PRIx64" is invalid "
+amdvi_get_pte_hwerror(uint64_t addr) "hardware error eccessing PTE at addr 
0x%"PRIx64
+amdvi_mode_invalid(unsigned level, uint64_t addr)"error: translation level 
0x%"PRIu8" translating addr 0x%"PRIx64
+amdvi_page_fault(uint64_t addr) "error: page fault accessing guest physical 
address 0x%"PRIx64
+amdvi_iotlb_hit(uint16_t bus, uint16_t slot, uint16_t func, uint64_t addr, 
uint64_t txaddr) "hit iotlb devid %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
+amdvi_translation_result(uint16_t bus, uint16_t slot, uint16_t func, uint64_t 
addr, uint64_t txaddr) "devid: %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64
-- 
2.1.4




Re: [Qemu-devel] [RFC] hw/i386: Composite Bus and PCI device

2016-06-11 Thread David Kiarie
On Fri, Jun 10, 2016 at 8:30 AM, Jan Kiszka  wrote:
> On 2016-06-08 17:25, Eduardo Habkost wrote:
>> On Wed, Jun 08, 2016 at 01:00:32PM +0300, David Kiarie wrote:
>>> Sample composite SysBus and PCI device similar to AMD IOMMU setup
>>>
>>> Signed-off-by: David Kiarie 
>>> ---
>>>  hw/i386/compositedevice.c | 113 
>>> ++
>>>  1 file changed, 113 insertions(+)
>>>  create mode 100644 hw/i386/compositedevice.c
>>
>> The filename is very generic (hw/i386/compositedevice.c), but it
>> has lots of AMD-specific names.
>>
>> Is your plan to provide generic helpers for implementing
>> SysBus+PCI devices, or is it going to be inside a source file
>> specific for AMD IOMMU?
>
> As far as I understood - David, correct me - this patch is more of a
> simplified demonstrator of the architecture to be applied on the actual
> AMD IOMMU code. It is not for merge.

Right.

>
> Jan
>
>



Re: [Qemu-devel] [RFC] Allow AMD IOMMU to have both SysBusDevice and PCIDevice properties.

2016-06-11 Thread David Kiarie
On Thu, Jun 9, 2016 at 4:23 PM, Marcel Apfelbaum  wrote:
> On 06/07/2016 10:12 PM, Eduardo Habkost wrote:
>>
>> Hi,
>>
>
> [...]
>
>> [...]
>>>
>>> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
>>> index 04aae89..431eaed 100644
>>> --- a/hw/i386/pc_q35.c
>>> +++ b/hw/i386/pc_q35.c
>>> @@ -281,6 +281,7 @@ static void pc_q35_machine_options(MachineClass *m)
>>>   m->default_machine_opts = "firmware=bios-256k.bin";
>>>   m->default_display = "std";
>>>   m->no_floppy = 1;
>>> +m->has_dynamic_sysbus = true;
>>
>>
>> Why is this needed? Is it possible to do this change before
>> adding the iommu code?  Can this be done in a separate patch that
>> documents why it should be changed and why it is safe to set it
>> to true?
>>
>>>
>
> Hi Eduardo,
>
> I also have this change as part of '[PATCH v2 0/3] enable iommu with
> -device'.
>
> Please see:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg374644.html

This change is actually yours. My code is depends on your code.

>
>
> Thanks,
> Marcel



[Qemu-devel] [RFC] AMD IOMMU: emulate multiple devices

2016-06-08 Thread David Kiarie
Hello all,

This patch tries to solve a problem whereby real AMD IOMMUs exhibit both PCI 
and Platform device properties. AMD IOMMU properties that conflict with 
conventional PCI devices' features include the fact that its not a BusMaster 
device and reserves MMIO region without a BAR register among others.

There is some already ongoing work on Intel IOMMU Interrupt remapping with 
implements an IOMMU base class[1], as a platform device(which means the moment 
I inherit from this class my device loses it's PCI properties). I am therefore 
forced to find a way to combine both PCI and platform features(which I had 
previously avoided) into AMD IOMMU.
X86-IOMMU(common code)
|
|
   / \
  /   \
Intel IOMMU   AMD IOMMU

This patch implements a stripped down sample of how I plan to solve this issue. 
It basically implements PCI device which serves to 'steal' PCI config space 
while the main device remains a platform device. The platform device maintains 
a reference to the PCI device and hence the relevant PCI config space. This 
device will also require [2] to work.

Looking forward to your comments!

[1] http://thread.gmane.org/gmane.comp.emulators.qemu/414510
[2] http://thread.gmane.org/gmane.comp.emulators.qemu/413018

David Kiarie (1):
  hw/i386: Composite Bus and PCI device

 hw/i386/compositedevice.c | 113 ++
 1 file changed, 113 insertions(+)
 create mode 100644 hw/i386/compositedevice.c

-- 
2.1.4




[Qemu-devel] [RFC] hw/i386: Composite Bus and PCI device

2016-06-08 Thread David Kiarie
Sample composite SysBus and PCI device similar to AMD IOMMU setup

Signed-off-by: David Kiarie 
---
 hw/i386/compositedevice.c | 113 ++
 1 file changed, 113 insertions(+)
 create mode 100644 hw/i386/compositedevice.c

diff --git a/hw/i386/compositedevice.c b/hw/i386/compositedevice.c
new file mode 100644
index 000..349e98d
--- /dev/null
+++ b/hw/i386/compositedevice.c
@@ -0,0 +1,113 @@
+#include "qemu/osdep.h"
+#include "hw/pci/pci.h"
+#include "hw/i386/x86-iommu.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/sysbus.h"
+#include "qom/object.h"
+#include "hw/i386/pc.h"
+
+#define AMDVI_MMIO_SIZE0x4000
+#define AMDVI_CAPAB_SIZE   0x18
+#define AMDVI_CAPAB_REG_SIZE   0x04
+#define AMDVI_CAPAB_ID_SEC 0xff
+
+#define TYPE_AMD_IOMMU_DEVICE "amd-iommu"
+#define AMD_IOMMU_DEVICE(obj)\
+OBJECT_CHECK(AMDVIState, (obj), TYPE_AMD_IOMMU_DEVICE)
+
+#define TYPE_AMD_IOMMU_PCI "AMDVI-PCI"
+#define AMD_IOMMU_PCI(obj)\
+OBJECT_CHECK(AMDVIPCIState, (obj), TYPE_AMD_IOMMU_PCI)
+
+typedef struct AMDVIPCIState {
+PCIDevice dev;
+/* PCI specific properties */
+uint8_t *capab;  /* capabilities registers   */
+uint32_t capab_offset;   /* capability offset pointer*/
+} AMDVIPCIState;
+
+typedef struct AMDVIState {
+X86IOMMUState iommu;/* IOMMU bus device */
+AMDVIPCIState *dev; /* IOMMU PCI device */
+
+uint8_t mmior[AMDVI_MMIO_SIZE];/* read/write MMIO  */
+uint8_t w1cmask[AMDVI_MMIO_SIZE];  /* read/write 1 clear mask  */
+uint8_t romask[AMDVI_MMIO_SIZE];   /* MMIO read/only mask  */
+} AMDVIState;
+
+static void amdvi_init(AMDVIState *s)
+{
+/* reset PCI device */
+pci_config_set_vendor_id(s->dev->dev.config, PCI_VENDOR_ID_AMD);
+pci_config_set_prog_interface(s->dev->dev.config, 00);
+pci_config_set_class(s->dev->dev.config, 0x0806);
+}
+
+static void amdvi_reset(DeviceState *dev)
+{
+AMDVIState *s = AMD_IOMMU_DEVICE(dev);
+
+amdvi_init(s);
+}
+
+static void amdvi_realize(DeviceState *dev, Error **errp)
+{
+AMDVIState *s = AMD_IOMMU_DEVICE(dev);
+PCIBus *bus = PC_MACHINE(qdev_get_machine())->bus;
+
+/* This device should take care of IOMMU PCI properties */
+PCIDevice *createddev = pci_create_simple(bus, -1, TYPE_AMD_IOMMU_PCI);
+AMDVIPCIState *amdpcidevice = container_of(createddev, AMDVIPCIState, dev);
+s->dev = amdpcidevice;
+}
+
+static void amdvi_class_init(ObjectClass *klass, void* data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+X86IOMMUClass *dc_class = X86_IOMMU_CLASS(klass);
+
+dc->reset = amdvi_reset;
+
+dc_class->realize = amdvi_realize;
+}
+
+static const TypeInfo amdvi = {
+.name = TYPE_AMD_IOMMU_DEVICE,
+.parent = TYPE_X86_IOMMU_DEVICE,
+.instance_size = sizeof(AMDVIState),
+.class_init = amdvi_class_init
+};
+
+static void amdviPCI_realize(PCIDevice *dev, Error **errp)
+{
+AMDVIPCIState *s = container_of(dev, AMDVIPCIState, dev);
+
+/* we need to report certain PCI capabilities */
+s->capab_offset = pci_add_capability(&s->dev, AMDVI_CAPAB_ID_SEC, 0,
+ AMDVI_CAPAB_SIZE);
+pci_add_capability(&s->dev, PCI_CAP_ID_MSI, 0, AMDVI_CAPAB_REG_SIZE);
+pci_add_capability(&s->dev, PCI_CAP_ID_HT, 0, AMDVI_CAPAB_REG_SIZE);
+}
+
+static void amdviPCI_class_init(ObjectClass *klass, void* data)
+{
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+k->realize = amdviPCI_realize;
+}
+
+static const TypeInfo amdviPCI = {
+.name = TYPE_AMD_IOMMU_PCI,
+.parent = TYPE_PCI_DEVICE,
+.instance_size = sizeof(AMDVIPCIState),
+.class_init = amdviPCI_class_init
+};
+
+static void amdviPCI_register_types(void)
+{
+type_register_static(&amdviPCI);
+type_register_static(&amdvi);
+}
+
+type_init(amdviPCI_register_types);
-- 
2.1.4




Re: [Qemu-devel] [RFC] Allow AMD IOMMU to have both SysBusDevice and PCIDevice properties.

2016-06-07 Thread David Kiarie
On Tue, Jun 7, 2016 at 10:12 PM, Eduardo Habkost  wrote:
> Hi,

Hello,

>
> I didn't review the amd_iommu.c code, but there seems to be some
> unrelated changes in the patch:

Thanks for looking at this but I actually wanted someone to look at
the amd_iommu.c. I mentioned in annotation that there are some
unrelated changes because this work is based on code that has not been
merged yet. I specifically sent this to have a review in amd_iommu.c
not the details but the design. I have patchset that implements AMD
IOMMU (translation only) which is implemented as a PCI device. It is
however not possible to work on interrupt remapping without converting
AMD IOMMU from a PCI device to a SysBusDevice. This device(AMD IOMMU),
the one on this patch unlike in previous patches, creates to devices ;
a PCI device and a SySBusDev which am not sure is acceptable.

>
> On Sun, Jun 05, 2016 at 07:54:33PM +0300, David Kiarie wrote:
>> Signed-off-by: David Kiarie 
>> ---
>>  hw/acpi/aml-build.c |2 +-
>>  hw/i386/amd_iommu.c | 1471 
>> +++
>>  hw/i386/amd_iommu.h |  348 ++
>>  hw/i386/kvm/pci-assign.c|2 +-
>>  hw/i386/pc_q35.c|1 +
>>  include/hw/acpi/acpi-defs.h |   13 +
>>  include/hw/acpi/aml-build.h |1 +
>>  include/hw/pci/pci.h|   10 +-
>>  qemu-options.hx |7 +-
>>  util/qemu-config.c  |8 +-
>>  10 files changed, 1853 insertions(+), 10 deletions(-)
>>  create mode 100644 hw/i386/amd_iommu.c
>>  create mode 100644 hw/i386/amd_iommu.h
>>
>> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
>> index cedb74e..8d4bd01 100644
>> --- a/hw/acpi/aml-build.c
>> +++ b/hw/acpi/aml-build.c
>> @@ -227,7 +227,7 @@ static void build_extop_package(GArray *package, uint8_t 
>> op)
>>  build_prepend_byte(package, 0x5B); /* ExtOpPrefix */
>>  }
>>
>> -static void build_append_int_noprefix(GArray *table, uint64_t value, int 
>> size)
>> +void build_append_int_noprefix(GArray *table, uint64_t value, int size)
>
> Why this change?
>
>>  {
>>  int i;
>>
> [...]
>> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
>> index 04aae89..431eaed 100644
>> --- a/hw/i386/pc_q35.c
>> +++ b/hw/i386/pc_q35.c
>> @@ -281,6 +281,7 @@ static void pc_q35_machine_options(MachineClass *m)
>>  m->default_machine_opts = "firmware=bios-256k.bin";
>>  m->default_display = "std";
>>  m->no_floppy = 1;
>> +m->has_dynamic_sysbus = true;
>
> Why is this needed? Is it possible to do this change before
> adding the iommu code?  Can this be done in a separate patch that
> documents why it should be changed and why it is safe to set it
> to true?
>
>>  }
>>
> [...]
>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>> index a30808b..ef0e8a6 100644
>> --- a/include/hw/pci/pci.h
>> +++ b/include/hw/pci/pci.h
>> @@ -11,10 +11,11 @@
>>  #include "hw/pci/pcie.h"
>>
>>  /* PCI bus */
>> -
>> +#define PCI_DEVID(bus, devfn)   uint16_t)(bus)) << 8) | (devfn))
>>  #define PCI_DEVFN(slot, func)   slot) & 0x1f) << 3) | ((func) & 0x07))
>>  #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f)
>>  #define PCI_FUNC(devfn) ((devfn) & 0x07)
>> +#define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn))
>
> Missing parenthesis around (bus).
>
>>  #define PCI_SLOT_MAX32
>>  #define PCI_FUNC_MAX8
>>
>> @@ -328,7 +329,6 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
>>  int pci_add_capability2(PCIDevice *pdev, uint8_t cap_id,
>> uint8_t offset, uint8_t size,
>> Error **errp);
>> -
>
> Unrelated whitespace change.
>
>>  void pci_del_capability(PCIDevice *pci_dev, uint8_t cap_id, uint8_t 
>> cap_size);
>>
>>  uint8_t pci_find_capability(PCIDevice *pci_dev, uint8_t cap_id);
>> @@ -692,11 +692,13 @@ static inline uint32_t pci_config_size(const PCIDevice 
>> *d)
>>  return pci_is_express(d) ? PCIE_CONFIG_SPACE_SIZE : 
>> PCI_CONFIG_SPACE_SIZE;
>>  }
>>
>> -static inline uint16_t pci_requester_id(PCIDevice *dev)
>> +static inline uint16_t pci_get_bdf(PCIDevice *dev)
>>  {
>> -return (pci_bus_num(dev->bus) << 8) | dev->devfn;
>> +return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
>>  }
>>
>> +uint16_t pci_requester_id(PCIDevice *dev);
>> +
>

[Qemu-devel] [RFC] Allow AMD IOMMU to have both SysBusDevice and PCIDevice properties.

2016-06-05 Thread David Kiarie
Signed-off-by: David Kiarie 
---
 hw/acpi/aml-build.c |2 +-
 hw/i386/amd_iommu.c | 1471 +++
 hw/i386/amd_iommu.h |  348 ++
 hw/i386/kvm/pci-assign.c|2 +-
 hw/i386/pc_q35.c|1 +
 include/hw/acpi/acpi-defs.h |   13 +
 include/hw/acpi/aml-build.h |1 +
 include/hw/pci/pci.h|   10 +-
 qemu-options.hx |7 +-
 util/qemu-config.c  |8 +-
 10 files changed, 1853 insertions(+), 10 deletions(-)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index cedb74e..8d4bd01 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -227,7 +227,7 @@ static void build_extop_package(GArray *package, uint8_t op)
 build_prepend_byte(package, 0x5B); /* ExtOpPrefix */
 }
 
-static void build_append_int_noprefix(GArray *table, uint64_t value, int size)
+void build_append_int_noprefix(GArray *table, uint64_t value, int size)
 {
 int i;
 
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
new file mode 100644
index 000..24ca2f3
--- /dev/null
+++ b/hw/i386/amd_iommu.c
@@ -0,0 +1,1471 @@
+/*
+ * QEMU emulation of AMD IOMMU (AMD-Vi)
+ *
+ * Copyright (C) 2011 Eduard - Gabriel Munteanu
+ * Copyright (C) 2015 David Kiarie, 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ * Cache implementation inspired by hw/i386/intel_iommu.c
+ *
+ */
+#include "qemu/osdep.h"
+#include "hw/i386/amd_iommu.h"
+#include "hw/pci/pci_bus.h"
+#include "qom/object.h"
+
+//#define DEBUG_AMD_AMDVI
+#ifdef DEBUG_AMD_AMDVI
+enum {
+DEBUG_GENERAL, DEBUG_CAPAB, DEBUG_MMIO, DEBUG_ELOG,
+DEBUG_CACHE, DEBUG_COMMAND, DEBUG_MMU, DEBUG_CUSTOM
+};
+
+#define AMDVI_DBGBIT(x)   (1 << DEBUG_##x)
+static int iommu_dbgflags = AMDVI_DBGBIT(MMU);
+
+#define AMDVI_DPRINTF(what, fmt, ...) do { \
+if (iommu_dbgflags & AMDVI_DBGBIT(what)) { \
+fprintf(stderr, "(amd-iommu)%s: " fmt "\n", __func__, \
+## __VA_ARGS__); } \
+} while (0)
+#else
+#define AMDVI_DPRINTF(what, fmt, ...) do {} while (0)
+#endif
+
+#define ENCODE_EVENT(devid, info, addr, rshift) do { \
+*(uint16_t *)&evt[0] = devid; \
+*(uint8_t *)&evt[3]  = info;  \
+*(uint64_t *)&evt[4] = rshift ? cpu_to_le64(addr) :\
+   cpu_to_le64(addr) >> rshift; \
+} while (0)
+
+typedef struct AMDVIAddressSpace {
+uint8_t bus_num;/* bus number   */
+uint8_t devfn;  /* device function  */
+AMDVIState *iommu_state;/* AMDVI - one per machine  */
+MemoryRegion iommu; /* Device's iommu region*/
+AddressSpace as;/* device's corresponding address space */
+} AMDVIAddressSpace;
+
+/* AMDVI cache entry */
+typedef struct AMDVIIOTLBEntry {
+uint64_t gfn;   /* guest frame number  */
+uint16_t domid; /* assigned domain id  */
+uint64_t devid; /* device owning entry */
+uint64_t perms; /* access permissions  */
+uint64_t translated_addr;   /* translated address  */
+} AMDVIIOTLBEntry;
+
+/* configure MMIO registers at startup/reset */
+static void amdvi_set_quad(AMDVIState *s, hwaddr addr, uint64_t val,
+   uint64_t romask, uint64_t w1cmask)
+{
+stq_le_p(&s->mmior[addr], val);
+stq_le_p(&s->romask[addr], romask);
+stq_le_p(&s->w1cmask[addr], w1cmask);
+}
+
+static uint16_t amdvi_readw(AMDVIState *s, hwaddr addr)
+{
+return lduw_le_p(&s->mmior[addr]);
+}
+
+static uint32_t amdvi_readl(AMDVIState *s, hwaddr addr)
+{
+return ldl_le_p(&s->mmior[addr]);
+}
+
+static uint64_t amdvi_readq(AMDVIState *s, hwaddr addr)
+{
+return ldq_le_p(&s->mmior[addr]);
+}
+
+/* internal write */
+static void amdvi_writeq_raw(AMDVIState *s, uint64_t val, hwaddr addr)
+{
+stq_le_p(&s->mmior[addr], val);
+}
+
+/* external write */
+static void amdvi_writew(AMDVIState *s, hwaddr addr, uint16_t val)
+{
+uint16_t romask = lduw_le_p(&s->romask[addr]);
+uint16_t w1cmask = lduw_le_p(&s->w1cmask[addr]);
+uint16_t 

[Qemu-devel] [RFC] AMD IOMMU: emulate multiple devices

2016-06-05 Thread David Kiarie
Hello all,

This patch tries to solve a problem whereby real AMD IOMMUs exhibit both PCI 
and Platform device properties. AMD IOMMU properties that conflict with 
conventional PCI devices' features include the fact that its not a BusMaster 
device, reserves MMIO region without a BAR register.

There is some already ongoing work on Intel IOMMU Interrupt remapping with 
implements an IOMMU base class, as a platform device(which means the moment I 
inherit from this class my device loses it's PCI properties). I am therefore 
forced to find a way to combine both PCI and platform features(which I had 
previously avoided) into AMD IOMMU. 

This patch implements a dummy PCI device which serves to 'steal' PCI config 
space while the rest of the device remains a platform device. The platform 
device maintains a reference to the PCI and hence the relevant PCI config 
space. Please ignore details in this patch and review the design. Also, some of 
the changes here are not related to the above issue.

Looking forward to your comments!

David Kiarie (1):
  Allow AMD IOMMU to have both SysBusDevice and PCIDevice properties.

 hw/acpi/aml-build.c |2 +-
 hw/i386/amd_iommu.c | 1471 +++
 hw/i386/amd_iommu.h |  348 ++
 hw/i386/kvm/pci-assign.c|2 +-
 hw/i386/pc_q35.c|1 +
 include/hw/acpi/acpi-defs.h |   13 +
 include/hw/acpi/aml-build.h |1 +
 include/hw/pci/pci.h|   10 +-
 qemu-options.hx |7 +-
 util/qemu-config.c  |8 +-
 10 files changed, 1853 insertions(+), 10 deletions(-)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

-- 
2.1.4




Re: [Qemu-devel] [PATCH v7 07/25] intel_iommu: define several structs for IOMMU IR

2016-05-30 Thread David Kiarie
On Mon, May 30, 2016 at 12:16 PM, Peter Xu  wrote:
> On Mon, May 30, 2016 at 11:54:52AM +0300, David Kiarie wrote:
>> On Mon, May 30, 2016 at 11:14 AM, Peter Xu  wrote:
>> > On Mon, May 30, 2016 at 07:56:16AM +0200, Jan Kiszka wrote:
>> >> On 2016-05-30 07:45, Peter Xu wrote:
> [...]
>> >> >
>> >> > I assume you mean when host cpu is big endian. x86 was little endian,
>> >> > and I was testing on x86.
>> >> >
>> >> > I think you are right. I should do conditional byte swap for all
>> >> > uint{16/32/64} cases within the fields. For example, index_l field in
>> >> > above VTD_IR_MSIAddress. And there are several other cases that need
>> >> > special treatment in the patchset. Will go over and fix corresponding
>> >> > issues in next version.
>> >>
>> >> You actually need bit-swap with bit fields, see e.g. hw/net/vmxnet3.h.
>> >
>> > Not noticed about bit-field ordering before... So maybe I need both?
>>
>> Yes, I think we will need both though, I think, byte swapping the
>> whole struct will break the code but swapping individual fields is
>> what we need.
>>
>> Myself, I'm defining bitfields as below:
>>
>>   struct CMDCompletionWait {
>>
>> #ifdef __BIG_ENDIAN_BITFIELD
>> uint32_t type:4;   /* command type   */
>> uint32_t reserved:8;
>> uint64_t store_addr:49;/* addr to write  */
>> uint32_t completion_flush:1;   /* allow more executions  */
>> uint32_t completion_int:1; /* set MMIOWAITINT*/
>> uint32_t completion_store:1;   /* write data to address  */
>
> I guess what we need might be this one:
>
>   uint64_t type:4;   /* command type   */
>   uint64_t reserved:8;
>   uint64_t store_addr:49;/* addr to write  */
>   uint64_t completion_flush:1;   /* allow more executions  */
>   uint64_t completion_int:1; /* set MMIOWAITINT*/
>   uint64_t completion_store:1;   /* write data to address  */
>
> IIUC, if we define type:4 as uint32_t rather than uint64_t, it should
> be bits [29:32] of the struct on big endian machines, not bits
> [61:64].

Yes, you're right.

>
> Thanks,
>
> -- peterx



Re: [Qemu-devel] [PATCH v7 07/25] intel_iommu: define several structs for IOMMU IR

2016-05-30 Thread David Kiarie
On Mon, May 30, 2016 at 11:14 AM, Peter Xu  wrote:
> On Mon, May 30, 2016 at 07:56:16AM +0200, Jan Kiszka wrote:
>> On 2016-05-30 07:45, Peter Xu wrote:
>> > On Sun, May 29, 2016 at 11:21:35AM +0300, David Kiarie wrote:
>> > [...]
>> >>>> +
>> >>>> +/* Programming format for MSI/MSI-X addresses */
>> >>>> +union VTD_IR_MSIAddress {
>> >>>> +struct {
>> >>>> +uint8_t __not_care:2;
>> >>>> +uint8_t index_h:1;  /* Interrupt index bit 15 */
>> >>>> +uint8_t sub_valid:1;/* SHV: Sub-Handle Valid bit */
>> >>>> +uint8_t int_mode:1; /* Interrupt format */
>> >>>> +uint16_t index_l:15;/* Interrupt index bit 14-0 */
>> >>>> +uint16_t __head:12; /* Should always be: 0x0fee */
>> >>>> +} QEMU_PACKED;
>> >>>> +uint32_t data;
>> >>>> +};
>> >>>
>> >>> In a recent discussion, it was brought to my attention that you might
>> >>> have a problem with bitfields when the host cpu is not x86. Have you
>> >>> considered this ?
>> >>
>> >> In a case when say the host cpu is little endian.
>> >
>> > I assume you mean when host cpu is big endian. x86 was little endian,
>> > and I was testing on x86.
>> >
>> > I think you are right. I should do conditional byte swap for all
>> > uint{16/32/64} cases within the fields. For example, index_l field in
>> > above VTD_IR_MSIAddress. And there are several other cases that need
>> > special treatment in the patchset. Will go over and fix corresponding
>> > issues in next version.
>>
>> You actually need bit-swap with bit fields, see e.g. hw/net/vmxnet3.h.
>
> Not noticed about bit-field ordering before... So maybe I need both?

Yes, I think we will need both though, I think, byte swapping the
whole struct will break the code but swapping individual fields is
what we need.

Myself, I'm defining bitfields as below:

  struct CMDCompletionWait {

#ifdef __BIG_ENDIAN_BITFIELD
uint32_t type:4;   /* command type   */
uint32_t reserved:8;
uint64_t store_addr:49;/* addr to write  */
uint32_t completion_flush:1;   /* allow more executions  */
uint32_t completion_int:1; /* set MMIOWAITINT*/
uint32_t completion_store:1;   /* write data to address  */
#else
uint32_t completion_store:1;
uint32_t completion_int:1;
uint32_t completion_flush:1;
uint64_t store_addr:49;
uint32_t reserved:8;
uint32_t type:4;
#endif /* __BIG_ENDIAN_BITFIELD */

uint64_t store_data;   /* data to write  */
if
} QEMU_PACKED;

So, the bitfields are basically aligned to a {1,2,4,8}-byte boundary.
I will have to swap store_addr,type, store_data, e.t.c.

>
> Thanks,
>
> -- peterx



Re: [Qemu-devel] [PATCH v7 07/25] intel_iommu: define several structs for IOMMU IR

2016-05-29 Thread David Kiarie
On Sun, May 29, 2016 at 11:20 AM, David Kiarie  wrote:
> On Tue, May 17, 2016 at 10:15 AM, Peter Xu  wrote:
>> Several data structs are defined to better support the rest of the
>> patches: IRTE to parse remapping table entries, and IOAPIC/MSI related
>> structure bits to parse interrupt entries to be filled in by guest
>> kernel.
>>
>> Signed-off-by: Peter Xu 
>> ---
>>  include/hw/i386/intel_iommu.h | 60 
>> +++
>>  1 file changed, 60 insertions(+)
>>
>> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
>> index cc49839..4914fe6 100644
>> --- a/include/hw/i386/intel_iommu.h
>> +++ b/include/hw/i386/intel_iommu.h
>> @@ -52,6 +52,9 @@ typedef struct IntelIOMMUState IntelIOMMUState;
>>  typedef struct VTDAddressSpace VTDAddressSpace;
>>  typedef struct VTDIOTLBEntry VTDIOTLBEntry;
>>  typedef struct VTDBus VTDBus;
>> +typedef union VTD_IRTE VTD_IRTE;
>> +typedef union VTD_IR_IOAPICEntry VTD_IR_IOAPICEntry;
>> +typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress;
>>
>>  /* Context-Entry */
>>  struct VTDContextEntry {
>> @@ -90,6 +93,63 @@ struct VTDIOTLBEntry {
>>  bool write_flags;
>>  };
>>
>> +/* Interrupt Remapping Table Entry Definition */
>> +union VTD_IRTE {
>> +struct {
>> +uint8_t present:1;  /* Whether entry present/available */
>> +uint8_t fault_disable:1;/* Fault Processing Disable */
>> +uint8_t dest_mode:1;/* Destination Mode */
>> +uint8_t redir_hint:1;   /* Redirection Hint */
>> +uint8_t trigger_mode:1; /* Trigger Mode */
>> +uint8_t delivery_mode:3;/* Delivery Mode */
>> +uint8_t __avail:4;  /* Available spaces for software */
>> +uint8_t __reserved_0:3; /* Reserved 0 */
>> +uint8_t irte_mode:1;/* IRTE Mode */
>> +uint8_t vector:8;   /* Interrupt Vector */
>> +uint8_t __reserved_1:8; /* Reserved 1 */
>> +uint32_t dest_id:32;/* Destination ID */
>> +uint16_t source_id:16;  /* Source-ID */
>> +uint8_t sid_q:2;/* Source-ID Qualifier */
>> +uint8_t sid_vtype:2;/* Source-ID Validation Type */
>> +uint64_t __reserved_2:44;   /* Reserved 2 */
>> +} QEMU_PACKED;
>> +uint64_t data[2];
>> +};
>> +
>> +/* Programming format for IOAPIC table entries */
>> +union VTD_IR_IOAPICEntry {
>> +struct {
>> +uint8_t vector:8;   /* Vector */
>> +uint8_t __zeros:3;  /* Reserved (all zero) */
>> +uint8_t index_h:1;  /* Interrupt Index bit 15 */
>> +uint8_t status:1;   /* Deliver Status */
>> +uint8_t polarity:1; /* Interrupt Polarity */
>> +uint8_t remote_irr:1;   /* Remote IRR */
>> +uint8_t trigger_mode:1; /* Trigger Mode */
>> +uint8_t mask:1; /* Mask */
>> +uint32_t __reserved:31; /* Reserved (should all zero) */
>> +uint8_t int_mode:1; /* Interrupt Format */
>> +uint16_t index_l:15;/* Interrupt Index bits 14-0 */
>> +} QEMU_PACKED;
>> +uint64_t data;
>> +};
>> +
>> +/* Programming format for MSI/MSI-X addresses */
>> +union VTD_IR_MSIAddress {
>> +struct {
>> +uint8_t __not_care:2;
>> +uint8_t index_h:1;  /* Interrupt index bit 15 */
>> +uint8_t sub_valid:1;/* SHV: Sub-Handle Valid bit */
>> +uint8_t int_mode:1; /* Interrupt format */
>> +uint16_t index_l:15;/* Interrupt index bit 14-0 */
>> +uint16_t __head:12; /* Should always be: 0x0fee */
>> +} QEMU_PACKED;
>> +uint32_t data;
>> +};
>
> In a recent discussion, it was brought to my attention that you might
> have a problem with bitfields when the host cpu is not x86. Have you
> considered this ?

In a case when say the host cpu is little endian.

>
>> +
>> +/* When IR is enabled, all MSI/MSI-X data bits should be zero */
>> +#define VTD_IR_MSI_DATA  (0)
>> +
>>  /* The iommu (DMAR) device state struct */
>>  struct IntelIOMMUState {
>>  SysBusDevice busdev;
>> --
>> 2.4.11
>>



Re: [Qemu-devel] [PATCH v7 07/25] intel_iommu: define several structs for IOMMU IR

2016-05-29 Thread David Kiarie
On Tue, May 17, 2016 at 10:15 AM, Peter Xu  wrote:
> Several data structs are defined to better support the rest of the
> patches: IRTE to parse remapping table entries, and IOAPIC/MSI related
> structure bits to parse interrupt entries to be filled in by guest
> kernel.
>
> Signed-off-by: Peter Xu 
> ---
>  include/hw/i386/intel_iommu.h | 60 
> +++
>  1 file changed, 60 insertions(+)
>
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index cc49839..4914fe6 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -52,6 +52,9 @@ typedef struct IntelIOMMUState IntelIOMMUState;
>  typedef struct VTDAddressSpace VTDAddressSpace;
>  typedef struct VTDIOTLBEntry VTDIOTLBEntry;
>  typedef struct VTDBus VTDBus;
> +typedef union VTD_IRTE VTD_IRTE;
> +typedef union VTD_IR_IOAPICEntry VTD_IR_IOAPICEntry;
> +typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress;
>
>  /* Context-Entry */
>  struct VTDContextEntry {
> @@ -90,6 +93,63 @@ struct VTDIOTLBEntry {
>  bool write_flags;
>  };
>
> +/* Interrupt Remapping Table Entry Definition */
> +union VTD_IRTE {
> +struct {
> +uint8_t present:1;  /* Whether entry present/available */
> +uint8_t fault_disable:1;/* Fault Processing Disable */
> +uint8_t dest_mode:1;/* Destination Mode */
> +uint8_t redir_hint:1;   /* Redirection Hint */
> +uint8_t trigger_mode:1; /* Trigger Mode */
> +uint8_t delivery_mode:3;/* Delivery Mode */
> +uint8_t __avail:4;  /* Available spaces for software */
> +uint8_t __reserved_0:3; /* Reserved 0 */
> +uint8_t irte_mode:1;/* IRTE Mode */
> +uint8_t vector:8;   /* Interrupt Vector */
> +uint8_t __reserved_1:8; /* Reserved 1 */
> +uint32_t dest_id:32;/* Destination ID */
> +uint16_t source_id:16;  /* Source-ID */
> +uint8_t sid_q:2;/* Source-ID Qualifier */
> +uint8_t sid_vtype:2;/* Source-ID Validation Type */
> +uint64_t __reserved_2:44;   /* Reserved 2 */
> +} QEMU_PACKED;
> +uint64_t data[2];
> +};
> +
> +/* Programming format for IOAPIC table entries */
> +union VTD_IR_IOAPICEntry {
> +struct {
> +uint8_t vector:8;   /* Vector */
> +uint8_t __zeros:3;  /* Reserved (all zero) */
> +uint8_t index_h:1;  /* Interrupt Index bit 15 */
> +uint8_t status:1;   /* Deliver Status */
> +uint8_t polarity:1; /* Interrupt Polarity */
> +uint8_t remote_irr:1;   /* Remote IRR */
> +uint8_t trigger_mode:1; /* Trigger Mode */
> +uint8_t mask:1; /* Mask */
> +uint32_t __reserved:31; /* Reserved (should all zero) */
> +uint8_t int_mode:1; /* Interrupt Format */
> +uint16_t index_l:15;/* Interrupt Index bits 14-0 */
> +} QEMU_PACKED;
> +uint64_t data;
> +};
> +
> +/* Programming format for MSI/MSI-X addresses */
> +union VTD_IR_MSIAddress {
> +struct {
> +uint8_t __not_care:2;
> +uint8_t index_h:1;  /* Interrupt index bit 15 */
> +uint8_t sub_valid:1;/* SHV: Sub-Handle Valid bit */
> +uint8_t int_mode:1; /* Interrupt format */
> +uint16_t index_l:15;/* Interrupt index bit 14-0 */
> +uint16_t __head:12; /* Should always be: 0x0fee */
> +} QEMU_PACKED;
> +uint32_t data;
> +};

In a recent discussion, it was brought to my attention that you might
have a problem with bitfields when the host cpu is not x86. Have you
considered this ?

> +
> +/* When IR is enabled, all MSI/MSI-X data bits should be zero */
> +#define VTD_IR_MSI_DATA  (0)
> +
>  /* The iommu (DMAR) device state struct */
>  struct IntelIOMMUState {
>  SysBusDevice busdev;
> --
> 2.4.11
>



Re: [Qemu-devel] [V11 1/4] hw/i386: Introduce AMD IOMMU

2016-05-24 Thread David Kiarie
On Tue, May 24, 2016 at 3:35 PM, Peter Xu  wrote:
> On Sun, May 22, 2016 at 01:21:51PM +0300, David Kiarie wrote:
>
> [...]
>
>> +#define DEBUG_AMD_AMDVI
>> +#ifdef DEBUG_AMD_AMDVI
>> +enum {
>> +DEBUG_GENERAL, DEBUG_CAPAB, DEBUG_MMIO, DEBUG_ELOG,
>> +DEBUG_CACHE, DEBUG_COMMAND, DEBUG_MMU, DEBUG_CUSTOM
>> +};
>> +
>> +#define AMDVI_DBGBIT(x)   (1 << DEBUG_##x)
>> +static int iommu_dbgflags = AMDVI_DBGBIT(CUSTOM) | AMDVI_DBGBIT(MMU);
>> +
>> +#define AMDVI_DPRINTF(what, fmt, ...) do { \
>> +if (iommu_dbgflags & AMDVI_DBGBIT(what)) { \
>> +fprintf(stderr, "(amd-iommu)%s: " fmt "\n", __func__, \
>> +## __VA_ARGS__); } \
>> +} while (0)
>> +#else
>> +#define AMDVI_DPRINTF(what, fmt, ...) do {} while (0)
>> +#endif
>
> (actually I was considering whether it would be cool that both Intel
>  and AMD IOMMU codes start to leverage trace utilities. Re-compiling
>  for debugging every time is not convenient, and also not aligned
>  with other part of QEMU. However I guess this is in-all-cases too
>  late for a v11 patchset... So just raise this question up in the
>  brackets)
>
> [...]
>
>> +static void amdvi_log_event(AMDVIState *s, uint16_t *evt)
>> +{
>> +/* event logging not enabled */
>> +if (!s->evtlog_enabled || *(uint64_t *)&s->mmior[AMDVI_MMIO_STATUS] |
>> +AMDVI_MMIO_STATUS_EVT_OVF) {
>
> I see that there are lots of places in this patch that used
> something like:
>
>  *(uint64_t *)s->mmior[X] | Y
>
> Or:
>
>  *(uint64_t *)s->mmior[X] |= Y
>
> So... would it be a good idea that we provide several more helpers,
> like amdvi_orq() and amdvi_readq()?

Yes, that could be a good idea.

>
>> +return;
>> +}
>> +
>> +/* event log buffer full */
>> +if (s->evtlog_tail >= s->evtlog_len) {
>> +*(uint64_t *)&s->mmior[AMDVI_MMIO_STATUS] |= 
>> AMDVI_MMIO_STATUS_EVT_OVF;
>
> Yet another example...
>
>> +/* generate interrupt */
>> +amdvi_generate_msi_interrupt(s);
>> +return;
>> +}
>> +
>> +if (dma_memory_write(&address_space_memory, s->evtlog_len + 
>> s->evtlog_tail,
>> +   &evt, AMDVI_EVENT_LEN)) {
>> +AMDVI_DPRINTF(ELOG, "error: fail to write at address 0x%"PRIx64
>> +  " + offset 0x%"PRIx32, s->evtlog, s->evtlog_tail);
>> +}
>> +
>> + s->evtlog_tail += AMDVI_EVENT_LEN;
>> + *(uint64_t *)&s->mmior[AMDVI_MMIO_STATUS] |= 
>> AMDVI_MMIO_STATUS_COMP_INT;
>
> Another one. (will stop finding examples)
>
>> + amdvi_generate_msi_interrupt(s);
>> +}
>
> [...]
>
>> +/* extract device id */
>> +static inline uint16_t devid_extract(uint8_t *cmd)
>> +{
>> +return (uint16_t)cmd[2] & AMDVI_INVAL_DEV_ID_MASK;
>
> Here the mask is defined as:
>
> #define AMDVI_INVAL_DEV_ID_MASK   (~((1UL << AMDVI_INVAL_DEV_ID_SHIFT) - 1))
>
> I think it should be 0x for 64 bit systems. However
> here cmd[2] is type uint8_t. Is there anything wrong?...
>
> Also, I see many places that we manipulate arbitary elements in
> cmd[] directly with some masks. Not sure whether we can make it more
> readable (which is optional though)

Well, this the code checking for reserved bits is admittedly, not very
legible. Most of the commands(if not all) vary in the position of
reserved bits which means if I decided for instance to have structs
representing the commands I'd probably have to define one for each
command. If I get support for this idea, I don't have a problem
implementing it.

>
> [...]
>
>> +static void amdvi_update_iotlb(AMDVIState *s, uint16_t devid,
>> +   uint64_t gpa, IOMMUTLBEntry to_cache,
>> +   uint16_t domid)
>> +{
>> +AMDVIIOTLBEntry *entry = g_malloc(sizeof(*entry));
>> +uint64_t *key = g_malloc(sizeof(key));
>> +uint64_t gfn = gpa >> AMDVI_PAGE_SHIFT_4K;
>> +
>> +/* don't cache erroneous translations */
>> +if (to_cache.perm != IOMMU_NONE) {
>> +AMDVI_DPRINTF(CACHE, " update iotlb domid 0x%"PRIx16" devid: "
>> +  "%02x:%02x.%xgpa 0x%"PRIx64 " hpa 0x%"PRIx64, domid,
>> +  PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
>> +  gpa, to_cache.translated_addr);
>> +
>> +if (g_hash_table_

Re: [Qemu-devel] [PATCH v7 08/25] x86-iommu: introduce parent class

2016-05-24 Thread David Kiarie
On Tue, May 24, 2016 at 2:02 PM, David Kiarie  wrote:
> On Tue, May 24, 2016 at 1:40 PM, Jan Kiszka  wrote:
>> On 2016-05-23 23:48, Marcel Apfelbaum wrote:
>>> On 05/23/2016 08:06 PM, David Kiarie wrote:
>>>> On Tue, May 17, 2016 at 10:15 AM, Peter Xu  wrote:
>>>>> Introducing parent class for intel-iommu devices named "x86-iommu". This
>>>>> is preparation work to abstract shared functionalities out from Intel
>>>>> and AMD IOMMUs. Currently, only the parent class is introduced. It does
>>>>> nothing yet.
>>>>>
>>>>> Signed-off-by: Peter Xu 
>>>>> ---
>>>>>   hw/i386/Makefile.objs |  2 +-
>>>
>>> [...]
>>>
>>>>> +
>>>>> +static const TypeInfo x86_iommu_info = {
>>>>> +.name  = TYPE_X86_IOMMU_DEVICE,
>>>>> +.parent= TYPE_SYS_BUS_DEVICE,
>>>>> +.instance_size = sizeof(X86IOMMUState),
>>>>> +.class_init= x86_iommu_class_init,
>>>>> +.class_size= sizeof(X86IOMMUClass),
>>>>> +.abstract  = true,
>>>>> +};
>>>>
>>>> As I suspected am having some trouble parenting a PCI device from a
>>>> Bus device but I will investigate further to see if I can manage
>>>> something.
>>>>
>>>
>>> You cannot derive from both SYS_BUS_DEVICE and PCI_DEVICE.
>>> You would need a composition; your device would be a SYS_BUS_DEVICE
>>> and its state would include a PCI_DEVICE (or the other way around).
>>> Then you can divide the responsibilities between them.
>>
>> Given that the AMD IOMMU is more a platform than a PCI device, I would
>> also go for deriving from SYS_BUS_DEVICE (and later on a common x86
>> IOMMU class) and embedding a PCI_DEVICE. And the Intel IOMMU has no PCI
>> device feature at all.

huh, should it be possible to embed the whole PCI device state
?Haven't tried that yet.

>
> Yes, I managed to do that by getting rid of PCI device specific
> callbacks(replaced them with DeviceState callbacks) so I get a compile
> and no runtime fatality but device(AMD IOMMU) never appears in the
> device tree.
>
>>
>> Jan
>>
>>



Re: [Qemu-devel] [PATCH v7 08/25] x86-iommu: introduce parent class

2016-05-24 Thread David Kiarie
On Tue, May 24, 2016 at 1:40 PM, Jan Kiszka  wrote:
> On 2016-05-23 23:48, Marcel Apfelbaum wrote:
>> On 05/23/2016 08:06 PM, David Kiarie wrote:
>>> On Tue, May 17, 2016 at 10:15 AM, Peter Xu  wrote:
>>>> Introducing parent class for intel-iommu devices named "x86-iommu". This
>>>> is preparation work to abstract shared functionalities out from Intel
>>>> and AMD IOMMUs. Currently, only the parent class is introduced. It does
>>>> nothing yet.
>>>>
>>>> Signed-off-by: Peter Xu 
>>>> ---
>>>>   hw/i386/Makefile.objs |  2 +-
>>
>> [...]
>>
>>>> +
>>>> +static const TypeInfo x86_iommu_info = {
>>>> +.name  = TYPE_X86_IOMMU_DEVICE,
>>>> +.parent= TYPE_SYS_BUS_DEVICE,
>>>> +.instance_size = sizeof(X86IOMMUState),
>>>> +.class_init= x86_iommu_class_init,
>>>> +.class_size= sizeof(X86IOMMUClass),
>>>> +.abstract  = true,
>>>> +};
>>>
>>> As I suspected am having some trouble parenting a PCI device from a
>>> Bus device but I will investigate further to see if I can manage
>>> something.
>>>
>>
>> You cannot derive from both SYS_BUS_DEVICE and PCI_DEVICE.
>> You would need a composition; your device would be a SYS_BUS_DEVICE
>> and its state would include a PCI_DEVICE (or the other way around).
>> Then you can divide the responsibilities between them.
>
> Given that the AMD IOMMU is more a platform than a PCI device, I would
> also go for deriving from SYS_BUS_DEVICE (and later on a common x86
> IOMMU class) and embedding a PCI_DEVICE. And the Intel IOMMU has no PCI
> device feature at all.

Yes, I managed to do that by getting rid of PCI device specific
callbacks(replaced them with DeviceState callbacks) so I get a compile
and no runtime fatality but device(AMD IOMMU) never appears in the
device tree.

>
> Jan
>
>



Re: [Qemu-devel] [PATCH v7 08/25] x86-iommu: introduce parent class

2016-05-23 Thread David Kiarie
On Tue, May 17, 2016 at 10:15 AM, Peter Xu  wrote:
> Introducing parent class for intel-iommu devices named "x86-iommu". This
> is preparation work to abstract shared functionalities out from Intel
> and AMD IOMMUs. Currently, only the parent class is introduced. It does
> nothing yet.
>
> Signed-off-by: Peter Xu 
> ---
>  hw/i386/Makefile.objs |  2 +-
>  hw/i386/intel_iommu.c |  5 ++--
>  hw/i386/x86-iommu.c   | 53 
> +++
>  include/hw/i386/intel_iommu.h |  3 ++-
>  include/hw/i386/x86-iommu.h   | 46 +
>  5 files changed, 105 insertions(+), 4 deletions(-)
>  create mode 100644 hw/i386/x86-iommu.c
>  create mode 100644 include/hw/i386/x86-iommu.h
>
> diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
> index b52d5b8..90e94ff 100644
> --- a/hw/i386/Makefile.objs
> +++ b/hw/i386/Makefile.objs
> @@ -2,7 +2,7 @@ obj-$(CONFIG_KVM) += kvm/
>  obj-y += multiboot.o
>  obj-y += pc.o pc_piix.o pc_q35.o
>  obj-y += pc_sysfw.o
> -obj-y += intel_iommu.o
> +obj-y += x86-iommu.o intel_iommu.o
>  obj-$(CONFIG_XEN) += ../xenpv/ xen/
>
>  obj-y += kvmvapic.o
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 4d14124..0a70577 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -2120,16 +2120,17 @@ static void vtd_realize(DeviceState *dev, Error 
> **errp)
>  static void vtd_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> +X86IOMMUClass *x86_class = X86_IOMMU_CLASS(klass);
>
>  dc->reset = vtd_reset;
> -dc->realize = vtd_realize;
>  dc->vmsd = &vtd_vmstate;
>  dc->props = vtd_properties;
> +x86_class->realize = vtd_realize;
>  }
>
>  static const TypeInfo vtd_info = {
>  .name  = TYPE_INTEL_IOMMU_DEVICE,
> -.parent= TYPE_SYS_BUS_DEVICE,
> +.parent= TYPE_X86_IOMMU_DEVICE,
>  .instance_size = sizeof(IntelIOMMUState),
>  .class_init= vtd_class_init,
>  };
> diff --git a/hw/i386/x86-iommu.c b/hw/i386/x86-iommu.c
> new file mode 100644
> index 000..d739afb
> --- /dev/null
> +++ b/hw/i386/x86-iommu.c
> @@ -0,0 +1,53 @@
> +/*
> + * QEMU emulation of common X86 IOMMU
> + *
> + * Copyright (C) 2016 Peter Xu, Red Hat 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> +
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> +
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see .
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/sysbus.h"
> +#include "hw/boards.h"
> +#include "hw/i386/x86-iommu.h"
> +
> +static void x86_iommu_realize(DeviceState *dev, Error **errp)
> +{
> +X86IOMMUClass *x86_class = X86_IOMMU_GET_CLASS(dev);
> +if (x86_class->realize) {
> +x86_class->realize(dev, errp);
> +}
> +}
> +
> +static void x86_iommu_class_init(ObjectClass *klass, void *data)
> +{
> +DeviceClass *dc = DEVICE_CLASS(klass);
> +dc->realize = x86_iommu_realize;
> +}
> +
> +static const TypeInfo x86_iommu_info = {
> +.name  = TYPE_X86_IOMMU_DEVICE,
> +.parent= TYPE_SYS_BUS_DEVICE,
> +.instance_size = sizeof(X86IOMMUState),
> +.class_init= x86_iommu_class_init,
> +.class_size= sizeof(X86IOMMUClass),
> +.abstract  = true,
> +};

As I suspected am having some trouble parenting a PCI device from a
Bus device but I will investigate further to see if I can manage
something.

> +
> +static void x86_iommu_register_types(void)
> +{
> +type_register_static(&x86_iommu_info);
> +}
> +
> +type_init(x86_iommu_register_types)
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index 4914fe6..c88a931 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -23,6 +23,7 @@
>  #define INTEL_IOMMU_H
>  #include "hw/qdev.h"
>  #include "sysemu/dma.h"
> +#include "hw/i386/x86-iommu.h"
>
>  #define TYPE_INTEL_IOMMU_DEVICE "intel-iommu"
>  #define INTEL_IOMMU_DEVICE(obj) \
> @@ -152,7 +153,7 @@ union VTD_IR_MSIAddress {
>
>  /* The iommu (DMAR) device state struct */
>  struct IntelIOMMUState {
> -SysBusDevice busdev;
> +X86IOMMUState x86_iommu;
>  MemoryRegion csrmem;
>  uint8_t csr[DMAR_REG_SIZE]; /* register values */
>  uint8_t wmask[DMAR_REG_SIZE];   /* R/W bytes */
> diff --git a/include/hw/i386/x86-iommu.h b/include/hw/i386/x86-iommu.h
> new file mode 100644
> index 000..924f39a
> --- /dev/null
> +++ b/include/hw/i386/x86-

[Qemu-devel] GSoC 2016: Student Introduction

2016-05-22 Thread David Kiarie
Hello,

My name is David Kiarie. I am a student who has been selected to work
with Qemu as part of GSoC. My project entails completing the current
AMD IOMMU implementation work (getting the current patches merged) and
adding interrupt remapping. I will also try to add more features if
time allows.

I just sent the lastest version of AMD IOMMU patches
(http://thread.gmane.org/gmane.comp.emulators.qemu/412864).

I hope to make a good contribution to Qemu.

Cheers,
David.



[Qemu-devel] [V11 3/4] hw/core: provision for overriding emulated IOMMU

2016-05-22 Thread David Kiarie
Added an enum, subject to review, to machine properties which
it used to override iommu emulated from Intel to AMD.

Signed-off-by: David Kiarie 
---
 hw/core/machine.c | 29 ++---
 include/hw/boards.h   |  1 +
 include/hw/i386/intel_iommu.h |  1 +
 qemu-options.hx   |  7 +--
 util/qemu-config.c|  8 ++--
 5 files changed, 39 insertions(+), 7 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 6dbbc85..fe44e25 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -15,6 +15,8 @@
 #include "qapi/error.h"
 #include "qapi-visit.h"
 #include "qapi/visitor.h"
+#include "hw/i386/amd_iommu.h"
+#include "hw/i386/intel_iommu.h"
 #include "hw/sysbus.h"
 #include "sysemu/sysemu.h"
 #include "qemu/error-report.h"
@@ -297,9 +299,26 @@ static void machine_set_iommu(Object *obj, bool value, 
Error **errp)
 {
 MachineState *ms = MACHINE(obj);
 
+ms->iommu_type = TYPE_INTEL;
 ms->iommu = value;
 }
 
+static void machine_set_iommu_override(Object *obj, const char *value,
+   Error **errp)
+{
+MachineState *ms = MACHINE(obj);
+
+/* ensure a valid iommu type */
+if (g_strcmp0(value, AMD_IOMMU_STR) == 0) {
+ms->iommu_type = TYPE_AMD;
+} else if (g_strcmp0(value, INTEL_IOMMU_STR) == 0) {
+ms->iommu_type = TYPE_INTEL;
+} else {
+error_setg(errp, "Invalid IOMMU type %s", value);
+return;
+}
+}
+
 static void machine_set_suppress_vmdesc(Object *obj, bool value, Error **errp)
 {
 MachineState *ms = MACHINE(obj);
@@ -473,10 +492,14 @@ static void machine_initfn(Object *obj)
 "Firmware image",
 NULL);
 object_property_add_bool(obj, "iommu",
- machine_get_iommu,
- machine_set_iommu, NULL);
+ machine_get_iommu, machine_set_iommu, NULL);
 object_property_set_description(obj, "iommu",
-"Set on/off to enable/disable Intel IOMMU 
(VT-d)",
+"Set on to enable IOMMU emulation",
+NULL);
+object_property_add_str(obj, "x-iommu-type",
+NULL, machine_set_iommu_override, NULL);
+object_property_set_description(obj, "x-iommu-type",
+"Set on to override emulated IOMMU to AMD 
IOMMU",
 NULL);
 object_property_add_bool(obj, "suppress-vmdesc",
  machine_get_suppress_vmdesc,
diff --git a/include/hw/boards.h b/include/hw/boards.h
index dbe6745..5b7eeda 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -158,6 +158,7 @@ struct MachineState {
 bool igd_gfx_passthru;
 char *firmware;
 bool iommu;
+IommuType iommu_type;
 bool suppress_vmdesc;
 bool enforce_config_section;
 
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index b024ffa..539530c 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -27,6 +27,7 @@
 #define TYPE_INTEL_IOMMU_DEVICE "intel-iommu"
 #define INTEL_IOMMU_DEVICE(obj) \
  OBJECT_CHECK(IntelIOMMUState, (obj), TYPE_INTEL_IOMMU_DEVICE)
+#define INTEL_IOMMU_STR "intel"
 
 /* DMAR Hardware Unit Definition address (IOMMU unit) */
 #define Q35_HOST_BRIDGE_IOMMU_ADDR  0xfed9ULL
diff --git a/qemu-options.hx b/qemu-options.hx
index 6106520..81217d3 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -38,7 +38,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
 "kvm_shadow_mem=size of KVM shadow MMU\n"
 "dump-guest-core=on|off include guest memory in a core 
dump (default=on)\n"
 "mem-merge=on|off controls memory merge support (default: 
on)\n"
-"iommu=on|off controls emulated Intel IOMMU (VT-d) support 
(default=off)\n"
+"iommu=on|off controls emulated IOMMU support(default: 
off)\n"
+"x-iommu-type=amd|intel overrides emulated IOMMU to AMD 
IOMMU (default: intel)\n"
 "igd-passthru=on|off controls IGD GFX passthrough support 
(default=off)\n"
 "aes-key-wrap=on|off controls support for AES key wrapping 
(default=on)\n"
 "dea-key-wrap=on|off controls support for DEA key wrapping 
(default=on)\n"
@@ -74,7 +75,9 @@ Enables or disables memory merge support. This feature, when 
supported by
 the host, de-duplicates identical memory pages among VMs instances
 (enable

[Qemu-devel] [V11 4/4] hw/pci-host: Emulate AMD IOMMU

2016-05-22 Thread David Kiarie
Add AMD IOMMU emulation support to q35 chipset

Signed-off-by: David Kiarie 
---
 hw/pci-host/q35.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 70f897e..26fea0e 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -32,6 +32,7 @@
 #include "hw/pci-host/q35.h"
 #include "qapi/error.h"
 #include "qapi/visitor.h"
+#include "hw/i386/amd_iommu.h"
 
 /
  * Q35 host
@@ -448,6 +449,19 @@ static void mch_init_dmar(MCHPCIState *mch)
 pci_setup_iommu(pci_bus, q35_host_dma_iommu, mch->iommu);
 }
 
+static void mch_init_amdvi(MCHPCIState *mch)
+{
+AMDVIState *iommu_state;
+PCIBus *bus = PCI_BUS(qdev_get_parent_bus(DEVICE(mch)));
+PCIDevice *iommu;
+
+iommu = pci_create_simple(bus, 0x20, TYPE_AMD_IOMMU_DEVICE);
+
+iommu_state = AMD_IOMMU_DEVICE(iommu);
+
+pci_setup_iommu(bus, bridge_host_amdvi, iommu_state);
+}
+
 static void mch_realize(PCIDevice *d, Error **errp)
 {
 int i;
@@ -506,9 +520,14 @@ static void mch_realize(PCIDevice *d, Error **errp)
  mch->pci_address_space, &mch->pam_regions[i+1],
  PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE, PAM_EXPAN_SIZE);
 }
-/* Intel IOMMU (VT-d) */
-if (object_property_get_bool(qdev_get_machine(), "iommu", NULL)) {
-mch_init_dmar(mch);
+
+MachineState *machine = MACHINE(qdev_get_machine());
+if (machine->iommu) {
+if (machine->iommu_type == TYPE_AMD) {
+mch_init_amdvi(mch);
+} else {
+mch_init_dmar(mch);
+}
 }
 }
 
-- 
2.1.4




[Qemu-devel] [V11 2/4] hw/i386: ACPI IVRS table

2016-05-22 Thread David Kiarie
Add IVRS table for AMD IOMMU. Generate IVRS or DMAR
depending on emulated IOMMU.

Signed-off-by: David Kiarie 
---
 hw/acpi/aml-build.c |  2 +-
 hw/i386/acpi-build.c| 93 +++--
 include/hw/acpi/acpi-defs.h | 13 +++
 include/hw/acpi/aml-build.h |  1 +
 include/hw/boards.h |  6 +++
 5 files changed, 103 insertions(+), 12 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index cedb74e..8d4bd01 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -227,7 +227,7 @@ static void build_extop_package(GArray *package, uint8_t op)
 build_prepend_byte(package, 0x5B); /* ExtOpPrefix */
 }
 
-static void build_append_int_noprefix(GArray *table, uint64_t value, int size)
+void build_append_int_noprefix(GArray *table, uint64_t value, int size)
 {
 int i;
 
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 279f0d7..b0ee01b 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -52,6 +52,7 @@
 #include "hw/pci/pci_bus.h"
 #include "hw/pci-host/q35.h"
 #include "hw/i386/intel_iommu.h"
+#include "hw/i386/amd_iommu.h"
 #include "hw/timer/hpet.h"
 
 #include "hw/acpi/aml-build.h"
@@ -59,6 +60,8 @@
 #include "qapi/qmp/qint.h"
 #include "qom/qom-qobject.h"
 
+#include "hw/boards.h"
+
 /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
  * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
  * a little bit, there should be plenty of free space since the DSDT
@@ -2577,6 +2580,77 @@ build_dmar_q35(GArray *table_data, GArray *linker)
  "DMAR", table_data->len - dmar_start, 1, NULL, NULL);
 }
 
+static void
+build_amd_iommu(GArray *table_data, GArray *linker)
+{
+int iommu_start = table_data->len;
+bool iommu_ambig;
+
+/* IVRS definition  - table header has an extra 2-byte field */
+acpi_data_push(table_data, (sizeof(AcpiTableHeader)));
+/* common virtualization information */
+build_append_int_noprefix(table_data, AMD_IOMMU_HOST_ADDRESS_WIDTH << 8, 
4);
+/* reserved */
+build_append_int_noprefix(table_data, 0, 8);
+
+AMDVIState *s = (AMDVIState *)object_resolve_path_type("",
+TYPE_AMD_IOMMU_DEVICE, &iommu_ambig);
+
+/* IVDB definition - type 10h */
+if (!iommu_ambig) {
+/* IVHD definition - type 10h */
+build_append_int_noprefix(table_data, 0x10, 1);
+/* virtualization flags */
+build_append_int_noprefix(table_data, (IVHD_HT_TUNEN |
+ IVHD_PPRSUP | IVHD_IOTLBSUP | IVHD_PREFSUP), 1);
+/* ivhd length */
+build_append_int_noprefix(table_data, 0x20, 2);
+/* iommu device id */
+build_append_int_noprefix(table_data, PCI_DEVICE_ID_RD890_IOMMU, 2);
+/* offset of capability registers */
+build_append_int_noprefix(table_data, s->capab_offset, 2);
+/* mmio base register */
+build_append_int_noprefix(table_data, s->mmio.addr, 8);
+/* pci segment */
+build_append_int_noprefix(table_data, 0, 2);
+/* interrupt numbers */
+build_append_int_noprefix(table_data, 0, 2);
+/* feature reporting */
+build_append_int_noprefix(table_data, (IVHD_EFR_GTSUP |
+IVHD_EFR_HATS | IVHD_EFR_GATS), 4);
+/* Add device flags here
+ *   These are 4-byte device entries currently reporting the range of
+ *   devices 00h - h; all devices
+ *   Device setting affecting all devices should be made here
+ *
+ *   Refer to
+ *   (http://developer.amd.com/wordpress/media/2012/10/488821.pdf)
+ *   Table 95
+ */
+/* start of device range, 4-byte entries */
+build_append_int_noprefix(table_data, 0x0003, 4);
+/* end of device range */
+build_append_int_noprefix(table_data, 0x0004, 4);
+}
+
+build_header(linker, table_data, (void *)(table_data->data + iommu_start),
+ "IVRS", table_data->len - iommu_start, 1, NULL, NULL);
+}
+
+static IommuType has_iommu(void)
+{
+bool ambiguous;
+
+if (object_resolve_path_type("", TYPE_AMD_IOMMU_DEVICE, &ambiguous)
+&& !ambiguous)
+return TYPE_AMD;
+else if (object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE, &ambiguous)
+&& !ambiguous)
+return TYPE_INTEL;
+else
+return TYPE_NONE;
+}
+
 static GArray *
 build_rsdp(GArray *rsdp_table, GArray *linker, unsigned rsdt)
 {
@@ -2635,16 +2709,6 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
 return true;
 }
 
-static bool acpi_has_iommu(void)
-{
-bool ambiguous;
-Object *intel_iommu;
-
-intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
-   

[Qemu-devel] [V11 1/4] hw/i386: Introduce AMD IOMMU

2016-05-22 Thread David Kiarie
Add AMD IOMMU emulaton to Qemu in addition to Intel IOMMU
The IOMMU does basic translation, error checking and has a
minimal IOTLB implementation. This IOMMU bypassed the need
for target aborts by responding with IOMMU_NONE access rights
and exempts the region 0xfee0-0xfeef from translation
as it is the q35 interrupt region. We also advertise features
that are not yet implemented to please the Linux IOMMU driver.

IOTLB aims at implementing commands on real IOMMUs which is
essential for debugging and may not offer any performance
benefits

Signed-off-by: David Kiarie 
---
 hw/i386/Makefile.objs |1 +
 hw/i386/amd_iommu.c   | 1401 +
 hw/i386/amd_iommu.h   |  340 
 include/hw/pci/pci.h  |2 +
 4 files changed, 1744 insertions(+)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index b52d5b8..2f1a265 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -3,6 +3,7 @@ obj-y += multiboot.o
 obj-y += pc.o pc_piix.o pc_q35.o
 obj-y += pc_sysfw.o
 obj-y += intel_iommu.o
+obj-y += amd_iommu.o
 obj-$(CONFIG_XEN) += ../xenpv/ xen/
 
 obj-y += kvmvapic.o
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
new file mode 100644
index 000..e70390d
--- /dev/null
+++ b/hw/i386/amd_iommu.c
@@ -0,0 +1,1401 @@
+/*
+ * QEMU emulation of AMD IOMMU (AMD-Vi)
+ *
+ * Copyright (C) 2011 Eduard - Gabriel Munteanu
+ * Copyright (C) 2015 David Kiarie, 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ * Cache implementation inspired by hw/i386/intel_iommu.c
+ *
+ */
+#include "qemu/osdep.h"
+#include "hw/pci/msi.h"
+#include "hw/i386/amd_iommu.h"
+
+#define DEBUG_AMD_AMDVI
+#ifdef DEBUG_AMD_AMDVI
+enum {
+DEBUG_GENERAL, DEBUG_CAPAB, DEBUG_MMIO, DEBUG_ELOG,
+DEBUG_CACHE, DEBUG_COMMAND, DEBUG_MMU, DEBUG_CUSTOM
+};
+
+#define AMDVI_DBGBIT(x)   (1 << DEBUG_##x)
+static int iommu_dbgflags = AMDVI_DBGBIT(CUSTOM) | AMDVI_DBGBIT(MMU);
+
+#define AMDVI_DPRINTF(what, fmt, ...) do { \
+if (iommu_dbgflags & AMDVI_DBGBIT(what)) { \
+fprintf(stderr, "(amd-iommu)%s: " fmt "\n", __func__, \
+## __VA_ARGS__); } \
+} while (0)
+#else
+#define AMDVI_DPRINTF(what, fmt, ...) do {} while (0)
+#endif
+
+#define ENCODE_EVENT(devid, info, addr, rshift) do { \
+*(uint16_t *)&evt[0] = devid; \
+*(uint8_t *)&evt[3]  = info;  \
+*(uint64_t *)&evt[4] = rshift ? cpu_to_le64(addr) :\
+   cpu_to_le64(addr) >> rshift; \
+} while (0)
+
+typedef struct AMDVIAddressSpace {
+uint8_t bus_num;/* bus number   */
+uint8_t devfn;  /* device function  */
+AMDVIState *iommu_state;/* AMDVI - one per machine  */
+MemoryRegion iommu; /* Device's iommu region*/
+AddressSpace as;/* device's corresponding address space */
+} AMDVIAddressSpace;
+
+/* AMDVI cache entry */
+typedef struct AMDVIIOTLBEntry {
+uint64_t gfn;   /* guest frame number  */
+uint16_t domid; /* assigned domain id  */
+uint64_t devid; /* device owning entry */
+uint64_t perms; /* access permissions  */
+uint64_t translated_addr;   /* translated address  */
+uint64_t page_mask; /* physical page size  */
+} AMDVIIOTLBEntry;
+
+/* configure MMIO registers at startup/reset */
+static void amdvi_set_quad(AMDVIState *s, hwaddr addr, uint64_t val,
+   uint64_t romask, uint64_t w1cmask)
+{
+stq_le_p(&s->mmior[addr], val);
+stq_le_p(&s->romask[addr], romask);
+stq_le_p(&s->w1cmask[addr], w1cmask);
+}
+
+static uint16_t amdvi_readw(AMDVIState *s, hwaddr addr)
+{
+return lduw_le_p(&s->mmior[addr]);
+}
+
+static uint32_t amdvi_readl(AMDVIState *s, hwaddr addr)
+{
+return ldl_le_p(&s->mmior[addr]);
+}
+
+static uint64_t amdvi_readq(AMDVIState *s, hwaddr addr)
+{
+return ldq_le_p(&s->mmior[addr]);
+}
+
+/* internal write */
+static void amdvi_writeq_raw(AMDVIState *s, uint64_t val, hwaddr addr)
+{
+stq_le_p(&s->mmior[addr], val);
+}

[Qemu-devel] [V11 0/4] AMD IOMMU

2016-05-22 Thread David Kiarie
Hi all,

This patches series adds basic AMD IOMMU emulation support to Qemu. It's
currently in it's 11th version.

Michael(or any other person who can merge this patchset) can you please look at 
the possibility of merging this patches ?Are there more issues with these 
patches since the last version barely attracted any comments ?

Changes since V10 include
 
   -Support for huge pages including some obscure AMD IOMMU feature that allows 
default page size override[Jan].
   -Fixed an issue with generation of interrupts. We noted that AMD IOMMU has 
BusMaster- and is therefore not able to generate interrupts like any other PCI 
device. We have resulted in writing directly to system address but this could 
be fixed by some patches which have not been merged yet.

Changes since v9

   -amd_iommu prefixes have been renamed to a shorter 'amdvi' both in the macros
and in the functions/code. The register macros have not been moved to the 
implementation file since almost the macros there are basically macros and 
I 
reckoned renaming them should suffice.
   -taken care of byte order in the use of 'dma_memory_read'[Michael]
   -Taken care of invalid DTE entries which is still subject discussion. I will 
make any necessary changes based on discusion outcome.[Jan]
   -An issue with the emulate IOMMU defaulting to AMD_IOMMU has be fixed[Marcel]
   
You can test this patches by starting with parameters 
qemu-system-x86_64 -M q35,iommu=on,x-iommu-type=amd -m 2G -enable-kvm -smp 
4 -cpu host -hda file.img -soundhw ac97 
emulating whatever devices you want.

Not passing any command line parameters to linux should be enough to test this 
patches since the devices are basically
passes-through but to the 'host' (l1 guest). You can still go ahead pass 
command line parameter 'iommu=pt iommu=1'
and try to pass a device to L2 guest. This can also done without passing any 
iommu related parameters to the kernel. 

David Kiarie (4):
  hw/i386: Introduce AMD IOMMU
  hw/i386: ACPI IVRS table
  hw/core: provision for overriding emulated IOMMU
  hw/pci-host: Emulate AMD IOMMU

 hw/acpi/aml-build.c   |2 +-
 hw/core/machine.c |   29 +-
 hw/i386/Makefile.objs |1 +
 hw/i386/acpi-build.c  |   93 ++-
 hw/i386/amd_iommu.c   | 1401 +
 hw/i386/amd_iommu.h   |  340 ++
 hw/pci-host/q35.c |   25 +-
 include/hw/acpi/acpi-defs.h   |   13 +
 include/hw/acpi/aml-build.h   |1 +
 include/hw/boards.h   |7 +
 include/hw/i386/intel_iommu.h |1 +
 include/hw/pci/pci.h  |2 +
 qemu-options.hx   |7 +-
 util/qemu-config.c|8 +-
 14 files changed, 1908 insertions(+), 22 deletions(-)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

-- 
2.1.4




Re: [Qemu-devel] [V10 1/4] hw/i386: Introduce AMD IOMMU

2016-05-15 Thread David Kiarie
On Sun, May 15, 2016 at 10:29 PM, Jan Kiszka  wrote:
> On 2016-05-09 14:15, David Kiarie wrote:
>> +

Thanks for review and testing!

>> +/* go to the next lower level */
>> +pte_addr = pte & AMDVI_DEV_PT_ROOT_MASK;
>> +/* add offset and load pte */
>> +pte_addr += ((addr >> (3 + 9 * level)) & 0x1FF) << 3;
>> +pte = ldq_phys(&address_space_memory, pte_addr);
>> +level = get_pte_translation_mode(pte);
>> +}
>> +/* get access permissions from pte */
>
> That comment is only addressing the last assignment of the followings.
>
>> +ret->iova = addr & AMDVI_PAGE_MASK_4K;
>> +ret->translated_addr = (pte & AMDVI_DEV_PT_ROOT_MASK) &
>> +AMDVI_PAGE_MASK_4K;
>> +ret->addr_mask = ~AMDVI_PAGE_MASK_4K;
>
> This does not take huge pages (2M, 1G, ...) into account. Jailhouse
> creates them, and its Linux guest goes mad. You need to use the correct
> page size here, analogously to intel_iommu.c.

Yes, this was meant to work with normal pages only. Until recently
intel iommu supported 4k pages only so I figured I could as well work
with 4k pages. Anyway, will fix this.

>
>> +ret->perm = amdvi_get_perms(pte);
>> +return;
>> +}
>> +
>> +no_remap:
>> +ret->iova = addr & AMDVI_PAGE_MASK_4K;
>> +ret->translated_addr = addr & AMDVI_PAGE_MASK_4K;
>> +ret->addr_mask = ~AMDVI_PAGE_MASK_4K;
>> +ret->perm = amdvi_get_perms(pte);
>> +
>> +}
>> +
>> +/* TODO : Mark addresses as Accessed and Dirty */
>> +static void amdvi_do_translate(AMDVIAddressSpace *as, hwaddr addr,
>> +   bool is_write, IOMMUTLBEntry *ret)
>> +{
>> +AMDVIState *s = as->iommu_state;
>> +uint16_t devid = PCI_DEVID(as->bus_num, as->devfn);
>> +AMDVIIOTLBEntry *iotlb_entry = amdvi_iotlb_lookup(s, addr, as->devfn);
>> +uint64_t entry[4];
>> +
>> +if (iotlb_entry) {
>> +AMDVI_DPRINTF(CACHE, "hit  iotlb devid: %02x:%02x.%x gpa 0x%"PRIx64
>> +  " hpa 0x%"PRIx64, PCI_BUS_NUM(devid), PCI_SLOT(devid),
>> +  PCI_FUNC(devid), addr, iotlb_entry->translated_addr);
>> +ret->iova = addr & AMDVI_PAGE_MASK_4K;
>> +ret->translated_addr = iotlb_entry->translated_addr;
>> +ret->addr_mask = ~AMDVI_PAGE_MASK_4K;
>> +ret->perm = iotlb_entry->perms;
>> +return;
>> +}
>> +
>> +/* devices with V = 0 are not translated */
>> +if (!amdvi_get_dte(s, devid, entry)) {
>> +goto out;
>> +}
>> +
>> +amdvi_page_walk(as, entry, ret,
>> +is_write ? AMDVI_PERM_WRITE : AMDVI_PERM_READ, addr);
>> +
>> +amdvi_update_iotlb(s, as->devfn, addr, ret->translated_addr,
>> +   ret->perm, entry[1] & AMDVI_DEV_DOMID_ID_MASK);
>> +return;
>> +
>> +out:
>> +ret->iova = addr & AMDVI_PAGE_MASK_4K;
>> +ret->translated_addr = addr & AMDVI_PAGE_MASK_4K;
>> +ret->addr_mask = ~AMDVI_PAGE_MASK_4K;
>> +ret->perm = IOMMU_RW;
>> +}
>> +
>> +static inline bool amdvi_is_interrupt_addr(hwaddr addr)
>> +{
>> +return addr >= AMDVI_INT_ADDR_FIRST && addr <= AMDVI_INT_ADDR_LAST;
>> +}
>> +
>> +static IOMMUTLBEntry amdvi_translate(MemoryRegion *iommu, hwaddr addr,
>> + bool is_write)
>> +{
>> +AMDVI_DPRINTF(GENERAL, "");
>
> Not a very helpful instrumentation, I would say.

It was helpful in the initial stages of development, not very helpful
now. I could get rid of such.

>
>> +
>> +AMDVIAddressSpace *as = container_of(iommu, AMDVIAddressSpace, iommu);
>> +AMDVIState *s = as->iommu_state;
>> +IOMMUTLBEntry ret = {
>> +.target_as = &address_space_memory,
>> +.iova = addr,
>> +.translated_addr = 0,
>> +.addr_mask = ~(hwaddr)0,
>> +.perm = IOMMU_NONE
>> +};
>> +
>> +if (!s->enabled || amdvi_is_interrupt_addr(addr)) {
>> +/* AMDVI disabled - corresponds to iommu=off not
>> + * failure to provide any parameter
>> + */
>> +ret.iova = addr & AMDVI_PAGE_MASK_4K;
>> +ret.translated_addr = addr & AMDVI_PAGE_MASK_4K;
>> +ret.addr_mask = ~AMDVI_PAGE_MASK_4K;
>> +ret.perm = IOMMU_RW;
>> +return ret;
>> +}
>> +
>> +amdvi_do_translate(as, addr, is_write, &ret);
>> +AMDVI_DPRINTF(MMU, "devid: %02x:%02x.%x gpa 0x%"PRIx64 " hpa 0x%"PRIx64,
>> +  as->bus_num, PCI_SLOT(as->devfn), PCI_FUNC(as->devfn), 
>> addr,
>> +  ret.translated_addr);
>
> Tracing permission here in addition would be good.
>
> Jan
>



[Qemu-devel] [V10 4/4] hw/pci-host: Emulate AMD IOMMU

2016-05-09 Thread David Kiarie
Add AMD IOMMU emulation support to q35 chipset

Signed-off-by: David Kiarie 
---
 hw/pci-host/q35.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 70f897e..26fea0e 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -32,6 +32,7 @@
 #include "hw/pci-host/q35.h"
 #include "qapi/error.h"
 #include "qapi/visitor.h"
+#include "hw/i386/amd_iommu.h"
 
 /
  * Q35 host
@@ -448,6 +449,19 @@ static void mch_init_dmar(MCHPCIState *mch)
 pci_setup_iommu(pci_bus, q35_host_dma_iommu, mch->iommu);
 }
 
+static void mch_init_amdvi(MCHPCIState *mch)
+{
+AMDVIState *iommu_state;
+PCIBus *bus = PCI_BUS(qdev_get_parent_bus(DEVICE(mch)));
+PCIDevice *iommu;
+
+iommu = pci_create_simple(bus, 0x20, TYPE_AMD_IOMMU_DEVICE);
+
+iommu_state = AMD_IOMMU_DEVICE(iommu);
+
+pci_setup_iommu(bus, bridge_host_amdvi, iommu_state);
+}
+
 static void mch_realize(PCIDevice *d, Error **errp)
 {
 int i;
@@ -506,9 +520,14 @@ static void mch_realize(PCIDevice *d, Error **errp)
  mch->pci_address_space, &mch->pam_regions[i+1],
  PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE, PAM_EXPAN_SIZE);
 }
-/* Intel IOMMU (VT-d) */
-if (object_property_get_bool(qdev_get_machine(), "iommu", NULL)) {
-mch_init_dmar(mch);
+
+MachineState *machine = MACHINE(qdev_get_machine());
+if (machine->iommu) {
+if (machine->iommu_type == TYPE_AMD) {
+mch_init_amdvi(mch);
+} else {
+mch_init_dmar(mch);
+}
 }
 }
 
-- 
2.1.4




[Qemu-devel] [V10 1/4] hw/i386: Introduce AMD IOMMU

2016-05-09 Thread David Kiarie
Add AMD IOMMU emulaton to Qemu in addition to Intel IOMMU
The IOMMU does basic translation, error checking and has a
minimal IOTLB implementation. This IOMMU bypassed the need
for target aborts by responding with IOMMU_NONE access rights
and exempts the region 0xfee0-0xfeef from translation
as it is the q35 interrupt region. We also advertise features
that are not yet implemented to please the Linux IOMMU driver.

IOTLB aims at implementing commands on real IOMMUs which is
essential for debugging and may not offer any performance
benefits.

Signed-off-by: David Kiarie 
---
 hw/i386/Makefile.objs |1 +
 hw/i386/amd_iommu.c   | 1405 +
 hw/i386/amd_iommu.h   |  340 
 include/hw/pci/pci.h  |2 +
 4 files changed, 1748 insertions(+)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index b52d5b8..2f1a265 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -3,6 +3,7 @@ obj-y += multiboot.o
 obj-y += pc.o pc_piix.o pc_q35.o
 obj-y += pc_sysfw.o
 obj-y += intel_iommu.o
+obj-y += amd_iommu.o
 obj-$(CONFIG_XEN) += ../xenpv/ xen/
 
 obj-y += kvmvapic.o
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
new file mode 100644
index 000..cd136ee
--- /dev/null
+++ b/hw/i386/amd_iommu.c
@@ -0,0 +1,1405 @@
+/*
+ * QEMU emulation of AMD IOMMU (AMD-Vi)
+ *
+ * Copyright (C) 2011 Eduard - Gabriel Munteanu
+ * Copyright (C) 2015 David Kiarie, 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ * Cache implementation inspired by hw/i386/intel_iommu.c
+ *
+ */
+#include "qemu/osdep.h"
+#include "hw/i386/amd_iommu.h"
+
+//#define DEBUG_AMD_AMDVI
+#ifdef DEBUG_AMD_AMDVI
+enum {
+DEBUG_GENERAL, DEBUG_CAPAB, DEBUG_MMIO, DEBUG_ELOG,
+DEBUG_CACHE, DEBUG_COMMAND, DEBUG_MMU, DEBUG_CUSTOM
+};
+
+#define AMDVI_DBGBIT(x)   (1 << DEBUG_##x)
+static int iommu_dbgflags = AMDVI_DBGBIT(MMU);
+
+#define AMDVI_DPRINTF(what, fmt, ...) do { \
+if (iommu_dbgflags & AMDVI_DBGBIT(what)) { \
+fprintf(stderr, "(amd-iommu)%s: " fmt "\n", __func__, \
+## __VA_ARGS__); } \
+} while (0)
+#else
+#define AMDVI_DPRINTF(what, fmt, ...) do {} while (0)
+#endif
+
+#define ENCODE_EVENT(devid, info, addr, rshift) do { \
+*(uint16_t *)&evt[0] = devid; \
+*(uint8_t *)&evt[3]  = info;  \
+*(uint64_t *)&evt[4] = rshift ? cpu_to_le64(addr) :\
+   cpu_to_le64(addr) >> rshift; \
+} while (0)
+
+typedef struct AMDVIAddressSpace {
+uint8_t bus_num;/* bus number   */
+uint8_t devfn;  /* device function  */
+AMDVIState *iommu_state; /* AMDVI - one per machine  */
+MemoryRegion iommu; /* Device's iommu region*/
+AddressSpace as;/* device's corresponding address space */
+} AMDVIAddressSpace;
+
+/* AMDVI cache entry */
+typedef struct AMDVIIOTLBEntry {
+uint64_t gfn;
+uint16_t domid;
+uint64_t devid;
+uint64_t perms;
+uint64_t translated_addr;
+} AMDVIIOTLBEntry;
+
+/* configure MMIO registers at startup/reset */
+static void amdvi_set_quad(AMDVIState *s, hwaddr addr, uint64_t val,
+   uint64_t romask, uint64_t w1cmask)
+{
+stq_le_p(&s->mmior[addr], val);
+stq_le_p(&s->romask[addr], romask);
+stq_le_p(&s->w1cmask[addr], w1cmask);
+}
+
+static uint16_t amdvi_readw(AMDVIState *s, hwaddr addr)
+{
+return lduw_le_p(&s->mmior[addr]);
+}
+
+static uint32_t amdvi_readl(AMDVIState *s, hwaddr addr)
+{
+return ldl_le_p(&s->mmior[addr]);
+}
+
+static uint64_t amdvi_readq(AMDVIState *s, hwaddr addr)
+{
+return ldq_le_p(&s->mmior[addr]);
+}
+
+/* internal write */
+static void amdvi_writeq_raw(AMDVIState *s, uint64_t val, hwaddr addr)
+{
+stq_le_p(&s->mmior[addr], val);
+}
+
+/* external write */
+static void amdvi_writew(AMDVIState *s, hwaddr addr, uint16_t val)
+{
+uint16_t romask = lduw_le_p(&s->romask[addr]);
+uint16_t w1cmask = lduw_le_p(&s->w1cmask[addr]);
+uint16_t oldval = lduw_le_p(&s->mmior[addr]);
+stw_le_p(&s->mmior[add

[Qemu-devel] [V10 2/4] hw/i386: ACPI IVRS table

2016-05-09 Thread David Kiarie
Add IVRS table for AMD IOMMU. Generate IVRS or DMAR
depending on emulated IOMMU.

Signed-off-by: David Kiarie 
---
 hw/acpi/aml-build.c |  2 +-
 hw/i386/acpi-build.c| 93 +++--
 include/hw/acpi/acpi-defs.h | 13 +++
 include/hw/acpi/aml-build.h |  1 +
 include/hw/boards.h |  6 +++
 5 files changed, 103 insertions(+), 12 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index ab89ca6..da11bf8 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -227,7 +227,7 @@ static void build_extop_package(GArray *package, uint8_t op)
 build_prepend_byte(package, 0x5B); /* ExtOpPrefix */
 }
 
-static void build_append_int_noprefix(GArray *table, uint64_t value, int size)
+void build_append_int_noprefix(GArray *table, uint64_t value, int size)
 {
 int i;
 
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 6477003..59849e4 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -52,6 +52,7 @@
 #include "hw/pci/pci_bus.h"
 #include "hw/pci-host/q35.h"
 #include "hw/i386/intel_iommu.h"
+#include "hw/i386/amd_iommu.h"
 #include "hw/timer/hpet.h"
 
 #include "hw/acpi/aml-build.h"
@@ -59,6 +60,8 @@
 #include "qapi/qmp/qint.h"
 #include "qom/qom-qobject.h"
 
+#include "hw/boards.h"
+
 /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
  * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
  * a little bit, there should be plenty of free space since the DSDT
@@ -2598,6 +2601,77 @@ build_dmar_q35(GArray *table_data, GArray *linker)
  "DMAR", table_data->len - dmar_start, 1, NULL, NULL);
 }
 
+static void
+build_amd_iommu(GArray *table_data, GArray *linker)
+{
+int iommu_start = table_data->len;
+bool iommu_ambig;
+
+/* IVRS definition  - table header has an extra 2-byte field */
+acpi_data_push(table_data, (sizeof(AcpiTableHeader)));
+/* common virtualization information */
+build_append_int_noprefix(table_data, AMD_IOMMU_HOST_ADDRESS_WIDTH << 8, 
4);
+/* reserved */
+build_append_int_noprefix(table_data, 0, 8);
+
+AMDVIState *s = (AMDVIState *)object_resolve_path_type("",
+TYPE_AMD_IOMMU_DEVICE, &iommu_ambig);
+
+/* IVDB definition - type 10h */
+if (!iommu_ambig) {
+/* IVHD definition - type 10h */
+build_append_int_noprefix(table_data, 0x10, 1);
+/* virtualization flags */
+build_append_int_noprefix(table_data, (IVHD_HT_TUNEN |
+ IVHD_PPRSUP | IVHD_IOTLBSUP | IVHD_PREFSUP), 1);
+/* ivhd length */
+build_append_int_noprefix(table_data, 0x20, 2);
+/* iommu device id */
+build_append_int_noprefix(table_data, PCI_DEVICE_ID_RD890_IOMMU, 2);
+/* offset of capability registers */
+build_append_int_noprefix(table_data, s->capab_offset, 2);
+/* mmio base register */
+build_append_int_noprefix(table_data, s->mmio.addr, 8);
+/* pci segment */
+build_append_int_noprefix(table_data, 0, 2);
+/* interrupt numbers */
+build_append_int_noprefix(table_data, 0, 2);
+/* feature reporting */
+build_append_int_noprefix(table_data, (IVHD_EFR_GTSUP |
+IVHD_EFR_HATS | IVHD_EFR_GATS), 4);
+/* Add device flags here
+ *   These are 4-byte device entries currently reporting the range of
+ *   devices 00h - h; all devices
+ *   Device setting affecting all devices should be made here
+ *
+ *   Refer to
+ *   (http://developer.amd.com/wordpress/media/2012/10/488821.pdf)
+ *   Table 95
+ */
+/* start of device range, 4-byte entries */
+build_append_int_noprefix(table_data, 0x0003, 4);
+/* end of device range */
+build_append_int_noprefix(table_data, 0x0004, 4);
+}
+
+build_header(linker, table_data, (void *)(table_data->data + iommu_start),
+ "IVRS", table_data->len - iommu_start, 1, NULL, NULL);
+}
+
+static IommuType has_iommu(void)
+{
+bool ambiguous;
+
+if (object_resolve_path_type("", TYPE_AMD_IOMMU_DEVICE, &ambiguous)
+&& !ambiguous)
+return TYPE_AMD;
+else if (object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE, &ambiguous)
+&& !ambiguous)
+return TYPE_INTEL;
+else
+return TYPE_NONE;
+}
+
 static GArray *
 build_rsdp(GArray *rsdp_table, GArray *linker, unsigned rsdt)
 {
@@ -2656,16 +2730,6 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
 return true;
 }
 
-static bool acpi_has_iommu(void)
-{
-bool ambiguous;
-Object *intel_iommu;
-
-intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
-   

[Qemu-devel] [V10 3/4] hw/core: provision for overriding emulated IOMMU

2016-05-09 Thread David Kiarie
Added an enum, subject to review, to machine properties which
it used to override iommu emulated from Intel to AMD.

Signed-off-by: David Kiarie 
---
 hw/core/machine.c | 29 ++---
 include/hw/boards.h   |  1 +
 include/hw/i386/intel_iommu.h |  1 +
 qemu-options.hx   |  7 +--
 util/qemu-config.c|  8 ++--
 5 files changed, 39 insertions(+), 7 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 6dbbc85..fe44e25 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -15,6 +15,8 @@
 #include "qapi/error.h"
 #include "qapi-visit.h"
 #include "qapi/visitor.h"
+#include "hw/i386/amd_iommu.h"
+#include "hw/i386/intel_iommu.h"
 #include "hw/sysbus.h"
 #include "sysemu/sysemu.h"
 #include "qemu/error-report.h"
@@ -297,9 +299,26 @@ static void machine_set_iommu(Object *obj, bool value, 
Error **errp)
 {
 MachineState *ms = MACHINE(obj);
 
+ms->iommu_type = TYPE_INTEL;
 ms->iommu = value;
 }
 
+static void machine_set_iommu_override(Object *obj, const char *value,
+   Error **errp)
+{
+MachineState *ms = MACHINE(obj);
+
+/* ensure a valid iommu type */
+if (g_strcmp0(value, AMD_IOMMU_STR) == 0) {
+ms->iommu_type = TYPE_AMD;
+} else if (g_strcmp0(value, INTEL_IOMMU_STR) == 0) {
+ms->iommu_type = TYPE_INTEL;
+} else {
+error_setg(errp, "Invalid IOMMU type %s", value);
+return;
+}
+}
+
 static void machine_set_suppress_vmdesc(Object *obj, bool value, Error **errp)
 {
 MachineState *ms = MACHINE(obj);
@@ -473,10 +492,14 @@ static void machine_initfn(Object *obj)
 "Firmware image",
 NULL);
 object_property_add_bool(obj, "iommu",
- machine_get_iommu,
- machine_set_iommu, NULL);
+ machine_get_iommu, machine_set_iommu, NULL);
 object_property_set_description(obj, "iommu",
-"Set on/off to enable/disable Intel IOMMU 
(VT-d)",
+"Set on to enable IOMMU emulation",
+NULL);
+object_property_add_str(obj, "x-iommu-type",
+NULL, machine_set_iommu_override, NULL);
+object_property_set_description(obj, "x-iommu-type",
+"Set on to override emulated IOMMU to AMD 
IOMMU",
 NULL);
 object_property_add_bool(obj, "suppress-vmdesc",
  machine_get_suppress_vmdesc,
diff --git a/include/hw/boards.h b/include/hw/boards.h
index dbe6745..5b7eeda 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -158,6 +158,7 @@ struct MachineState {
 bool igd_gfx_passthru;
 char *firmware;
 bool iommu;
+IommuType iommu_type;
 bool suppress_vmdesc;
 bool enforce_config_section;
 
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index b024ffa..539530c 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -27,6 +27,7 @@
 #define TYPE_INTEL_IOMMU_DEVICE "intel-iommu"
 #define INTEL_IOMMU_DEVICE(obj) \
  OBJECT_CHECK(IntelIOMMUState, (obj), TYPE_INTEL_IOMMU_DEVICE)
+#define INTEL_IOMMU_STR "intel"
 
 /* DMAR Hardware Unit Definition address (IOMMU unit) */
 #define Q35_HOST_BRIDGE_IOMMU_ADDR  0xfed9ULL
diff --git a/qemu-options.hx b/qemu-options.hx
index 6106520..81217d3 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -38,7 +38,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
 "kvm_shadow_mem=size of KVM shadow MMU\n"
 "dump-guest-core=on|off include guest memory in a core 
dump (default=on)\n"
 "mem-merge=on|off controls memory merge support (default: 
on)\n"
-"iommu=on|off controls emulated Intel IOMMU (VT-d) support 
(default=off)\n"
+"iommu=on|off controls emulated IOMMU support(default: 
off)\n"
+"x-iommu-type=amd|intel overrides emulated IOMMU to AMD 
IOMMU (default: intel)\n"
 "igd-passthru=on|off controls IGD GFX passthrough support 
(default=off)\n"
 "aes-key-wrap=on|off controls support for AES key wrapping 
(default=on)\n"
 "dea-key-wrap=on|off controls support for DEA key wrapping 
(default=on)\n"
@@ -74,7 +75,9 @@ Enables or disables memory merge support. This feature, when 
supported by
 the host, de-duplicates identical memory pages among VMs instances
 (enable

[Qemu-devel] [V10 0/4] AMD IOMMU

2016-05-09 Thread David Kiarie
Hi all,

This patches series adds basic AMD IOMMU emulation support to Qemu. It's
currently in it's 10th version. Changes since V9 include

   -amd_iommu prefixes have been renamed to a shorter 'amdvi' both in the macros
and in the functions/code. The register macros have not been moved to the 
implementation file since almost the macros there are basically macros and 
I 
reckoned renaming them should suffice.
   -taken care of byte order in the use of 'dma_memory_read'[Michael]
   -Taken care of invalid DTE entries which is still subject discussion. I will 
make any necessary changes based on discusion outcome.[Jan]
   -An issue with the emulate IOMMU defaulting to AMD_IOMMU has be fixed[Marcel]
   
You can test this patches by starting with parameters 
qemu-system-x86_64 -M q35,iommu=on,x-iommu-type=amd -m 2G -enable-kvm -smp 
4 -cpu host -hda file.img -soundhw ac97 
emulating whatever devices you want.

Not passing any command line parameters to linux should be enough to test this 
patches since the devices are basically
passes-through but to the 'host' (l1 guest). You can still go ahead pass 
command line parameter 'iommu=pt iommu=1'
and try to pass a device to L2 guest. This can also done without passing any 
iommu related parameters to the kernel. 

David Kiarie (4):
  hw/i386: Introduce AMD IOMMU
  hw/i386: ACPI IVRS table
  hw/core: provision for overriding emulated IOMMU
  hw/pci-host: Emulate AMD IOMMU

 hw/acpi/aml-build.c   |2 +-
 hw/core/machine.c |   29 +-
 hw/i386/Makefile.objs |1 +
 hw/i386/acpi-build.c  |   93 ++-
 hw/i386/amd_iommu.c   | 1405 +
 hw/i386/amd_iommu.h   |  340 ++
 hw/pci-host/q35.c |   25 +-
 include/hw/acpi/acpi-defs.h   |   13 +
 include/hw/acpi/aml-build.h   |1 +
 include/hw/boards.h   |7 +
 include/hw/i386/intel_iommu.h |1 +
 include/hw/pci/pci.h  |2 +
 qemu-options.hx   |7 +-
 util/qemu-config.c|8 +-
 14 files changed, 1912 insertions(+), 22 deletions(-)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

-- 
2.1.4




Re: [Qemu-devel] [V9 0/4] AMD IOMMU

2016-05-05 Thread David Kiarie
On Wed, May 4, 2016 at 2:05 PM, Valentine Sinitsyn
 wrote:
> On 04.05.2016 16:02, David Kiarie wrote:
>>
>>
>>
>> On 04/05/16 13:58, Valentine Sinitsyn wrote:
>>>
>>> On 04.05.2016 15:51, David Kiarie wrote:
>>>>
>>>> On Wed, May 4, 2016 at 10:39 AM, Valentine Sinitsyn
>>>>  wrote:
>>>>>
>>>>> Hi everyone,
>>>>>
>>>>> On 04.05.2016 12:05, David Kiarie wrote:
>>>>>>
>>>>>>
>>>>>> On Wed, May 4, 2016 at 9:12 AM, Jan Kiszka  wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2016-04-30 00:42, David Kiarie wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> These series adds AMD IOMMU support to Qemu. It's currently in
>>>>>>>> the 9th
>>>>>>>> version.
>>>>>>>>
>>>>>>>> In this series I have (hopefully) addressed all the comments made
>>>>>>>> in the
>>>>>>>> previous version.
>>>>>>>> I have also tested and successfully passed-through PCI device 'ac97'
>>>>>>>> with more devices to be tested.
>>>>>>>>
>>>>>>>
>>>>>>> I've done some basic testing with a Jailhouse setup and found it
>>>>>>> working. The ACPI table is now properly parsed and the DMA
>>>>>>> remapping was
>>>>>>> not disturbing the system after Jailhouse was activated.
>>>>>>>
>>>>>>> However, it was also still not intervening after I started to corrupt
>>>>>>> the configuration, removed DMA target properties from most of the
>>>>>>> RAM or
>>>>>>> dropped PCI devices.
>>>>>
>>>>>
>>>>> Please also remember that unlisted devices go without translation.
>>>>> To "mute"
>>>>> the device, set V, TV, the DomainId, and zero everything else in the
>>>>> DTE.
>>>>>
>>>>>>
>>>>>> This means you're invalidating DTEs ?
>>>>>>
>>>>>>>
>>>>>>> You are not dropping invalid remapping requests, are you?
>>>>>>> According to
>>>>>>> the logs, you are detecting them at least:
>>>>>>>
>>>>>>> (amd-iommu)amd_iommu_get_dte: Device Table at 0x3b0d4000
>>>>>>> (amd-iommu)amd_iommu_get_dte: Pte entry at 0x0 is invalid
>>>>>>> (amd-iommu)amd_iommu_translate: devid: 00:02.0 gpa 0x32f39480 hpa
>>>>>>> 0x32f39000
>>>>>>>
>>>>>>> It's a bit hard to test right now if remapping is actually properly
>>>>>>> working in all important cases if you do not reject invalid ones.
>>>>>
>>>>>
>>>>> My understanding is that you should generate an IO_PAGE_FAULT event
>>>>> and drop
>>>>> the request. This doesn't apply to ATS, which is a bit trickier, but we
>>>>> don't address ATS in this patch series anyway, do we?
>>>>
>>>>
>>>> My next question is what you mean by 'reject' and 'drop'. In I
>>>> encounter an invalid PTE/DTE I don't translate the gpa, it just become
>>>> the hpa which is what is happening above.
>>>
>>> What happens if you just ignore the request? I mean, what if you don't
>>> forward it to anywhere else in QEMU, just log this event and return?
>>
>>
>> Am guessing this should have something to do with pci abort which, last
>> time I tried, wasn't aborting at request. I will look at it again.
>
> My initial answer was also "do target abort". But then I did a quick look
> over the spec, and found no such requirement. Please read relevant parts
> thoroughly yourself, and maybe experiment with "just ignore"/"explicitly
> abort" options.

Qemu doesn't seem to be honouring target aborts. The PCI device
represented by IOMMU seems like just a link to allow communication
with the OS/System Software.

In place of target aborts I am going to instead populate the struct
(IOMMUTLBEntry) like below

  IOMMUTLBEntry ret = {
.target_as = &address_space_memory,
.iova = addr,
.translated_addr = 0,
.addr_mask = ~(hwaddr)0,
.perm = IOMMU_NONE,
}

Functionally speaking, this doesn't seem very different from a target
abort because with a target abort, the bus is expected to complete the
translation which should come down to the same thing(with the later
being more complicated).

On the other hand, I am yet to confirm, from the spec that the
particular reported case warrants a target abort but cases such as
page faults should be target aborted (which is not currently the case,
so they should be fixed).


>
> Valentine



Re: [Qemu-devel] [PATCH v6 08/26] intel_iommu: provide helper function vtd_get_iommu

2016-05-05 Thread David Kiarie
On Thu, May 5, 2016 at 6:25 AM, Peter Xu  wrote:
> Moves acpi_get_iommu() under VT-d to make it a public function.
>
> Signed-off-by: Peter Xu 
> ---
>  hw/i386/acpi-build.c  |  7 +--
>  hw/i386/intel_iommu.c | 13 +
>  include/hw/i386/intel_iommu.h |  2 ++
>  3 files changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 5d2d87b..b064bc2 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -2677,12 +2677,7 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
>
>  static bool acpi_has_iommu(void)
>  {
> -bool ambiguous;
> -Object *intel_iommu;
> -
> -intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
> -   &ambiguous);
> -return intel_iommu && !ambiguous;
> +return !!vtd_iommu_get();
>  }

This is not consistent with what we have in the AMD IOMMU patches but
I guess this could be easily fixed.

>
>  static
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 4d14124..a44289f 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -2001,6 +2001,19 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, 
> PCIBus *bus, int devfn)
>  return vtd_dev_as;
>  }
>
> +IntelIOMMUState *vtd_iommu_get(void)
> +{
> +bool ambiguous = false;
> +Object *intel_iommu = NULL;
> +
> +intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
> + &ambiguous);
> +if (ambiguous)
> +intel_iommu = NULL;
> +
> +return (IntelIOMMUState *)intel_iommu;
> +}
> +
>  /* Do the initialization. It will also be called when reset, so pay
>   * attention when adding new initialization stuff.
>   */
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index 4914fe6..9ee84f7 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -196,5 +196,7 @@ struct IntelIOMMUState {
>   * create a new one if none exists
>   */
>  VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn);
> +/* Get default IOMMU object */
> +IntelIOMMUState *vtd_iommu_get(void);
>
>  #endif
> --
> 2.4.11
>



<    1   2   3   4   >