Re: [Qemu-devel] [PATCH RFC v3 07/14] memory: add section range info for IOMMU notifier

2017-01-12 Thread Jason Wang



On 2017年01月13日 11:06, Peter Xu wrote:

In this patch, IOMMUNotifier.{start|end} are introduced to store section
information for a specific notifier. When notification occurs, we not
only check the notification type (MAP|UNMAP), but also check whether the
notified iova is in the range of specific IOMMU notifier, and skip those
notifiers if not in the listened range.

When removing an region, we need to make sure we removed the correct
VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well.

Suggested-by: David Gibson 
Reviewed-by: David Gibson 
Acked-by: Paolo Bonzini 
Signed-off-by: Peter Xu 
---
  hw/vfio/common.c  | 7 ++-
  include/exec/memory.h | 3 +++
  memory.c  | 4 +++-
  3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 801578b..6f648da 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -455,6 +455,10 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
  giommu->container = container;
  giommu->n.notify = vfio_iommu_map_notify;
  giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL;
+giommu->n.start = section->offset_within_region;
+llend = int128_add(int128_make64(giommu->n.start), section->size);
+llend = int128_sub(llend, int128_one());
+giommu->n.end = int128_get64(llend);
  QLIST_INSERT_HEAD(>giommu_list, giommu, giommu_next);
  
  memory_region_register_iommu_notifier(giommu->iommu, >n);

@@ -525,7 +529,8 @@ static void vfio_listener_region_del(MemoryListener 
*listener,
  VFIOGuestIOMMU *giommu;
  
  QLIST_FOREACH(giommu, >giommu_list, giommu_next) {

-if (giommu->iommu == section->mr) {
+if (giommu->iommu == section->mr &&
+giommu->n.start == section->offset_within_region) {
  memory_region_unregister_iommu_notifier(giommu->iommu,
  >n);
  QLIST_REMOVE(giommu, giommu_next);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index bec9756..7649e74 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -84,6 +84,9 @@ typedef enum {
  struct IOMMUNotifier {
  void (*notify)(struct IOMMUNotifier *notifier, IOMMUTLBEntry *data);
  IOMMUNotifierFlag notifier_flags;
+/* Notify for address space range start <= addr <= end */
+hwaddr start;
+hwaddr end;
  QLIST_ENTRY(IOMMUNotifier) node;
  };
  typedef struct IOMMUNotifier IOMMUNotifier;
diff --git a/memory.c b/memory.c
index 2bfc37f..e88bb54 100644
--- a/memory.c
+++ b/memory.c
@@ -1671,7 +1671,9 @@ void memory_region_notify_iommu(MemoryRegion *mr,
  }
  
  QLIST_FOREACH(iommu_notifier, >iommu_notify, node) {

-if (iommu_notifier->notifier_flags & request_flags) {
+if (iommu_notifier->notifier_flags & request_flags &&
+iommu_notifier->start <= entry.iova &&
+iommu_notifier->end >= entry.iova) {
  iommu_notifier->notify(iommu_notifier, );
  }
  }


This seems breaks vhost device IOTLB. How about keep the the behavior 
somehow?


Thanks



Re: [Qemu-devel] [PATCH RFC v3 04/14] intel_iommu: fix trace for inv desc handling

2017-01-12 Thread Jason Wang



On 2017年01月13日 11:06, Peter Xu wrote:

VT-d codes are still using static DEBUG_INTEL_IOMMU macro. That's not
good, and we should end the day when we need to recompile the code
before getting useful debugging information for vt-d. Time to switch to
the trace system.

This is the first patch to do it.

Generally, the rule of mine is:

- for the old GENERAL typed message, I use error_report() directly if
   apply. Those are something shouldn't happen, and we should print those
   errors in all cases, even without enabling debug and tracing.


Looks like some were guest trigger-able. If yes, let's try not use 
error_report() for not being flooded.


Thanks



- for the non-GENERAL typed messages, remove those VTD_PRINTF()s that
   looks hardly used, and convert the rest lines into trace_*().

- for useless DPRINTFs, I removed them.





[Qemu-devel] [PATCH v8 07/10] vmxnet3: fix reference leak issue

2017-01-12 Thread Cao jin
On migration target, msix_vector_use() will be called in vmxnet3_post_load()
in second time, without a matching second call to msi_vector_unuse(),
which results in vector reference leak.

CC: Dmitry Fleytman 
CC: Jason Wang 
CC: Markus Armbruster 
CC: Michael S. Tsirkin 

Reviewed-by: Markus Armbruster 
Reviewed-by: Dmitry Fleytman 
Acked-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
 hw/net/vmxnet3.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index a433cc017cb1..45e125e92c8a 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2552,21 +2552,11 @@ static void vmxnet3_put_rxq_descr(QEMUFile *f, void 
*pv, size_t size)
 static int vmxnet3_post_load(void *opaque, int version_id)
 {
 VMXNET3State *s = opaque;
-PCIDevice *d = PCI_DEVICE(s);
 
 net_tx_pkt_init(>tx_pkt, PCI_DEVICE(s),
 s->max_tx_frags, s->peer_has_vhdr);
 net_rx_pkt_init(>rx_pkt, s->peer_has_vhdr);
 
-if (s->msix_used) {
-if  (!vmxnet3_use_msix_vectors(s, VMXNET3_MAX_INTRS)) {
-VMW_WRPRN("Failed to re-use MSI-X vectors");
-msix_uninit(d, >msix_bar, >msix_bar);
-s->msix_used = false;
-return -1;
-}
-}
-
 vmxnet3_validate_queues(s);
 vmxnet3_validate_interrupts(s);
 
-- 
2.1.0






[Qemu-devel] [PATCH v8 08/10] vmxnet3: remove unnecessary internal msix flag

2017-01-12 Thread Cao jin
Internal flag msix_used is unnecessary, it has the same effect as
msix_enabled().

The corresponding msi flag is already dropped in commit 1070048e.

CC: Dmitry Fleytman 
CC: Jason Wang 
CC: Markus Armbruster 
CC: Michael S. Tsirkin 

Reviewed-by: Markus Armbruster 
Reviewed-by: Dmitry Fleytman 
Acked-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
 hw/net/vmxnet3.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 45e125e92c8a..af39965c8cc2 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -281,8 +281,6 @@ typedef struct {
 Vmxnet3RxqDescr rxq_descr[VMXNET3_DEVICE_MAX_RX_QUEUES];
 Vmxnet3TxqDescr txq_descr[VMXNET3_DEVICE_MAX_TX_QUEUES];
 
-/* Whether MSI-X support was installed successfully */
-bool msix_used;
 hwaddr drv_shmem;
 hwaddr temp_shared_guest_driver_memory;
 
@@ -359,7 +357,7 @@ static bool _vmxnet3_assert_interrupt_line(VMXNET3State *s, 
uint32_t int_idx)
 {
 PCIDevice *d = PCI_DEVICE(s);
 
-if (s->msix_used && msix_enabled(d)) {
+if (msix_enabled(d)) {
 VMW_IRPRN("Sending MSI-X notification for vector %u", int_idx);
 msix_notify(d, int_idx);
 return false;
@@ -383,7 +381,7 @@ static void _vmxnet3_deassert_interrupt_line(VMXNET3State 
*s, int lidx)
  * This function should never be called for MSI(X) interrupts
  * because deassertion never required for message interrupts
  */
-assert(!s->msix_used || !msix_enabled(d));
+assert(!msix_enabled(d));
 /*
  * This function should never be called for MSI(X) interrupts
  * because deassertion never required for message interrupts
@@ -421,7 +419,7 @@ static void vmxnet3_trigger_interrupt(VMXNET3State *s, int 
lidx)
 s->interrupt_states[lidx].is_pending = true;
 vmxnet3_update_interrupt_line_state(s, lidx);
 
-if (s->msix_used && msix_enabled(d) && s->auto_int_masking) {
+if (msix_enabled(d) && s->auto_int_masking) {
 goto do_automask;
 }
 
@@ -1428,7 +1426,7 @@ static void vmxnet3_update_features(VMXNET3State *s)
 
 static bool vmxnet3_verify_intx(VMXNET3State *s, int intx)
 {
-return s->msix_used || msi_enabled(PCI_DEVICE(s))
+return msix_enabled(PCI_DEVICE(s)) || msi_enabled(PCI_DEVICE(s))
 || intx == pci_get_byte(s->parent_obj.config + PCI_INTERRUPT_PIN) - 1;
 }
 
@@ -1445,18 +1443,18 @@ static void vmxnet3_validate_interrupts(VMXNET3State *s)
 int i;
 
 VMW_CFPRN("Verifying event interrupt index (%d)", s->event_int_idx);
-vmxnet3_validate_interrupt_idx(s->msix_used, s->event_int_idx);
+vmxnet3_validate_interrupt_idx(msix_enabled(PCI_DEVICE(s)), 
s->event_int_idx);
 
 for (i = 0; i < s->txq_num; i++) {
 int idx = s->txq_descr[i].intr_idx;
 VMW_CFPRN("Verifying TX queue %d interrupt index (%d)", i, idx);
-vmxnet3_validate_interrupt_idx(s->msix_used, idx);
+vmxnet3_validate_interrupt_idx(msix_enabled(PCI_DEVICE(s)), idx);
 }
 
 for (i = 0; i < s->rxq_num; i++) {
 int idx = s->rxq_descr[i].intr_idx;
 VMW_CFPRN("Verifying RX queue %d interrupt index (%d)", i, idx);
-vmxnet3_validate_interrupt_idx(s->msix_used, idx);
+vmxnet3_validate_interrupt_idx(msix_enabled(PCI_DEVICE(s)), idx);
 }
 }
 
@@ -2185,6 +2183,7 @@ vmxnet3_use_msix_vectors(VMXNET3State *s, int num_vectors)
 static bool
 vmxnet3_init_msix(VMXNET3State *s)
 {
+bool msix;
 PCIDevice *d = PCI_DEVICE(s);
 int res = msix_init(d, VMXNET3_MAX_INTRS,
 >msix_bar,
@@ -2199,17 +2198,18 @@ vmxnet3_init_msix(VMXNET3State *s)
 
 if (0 > res) {
 VMW_WRPRN("Failed to initialize MSI-X, board's MSI support is broken");
-s->msix_used = false;
+msix = false;
 } else {
 if (!vmxnet3_use_msix_vectors(s, VMXNET3_MAX_INTRS)) {
 VMW_WRPRN("Failed to use MSI-X vectors, error %d", res);
 msix_uninit(d, >msix_bar, >msix_bar);
-s->msix_used = false;
+msix = false;
 } else {
-s->msix_used = true;
+msix = true;
 }
 }
-return s->msix_used;
+
+return msix;
 }
 
 static void
@@ -2217,7 +2217,7 @@ vmxnet3_cleanup_msix(VMXNET3State *s)
 {
 PCIDevice *d = PCI_DEVICE(s);
 
-if (s->msix_used) {
+if (msix_enabled(d)) {
 vmxnet3_unuse_msix_vectors(s, VMXNET3_MAX_INTRS);
 msix_uninit(d, >msix_bar, >msix_bar);
 }
-- 
2.1.0






[Qemu-devel] [PATCH v8 03/10] pci: Convert msix_init() to Error and fix callers to check it

2017-01-12 Thread Cao jin
msix_init() reports errors with error_report(), which is wrong when
it's used in realize().  The same issue was fixed for msi_init() in
commit 1108b2f.

For some devices(like e1000e, vmxnet3) who won't fail because of
msix_init's failure, suppress the error report by passing NULL error object.

Bonus: add comment for msix_init.

CC: Jiri Pirko 
CC: Gerd Hoffmann 
CC: Dmitry Fleytman 
CC: Jason Wang 
CC: Michael S. Tsirkin 
CC: Hannes Reinecke 
CC: Paolo Bonzini 
CC: Alex Williamson 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 

Reviewed-by: Markus Armbruster 
Reviewed-by: Hannes Reinecke 
Acked-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
 hw/block/nvme.c|  5 -
 hw/misc/ivshmem.c  |  8 
 hw/net/e1000e.c|  6 +-
 hw/net/rocker/rocker.c |  7 ++-
 hw/net/vmxnet3.c   |  8 ++--
 hw/pci/msix.c  | 34 +-
 hw/scsi/megasas.c  |  5 -
 hw/usb/hcd-xhci.c  | 13 -
 hw/vfio/pci.c  |  8 ++--
 hw/virtio/virtio-pci.c | 11 +--
 include/hw/pci/msix.h  |  5 +++--
 11 files changed, 80 insertions(+), 30 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d479fd22f573..2d703c8a712a 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -831,6 +831,7 @@ static int nvme_init(PCIDevice *pci_dev)
 {
 NvmeCtrl *n = NVME(pci_dev);
 NvmeIdCtrl *id = >id_ctrl;
+Error *err = NULL;
 
 int i;
 int64_t bs_size;
@@ -872,7 +873,9 @@ static int nvme_init(PCIDevice *pci_dev)
 pci_register_bar(>parent_obj, 0,
 PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64,
 >iomem);
-msix_init_exclusive_bar(>parent_obj, n->num_queues, 4);
+if (msix_init_exclusive_bar(>parent_obj, n->num_queues, 4, )) {
+error_report_err(err);
+}
 
 id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
 id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index abeaf3da0800..70e71a597b9c 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -749,13 +749,13 @@ static void ivshmem_reset(DeviceState *d)
 }
 }
 
-static int ivshmem_setup_interrupts(IVShmemState *s)
+static int ivshmem_setup_interrupts(IVShmemState *s, Error **errp)
 {
 /* allocate QEMU callback data for receiving interrupts */
 s->msi_vectors = g_malloc0(s->vectors * sizeof(MSIVector));
 
 if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
-if (msix_init_exclusive_bar(PCI_DEVICE(s), s->vectors, 1)) {
+if (msix_init_exclusive_bar(PCI_DEVICE(s), s->vectors, 1, errp)) {
 return -1;
 }
 
@@ -897,8 +897,8 @@ static void ivshmem_common_realize(PCIDevice *dev, Error 
**errp)
 qemu_chr_fe_set_handlers(>server_chr, ivshmem_can_receive,
  ivshmem_read, NULL, s, NULL, true);
 
-if (ivshmem_setup_interrupts(s) < 0) {
-error_setg(errp, "failed to initialize interrupts");
+if (ivshmem_setup_interrupts(s, errp) < 0) {
+error_prepend(errp, "Failed to initialize interrupts: ");
 return;
 }
 }
diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 4994e1ca0062..74cbbef30366 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -292,7 +292,11 @@ e1000e_init_msix(E1000EState *s)
 E1000E_MSIX_IDX, E1000E_MSIX_TABLE,
 >msix,
 E1000E_MSIX_IDX, E1000E_MSIX_PBA,
-0xA0);
+0xA0, NULL);
+
+/* Any error other than -ENOTSUP(board's MSI support is broken)
+ * is a programming error. Fall back to INTx silently on -ENOTSUP */
+assert(!res || res == -ENOTSUP);
 
 if (res < 0) {
 trace_e1000e_msix_init_fail(res);
diff --git a/hw/net/rocker/rocker.c b/hw/net/rocker/rocker.c
index e9d215aa4df1..8f829f2946d8 100644
--- a/hw/net/rocker/rocker.c
+++ b/hw/net/rocker/rocker.c
@@ -1256,14 +1256,19 @@ static int rocker_msix_init(Rocker *r)
 {
 PCIDevice *dev = PCI_DEVICE(r);
 int err;
+Error *local_err = NULL;
 
 err = msix_init(dev, ROCKER_MSIX_VEC_COUNT(r->fp_ports),
 >msix_bar,
 ROCKER_PCI_MSIX_BAR_IDX, ROCKER_PCI_MSIX_TABLE_OFFSET,
 >msix_bar,
 ROCKER_PCI_MSIX_BAR_IDX, ROCKER_PCI_MSIX_PBA_OFFSET,
-0);
+0, _err);
+/* Any error other than -ENOTSUP(board's MSI support is broken)
+ * is a programming error. */
+assert(!err || err == -ENOTSUP);
 if (err) {
+error_report_err(local_err);
 return err;
 }
 
diff --git 

[Qemu-devel] [PATCH v8 10/10] megasas: remove unnecessary megasas_use_msix()

2017-01-12 Thread Cao jin
Also move certain hunk above, to place msix init related code together.

CC: Hannes Reinecke 
CC: Paolo Bonzini 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 
CC: Michael S. Tsirkin 

Signed-off-by: Cao jin 
---
 hw/scsi/megasas.c | 19 ++-
 1 file changed, 6 insertions(+), 13 deletions(-)

msix_init() doesn't set the MSI-X enable bit, so use msix_enabled()
is not right here, restore the old check without the
megasas_use_msix() wrapper.

diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index c208d520c4df..73eab7844ee3 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -155,11 +155,6 @@ static bool megasas_use_queue64(MegasasState *s)
 return s->flags & MEGASAS_MASK_USE_QUEUE64;
 }
 
-static bool megasas_use_msix(MegasasState *s)
-{
-return s->msix != ON_OFF_AUTO_OFF;
-}
-
 static bool megasas_is_jbod(MegasasState *s)
 {
 return s->flags & MEGASAS_MASK_USE_JBOD;
@@ -2305,9 +2300,7 @@ static void megasas_scsi_uninit(PCIDevice *d)
 {
 MegasasState *s = MEGASAS(d);
 
-if (megasas_use_msix(s)) {
-msix_uninit(d, >mmio_io, >mmio_io);
-}
+msix_uninit(d, >mmio_io, >mmio_io);
 msi_uninit(d);
 }
 
@@ -2358,7 +2351,7 @@ static void megasas_scsi_realize(PCIDevice *dev, Error 
**errp)
 
 memory_region_init_io(>mmio_io, OBJECT(s), _mmio_ops, s,
   "megasas-mmio", 0x4000);
-if (megasas_use_msix(s)) {
+if (s->msix != ON_OFF_AUTO_OFF) {
 ret = msix_init(dev, 15, >mmio_io, b->mmio_bar, 0x2000,
 >mmio_io, b->mmio_bar, 0x3800, 0x68, );
 /* Any error other than -ENOTSUP(board's MSI support is broken)
@@ -2378,6 +2371,10 @@ static void megasas_scsi_realize(PCIDevice *dev, Error 
**errp)
 error_free(err);
 }
 
+if (s->msix != ON_OFF_AUTO_OFF) {
+msix_vector_use(dev, 0);
+}
+
 memory_region_init_io(>port_io, OBJECT(s), _port_ops, s,
   "megasas-io", 256);
 memory_region_init_io(>queue_io, OBJECT(s), _queue_ops, s,
@@ -2393,10 +2390,6 @@ static void megasas_scsi_realize(PCIDevice *dev, Error 
**errp)
 pci_register_bar(dev, b->mmio_bar, bar_type, >mmio_io);
 pci_register_bar(dev, 3, bar_type, >queue_io);
 
-if (megasas_use_msix(s)) {
-msix_vector_use(dev, 0);
-}
-
 s->fw_state = MFI_FWSTATE_READY;
 if (!s->sas_addr) {
 s->sas_addr = ((NAA_LOCALLY_ASSIGNED_ID << 24) |
-- 
2.1.0






[Qemu-devel] [PATCH v8 04/10] megasas: change behaviour of msix switch

2017-01-12 Thread Cao jin
Resolve the TODO, msix=auto means msix on; if user specify msix=on,
then device creation fail on msix_init failure.
Also undo the overwrites of user configuration of msix.

CC: Michael S. Tsirkin 
CC: Hannes Reinecke 
CC: Paolo Bonzini 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 

Reviewed-by: Markus Armbruster 
Reviewed-by: Hannes Reinecke 
Acked-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
 hw/scsi/megasas.c | 28 
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index a946424ab234..14d6e0c6d565 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -2359,19 +2359,31 @@ static void megasas_scsi_realize(PCIDevice *dev, Error 
**errp)
 
 memory_region_init_io(>mmio_io, OBJECT(s), _mmio_ops, s,
   "megasas-mmio", 0x4000);
+if (megasas_use_msix(s)) {
+ret = msix_init(dev, 15, >mmio_io, b->mmio_bar, 0x2000,
+>mmio_io, b->mmio_bar, 0x3800, 0x68, );
+/* Any error other than -ENOTSUP(board's MSI support is broken)
+ * is a programming error */
+assert(!ret || ret == -ENOTSUP);
+if (ret && s->msix == ON_OFF_AUTO_ON) {
+/* Can't satisfy user's explicit msix=on request, fail */
+error_append_hint(, "You have to use msix=auto (default) or "
+"msix=off with this machine type.\n");
+/* No instance_finalize method, need to free the resource here */
+object_unref(OBJECT(>mmio_io));
+error_propagate(errp, err);
+return;
+}
+assert(!err || s->msix == ON_OFF_AUTO_AUTO);
+/* With msix=auto, we fall back to MSI off silently */
+error_free(err);
+}
+
 memory_region_init_io(>port_io, OBJECT(s), _port_ops, s,
   "megasas-io", 256);
 memory_region_init_io(>queue_io, OBJECT(s), _queue_ops, s,
   "megasas-queue", 0x4);
 
-if (megasas_use_msix(s) &&
-msix_init(dev, 15, >mmio_io, b->mmio_bar, 0x2000,
-  >mmio_io, b->mmio_bar, 0x3800, 0x68, )) {
-/*TODO: check msix_init's error, and should fail on msix=on */
-error_report_err(err);
-s->msix = ON_OFF_AUTO_OFF;
-}
-
 if (pci_is_express(dev)) {
 pcie_endpoint_cap_init(dev, 0xa0);
 }
-- 
2.1.0






[Qemu-devel] [PATCH v8 05/10] hcd-xhci: change behaviour of msix switch

2017-01-12 Thread Cao jin
Resolve the TODO, msix=auto means msix on; if user specify msix=on,
then device creation fail on msix_init failure.

CC: Gerd Hoffmann 
CC: Michael S. Tsirkin 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 

Reviewed-by: Gerd Hoffmann 
Reviewed-by: Markus Armbruster 
Acked-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
 hw/usb/hcd-xhci.c | 35 ---
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index 153b006ca01b..aaca57cb5f1f 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -3636,12 +3636,14 @@ static void usb_xhci_realize(struct PCIDevice *dev, 
Error **errp)
 if (xhci->numintrs < 1) {
 xhci->numintrs = 1;
 }
+
 if (xhci->numslots > MAXSLOTS) {
 xhci->numslots = MAXSLOTS;
 }
 if (xhci->numslots < 1) {
 xhci->numslots = 1;
 }
+
 if (xhci_get_flag(xhci, XHCI_FLAG_ENABLE_STREAMS)) {
 xhci->max_pstreams_mask = 7; /* == 256 primary streams */
 } else {
@@ -3669,6 +3671,28 @@ static void usb_xhci_realize(struct PCIDevice *dev, 
Error **errp)
 xhci->mfwrap_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, xhci_mfwrap_timer, 
xhci);
 
 memory_region_init(>mem, OBJECT(xhci), "xhci", LEN_REGS);
+if (xhci->msix != ON_OFF_AUTO_OFF) {
+ret = msix_init(dev, xhci->numintrs,
+>mem, 0, OFF_MSIX_TABLE,
+>mem, 0, OFF_MSIX_PBA,
+0x90, );
+/* Any error other than -ENOTSUP(board's MSI support is broken)
+ * is a programming error */
+assert(!ret || ret == -ENOTSUP);
+if (ret && xhci->msix == ON_OFF_AUTO_ON) {
+/* Can't satisfy user's explicit msix=on request, fail */
+error_append_hint(, "You have to use msix=auto (default) or "
+"msix=off with this machine type.\n");
+/* No instance_finalize method, need to free the resource here */
+object_unref(OBJECT(>mem));
+error_propagate(errp, err);
+return;
+}
+assert(!err || xhci->msix == ON_OFF_AUTO_AUTO);
+/* With msix=auto, we fall back to MSI off silently */
+error_free(err);
+}
+
 memory_region_init_io(>mem_cap, OBJECT(xhci), _cap_ops, xhci,
   "capabilities", LEN_CAP);
 memory_region_init_io(>mem_oper, OBJECT(xhci), _oper_ops, xhci,
@@ -3701,17 +3725,6 @@ static void usb_xhci_realize(struct PCIDevice *dev, 
Error **errp)
 ret = pcie_endpoint_cap_init(dev, 0xa0);
 assert(ret >= 0);
 }
-
-if (xhci->msix != ON_OFF_AUTO_OFF) {
-/* TODO check for errors, and should fail when msix=on */
-ret = msix_init(dev, xhci->numintrs,
->mem, 0, OFF_MSIX_TABLE,
->mem, 0, OFF_MSIX_PBA,
-0x90, );
-if (ret) {
-error_report_err(err);
-}
-}
 }
 
 static void usb_xhci_exit(PCIDevice *dev)
-- 
2.1.0






[Qemu-devel] [PATCH v8 06/10] megasas: undo the overwrites of msi user configuration

2017-01-12 Thread Cao jin
Commit afea4e14 seems forgetting to undo the overwrites, which is
unsuitable.

CC: Hannes Reinecke 
CC: Paolo Bonzini 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 
CC: Michael S. Tsirkin 

Reviewed-by: Markus Armbruster 
Reviewed-by: Hannes Reinecke 
Acked-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
 hw/scsi/megasas.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index 14d6e0c6d565..c208d520c4df 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -2350,11 +2350,10 @@ static void megasas_scsi_realize(PCIDevice *dev, Error 
**errp)
 "msi=off with this machine type.\n");
 error_propagate(errp, err);
 return;
-} else if (ret) {
-/* With msi=auto, we fall back to MSI off silently */
-s->msi = ON_OFF_AUTO_OFF;
-error_free(err);
 }
+assert(!err || s->msix == ON_OFF_AUTO_AUTO);
+/* With msi=auto, we fall back to MSI off silently */
+error_free(err);
 }
 
 memory_region_init_io(>mmio_io, OBJECT(s), _mmio_ops, s,
-- 
2.1.0






[Qemu-devel] [PATCH v8 09/10] msi_init: convert assert to return -errno

2017-01-12 Thread Cao jin
According to the disscussion:
http://lists.nongnu.org/archive/html/qemu-devel/2016-09/msg08215.html

Let leaf function returns reasonable -errno, let caller decide how to
handle the return value.

Suggested-by: Markus Armbruster 
CC: Markus Armbruster 
CC: Michael S. Tsirkin 
CC: Marcel Apfelbaum 

Reviewed-by: Markus Armbruster 
Reviewed-by: Marcel Apfelbaum 
Acked-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
 hw/pci/msi.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/pci/msi.c b/hw/pci/msi.c
index a87b2278a373..af3efbe525ce 100644
--- a/hw/pci/msi.c
+++ b/hw/pci/msi.c
@@ -201,9 +201,12 @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
" 64bit %d mask %d\n",
offset, nr_vectors, msi64bit, msi_per_vector_mask);
 
-assert(!(nr_vectors & (nr_vectors - 1)));   /* power of 2 */
-assert(nr_vectors > 0);
-assert(nr_vectors <= PCI_MSI_VECTORS_MAX);
+/* vector sanity test: should in range 1 - 32, should be power of 2 */
+if (!is_power_of_2(nr_vectors) || nr_vectors > PCI_MSI_VECTORS_MAX) {
+error_setg(errp, "Invalid vector number: %d", nr_vectors);
+return -EINVAL;
+}
+
 /* the nr of MSI vectors is up to 32 */
 vectors_order = ctz32(nr_vectors);
 
-- 
2.1.0






[Qemu-devel] [PATCH v8 00/10] Convert msix_init() to error

2017-01-12 Thread Cao jin
Only a tiny modification in patch "megasas: remove unnecessary
megasas_use_msix()" to fix a megasas issue.

v8 changelog:
1. reorder: place the "megasas: remove unnecessary megasas_use_msix()"
   as the last one. and fix the bug in it, detailed description in it,
   also removed the R-b of it.
2. Add the Acked-by from Marcel for first 9 patches; add the R-b from Markus
   to "hcd-xhci: check & correct param before using it".

Test:
1. make check ok
2. command line test for all affected device, make sure their realization
   is ok.
3. detailed test for megasas, hcd-xhci, vmxnet3.
   megasas: install a linux distro is ok
   ./qemu-system-x86_64 --enable-kvm -m 1024
   -device megasas,id=scsi0,bus=pci.0
   -drive file=/xx/scsi-disk.img,if=none,id=drive-scsi0
   -device 
scsi-disk,bus=scsi0.0,channel=0,scsi-id=4,lun=0,drive=drive-scsi0,id=scsi0-4
   -cdrom /xx/Fedora-Server-DVD-x86_64-23.iso --monitor stdio

   hcd-xhci: partition the usbstick.img, mkfs, write to file, is ok
   ./qemu-system-x86_64 -M q35 -m 1024 --enable-kvm
   -drive if=none,id=usbstick,file=/xx/usbstick.img
   -device nec-usb-xhci,id=usb,p2=8,p3=8,bus=pcie.0
   -device usb-storage,bus=usb.0,drive=usbstick --monitor stdio
   /xx/FedoraWorkStatsion23-x86_64.img

   vmxnet3: ping another destination belongs to host's network is ok.
   But no migration test, because I don't have a spare machine for now.
   ./qemu-system-x86_64 -M q35 -m 1024 --enable-kvm
   -netdev tap,id=mynet0 -device vmxnet3,netdev=mynet0
   --monitor stdio /home/pino/vm/FedoraWorkStatsion23-x86_64.img

CC: Jiri Pirko 
CC: Gerd Hoffmann 
CC: Dmitry Fleytman 
CC: Jason Wang 
CC: Michael S. Tsirkin 
CC: Hannes Reinecke 
CC: Paolo Bonzini 
CC: Alex Williamson 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 

Cao jin (10):
  msix: Follow CODING_STYLE
  hcd-xhci: check & correct param before using it
  pci: Convert msix_init() to Error and fix callers to check it
  megasas: change behaviour of msix switch
  hcd-xhci: change behaviour of msix switch
  megasas: undo the overwrites of msi user configuration
  vmxnet3: fix reference leak issue
  vmxnet3: remove unnecessary internal msix flag
  msi_init: convert assert to return -errno
  megasas: remove unnecessary megasas_use_msix()

 hw/block/nvme.c|  5 +++-
 hw/misc/ivshmem.c  |  8 +++---
 hw/net/e1000e.c|  6 -
 hw/net/rocker/rocker.c |  7 -
 hw/net/vmxnet3.c   | 46 +++--
 hw/pci/msi.c   |  9 ---
 hw/pci/msix.c  | 42 +-
 hw/scsi/megasas.c  | 49 ---
 hw/usb/hcd-xhci.c  | 69 ++
 hw/vfio/pci.c  |  8 --
 hw/virtio/virtio-pci.c | 11 
 include/hw/pci/msix.h  |  5 ++--
 12 files changed, 164 insertions(+), 101 deletions(-)

-- 
2.1.0






[Qemu-devel] [PATCH v8 02/10] hcd-xhci: check & correct param before using it

2017-01-12 Thread Cao jin
usb_xhci_realize() corrects invalid values of property "intrs"
automatically, but the uncorrected value is passed to msi_init(),
which chokes on invalid values.  Delay that until after the
correction.

Resources allocated by usb_xhci_init() are leaked when msi_init()
fails.  Fix by calling it after msi_init().

CC: Gerd Hoffmann 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 
CC: Michael S. Tsirkin 
Reviewed-by: Markus Armbruster 
Acked-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
 hw/usb/hcd-xhci.c | 37 ++---
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index 4acf0c6dd8c0..0ace273da472 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -3627,25 +3627,6 @@ static void usb_xhci_realize(struct PCIDevice *dev, 
Error **errp)
 dev->config[PCI_CACHE_LINE_SIZE] = 0x10;
 dev->config[0x60] = 0x30; /* release number */
 
-usb_xhci_init(xhci);
-
-if (xhci->msi != ON_OFF_AUTO_OFF) {
-ret = msi_init(dev, 0x70, xhci->numintrs, true, false, );
-/* Any error other than -ENOTSUP(board's MSI support is broken)
- * is a programming error */
-assert(!ret || ret == -ENOTSUP);
-if (ret && xhci->msi == ON_OFF_AUTO_ON) {
-/* Can't satisfy user's explicit msi=on request, fail */
-error_append_hint(, "You have to use msi=auto (default) or "
-"msi=off with this machine type.\n");
-error_propagate(errp, err);
-return;
-}
-assert(!err || xhci->msi == ON_OFF_AUTO_AUTO);
-/* With msi=auto, we fall back to MSI off silently */
-error_free(err);
-}
-
 if (xhci->numintrs > MAXINTRS) {
 xhci->numintrs = MAXINTRS;
 }
@@ -3667,6 +3648,24 @@ static void usb_xhci_realize(struct PCIDevice *dev, 
Error **errp)
 xhci->max_pstreams_mask = 0;
 }
 
+if (xhci->msi != ON_OFF_AUTO_OFF) {
+ret = msi_init(dev, 0x70, xhci->numintrs, true, false, );
+/* Any error other than -ENOTSUP(board's MSI support is broken)
+ * is a programming error */
+assert(!ret || ret == -ENOTSUP);
+if (ret && xhci->msi == ON_OFF_AUTO_ON) {
+/* Can't satisfy user's explicit msi=on request, fail */
+error_append_hint(, "You have to use msi=auto (default) or "
+"msi=off with this machine type.\n");
+error_propagate(errp, err);
+return;
+}
+assert(!err || xhci->msi == ON_OFF_AUTO_AUTO);
+/* With msi=auto, we fall back to MSI off silently */
+error_free(err);
+}
+
+usb_xhci_init(xhci);
 xhci->mfwrap_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, xhci_mfwrap_timer, 
xhci);
 
 memory_region_init(>mem, OBJECT(xhci), "xhci", LEN_REGS);
-- 
2.1.0






[Qemu-devel] [PATCH v8 01/10] msix: Follow CODING_STYLE

2017-01-12 Thread Cao jin
CC: Markus Armbruster 
CC: Marcel Apfelbaum 
CC: Michael S. Tsirkin 

Reviewed-by: Markus Armbruster 
Acked-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
 hw/pci/msix.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index 0ec1cb14fc60..0cee631ecc55 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -447,8 +447,10 @@ void msix_notify(PCIDevice *dev, unsigned vector)
 {
 MSIMessage msg;
 
-if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
+if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector]) {
 return;
+}
+
 if (msix_is_masked(dev, vector)) {
 msix_set_pending(dev, vector);
 return;
@@ -483,8 +485,10 @@ void msix_reset(PCIDevice *dev)
 /* Mark vector as used. */
 int msix_vector_use(PCIDevice *dev, unsigned vector)
 {
-if (vector >= dev->msix_entries_nr)
+if (vector >= dev->msix_entries_nr) {
 return -EINVAL;
+}
+
 dev->msix_entry_used[vector]++;
 return 0;
 }
-- 
2.1.0






Re: [Qemu-devel] [RFC PATCH 00/17] target/ppc: Implement POWER9 pseries tcg legacy kernel support

2017-01-12 Thread no-reply
Hi,

Your series seems to have some coding style problems. See output below for
more information:

Message-id: 1484288903-18807-1-git-send-email-sjitindarsi...@gmail.com
Subject: [Qemu-devel] [RFC PATCH 00/17] target/ppc: Implement POWER9 pseries 
tcg legacy kernel support
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] 
patchew/1484288903-18807-1-git-send-email-sjitindarsi...@gmail.com -> 
patchew/1484288903-18807-1-git-send-email-sjitindarsi...@gmail.com
Switched to a new branch 'test'
fc9a657 target/ppc/mmu_hash64: Fix incorrect shift value in amr calculation
3a6e0b1 target/ppc/mmu_hash64: Fix printing unsigned as signed int
0df1183 tcg/POWER9: NOOP the cp_abort instruction
8b5bf42 target/ppc/debug: Print LPCR register value if register exists
341c7e5 target/ppc/POWER9: Add cpu_has_work function for POWER9
398b7d5 target/ppc/POWER9: Add POWER9 pa-features definition
96a44df target/ppc/POWER9: Update to new pte format for POWER9 accesses
9190833 target/ppc/POWER9: Add POWER9 mmu fault handler
bb7fa61 target/ppc/POWER9: Remove SDR1 register
1191b9a target/ppc/POWER9: Add external partition table pointer to cpu state
8d19919 target/ppc/POWER9: Add partition table pointer to sPAPRMachineState
ffd1d31 target/ppc/POWER9: Direct all instr and data storage interrupts to the 
hypv
4c92646 target/ppc/POWER9: Adapt LPCR handling for POWER9
e1a9f12 target/ppc/POWER9: Add ISAv3.00 MMU definition
87144e8 target/ppc: Add pcr_supported to POWER9 cpu class definition
b9a19a8 hw/ppc/spapr: Add POWER9 to pseries cpu models
eb1b279 powerpc/cpu-models: rename ISAv3.00 logical PVR definition

=== OUTPUT BEGIN ===
Checking PATCH 1/17: powerpc/cpu-models: rename ISAv3.00 logical PVR 
definition...
Checking PATCH 2/17: hw/ppc/spapr: Add POWER9 to pseries cpu models...
Checking PATCH 3/17: target/ppc: Add pcr_supported to POWER9 cpu class 
definition...
ERROR: spaces required around that '-' (ctx:VxV)
#21: FILE: target/ppc/cpu.h:2253:
+PCR_COMPAT_3_00 = 1ull << (63-59),
  ^

total: 1 errors, 0 warnings, 15 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 4/17: target/ppc/POWER9: Add ISAv3.00 MMU definition...
Checking PATCH 5/17: target/ppc/POWER9: Adapt LPCR handling for POWER9...
Checking PATCH 6/17: target/ppc/POWER9: Direct all instr and data storage 
interrupts to the hypv...
Checking PATCH 7/17: target/ppc/POWER9: Add partition table pointer to 
sPAPRMachineState...
Checking PATCH 8/17: target/ppc/POWER9: Add external partition table pointer to 
cpu state...
Checking PATCH 9/17: target/ppc/POWER9: Remove SDR1 register...
ERROR: braces {} are necessary for all arms of this statement
#29: FILE: target/ppc/kvm.c:935:
+if (env->external_patbe)
[...]

ERROR: braces {} are necessary for all arms of this statement
#54: FILE: target/ppc/mmu-hash64.c:300:
+if (env->external_patbe)
[...]

ERROR: braces {} are necessary for all arms of this statement
#77: FILE: target/ppc/translate.c:6927:
+if (env->spr_cb[SPR_SDR1].name)
[...]

ERROR: space prohibited between function name and open parenthesis '('
#102: FILE: target/ppc/translate_init.c:727:
+static void gen_spr_ne_power9 (CPUPPCState *env)

ERROR: space prohibited between function name and open parenthesis '('
#113: FILE: target/ppc/translate_init.c:746:
+static void gen_spr_ne_601 (CPUPPCState *env)

ERROR: braces {} are necessary for all arms of this statement
#131: FILE: target/ppc/translate_init.c:8289:
+if (version >= BOOK3S_CPU_POWER9)
[...]
+else
[...]

total: 6 errors, 0 warnings, 96 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 10/17: target/ppc/POWER9: Add POWER9 mmu fault handler...
ERROR: Error messages should not contain newlines
#135: FILE: target/ppc/mmu_helper.c:2963:
+error_report("Guest Radix Support Unimplemented\n");

total: 1 errors, 0 warnings, 110 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 11/17: target/ppc/POWER9: Update to new pte format for POWER9 
accesses...
ERROR: spaces 

[Qemu-devel] [RFC PATCH 17/17] target/ppc/mmu_hash64: Fix incorrect shift value in amr calculation

2017-01-12 Thread Suraj Jitindar Singh
We are calculating the authority mask register key value wrong.

The pte entry contains the key value with the two upper bits and the three
lower bits stored separately. We should use these two portions to get a 5
bit value, not or them together which will only give us a 3 bit value.

Fix this.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/mmu-hash64.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/ppc/mmu-hash64.h b/target/ppc/mmu-hash64.h
index 73d7ce4..e8160c3 100644
--- a/target/ppc/mmu-hash64.h
+++ b/target/ppc/mmu-hash64.h
@@ -89,7 +89,7 @@ void ppc_hash64_update_rmls(CPUPPCState *env);
 #define HPTE64_R_C  0x0080ULL
 #define HPTE64_R_R  0x0100ULL
 #define HPTE64_R_KEY_LO 0x0e00ULL
-#define HPTE64_R_KEY(x) x) & HPTE64_R_KEY_HI) >> 60) | \
+#define HPTE64_R_KEY(x) x) & HPTE64_R_KEY_HI) >> 57) | \
  (((x) & HPTE64_R_KEY_LO) >> 9))
 
 #define HPTE64_V_1TB_SEG0x4000ULL
-- 
2.5.5




[Qemu-devel] [RFC PATCH 16/17] target/ppc/mmu_hash64: Fix printing unsigned as signed int

2017-01-12 Thread Suraj Jitindar Singh
We were printing an unsigned value as a signed value, fix this.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/mmu-hash64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 03607d5..08d2393 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -189,8 +189,8 @@ int ppc_store_slb(PowerPCCPU *cpu, target_ulong slot,
 slb->vsid = vsid;
 slb->sps = sps;
 
-LOG_SLB("%s: %d " TARGET_FMT_lx " - " TARGET_FMT_lx " => %016" PRIx64
-" %016" PRIx64 "\n", __func__, slot, esid, vsid,
+LOG_SLB("%s: " TARGET_FMT_lu " " TARGET_FMT_lx " - " TARGET_FMT_lx
+" => %016" PRIx64 " %016" PRIx64 "\n", __func__, slot, esid, vsid,
 slb->esid, slb->vsid);
 
 return 0;
-- 
2.5.5




[Qemu-devel] [RFC PATCH 15/17] tcg/POWER9: NOOP the cp_abort instruction

2017-01-12 Thread Suraj Jitindar Singh
The cp_abort instruction is used to remove the state of an in progress
copy paste sequence. POWER9 compilers add this in various places, such
as context switches which causes illegal instruction signals since we
don't yet implement this instruction.

Given there is no implementation of the copy paste facility and that we
don't claim to support it, we can just noop this instruction.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/translate.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 5d63254..7527bd2 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6166,6 +6166,10 @@ GEN_TM_NOOP(tabortwci);
 GEN_TM_NOOP(tabortdc);
 GEN_TM_NOOP(tabortdci);
 GEN_TM_NOOP(tsr);
+static inline void gen_cp_abort(DisasContext *ctx)
+{
+// Do Nothing
+}
 
 static void gen_tcheck(DisasContext *ctx)
 {
@@ -6255,6 +6259,7 @@ GEN_HANDLER2(andi_, "andi.", 0x1C, 0xFF, 0xFF, 
0x, PPC_INTEGER),
 GEN_HANDLER2(andis_, "andis.", 0x1D, 0xFF, 0xFF, 0x, PPC_INTEGER),
 GEN_HANDLER(cntlzw, 0x1F, 0x1A, 0x00, 0x, PPC_INTEGER),
 GEN_HANDLER_E(cnttzw, 0x1F, 0x1A, 0x10, 0x, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(cp_abort, 0x1F, 0x06, 0x1A, 0x03FFF801, PPC_NONE, PPC2_ISA300),
 GEN_HANDLER(or, 0x1F, 0x1C, 0x0D, 0x, PPC_INTEGER),
 GEN_HANDLER(xor, 0x1F, 0x1C, 0x09, 0x, PPC_INTEGER),
 GEN_HANDLER(ori, 0x18, 0xFF, 0xFF, 0x, PPC_INTEGER),
-- 
2.5.5




[Qemu-devel] [RFC PATCH 11/17] target/ppc/POWER9: Update to new pte format for POWER9 accesses

2017-01-12 Thread Suraj Jitindar Singh
The page table entry format was updated for the POWER9 processor.

It was decided that kernels would used the old format irrespective
with the translation occuring at the hypervisor level. Thus we convert
between the old and new format when accessing the ptes. Since we need the
whole pte to perform this conversion, we remove the old functions which
accessed either the first or second doubleword and introduce a new
functions which access the entire pte, returning the entry converted
back to the old format (if required). Update call sites accordingly.

Signed-off-by: Suraj Jitindar Singh 
---
 hw/ppc/spapr_hcall.c| 51 ++-
 target/ppc/mmu-hash64.c | 13 +
 target/ppc/mmu-hash64.h | 71 -
 3 files changed, 86 insertions(+), 49 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 9a9bedf..9f0c20d 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -125,7 +125,8 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 pte_index &= ~7ULL;
 token = ppc_hash64_start_access(cpu, pte_index);
 for (; index < 8; index++) {
-if (!(ppc_hash64_load_hpte0(cpu, token, index) & HPTE64_V_VALID)) {
+ppc_hash_pte64_t pte = ppc_hash64_load_hpte(cpu, token, index);
+if (!(pte.pte0 & HPTE64_V_VALID)) {
 break;
 }
 }
@@ -134,8 +135,10 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 return H_PTEG_FULL;
 }
 } else {
+ppc_hash_pte64_t pte;
 token = ppc_hash64_start_access(cpu, pte_index);
-if (ppc_hash64_load_hpte0(cpu, token, 0) & HPTE64_V_VALID) {
+pte = ppc_hash64_load_hpte(cpu, token, 0);
+if (pte.pte0 & HPTE64_V_VALID) {
 ppc_hash64_stop_access(cpu, token);
 return H_PTEG_FULL;
 }
@@ -163,26 +166,25 @@ static RemoveResult remove_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 {
 CPUPPCState *env = >env;
 uint64_t token;
-target_ulong v, r;
+ppc_hash_pte64_t pte;
 
 if (!valid_pte_index(env, ptex)) {
 return REMOVE_PARM;
 }
 
 token = ppc_hash64_start_access(cpu, ptex);
-v = ppc_hash64_load_hpte0(cpu, token, 0);
-r = ppc_hash64_load_hpte1(cpu, token, 0);
+pte = ppc_hash64_load_hpte(cpu, token, 0);
 ppc_hash64_stop_access(cpu, token);
 
-if ((v & HPTE64_V_VALID) == 0 ||
-((flags & H_AVPN) && (v & ~0x7fULL) != avpn) ||
-((flags & H_ANDCOND) && (v & avpn) != 0)) {
+if ((pte.pte0 & HPTE64_V_VALID) == 0 ||
+((flags & H_AVPN) && (pte.pte0 & ~0x7fULL) != avpn) ||
+((flags & H_ANDCOND) && (pte.pte0 & avpn) != 0)) {
 return REMOVE_NOT_FOUND;
 }
-*vp = v;
-*rp = r;
+*vp = pte.pte0;
+*rp = pte.pte1;
 ppc_hash64_store_hpte(cpu, ptex, HPTE64_V_HPTE_DIRTY, 0);
-ppc_hash64_tlb_flush_hpte(cpu, ptex, v, r);
+ppc_hash64_tlb_flush_hpte(cpu, ptex, pte.pte0, pte.pte1);
 return REMOVE_SUCCESS;
 }
 
@@ -293,35 +295,36 @@ static target_ulong h_protect(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 target_ulong flags = args[0];
 target_ulong pte_index = args[1];
 target_ulong avpn = args[2];
+ppc_hash_pte64_t pte;
 uint64_t token;
-target_ulong v, r;
 
 if (!valid_pte_index(env, pte_index)) {
 return H_PARAMETER;
 }
 
 token = ppc_hash64_start_access(cpu, pte_index);
-v = ppc_hash64_load_hpte0(cpu, token, 0);
-r = ppc_hash64_load_hpte1(cpu, token, 0);
+pte = ppc_hash64_load_hpte(cpu, token, 0);
 ppc_hash64_stop_access(cpu, token);
 
-if ((v & HPTE64_V_VALID) == 0 ||
-((flags & H_AVPN) && (v & ~0x7fULL) != avpn)) {
+if ((pte.pte0 & HPTE64_V_VALID) == 0 ||
+((flags & H_AVPN) && (pte.pte0 & ~0x7fULL) != avpn)) {
 return H_NOT_FOUND;
 }
 
-r &= ~(HPTE64_R_PP0 | HPTE64_R_PP | HPTE64_R_N |
-   HPTE64_R_KEY_HI | HPTE64_R_KEY_LO);
-r |= (flags << 55) & HPTE64_R_PP0;
-r |= (flags << 48) & HPTE64_R_KEY_HI;
-r |= flags & (HPTE64_R_PP | HPTE64_R_N | HPTE64_R_KEY_LO);
+pte.pte1 &= ~(HPTE64_R_PP0 | HPTE64_R_PP | HPTE64_R_N |
+  HPTE64_R_KEY_HI | HPTE64_R_KEY_LO);
+pte.pte1 |= (flags << 55) & HPTE64_R_PP0;
+pte.pte1 |= (flags << 48) & HPTE64_R_KEY_HI;
+pte.pte1 |= flags & (HPTE64_R_PP | HPTE64_R_N | HPTE64_R_KEY_LO);
 ppc_hash64_store_hpte(cpu, pte_index,
-  (v & ~HPTE64_V_VALID) | HPTE64_V_HPTE_DIRTY, 0);
-ppc_hash64_tlb_flush_hpte(cpu, pte_index, v, r);
+  (pte.pte0 & ~HPTE64_V_VALID) | HPTE64_V_HPTE_DIRTY,
+  0);
+ppc_hash64_tlb_flush_hpte(cpu, pte_index, pte.pte0, pte.pte1);
 /* Flush the tlb */
 check_tlb_flush(env, true);
 /* Don't need a memory barrier, due to qemu's global lock */
-

[Qemu-devel] [RFC PATCH 14/17] target/ppc/debug: Print LPCR register value if register exists

2017-01-12 Thread Suraj Jitindar Singh
It can be useful when debugging to print the LPCR value.

Thus we add the LPCR to the "info registers" output if the register had
been defined.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/translate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 521aed3..5d63254 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6910,6 +6910,9 @@ void ppc_cpu_dump_state(CPUState *cs, FILE *f, 
fprintf_function cpu_fprintf,
 }
 #endif
 
+if (env->spr_cb[SPR_LPCR].name)
+cpu_fprintf(f, " LPCR " TARGET_FMT_lx "\n", env->spr[SPR_LPCR]);
+
 switch (env->mmu_model) {
 case POWERPC_MMU_32B:
 case POWERPC_MMU_601:
-- 
2.5.5




[Qemu-devel] [RFC PATCH 10/17] target/ppc/POWER9: Add POWER9 mmu fault handler

2017-01-12 Thread Suraj Jitindar Singh
Add a new mmu fault handler for the POWER9 cpu and add it as the handler
for the POWER9 cpu definition.

This handler checks if the guest is radix or hash based on the value in the
partition table entry and calls the correct fault handler accordingly.

The hash fault handling code has also been updated to check if the
partition is using segment tables.

Currently only legacy hash (no segment tables) is supported.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/mmu-hash64.c |  8 
 target/ppc/mmu.h|  8 
 target/ppc/mmu_helper.c | 47 +
 target/ppc/translate_init.c |  2 +-
 4 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index b9d4f4e..b476b3f 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -73,6 +73,14 @@ static ppc_slb_t *slb_lookup(PowerPCCPU *cpu, target_ulong 
eaddr)
 }
 }
 
+/* Check if in-memory segment tables are in use */
+if (ppc64_use_proc_tbl(cpu)) {
+/* TODO - Unsupported */
+qemu_log_mask(LOG_UNIMP, "%s: unimplemented - segment table support\n",
+  __func__);
+/* Not much we can do here, caller will generate a segment interrupt */
+}
+
 return NULL;
 }
 
diff --git a/target/ppc/mmu.h b/target/ppc/mmu.h
index c7967c3..e07b128 100644
--- a/target/ppc/mmu.h
+++ b/target/ppc/mmu.h
@@ -3,6 +3,10 @@
 
 #ifndef CONFIG_USER_ONLY
 
+/* Common Partition Table Entry Fields */
+#define PATBE0_HR0x8000
+#define PATBE1_GR0x8000
+
 /* Partition Table Entry */
 struct patb_entry {
 uint64_t patbe0, patbe1;
@@ -11,6 +15,10 @@ struct patb_entry {
 #ifdef TARGET_PPC64
 
 void ppc64_set_external_patb(PowerPCCPU *cpu, void *patb, Error **errp);
+bool ppc64_use_proc_tbl(PowerPCCPU *cpu);
+bool ppc64_radix_guest(PowerPCCPU *cpu);
+int ppc64_handle_mmu_fault(PowerPCCPU *cpu, vaddr eaddr, int rwx,
+   int mmu_idx);
 
 #endif /* TARGET_PPC64 */
 
diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
index bc6c117..612f407 100644
--- a/target/ppc/mmu_helper.c
+++ b/target/ppc/mmu_helper.c
@@ -28,6 +28,7 @@
 #include "exec/cpu_ldst.h"
 #include "exec/log.h"
 #include "helper_regs.h"
+#include "qemu/error-report.h"
 #include "mmu.h"
 
 //#define DEBUG_MMU
@@ -1281,6 +1282,17 @@ void dump_mmu(FILE *f, fprintf_function cpu_fprintf, 
CPUPPCState *env)
 case POWERPC_MMU_2_07a:
 dump_slb(f, cpu_fprintf, ppc_env_get_cpu(env));
 break;
+case POWERPC_MMU_3_00:
+if (ppc64_radix_guest(ppc_env_get_cpu(env))) {
+/* TODO - Unsupported */
+} else {
+if (ppc64_use_proc_tbl(ppc_env_get_cpu(env))) {
+/* TODO - Unsupported */
+} else {
+dump_slb(f, cpu_fprintf, ppc_env_get_cpu(env));
+break;
+}
+}
 #endif
 default:
 qemu_log_mask(LOG_UNIMP, "%s: unimplemented\n", __func__);
@@ -1422,6 +1434,17 @@ hwaddr ppc_cpu_get_phys_page_debug(CPUState *cs, vaddr 
addr)
 case POWERPC_MMU_2_07:
 case POWERPC_MMU_2_07a:
 return ppc_hash64_get_phys_page_debug(cpu, addr);
+case POWERPC_MMU_3_00:
+if (ppc64_radix_guest(ppc_env_get_cpu(env))) {
+/* TODO - Unsupported */
+} else {
+if (ppc64_use_proc_tbl(ppc_env_get_cpu(env))) {
+/* TODO - Unsupported */
+} else {
+return ppc_hash64_get_phys_page_debug(cpu, addr);
+}
+}
+break;
 #endif
 
 case POWERPC_MMU_32B:
@@ -2919,3 +2942,27 @@ void ppc64_set_external_patb(PowerPCCPU *cpu, void 
*patb, Error **errp)
 
 env->external_patbe = (struct patb_entry *) patb;
 }
+
+inline bool ppc64_use_proc_tbl(PowerPCCPU *cpu)
+{
+return !!(cpu->env.spr[SPR_LPCR] & LPCR_UPRT);
+}
+
+inline bool ppc64_radix_guest(PowerPCCPU *cpu)
+{
+struct patb_entry *patbe = cpu->env.external_patbe;
+
+return patbe && (patbe->patbe1 & PATBE1_GR);
+}
+
+int ppc64_handle_mmu_fault(PowerPCCPU *cpu, vaddr eaddr, int rwx,
+   int mmu_idx)
+{
+if (ppc64_radix_guest(cpu)) { /* Guest uses radix */
+/* TODO - Unsupported */
+error_report("Guest Radix Support Unimplemented\n");
+abort();
+} else { /* Guest uses hash */
+return ppc_hash64_handle_mmu_fault(cpu, eaddr, rwx, mmu_idx);
+}
+}
diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index c771ba3..87297a7 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -8850,7 +8850,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
 (1ull << MSR_LE);
 pcc->mmu_model = POWERPC_MMU_3_00;
 #if defined(CONFIG_SOFTMMU)
-pcc->handle_mmu_fault = ppc_hash64_handle_mmu_fault;
+pcc->handle_mmu_fault = 

[Qemu-devel] [RFC PATCH 08/17] target/ppc/POWER9: Add external partition table pointer to cpu state

2017-01-12 Thread Suraj Jitindar Singh
Similarly to how we have an external hpt pointer in the cpu state, add
an external partition table pointer and update it to point to the
partition table entry in the machine state struct on cpu reset.

Signed-off-by: Suraj Jitindar Singh 
---
 hw/ppc/spapr_cpu_core.c | 12 ++--
 target/ppc/cpu.h|  3 +++
 target/ppc/mmu.h|  6 ++
 target/ppc/mmu_helper.c | 12 
 4 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 8cc7058..72a7f90 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -17,6 +17,7 @@
 #include "hw/ppc/ppc.h"
 #include "target/ppc/mmu-hash64.h"
 #include "sysemu/numa.h"
+#include "mmu.h"
 
 static void spapr_cpu_reset(void *opaque)
 {
@@ -34,8 +35,15 @@ static void spapr_cpu_reset(void *opaque)
 
 env->spr[SPR_HIOR] = 0;
 
-ppc_hash64_set_external_hpt(cpu, spapr->htab, spapr->htab_shift,
-_fatal);
+switch (env->mmu_model) {
+case POWERPC_MMU_3_00:
+ppc64_set_external_patb(cpu, spapr->patb, _fatal);
+default:
+/* We assume legacy until told otherwise, thus set HPT irrespective */
+ppc_hash64_set_external_hpt(cpu, spapr->htab, spapr->htab_shift,
+_fatal);
+break;
+}
 }
 
 static void spapr_cpu_destroy(PowerPCCPU *cpu)
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 0ab49b3..e8b7c06 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -77,6 +77,7 @@
 #include "exec/cpu-defs.h"
 #include "cpu-qom.h"
 #include "fpu/softfloat.h"
+#include "mmu.h"
 
 #if defined (TARGET_PPC64)
 #define PPC_ELF_MACHINE EM_PPC64
@@ -1009,6 +1010,8 @@ struct CPUPPCState {
 target_ulong sr[32];
 /* externally stored hash table */
 uint8_t *external_htab;
+/* externally stored partition table entry */
+struct patb_entry *external_patbe;
 /* BATs */
 uint32_t nb_BATs;
 target_ulong DBAT[2][8];
diff --git a/target/ppc/mmu.h b/target/ppc/mmu.h
index 67b9707..c7967c3 100644
--- a/target/ppc/mmu.h
+++ b/target/ppc/mmu.h
@@ -8,6 +8,12 @@ struct patb_entry {
 uint64_t patbe0, patbe1;
 };
 
+#ifdef TARGET_PPC64
+
+void ppc64_set_external_patb(PowerPCCPU *cpu, void *patb, Error **errp);
+
+#endif /* TARGET_PPC64 */
+
 #endif /* CONFIG_USER_ONLY */
 
 #endif /* MMU_H */
diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
index 2ab4562..bc6c117 100644
--- a/target/ppc/mmu_helper.c
+++ b/target/ppc/mmu_helper.c
@@ -28,6 +28,7 @@
 #include "exec/cpu_ldst.h"
 #include "exec/log.h"
 #include "helper_regs.h"
+#include "mmu.h"
 
 //#define DEBUG_MMU
 //#define DEBUG_BATS
@@ -2907,3 +2908,14 @@ void tlb_fill(CPUState *cs, target_ulong addr, 
MMUAccessType access_type,
retaddr);
 }
 }
+
+/**/
+
+/* ISA v3.00 (POWER9) Generic MMU Helpers */
+
+void ppc64_set_external_patb(PowerPCCPU *cpu, void *patb, Error **errp)
+{
+CPUPPCState *env = >env;
+
+env->external_patbe = (struct patb_entry *) patb;
+}
-- 
2.5.5




[Qemu-devel] [RFC PATCH 12/17] target/ppc/POWER9: Add POWER9 pa-features definition

2017-01-12 Thread Suraj Jitindar Singh
Add a pa-features definition which includes all of the new fields which
have been added, note we don't claim support for any of these new features
at this stage.

Signed-off-by: Suraj Jitindar Singh 
---
 hw/ppc/spapr.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 45bd2de..35799da 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -357,6 +357,20 @@ static void spapr_populate_pa_features(CPUPPCState *env, 
void *fdt, int offset)
 0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
 0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
+/* Currently we don't advertise any of the "new" ISAv3.00 functionality */
+uint8_t pa_features_300[] = { 64, 0,
+0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0, /*  0 -  5 */
+0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /*  6 - 11 */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 18 - 23 */
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 24 - 29 */
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 30 - 35 */
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 36 - 41 */
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 42 - 47 */
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 48 - 53 */
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 54 - 59 */
+0x00, 0x00, 0x00, 0x00   }; /* 60 - 63 */
+
 uint8_t *pa_features;
 size_t pa_size;
 
@@ -371,6 +385,10 @@ static void spapr_populate_pa_features(CPUPPCState *env, 
void *fdt, int offset)
 pa_features = pa_features_207;
 pa_size = sizeof(pa_features_207);
 break;
+case POWERPC_MMU_3_00:
+pa_features = pa_features_300;
+pa_size = sizeof(pa_features_300);
+break;
 default:
 return;
 }
-- 
2.5.5




[Qemu-devel] [RFC PATCH 05/17] target/ppc/POWER9: Adapt LPCR handling for POWER9

2017-01-12 Thread Suraj Jitindar Singh
The logical partitioning control register controls a threads operation
based on the partition it is currently executing. Add new definitions and
update the mask used when writing to the LPCR based on the POWER9 spec.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/cpu.h| 20 +++-
 target/ppc/mmu-hash64.c |  8 
 target/ppc/translate_init.c | 24 ++--
 3 files changed, 45 insertions(+), 7 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index afb7ddb..0ab49b3 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -379,15 +379,22 @@ struct ppc_slb_t {
 #define LPCR_ISL  (1ull << (63 - 2))
 #define LPCR_KBV  (1ull << (63 - 3))
 #define LPCR_DPFD_SHIFT   (63 - 11)
-#define LPCR_DPFD (0x3ull << LPCR_DPFD_SHIFT)
+#define LPCR_DPFD (0x7ull << LPCR_DPFD_SHIFT)
 #define LPCR_VRMASD_SHIFT (63 - 16)
 #define LPCR_VRMASD   (0x1full << LPCR_VRMASD_SHIFT)
+/* P9: Power-saving mode Exit Cause Enable (Upper Section) Mask */
+#define LPCR_PECE_U_SHIFT (63 - 19)
+#define LPCR_PECE_U_MASK  (0x7ull << LPCR_PECE_U_SHIFT)
+#define LPCR_HVEE (1ull << (63 - 17)) /* Hypervisor Virt Exit Enable */
 #define LPCR_RMLS_SHIFT   (63 - 37)
 #define LPCR_RMLS (0xfull << LPCR_RMLS_SHIFT)
 #define LPCR_ILE  (1ull << (63 - 38))
 #define LPCR_AIL_SHIFT(63 - 40)  /* Alternate interrupt location */
 #define LPCR_AIL  (3ull << LPCR_AIL_SHIFT)
+#define LPCR_UPRT (1ull << (63 - 41)) /* Use Process Table */
+#define LPCR_EVIRT(1ull << (63 - 42)) /* Enhanced Virtualisation */
 #define LPCR_ONL  (1ull << (63 - 45))
+#define LPCR_LD   (1ull << (63 - 46)) /* Large Decrementer */
 #define LPCR_P7_PECE0 (1ull << (63 - 49))
 #define LPCR_P7_PECE1 (1ull << (63 - 50))
 #define LPCR_P7_PECE2 (1ull << (63 - 51))
@@ -396,11 +403,22 @@ struct ppc_slb_t {
 #define LPCR_P8_PECE2 (1ull << (63 - 49))
 #define LPCR_P8_PECE3 (1ull << (63 - 50))
 #define LPCR_P8_PECE4 (1ull << (63 - 51))
+/* P9: Power-saving mode Exit Cause Enable (Lower Section) Mask */
+#define LPCR_PECE_L_SHIFT (63 - 51)
+#define LPCR_PECE_L_MASK  (0x1full << LPCR_PECE_L_SHIFT)
+#define LPCR_PDEE (1ull << (63 - 47)) /* Privileged Doorbell Exit EN */
+#define LPCR_HDEE (1ull << (63 - 48)) /* Hyperv Doorbell Exit Enable */
+#define LPCR_EEE  (1ull << (63 - 49)) /* External Exit Enable*/
+#define LPCR_DEE  (1ull << (63 - 50)) /* Decrementer Exit Enable */
+#define LPCR_OEE  (1ull << (63 - 51)) /* Other Exit Enable   */
 #define LPCR_MER  (1ull << (63 - 52))
+#define LPCR_GTSE (1ull << (63 - 53)) /* Guest Translation Shootdown */
 #define LPCR_TC   (1ull << (63 - 54))
+#define LPCR_HEIC (1ull << (63 - 59)) /* HV Extern Interrupt Control */
 #define LPCR_LPES0(1ull << (63 - 60))
 #define LPCR_LPES1(1ull << (63 - 61))
 #define LPCR_RMI  (1ull << (63 - 62))
+#define LPCR_HVICE(1ull << (63 - 62)) /* HV Virtualisation Int Enable 
*/
 #define LPCR_HDICE(1ull << (63 - 63))
 
 #define msr_sf   ((env->msr >> MSR_SF)   & 1)
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index fdb7a78..3a2acb8 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1050,6 +1050,14 @@ void helper_store_lpcr(CPUPPCState *env, target_ulong 
val)
   LPCR_P8_PECE2 | LPCR_P8_PECE3 | LPCR_P8_PECE4 |
   LPCR_MER | LPCR_TC | LPCR_LPES0 | LPCR_HDICE);
 break;
+case POWERPC_MMU_3_00: /* P9 */
+lpcr = val & (LPCR_VPM1 | LPCR_ISL | LPCR_KBV | LPCR_DPFD |
+  (LPCR_PECE_U_MASK & LPCR_HVEE) | LPCR_ILE | LPCR_AIL |
+  LPCR_UPRT | LPCR_EVIRT | LPCR_ONL |
+  (LPCR_PECE_L_MASK & (LPCR_PDEE | LPCR_HDEE | LPCR_EEE |
+  LPCR_DEE | LPCR_OEE)) | LPCR_MER | LPCR_GTSE | LPCR_TC |
+  LPCR_HEIC | LPCR_LPES0 | LPCR_HVICE | LPCR_HDICE);
+break;
 default:
 ;
 }
diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index 2402eef..a1994d3 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -8887,12 +8887,24 @@ void cpu_ppc_set_papr(PowerPCCPU *cpu)
 lpcr->default_value &= ~LPCR_RMLS;
 lpcr->default_value |= 1ull << LPCR_RMLS_SHIFT;
 
-/* P7 and P8 has slightly different PECE bits, mostly because P8 adds
- * bit 47 and 48 which are reserved on P7. Here we set them all, which
- * will work as expected for both implementations
- */
-lpcr->default_value |= LPCR_P8_PECE0 | LPCR_P8_PECE1 | LPCR_P8_PECE2 |
-   LPCR_P8_PECE3 | LPCR_P8_PECE4;
+switch (env->mmu_model) {
+case POWERPC_MMU_3_00:
+/* By default we choose legacy mode and switch to new hash or radix
+ * when a register 

[Qemu-devel] [RFC PATCH 13/17] target/ppc/POWER9: Add cpu_has_work function for POWER9

2017-01-12 Thread Suraj Jitindar Singh
The cpu has work function is used to mask interrupts used to determine
if there is work for the cpu based on the LPCR. Add a function to do this
for POWER9 and add it to the POWER9 cpu definition. This is similar to that
for POWER8 except using the LPCR bits as defined for POWER9.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/translate_init.c | 45 +
 1 file changed, 45 insertions(+)

diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index 87297a7..9db004d 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -8797,10 +8797,54 @@ static bool ppc_pvr_match_power9(PowerPCCPUClass *pcc, 
uint32_t pvr)
 return false;
 }
 
+static bool cpu_has_work_POWER9(CPUState *cs)
+{
+PowerPCCPU *cpu = POWERPC_CPU(cs);
+CPUPPCState *env = >env;
+
+if (cs->halted) {
+if (!(cs->interrupt_request & CPU_INTERRUPT_HARD)) {
+return false;
+}
+/* External Exception */
+if ((env->pending_interrupts & (1u << PPC_INTERRUPT_EXT)) &&
+(env->spr[SPR_LPCR] & LPCR_EEE)) {
+return true;
+}
+/* Decrementer Exception */
+if ((env->pending_interrupts & (1u << PPC_INTERRUPT_DECR)) &&
+(env->spr[SPR_LPCR] & LPCR_DEE)) {
+return true;
+}
+/* Machine Check or Hypervisor Maintenance Exception */
+if ((env->pending_interrupts & (1u << PPC_INTERRUPT_MCK |
+1u << PPC_INTERRUPT_HMI)) && (env->spr[SPR_LPCR] & LPCR_OEE)) {
+return true;
+}
+/* Privileged Doorbell Exception */
+if ((env->pending_interrupts & (1u << PPC_INTERRUPT_DOORBELL)) &&
+(env->spr[SPR_LPCR] & LPCR_PDEE)) {
+return true;
+}
+/* Hypervisor Doorbell Exception */
+if ((env->pending_interrupts & (1u << PPC_INTERRUPT_HDOORBELL)) &&
+(env->spr[SPR_LPCR] & LPCR_HDEE)) {
+return true;
+}
+if (env->pending_interrupts & (1u << PPC_INTERRUPT_RESET)) {
+return true;
+}
+return false;
+} else {
+return msr_ee && (cs->interrupt_request & CPU_INTERRUPT_HARD);
+}
+}
+
 POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
 PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
+CPUClass *cc = CPU_CLASS(oc);
 
 dc->fw_name = "PowerPC,POWER9";
 dc->desc = "POWER9";
@@ -8811,6 +8855,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
  PCR_COMPAT_2_05;
 pcc->init_proc = init_proc_POWER9;
 pcc->check_pow = check_pow_nocheck;
+cc->has_work = cpu_has_work_POWER9;
 pcc->insns_flags = PPC_INSNS_BASE | PPC_ISEL | PPC_STRING | PPC_MFTB |
PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES |
PPC_FLOAT_FSQRT | PPC_FLOAT_FRSQRTE |
-- 
2.5.5




[Qemu-devel] [RFC PATCH 09/17] target/ppc/POWER9: Remove SDR1 register

2017-01-12 Thread Suraj Jitindar Singh
The SDR1 registers was used to store the location of the hash page table.

This register no longer exists on POWER9 processors, so don't create it.

We now store the hash page table location in the process table entry.

We now check if the SDR1 register exists before printing its value when
displaying register debug info.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/kvm.c| 10 +-
 target/ppc/mmu-hash64.c | 15 ++-
 target/ppc/translate.c  |  6 --
 target/ppc/translate_init.c | 16 +---
 4 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 9c4834c..6016930 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -930,7 +930,15 @@ int kvmppc_put_books_sregs(PowerPCCPU *cpu)
 
 sregs.pvr = env->spr[SPR_PVR];
 
-sregs.u.s.sdr1 = env->spr[SPR_SDR1];
+switch (env->mmu_model) {
+case POWERPC_MMU_3_00:
+if (env->external_patbe)
+sregs.u.s.sdr1 = env->external_patbe->patbe0;
+break;
+default:
+sregs.u.s.sdr1 = env->spr[SPR_SDR1];
+break;
+}
 
 /* Sync SLB */
 #ifdef TARGET_PPC64
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index fe7da18..b9d4f4e 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -291,7 +291,20 @@ void ppc_hash64_set_sdr1(PowerPCCPU *cpu, target_ulong 
value,
 CPUPPCState *env = >env;
 target_ulong htabsize = value & SDR_64_HTABSIZE;
 
-env->spr[SPR_SDR1] = value;
+switch (env->mmu_model) {
+case POWERPC_MMU_3_00:
+/*
+ * Technically P9 doesn't have a SDR1, the hash table address should be
+ * stored in the partition table entry instead.
+ */
+if (env->external_patbe)
+env->external_patbe->patbe0 = value;
+break;
+default:
+env->spr[SPR_SDR1] = value;
+break;
+}
+
 if (htabsize > 28) {
 error_setg(errp,
"Invalid HTABSIZE 0x" TARGET_FMT_lx" stored in SDR1",
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 1212180..521aed3 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6922,9 +6922,11 @@ void ppc_cpu_dump_state(CPUState *cs, FILE *f, 
fprintf_function cpu_fprintf,
 case POWERPC_MMU_2_06a:
 case POWERPC_MMU_2_07:
 case POWERPC_MMU_2_07a:
+case POWERPC_MMU_3_00:
 #endif
-cpu_fprintf(f, " SDR1 " TARGET_FMT_lx "   DAR " TARGET_FMT_lx
-   "  DSISR " TARGET_FMT_lx "\n", env->spr[SPR_SDR1],
+if (env->spr_cb[SPR_SDR1].name)
+cpu_fprintf(f, " SDR1 " TARGET_FMT_lx " ", env->spr[SPR_SDR1]);
+cpu_fprintf(f, "  DAR " TARGET_FMT_lx "  DSISR " TARGET_FMT_lx "\n",
 env->spr[SPR_DAR], env->spr[SPR_DSISR]);
 break;
 case POWERPC_MMU_BOOKE206:
diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index a1994d3..c771ba3 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -32,6 +32,7 @@
 #include "qapi/visitor.h"
 #include "hw/qdev-properties.h"
 #include "hw/ppc/ppc.h"
+#include "mmu.h"
 
 //#define PPC_DUMP_CPU
 //#define PPC_DEBUG_SPR
@@ -722,8 +723,8 @@ static void gen_spr_generic (CPUPPCState *env)
  0x);
 }
 
-/* SPR common to all non-embedded PowerPC, including 601 */
-static void gen_spr_ne_601 (CPUPPCState *env)
+/* SPR common to all non-embedded PowerPC, including POWER9 */
+static void gen_spr_ne_power9 (CPUPPCState *env)
 {
 /* Exception processing */
 spr_register_kvm(env, SPR_DSISR, "DSISR",
@@ -739,6 +740,12 @@ static void gen_spr_ne_601 (CPUPPCState *env)
  SPR_NOACCESS, SPR_NOACCESS,
  _read_decr, _write_decr,
  0x);
+}
+
+/* SPR common to all non-embedded PowerPC, including 601 */
+static void gen_spr_ne_601 (CPUPPCState *env)
+{
+gen_spr_ne_power9(env);
 /* Memory management */
 spr_register(env, SPR_SDR1, "SDR1",
  SPR_NOACCESS, SPR_NOACCESS,
@@ -8222,7 +8229,6 @@ static void gen_spr_power8_rpr(CPUPPCState *env)
 
 static void init_proc_book3s_64(CPUPPCState *env, int version)
 {
-gen_spr_ne_601(env);
 gen_tbl(env);
 gen_spr_book3s_altivec(env);
 gen_spr_book3s_pmu_sup(env);
@@ -8280,6 +8286,10 @@ static void init_proc_book3s_64(CPUPPCState *env, int 
version)
 gen_spr_power8_book4(env);
 gen_spr_power8_rpr(env);
 }
+if (version >= BOOK3S_CPU_POWER9)
+gen_spr_ne_power9(env);
+else
+gen_spr_ne_601(env);
 if (version < BOOK3S_CPU_POWER8) {
 gen_spr_book3s_dbg(env);
 } else {
-- 
2.5.5




[Qemu-devel] [RFC PATCH 06/17] target/ppc/POWER9: Direct all instr and data storage interrupts to the hypv

2017-01-12 Thread Suraj Jitindar Singh
The vpm0 bit was removed from the LPCR in POWER9, this bit controlled
whether ISI and DSI interrupts were directed to the hypervisor or the
partition. These interrupts now go to the hypervisor irrespective, thus
it is no longer necessary to check the vmp0 bit in the LPCR.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/mmu-hash64.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 3a2acb8..fe7da18 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -640,7 +640,15 @@ static void ppc_hash64_set_isi(CPUState *cs, CPUPPCState 
*env,
 if (msr_ir) {
 vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
 } else {
-vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
+switch (env->mmu_model) {
+case POWERPC_MMU_3_00:
+/* Field deprecated in ISAv3.00 - interrupts always go to hyperv */
+vpm = true;
+break;
+default:
+vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
+break;
+}
 }
 if (vpm && !msr_hv) {
 cs->exception_index = POWERPC_EXCP_HISI;
@@ -658,7 +666,15 @@ static void ppc_hash64_set_dsi(CPUState *cs, CPUPPCState 
*env, uint64_t dar,
 if (msr_dr) {
 vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
 } else {
-vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
+switch (env->mmu_model) {
+case POWERPC_MMU_3_00:
+/* Field deprecated in ISAv3.00 - interrupts always go to hyperv */
+vpm = true;
+break;
+default:
+vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
+break;
+}
 }
 if (vpm && !msr_hv) {
 cs->exception_index = POWERPC_EXCP_HDSI;
-- 
2.5.5




[Qemu-devel] [RFC PATCH 02/17] hw/ppc/spapr: Add POWER9 to pseries cpu models

2017-01-12 Thread Suraj Jitindar Singh
Add POWER9 cpu to list of spapr core models which allows it to be specified
as the cpu model for a pseries guest (e.g. -machine pseries -cpu POWER9).

Signed-off-by: Suraj Jitindar Singh 
---
 hw/ppc/spapr_cpu_core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index c18632b..8cc7058 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -358,6 +358,9 @@ static const char *spapr_core_models[] = {
 
 /* POWER8NVL */
 "POWER8NVL_v1.0",
+
+/* POWER9 */
+"POWER9_v1.0",
 };
 
 void spapr_cpu_core_class_init(ObjectClass *oc, void *data)
-- 
2.5.5




[Qemu-devel] [RFC PATCH 07/17] target/ppc/POWER9: Add partition table pointer to sPAPRMachineState

2017-01-12 Thread Suraj Jitindar Singh
POWER9 uses a partition table to store information relating to how
address translation is performed on a per partition basis.

Add a data area for this to the sPAPRMachineState struct and (re)allocate
it on machine reset.

Signed-off-by: Suraj Jitindar Singh 
---
 hw/ppc/spapr.c | 38 --
 include/hw/ppc/spapr.h |  1 +
 target/ppc/mmu.h   | 13 +
 3 files changed, 46 insertions(+), 6 deletions(-)
 create mode 100644 target/ppc/mmu.h

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 208ef7b..45bd2de 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -41,6 +41,7 @@
 #include "migration/migration.h"
 #include "mmu-hash64.h"
 #include "qom/cpu.h"
+#include "mmu.h"
 
 #include "hw/boards.h"
 #include "hw/ppc/ppc.h"
@@ -1115,6 +1116,26 @@ static void spapr_reallocate_hpt(sPAPRMachineState 
*spapr, int shift,
 }
 }
 
+static void spapr_reallocate_patb(sPAPRMachineState *spapr, Error **errp)
+{
+g_free(spapr->patb);
+spapr->patb = NULL;
+
+if (!kvm_enabled()) {
+/* We need to allocate a partition table entry */
+size_t size = sizeof(struct patb_entry);
+
+spapr->patb = qemu_memalign(size, size);
+if (!spapr->patb) {
+error_setg_errno(errp, errno, "Could not allocate memory for "
+  "partition table entry");
+return;
+}
+
+memset(spapr->patb, 0, size);
+}
+}
+
 static void find_unknown_sysbus_device(SysBusDevice *sbdev, void *opaque)
 {
 bool matched = false;
@@ -1134,7 +1155,7 @@ static void ppc_spapr_reset(void)
 {
 MachineState *machine = MACHINE(qdev_get_machine());
 sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
-PowerPCCPU *first_ppc_cpu;
+PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
 uint32_t rtas_limit;
 hwaddr rtas_addr, fdt_addr;
 void *fdt;
@@ -1143,10 +1164,16 @@ static void ppc_spapr_reset(void)
 /* Check for unknown sysbus devices */
 foreach_dynamic_sysbus_device(find_unknown_sysbus_device, NULL);
 
-/* Allocate and/or reset the hash page table */
-spapr_reallocate_hpt(spapr,
- spapr_hpt_shift_for_ramsize(machine->maxram_size),
- _fatal);
+switch (first_ppc_cpu->env.mmu_model) {
+case POWERPC_MMU_3_00:
+/* Allocate the partition table */
+spapr_reallocate_patb(spapr, _fatal);
+default:
+/* Allocate and/or reset the hash page table */
+spapr_reallocate_hpt(spapr,
+ spapr_hpt_shift_for_ramsize(machine->maxram_size),
+ _fatal);
+}
 
 /* Update the RMA size if necessary */
 if (spapr->vrma_adjust) {
@@ -1193,7 +1220,6 @@ static void ppc_spapr_reset(void)
 g_free(fdt);
 
 /* Set up the entry state */
-first_ppc_cpu = POWERPC_CPU(first_cpu);
 first_ppc_cpu->env.gpr[3] = fdt_addr;
 first_ppc_cpu->env.gpr[5] = 0;
 first_cpu->halted = 0;
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index bd5bcf7..b654773 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -63,6 +63,7 @@ struct sPAPRMachineState {
 
 void *htab;
 uint32_t htab_shift;
+void *patb;
 hwaddr rma_size;
 int vrma_adjust;
 ssize_t rtas_size;
diff --git a/target/ppc/mmu.h b/target/ppc/mmu.h
new file mode 100644
index 000..67b9707
--- /dev/null
+++ b/target/ppc/mmu.h
@@ -0,0 +1,13 @@
+#ifndef MMU_H
+#define MMU_H
+
+#ifndef CONFIG_USER_ONLY
+
+/* Partition Table Entry */
+struct patb_entry {
+uint64_t patbe0, patbe1;
+};
+
+#endif /* CONFIG_USER_ONLY */
+
+#endif /* MMU_H */
-- 
2.5.5




[Qemu-devel] [RFC PATCH 00/17] target/ppc: Implement POWER9 pseries tcg legacy kernel support

2017-01-12 Thread Suraj Jitindar Singh
This patch set provides the initial implementation of support for the 
POWER9 processor running in tcg mode under the pseries machine type.

To use a POWER9 cpu provide the command line option "-cpu POWER9".

This is the initial work to make the mmu emulation model look like a POWER9

Currently only legacy kernels are supported (hash without segment tables).

This patch set will be followed by two more in the near future to support
the new process table capabilities added in ISAv3.00; hash via segment
tables and radix. Kernels with support for this will currently fail when
they try to register a process table as this isn't yet implemented.

The assumption is that we're running a legacy kernel, in the event the
guest registers a process table (when support exists) we will handle it
accordingly.

The main changes are:
 - Define a new mmu model and fault handler
 - Add new LPCR fields and check them accordingly
 - Add a partition table entry to the machine state
 - Point to the partition table entry in the cpu state
 - Remove SDR1
 - Adapt to new pte format
 - NOOP the cp_abort instruction
 - Small bug fixes

This was intially one huge patch so I've tried to break it up into what I
think are logical chunks, how exactly this should be split up is up for 
debate.

A current upstream kernel with POWER9 support added to the architecture
vector should correctly report a POWER9 cpu under /proc/cpuinfo.

Suraj Jitindar Singh (17):
  powerpc/cpu-models: rename ISAv3.00 logical PVR definition
  hw/ppc/spapr: Add POWER9 to pseries cpu models
  target/ppc: Add pcr_supported to POWER9 cpu class definition
  target/ppc/POWER9: Add ISAv3.00 MMU definition
  target/ppc/POWER9: Adapt LPCR handling for POWER9
  target/ppc/POWER9: Direct all instr and data storage interrupts to the
hypv
  target/ppc/POWER9: Add partition table pointer to sPAPRMachineState
  target/ppc/POWER9: Add external partition table pointer to cpu state
  target/ppc/POWER9: Remove SDR1 register
  target/ppc/POWER9: Add POWER9 mmu fault handler
  target/ppc/POWER9: Update to new pte format for POWER9 accesses
  target/ppc/POWER9: Add POWER9 pa-features definition
  target/ppc/POWER9: Add cpu_has_work function for POWER9
  target/ppc/debug: Print LPCR register value if register exists
  tcg/POWER9: NOOP the cp_abort instruction
  target/ppc/mmu_hash64: Fix printing unsigned as signed int
  target/ppc/mmu_hash64: Fix incorrect shift value in amr calculation

 hw/ppc/spapr.c  | 56 ---
 hw/ppc/spapr_cpu_core.c | 15 +++-
 hw/ppc/spapr_hcall.c| 51 +
 include/hw/ppc/spapr.h  |  1 +
 target/ppc/cpu-models.h |  2 +-
 target/ppc/cpu-qom.h|  5 ++-
 target/ppc/cpu.h| 24 +++-
 target/ppc/kvm.c| 10 -
 target/ppc/mmu-hash64.c | 68 +++--
 target/ppc/mmu-hash64.h | 73 +--
 target/ppc/mmu.h| 27 +
 target/ppc/mmu_helper.c | 61 ++
 target/ppc/translate.c  | 14 ++-
 target/ppc/translate_init.c | 92 +++--
 14 files changed, 418 insertions(+), 81 deletions(-)
 create mode 100644 target/ppc/mmu.h

-- 
2.5.5




[Qemu-devel] [RFC PATCH 03/17] target/ppc: Add pcr_supported to POWER9 cpu class definition

2017-01-12 Thread Suraj Jitindar Singh
pcr_supported is used to define the supported PCR values for a given
processor. A POWER9 processor can support 3.00, 2.07, 2.06 and 2.05
compatibility modes, thus we set this accordingly.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/cpu.h| 1 +
 target/ppc/translate_init.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 2a50c43..afb7ddb 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2250,6 +2250,7 @@ enum {
 PCR_COMPAT_2_05 = 1ull << (63-62),
 PCR_COMPAT_2_06 = 1ull << (63-61),
 PCR_COMPAT_2_07 = 1ull << (63-60),
+PCR_COMPAT_3_00 = 1ull << (63-59),
 PCR_VEC_DIS = 1ull << (63-0), /* Vec. disable (bit NA since 
POWER8) */
 PCR_VSX_DIS = 1ull << (63-1), /* VSX disable (bit NA since POWER8) 
*/
 PCR_TM_DIS  = 1ull << (63-2), /* Trans. memory disable (POWER8) */
diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index 626e031..bfc1f24 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -8797,6 +8797,8 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
 dc->props = powerpc_servercpu_properties;
 pcc->pvr_match = ppc_pvr_match_power9;
 pcc->pcr_mask = PCR_COMPAT_2_05 | PCR_COMPAT_2_06 | PCR_COMPAT_2_07;
+pcc->pcr_supported = PCR_COMPAT_3_00 | PCR_COMPAT_2_07 | PCR_COMPAT_2_06 |
+ PCR_COMPAT_2_05;
 pcc->init_proc = init_proc_POWER9;
 pcc->check_pow = check_pow_nocheck;
 pcc->insns_flags = PPC_INSNS_BASE | PPC_ISEL | PPC_STRING | PPC_MFTB |
-- 
2.5.5




[Qemu-devel] [RFC PATCH 04/17] target/ppc/POWER9: Add ISAv3.00 MMU definition

2017-01-12 Thread Suraj Jitindar Singh
POWER9 processors implement the mmu as defined in version 3.00 of the ISA.

Add a definition for this mmu model and set the POWER9 cpu model to use
this mmu model.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/cpu-qom.h| 5 -
 target/ppc/mmu_helper.c | 2 ++
 target/ppc/translate_init.c | 3 +--
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
index d46c31a..1577cc8 100644
--- a/target/ppc/cpu-qom.h
+++ b/target/ppc/cpu-qom.h
@@ -86,10 +86,13 @@ enum powerpc_mmu_t {
 POWERPC_MMU_2_07   = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
  | POWERPC_MMU_64K
  | POWERPC_MMU_AMR | 0x0004,
-/* FIXME Add POWERPC_MMU_3_OO defines */
 /* Architecture 2.07 "degraded" (no 1T segments)   */
 POWERPC_MMU_2_07a  = POWERPC_MMU_64 | POWERPC_MMU_AMR
  | 0x0004,
+/* Architecture 3.00 variant   */
+POWERPC_MMU_3_00   = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
+ | POWERPC_MMU_64K
+ | POWERPC_MMU_AMR | 0x0005,
 };
 
 /*/
diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
index d09fc0a..2ab4562 100644
--- a/target/ppc/mmu_helper.c
+++ b/target/ppc/mmu_helper.c
@@ -1935,6 +1935,7 @@ void ppc_tlb_invalidate_all(CPUPPCState *env)
 case POWERPC_MMU_2_06a:
 case POWERPC_MMU_2_07:
 case POWERPC_MMU_2_07a:
+case POWERPC_MMU_3_00:
 #endif /* defined(TARGET_PPC64) */
 env->tlb_need_flush = 0;
 tlb_flush(CPU(cpu), 1);
@@ -1974,6 +1975,7 @@ void ppc_tlb_invalidate_one(CPUPPCState *env, 
target_ulong addr)
 case POWERPC_MMU_2_06a:
 case POWERPC_MMU_2_07:
 case POWERPC_MMU_2_07a:
+case POWERPC_MMU_3_00:
 /* tlbie invalidate TLBs for all segments */
 /* XXX: given the fact that there are too many segments to invalidate,
  *  and we still don't have a tlb_flush_mask(env, n, mask) in QEMU,
diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index bfc1f24..2402eef 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -8838,8 +8838,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
 (1ull << MSR_PMM) |
 (1ull << MSR_RI) |
 (1ull << MSR_LE);
-/* Using 2.07 defines until new radix model is added. */
-pcc->mmu_model = POWERPC_MMU_2_07;
+pcc->mmu_model = POWERPC_MMU_3_00;
 #if defined(CONFIG_SOFTMMU)
 pcc->handle_mmu_fault = ppc_hash64_handle_mmu_fault;
 /* segment page size remain the same */
-- 
2.5.5




[Qemu-devel] [RFC PATCH 01/17] powerpc/cpu-models: rename ISAv3.00 logical PVR definition

2017-01-12 Thread Suraj Jitindar Singh
This logical PVR value now corresponds to ISA version 3.00 so rename it
accordingly.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/cpu-models.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/ppc/cpu-models.h b/target/ppc/cpu-models.h
index aafbbd7..d587e69 100644
--- a/target/ppc/cpu-models.h
+++ b/target/ppc/cpu-models.h
@@ -601,7 +601,7 @@ enum {
 CPU_POWERPC_LOGICAL_2_06   = 0x0F03,
 CPU_POWERPC_LOGICAL_2_06_PLUS  = 0x0F13,
 CPU_POWERPC_LOGICAL_2_07   = 0x0F04,
-CPU_POWERPC_LOGICAL_2_08   = 0x0F05,
+CPU_POWERPC_LOGICAL_3_00   = 0x0F05,
 };
 
 /* System version register (used on MPC 8xxx)*/
-- 
2.5.5




Re: [Qemu-devel] [PATCH] add migration capability to bypass the shared memory

2017-01-12 Thread Lai Jiangshan
On Fri, Jan 13, 2017 at 3:19 AM, Jianjun Duan  wrote:
> I have a question related to interplay of bypassing the shared memory in
> migration and memory hotplugging. If on the source guest a big chunk of
> memory is plugged in, will the shared memory still be mapped the same
> way on the guest? i.e, the mapping from guest physical address to the
> host virtual address be the same?

I don't understand the question, the patch doesn't change
the memory hotplugging nor the way how the pages are mapped
in the guest physical.

>
> Thanks,
> Jianjun
>
>
> On 08/29/2016 09:11 PM, Lai Jiangshan wrote:
>> On Wed, Aug 10, 2016 at 5:03 PM, Juan Quintela  wrote:
>>> Lai Jiangshan  wrote:
>>>
>>> Hi
>>>
>>> First of all, I like a lot the patchset, but I would preffer to split it
>>> to find "possible" bugs along the lines, especially in postcopy, but not 
>>> only.
>>
>> Hello, thanks for review and comments
>>
>> I tried to make the patch be sane and tight.
>> I don't see any strong reason to split it without complicating the patch.
>>
>>>
>>> [very nice description of the patch]
>>>
>>> Nothing to say about the QMP and shared memory detection, looks correct
>>> to me.
>>>
 diff --git a/migration/ram.c b/migration/ram.c
 index 815bc0e..880972d 100644
 --- a/migration/ram.c
 +++ b/migration/ram.c
 @@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
  num_dirty_pages_period = 0;
  xbzrle_cache_miss_prev = 0;
  iterations_prev = 0;
 +migration_dirty_pages = 0;
 +}
 +
 +static void migration_bitmap_init(unsigned long *bitmap)
 +{
 +RAMBlock *block;
 +
 +bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
 +rcu_read_lock();
 +QLIST_FOREACH_RCU(block, _list.blocks, next) {
 +if (!migrate_bypass_shared_memory() || 
 !qemu_ram_is_shared(block)) {
 +bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
 +   block->used_length >> TARGET_PAGE_BITS);
 +
 +/*
 + * Count the total number of pages used by ram blocks not 
 including
 + * any gaps due to alignment or unplugs.
 + */
 + migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
 + }
 +}
 +rcu_read_unlock();
  }
>>>
>>> We can split this function in a different patch.
>>
>> it calls the new function migrate_bypass_shared_memory().
>> it is no a good idea to split it out.
>>
>>> I haven't fully search
>>> if we care about taking the rcu lock here.  The thing that I am more
>>> interested is in knowing what happens when we don't set
>>> migration_dirty_pages as the full "possible" memory pages.
>>
>> I hadn't tested it with postcopy, I don't know how to use postcopy.
>> From my review I can't find obvious bugs about it.
>>
>> I don't think there is any good reason to use migrate_bypass
>> and postcopy together,  I can disable the migrate_bypass
>> when postcopy==true if you want.
>>
>>>
>>> Once here, should we check for ROM regions?
>>>
>>> BTW, could'nt we use:
>>>
>>> int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque)
>>> {
>>> RAMBlock *block;
>>> int ret = 0;
>>>
>>> rcu_read_lock();
>>> QLIST_FOREACH_RCU(block, _list.blocks, next) {
>>> ret = func(block->idstr, block->host, block->offset,
>>>block->used_length, opaque);
>>> if (ret) {
>>> break;
>>> }
>>> }
>>> rcu_read_unlock();
>>> return ret;
>>> }
>>>
>>
>> the patch only introduces only one "QLIST_FOREACH_RCU(ram_list.blocks)"
>> but
>> # git grep 'QLIST_FOREACH_RCU.*ram_list'  | wc -l
>> #   16
>>
>> I don't want to introduce qemu_ram_foreach_block()
>> and touch another 15 places.
>> I hope someone do it after merged.
>>
>>
>>>
>>>

  static void migration_bitmap_sync(void)
 @@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
  qemu_mutex_lock(_bitmap_mutex);
  rcu_read_lock();
  QLIST_FOREACH_RCU(block, _list.blocks, next) {
 -migration_bitmap_sync_range(block->offset, block->used_length);
 +if (!migrate_bypass_shared_memory() || 
 !qemu_ram_is_shared(block)) {
 +migration_bitmap_sync_range(block->offset, 
 block->used_length);
 +}
  }
  rcu_read_unlock();
  qemu_mutex_unlock(_bitmap_mutex);
>>>
>>> Oops, another place where we were not using qemu_ram_foreach_block :p
>>>
>>>
 @@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void 
 *opaque)
  ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
  migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
  migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
 -bitmap_set(migration_bitmap_rcu->bmap, 0, 

Re: [Qemu-devel] [PATCH v1 1/2] target-ppc: Add xvtstdc[sp, dp] instructions

2017-01-12 Thread Nikunj A Dadhania
Richard Henderson  writes:

> On 01/12/2017 07:54 PM, Nikunj A Dadhania wrote:
>> +uint32_t match = 0; \
>> +\
>> +getVSR(xbn, , env);  \
>> +memset(, 0, sizeof(xt)); \
>> +dcmx = DCMX_XV(opcode); \
>> +\
>> +for (i = 0; i < nels; i++) {\
>> +sign = tp##_is_neg(xb.fld); \
>> +if (tp##_is_any_nan(xb.fld)) {  \
>> +match = extract32(dcmx, 6, 1);  \
>
> Failing to clear match for each elt.

Oops, will send an update :(

Regards,
Nikunj




Re: [Qemu-devel] [PATCH v1 1/2] target-ppc: Add xvtstdc[sp, dp] instructions

2017-01-12 Thread Richard Henderson

On 01/12/2017 07:54 PM, Nikunj A Dadhania wrote:

+uint32_t match = 0; \
+\
+getVSR(xbn, , env);  \
+memset(, 0, sizeof(xt)); \
+dcmx = DCMX_XV(opcode); \
+\
+for (i = 0; i < nels; i++) {\
+sign = tp##_is_neg(xb.fld); \
+if (tp##_is_any_nan(xb.fld)) {  \
+match = extract32(dcmx, 6, 1);  \


Failing to clear match for each elt.


r~



[Qemu-devel] [PATCH v1 2/2] target-ppc: Add xststdc[sp, dp, qp] instructions

2017-01-12 Thread Nikunj A Dadhania
xststdcsp: VSX Scalar Test Data Class Single-Precision
xststdcdp: VSX Scalar Test Data Class Double-Precision
xststdcqp: VSX Scalar Test Data Class Quad-Precision

Signed-off-by: Nikunj A Dadhania 
---
 target/ppc/fpu_helper.c | 66 -
 target/ppc/helper.h |  3 ++
 target/ppc/internal.h   |  1 +
 target/ppc/translate/vsx-impl.inc.c |  3 ++
 target/ppc/translate/vsx-ops.inc.c  |  4 +++
 5 files changed, 69 insertions(+), 8 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 75c70e4..5d5a5f7 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -3196,17 +3196,22 @@ void helper_xvxsigsp(CPUPPCState *env, uint32_t opcode)
  *   fld   - vsr_t field (VsrD(*) or VsrW(*))
  *   tfld   - target vsr_t field (VsrD(*) or VsrW(*))
  *   fld_max - target field max
+ *   scrf - set result in CR and FPCC
  */
-#define VSX_TEST_DC(op, nels, xbn, tp, fld, tfld, fld_max)  \
+#define VSX_TEST_DC(op, nels, xbn, tp, fld, tfld, fld_max, scrf)  \
 void helper_##op(CPUPPCState *env, uint32_t opcode) \
 {   \
 ppc_vsr_t xt, xb;   \
 uint32_t i, sign, dcmx; \
-uint32_t match = 0; \
+uint32_t cc, match = 0; \
 \
 getVSR(xbn, , env);  \
-memset(, 0, sizeof(xt)); \
-dcmx = DCMX_XV(opcode); \
+if (!scrf) {\
+memset(, 0, sizeof(xt)); \
+dcmx = DCMX_XV(opcode); \
+} else {\
+dcmx = DCMX(opcode);\
+}   \
 \
 for (i = 0; i < nels; i++) {\
 sign = tp##_is_neg(xb.fld); \
@@ -3219,10 +3224,55 @@ void helper_##op(CPUPPCState *env, uint32_t opcode) 
\
 } else if (tp##_is_zero_or_denormal(xb.fld)) {  \
 match = extract32(dcmx, 0 + !sign, 1);  \
 }   \
-xt.tfld = match ? fld_max : 0;  \
+\
+if (scrf) { \
+cc = sign << CRF_LT_BIT | match << CRF_EQ_BIT;  \
+env->fpscr &= ~(0x0F << FPSCR_FPRF);\
+env->fpscr |= cc << FPSCR_FPRF; \
+env->crf[BF(opcode)] = cc;  \
+} else { \
+xt.tfld = match ? fld_max : 0;   \
+}   \
+}   \
+if (!scrf) {\
+putVSR(xT(opcode), , env);   \
 }   \
-putVSR(xT(opcode), , env);   \
 }
 
-VSX_TEST_DC(xvtstdcdp, 2, xB(opcode), float64, VsrD(i), VsrD(i), UINT64_MAX)
-VSX_TEST_DC(xvtstdcsp, 4, xB(opcode), float32, VsrW(i), VsrW(i), UINT32_MAX)
+VSX_TEST_DC(xvtstdcdp, 2, xB(opcode), float64, VsrD(i), VsrD(i), UINT64_MAX, 0)
+VSX_TEST_DC(xvtstdcsp, 4, xB(opcode), float32, VsrW(i), VsrW(i), UINT32_MAX, 0)
+VSX_TEST_DC(xststdcdp, 1, xB(opcode), float64, VsrD(0), VsrD(0), 0, 1)
+VSX_TEST_DC(xststdcqp, 1, (rB(opcode) + 32), float128, f128, VsrD(0), 0, 1)
+
+void helper_xststdcsp(CPUPPCState *env, uint32_t opcode)
+{
+ppc_vsr_t xb;
+uint32_t dcmx, sign, exp;
+uint32_t cc, match = 0, not_sp = 0;
+
+getVSR(xB(opcode), , env);
+dcmx = DCMX(opcode);
+exp = (xb.VsrD(0) >> 52) & 0x7FF;
+
+sign = float64_is_neg(xb.VsrD(0));
+if (float64_is_any_nan(xb.VsrD(0))) {
+match = extract32(dcmx, 6, 1);
+} else if (float64_is_infinity(xb.VsrD(0))) {
+match = extract32(dcmx, 4 + !sign, 1);
+} else if (float64_is_zero(xb.VsrD(0))) {
+match = extract32(dcmx, 2 + !sign, 1);
+} else if (float64_is_zero_or_denormal(xb.VsrD(0)) ||
+   (exp > 0 && exp < 0x381)) {
+match = extract32(dcmx, 0 + !sign, 1);
+}
+
+not_sp = !float64_eq(xb.VsrD(0),
+ float32_to_float64(
+ float64_to_float32(xb.VsrD(0), >fp_status),
+ >fp_status), >fp_status);
+
+cc = sign << CRF_LT_BIT | match << CRF_EQ_BIT | not_sp << CRF_SO_BIT;
+

[Qemu-devel] [PATCH v1 1/2] target-ppc: Add xvtstdc[sp, dp] instructions

2017-01-12 Thread Nikunj A Dadhania
xvtstdcsp: VSX Vector Test Data Class Single-Precision
xvtstdcdp: VSX Vector Test Data Class Double-Precision

Signed-off-by: Nikunj A Dadhania 
---
 target/ppc/fpu_helper.c | 39 +
 target/ppc/helper.h |  2 ++
 target/ppc/internal.h   |  5 +++--
 target/ppc/translate/vsx-impl.inc.c |  2 ++
 target/ppc/translate/vsx-ops.inc.c  |  8 
 5 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index ffcf9ca..75c70e4 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -3187,3 +3187,42 @@ void helper_xvxsigsp(CPUPPCState *env, uint32_t opcode)
 }
 putVSR(xT(opcode), , env);
 }
+
+/* VSX_TEST_DC - VSX floating point test data class
+ *   op- instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   xbn   - VSR register number
+ *   tp- type (float32 or float64)
+ *   fld   - vsr_t field (VsrD(*) or VsrW(*))
+ *   tfld   - target vsr_t field (VsrD(*) or VsrW(*))
+ *   fld_max - target field max
+ */
+#define VSX_TEST_DC(op, nels, xbn, tp, fld, tfld, fld_max)  \
+void helper_##op(CPUPPCState *env, uint32_t opcode) \
+{   \
+ppc_vsr_t xt, xb;   \
+uint32_t i, sign, dcmx; \
+uint32_t match = 0; \
+\
+getVSR(xbn, , env);  \
+memset(, 0, sizeof(xt)); \
+dcmx = DCMX_XV(opcode); \
+\
+for (i = 0; i < nels; i++) {\
+sign = tp##_is_neg(xb.fld); \
+if (tp##_is_any_nan(xb.fld)) {  \
+match = extract32(dcmx, 6, 1);  \
+} else if (tp##_is_infinity(xb.fld)) {  \
+match = extract32(dcmx, 4 + !sign, 1);  \
+} else if (tp##_is_zero(xb.fld)) {  \
+match = extract32(dcmx, 2 + !sign, 1);  \
+} else if (tp##_is_zero_or_denormal(xb.fld)) {  \
+match = extract32(dcmx, 0 + !sign, 1);  \
+}   \
+xt.tfld = match ? fld_max : 0;  \
+}   \
+putVSR(xT(opcode), , env);   \
+}
+
+VSX_TEST_DC(xvtstdcdp, 2, xB(opcode), float64, VsrD(i), VsrD(i), UINT64_MAX)
+VSX_TEST_DC(xvtstdcsp, 4, xB(opcode), float32, VsrW(i), VsrW(i), UINT32_MAX)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 9d4ed08..165e4a5 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -546,6 +546,8 @@ DEF_HELPER_2(xvcvsxdsp, void, env, i32)
 DEF_HELPER_2(xvcvuxdsp, void, env, i32)
 DEF_HELPER_2(xvcvsxwsp, void, env, i32)
 DEF_HELPER_2(xvcvuxwsp, void, env, i32)
+DEF_HELPER_2(xvtstdcsp, void, env, i32)
+DEF_HELPER_2(xvtstdcdp, void, env, i32)
 DEF_HELPER_2(xvrspi, void, env, i32)
 DEF_HELPER_2(xvrspic, void, env, i32)
 DEF_HELPER_2(xvrspim, void, env, i32)
diff --git a/target/ppc/internal.h b/target/ppc/internal.h
index c22d74e..4c3811a 100644
--- a/target/ppc/internal.h
+++ b/target/ppc/internal.h
@@ -68,7 +68,7 @@ static inline uint32_t name(uint32_t opcode)  
\
 ((opcode >> (shift2)) & ((1 << (nb2)) - 1));  \
 }
 
-#define EXTRACT_HELPER_DXFORM(name,   \
+#define EXTRACT_HELPER_SPLIT_3(name,  \
   d0_bits, shift_op_d0, shift_d0, \
   d1_bits, shift_op_d1, shift_d1, \
   d2_bits, shift_op_d2, shift_d2) \
@@ -156,7 +156,7 @@ EXTRACT_HELPER(FPFLM, 17, 8);
 EXTRACT_HELPER(FPW, 16, 1);
 
 /* addpcis */
-EXTRACT_HELPER_DXFORM(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
+EXTRACT_HELPER_SPLIT_3(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
 #if defined(TARGET_PPC64)
 /* darn */
 EXTRACT_HELPER(L, 16, 2);
@@ -198,6 +198,7 @@ EXTRACT_HELPER(UIM, 16, 2);
 EXTRACT_HELPER(SHW, 8, 2);
 EXTRACT_HELPER(SP, 19, 2);
 EXTRACT_HELPER(IMM8, 11, 8);
+EXTRACT_HELPER_SPLIT_3(DCMX_XV, 5, 16, 0, 1, 2, 5, 1, 6, 6);
 
 typedef union _ppc_vsr_t {
 uint8_t u8[16];
diff --git a/target/ppc/translate/vsx-impl.inc.c 
b/target/ppc/translate/vsx-impl.inc.c
index 9bcc5af..adb6fc7 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -928,6 +928,8 @@ GEN_VSX_HELPER_2(xvrspic, 0x16, 0x0A, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvrspim, 0x12, 0x0B, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvrspip, 0x12, 0x0A, 0, PPC2_VSX)

[Qemu-devel] [PATCH v1 0/2] POWER9 TCG enablements - part12

2017-01-12 Thread Nikunj A Dadhania
This series contains 5 new instructions for POWER9 ISA3.0
VSX Scalar Test Data Class
VSX Vector Test Data Class

Changelog:
v0:
* Concise logic for identifying data class in Scalar/Vector 
  test data class instructions

Nikunj A Dadhania (2):
  target-ppc: Add xvtstdc[sp,dp] instructions
  target-ppc: Add xststdc[sp, dp, qp] instructions

 target/ppc/fpu_helper.c | 89 +
 target/ppc/helper.h |  5 +++
 target/ppc/internal.h   |  6 ++-
 target/ppc/translate/vsx-impl.inc.c |  5 +++
 target/ppc/translate/vsx-ops.inc.c  | 12 +
 5 files changed, 115 insertions(+), 2 deletions(-)

-- 
2.7.4




Re: [Qemu-devel] [PULL 0/4] migration: QTAILQ migration

2017-01-12 Thread Amit Shah
On (Fri) 06 Jan 2017 [12:10:22], Peter Maydell wrote:
> On 5 January 2017 at 16:32, Amit Shah  wrote:
> > The following changes since commit dbe2b65566e76d3c3a0c3358285c0336ac61e757:
> >
> >   Merge remote-tracking branch 
> > 'remotes/vivier/tags/m68k-for-2.9-pull-request' into staging (2016-12-28 
> > 17:11:11 +)
> >
> > are available in the git repository at:
> >
> >   https://git.kernel.org/pub/scm/virt/qemu/amit/migration.git 
> > tags/migration-for-2.9-1
> >
> > for you to fetch changes up to 8d0c57697abcc57166e918064d9a7f6c7316d01b:
> >
> >   migration: add error_report (2017-01-04 19:21:26 +0530)
> >
> > 
> > Migration: migrate QTAILQ infrastructure for vmstate
> >
> > 
> 
> Hi. I'm afraid this fails 'make check' for ppc64be:

Hi Peter, since this was due to something else, will you try to pull
this in again?  Or should I re-send the series?

Thanks,

Amit



Re: [Qemu-devel] [PATCH 6/7] target-ppc: Add xvtstdc[sp, dp] instructions

2017-01-12 Thread Nikunj A Dadhania
Richard Henderson  writes:
> On 01/12/2017 08:24 AM, Nikunj A Dadhania wrote:
>> +nan = tp##_is_any_nan(xb.fld);  \
>> +infinity = tp##_is_infinity(xb.fld);\
>> +sign = tp##_is_neg(xb.fld); \
>> +zero = denormal = 0;\
>> +if (tp##_is_zero_or_denormal(xb.fld)) { \
>> +if (tp##_is_zero(xb.fld)) { \
>> +zero = 1;   \
>> +} else {\
>> +denormal = 1;   \
>> +}   \
>> +}   \
>> +\
>> +if ((extract32(dcmx, 6, 1) && nan) ||   \
>> +(extract32(dcmx, 5, 1) && infinity && !sign) || \
>> +(extract32(dcmx, 4, 1) && infinity &&  sign) || \
>> +(extract32(dcmx, 3, 1) && zero && !sign) || \
>> +(extract32(dcmx, 3, 1) && zero &&  sign) || \
>> +(extract32(dcmx, 1, 1) && denormal && !sign) || \
>> +(extract32(dcmx, 0, 1) && denormal &&  sign)) { \
>> +match = 1;  \
>> +}   \
>
> I'll note that all of these are mutually exclusive, therefore you're doing 
> much
> more work than required.
>
>   sign = tp##_is_neg(x));
>   if (tp##is_any_nan(x)) {
>   match = extract32(dcmx, 6, 1);
>   } else if (tp##_is_infinity(x)) {
>   match = extract32(dcmx, 4 + !sign, 1);
>   } else if (tp##_is_zero(x)) {
>   match = extract32(dcmx, 2 + !sign, 1);
>   } else if (tp##_is_zero_or_denormal(x)) {
>   match = extract32(dcmx, 0 + !sign, 1);
>   }

Right, that is pretty concise. Will send updated series changing patch 6
and 7.

> (Also, an apparent typo for your zero && sign case.)

Yes, was by mistake.

Regards
Nikunj




[Qemu-devel] [PATCH RFC v3 14/14] intel_iommu: enable vfio devices

2017-01-12 Thread Peter Xu
This patch is based on Aviv Ben-David ()'s patch
upstream:

  "IOMMU: enable intel_iommu map and unmap notifiers"
  https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01453.html

However I removed/fixed some content, and added my own codes.

Instead of translate() every page for iotlb invalidations (which is
slower), we walk the pages when needed and notify in a hook function.

This patch enables vfio devices for VT-d emulation.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 68 +--
 include/hw/i386/intel_iommu.h |  8 +
 2 files changed, 67 insertions(+), 9 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 2596f11..104200b 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -839,7 +839,8 @@ next:
  * @private: private data for the hook function
  */
 static int vtd_page_walk(VTDContextEntry *ce, uint64_t start, uint64_t end,
- vtd_page_walk_hook hook_fn, void *private)
+ vtd_page_walk_hook hook_fn, void *private,
+ bool notify_unmap)
 {
 dma_addr_t addr = vtd_get_slpt_base_from_context(ce);
 uint32_t level = vtd_get_level_from_context_entry(ce);
@@ -858,7 +859,7 @@ static int vtd_page_walk(VTDContextEntry *ce, uint64_t 
start, uint64_t end,
 trace_vtd_page_walk(ce->hi, ce->lo, start, end);
 
 return vtd_page_walk_level(addr, start, end, hook_fn, private,
-   level, true, true, NULL, false);
+   level, true, true, NULL, notify_unmap);
 }
 
 /* Map a device to its corresponding domain (context-entry) */
@@ -1212,6 +1213,34 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUState 
*s, uint16_t domain_id)
 _id);
 }
 
+static int vtd_page_invalidate_notify_hook(IOMMUTLBEntry *entry,
+   void *private)
+{
+memory_region_notify_iommu((MemoryRegion *)private, *entry);
+return 0;
+}
+
+static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s,
+   uint16_t domain_id, hwaddr addr,
+   uint8_t am)
+{
+IntelIOMMUNotifierNode *node;
+VTDContextEntry ce;
+int ret;
+
+QLIST_FOREACH(node, &(s->notifiers_list), next) {
+VTDAddressSpace *vtd_as = node->vtd_as;
+ret = vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
+   vtd_as->devfn, );
+if (!ret && domain_id == VTD_CONTEXT_ENTRY_DID(ce.hi)) {
+vtd_page_walk(, addr, addr + (1 << am) * VTD_PAGE_SIZE,
+  vtd_page_invalidate_notify_hook,
+  (void *)_as->iommu, true);
+}
+}
+}
+
+
 static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
   hwaddr addr, uint8_t am)
 {
@@ -1222,6 +1251,7 @@ static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, 
uint16_t domain_id,
 info.addr = addr;
 info.mask = ~((1 << am) - 1);
 g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page, );
+vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am);
 }
 
 /* Flush IOTLB
@@ -2244,15 +2274,34 @@ static void vtd_iommu_notify_flag_changed(MemoryRegion 
*iommu,
   IOMMUNotifierFlag new)
 {
 VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
+IntelIOMMUState *s = vtd_as->iommu_state;
+IntelIOMMUNotifierNode *node = NULL;
+IntelIOMMUNotifierNode *next_node = NULL;
 
-if (new & IOMMU_NOTIFIER_MAP) {
-error_report("Device at bus %s addr %02x.%d requires iommu "
- "notifier which is currently not supported by "
- "intel-iommu emulation",
- vtd_as->bus->qbus.name, PCI_SLOT(vtd_as->devfn),
- PCI_FUNC(vtd_as->devfn));
+if (!s->cache_mode_enabled && new & IOMMU_NOTIFIER_MAP) {
+error_report("We need to set cache_mode=1 for intel-iommu to enable "
+ "device assignment with IOMMU protection.");
 exit(1);
 }
+
+/* Add new ndoe if no mapping was exising before this call */
+if (old == IOMMU_NOTIFIER_NONE) {
+node = g_malloc0(sizeof(*node));
+node->vtd_as = vtd_as;
+QLIST_INSERT_HEAD(>notifiers_list, node, next);
+return;
+}
+
+/* update notifier node with new flags */
+QLIST_FOREACH_SAFE(node, >notifiers_list, next, next_node) {
+if (node->vtd_as == vtd_as) {
+if (new == IOMMU_NOTIFIER_NONE) {
+QLIST_REMOVE(node, next);
+g_free(node);
+}
+return;
+}
+}
 }
 
 static const VMStateDescription vtd_vmstate = {
@@ -2689,7 +2738,7 @@ static void vtd_iommu_replay(MemoryRegion *mr, 
IOMMUNotifier 

[Qemu-devel] [PATCH RFC v3 11/14] intel_iommu: provide its own replay() callback

2017-01-12 Thread Peter Xu
The default replay() don't work for VT-d since vt-d will have a huge
default memory region which covers address range 0-(2^64-1). This will
normally bring a dead loop when guest starts.

The solution is simple - we don't walk over all the regions. Instead, we
jump over the regions when we found that the page directories are empty.
It'll greatly reduce the time to walk the whole region.

To achieve this, we provided a page walk helper to do that, invoking
corresponding hook function when we found an page we are interested in.
vtd_page_walk_level() is the core logic for the page walking. It's
interface is designed to suite further use case, e.g., to invalidate a
range of addresses.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 212 --
 hw/i386/trace-events  |   8 ++
 include/exec/memory.h |   2 +
 3 files changed, 217 insertions(+), 5 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index b4019d0..59bf683 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -600,6 +600,22 @@ static inline uint32_t 
vtd_get_agaw_from_context_entry(VTDContextEntry *ce)
 return 30 + (ce->hi & VTD_CONTEXT_ENTRY_AW) * 9;
 }
 
+static inline uint64_t vtd_iova_limit(VTDContextEntry *ce)
+{
+uint32_t ce_agaw = vtd_get_agaw_from_context_entry(ce);
+return 1ULL << MIN(ce_agaw, VTD_MGAW);
+}
+
+/* Return true if IOVA passes range check, otherwise false. */
+static inline bool vtd_iova_range_check(uint64_t iova, VTDContextEntry *ce)
+{
+/*
+ * Check if @iova is above 2^X-1, where X is the minimum of MGAW
+ * in CAP_REG and AW in context-entry.
+ */
+return !(iova & ~(vtd_iova_limit(ce) - 1));
+}
+
 static const uint64_t vtd_paging_entry_rsvd_field[] = {
 [0] = ~0ULL,
 /* For not large page */
@@ -635,13 +651,9 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
iova, bool is_write,
 uint32_t level = vtd_get_level_from_context_entry(ce);
 uint32_t offset;
 uint64_t slpte;
-uint32_t ce_agaw = vtd_get_agaw_from_context_entry(ce);
 uint64_t access_right_check;
 
-/* Check if @iova is above 2^X-1, where X is the minimum of MGAW
- * in CAP_REG and AW in context-entry.
- */
-if (iova & ~((1ULL << MIN(ce_agaw, VTD_MGAW)) - 1)) {
+if (!vtd_iova_range_check(iova, ce)) {
 error_report("IOVA 0x%"PRIx64 " exceeds limits", iova);
 return -VTD_FR_ADDR_BEYOND_MGAW;
 }
@@ -689,6 +701,166 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
iova, bool is_write,
 }
 }
 
+typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private);
+
+/**
+ * vtd_page_walk_level - walk over specific level for IOVA range
+ *
+ * @addr: base GPA addr to start the walk
+ * @start: IOVA range start address
+ * @end: IOVA range end address (start <= addr < end)
+ * @hook_fn: hook func to be called when detected page
+ * @private: private data to be passed into hook func
+ * @read: whether parent level has read permission
+ * @write: whether parent level has write permission
+ * @skipped: accumulated skipped ranges
+ * @notify_unmap: whether we should notify invalid entries
+ */
+static int vtd_page_walk_level(dma_addr_t addr, uint64_t start,
+   uint64_t end, vtd_page_walk_hook hook_fn,
+   void *private, uint32_t level,
+   bool read, bool write, uint64_t *skipped,
+   bool notify_unmap)
+{
+bool read_cur, write_cur, entry_valid;
+uint32_t offset;
+uint64_t slpte;
+uint64_t subpage_size, subpage_mask;
+IOMMUTLBEntry entry;
+uint64_t iova = start;
+uint64_t iova_next;
+uint64_t skipped_local = 0;
+int ret = 0;
+
+trace_vtd_page_walk_level(addr, level, start, end);
+
+subpage_size = 1ULL << vtd_slpt_level_shift(level);
+subpage_mask = vtd_slpt_level_page_mask(level);
+
+while (iova < end) {
+iova_next = (iova & subpage_mask) + subpage_size;
+
+offset = vtd_iova_level_offset(iova, level);
+slpte = vtd_get_slpte(addr, offset);
+
+/*
+ * When one of the following case happens, we assume the whole
+ * range is invalid:
+ *
+ * 1. read block failed
+ * 3. reserved area non-zero
+ * 2. both read & write flag are not set
+ */
+
+if (slpte == (uint64_t)-1) {
+trace_vtd_page_walk_skip_read(iova, iova_next);
+skipped_local++;
+goto next;
+}
+
+if (vtd_slpte_nonzero_rsvd(slpte, level)) {
+trace_vtd_page_walk_skip_reserve(iova, iova_next);
+skipped_local++;
+goto next;
+}
+
+/* Permissions are stacked with parents' */
+read_cur = read && (slpte & VTD_SL_R);
+write_cur = write && (slpte & VTD_SL_W);
+
+/*
+ * As long as we have either read/write permission, 

[Qemu-devel] [PATCH RFC v3 09/14] memory: introduce memory_region_notify_one()

2017-01-12 Thread Peter Xu
Generalizing the notify logic in memory_region_notify_iommu() into a
single function. This can be further used in customized replay()
functions for IOMMUs.

Signed-off-by: Peter Xu 
---
 include/exec/memory.h | 15 +++
 memory.c  | 29 ++---
 2 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 2233f99..f367e54 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -669,6 +669,21 @@ void memory_region_notify_iommu(MemoryRegion *mr,
 IOMMUTLBEntry entry);
 
 /**
+ * memory_region_notify_one: notify a change in an IOMMU translation
+ *   entry to a single notifier
+ *
+ * This works just like memory_region_notify_iommu(), but it only
+ * notifies a specific notifier, not all of them.
+ *
+ * @notifier: the notifier to be notified
+ * @entry: the new entry in the IOMMU translation table.  The entry
+ * replaces all old entries for the same virtual I/O address range.
+ * Deleted entries have .@perm == 0.
+ */
+void memory_region_notify_one(IOMMUNotifier *notifier,
+  IOMMUTLBEntry *entry);
+
+/**
  * memory_region_register_iommu_notifier: register a notifier for changes to
  * IOMMU translation entries.
  *
diff --git a/memory.c b/memory.c
index df62bd1..6e4c872 100644
--- a/memory.c
+++ b/memory.c
@@ -1665,26 +1665,33 @@ void 
memory_region_unregister_iommu_notifier(MemoryRegion *mr,
 memory_region_update_iommu_notify_flags(mr);
 }
 
-void memory_region_notify_iommu(MemoryRegion *mr,
-IOMMUTLBEntry entry)
+void memory_region_notify_one(IOMMUNotifier *notifier,
+  IOMMUTLBEntry *entry)
 {
-IOMMUNotifier *iommu_notifier;
 IOMMUNotifierFlag request_flags;
 
-assert(memory_region_is_iommu(mr));
-
-if (entry.perm & IOMMU_RW) {
+if (entry->perm & IOMMU_RW) {
 request_flags = IOMMU_NOTIFIER_MAP;
 } else {
 request_flags = IOMMU_NOTIFIER_UNMAP;
 }
 
+if (notifier->notifier_flags & request_flags &&
+notifier->start <= entry->iova &&
+notifier->end >= entry->iova) {
+notifier->notify(notifier, entry);
+}
+}
+
+void memory_region_notify_iommu(MemoryRegion *mr,
+IOMMUTLBEntry entry)
+{
+IOMMUNotifier *iommu_notifier;
+
+assert(memory_region_is_iommu(mr));
+
 QLIST_FOREACH(iommu_notifier, >iommu_notify, node) {
-if (iommu_notifier->notifier_flags & request_flags &&
-iommu_notifier->start <= entry.iova &&
-iommu_notifier->end >= entry.iova) {
-iommu_notifier->notify(iommu_notifier, );
-}
+memory_region_notify_one(iommu_notifier, );
 }
 }
 
-- 
2.7.4




[Qemu-devel] [PATCH RFC v3 13/14] intel_iommu: allow dynamic switch of IOMMU region

2017-01-12 Thread Peter Xu
This is preparation work to finally enabled dynamic switching ON/OFF for
VT-d protection. The old VT-d codes is using static IOMMU address space,
and that won't satisfy vfio-pci device listeners.

Let me explain.

vfio-pci devices depend on the memory region listener and IOMMU replay
mechanism to make sure the device mapping is coherent with the guest
even if there are domain switches. And there are two kinds of domain
switches:

  (1) switch from domain A -> B
  (2) switch from domain A -> no domain (e.g., turn DMAR off)

Case (1) is handled by the context entry invalidation handling by the
VT-d replay logic. What the replay function should do here is to replay
the existing page mappings in domain B.

However for case (2), we don't want to replay any domain mappings - we
just need the default GPA->HPA mappings (the address_space_memory
mapping). And this patch helps on case (2) to build up the mapping
automatically by leveraging the vfio-pci memory listeners.

Another important thing that this patch does is to seperate
IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not
depend on the DMAR region (like before this patch). It should be a
standalone region, and it should be able to be activated without
DMAR (which is a common behavior of Linux kernel - by default it enables
IR while disabled DMAR).

Signed-off-by: Peter Xu 
---
v3:
- fix another trivial style issue patchew reported but I missed in v2

v2:
- fix issues reported by patchew
- switch domain by enable/disable memory regions [David]
- provide vtd_switch_address_space{_all}()
- provide a better comment on the memory regions

test done: with intel_iommu device, boot vm with/without
"intel_iommu=on" parameter.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 78 ---
 hw/i386/trace-events  |  2 +-
 include/hw/i386/intel_iommu.h |  2 ++
 3 files changed, 77 insertions(+), 5 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index fd75112..2596f11 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1343,9 +1343,49 @@ static void vtd_handle_gcmd_sirtp(IntelIOMMUState *s)
 vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_IRTPS);
 }
 
+static void vtd_switch_address_space(VTDAddressSpace *as, bool iommu_enabled)
+{
+assert(as);
+
+trace_vtd_switch_address_space(pci_bus_num(as->bus),
+   VTD_PCI_SLOT(as->devfn),
+   VTD_PCI_FUNC(as->devfn),
+   iommu_enabled);
+
+/* Turn off first then on the other */
+if (iommu_enabled) {
+memory_region_set_enabled(>sys_alias, false);
+memory_region_set_enabled(>iommu, true);
+} else {
+memory_region_set_enabled(>iommu, false);
+memory_region_set_enabled(>sys_alias, true);
+}
+}
+
+static void vtd_switch_address_space_all(IntelIOMMUState *s, bool enabled)
+{
+GHashTableIter iter;
+VTDBus *vtd_bus;
+int i;
+
+g_hash_table_iter_init(, s->vtd_as_by_busptr);
+while (g_hash_table_iter_next(, NULL, (void **)_bus)) {
+for (i = 0; i < X86_IOMMU_PCI_DEVFN_MAX; i++) {
+if (!vtd_bus->dev_as[i]) {
+continue;
+}
+vtd_switch_address_space(vtd_bus->dev_as[i], enabled);
+}
+}
+}
+
 /* Handle Translation Enable/Disable */
 static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
 {
+if (s->dmar_enabled == en) {
+return;
+}
+
 VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
 
 if (en) {
@@ -1360,6 +1400,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool 
en)
 /* Ok - report back to driver */
 vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
 }
+
+vtd_switch_address_space_all(s, en);
 }
 
 /* Handle Interrupt Remap Enable/Disable */
@@ -2586,15 +2628,43 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, 
PCIBus *bus, int devfn)
 vtd_dev_as->devfn = (uint8_t)devfn;
 vtd_dev_as->iommu_state = s;
 vtd_dev_as->context_cache_entry.context_cache_gen = 0;
+
+/*
+ * Memory region relationships looks like (Address range shows
+ * only lower 32 bits to make it short in length...):
+ *
+ * |-+---+--|
+ * | Name| Address range | Priority |
+ * |-+---+--+
+ * | vtd_root| - |0 |
+ * |  intel_iommu| - |1 |
+ * |  vtd_sys_alias  | - |1 |
+ * |  intel_iommu_ir | fee0-feef |   64 |
+ * |-+---+--|
+ *
+ * We enable/disable DMAR by switching enablement for
+ * vtd_sys_alias and intel_iommu regions. 

[Qemu-devel] [PATCH RFC v3 07/14] memory: add section range info for IOMMU notifier

2017-01-12 Thread Peter Xu
In this patch, IOMMUNotifier.{start|end} are introduced to store section
information for a specific notifier. When notification occurs, we not
only check the notification type (MAP|UNMAP), but also check whether the
notified iova is in the range of specific IOMMU notifier, and skip those
notifiers if not in the listened range.

When removing an region, we need to make sure we removed the correct
VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well.

Suggested-by: David Gibson 
Reviewed-by: David Gibson 
Acked-by: Paolo Bonzini 
Signed-off-by: Peter Xu 
---
 hw/vfio/common.c  | 7 ++-
 include/exec/memory.h | 3 +++
 memory.c  | 4 +++-
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 801578b..6f648da 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -455,6 +455,10 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
 giommu->container = container;
 giommu->n.notify = vfio_iommu_map_notify;
 giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL;
+giommu->n.start = section->offset_within_region;
+llend = int128_add(int128_make64(giommu->n.start), section->size);
+llend = int128_sub(llend, int128_one());
+giommu->n.end = int128_get64(llend);
 QLIST_INSERT_HEAD(>giommu_list, giommu, giommu_next);
 
 memory_region_register_iommu_notifier(giommu->iommu, >n);
@@ -525,7 +529,8 @@ static void vfio_listener_region_del(MemoryListener 
*listener,
 VFIOGuestIOMMU *giommu;
 
 QLIST_FOREACH(giommu, >giommu_list, giommu_next) {
-if (giommu->iommu == section->mr) {
+if (giommu->iommu == section->mr &&
+giommu->n.start == section->offset_within_region) {
 memory_region_unregister_iommu_notifier(giommu->iommu,
 >n);
 QLIST_REMOVE(giommu, giommu_next);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index bec9756..7649e74 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -84,6 +84,9 @@ typedef enum {
 struct IOMMUNotifier {
 void (*notify)(struct IOMMUNotifier *notifier, IOMMUTLBEntry *data);
 IOMMUNotifierFlag notifier_flags;
+/* Notify for address space range start <= addr <= end */
+hwaddr start;
+hwaddr end;
 QLIST_ENTRY(IOMMUNotifier) node;
 };
 typedef struct IOMMUNotifier IOMMUNotifier;
diff --git a/memory.c b/memory.c
index 2bfc37f..e88bb54 100644
--- a/memory.c
+++ b/memory.c
@@ -1671,7 +1671,9 @@ void memory_region_notify_iommu(MemoryRegion *mr,
 }
 
 QLIST_FOREACH(iommu_notifier, >iommu_notify, node) {
-if (iommu_notifier->notifier_flags & request_flags) {
+if (iommu_notifier->notifier_flags & request_flags &&
+iommu_notifier->start <= entry.iova &&
+iommu_notifier->end >= entry.iova) {
 iommu_notifier->notify(iommu_notifier, );
 }
 }
-- 
2.7.4




[Qemu-devel] [PATCH RFC v3 12/14] intel_iommu: do replay when context invalidate

2017-01-12 Thread Peter Xu
Before this one we only invalidate context cache when we receive context
entry invalidations. However it's possible that the invalidation also
contains a domain switch (only if cache-mode is enabled for vIOMMU). In
that case we need to notify all the registered components about the new
mapping.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 59bf683..fd75112 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1162,6 +1162,7 @@ static void vtd_context_device_invalidate(IntelIOMMUState 
*s,
 trace_vtd_inv_desc_cc_device(bus_n, (devfn_it >> 3) & 0x1f,
  devfn_it & 3);
 vtd_as->context_cache_entry.context_cache_gen = 0;
+memory_region_iommu_replay_all(_as->iommu);
 }
 }
 }
-- 
2.7.4




[Qemu-devel] [PATCH RFC v3 05/14] intel_iommu: fix trace for addr translation

2017-01-12 Thread Peter Xu
Another patch to convert the DPRINTF() stuffs. This patch focuses on the
address translation path and caching.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 84 +--
 hw/i386/trace-events  |  7 +
 2 files changed, 48 insertions(+), 43 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 459e575..b4166e0 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -260,11 +260,9 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t 
source_id,
 uint64_t *key = g_malloc(sizeof(*key));
 uint64_t gfn = vtd_get_iotlb_gfn(addr, level);
 
-VTD_DPRINTF(CACHE, "update iotlb sid 0x%"PRIx16 " iova 0x%"PRIx64
-" slpte 0x%"PRIx64 " did 0x%"PRIx16, source_id, addr, slpte,
-domain_id);
+trace_vtd_iotlb_page_update(source_id, addr, slpte, domain_id);
 if (g_hash_table_size(s->iotlb) >= VTD_IOTLB_MAX_SIZE) {
-VTD_DPRINTF(CACHE, "iotlb exceeds size limit, forced to reset");
+trace_vtd_iotlb_reset("iotlb exceeds size limit");
 vtd_reset_iotlb(s);
 }
 
@@ -505,8 +503,8 @@ static int vtd_get_root_entry(IntelIOMMUState *s, uint8_t 
index,
 
 addr = s->root + index * sizeof(*re);
 if (dma_memory_read(_space_memory, addr, re, sizeof(*re))) {
-VTD_DPRINTF(GENERAL, "error: fail to access root-entry at 0x%"PRIx64
-" + %"PRIu8, s->root, index);
+error_report("Fail to access root-entry at 0x%"PRIx64
+ " index %"PRIu8, s->root, index);
 re->val = 0;
 return -VTD_FR_ROOT_TABLE_INV;
 }
@@ -525,13 +523,12 @@ static int vtd_get_context_entry_from_root(VTDRootEntry 
*root, uint8_t index,
 dma_addr_t addr;
 
 if (!vtd_root_entry_present(root)) {
-VTD_DPRINTF(GENERAL, "error: root-entry is not present");
+error_report("Root-entry is not present");
 return -VTD_FR_ROOT_ENTRY_P;
 }
 addr = (root->val & VTD_ROOT_ENTRY_CTP) + index * sizeof(*ce);
 if (dma_memory_read(_space_memory, addr, ce, sizeof(*ce))) {
-VTD_DPRINTF(GENERAL, "error: fail to access context-entry at 0x%"PRIx64
-" + %"PRIu8,
+error_report("Fail to access context-entry at 0x%"PRIx64" ind %"PRIu8,
 (uint64_t)(root->val & VTD_ROOT_ENTRY_CTP), index);
 return -VTD_FR_CONTEXT_TABLE_INV;
 }
@@ -644,7 +641,7 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
iova, bool is_write,
  * in CAP_REG and AW in context-entry.
  */
 if (iova & ~((1ULL << MIN(ce_agaw, VTD_MGAW)) - 1)) {
-VTD_DPRINTF(GENERAL, "error: iova 0x%"PRIx64 " exceeds limits", iova);
+error_report("IOVA 0x%"PRIx64 " exceeds limits", iova);
 return -VTD_FR_ADDR_BEYOND_MGAW;
 }
 
@@ -656,7 +653,7 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
iova, bool is_write,
 slpte = vtd_get_slpte(addr, offset);
 
 if (slpte == (uint64_t)-1) {
-VTD_DPRINTF(GENERAL, "error: fail to access second-level paging "
+error_report("Fail to access second-level paging "
 "entry at level %"PRIu32 " for iova 0x%"PRIx64,
 level, iova);
 if (level == vtd_get_level_from_context_entry(ce)) {
@@ -669,13 +666,13 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
iova, bool is_write,
 *reads = (*reads) && (slpte & VTD_SL_R);
 *writes = (*writes) && (slpte & VTD_SL_W);
 if (!(slpte & access_right_check)) {
-VTD_DPRINTF(GENERAL, "error: lack of %s permission for "
-"iova 0x%"PRIx64 " slpte 0x%"PRIx64,
-(is_write ? "write" : "read"), iova, slpte);
+error_report("Lack of %s permission for iova 0x%"PRIx64
+ " slpte 0x%"PRIx64,
+ (is_write ? "write" : "read"), iova, slpte);
 return is_write ? -VTD_FR_WRITE : -VTD_FR_READ;
 }
 if (vtd_slpte_nonzero_rsvd(slpte, level)) {
-VTD_DPRINTF(GENERAL, "error: non-zero reserved field in second "
+error_report("Non-zero reserved field in second "
 "level paging entry level %"PRIu32 " slpte 0x%"PRIx64,
 level, slpte);
 return -VTD_FR_PAGING_ENTRY_RSVD;
@@ -704,12 +701,13 @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, 
uint8_t bus_num,
 }
 
 if (!vtd_root_entry_present()) {
-VTD_DPRINTF(GENERAL, "error: root-entry #%"PRIu8 " is not present",
-bus_num);
+/* Not error - it's okay we don't have root entry. */
+trace_vtd_re_not_present(bus_num);
 return -VTD_FR_ROOT_ENTRY_P;
 } else if (re.rsvd || (re.val & VTD_ROOT_ENTRY_RSVD)) {
-VTD_DPRINTF(GENERAL, "error: non-zero reserved field in root-entry "
-"hi 0x%"PRIx64 " 

[Qemu-devel] [PATCH RFC v3 10/14] memory: add MemoryRegionIOMMUOps.replay() callback

2017-01-12 Thread Peter Xu
Originally we have one memory_region_iommu_replay() function, which is
the default behavior to replay the translations of the whole IOMMU
region. However, on some platform like x86, we may want our own replay
logic for IOMMU regions. This patch add one more hook for IOMMUOps for
the callback, and it'll override the default if set.

Signed-off-by: Peter Xu 
---
 include/exec/memory.h | 2 ++
 memory.c  | 6 ++
 2 files changed, 8 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index f367e54..cff6958 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -181,6 +181,8 @@ struct MemoryRegionIOMMUOps {
 void (*notify_flag_changed)(MemoryRegion *iommu,
 IOMMUNotifierFlag old_flags,
 IOMMUNotifierFlag new_flags);
+/* Set this up to provide customized IOMMU replay function */
+void (*replay)(MemoryRegion *iommu, IOMMUNotifier *notifier);
 };
 
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
diff --git a/memory.c b/memory.c
index 6e4c872..609ac67 100644
--- a/memory.c
+++ b/memory.c
@@ -1629,6 +1629,12 @@ void memory_region_iommu_replay(MemoryRegion *mr, 
IOMMUNotifier *n,
 hwaddr addr, granularity;
 IOMMUTLBEntry iotlb;
 
+/* If the IOMMU has its own replay callback, override */
+if (mr->iommu_ops->replay) {
+mr->iommu_ops->replay(mr, n);
+return;
+}
+
 granularity = memory_region_iommu_get_min_page_size(mr);
 
 for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
-- 
2.7.4




[Qemu-devel] [PATCH RFC v3 02/14] intel_iommu: simplify irq region translation

2017-01-12 Thread Peter Xu
Before we have int-remap, we need to bypass interrupt write requests.
That's not necessary now - we have supported int-remap, and all the irq
region requests should be redirected there. Cleaning up the block with
an assertion instead.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 28 ++--
 1 file changed, 6 insertions(+), 22 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 2868e37..77d467a 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -818,28 +818,12 @@ static void vtd_do_iommu_translate(VTDAddressSpace 
*vtd_as, PCIBus *bus,
 bool writes = true;
 VTDIOTLBEntry *iotlb_entry;
 
-/* Check if the request is in interrupt address range */
-if (vtd_is_interrupt_addr(addr)) {
-if (is_write) {
-/* FIXME: since we don't know the length of the access here, we
- * treat Non-DWORD length write requests without PASID as
- * interrupt requests, too. Withoud interrupt remapping support,
- * we just use 1:1 mapping.
- */
-VTD_DPRINTF(MMU, "write request to interrupt address "
-"gpa 0x%"PRIx64, addr);
-entry->iova = addr & VTD_PAGE_MASK_4K;
-entry->translated_addr = addr & VTD_PAGE_MASK_4K;
-entry->addr_mask = ~VTD_PAGE_MASK_4K;
-entry->perm = IOMMU_WO;
-return;
-} else {
-VTD_DPRINTF(GENERAL, "error: read request from interrupt address "
-"gpa 0x%"PRIx64, addr);
-vtd_report_dmar_fault(s, source_id, addr, VTD_FR_READ, is_write);
-return;
-}
-}
+/*
+ * We have standalone memory region for interrupt addresses, we
+ * should never receive translation requests in this region.
+ */
+assert(!vtd_is_interrupt_addr(addr));
+
 /* Try to fetch slpte form IOTLB */
 iotlb_entry = vtd_lookup_iotlb(s, source_id, addr);
 if (iotlb_entry) {
-- 
2.7.4




[Qemu-devel] [PATCH RFC v3 08/14] memory: provide iommu_replay_all()

2017-01-12 Thread Peter Xu
This is an "global" version of exising memory_region_iommu_replay() - we
announce the translations to all the registered notifiers, instead of a
specific one.

Signed-off-by: Peter Xu 
---
 include/exec/memory.h | 8 
 memory.c  | 9 +
 2 files changed, 17 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 7649e74..2233f99 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -694,6 +694,14 @@ void memory_region_iommu_replay(MemoryRegion *mr, 
IOMMUNotifier *n,
 bool is_write);
 
 /**
+ * memory_region_iommu_replay_all: replay existing IOMMU translations
+ * to all the notifiers registered.
+ *
+ * @mr: the memory region to observe
+ */
+void memory_region_iommu_replay_all(MemoryRegion *mr);
+
+/**
  * memory_region_unregister_iommu_notifier: unregister a notifier for
  * changes to IOMMU translation entries.
  *
diff --git a/memory.c b/memory.c
index e88bb54..df62bd1 100644
--- a/memory.c
+++ b/memory.c
@@ -1645,6 +1645,15 @@ void memory_region_iommu_replay(MemoryRegion *mr, 
IOMMUNotifier *n,
 }
 }
 
+void memory_region_iommu_replay_all(MemoryRegion *mr)
+{
+IOMMUNotifier *notifier;
+
+QLIST_FOREACH(notifier, >iommu_notify, node) {
+memory_region_iommu_replay(mr, notifier, false);
+}
+}
+
 void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
  IOMMUNotifier *n)
 {
-- 
2.7.4




[Qemu-devel] [PATCH RFC v3 01/14] IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to guest

2017-01-12 Thread Peter Xu
From: Aviv Ben-David 

This capability asks the guest to invalidate cache before each map operation.
We can use this invalidation to trap map operations in the hypervisor.

Signed-off-by: Aviv Ben-David 
Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c  | 5 +
 hw/i386/intel_iommu_internal.h | 1 +
 include/hw/i386/intel_iommu.h  | 2 ++
 3 files changed, 8 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index ec62239..2868e37 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2107,6 +2107,7 @@ static Property vtd_properties[] = {
 DEFINE_PROP_ON_OFF_AUTO("eim", IntelIOMMUState, intr_eim,
 ON_OFF_AUTO_AUTO),
 DEFINE_PROP_BOOL("x-buggy-eim", IntelIOMMUState, buggy_eim, false),
+DEFINE_PROP_BOOL("cache-mode", IntelIOMMUState, cache_mode_enabled, FALSE),
 DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -2488,6 +2489,10 @@ static void vtd_init(IntelIOMMUState *s)
 s->ecap |= VTD_ECAP_DT;
 }
 
+if (s->cache_mode_enabled) {
+s->cap |= VTD_CAP_CM;
+}
+
 vtd_reset_context_cache(s);
 vtd_reset_iotlb(s);
 
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 356f188..4104121 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -202,6 +202,7 @@
 #define VTD_CAP_MAMV(VTD_MAMV << 48)
 #define VTD_CAP_PSI (1ULL << 39)
 #define VTD_CAP_SLLPS   ((1ULL << 34) | (1ULL << 35))
+#define VTD_CAP_CM  (1ULL << 7)
 
 /* Supported Adjusted Guest Address Widths */
 #define VTD_CAP_SAGAW_SHIFT 8
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 405c9d1..749eef9 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -257,6 +257,8 @@ struct IntelIOMMUState {
 uint8_t womask[DMAR_REG_SIZE];  /* WO (write only - read returns 0) */
 uint32_t version;
 
+bool cache_mode_enabled;/* RO - is cap CM enabled? */
+
 dma_addr_t root;/* Current root table pointer */
 bool root_extended; /* Type of root table (extended or not) */
 bool dmar_enabled;  /* Set if DMA remapping is enabled */
-- 
2.7.4




[Qemu-devel] [PATCH RFC v3 03/14] intel_iommu: renaming gpa to iova where proper

2017-01-12 Thread Peter Xu
There are lots of places in current intel_iommu.c codes that named
"iova" as "gpa". It is really confusing to use a name "gpa" in these
places (which is very easily to be understood as "Guest Physical
Address", while it's not). To make the codes (much) easier to be read, I
decided to do this once and for all.

No functional change is made. Only literal ones.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 77d467a..275c3db 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -259,7 +259,7 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t 
source_id,
 uint64_t *key = g_malloc(sizeof(*key));
 uint64_t gfn = vtd_get_iotlb_gfn(addr, level);
 
-VTD_DPRINTF(CACHE, "update iotlb sid 0x%"PRIx16 " gpa 0x%"PRIx64
+VTD_DPRINTF(CACHE, "update iotlb sid 0x%"PRIx16 " iova 0x%"PRIx64
 " slpte 0x%"PRIx64 " did 0x%"PRIx16, source_id, addr, slpte,
 domain_id);
 if (g_hash_table_size(s->iotlb) >= VTD_IOTLB_MAX_SIZE) {
@@ -575,12 +575,12 @@ static uint64_t vtd_get_slpte(dma_addr_t base_addr, 
uint32_t index)
 return slpte;
 }
 
-/* Given a gpa and the level of paging structure, return the offset of current
- * level.
+/* Given an iova and the level of paging structure, return the offset
+ * of current level.
  */
-static inline uint32_t vtd_gpa_level_offset(uint64_t gpa, uint32_t level)
+static inline uint32_t vtd_iova_level_offset(uint64_t iova, uint32_t level)
 {
-return (gpa >> vtd_slpt_level_shift(level)) &
+return (iova >> vtd_slpt_level_shift(level)) &
 ((1ULL << VTD_SL_LEVEL_BITS) - 1);
 }
 
@@ -628,10 +628,10 @@ static bool vtd_slpte_nonzero_rsvd(uint64_t slpte, 
uint32_t level)
 }
 }
 
-/* Given the @gpa, get relevant @slptep. @slpte_level will be the last level
+/* Given the @iova, get relevant @slptep. @slpte_level will be the last level
  * of the translation, can be used for deciding the size of large page.
  */
-static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t gpa, bool is_write,
+static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t iova, bool is_write,
 uint64_t *slptep, uint32_t *slpte_level,
 bool *reads, bool *writes)
 {
@@ -642,11 +642,11 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
gpa, bool is_write,
 uint32_t ce_agaw = vtd_get_agaw_from_context_entry(ce);
 uint64_t access_right_check;
 
-/* Check if @gpa is above 2^X-1, where X is the minimum of MGAW in CAP_REG
- * and AW in context-entry.
+/* Check if @iova is above 2^X-1, where X is the minimum of MGAW
+ * in CAP_REG and AW in context-entry.
  */
-if (gpa & ~((1ULL << MIN(ce_agaw, VTD_MGAW)) - 1)) {
-VTD_DPRINTF(GENERAL, "error: gpa 0x%"PRIx64 " exceeds limits", gpa);
+if (iova & ~((1ULL << MIN(ce_agaw, VTD_MGAW)) - 1)) {
+VTD_DPRINTF(GENERAL, "error: iova 0x%"PRIx64 " exceeds limits", iova);
 return -VTD_FR_ADDR_BEYOND_MGAW;
 }
 
@@ -654,13 +654,13 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
gpa, bool is_write,
 access_right_check = is_write ? VTD_SL_W : VTD_SL_R;
 
 while (true) {
-offset = vtd_gpa_level_offset(gpa, level);
+offset = vtd_iova_level_offset(iova, level);
 slpte = vtd_get_slpte(addr, offset);
 
 if (slpte == (uint64_t)-1) {
 VTD_DPRINTF(GENERAL, "error: fail to access second-level paging "
-"entry at level %"PRIu32 " for gpa 0x%"PRIx64,
-level, gpa);
+"entry at level %"PRIu32 " for iova 0x%"PRIx64,
+level, iova);
 if (level == vtd_get_level_from_context_entry(ce)) {
 /* Invalid programming of context-entry */
 return -VTD_FR_CONTEXT_ENTRY_INV;
@@ -672,8 +672,8 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
gpa, bool is_write,
 *writes = (*writes) && (slpte & VTD_SL_W);
 if (!(slpte & access_right_check)) {
 VTD_DPRINTF(GENERAL, "error: lack of %s permission for "
-"gpa 0x%"PRIx64 " slpte 0x%"PRIx64,
-(is_write ? "write" : "read"), gpa, slpte);
+"iova 0x%"PRIx64 " slpte 0x%"PRIx64,
+(is_write ? "write" : "read"), iova, slpte);
 return is_write ? -VTD_FR_WRITE : -VTD_FR_READ;
 }
 if (vtd_slpte_nonzero_rsvd(slpte, level)) {
@@ -827,7 +827,7 @@ static void vtd_do_iommu_translate(VTDAddressSpace *vtd_as, 
PCIBus *bus,
 /* Try to fetch slpte form IOTLB */
 iotlb_entry = vtd_lookup_iotlb(s, source_id, addr);
 if (iotlb_entry) {
-VTD_DPRINTF(CACHE, "hit iotlb sid 0x%"PRIx16 " gpa 0x%"PRIx64
+VTD_DPRINTF(CACHE, "hit 

[Qemu-devel] [PATCH RFC v3 04/14] intel_iommu: fix trace for inv desc handling

2017-01-12 Thread Peter Xu
VT-d codes are still using static DEBUG_INTEL_IOMMU macro. That's not
good, and we should end the day when we need to recompile the code
before getting useful debugging information for vt-d. Time to switch to
the trace system.

This is the first patch to do it.

Generally, the rule of mine is:

- for the old GENERAL typed message, I use error_report() directly if
  apply. Those are something shouldn't happen, and we should print those
  errors in all cases, even without enabling debug and tracing.

- for the non-GENERAL typed messages, remove those VTD_PRINTF()s that
  looks hardly used, and convert the rest lines into trace_*().

- for useless DPRINTFs, I removed them.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 98 ---
 hw/i386/trace-events  | 13 +++
 2 files changed, 59 insertions(+), 52 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 275c3db..459e575 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -35,6 +35,7 @@
 #include "sysemu/kvm.h"
 #include "hw/i386/apic_internal.h"
 #include "kvm_i386.h"
+#include "trace.h"
 
 /*#define DEBUG_INTEL_IOMMU*/
 #ifdef DEBUG_INTEL_IOMMU
@@ -474,22 +475,19 @@ static void vtd_handle_inv_queue_error(IntelIOMMUState *s)
 /* Set the IWC field and try to generate an invalidation completion interrupt 
*/
 static void vtd_generate_completion_event(IntelIOMMUState *s)
 {
-VTD_DPRINTF(INV, "completes an invalidation wait command with "
-"Interrupt Flag");
 if (vtd_get_long_raw(s, DMAR_ICS_REG) & VTD_ICS_IWC) {
-VTD_DPRINTF(INV, "there is a previous interrupt condition to be "
-"serviced by software, "
-"new invalidation event is not generated");
+trace_vtd_inv_desc_wait_irq("One pending, skip current");
 return;
 }
 vtd_set_clear_mask_long(s, DMAR_ICS_REG, 0, VTD_ICS_IWC);
 vtd_set_clear_mask_long(s, DMAR_IECTL_REG, 0, VTD_IECTL_IP);
 if (vtd_get_long_raw(s, DMAR_IECTL_REG) & VTD_IECTL_IM) {
-VTD_DPRINTF(INV, "IM filed in IECTL_REG is set, new invalidation "
-"event is not generated");
+trace_vtd_inv_desc_wait_irq("IM in IECTL_REG is set, "
+"new event not generated");
 return;
 } else {
 /* Generate the interrupt event */
+trace_vtd_inv_desc_wait_irq("Generating complete event");
 vtd_generate_interrupt(s, DMAR_IEADDR_REG, DMAR_IEDATA_REG);
 vtd_set_clear_mask_long(s, DMAR_IECTL_REG, VTD_IECTL_IP, 0);
 }
@@ -923,6 +921,7 @@ static void vtd_interrupt_remap_table_setup(IntelIOMMUState 
*s)
 
 static void vtd_context_global_invalidate(IntelIOMMUState *s)
 {
+trace_vtd_inv_desc_cc_global();
 s->context_cache_gen++;
 if (s->context_cache_gen == VTD_CONTEXT_CACHE_GEN_MAX) {
 vtd_reset_context_cache(s);
@@ -962,9 +961,11 @@ static void vtd_context_device_invalidate(IntelIOMMUState 
*s,
 uint16_t mask;
 VTDBus *vtd_bus;
 VTDAddressSpace *vtd_as;
-uint16_t devfn;
+uint8_t bus_n, devfn;
 uint16_t devfn_it;
 
+trace_vtd_inv_desc_cc_devices(source_id, func_mask);
+
 switch (func_mask & 3) {
 case 0:
 mask = 0;   /* No bits in the SID field masked */
@@ -980,16 +981,16 @@ static void vtd_context_device_invalidate(IntelIOMMUState 
*s,
 break;
 }
 mask = ~mask;
-VTD_DPRINTF(INV, "device-selective invalidation source 0x%"PRIx16
-" mask %"PRIu16, source_id, mask);
-vtd_bus = vtd_find_as_from_bus_num(s, VTD_SID_TO_BUS(source_id));
+
+bus_n = VTD_SID_TO_BUS(source_id);
+vtd_bus = vtd_find_as_from_bus_num(s, bus_n);
 if (vtd_bus) {
 devfn = VTD_SID_TO_DEVFN(source_id);
 for (devfn_it = 0; devfn_it < X86_IOMMU_PCI_DEVFN_MAX; ++devfn_it) {
 vtd_as = vtd_bus->dev_as[devfn_it];
 if (vtd_as && ((devfn_it & mask) == (devfn & mask))) {
-VTD_DPRINTF(INV, "invalidate context-cahce of devfn 0x%"PRIx16,
-devfn_it);
+trace_vtd_inv_desc_cc_device(bus_n, (devfn_it >> 3) & 0x1f,
+ devfn_it & 3);
 vtd_as->context_cache_entry.context_cache_gen = 0;
 }
 }
@@ -1302,7 +1303,7 @@ static bool vtd_process_wait_desc(IntelIOMMUState *s, 
VTDInvDesc *inv_desc)
 {
 if ((inv_desc->hi & VTD_INV_DESC_WAIT_RSVD_HI) ||
 (inv_desc->lo & VTD_INV_DESC_WAIT_RSVD_LO)) {
-VTD_DPRINTF(GENERAL, "error: non-zero reserved field in Invalidation "
+error_report("Non-zero reserved field in Invalidation "
 "Wait Descriptor hi 0x%"PRIx64 " lo 0x%"PRIx64,
 inv_desc->hi, inv_desc->lo);
 return false;
@@ -1316,21 +1317,20 @@ static bool vtd_process_wait_desc(IntelIOMMUState *s, 
VTDInvDesc *inv_desc)
 
 /* FIXME: need 

[Qemu-devel] [PATCH RFC v3 06/14] intel_iommu: vtd_slpt_level_shift check level

2017-01-12 Thread Peter Xu
This helps in debugging incorrect level passed in.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index b4166e0..b4019d0 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -168,6 +168,7 @@ static gboolean vtd_hash_remove_by_domain(gpointer key, 
gpointer value,
 /* The shift of an addr for a certain level of paging structure */
 static inline uint32_t vtd_slpt_level_shift(uint32_t level)
 {
+assert(level != 0);
 return VTD_PAGE_SHIFT_4K + (level - 1) * VTD_SL_LEVEL_BITS;
 }
 
-- 
2.7.4




[Qemu-devel] [PATCH RFC v3 00/14] VT-d: vfio enablement and misc enhances

2017-01-12 Thread Peter Xu
v3:
- fix style error reported by patchew
- fix comment in domain switch patch: use "IOMMU address space" rather
  than "IOMMU region" [Kevin]
- add ack-by for Paolo in patch:
  "memory: add section range info for IOMMU notifier"
  (this is seperately collected besides this thread)
- remove 3 patches which are merged already (from Jason)
- rebase to master b6c0897

v2:
- change comment for "end" parameter in vtd_page_walk() [Tianyu]
- change comment for "a iova" to "an iova" [Yi]
- fix fault printed val for GPA address in vtd_page_walk_level (debug
  only)
- rebased to master (rather than Aviv's v6 series) and merged Aviv's
  series v6: picked patch 1 (as patch 1 in this series), dropped patch
  2, re-wrote patch 3 (as patch 17 of this series).
- picked up two more bugfix patches from Jason's DMAR series
- picked up the following patch as well:
  "[PATCH v3] intel_iommu: allow dynamic switch of IOMMU region"

This RFC series is a re-work for Aviv B.D.'s vfio enablement series
with vt-d:

  https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01452.html

Aviv has done a great job there, and what we still lack there are
mostly the following:

(1) VFIO got duplicated IOTLB notifications due to splitted VT-d IOMMU
memory region.

(2) VT-d still haven't provide a correct replay() mechanism (e.g.,
when IOMMU domain switches, things will broke).

This series should have solved the above two issues.

Online repo:

  https://github.com/xzpeter/qemu/tree/vtd-vfio-enablement-v2

I would be glad to hear about any review comments for above patches.

=
Test Done
=

Build test passed for x86_64/arm/ppc64.

Simply tested with x86_64, assigning two PCI devices to a single VM,
boot the VM using:

bin=x86_64-softmmu/qemu-system-x86_64
$bin -M q35,accel=kvm,kernel-irqchip=split -m 1G \
 -device intel-iommu,intremap=on,eim=off,cache-mode=on \
 -netdev user,id=net0,hostfwd=tcp::-:22 \
 -device virtio-net-pci,netdev=net0 \
 -device vfio-pci,host=03:00.0 \
 -device vfio-pci,host=02:00.0 \
 -trace events=".trace.vfio" \
 /var/lib/libvirt/images/vm1.qcow2

pxdev:bin [vtd-vfio-enablement]# cat .trace.vfio
vtd_page_walk*
vtd_replay*
vtd_inv_desc*

Then, in the guest, run the following tool:

  
https://github.com/xzpeter/clibs/blob/master/gpl/userspace/vfio-bind-group/vfio-bind-group.c

With parameter:

  ./vfio-bind-group 00:03.0 00:04.0

Check host side trace log, I can see pages are replayed and mapped in
00:04.0 device address space, like:

...
vtd_replay_ce_valid replay valid context device 00:04.00 hi 0x401 lo 0x38fe1001
vtd_page_walk Page walk for ce (0x401, 0x38fe1001) iova range 0x0 - 0x80
vtd_page_walk_level Page walk (base=0x38fe1000, level=3) iova range 0x0 - 
0x80
vtd_page_walk_level Page walk (base=0x35d31000, level=2) iova range 0x0 - 
0x4000
vtd_page_walk_level Page walk (base=0x34979000, level=1) iova range 0x0 - 
0x20
vtd_page_walk_one Page walk detected map level 0x1 iova 0x0 -> gpa 0x22dc3000 
mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x1000 -> gpa 
0x22e25000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x2000 -> gpa 
0x22e12000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x3000 -> gpa 
0x22e2d000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x4000 -> gpa 
0x12a49000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x5000 -> gpa 
0x129bb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x6000 -> gpa 
0x128db000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x7000 -> gpa 
0x12a8 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x8000 -> gpa 
0x12a7e000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x9000 -> gpa 
0x12b22000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0xa000 -> gpa 
0x12b41000 mask 0xfff perm 3
...

=
Todo List
=

- error reporting for the assigned devices (as Tianyu has mentioned)

- per-domain address-space: A better solution in the future may be -
  we maintain one address space per IOMMU domain in the guest (so
  multiple devices can share a same address space if they are sharing
  the same IOMMU domains in the guest), rather than one address space
  per device (which is current implementation of vt-d). However that's
  a step further than this series, and let's see whether we can first
  provide a workable version of device assignment with vt-d
  protection.

- more to come...

Thanks,

Aviv Ben-David (1):
  IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to
guest

Peter Xu (13):
  intel_iommu: simplify irq region translation
  intel_iommu: renaming gpa to iova where proper
  intel_iommu: fix trace for inv desc handling
  intel_iommu: fix trace for addr translation
  intel_iommu: 

[Qemu-devel] [PATCH] nvdimm: allow read/write zero-size namespace label

2017-01-12 Thread Li Qiang
From: Li Qiang 

The spec doesn't say the namespace label can't be zero
when read/write it. As this is no harmful, just allow
it.

Signed-off-by: Li Qiang 
---
 hw/mem/nvdimm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
index db896b0..4042097 100644
--- a/hw/mem/nvdimm.c
+++ b/hw/mem/nvdimm.c
@@ -114,7 +114,7 @@ static void nvdimm_realize(PCDIMMDevice *dimm, Error **errp)
 static void nvdimm_validate_rw_label_data(NVDIMMDevice *nvdimm, uint64_t size,
 uint64_t offset)
 {
-assert((nvdimm->label_size >= size + offset) && (offset + size > offset));
+assert((nvdimm->label_size >= size + offset) && (offset + size >= offset));
 }
 
 static void nvdimm_read_label_data(NVDIMMDevice *nvdimm, void *buf,
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH 0/7] POWER9 TCG enablements - part12

2017-01-12 Thread David Gibson
On Thu, Jan 12, 2017 at 09:54:04PM +0530, Nikunj A Dadhania wrote:
> This series contains 11 new instructions for POWER9 ISA3.0
> VSX Scalar Test Data Class
> VSX Vector Test Data Class
> VSX Vector Convert HP/SP
> VSX Scalar Multiply/Divide
> VSX Scalar Convert Unsigned/Signed Doubleword
> 
> Bharata B Rao (4):
>   target-ppc: Use ppc_vsr_t.f128 in xscmp[o,u,exp]qp
>   target-ppc: Add xscvsdqp and xscvudqp instructions
>   target-ppc: Add xsdivqp instruction
>   target-ppc: Add xsmulqp instruction
> 
> Nikunj A Dadhania (3):
>   target-ppc: Add xvcv[hpsp, sphp] instructions
>   target-ppc: Add xvtstdc[sp,dp] instructions
>   target-ppc: Add xststdc[sp, dp, qp] instructions
> 
>  target/ppc/fpu_helper.c | 253 
> +---
>  target/ppc/helper.h |  11 ++
>  target/ppc/internal.h   |   6 +-
>  target/ppc/translate/vsx-impl.inc.c |  11 ++
>  target/ppc/translate/vsx-ops.inc.c  |  18 +++
>  5 files changed, 276 insertions(+), 23 deletions(-)

I've merged 1..5.  6 & 7 I've left, pending a reply to rth's comment.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH RFC v2 00/17] VT-d: vfio enablement and misc enhances

2017-01-12 Thread Peter Xu
On Fri, Jan 13, 2017 at 10:19:27AM +0800, Peter Xu wrote:

[...]

> > This all looks good to me. The series needs to be
> > rebased on top of latest bits.
> > In particular, Jason made changes which conflict
> > with this.
> 
> Michael,
> 
> Thanks for your positive feedback.
> 
> Could you provide me a branch so that I can rebase this work on? I was
> trying to find a good base point in your repo here:
> 
>   git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git
> 
> but failed to find a suitable branch. Thanks,

Please don't bother. Just noticed that Jason's patches related to VT-d
have been merged. So will rebase to master and re-post.

Thanks,

-- peterx



Re: [Qemu-devel] [virtio-dev] Re: [PATCH v15 0/2] virtio-crypto: virtio crypto device specification

2017-01-12 Thread Gonglei (Arei)

> 
> On Thu, Jan 12, 2017 at 12:26:24PM +, Gonglei (Arei) wrote:
> > Hi,
> >
> >
> > >
> > > On 01/04/2017 11:10 AM, Gonglei (Arei) wrote:
> > > > Hi all,
> > > >
> > > > I attach the diff files between v14 and v15 for better review.
> > > >
> > > Hi,
> > >
> > > only had a quick look. Will try to come back to this later.
> > >
> > That's cool.
> >
> > > > diff --git a/virtio-crypto.tex b/virtio-crypto.tex
> > > > index 9f7faf0..884ee95 100644
> > > > --- a/virtio-crypto.tex
> > > > +++ b/virtio-crypto.tex
> > > > @@ -2,8 +2,8 @@
> > > >
> > > >  The virtio crypto device is a virtual cryptography device as well as a 
> > > > kind
> of
> > > >  virtual hardware accelerator for virtual machines. The encryption and
> > > > -decryption requests are placed in the data queue and are ultimately
> handled
> > > by the
> > > > -backend crypto accelerators. The second queue is the control queue used
> to
> > > create
> > > > +decryption requests are placed in any of the data active queues and are
> > > ultimately handled by the
> > > s/data active/active data/
> > > > +backend crypto accelerators. The second kind of queue is the control
> queue
> > > used to create
> > > >  or destroy sessions for symmetric algorithms and will control some
> > > advanced
> > > >  features in the future. The virtio crypto device provides the following
> crypto
> > > >  services: CIPHER, MAC, HASH, and AEAD.
> > >
> > > [..]
> > >
> > > > ===The below diff shows the changes of add non-session
> mode
> > > support:
> > > >
> > > > diff --git a/virtio-crypto.tex b/virtio-crypto.tex
> > > > index 884ee95..44819f9 100644
> > > > --- a/virtio-crypto.tex
> > > > +++ b/virtio-crypto.tex
> > > > @@ -26,7 +26,10 @@ N is set by \field{max_dataqueues}.
> > > >
> > > >  \subsection{Feature bits}\label{sec:Device Types / Crypto Device /
> Feature
> > > bits}
> > > >
> > > > -None currently defined.
> > > > +VIRTIO_CRYPTO_F_CIPHER_SESSION_MODE (1) Session mode is
> available
> > > for CIPHER service.
> > > > +VIRTIO_CRYPTO_F_HASH_SESSION_MODE (2) Session mode is available
> for
> > > HASH service.
> > > > +VIRTIO_CRYPTO_F_MAC_SESSION_MODE (3) Session mode is available
> for
> > > MAC service.
> > > > +VIRTIO_CRYPTO_F_AEAD_SESSION_MODE (4) Session mode is available
> for
> > > AEAD service.
> > > >
> > > >  \subsection{Device configuration layout}\label{sec:Device Types /
> Crypto
> > > Device / Device configuration layout}
> > > >
> > > > @@ -208,6 +211,9 @@ Operation parameters are algorithm-specific
> > > parameters, output data is the
> > > >  data that should be utilized in operations, and input data is equal to
> > > >  "operation result + result data".
> > > >
> > > > +The device can support both session mode (See \ref{sec:Device Types /
> > > Crypto Device / Device Operation / Control Virtqueue / Session operation})
> and
> > > non-session mode, for example,
> > > > +As VIRTIO_CRYPTO_F_CIPHER_SESSION feature bit is negotiated, the
> driver
> > > can use session mode for CIPHER service, otherwise it can only use
> non-session
> > > mode.
> > > > +
> > >
> > > As far as I understand you are adding non-session mode to the mix but
> > > providing feature bits for session mode. Would this render the the current
> > > implementation non-compliant?
> > >
> > You are right, shall we use feature bits for non-session mode for 
> > compliancy?
> > Or because the spec is on the fly, and some structures in the 
> > virtio_crypto.h
> need to
> > be modified, can we keep the compliancy completely?
> >
> > Thanks,
> > -Gonglei
> 
> Since there's a linux driver upstream you must at least
> keep compatibility with that.
> 
Sure. We use feature bits for non-session mode then.
For structures modification, do we need to do some specific
actions for compatibility?  Take CIPHER service as an example:

The present structure:

struct virtio_crypto_cipher_para {
/*
 * Byte Length of valid IV/Counter data pointed to by the below iv data.
 *
 * For block ciphers in CBC or F8 mode, or for Kasumi in F8 mode, or for
 *   SNOW3G in UEA2 mode, this is the length of the IV (which
 *   must be the same as the block length of the cipher).
 * For block ciphers in CTR mode, this is the length of the counter
 *   (which must be the same as the block length of the cipher).
 */
le32 iv_len;
/* length of source data */
le32 src_data_len;
/* length of destination data */
le32 dst_data_len;
};

The future structure if supporting non-session based operations:

struct virtio_crypto_cipher_para {
struct {
/* See VIRTIO_CRYPTO_CIPHER* above */
le32 algo;
/* length of key */
le32 keylen;

/* See VIRTIO_CRYPTO_OP_* above */
le32 op;
} sess_para;

/*
 * Byte Length of valid IV/Counter data pointed to by the below iv data.
 *
 * For block ciphers in CBC or F8 mode, or for Kasumi in F8 mode, or for
 *   SNOW3G in UEA2 mode, this is the length of 

Re: [Qemu-devel] [PATCH v5 00/10] aio_context_acquire/release pushdown, part 1

2017-01-12 Thread Fam Zheng
On Thu, 01/12 19:07, Paolo Bonzini wrote:
> This is the first step of pushing down the AioContext lock.  Bottom halves
> are already protected by their own lock, use it also for walking_bh
> and for the handlers list (including walking_handlers).  The (lock,
> walking_foo) pair is wrapped into the QemuLockCnt primitive.
> 
> The only difference from v3 is a smattering of tiny nice improvements
> to QemuLockCnt.
> 
> Paolo
> 
> v4->v5:
> remove stray tabs [patchew]
> 
> v3->v4:
> Avoid useless atomic_mb_read in non-futex lockcnt [Stefan]
> Use atomic_read in qemu_lockcnt_count [Stefan]
> Tweak comment for qemu_lockcnt_cmpxchg_or_wait [Fam]
> Use if/else in qemu_lockcnt_dec_and_lock [Fam]
> Comment QEMU_LOCKCNT_STATE_* definitions [Fam]

Perfect!

Reviewed-by: Fam Zheng 



Re: [Qemu-devel] [PATCH V4 net-next] vhost_net: device IOTLB support

2017-01-12 Thread Jason Wang



On 2017年01月12日 22:17, Michael S. Tsirkin wrote:

On Wed, Jan 11, 2017 at 12:32:12PM +0800, Jason Wang wrote:

This patches implements Device IOTLB support for vhost kernel. This is
done through:

1) switch to use dma helpers when map/unmap vrings from vhost codes
2) introduce a set of VhostOps to:
- setting up device IOTLB request callback
- processing device IOTLB request
- processing device IOTLB invalidation
2) kernel support for Device IOTLB API:

- allow vhost-net to query the IOMMU IOTLB entry through eventfd
- enable the ability for qemu to update a specified mapping of vhost
- through ioctl.
- enable the ability to invalidate a specified range of iova for the
   device IOTLB of vhost through ioctl. In x86/intel_iommu case this is
   triggered through iommu memory region notifier from device IOTLB
   invalidation descriptor processing routine.

With all the above, kernel vhost_net can co-operate with userspace
IOMMU. For vhost-user, the support could be easily done on top by
implementing the VhostOps.

Cc: Michael S. Tsirkin
Signed-off-by: Jason Wang

Applied, thanks!


---
Changes from V4:
- set iotlb callback only when IOMMU_PLATFORM is negotiated (fix
   vhost-user qtest failure)

In fact this only checks virtio_host_has_feature - which is
the right thing to do, we can't trust the guest.


- whitelist VIRTIO_F_IOMMU_PLATFORM instead of manually add it
- keep cpu_physical_memory_map() in vhost_memory_map()

One further enhancement might be to detect that guest disabled
iommu (e.g. globally, or using iommu=pt) and disable
the iotlb to avoid overhead for guests which use DPDK
for assigned devices but not for vhost.




Yes, it's in my todo list.

Thanks



Re: [Qemu-devel] [PATCH RFC v2 00/17] VT-d: vfio enablement and misc enhances

2017-01-12 Thread Peter Xu
On Thu, Jan 12, 2017 at 04:27:30PM +0200, Michael S. Tsirkin wrote:
> On Tue, Jan 03, 2017 at 03:29:37PM +0800, Peter Xu wrote:
> > (I renamed the title for this RFC v2, since starting from this version
> >  the series will be based on master, also I picked up some more fixes
> >  for vt-d into this series)
> > 
> > v2:
> > - change comment for "end" parameter in vtd_page_walk() [Tianyu]
> > - change comment for "a iova" to "an iova" [Yi]
> > - fix fault printed val for GPA address in vtd_page_walk_level (debug
> >   only)
> > - rebased to master (rather than Aviv's v6 series) and merged Aviv's
> >   series v6: picked patch 1 (as patch 1 in this series), dropped patch
> >   2, re-wrote patch 3 (as patch 17 of this series).
> > - picked up two more bugfix patches from Jason's DMAR series
> > - picked up the following patch as well:
> >   "[PATCH v3] intel_iommu: allow dynamic switch of IOMMU region"
> > 
> > This RFC series is a re-work for Aviv B.D.'s vfio enablement series
> > with vt-d:
> > 
> >   https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01452.html
> > 
> > Aviv has done a great job there, and what we still lack there are
> > mostly the following:
> > 
> > (1) VFIO got duplicated IOTLB notifications due to splitted VT-d IOMMU
> > memory region.
> > 
> > (2) VT-d still haven't provide a correct replay() mechanism (e.g.,
> > when IOMMU domain switches, things will broke).
> > 
> > This series should have solved the above two issues.
> > 
> > Online repo:
> > 
> >   https://github.com/xzpeter/qemu/tree/vtd-vfio-enablement-v2
> > 
> > I would be glad to hear about any review comments for above patches.
> 
> This all looks good to me. The series needs to be
> rebased on top of latest bits.
> In particular, Jason made changes which conflict
> with this.

Michael,

Thanks for your positive feedback.

Could you provide me a branch so that I can rebase this work on? I was
trying to find a good base point in your repo here:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git

but failed to find a suitable branch. Thanks,

-- peterx



Re: [Qemu-devel] [PATCH v8 1/1] crypto: add virtio-crypto driver

2017-01-12 Thread Gonglei (Arei)
> 
> On Thu, Jan 12, 2017 at 03:10:25PM +0100, Christian Borntraeger wrote:
> > On 01/10/2017 01:56 PM, Christian Borntraeger wrote:
> > > On 01/10/2017 01:36 PM, Gonglei (Arei) wrote:
> > >> Hi,
> > >>
> > >>>
> > >>> On 12/15/2016 03:03 AM, Gonglei wrote:
> > >>> [...]
> >  +
> >  +static struct crypto_alg virtio_crypto_algs[] = { {
> >  +  .cra_name = "cbc(aes)",
> >  +  .cra_driver_name = "virtio_crypto_aes_cbc",
> >  +  .cra_priority = 501,
> > >>>
> > >>>
> > >>> This is still higher than the hardware-accelerators (like intel aesni 
> > >>> or the
> > >>> s390 cpacf functions or the arm hw). aesni and s390/cpacf are supported
> by the
> > >>> hardware virtualization and available to the guests. I do not see a way
> how
> > >>> virtio
> > >>> crypto can be faster than that (in the end it might be cpacf/aesni +
> overhead)
> > >>> instead it will very likely be slower.
> > >>> So we should use a number that is higher than software implementations
> but
> > >>> lower than the hw ones.
> > >>>
> > >>> Just grepping around, the software ones seem be be around 100 and the
> > >>> hardware
> > >>> ones around 200-400. So why was 150 not enough?
> > >>>
> > >> I didn't find a documentation about how we use the priority, and I 
> > >> assumed
> > >> people use virtio-crypto will configure hardware accelerators in the
> > >> host. So I choosed the number which bigger than aesni's priority.
> > >
> > > Yes, but the aesni driver will only bind if there is HW support in the 
> > > guest.
> > > And if aesni is available in the guest (or the s390 aes function from 
> > > cpacf)
> > > it will always be faster than the same in the host via virtio.So your 
> > > priority
> > > should be smaller.
> >
> >
> > any opinion on this?
> 
> Going forward, we might add an emulated aesni device and that might
> become slower than virtio. OTOH if or when this happens, we can solve it
> by adding a priority or a feature flag to virtio to raise its priority.
> 
> So I think I agree with Christian here, let's lower the priority.
> Gonglei, could you send a patch like this?
> 
OK, will do.

Thanks,
-Gonglei



[Qemu-devel] [PATCH v2] x86: add AVX512_VPOPCNTDQ features

2017-01-12 Thread He Chen
AVX512_VPOPCNTDQ: Vector POPCNT instructions for word and qwords.
variable precision.

Signed-off-by: He Chen 
---
Changes from v1:
* Rename vpopcntdq to avx512-vpopcntdq.
---
 target/i386/cpu.c | 2 +-
 target/i386/cpu.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b0640f1..1a4b164 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -435,7 +435,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
 NULL, "avx512vbmi", "umip", "pku",
 "ospke", NULL, NULL, NULL,
 NULL, NULL, NULL, NULL,
-NULL, NULL, NULL, NULL,
+NULL, NULL, "avx512-vpopcntdq", NULL,
 "la57", NULL, NULL, NULL,
 NULL, NULL, "rdpid", NULL,
 NULL, NULL, NULL, NULL,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index a04e46b..37b209e 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -630,6 +630,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_7_0_ECX_UMIP (1U << 2)
 #define CPUID_7_0_ECX_PKU  (1U << 3)
 #define CPUID_7_0_ECX_OSPKE(1U << 4)
+#define CPUID_7_0_ECX_AVX512_VPOPCNTDQ (1U << 14) /* POPCNT for vectors of 
DW/QW */
 #define CPUID_7_0_ECX_LA57 (1U << 16)
 #define CPUID_7_0_ECX_RDPID(1U << 22)
 
-- 
2.7.4




[Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL

2017-01-12 Thread Felipe Franciosi
Currently, VQs are started as soon as a SET_VRING_KICK is received. That
is too early in the VQ setup process, as the backend might not yet have
a callfd to notify in case it received a kick and fully processed the
request/command. This patch only starts a VQ when a SET_VRING_CALL is
received.

Signed-off-by: Felipe Franciosi 
---
 contrib/libvhost-user/libvhost-user.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/contrib/libvhost-user/libvhost-user.c 
b/contrib/libvhost-user/libvhost-user.c
index af4faad..a46ef90 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -607,19 +607,6 @@ vu_set_vring_kick_exec(VuDev *dev, VhostUserMsg *vmsg)
 DPRINT("Got kick_fd: %d for vq: %d\n", vmsg->fds[0], index);
 }
 
-dev->vq[index].started = true;
-if (dev->iface->queue_set_started) {
-dev->iface->queue_set_started(dev, index, true);
-}
-
-if (dev->vq[index].kick_fd != -1 && dev->vq[index].handler) {
-dev->set_watch(dev, dev->vq[index].kick_fd, VU_WATCH_IN,
-   vu_kick_cb, (void *)(long)index);
-
-DPRINT("Waiting for kicks on fd: %d for vq: %d\n",
-   dev->vq[index].kick_fd, index);
-}
-
 return false;
 }
 
@@ -661,6 +648,19 @@ vu_set_vring_call_exec(VuDev *dev, VhostUserMsg *vmsg)
 
 DPRINT("Got call_fd: %d for vq: %d\n", vmsg->fds[0], index);
 
+dev->vq[index].started = true;
+if (dev->iface->queue_set_started) {
+dev->iface->queue_set_started(dev, index, true);
+}
+
+if (dev->vq[index].kick_fd != -1 && dev->vq[index].handler) {
+dev->set_watch(dev, dev->vq[index].kick_fd, VU_WATCH_IN,
+   vu_kick_cb, (void *)(long)index);
+
+DPRINT("Waiting for kicks on fd: %d for vq: %d\n",
+   dev->vq[index].kick_fd, index);
+}
+
 return false;
 }
 
-- 
1.9.4




Re: [Qemu-devel] [PATCH v3 2/3] vus: Introduce vhost-user-scsi host device

2017-01-12 Thread Felipe Franciosi
Hi Paolo,

Thanks again for the review.
Just to clarify:

> On 2 Jan 2017, at 02:25, Paolo Bonzini  wrote:
> 
> 
> 
> On 21/12/2016 23:17, Felipe Franciosi wrote:
>> To use it, one must configure Qemu with --enable-vhost-user-scsi and
>> start Qemu with a command line equivalent to:
>> 
>> qemu-system-x86_64 \
>>   -chardev socket,id=vus0,path=/tmp/vus.sock \
>>   -device vhost-user-scsi-pci,chardev=vus0,bus=pci.0,addr=...
>> 
>> A separate commit presents a sample application linked with libiscsi to
>> provide a backend for vhost-user-scsi.
> 
> Please place CONFIG_VHOST_USER_SCSI=$(CONFIG_POSIX) symbol in
> default-configs/ (so that it is enabled by default on non-Windows hosts)
> instead of having the configure option.  Otherwise, the patches look good!

Which architectures would you like me to add this to? How about putting it in 
"pci.mak"?

And when you say "instead", do you mean on top of? I imagine having a --disable 
switch is at least desirable, so I'd rather not remove the commands from 
'configure'.

Cheers,
Felipe

> 
> Paolo





Re: [Qemu-devel] [PATCH] nvdimm acpi: fix g_array_free() with NULL pointer

2017-01-12 Thread Xiao Guangrong


CCed Haozhong

On 01/12/2017 07:09 PM, Stefan Hajnoczi wrote:

On Thu, Jan 12, 2017 at 11:18:25AM +0800, Xiao Guangrong wrote:



On 01/11/2017 05:36 PM, Stefan Hajnoczi wrote:

Unlike g_free(), g_array_free() does not accept a NULL pointer argument.
The following error is logged when an nvdimm device is realized:

  GLib-CRITICAL **: g_array_free: assertion 'array' failed

Cc: Xiao Guangrong 
Signed-off-by: Stefan Hajnoczi 
---
 hw/acpi/nvdimm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


NACK



diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 8e7d6ec..8f0a484 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -375,7 +375,9 @@ static void nvdimm_init_fit_buffer(NvdimmFitBuffer *fit_buf)

 static void nvdimm_build_fit_buffer(NvdimmFitBuffer *fit_buf)
 {
-g_array_free(fit_buf->fit, true);
+if (fit_buf->fit) {
+g_array_free(fit_buf->fit, true);
+}


Er, i do not know why it is NULL as we have init-ed it in 
nvdimm_init_fit_buffer:

static void nvdimm_init_fit_buffer(NvdimmFitBuffer *fit_buf)
{
fit_buf->fit = g_array_new(false, true /* clear */, 1);
}

And i can not reproduce it on my box, could you share your command line and the
based commit id?


Good point, it happens when nvdimm_plug() is called but -M pc,nvdimm is
missing from the command-line.  This means nvdimm_init_acpi_state() was
not called by pc_init1():

  $ x86_64-softmmu/qemu-system-x86_64 \
  -enable-kvm \
  -m 1G,slots=2,maxmem=16G \
  -drive if=virtio,file=test.img,format=raw \
  -object memory-backend-file,id=hostmem0,mem-path=mydimm,share=on,size=8G \
  -device nvdimm,id=nvdimm0,memdev=hostmem0

Do you want to audit the code to check if anything else misbehaves when
-device nvdimm is used without -M pc,nvdimm?


Yes. Haozhong will help me to audit the code and fix this crash.

Thanks for your report, Stefan!




Re: [Qemu-devel] [PATCH v3 3/3] cputlb: drop flush_global flag from tlb_flush

2017-01-12 Thread David Gibson
On Thu, Jan 12, 2017 at 03:47:31PM +, Alex Bennée wrote:
> We have never has the concept of global TLB entries which would avoid
> the flush so we never actually use this flag. Drop it and make clear
> that tlb_flush is the sledge-hammer it has always been.
> 
> Signed-off-by: Alex Bennée 
> Reviewed-by: Richard Henderson 

For ppc:

Acked-by: David Gibson 

> ---
>  cputlb.c   | 21 ++---
>  exec.c |  4 ++--
>  hw/sh4/sh7750.c|  2 +-
>  include/exec/exec-all.h| 14 ++
>  target/alpha/cpu.c |  2 +-
>  target/alpha/sys_helper.c  |  2 +-
>  target/arm/helper.c| 26 +-
>  target/i386/fpu_helper.c   |  2 +-
>  target/i386/helper.c   |  8 
>  target/i386/machine.c  |  2 +-
>  target/i386/misc_helper.c  |  2 +-
>  target/i386/svm_helper.c   |  2 +-
>  target/microblaze/mmu.c|  2 +-
>  target/mips/cpu.h  |  2 +-
>  target/mips/helper.c   |  6 +++---
>  target/mips/op_helper.c|  8 
>  target/openrisc/interrupt.c|  2 +-
>  target/openrisc/interrupt_helper.c |  2 +-
>  target/openrisc/sys_helper.c   |  2 +-
>  target/ppc/helper_regs.h   |  4 ++--
>  target/ppc/misc_helper.c   |  4 ++--
>  target/ppc/mmu_helper.c| 32 
>  target/s390x/gdbstub.c |  2 +-
>  target/s390x/mem_helper.c  |  8 
>  target/sh4/helper.c|  2 +-
>  target/sparc/ldst_helper.c | 12 ++--
>  target/unicore32/cpu.c |  2 +-
>  target/unicore32/helper.c  |  2 +-
>  target/xtensa/op_helper.c  |  2 +-
>  29 files changed, 85 insertions(+), 96 deletions(-)
> 
> diff --git a/cputlb.c b/cputlb.c
> index 813279f3bc..6c39927455 100644
> --- a/cputlb.c
> +++ b/cputlb.c
> @@ -60,24 +60,15 @@
>  /* statistics */
>  int tlb_flush_count;
>  
> -/* NOTE:
> - * If flush_global is true (the usual case), flush all tlb entries.
> - * If flush_global is false, flush (at least) all tlb entries not
> - * marked global.
> - *
> - * Since QEMU doesn't currently implement a global/not-global flag
> - * for tlb entries, at the moment tlb_flush() will also flush all
> - * tlb entries in the flush_global == false case. This is OK because
> - * CPU architectures generally permit an implementation to drop
> - * entries from the TLB at any time, so flushing more entries than
> - * required is only an efficiency issue, not a correctness issue.
> +/* This is OK because CPU architectures generally permit an
> + * implementation to drop entries from the TLB at any time, so
> + * flushing more entries than required is only an efficiency issue,
> + * not a correctness issue.
>   */
> -void tlb_flush(CPUState *cpu, int flush_global)
> +void tlb_flush(CPUState *cpu)
>  {
>  CPUArchState *env = cpu->env_ptr;
>  
> -tlb_debug("(%d)\n", flush_global);
> -
>  memset(env->tlb_table, -1, sizeof(env->tlb_table));
>  memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table));
>  memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
> @@ -144,7 +135,7 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
>TARGET_FMT_lx "/" TARGET_FMT_lx ")\n",
>env->tlb_flush_addr, env->tlb_flush_mask);
>  
> -tlb_flush(cpu, 1);
> +tlb_flush(cpu);
>  return;
>  }
>  
> diff --git a/exec.c b/exec.c
> index 47835c1dc1..401a9127c2 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -544,7 +544,7 @@ static int cpu_common_post_load(void *opaque, int 
> version_id)
>  /* 0x01 was CPU_INTERRUPT_EXIT. This line can be removed when the
> version_id is increased. */
>  cpu->interrupt_request &= ~0x01;
> -tlb_flush(cpu, 1);
> +tlb_flush(cpu);
>  
>  return 0;
>  }
> @@ -2426,7 +2426,7 @@ static void tcg_commit(MemoryListener *listener)
>   */
>  d = atomic_rcu_read(>as->dispatch);
>  atomic_rcu_set(>memory_dispatch, d);
> -tlb_flush(cpuas->cpu, 1);
> +tlb_flush(cpuas->cpu);
>  }
>  
>  void address_space_init_dispatch(AddressSpace *as)
> diff --git a/hw/sh4/sh7750.c b/hw/sh4/sh7750.c
> index 3132d559d7..166e4bd947 100644
> --- a/hw/sh4/sh7750.c
> +++ b/hw/sh4/sh7750.c
> @@ -417,7 +417,7 @@ static void sh7750_mem_writel(void *opaque, hwaddr addr,
>  case SH7750_PTEH_A7:
>  /* If asid changes, clear all registered tlb entries. */
>  if ((s->cpu->env.pteh & 0xff) != (mem_value & 0xff)) {
> -tlb_flush(CPU(s->cpu), 1);
> +tlb_flush(CPU(s->cpu));
>  }
>  s->cpu->env.pteh = mem_value;
>  return;
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index a8c13cee66..bbc9478a50 100644
> --- 

Re: [Qemu-devel] [PATCH v3 1/3] qom/cpu: move tlb_flush to cpu_common_reset

2017-01-12 Thread David Gibson
On Thu, Jan 12, 2017 at 03:47:29PM +, Alex Bennée wrote:
> It is a common thing amongst the various cpu reset functions want to
> flush the SoftMMU's TLB entries. This is done either by calling
> tlb_flush directly or by way of a general memset of the CPU
> structure (sometimes both).
> 
> This moves the tlb_flush call to the common reset function and
> additionally ensures it is only done for the CONFIG_SOFTMMU case and
> when tcg is enabled.
> 
> In some target cases we add an empty end_of_reset_fields structure to the
> target vCPU structure so have a clear end point for any memset which
> is resetting value in the structure before CPU_COMMON (where the TLB
> structures are).
> 
> While this is a nice clean-up in general it is also a precursor for
> changes coming to cputlb for MTTCG where the clearing of entries
> can't be done arbitrarily across vCPUs. Currently the cpu_reset
> function is usually called from the context of another vCPU as the
> architectural power up sequence is run. By using the cputlb API
> functions we can ensure the right behaviour in the future.
> 
> Signed-off-by: Alex Bennée 
> Reviewed-by: Richard Henderson 

For ppc:

Reviewed-by: David Gibson 

> 
> ---
> v3:
>   - split tcg_enabled() into a separate patch
> ---
>  qom/cpu.c   | 4 
>  target/arm/cpu.c| 5 ++---
>  target/arm/cpu.h| 5 -
>  target/cris/cpu.c   | 3 +--
>  target/cris/cpu.h   | 9 ++---
>  target/i386/cpu.c   | 2 --
>  target/i386/cpu.h   | 6 --
>  target/lm32/cpu.c   | 3 +--
>  target/lm32/cpu.h   | 3 +++
>  target/m68k/cpu.c   | 3 +--
>  target/m68k/cpu.h   | 3 +++
>  target/microblaze/cpu.c | 3 +--
>  target/microblaze/cpu.h | 3 +++
>  target/mips/cpu.c   | 3 +--
>  target/mips/cpu.h   | 3 +++
>  target/moxie/cpu.c  | 4 +---
>  target/moxie/cpu.h  | 3 +++
>  target/openrisc/cpu.c   | 9 +
>  target/openrisc/cpu.h   | 3 +++
>  target/ppc/translate_init.c | 3 ---
>  target/s390x/cpu.c  | 7 ++-
>  target/s390x/cpu.h  | 5 +++--
>  target/sh4/cpu.c| 3 +--
>  target/sh4/cpu.h| 3 +++
>  target/sparc/cpu.c  | 3 +--
>  target/sparc/cpu.h  | 3 +++
>  target/tilegx/cpu.c | 3 +--
>  target/tilegx/cpu.h | 3 +++
>  target/tricore/cpu.c| 2 --
>  29 files changed, 62 insertions(+), 50 deletions(-)
> 
> diff --git a/qom/cpu.c b/qom/cpu.c
> index 03d9190f8c..cc51de2a8c 100644
> --- a/qom/cpu.c
> +++ b/qom/cpu.c
> @@ -273,6 +273,10 @@ static void cpu_common_reset(CPUState *cpu)
>  for (i = 0; i < TB_JMP_CACHE_SIZE; ++i) {
>  atomic_set(>tb_jmp_cache[i], NULL);
>  }
> +
> +#ifdef CONFIG_SOFTMMU
> +tlb_flush(cpu, 0);
> +#endif
>  }
>  
>  static bool cpu_common_has_work(CPUState *cs)
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index f5cb30af6c..91046111d9 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -122,7 +122,8 @@ static void arm_cpu_reset(CPUState *s)
>  
>  acc->parent_reset(s);
>  
> -memset(env, 0, offsetof(CPUARMState, features));
> +memset(env, 0, offsetof(CPUARMState, end_reset_fields));
> +
>  g_hash_table_foreach(cpu->cp_regs, cp_reg_reset, cpu);
>  g_hash_table_foreach(cpu->cp_regs, cp_reg_check_reset, cpu);
>  
> @@ -226,8 +227,6 @@ static void arm_cpu_reset(CPUState *s)
>>vfp.fp_status);
>  set_float_detect_tininess(float_tininess_before_rounding,
>>vfp.standard_fp_status);
> -tlb_flush(s, 1);
> -
>  #ifndef CONFIG_USER_ONLY
>  if (kvm_enabled()) {
>  kvm_arm_reset_vcpu(cpu);
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index ab119e62ab..7bd16eec18 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -491,9 +491,12 @@ typedef struct CPUARMState {
>  struct CPUBreakpoint *cpu_breakpoint[16];
>  struct CPUWatchpoint *cpu_watchpoint[16];
>  
> +/* Fields up to this point are cleared by a CPU reset */
> +struct {} end_reset_fields;
> +
>  CPU_COMMON
>  
> -/* These fields after the common ones so they are preserved on reset.  */
> +/* Fields after CPU_COMMON are preserved across CPU reset. */
>  
>  /* Internal CPU feature flags.  */
>  uint64_t features;
> diff --git a/target/cris/cpu.c b/target/cris/cpu.c
> index 2e9ab9700e..5f766f09d6 100644
> --- a/target/cris/cpu.c
> +++ b/target/cris/cpu.c
> @@ -52,9 +52,8 @@ static void cris_cpu_reset(CPUState *s)
>  ccc->parent_reset(s);
>  
>  vr = env->pregs[PR_VR];
> -memset(env, 0, offsetof(CPUCRISState, load_info));
> +memset(env, 0, offsetof(CPUCRISState, end_reset_fields));
>  env->pregs[PR_VR] = vr;
> -tlb_flush(s, 1);
>  
>  #if defined(CONFIG_USER_ONLY)
>  /* start in user mode with interrupts enabled.  */
> diff 

Re: [Qemu-devel] [PATCH] ppc/prep: update MAINTAINERS file

2017-01-12 Thread David Gibson
On Thu, Jan 12, 2017 at 09:47:29AM +0100, Hervé Poussineau wrote:
> Signed-off-by: Hervé Poussineau 

Merged to ppc-for-2.9.

> ---
>  MAINTAINERS | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1444b26..f4b02ab 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -657,10 +657,13 @@ F: hw/misc/macio/
>  F: hw/intc/heathrow_pic.c
>  
>  PReP
> +M: Hervé Poussineau 
>  L: qemu-devel@nongnu.org
>  L: qemu-...@nongnu.org
> -S: Odd Fixes
> +S: Maintained
>  F: hw/ppc/prep.c
> +F: hw/ppc/prep_systemio.c
> +F: hw/ppc/rs6000_mc.c
>  F: hw/pci-host/prep.[hc]
>  F: hw/isa/pc87312.[hc]
>  F: pc-bios/ppc_rom.bin

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PULL 00/67] ppc-for-2.9 queue 20170112

2017-01-12 Thread David Gibson
On Wed, Jan 11, 2017 at 07:42:04PM -0800, no-re...@patchew.org wrote:
> Hi,
> 
> Your series seems to have some coding style problems. See output below for
> more information:

I don't think this is a real problem:

> Checking PATCH 46/67: target-ppc: Add xxextractuw instruction...
> ERROR: Macros with complex values should be enclosed in parenthesis
> #110: FILE: target/ppc/translate/vsx-ops.inc.c:52:
> +#define GEN_XX2FORM_EXT(name, opc2, opc3, fl2)  \
> +GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0, opc3, 0x0010, PPC_NONE, 
> fl2), \
> +GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 0x0010, PPC_NONE, fl2)
> 
> total: 1 errors, 0 warnings, 92 lines checked
> 
> Your patch has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.

This is the standard problem with checkpatch getting confused by the
ugly, ugly macros in the instruction decode stuff.  Maybe we'll fix
that one day, but it's not a priority, and it's definitely out of
scope for this patch.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v6 0/2] POWER9 TCG enablements - BCD functions - final part

2017-01-12 Thread David Gibson

Applied to ppc-for-2.9.

On Thu, Jan 12, 2017 at 06:08:31PM -0200, Jose Ricardo Ziviani wrote:
> v6:
>  - improves bcdtrunc/bcdutrunc overflow comparison
>  - removes bcds/bcdus/bcdsr applied patches
> 
> v5:
>  - removes 'unlikely' gcc branch pred. hints from not unlikely places
>  - adds comments in host-utils functions
>  - adds more test cases for shift functions
>  - handles "shift backwards" with signed shifts
>  - rebases branch
> 
> v4:
>  - improves functions to behave exactly like the target
> 
> v3:
>  - moves shift functions to host-utils.c and added config_int128 guard
>  - changes Makefile to always compile host-utils.c
>  - redesigns bcd[u]trunc to use bitwise operations
>  - removes "target-ppc: Implement bcd_is_valid function" (merged)
> 
> v2:
>  - bcd[s,sr,us] uses 1 byte for shifting instead of 4 bytes
>  - left/right functions in host-utils are out of CONFIG_INT128
>  - fixes overflowing issue in left shift and added a testcase
> 
> This serie contains 5 new instructions for POWER9 ISA3.0, left/right shifts 
> for 
> unsigned quadwords and a small improvement to check whether a bcd value is 
> valid or not.
> 
> bcdtrunc.: Decimal signed trucate
> bcdutrunc.: Decimal unsigned truncate
> 
> Jose Ricardo Ziviani (2):
>   ppc: Implement bcdtrunc. instruction
>   ppc: Implement bcdutrunc. instruction
> 
>  target/ppc/helper.h |  2 +
>  target/ppc/int_helper.c | 88 
> +
>  target/ppc/translate/vmx-impl.inc.c |  9 
>  target/ppc/translate/vmx-ops.inc.c  |  6 +--
>  4 files changed, 102 insertions(+), 3 deletions(-)
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] Proposal PCI/PCIe device placement on PAPR guests

2017-01-12 Thread David Gibson
On Thu, Jan 12, 2017 at 11:31:35AM +0100, Andrea Bolognani wrote:
> On Mon, 2017-01-09 at 10:46 +1100, David Gibson wrote:
> > > >* To allow for hotplugged devices, libvirt should also add a number
> > > >  of additional, empty vPHBs (the PAPR spec allows for hotplug of
> > > >  PHBs, but this is not yet implemented in qemu).
> > > 
> > > "A number" here will have to mean "one", same number of
> > > empty PCIe Root Ports libvirt will add to a newly-defined
> > > q35 guest.
> > 
> > Umm.. why?
> 
> Because some applications using libvirt would inevitably
> start relying on the fact that such spare PHBs are
> available, locking us into providing at least the same
> number forever. In other words, increasing the amount at
> a later time is always possible, but decreasing it isn't.
> We did the same when we started automatically adding PCIe
> Root Ports to q35 machines.
> 
> The rationale is that having a single spare hotpluggable
> slot is extremely convenient for basic usage, eg. a simple
> guest created by someone who's not necessarily very
> familiar with virtualization; on the other hand, if you
> are actually deploying in production you ought to conduct
> proper capacity planning and figure out in advance how
> many devices you're likely to need to hotplug throughout
> the guest's life.

Hm, ok.  Well I guess the limitation is the same as on x86, so it
shouldn't surprise people.

> Of course this all will be moot once we can hotplug PHBs :)

Yes.  Unfortunately, nobody's actually working on that at present.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] Proposal PCI/PCIe device placement on PAPR guests

2017-01-12 Thread David Gibson
On Thu, Jan 12, 2017 at 12:53:28PM -0500, Laine Stump wrote:
> On 01/12/2017 11:35 AM, Michael Roth wrote:
> > Quoting Laine Stump (2017-01-12 08:52:10)
> > > On 01/12/2017 05:31 AM, Andrea Bolognani wrote:
> > > > On Mon, 2017-01-09 at 10:46 +1100, David Gibson wrote:
> > > > > > > * To allow for hotplugged devices, libvirt should also add a 
> > > > > > > number
> > > > > > >   of additional, empty vPHBs (the PAPR spec allows for 
> > > > > > > hotplug of
> > > > > > >   PHBs, but this is not yet implemented in qemu).
> > > > > > 
> > > > > > "A number" here will have to mean "one", same number of
> > > > > > empty PCIe Root Ports libvirt will add to a newly-defined
> > > > > > q35 guest.
> > > > > 
> > > > > Umm.. why?
> > > > 
> > > > Because some applications using libvirt would inevitably
> > > > start relying on the fact that such spare PHBs are
> > > > available, locking us into providing at least the same
> > > > number forever. In other words, increasing the amount at
> > > > a later time is always possible, but decreasing it isn't.
> > > > We did the same when we started automatically adding PCIe
> > > > Root Ports to q35 machines.
> > > > 
> > > > The rationale is that having a single spare hotpluggable
> > > > slot is extremely convenient for basic usage, eg. a simple
> > > > guest created by someone who's not necessarily very
> > > > familiar with virtualization; on the other hand, if you
> > > > are actually deploying in production you ought to conduct
> > > > proper capacity planning and figure out in advance how
> > > > many devices you're likely to need to hotplug throughout
> > > > the guest's life.
> > > 
> > > And of course the reason we don't want to add "too many" extra
> > > controllers by default is so that we don't end up with *all* guests
> > > burdened with extra hardware they don't need or want. The libguestfs
> > > appliance is one example of a libvirt consumer that definitely doesn't
> > > want extra baggage in its guests - guest startup time is very important
> > > to libguestfs, so any addition to the hardware list is looked upon with
> > > disappointment.
> > > 
> > > > 
> > > > Of course this all will be moot once we can hotplug PHBs :)
> > > 
> > > Will the guest OSes handle that properly? I remember being told that
> > 
> > I believe on pseries we *do* scan for devices on the PHB as part of
> > bringing the PHB online in the hotplug path. But I'm not sure that
> > matters (see below).
> > 
> > > Linux, for example, doesn't scan the new bus for devices when a new
> > > controller is added, making it pointless to hotplug a PCI controller (as
> > > usual, it could be that I'm remembering incorrectly...)
> > > 
> > 
> > Wouldn't that only be an issue if we hotplugged a PHB that already had
> > PCI devices on the bus?
> 
> 
> Yeah you're right, I'm probably remembering the wrong problem and wrong
> reason for the problem. I just remember there was *some* issue about
> hotplugging new PCI controllers. Possibly the internal representation of the
> bus hierarchy wasn't updated unless you forced a rescan of all the devices
> or something? My memory of it is vague, I just remember being told it wasn't
> just a case of the controller itself being initialized.
> 
> Alex or Marcel - since whatever it was I likely heard it from one of you (or
> imagined it in a dream), can you straighten me out?

Regardless, I'm pretty sure it won't be relevant for Power guests.
PHB hotplug has its own protocol in PAPR, and is used routinely for
Linux guests under PowerVM.

> 
> > That only seems possible if we had a way to
> > signal phb hotplug *after* we've hotplugged some PCI devices on the bus,
> > which means we'd need some interface to trigger hotplug  beyond the
> > standard 'device_add' calls, e.g.:
> > 
> >   device_add spapr-pci-host-bridge,hotplug-deferred=true,id=phb2,index=2
> >   device_add virtio-net-pci,bus=phb2.0,...,hotplug-deferred=true
> >   device_signal_hotplug phb2
> > 
> > That's actually akin to how it's normally done on pHyp (not only for PHB
> > hotplug, but for PCI hotplug in general, which is why this could be
> > reasonably expected to work on pseries guests), but it seems quite a bit
> > different from how we'd normally handle this on kvm, which I think would
> > be something more like:
> > 
> >   device_add spapr-pci-host-bridge,id=phb2,index=2
> >   
> >   device_add virtio-net-pci,bus=phb2.0,...
> > 
> > In which case it doesn't really matter if the guest scans the bus at
> > hotplug time or not. Is there some other scenario where this might
> > arise?
> > 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 27/40] char: move QIOChannel-related in char-io.h

2017-01-12 Thread Eric Blake
On 01/11/2017 11:29 AM, Marc-André Lureau wrote:

Grammar in subject is a bit terse; maybe:

char: move QIOChannel-related stuff to char-io.h

> Signed-off-by: Marc-André Lureau 
> ---
>  chardev/char-io.h |  24 +++
>  chardev/char-io.c | 168 
>  chardev/char.c| 174 
> +-
>  chardev/Makefile.objs |   1 +
>  4 files changed, 194 insertions(+), 173 deletions(-)
>  create mode 100644 chardev/char-io.h
>  create mode 100644 chardev/char-io.c
> 
> diff --git a/chardev/char-io.h b/chardev/char-io.h
> new file mode 100644
> index 00..ea559fd124
> --- /dev/null
> +++ b/chardev/char-io.h
> @@ -0,0 +1,24 @@
> +#ifndef CHAR_IO_H
> +#define CHAR_IO_H

Must... resist... the temptation to repeat myself!

> +
> +#include "qemu/osdep.h"

.h files should NOT include osdep.h; since all .c files included it
before any other .h file, then all things in osdep.h are already in
scope at the point the .h file starts.


> +static gboolean io_watch_poll_prepare(GSource *source,
> +  gint *timeout_)
> +{

Why the weird spelling of timeout_ ?  Maybe timeout_unused is better?


> +++ b/chardev/char.c

> -static gboolean io_watch_poll_prepare(GSource *source,
> -  gint *timeout_)

Then again, it's code motion.  Again, up to you if you want to tweak
style things during code motion, or split them into a general cleanup
patch separately.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 26/40] char: remove unused READ_RETRIES

2017-01-12 Thread Eric Blake
On 01/11/2017 11:29 AM, Marc-André Lureau wrote:
> Curiously unused since its introduction in commit 7b0bfdf52d69.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  chardev/char.c | 1 -
>  1 file changed, 1 deletion(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 24/40] char: move ringbuf/memory to its own file

2017-01-12 Thread Eric Blake
On 01/11/2017 11:29 AM, Marc-André Lureau wrote:
> Signed-off-by: Marc-André Lureau 
> ---
>  chardev/char-ringbuf.c | 226 
> +
>  chardev/char.c | 218 ---
>  chardev/Makefile.objs  |   1 +
>  3 files changed, 227 insertions(+), 218 deletions(-)
>  create mode 100644 chardev/char-ringbuf.c
> 

Looks mechanical, same story about copyright blurb.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 25/40] char: rename and move to header CHR_READ_BUF_LEN

2017-01-12 Thread Eric Blake
On 01/11/2017 11:29 AM, Marc-André Lureau wrote:
> This define is used by several character devices, place it in char
> common header.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  include/sysemu/char.h |  1 +
>  chardev/char.c| 13 ++---
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 23/40] char: move mux to its own file

2017-01-12 Thread Eric Blake
On 01/11/2017 11:29 AM, Marc-André Lureau wrote:
> A mechanical move, except that qemu_chr_write_all() needs to be declared
> in char.h header to be used from chardev unit files.
> 
> Signed-off-by: Marc-André Lureau 
> ---

> +++ b/chardev/char-mux.h
> @@ -0,0 +1,40 @@
> +#ifndef CHAR_MUX_H
> +#define CHAR_MUX_H

Again, missing copyright blurbs. You'll need to respin to get that
right, but I'll quit pointing it out.

> +
> +#include "sysemu/char.h"
> +
> +extern bool muxes_realized;
> +
> +#define MAX_MUX 4
> +#define MUX_BUFFER_SIZE 32 /* Must be a power of 2.  */
> +#define MUX_BUFFER_MASK (MUX_BUFFER_SIZE - 1)
> +typedef struct MuxChardev {
> +Chardev parent;
> +CharBackend *backends[MAX_MUX];
> +CharBackend chr;
> +int focus;
> +int mux_cnt;
> +int term_got_escape;
> +int max_size;
> +/* Intermediate input buffer allows to catch escape sequences even if the

s/allows to catch/catches/

Dunno if cleanups like that should be mixed with code motion, or done
separately.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 22/40] char: move null chardev to its own file

2017-01-12 Thread Eric Blake
On 01/11/2017 11:29 AM, Marc-André Lureau wrote:
> Signed-off-by: Marc-André Lureau 
> ---
>  chardev/char-null.c   | 31 +++
>  chardev/char.c| 23 ---
>  chardev/Makefile.objs |  1 +
>  3 files changed, 32 insertions(+), 23 deletions(-)
>  create mode 100644 chardev/char-null.c
> 
> diff --git a/chardev/char-null.c b/chardev/char-null.c
> new file mode 100644
> index 00..18a49f5802
> --- /dev/null
> +++ b/chardev/char-null.c
> @@ -0,0 +1,31 @@
> +#include "qemu/osdep.h"
> +#include "sysemu/char.h"

No copyright blurb? You should copy the header of chardev/char.c (since
that's what you're splitting off of).

Otherwise looks okay.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 21/40] char: make null_chr_write() the default method

2017-01-12 Thread Eric Blake
On 01/11/2017 11:29 AM, Marc-André Lureau wrote:
> All chardev must implement chr_write(), but parallel and null chardev
> both use null_chr_write(). Move it to the base class, so we don't need
> to export the function when splitting the chardev in respective files.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  chardev/char.c | 21 +
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH 2/2] virtio-mmio: switch to linux headers

2017-01-12 Thread Michael S. Tsirkin
Switch to virtio_mmio.h from Linux - will make it
easier to implement virtio 1.

Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-mmio.c | 95 +++--
 1 file changed, 37 insertions(+), 58 deletions(-)

diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index 60654dc..5807aa8 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -20,6 +20,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "standard-headers/linux/virtio_mmio.h"
 #include "hw/sysbus.h"
 #include "hw/virtio/virtio.h"
 #include "qemu/host-utils.h"
@@ -52,28 +53,6 @@ do { printf("virtio_mmio: " fmt , ## __VA_ARGS__); } while 
(0)
 #define VIRTIO_MMIO(obj) \
 OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO)
 
-/* Memory mapped register offsets */
-#define VIRTIO_MMIO_MAGIC 0x0
-#define VIRTIO_MMIO_VERSION 0x4
-#define VIRTIO_MMIO_DEVICEID 0x8
-#define VIRTIO_MMIO_VENDORID 0xc
-#define VIRTIO_MMIO_HOSTFEATURES 0x10
-#define VIRTIO_MMIO_HOSTFEATURESSEL 0x14
-#define VIRTIO_MMIO_GUESTFEATURES 0x20
-#define VIRTIO_MMIO_GUESTFEATURESSEL 0x24
-#define VIRTIO_MMIO_GUESTPAGESIZE 0x28
-#define VIRTIO_MMIO_QUEUESEL 0x30
-#define VIRTIO_MMIO_QUEUENUMMAX 0x34
-#define VIRTIO_MMIO_QUEUENUM 0x38
-#define VIRTIO_MMIO_QUEUEALIGN 0x3c
-#define VIRTIO_MMIO_QUEUEPFN 0x40
-#define VIRTIO_MMIO_QUEUENOTIFY 0x50
-#define VIRTIO_MMIO_INTERRUPTSTATUS 0x60
-#define VIRTIO_MMIO_INTERRUPTACK 0x64
-#define VIRTIO_MMIO_STATUS 0x70
-/* Device specific config space starts here */
-#define VIRTIO_MMIO_CONFIG 0x100
-
 #define VIRT_MAGIC 0x74726976 /* 'virt' */
 #define VIRT_VERSION 1
 #define VIRT_VENDOR 0x554D4551 /* 'QEMU' */
@@ -104,10 +83,10 @@ static int virtio_mmio_ioeventfd_assign(DeviceState *d,
 VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d);
 
 if (assign) {
-memory_region_add_eventfd(>iomem, VIRTIO_MMIO_QUEUENOTIFY, 4,
+memory_region_add_eventfd(>iomem, VIRTIO_MMIO_QUEUE_NOTIFY, 4,
   true, n, notifier);
 } else {
-memory_region_del_eventfd(>iomem, VIRTIO_MMIO_QUEUENOTIFY, 4,
+memory_region_del_eventfd(>iomem, VIRTIO_MMIO_QUEUE_NOTIFY, 4,
   true, n, notifier);
 }
 return 0;
@@ -140,11 +119,11 @@ static uint64_t virtio_mmio_read(void *opaque, hwaddr 
offset, unsigned size)
  * device ID of zero means no backend will claim it.
  */
 switch (offset) {
-case VIRTIO_MMIO_MAGIC:
+case VIRTIO_MMIO_MAGIC_VALUE:
 return VIRT_MAGIC;
 case VIRTIO_MMIO_VERSION:
 return VIRT_VERSION;
-case VIRTIO_MMIO_VENDORID:
+case VIRTIO_MMIO_VENDOR_ID:
 return VIRT_VENDOR;
 default:
 return 0;
@@ -169,40 +148,40 @@ static uint64_t virtio_mmio_read(void *opaque, hwaddr 
offset, unsigned size)
 return 0;
 }
 switch (offset) {
-case VIRTIO_MMIO_MAGIC:
+case VIRTIO_MMIO_MAGIC_VALUE:
 return VIRT_MAGIC;
 case VIRTIO_MMIO_VERSION:
 return VIRT_VERSION;
-case VIRTIO_MMIO_DEVICEID:
+case VIRTIO_MMIO_DEVICE_ID:
 return vdev->device_id;
-case VIRTIO_MMIO_VENDORID:
+case VIRTIO_MMIO_VENDOR_ID:
 return VIRT_VENDOR;
-case VIRTIO_MMIO_HOSTFEATURES:
+case VIRTIO_MMIO_DEVICE_FEATURES:
 if (proxy->host_features_sel) {
 return 0;
 }
 return vdev->host_features;
-case VIRTIO_MMIO_QUEUENUMMAX:
+case VIRTIO_MMIO_QUEUE_NUM_MAX:
 if (!virtio_queue_get_num(vdev, vdev->queue_sel)) {
 return 0;
 }
 return VIRTQUEUE_MAX_SIZE;
-case VIRTIO_MMIO_QUEUEPFN:
+case VIRTIO_MMIO_QUEUE_PFN:
 return virtio_queue_get_addr(vdev, vdev->queue_sel)
 >> proxy->guest_page_shift;
-case VIRTIO_MMIO_INTERRUPTSTATUS:
+case VIRTIO_MMIO_INTERRUPT_STATUS:
 return atomic_read(>isr);
 case VIRTIO_MMIO_STATUS:
 return vdev->status;
-case VIRTIO_MMIO_HOSTFEATURESSEL:
-case VIRTIO_MMIO_GUESTFEATURES:
-case VIRTIO_MMIO_GUESTFEATURESSEL:
-case VIRTIO_MMIO_GUESTPAGESIZE:
-case VIRTIO_MMIO_QUEUESEL:
-case VIRTIO_MMIO_QUEUENUM:
-case VIRTIO_MMIO_QUEUEALIGN:
-case VIRTIO_MMIO_QUEUENOTIFY:
-case VIRTIO_MMIO_INTERRUPTACK:
+case VIRTIO_MMIO_DEVICE_FEATURES_SEL:
+case VIRTIO_MMIO_DRIVER_FEATURES:
+case VIRTIO_MMIO_DRIVER_FEATURES_SEL:
+case VIRTIO_MMIO_GUEST_PAGE_SIZE:
+case VIRTIO_MMIO_QUEUE_SEL:
+case VIRTIO_MMIO_QUEUE_NUM:
+case VIRTIO_MMIO_QUEUE_ALIGN:
+case VIRTIO_MMIO_QUEUE_NOTIFY:
+case VIRTIO_MMIO_INTERRUPT_ACK:
 DPRINTF("read of write-only register\n");
 return 0;
 default:
@@ -251,18 +230,18 @@ static void virtio_mmio_write(void *opaque, hwaddr 
offset, uint64_t value,
 return;
 }
 switch (offset) {
-case VIRTIO_MMIO_HOSTFEATURESSEL:
+case VIRTIO_MMIO_DEVICE_FEATURES_SEL:
 

[Qemu-devel] [PATCH 1/2] virtio_mmio: add standard header file

2017-01-12 Thread Michael S. Tsirkin
Signed-off-by: Michael S. Tsirkin 
---
 include/standard-headers/linux/virtio_mmio.h | 141 +++
 1 file changed, 141 insertions(+)
 create mode 100644 include/standard-headers/linux/virtio_mmio.h

diff --git a/include/standard-headers/linux/virtio_mmio.h 
b/include/standard-headers/linux/virtio_mmio.h
new file mode 100644
index 000..c4b0968
--- /dev/null
+++ b/include/standard-headers/linux/virtio_mmio.h
@@ -0,0 +1,141 @@
+/*
+ * Virtio platform device driver
+ *
+ * Copyright 2011, ARM Ltd.
+ *
+ * Based on Virtio PCI driver by Anthony Liguori, copyright IBM Corp. 2007
+ *
+ * This header is BSD licensed so anyone can use the definitions to implement
+ * compatible drivers/servers.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS 
IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#ifndef _LINUX_VIRTIO_MMIO_H
+#define _LINUX_VIRTIO_MMIO_H
+
+/*
+ * Control registers
+ */
+
+/* Magic value ("virt" string) - Read Only */
+#define VIRTIO_MMIO_MAGIC_VALUE0x000
+
+/* Virtio device version - Read Only */
+#define VIRTIO_MMIO_VERSION0x004
+
+/* Virtio device ID - Read Only */
+#define VIRTIO_MMIO_DEVICE_ID  0x008
+
+/* Virtio vendor ID - Read Only */
+#define VIRTIO_MMIO_VENDOR_ID  0x00c
+
+/* Bitmask of the features supported by the device (host)
+ * (32 bits per set) - Read Only */
+#define VIRTIO_MMIO_DEVICE_FEATURES0x010
+
+/* Device (host) features set selector - Write Only */
+#define VIRTIO_MMIO_DEVICE_FEATURES_SEL0x014
+
+/* Bitmask of features activated by the driver (guest)
+ * (32 bits per set) - Write Only */
+#define VIRTIO_MMIO_DRIVER_FEATURES0x020
+
+/* Activated features set selector - Write Only */
+#define VIRTIO_MMIO_DRIVER_FEATURES_SEL0x024
+
+
+#ifndef VIRTIO_MMIO_NO_LEGACY /* LEGACY DEVICES ONLY! */
+
+/* Guest's memory page size in bytes - Write Only */
+#define VIRTIO_MMIO_GUEST_PAGE_SIZE0x028
+
+#endif
+
+
+/* Queue selector - Write Only */
+#define VIRTIO_MMIO_QUEUE_SEL  0x030
+
+/* Maximum size of the currently selected queue - Read Only */
+#define VIRTIO_MMIO_QUEUE_NUM_MAX  0x034
+
+/* Queue size for the currently selected queue - Write Only */
+#define VIRTIO_MMIO_QUEUE_NUM  0x038
+
+
+#ifndef VIRTIO_MMIO_NO_LEGACY /* LEGACY DEVICES ONLY! */
+
+/* Used Ring alignment for the currently selected queue - Write Only */
+#define VIRTIO_MMIO_QUEUE_ALIGN0x03c
+
+/* Guest's PFN for the currently selected queue - Read Write */
+#define VIRTIO_MMIO_QUEUE_PFN  0x040
+
+#endif
+
+
+/* Ready bit for the currently selected queue - Read Write */
+#define VIRTIO_MMIO_QUEUE_READY0x044
+
+/* Queue notifier - Write Only */
+#define VIRTIO_MMIO_QUEUE_NOTIFY   0x050
+
+/* Interrupt status - Read Only */
+#define VIRTIO_MMIO_INTERRUPT_STATUS   0x060
+
+/* Interrupt acknowledge - Write Only */
+#define VIRTIO_MMIO_INTERRUPT_ACK  0x064
+
+/* Device status register - Read Write */
+#define VIRTIO_MMIO_STATUS 0x070
+
+/* Selected queue's Descriptor Table address, 64 bits in two halves */
+#define VIRTIO_MMIO_QUEUE_DESC_LOW 0x080
+#define VIRTIO_MMIO_QUEUE_DESC_HIGH0x084
+
+/* Selected queue's Available Ring address, 64 bits in two halves */
+#define VIRTIO_MMIO_QUEUE_AVAIL_LOW0x090
+#define VIRTIO_MMIO_QUEUE_AVAIL_HIGH   0x094
+
+/* Selected queue's Used Ring address, 64 bits in two halves */
+#define VIRTIO_MMIO_QUEUE_USED_LOW 0x0a0
+#define VIRTIO_MMIO_QUEUE_USED_HIGH0x0a4
+
+/* Configuration atomicity value */

Re: [Qemu-devel] [PATCH 1/3] target/arm: A32, T32: Create Instruction Syndromes for Data Aborts

2017-01-12 Thread Peter Maydell
On 12 January 2017 at 20:41, Edgar E. Iglesias  wrote:
> On Tue, Jan 10, 2017 at 06:44:07PM +, Peter Maydell wrote:
>> Add support for generating the ISS (Instruction Specific Syndrome)
>> for Data Abort exceptions taken from AArch32. These syndromes are
>> used by hypervisors for example to trap and emulate memory accesses.
>>
>> This is the equivalent for AArch32 guests of the work done for AArch64
>> guests in commit aaa1f954d4cab243.
>
> Hi,
>
> I haven't checked the the details but I think the structure looks good.
> The patch is a large and has a few things that I think could be broken
> out into separate patches (or dropped).
>
> For example these kind of refactoring could be a separate patch:
>> +bool pbit = insn & (1 << 24);
> ...
>> -if (insn & (1 << 24))
>> +if (pbit) {

That one should be the only refactoring, but yeah I can split it out.

>
> There's also a few whitespace changes that could be broken out or dropped..

Oops; I'll get rid of those.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 20/40] char: create chardev-obj-y

2017-01-12 Thread Eric Blake
On 01/11/2017 11:29 AM, Marc-André Lureau wrote:
> This will help to split char.c in several units without having to
> reference them all everywhere. This is useful in particular for tests.
> 
> Signed-off-by: Marc-André Lureau 
> ---

> @@ -221,7 +222,7 @@ subdir-dtc:dtc/libfdt dtc/tests
>  dtc/%:
>   mkdir -p $@
>  
> -$(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y) $(qom-obj-y) 
> $(crypto-aes-obj-$(CONFIG_USER_ONLY))
> +$(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y) 
> $(chardev-obj-y) $(qom-obj-y) $(crypto-aes-obj-$(CONFIG_USER_ONLY))

Again, splitting long lines with \ would be nice, but not a necessity.

>  
>  ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS))
>  # Only keep -O and -g cflags
> diff --git a/Makefile.objs b/Makefile.objs
> index cad4c54740..7985d21841 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -51,7 +51,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
>  common-obj-$(CONFIG_LINUX) += fsdev/
>  
>  common-obj-y += migration/
> -common-obj-y += chardev/ #aio.o
> +common-obj-y += #aio.o

The fact that aio.o is commented is unrelated to your patch, but now
that this line is a comment only, can we delete it?

>  common-obj-y += page_cache.o

Or maybe merge it with this line, so it doesn't look quite as odd?

Nothing major, so
Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 2/3] target/arm: Handle VIRQ and VFIQ in arm_cpu_do_interrupt_aarch32()

2017-01-12 Thread Edgar E. Iglesias
On Tue, Jan 10, 2017 at 06:44:08PM +, Peter Maydell wrote:
> To run a VM in 32-bit EL1 our AArch32 interrupt handling code
> needs to be able to cope with VIRQ and VFIQ exceptions.
> These behave like IRQ and FIQ except that we don't need to try
> to route them to Monitor mode.
> 
> Signed-off-by: Peter Maydell 

We could possibly avoid some duplication with EXCP_IRQ and _FIQ
but either way works:

Reviewed-by: Edgar E. Iglesias 


> ---
>  target/arm/helper.c | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 8dcabbf..dc90986 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -6403,6 +6403,20 @@ static void arm_cpu_do_interrupt_aarch32(CPUState *cs)
>  }
>  offset = 4;
>  break;
> +case EXCP_VIRQ:
> +new_mode = ARM_CPU_MODE_IRQ;
> +addr = 0x18;
> +/* Disable IRQ and imprecise data aborts.  */
> +mask = CPSR_A | CPSR_I;
> +offset = 4;
> +break;
> +case EXCP_VFIQ:
> +new_mode = ARM_CPU_MODE_FIQ;
> +addr = 0x1c;
> +/* Disable FIQ, IRQ and imprecise data aborts.  */
> +mask = CPSR_A | CPSR_I | CPSR_F;
> +offset = 4;
> +break;
>  case EXCP_SMC:
>  new_mode = ARM_CPU_MODE_MON;
>  addr = 0x08;
> -- 
> 2.7.4
> 



Re: [Qemu-devel] [PATCH 5/5] target-m68k: increment/decrement with SP

2017-01-12 Thread Laurent Vivier
Le 12/01/2017 à 22:14, Thomas Huth a écrit :
> On 12.01.2017 21:18, Laurent Vivier wrote:
>> Address Register indirect With postincrement:
>>
>> When using the stack pointer (A7) with byte size data, the register
>> is incremented by two.
>>
>> Address Register indirect With predecrement:
>>
>> When using the stack pointer (A7) with byte size data, the register
>> is decremented by two.
> 
> I think this is only valid for the full 680x0 CPUs. According to
> http://www.nxp.com/assets/documents/data/en/reference-manuals/CFPRM.pdf
> the stack pointer behaves differently on ColdFire:
> 
> "2.2.5 Address Register Indirect with Predecrement Mode [...]
> Note that the stack pointer (A7) is treated just like the other address
> registers."

Yes, you're right. This is true only for 680x0:

"MOTOROLA M68000 FAMILY Programmer’s Reference Manual"

"2.2.5 Address Register Indirect with Predecrement Mode
...
If the address register is thestack pointer and the operand size is
byte, the address is decremented by two to keep the stack pointer
aligned to a word boundary."

Thank you, I will update this patch accordingly.

Laurent



[Qemu-devel] [PATCH] virtio: drop an obsolete comment

2017-01-12 Thread Michael S. Tsirkin
virtio core has code to revert queue number
to maximum on reset. Drop TODO to add that.

Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-pci.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 8baaf2b..09230c0 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1316,7 +1316,6 @@ static void virtio_pci_common_write(void *opaque, hwaddr 
addr,
 virtio_queue_set_vector(vdev, vdev->queue_sel, val);
 break;
 case VIRTIO_PCI_COMMON_Q_ENABLE:
-/* TODO: need a way to put num back on reset. */
 virtio_queue_set_num(vdev, vdev->queue_sel,
  proxy->vqs[vdev->queue_sel].num);
 virtio_queue_set_rings(vdev, vdev->queue_sel,
-- 
MST



[Qemu-devel] [Bug 1653384] Re: Assertion failed with USB pass through with XHCI controller

2017-01-12 Thread Fabian Lesniak
This patch fixes passing through a keyboard for me. I tried a Logitech
K120 (046d:c31c).

After that, I tried my real-world use case being a standard USB sound
card (046d:0a4d). This does not crash the machine anymore, but linux
reports:

xhci_hcd :00:03.0: ERROR Transfer event TRB DMA ptr not part of
current TD ep_index 1 comp_code 1

multiple times when trying to use the sound card. I get no sound and my
media player freezes.

Before commit 94b037f2a451b3dc855f9f2c346e5049a361bd55 this sound card
worked without any errors.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1653384

Title:
  Assertion failed with USB pass through with XHCI controller

Status in QEMU:
  New

Bug description:
  Starting qemu 2.8.0 with XHCI controller and host device passed
  through results in an assertion failure:

  qemu-system-x86_64: hw/usb/core.c:623: usb_packet_cleanup: Assertion
  `!usb_packet_is_inflight(p)' failed.

  Can be reproduced with the following command (passing through a Lenovo
  keyboard):

  qemu-system-x86_64 -usb  -device nec-usb-xhci,id=usb -device usb-
  host,vendorid=0x04b3,productid=0x3025,id=hostdev0,bus=usb.0,port=1

  If nec-usb-xhci is changed to usb-ehci, qemu tries to boot without
  assertion failures.

  
  Can be reproduced with the latest master (commit dbe2b65) and v2.8.0.

  Bisected the issue to following commit:
  first bad commit: [94b037f2a451b3dc855f9f2c346e5049a361bd55] xhci: use linked 
list for transfers

  
  Backtrace from commit dbe2b65:

  #0  0x7f2eb4657227 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:55
  resultvar = 0
  pid = 3453
  selftid = 3453
  #1  0x7f2eb465867a in __GI_abort () at abort.c:89
  save_stage = 2
  act = {__sigaction_handler = {sa_handler = 0x4, sa_sigaction = 0x4}, 
sa_mask = {__val = {140734740550528, 93876690035339, 
140734740550624, 48833659808, 0, 0, 0, 21474836480, 
140734740550792, 139838573009553, 140734740550560, 139838573043008, 
139838573024160, 9387665872, 139838702616576, 
139838573024160}}, sa_flags = 1528954938, 
sa_restorer = 0x55615b2202c0 <__PRETTY_FUNCTION__.38612>}
  sigs = {__val = {32, 0 }}
  #2  0x7f2eb46502cd in __assert_fail_base (fmt=0x7f2eb47893a0 "%s%s%s:%u: 
%s%sAssertion `%s' failed.\n%n", 
  assertion=assertion@entry=0x55615b22003a "!usb_packet_is_inflight(p)", 
file=file@entry=0x55615b21fdf0 "hw/usb/core.c", line=line@entry=619, 
  function=function@entry=0x55615b2202c0 <__PRETTY_FUNCTION__.38612> 
"usb_packet_cleanup") at assert.c:92
  str = 0x55615cfdf510 ""
  total = 4096
  #3  0x7f2eb4650382 in __GI___assert_fail (assertion=0x55615b22003a 
"!usb_packet_is_inflight(p)", file=0x55615b21fdf0 "hw/usb/core.c", 
  line=619, function=0x55615b2202c0 <__PRETTY_FUNCTION__.38612> 
"usb_packet_cleanup") at assert.c:101
  No locals.
  #4  0x55615afc385e in usb_packet_cleanup ()
  No symbol table info available.
  #5  0x55615afda555 in xhci_ep_free_xfer ()
  No symbol table info available.
  #6  0x55615afdc156 in xhci_kick_epctx ()
  No symbol table info available.
  #7  0x55615afda099 in xhci_ep_kick_timer ()
  No symbol table info available.
  #8  0x55615b08ceee in timerlist_run_timers ()
  No symbol table info available.
  #9  0x55615b08cf36 in qemu_clock_run_timers ()
  No symbol table info available.
  #10 0x55615b08d2df in qemu_clock_run_all_timers ()
  No symbol table info available.
  #11 0x55615b08be40 in main_loop_wait ()
  No symbol table info available.
  #12 0x55615ae3870f in main_loop ()
  No symbol table info available.
  #13 0x55615ae4027b in main ()

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1653384/+subscriptions



[Qemu-devel] [PATCH] virtio-ccw: fix ring sizing

2017-01-12 Thread Michael S. Tsirkin
Current code seems to assume ring size is
always decreased but this is not required by spec:
what spec says is just that size can not exceed
the maximum. Fix it up.

Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/virtio.h | 1 +
 hw/s390x/virtio-ccw.c  | 2 +-
 hw/virtio/virtio.c | 5 +
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index e5541c6..6523bac 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -228,6 +228,7 @@ void virtio_queue_set_addr(VirtIODevice *vdev, int n, 
hwaddr addr);
 hwaddr virtio_queue_get_addr(VirtIODevice *vdev, int n);
 void virtio_queue_set_num(VirtIODevice *vdev, int n, int num);
 int virtio_queue_get_num(VirtIODevice *vdev, int n);
+int virtio_queue_get_max_num(VirtIODevice *vdev, int n);
 int virtio_get_num_queues(VirtIODevice *vdev);
 void virtio_queue_set_rings(VirtIODevice *vdev, int n, hwaddr desc,
 hwaddr avail, hwaddr used);
diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index 0765068..63c4637 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -149,7 +149,7 @@ static int virtio_ccw_set_vqs(SubchDev *sch, VqInfoBlock 
*info,
 } else {
 if (info) {
 /* virtio-1 allows changing the ring size. */
-if (virtio_queue_get_num(vdev, index) < num) {
+if (virtio_queue_get_max_num(vdev, index) < num) {
 /* Fail if we exceed the maximum number. */
 return -EINVAL;
 }
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 34065c7..e52fd8e 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1256,6 +1256,11 @@ int virtio_queue_get_num(VirtIODevice *vdev, int n)
 return vdev->vq[n].vring.num;
 }
 
+int virtio_queue_get_max_num(VirtIODevice *vdev, int n)
+{
+return vdev->vq[n].vring.num_default;
+}
+
 int virtio_get_num_queues(VirtIODevice *vdev)
 {
 int i;
-- 
MST



Re: [Qemu-devel] [PATCH v6 3/7] trace: [tcg] Delay changes to dynamic state when translating

2017-01-12 Thread Paolo Bonzini


On 12/01/2017 20:37, Lluís Vilanova wrote:
> Stefan Hajnoczi writes:
> 
>> On Tue, Jan 10, 2017 at 05:31:37PM +0100, Paolo Bonzini wrote:
>>> On 09/01/2017 18:01, Stefan Hajnoczi wrote:
 Or use a simpler scheme:

 struct CPUState {
 ...
 uint32_t dstate_update_count;
 };

 In trace_event_set_vcpu_state_dynamic():

 if (state) {
 trace_events_enabled_count++;
 set_bit(vcpu_id, vcpu->trace_dstate_delayed);
atomic_inc(>dstate_update_count, 1);
 (*ev->dstate)++;
 } ...

 In cpu_exec() and friends:

 last_dstate_update_count = atomic_read(>dstate_update_count);

 tb = tb_find(cpu, last_tb, tb_exit);
 cpu_loop_exec_tb(cpu, tb, _tb, _exit, );

 /* apply and disable delayed dstate changes */
 if (unlikely(atomic_read(>dstate_update_count) != 
 last_dstate_update_count)) {
 bitmap_copy(cpu->trace_dstate, cpu->trace_dstate_delayed,
 trace_get_vcpu_event_count());
 }

 (You'll need to adjust the details but the update counter approach
 should be workable.)
>>>
>>> Would it work to use async_run_on_cpu?
> 
>> I think so.
> 
> AFAIU we cannot use async_run_on_cpu(), since we need to reset the local
> variable "last_tb" to avoid chaining TBs with different dstates, and we cannot
> use cpu_loop_exit() inside the callback.

async_run_on_cpu would run as soon as the currently executing TB
finishes, and would leave cpu_exec completely, so there would be no
chaining.

Paolo

> To make it work, we'd need to add some new boolean flag on the vCPU to control
> when to reset "last_tb", and then we're just as good as implementing the async
> work "protocol" manually for this specific case.
> 
> What I'll do is fix the race condition by simplifying that code (haven't 
> looked
> at the problem yet).
> 
> 
> Thanks,
>   Lluis
> 



[Qemu-devel] [Bug 1617114] Re: Qemu 2.6.0 freezes with windows guests

2017-01-12 Thread Javier
Qemu 2.8.0 is no better.  Actually now win-10 can even boot, getting the
light blue window with sad face saying:  "Your PC ran into a problem and
needs to restart...".  Moreover, the qemu monitor mode (alt-2) pops up a
frozen useless window, so no way to try reseting...

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1617114

Title:
  Qemu 2.6.0 freezes with windows guests

Status in QEMU:
  New

Bug description:
  When launching qemu with the same command line as before 2.6.0, with
  SDL display, with virtio, for a win-10 guest:

  qemu-system-x86_64 -enable-kvm -name win-10 -machine type=pc,accel=kvm
  -cpu host -smp cores=1,threads=2,sockets=1 -m 2.7G -balloon virtio
  -drive
  file=/home//.qemu/imgs/win10-coe.qcow2,index=0,media=disk,if=virtio
  -drive file=/usr/share/virtio/virtio-win.iso,index=1,media=cdrom
  -drive file=/usr/share/spice-guest-tools/spice-guest-
  tools.iso,index=2,media=cdrom -net nic,model=virtio -net
  tap,ifname=tap0,script=no,downscript=no,vhost=on -usbdevice tablet
  -usb -display sdl -vga qxl -soundhw ac97 -rtc base=localtime
  -usbdevice host:0b0e:0032 -usbdevice host:0b0e:0348 -usbdevice
  host:0b0e:0410

  Qemu at some point just freezes with no error message at all with
  newer version 2.6.0-1.

  Reverting to prior version 2.5.1-1, things go back to normal.

  A simple way to accelerate the freeze is to have qemu launch in a
  workspace/desktop, and then move to a different workspace/desktop, and
  then move back to the qemu workspace/desktop, and you'll find out it's
  frozen.

  BTW, there's no way to get into qemu monitor mode terminal at all once
  frozen. The monitor terminal shows up, but does nothing...

  Perhaps it's useful to notice that I have up to date win-10 virtio
  drivers for ethernet, scsi/storage, qxl-dod, balloon, and serial
  interface drivers. The ISO version used is 0.1.118.1 (virtio-win AUR
  package).

  Using the standard (std) qemu video driver, rather than the qxl-dod
  one makes no difference BTW.

  Just in case, running on up to date x86-64 Arch, plain qemu command
  line.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1617114/+subscriptions



Re: [Qemu-devel] [PATCH 5/5] target-m68k: increment/decrement with SP

2017-01-12 Thread Thomas Huth
On 12.01.2017 21:18, Laurent Vivier wrote:
> Address Register indirect With postincrement:
> 
> When using the stack pointer (A7) with byte size data, the register
> is incremented by two.
> 
> Address Register indirect With predecrement:
> 
> When using the stack pointer (A7) with byte size data, the register
> is decremented by two.

I think this is only valid for the full 680x0 CPUs. According to
http://www.nxp.com/assets/documents/data/en/reference-manuals/CFPRM.pdf
the stack pointer behaves differently on ColdFire:

"2.2.5 Address Register Indirect with Predecrement Mode [...]
Note that the stack pointer (A7) is treated just like the other address
registers."

 Thomas




Re: [Qemu-devel] [PATCH] qemu-io: Return non-zero exit code on failure

2017-01-12 Thread Nir Soffer
On Wed, Jan 11, 2017 at 11:51 PM, Eric Blake  wrote:
> On 01/11/2017 12:24 PM, Nir Soffer wrote:
>> From: Nir Soffer 
>>
>> The result of openfile was not checked, leading to failure deep in the
>> actual command with confusing error message, and exiting with exit code 0.
>>
>> Here is one example - trying to read a pattern from an invalid chain:
>>
>> $ qemu-io -c 'read -P 1 0 1024' top.qcow2; echo $?
>
> As written, you have to guess some context about how top.qcow2 was
> created.  The example can be made a bit more reproducible with:
>
> $ : > file
> $ qemu-io -f qcow2 -c ... file

Right, I'll refine it in the next version.

>
>> can't open device top.qcow2: Could not open backing file: Image is not 
>> in qcow2 format
>> no file open, try 'help open'
>> 0
>>
>> With this patch, we fail earlier with exit code 1:
>>
>> $ ./qemu-io -c 'read -P 1 0 1024' top.qcow2; echo $?
>> can't open device top.qcow2: Could not open backing file: Image is not
>> in qcow2 format
>> 1
>>
>> Signed-off-by: Nir Soffer 
>> ---
>>  qemu-io.c | 8 ++--
>>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> Whether or not the commit message is improved,
> Reviewed-by: Eric Blake 
>
> --
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>



Re: [Qemu-devel] [PATCH 3/3] target/arm: Implement DBGVCR32_EL2 system register

2017-01-12 Thread Edgar E. Iglesias
On Tue, Jan 10, 2017 at 06:44:09PM +, Peter Maydell wrote:
> The DBGVCR_EL2 system register is needed to run a 32-bit
> EL1 guest under a Linux EL2 64-bit hypervisor. Its only
> purpose is to provide AArch64 with access to the state of
> the DBGVCR AArch32 register. Since we only have a dummy
> DBGVCR, implement a corresponding dummy DBGVCR32_EL2.
> 
> Signed-off-by: Peter Maydell 

Reviewed-by: Edgar E. Iglesias 


> ---
>  target/arm/helper.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index dc90986..bda562d 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -4066,6 +4066,13 @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
>.cp = 14, .opc1 = 0, .crn = 0, .crm = 7, .opc2 = 0,
>.access = PL1_RW, .accessfn = access_tda,
>.type = ARM_CP_NOP },
> +/* Dummy DBGVCR32_EL2 (which is only for a 64-bit hypervisor
> + * to save and restore a 32-bit guest's DBGVCR)
> + */
> +{ .name = "DBGVCR32_EL2", .state = ARM_CP_STATE_AA64,
> +  .opc0 = 2, .opc1 = 4, .crn = 0, .crm = 7, .opc2 = 0,
> +  .access = PL2_RW, .accessfn = access_tda,
> +  .type = ARM_CP_NOP },
>  /* Dummy MDCCINT_EL1, since we don't implement the Debug Communications
>   * Channel but Linux may try to access this register. The 32-bit
>   * alias is DBGDCCINT.
> -- 
> 2.7.4
> 



Re: [Qemu-devel] [PATCH 0/3] target/arm: Support EL1 AArch32 guest under AArch64 EL2

2017-01-12 Thread Edgar E. Iglesias
On Tue, Jan 10, 2017 at 06:44:06PM +, Peter Maydell wrote:
> The GICv3 virt patchset is sufficient to run a 64-bit guest under
> a 64-bit host kernel. To run 32-bit guests under the 64-bit host
> you need a few more things:
>  * data aborts from AArch32 need to provide instruction syndrome info
>to the hypervisor
>  * the AArch32 interrupt code needs to handle VIRQ and VFIQ
>  * we need a DBGVCR32_EL2 register, because Linux's EL2 code uses
>it to context-switch AArch32 DBGVCR between guests
> 
> This patchset sits on top of the gicv3-virt patchset and is
> sufficient to run a Linux 32-bit guest under 64-bit Linux host.

Cool stuff Peter, I didn't think we were this close for this to work!

32bit hypervisors would be cool too but I'm guessing there's quite a bit
of work before that works...

Cheers,
Edgar

> 
> Git branch with the whole lot:
>  https://git.linaro.org/people/peter.maydell/qemu-arm.git aarch32-guest
> 
> 
> Peter Maydell (3):
>   target/arm: A32, T32: Create Instruction Syndromes for Data Aborts
>   target/arm: Handle VIRQ and VFIQ in arm_cpu_do_interrupt_aarch32()
>   target/arm: Implement DBGVCR32_EL2 system register
> 
>  target/arm/helper.c|  21 +
>  target/arm/translate.c | 213 
> -
>  2 files changed, 178 insertions(+), 56 deletions(-)
> 
> -- 
> 2.7.4
> 



Re: [Qemu-devel] [PATCH 1/3] target/arm: A32, T32: Create Instruction Syndromes for Data Aborts

2017-01-12 Thread Edgar E. Iglesias
On Tue, Jan 10, 2017 at 06:44:07PM +, Peter Maydell wrote:
> Add support for generating the ISS (Instruction Specific Syndrome)
> for Data Abort exceptions taken from AArch32. These syndromes are
> used by hypervisors for example to trap and emulate memory accesses.
> 
> This is the equivalent for AArch32 guests of the work done for AArch64
> guests in commit aaa1f954d4cab243.

Hi,

I haven't checked the the details but I think the structure looks good.
The patch is a large and has a few things that I think could be broken
out into separate patches (or dropped).

For example these kind of refactoring could be a separate patch:
> +bool pbit = insn & (1 << 24);
...
> -if (insn & (1 << 24))
> +if (pbit) {


There's also a few whitespace changes that could be broken out or dropped..

Cheers,
Edgar


> 
> Signed-off-by: Peter Maydell 
> ---
>  target/arm/translate.c | 213 
> -
>  1 file changed, 157 insertions(+), 56 deletions(-)
> 
> diff --git a/target/arm/translate.c b/target/arm/translate.c
> index 0ad9070..1e1d78e 100644
> --- a/target/arm/translate.c
> +++ b/target/arm/translate.c
> @@ -102,6 +102,63 @@ void arm_translate_init(void)
>  a64_translate_init();
>  }
>  
> +static void disas_set_insn_syndrome(DisasContext *s, uint32_t syn)
> +{
> +/* We don't need to save all of the syndrome so we mask and shift
> + * out uneeded bits to help the sleb128 encoder do a better job.
> + */
> +syn &= ARM_INSN_START_WORD2_MASK;
> +syn >>= ARM_INSN_START_WORD2_SHIFT;
> +
> +/* We check and clear insn_start_idx to catch multiple updates.  */
> +assert(s->insn_start_idx != 0);
> +tcg_set_insn_param(s->insn_start_idx, 2, syn);
> +s->insn_start_idx = 0;
> +}
> +
> +/* Flags for the disas_set_da_iss info argument:
> + * lower bits hold the Rt register number, higher bits are flags.
> + */
> +typedef enum ISSInfo {
> +ISSNone = 0,
> +ISSRegMask = 0x1f,
> +ISSInvalid = (1 << 5),
> +ISSIsAcqRel = (1 << 6),
> +ISSIsWrite = (1 << 7),
> +ISSIs16Bit = (1 << 8),
> +} ISSInfo;
> +
> +/* Save the syndrome information for a Data Abort */
> +static void disas_set_da_iss(DisasContext *s, TCGMemOp memop, ISSInfo 
> issinfo)
> +{
> +uint32_t syn;
> +int sas = memop & MO_SIZE;
> +bool sse = memop & MO_SIGN;
> +bool is_acqrel = issinfo & ISSIsAcqRel;
> +bool is_write = issinfo & ISSIsWrite;
> +bool is_16bit = issinfo & ISSIs16Bit;
> +int srt = issinfo & ISSRegMask;
> +
> +if (issinfo & ISSInvalid) {
> +/* Some callsites want to conditionally provide ISS info,
> + * eg "only if this was not a writeback"
> + */
> +return;
> +}
> +
> +if (srt == 15) {
> +/* For AArch32, insns where the src/dest is R15 never generate
> + * ISS information. Catching that here saves checking at all
> + * the call sites.
> + */
> +return;
> +}
> +
> +syn = syn_data_abort_with_iss(0, sas, sse, srt, 0, is_acqrel,
> +  0, 0, 0, is_write, 0, is_16bit);
> +disas_set_insn_syndrome(s, syn);
> +}
> +
>  static inline ARMMMUIdx get_a32_user_mem_index(DisasContext *s)
>  {
>  /* Return the mmu_idx to use for A32/T32 "unprivileged load/store"
> @@ -956,13 +1013,30 @@ static inline void gen_aa32_ld##SUFF(DisasContext *s, 
> TCGv_i32 val,  \
>   TCGv_i32 a32, int index)\
>  {\
>  gen_aa32_ld_i32(s, val, a32, index, OPC | s->be_data);   \
> +}\
> +static inline void gen_aa32_ld##SUFF##_iss(DisasContext *s,  \
> +   TCGv_i32 val, \
> +   TCGv_i32 a32, int index,  \
> +   ISSInfo issinfo)  \
> +{\
> +gen_aa32_ld_i32(s, val, a32, index, OPC | s->be_data);   \
> +disas_set_da_iss(s, OPC, issinfo);   \
>  }
>  
> +
>  #define DO_GEN_ST(SUFF, OPC) \
>  static inline void gen_aa32_st##SUFF(DisasContext *s, TCGv_i32 val,  \
>   TCGv_i32 a32, int index)\
>  {\
>  gen_aa32_st_i32(s, val, a32, index, OPC | s->be_data);   \
> +}\
> +static inline void gen_aa32_st##SUFF##_iss(DisasContext *s,  \
> +   TCGv_i32 val, \
> + 

Re: [Qemu-devel] [PATCH] qemu-io: Return non-zero exit code on failure

2017-01-12 Thread Nir Soffer
On Thu, Jan 12, 2017 at 5:01 AM, Fam Zheng  wrote:
> On Wed, 01/11 15:51, Eric Blake wrote:
>> On 01/11/2017 12:24 PM, Nir Soffer wrote:
>> > From: Nir Soffer 
>> >
>> > The result of openfile was not checked, leading to failure deep in the
>> > actual command with confusing error message, and exiting with exit code 0.
>> >
>> > Here is one example - trying to read a pattern from an invalid chain:
>> >
>> > $ qemu-io -c 'read -P 1 0 1024' top.qcow2; echo $?
>>
>> As written, you have to guess some context about how top.qcow2 was
>> created.  The example can be made a bit more reproducible with:
>>
>> $ : > file
>> $ qemu-io -f qcow2 -c ... file
>
> Nir, thank you for the fix. Could you also add a regression test in
> tests/qemu-iotests?

Sure, will look into it.

>
> Reviewed-by: Fam Zheng 
>
>>
>> > can't open device top.qcow2: Could not open backing file: Image is not 
>> > in qcow2 format
>> > no file open, try 'help open'
>> > 0
>> >
>> > With this patch, we fail earlier with exit code 1:
>> >
>> > $ ./qemu-io -c 'read -P 1 0 1024' top.qcow2; echo $?
>> > can't open device top.qcow2: Could not open backing file: Image is not
>> > in qcow2 format
>> > 1
>> >
>> > Signed-off-by: Nir Soffer 
>> > ---
>> >  qemu-io.c | 8 ++--
>> >  1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> Whether or not the commit message is improved,
>> Reviewed-by: Eric Blake 



[Qemu-devel] [PATCH 4/5] target-m68k: CAS doesn't need aligned access

2017-01-12 Thread Laurent Vivier
Signed-off-by: Laurent Vivier 
---
 target/m68k/translate.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 23e2b06..cf5d8dd 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -1934,7 +1934,6 @@ DISAS_INSN(cas)
 default:
 g_assert_not_reached();
 }
-opc |= MO_ALIGN;
 
 ext = read_im16(env, s);
 
-- 
2.7.4




[Qemu-devel] [PATCH 2/5] target-m68k: fix gen_flush_flags()

2017-01-12 Thread Laurent Vivier
gen_flush_flags() is setting unconditionally cc_op_synced to 1
and s->cc_op to CC_OP_FLAGS, whereas env->cc_op can be set
to something else by a previous tcg fragment.

We fix that by not setting cc_op_synced to 1
(except for gen_helper_flush_flags() that updates env->cc_op)

FIX: https://github.com/vivier/qemu-m68k/issues/19

Signed-off-by: Laurent Vivier 
---
 target/m68k/translate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 410f56a..0e97900 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -595,18 +595,19 @@ static void gen_flush_flags(DisasContext *s)
 
 case CC_OP_DYNAMIC:
 gen_helper_flush_flags(cpu_env, QREG_CC_OP);
+s->cc_op_synced = 1;
 break;
 
 default:
 t0 = tcg_const_i32(s->cc_op);
 gen_helper_flush_flags(cpu_env, t0);
 tcg_temp_free(t0);
+s->cc_op_synced = 1;
 break;
 }
 
 /* Note that flush_flags also assigned to env->cc_op.  */
 s->cc_op = CC_OP_FLAGS;
-s->cc_op_synced = 1;
 }
 
 static inline TCGv gen_extend(TCGv val, int opsize, int sign)
-- 
2.7.4




  1   2   3   4   >