date:20191105

RE: [RFC v2 13/22] intel_iommu: add PASID cache management infrastructure

2019-11-05 Thread Liu, Yi L

> From: Peter Xu [mailto:pet...@redhat.com]
> Sent: Tuesday, November 5, 2019 4:07 AM
> To: Liu, Yi L 
> Subject: Re: [RFC v2 13/22] intel_iommu: add PASID cache management
> infrastructure
> 
> On Thu, Oct 24, 2019 at 08:34:34AM -0400, Liu Yi L wrote:
> > This patch adds a PASID cache management infrastructure based on
> > new added structure VTDPASIDAddressSpace, which is used to track
> > the PASID usage and future PASID tagged DMA address translation
> > support in vIOMMU.
> >
> > struct VTDPASIDAddressSpace {
> > VTDBus *vtd_bus;
> > uint8_t devfn;
> > AddressSpace as;
> > uint32_t pasid;
> > IntelIOMMUState *iommu_state;
> > VTDContextCacheEntry context_cache_entry;
> > QLIST_ENTRY(VTDPASIDAddressSpace) next;
> > VTDPASIDCacheEntry pasid_cache_entry;
> > };
> >
> > Ideally, a VTDPASIDAddressSpace instance is created when a PASID
> > is bound with a DMA AddressSpace. Intel VT-d spec requires guest
> > software to issue pasid cache invalidation when bind or unbind a
> > pasid with an address space under caching-mode. However, as
> > VTDPASIDAddressSpace instances also act as pasid cache in this
> > implementation, its creation also happens during vIOMMU PASID
> > tagged DMA translation. The creation in this path will not be
> > added in this patch since no PASID-capable emulated devices for
> > now.
> >
> > The implementation in this patch manages VTDPASIDAddressSpace
> > instances per PASID+BDF (lookup and insert will use PASID and
> > BDF) since Intel VT-d spec allows per-BDF PASID Table. When a
> > guest bind a PASID with an AddressSpace, QEMU will capture the
> > guest pasid selective pasid cache invalidation, and allocate
> > remove a VTDPASIDAddressSpace instance per the invalidation
> > reasons:
> >
> > *) a present pasid entry moved to non-present
> > *) a present pasid entry to be a present entry
> > *) a non-present pasid entry moved to present
> >
> > vIOMMU emulator could figure out the reason by fetching latest
> > guest pasid entry.
> >
> > Cc: Kevin Tian 
> > Cc: Jacob Pan 
> > Cc: Peter Xu 
> > Cc: Yi Sun 
> > Signed-off-by: Liu Yi L 
> 
> Ok feel free to ignore my previous reply... I didn't notice it's
> actually the pasid entry cache layer rather than the whole pasid
> layer (including piotlb).  Comments below.

yep. It is in another patch and this patch set won't implement piotlb
cache infrastructure as no emulated sva-capable device so far.

> > ---
> >  hw/i386/intel_iommu.c  | 356
> +
> >  hw/i386/intel_iommu_internal.h |  10 ++
> >  hw/i386/trace-events   |   1 +
> >  include/hw/i386/intel_iommu.h  |  36 -
> >  4 files changed, 402 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 90b8f6c..d8827c9 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -40,6 +40,7 @@
> >  #include "kvm_i386.h"
> >  #include "migration/vmstate.h"
> >  #include "trace.h"
> > +#include "qemu/jhash.h"
> >
> >  /* context entry operations */
> >  #define VTD_CE_GET_RID2PASID(ce) \
> > @@ -65,6 +66,8 @@
> >  static void vtd_address_space_refresh_all(IntelIOMMUState *s);
> >  static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n);
> >
> > +static void vtd_pasid_cache_reset(IntelIOMMUState *s);
> > +
> >  static void vtd_panic_require_caching_mode(void)
> >  {
> >  error_report("We need to set caching-mode=on for intel-iommu to enable 
> > "
> > @@ -276,6 +279,7 @@ static void vtd_reset_caches(IntelIOMMUState *s)
> >  vtd_iommu_lock(s);
> >  vtd_reset_iotlb_locked(s);
> >  vtd_reset_context_cache_locked(s);
> > +vtd_pasid_cache_reset(s);
> >  vtd_iommu_unlock(s);
> >  }
> >
> > @@ -686,6 +690,11 @@ static inline bool vtd_pe_type_check(X86IOMMUState
> *x86_iommu,
> >  return true;
> >  }
> >
> > +static inline uint16_t vtd_pe_get_domain_id(VTDPASIDEntry *pe)
> > +{
> > +return VTD_SM_PASID_ENTRY_DID((pe)->val[1]);
> > +}
> > +
> >  static inline bool vtd_pdire_present(VTDPASIDDirEntry *pdire)
> >  {
> >  return pdire->val & 1;
> > @@ -2389,19 +2398,361 @@ static bool
> vtd_process_iotlb_desc(IntelIOMMUState *s, VTDInvDesc *inv_desc)
> >  return true;
> >  }
> >
> > +static inline struct pasid_key *vtd_get_pasid_key(uint32_t pasid,
> > +  uint16_t sid)
> > +{
> > +struct pasid_key *key = g_malloc0(sizeof(*key));
> 
> I think you can simply return the pasid_key directly maybe otherwise
> should be careful on mem leak.  Actually I think it's leaked below...

sure, I can do it. For the leak, it is a known issue as below comment
indicates. Not sure why it was left as it is. Perhaps, the key point is
used in the hash table. Per my understanding, hash table should have
its own field to store the key content. Do you have any idea?

if (!vtd_bus) {
uintptr_t *new_key = g_mal

Re: [PATCH v1 1/4] virtio: protect non-modern devices from too big virtqueue size setting

2019-11-05 Thread Denis Plotnikov


On 05.11.2019 23:56, Michael S. Tsirkin wrote:
> On Tue, Nov 05, 2019 at 07:11:02PM +0300, Denis Plotnikov wrote:
>> The patch protects from creating illegal virtio device configuration
>> via direct virtqueue size property setting.
>>
>> Signed-off-by: Denis Plotnikov 
>> ---
>>   hw/virtio/virtio-blk-pci.c  |  9 +
>>   hw/virtio/virtio-scsi-pci.c | 10 ++
>>   2 files changed, 19 insertions(+)
>>
>> diff --git a/hw/virtio/virtio-blk-pci.c b/hw/virtio/virtio-blk-pci.c
>> index 60c9185c39..6177ff1df8 100644
>> --- a/hw/virtio/virtio-blk-pci.c
>> +++ b/hw/virtio/virtio-blk-pci.c
>> @@ -48,6 +48,15 @@ static void virtio_blk_pci_realize(VirtIOPCIProxy 
>> *vpci_dev, Error **errp)
>>   {
>>   VirtIOBlkPCI *dev = VIRTIO_BLK_PCI(vpci_dev);
>>   DeviceState *vdev = DEVICE(&dev->vdev);
>> +bool modern = virtio_pci_modern(vpci_dev);
>> +uint32_t queue_size = dev->vdev.conf.queue_size;
>> +
>> +if (!modern && queue_size > 128) {
>> +error_setg(errp,
>> +   "too big queue size (%u, max: 128) "
>> +   "for non-modern virtio device", queue_size);
>> +return;
>> +}
>
> this enables for transitional so still visible to legacy
> interface. I am guessing you want to check whether
> device is accessed through the modern interface instead.

My goal is to not break something when I'm setting the queue size > 128 
(taking into account the current seabios queue size restriction to 128). 
I'm not quite sure what to check. Could I ask why one want to the check 
whether accessing through the modern interface and how it could be checked?

Thanks!

Denis

>>   if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
>>   vpci_dev->nvectors = dev->vdev.conf.num_queues + 1;
>> diff --git a/hw/virtio/virtio-scsi-pci.c b/hw/virtio/virtio-scsi-pci.c
>> index 2830849729..6e6790fda5 100644
>> --- a/hw/virtio/virtio-scsi-pci.c
>> +++ b/hw/virtio/virtio-scsi-pci.c
>> @@ -17,6 +17,7 @@
>>   
>>   #include "hw/virtio/virtio-scsi.h"
>>   #include "virtio-pci.h"
>> +#include "qapi/error.h"
>>   
>>   typedef struct VirtIOSCSIPCI VirtIOSCSIPCI;
>>   
>> @@ -47,6 +48,15 @@ static void virtio_scsi_pci_realize(VirtIOPCIProxy 
>> *vpci_dev, Error **errp)
>>   VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(vdev);
>>   DeviceState *proxy = DEVICE(vpci_dev);
>>   char *bus_name;
>> +bool modern = virtio_pci_modern(vpci_dev);
>> +uint32_t virtqueue_size = vs->conf.virtqueue_size;
>> +
>> +if (!modern && virtqueue_size > 128) {
>> +error_setg(errp,
>> +   "too big virtqueue size (%u, max: 128) "
>> +   "for non-modern virtio device", virtqueue_size);
>> +return;
>> +}
> why? what is illegal about 256 for legacy?
>
>>   
>>   if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
>>   vpci_dev->nvectors = vs->conf.num_queues + 3;
>> -- 
>> 2.17.0

[PATCH] vhost-user: Refractor vhost_user_set_mem_table Functions

2019-11-05 Thread Raphael Norwitz

vhost_user_set_mem_table() and vhost_user_set_mem_table_postcopy()
have gotten convoluted, and have some identical code.

This change moves the logic populating the VhostUserMemory struct
and fds array from vhost_user_set_mem_table() and
vhost_user_set_mem_table_postcopy() to a new function,
vhost_user_fill_set_mem_table_msg().

No functionality is impacted.

Signed-off-by: Raphael Norwitz 
---
 hw/virtio/vhost-user.c | 140 +++--
 1 file changed, 65 insertions(+), 75 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 02a9b25..183587e 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -405,31 +405,16 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, 
uint64_t base,
 return 0;
 }
 
-static int vhost_user_set_mem_table_postcopy(struct vhost_dev *dev,
- struct vhost_memory *mem)
+static int vhost_user_fill_set_mem_table_msg(struct vhost_user *u,
+ struct vhost_dev *dev,
+ struct VhostUserMsg *msg,
+ int *fds,
+ size_t *fd_num,
+ bool postcopy)
 {
-struct vhost_user *u = dev->opaque;
-int fds[VHOST_MEMORY_MAX_NREGIONS];
 int i, fd;
-size_t fd_num = 0;
-VhostUserMsg msg_reply;
-int region_i, msg_i;
 
-VhostUserMsg msg = {
-.hdr.request = VHOST_USER_SET_MEM_TABLE,
-.hdr.flags = VHOST_USER_VERSION,
-};
-
-if (u->region_rb_len < dev->mem->nregions) {
-u->region_rb = g_renew(RAMBlock*, u->region_rb, dev->mem->nregions);
-u->region_rb_offset = g_renew(ram_addr_t, u->region_rb_offset,
-  dev->mem->nregions);
-memset(&(u->region_rb[u->region_rb_len]), '\0',
-   sizeof(RAMBlock *) * (dev->mem->nregions - u->region_rb_len));
-memset(&(u->region_rb_offset[u->region_rb_len]), '\0',
-   sizeof(ram_addr_t) * (dev->mem->nregions - u->region_rb_len));
-u->region_rb_len = dev->mem->nregions;
-}
+msg->hdr.request = VHOST_USER_SET_MEM_TABLE;
 
 for (i = 0; i < dev->mem->nregions; ++i) {
 struct vhost_memory_region *reg = dev->mem->regions + i;
@@ -441,37 +426,75 @@ static int vhost_user_set_mem_table_postcopy(struct 
vhost_dev *dev,
  &offset);
 fd = memory_region_get_fd(mr);
 if (fd > 0) {
-trace_vhost_user_set_mem_table_withfd(fd_num, mr->name,
-  reg->memory_size,
-  reg->guest_phys_addr,
-  reg->userspace_addr, offset);
-u->region_rb_offset[i] = offset;
-u->region_rb[i] = mr->ram_block;
-msg.payload.memory.regions[fd_num].userspace_addr =
+if (postcopy) {
+trace_vhost_user_set_mem_table_withfd(*fd_num, mr->name,
+  reg->memory_size,
+  reg->guest_phys_addr,
+  reg->userspace_addr, 
offset);
+u->region_rb_offset[i] = offset;
+u->region_rb[i] = mr->ram_block;
+} else if (*fd_num == VHOST_MEMORY_MAX_NREGIONS) {
+error_report("Failed preparing vhost-user memory table msg");
+return -1;
+}
+msg->payload.memory.regions[*fd_num].userspace_addr =
 reg->userspace_addr;
-msg.payload.memory.regions[fd_num].memory_size  = reg->memory_size;
-msg.payload.memory.regions[fd_num].guest_phys_addr =
+msg->payload.memory.regions[*fd_num].memory_size  = 
reg->memory_size;
+msg->payload.memory.regions[*fd_num].guest_phys_addr =
 reg->guest_phys_addr;
-msg.payload.memory.regions[fd_num].mmap_offset = offset;
-assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
-fds[fd_num++] = fd;
-} else {
+msg->payload.memory.regions[*fd_num].mmap_offset = offset;
+assert(*fd_num < VHOST_MEMORY_MAX_NREGIONS);
+fds[(*fd_num)++] = fd;
+} else if (postcopy) {
 u->region_rb_offset[i] = 0;
 u->region_rb[i] = NULL;
 }
 }
 
-msg.payload.memory.nregions = fd_num;
+msg->payload.memory.nregions = *fd_num;
 
-if (!fd_num) {
+if (!*fd_num && postcopy) {
 error_report("Failed initializing vhost-user memory map, "
  "consider using -object memory-backend-file share=on");
 return -1;
 }
 
-msg.hdr.size = sizeof(msg.payload.memory.nregions);
-msg.hdr.size += sizeof(msg.paylo

[PATCH] [RFC] vhost-user: clean up set_mem_table functions

2019-11-05 Thread Raphael Norwitz

The functions sending vhost-user set memory table messages are getting
convoluted. The amount of nested logic is getting in the way of my
development and it looks like some identical logic should be refractored
out anyways. Here???s an RFC which cleans these functions up a bit.

Raphael

Raphael Norwitz (1):
  vhost-user: Refractor vhost_user_set_mem_table Functions

 hw/virtio/vhost-user.c | 140 +++--
 1 file changed, 65 insertions(+), 75 deletions(-)

-- 
1.8.3.1

[PULL 1/1] qemu-options: Rework the help text of the '-display' option

2019-11-05 Thread Gerd Hoffmann

From: Thomas Huth 

Improve the help text of the "-display" option:

- Only print the options that we have enabled in the binary
  (similar to what we do for other options like -netdev already)

- The "frame=on|off" from "-display sdl" has been removed in commit
  09bd7ba9f5f7 ("Remove deprecated -no-frame option"), so we should
  not show this in the help text anymore

- The "-display egl-headless" line was missing a "\n" at the end

- Indent the default display text in a nicer way

Signed-off-by: Thomas Huth 
Reviewed-by: Philippe Mathieu-Daudé 
Message-id: 20191023120129.13721-1-h...@tuxfamily.org
Signed-off-by: Gerd Hoffmann 
---
 qemu-options.hx | 30 +-
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 1fc2470e2fd4..637597d0d95e 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1546,26 +1546,38 @@ STEXI
 ETEXI
 
 DEF("display", HAS_ARG, QEMU_OPTION_display,
+#if defined(CONFIG_SPICE)
 "-display spice-app[,gl=on|off]\n"
-"-display sdl[,frame=on|off][,alt_grab=on|off][,ctrl_grab=on|off]\n"
+#endif
+#if defined(CONFIG_SDL)
+"-display sdl[,alt_grab=on|off][,ctrl_grab=on|off]\n"
 "[,window_close=on|off][,gl=on|core|es|off]\n"
+#endif
+#if defined(CONFIG_GTK)
 "-display gtk[,grab_on_hover=on|off][,gl=on|off]|\n"
+#endif
+#if defined(CONFIG_VNC)
 "-display vnc=[,]\n"
+#endif
+#if defined(CONFIG_CURSES)
 "-display curses[,charset=]\n"
+#endif
+#if defined(CONFIG_OPENGL)
+"-display egl-headless[,rendernode=]\n"
+#endif
 "-display none\n"
-"-display egl-headless[,rendernode=]"
-"select display type\n"
-"The default display is equivalent to\n"
+"select display backend type\n"
+"The default display is equivalent to\n"
 #if defined(CONFIG_GTK)
-"\t\"-display gtk\"\n"
+"\"-display gtk\"\n"
 #elif defined(CONFIG_SDL)
-"\t\"-display sdl\"\n"
+"\"-display sdl\"\n"
 #elif defined(CONFIG_COCOA)
-"\t\"-display cocoa\"\n"
+"\"-display cocoa\"\n"
 #elif defined(CONFIG_VNC)
-"\t\"-vnc localhost:0,to=99,id=default\"\n"
+"\"-vnc localhost:0,to=99,id=default\"\n"
 #else
-"\t\"-display none\"\n"
+"\"-display none\"\n"
 #endif
 , QEMU_ARCH_ALL)
 STEXI
-- 
2.18.1

[PULL 0/1] Ui 20191106 patches

2019-11-05 Thread Gerd Hoffmann

The following changes since commit 36609b4fa36f0ac934874371874416f7533a5408:

  Merge remote-tracking branch 'remotes/palmer/tags/palmer-for-master-4.2-sf1' 
into staging (2019-11-02 17:59:03 +)

are available in the Git repository at:

  git://git.kraxel.org/qemu tags/ui-20191106-pull-request

for you to fetch changes up to 88b40c683fda6fa00639de01d4274e94bd4f1cdd:

  qemu-options: Rework the help text of the '-display' option (2019-11-05 
12:10:42 +0100)


ui: rework -display help text



Thomas Huth (1):
  qemu-options: Rework the help text of the '-display' option

 qemu-options.hx | 30 +-
 1 file changed, 21 insertions(+), 9 deletions(-)

-- 
2.18.1

Re: [PATCH v6 1/3] hw: rtc: Add Goldfish RTC device

2019-11-05 Thread Anup Patel

On Wed, Nov 6, 2019 at 4:54 AM Philippe Mathieu-Daudé  wrote:
>
> Hi Anup,
>
> On 11/3/19 8:55 AM, Anup Patel wrote:
> > This patch adds model for Google Goldfish virtual platform RTC device.
> >
> > We will be adding Goldfish RTC device to the QEMU RISC-V virt machine
> > for providing real date-time to Guest Linux. The corresponding Linux
> > driver for Goldfish RTC device is already available in upstream Linux.
> >
> > For now, VM migration support is available but untested for Goldfish RTC
> > device. It will be hardened in-future when we implement VM migration for
> > KVM RISC-V.
> >
> > Signed-off-by: Anup Patel 
> > Reviewed-by: Alistair Francis 
> > ---
> >   hw/rtc/Kconfig|   3 +
> >   hw/rtc/Makefile.objs  |   1 +
> >   hw/rtc/goldfish_rtc.c | 288 ++
> >   hw/rtc/trace-events   |   4 +
> >   include/hw/rtc/goldfish_rtc.h |  46 ++
>
> Correct path, thanks :)
>
> >   5 files changed, 342 insertions(+)
> >   create mode 100644 hw/rtc/goldfish_rtc.c
> >   create mode 100644 include/hw/rtc/goldfish_rtc.h
> >
> > diff --git a/hw/rtc/Kconfig b/hw/rtc/Kconfig
> > index 45daa8d655..bafe6ac2c9 100644
> > --- a/hw/rtc/Kconfig
> > +++ b/hw/rtc/Kconfig
> > @@ -21,3 +21,6 @@ config MC146818RTC
> >
> >   config SUN4V_RTC
> >   bool
> > +
> > +config GOLDFISH_RTC
> > +bool
> > diff --git a/hw/rtc/Makefile.objs b/hw/rtc/Makefile.objs
> > index 8dc9fcd3a9..aa208d0d10 100644
> > --- a/hw/rtc/Makefile.objs
> > +++ b/hw/rtc/Makefile.objs
> > @@ -11,3 +11,4 @@ common-obj-$(CONFIG_EXYNOS4) += exynos4210_rtc.o
> >   obj-$(CONFIG_MC146818RTC) += mc146818rtc.o
> >   common-obj-$(CONFIG_SUN4V_RTC) += sun4v-rtc.o
> >   common-obj-$(CONFIG_ASPEED_SOC) += aspeed_rtc.o
> > +common-obj-$(CONFIG_GOLDFISH_RTC) += goldfish_rtc.o
> > diff --git a/hw/rtc/goldfish_rtc.c b/hw/rtc/goldfish_rtc.c
> > new file mode 100644
> > index 00..f71f6eaab0
> > --- /dev/null
> > +++ b/hw/rtc/goldfish_rtc.c
> > @@ -0,0 +1,288 @@
> > +/*
> > + * Goldfish virtual platform RTC
> > + *
> > + * Copyright (C) 2019 Western Digital Corporation or its affiliates.
> > + *
> > + * For more details on Google Goldfish virtual platform refer:
> > + * 
> > https://android.googlesource.com/platform/external/qemu/+/master/docs/GOLDFISH-VIRTUAL-HARDWARE.TXT
>
> I'd use a (fixed) release tag, and not the (unstable) master branch:
>
> https://android.googlesource.com/platform/external/qemu/+/refs/heads/emu-2.0-release/docs/GOLDFISH-VIRTUAL-HARDWARE.TXT

Sure, I will update the URL.

>
> > + *
> > + * This program is free software; you can redistribute it and/or modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2 or later, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License 
> > for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along 
> > with
> > + * this program.  If not, see .
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu-common.h"
> > +#include "hw/rtc/goldfish_rtc.h"
> > +#include "migration/vmstate.h"
> > +#include "hw/irq.h"
> > +#include "hw/qdev-properties.h"
> > +#include "hw/sysbus.h"
> > +#include "qemu/timer.h"
> > +#include "sysemu/sysemu.h"
> > +#include "qemu/cutils.h"
> > +#include "qemu/log.h"
> > +
> > +#include "trace.h"
> > +
> > +#define RTC_TIME_LOW0x00
> > +#define RTC_TIME_HIGH   0x04
> > +#define RTC_ALARM_LOW   0x08
> > +#define RTC_ALARM_HIGH  0x0c
> > +#define RTC_IRQ_ENABLED 0x10
> > +#define RTC_CLEAR_ALARM 0x14
> > +#define RTC_ALARM_STATUS0x18
> > +#define RTC_CLEAR_INTERRUPT 0x1c
> > +
> > +static void goldfish_rtc_update(GoldfishRTCState *s)
> > +{
> > +qemu_set_irq(s->irq, (s->irq_pending & s->irq_enabled) ? 1 : 0);
> > +}
> > +
> > +static void goldfish_rtc_interrupt(void *opaque)
> > +{
> > +GoldfishRTCState *s = (GoldfishRTCState *)opaque;
> > +
> > +s->alarm_running = 0;
> > +s->irq_pending = 1;
> > +goldfish_rtc_update(s);
> > +}
> > +
> > +static uint64_t goldfish_rtc_get_count(GoldfishRTCState *s)
> > +{
> > +return s->tick_offset + (uint64_t)qemu_clock_get_ns(rtc_clock);
> > +}
> > +
> > +static void goldfish_rtc_clear_alarm(GoldfishRTCState *s)
> > +{
> > +timer_del(s->timer);
> > +s->alarm_running = 0;
> > +}
> > +
> > +static void goldfish_rtc_set_alarm(GoldfishRTCState *s)
> > +{
> > +uint64_t ticks = goldfish_rtc_get_count(s);
> > +uint64_t event = s->alarm_next;
> > +
> > +if (event <= ticks) {
> > +goldfish_rtc_clear_alarm(s);
> > +goldfish_rtc_interrupt(s);
> > +} else {
> > +/*
> > + * We should be se

Re: guest / host buffer sharing ...

2019-11-05 Thread Gerd Hoffmann

> > (1) The virtio device
> > =
> > 
> > Has a single virtio queue, so the guest can send commands to register
> > and unregister buffers.  Buffers are allocated in guest ram.  Each
> > buffer
> > has a list of memory ranges for the data.  Each buffer also has some
> > properties to carry metadata, some fixed (id, size, application), but
> > also allow free form (name = value, framebuffers would have
> > width/height/stride/format for example).
> > 
> 
> Perfect, however since it's to be a generic device there also needs to be a
> method in the guest to identify which device is the one the application is
> interested in without opening the device.

This is what the application buffer property is supposed to handle, i.e.
you'll have a single device, all applications share it and the property
tells which buffer belongs to which application.

> The device should also support a reset feature allowing the guest to
> notify the host application that all buffers have become invalid such as
> on abnormal termination of the guest application that is using the device.

The guest driver should cleanup properly (i.e. unregister all buffers)
when an application terminates of course, no matter what the reason is
(crash, exit without unregistering buffers, ...).  Doable without a full
device reset.

Independent from that a full reset will be supported of course, it is a
standard virtio feature.

> Conversely, qemu on unix socket disconnect should notify the guest of this
> event also, allowing each end to properly syncronize.

I was thinking more about a simple guest-side publishing of buffers,
without a backchannel.  If more coordination is needed you can use
vsocks for that for example.

> > (3) The qemu host implementation
> > 
> > 
> > qemu (likewise other vmms) can use the udmabuf driver to create
> > host-side dma-bufs for the buffers.  The dma-bufs can be passed to
> > anyone interested, inside and outside qemu.  We'll need some protocol
> > for communication between qemu and external users interested in those
> > buffers, to receive dma-bufs (via unix file descriptor passing) and
> > update notifications.

Using vhost for the host-side implementation should be possible too.

> > Dispatching updates could be done based on the
> > application property, which could be "virtio-vdec" or "wayland-proxy"
> > for example.
> 
> I don't know enough about udmabuf to really comment on this except to ask
> a question. Would this make guest to guest transfers without an
> intermediate buffer possible?

Yes.

cheers,
  Gerd

Re: git-publish, --pull-request and --signoff (was: Re: [PULL 0/9] Ide patches)

2019-11-05 Thread Stefan Hajnoczi

On Tue, Nov 5, 2019 at 9:22 PM Eduardo Habkost  wrote:
> On Tue, Nov 05, 2019 at 09:17:42PM +0100, Stefan Hajnoczi wrote:
> > On Thu, Oct 31, 2019 at 5:07 PM John Snow  wrote:
> > > On 10/31/19 11:02 AM, Peter Maydell wrote:
> > > > On Thu, 31 Oct 2019 at 10:59, John Snow  wrote:
> > > >>
> > > >> The following changes since commit 
> > > >> 68d8ef4ec540682c3538d4963e836e43a211dd17:
> > > >>
> > > >>   Merge remote-tracking branch 
> > > >> 'remotes/stsquad/tags/pull-tcg-plugins-281019-4' into staging 
> > > >> (2019-10-30 14:10:32 +)
> > > >>
> > > >> are available in the Git repository at:
> > > >>
> > > >>   https://github.com/jnsnow/qemu.git tags/ide-pull-request
> > > >>
> > > >> for you to fetch changes up to 
> > > >> c35564caf20e8d3431786dddf0fa513daa7d7f3c:
> > > >>
> > > >>   hd-geo-test: Add tests for lchs override (2019-10-31 06:11:34 -0400)
> > > >>
> > > >> 
> > > >> Pull request
> > > >>
> > > >
> > > > Hi -- this passed the merge tests but it looks like you forgot
> > > > to add your signed-off by line as the submaintainer to Sam's
> > > > patches. Could you fix that up and resend, please?
> > > >
> > > > thanks
> > > > -- PMM
> > > >
> > >
> > > I bit myself twice with this now: adding --signoff to a pull request
> > > signs the messages that get sent to list, but not the ones that get 
> > > staged.
> > >
> > > Could always be a bug in my local copy, but I'm documenting it on the
> > > list, in case I don't get time to look at this in the next 24h.
> >
> > Do you mean Signed-off-by is only added to emails that are sent and
> > not to the actual commits in your repo?
> >
> > This is how git-format-patch(1) --signoff works.  git-publish does not
> > modify local commits either.
> >
> > Some people would probably be surprised if git-publish modified their
> > commit history.
> >
> > I'm not sure what the best solution here is, aside from introducing a
> > separate signoff option called --apply-signoff or similar so there is
> > no confusion.
>
> I would make git-publish error out if --signoff and
> --pull-request are used simultaneously.  I can't think of a
> justification for having the email contents not match the git tag
> contents in a pull request.

Sounds good!

Stefan

Re: [PATCH 2/3] dp8393x: fix dp8393x_receive()

2019-11-05 Thread Hervé Poussineau


Le 05/11/2019 à 22:53, Laurent Vivier a écrit :

Le 05/11/2019 à 22:06, Hervé Poussineau a écrit :

Le 02/11/2019 à 18:15, Laurent Vivier a écrit :

address_space_rw() access size must be multiplied by the width.

This fixes DHCP for Q800 guest.

Signed-off-by: Laurent Vivier 
---
   hw/net/dp8393x.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
index 85d3f3788e..b8c4473f99 100644
--- a/hw/net/dp8393x.c
+++ b/hw/net/dp8393x.c
@@ -833,7 +833,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
   } else {
   dp8393x_put(s, width, 0, 0); /* in_use */
   address_space_rw(&s->as, dp8393x_crda(s) + sizeof(uint16_t)
* 6 * width,
-    MEMTXATTRS_UNSPECIFIED, (uint8_t *)s->data,
sizeof(uint16_t), 1);
+    MEMTXATTRS_UNSPECIFIED, (uint8_t *)s->data, size, 1);
   s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
   s->regs[SONIC_ISR] |= SONIC_ISR_PKTRX;
   s->regs[SONIC_RSC] = (s->regs[SONIC_RSC] & 0xff00) |
(((s->regs[SONIC_RSC] & 0x00ff) + 1) & 0x00ff);



This patch is problematic.
The code was initially created with "size".
It was changed in 409b52bfe199d8106dadf7c5ff3d88d2228e89b5 to fix
networking in NetBSD 5.1.

To test with NetBSD 5.1
- boot the installer (arccd-5.1.iso)
- choose (S)hell option
- "ifconfig sn0 10.0.2.15 netmask 255.255.255.0"
- "route add default 10.0.2.2"
- networking should work (I test with "ftp 212.27.63.3")


I've the firmware from
http://hpoussineau.free.fr/qemu/firmware/magnum-4000/setup.zip
Which file to use? NTPROM.RAW?


Without this patch, I get the FTP banner.
With this patch, connection can't be established.

In datasheet page 17, you can see the "Receive Descriptor Format", which
contains the in_use field.
It is clearly said that RXpkt.in_use is 16 bit wide, and that the bits
16-31 are not used in 32-bit mode.

So, I don't see why you need to clear 32 bits in 32-bit mode. Maybe you
need to clear only the other
16 bits ? Maybe it depends of endianness ?


Thank you for the details. I think the problem should likely come from
the endianness.

The offset must be adjusted according to the access mode (endianness and
size).

The following patch fixes the problem for me, and should not break other
targets:

diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
index 85d3f3788e..3d991af163 100644
--- a/hw/net/dp8393x.c
+++ b/hw/net/dp8393x.c
@@ -831,9 +831,15 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
  /* EOL detected */
  s->regs[SONIC_ISR] |= SONIC_ISR_RDE;
  } else {
-dp8393x_put(s, width, 0, 0); /* in_use */
-address_space_rw(&s->as, dp8393x_crda(s) + sizeof(uint16_t) * 6
* width,
-MEMTXATTRS_UNSPECIFIED, (uint8_t *)s->data,
sizeof(uint16_t), 1);
+/* Clear in_use, but it is always 16bit wide */
+int offset = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width;
+if (s->big_endian && width == 2) {
+/* we need to adjust the offset of the 16bit field */
+offset += sizeof(uint16_t);
+}
+s->data[0] = 0;
+address_space_rw(&s->as, offset, MEMTXATTRS_UNSPECIFIED,
+ (uint8_t *)s->data, sizeof(uint16_t), 1);
  s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
  s->regs[SONIC_ISR] |= SONIC_ISR_PKTRX;
  s->regs[SONIC_RSC] = (s->regs[SONIC_RSC] & 0xff00) |
(((s->regs[SONIC_RSC] & 0x00ff) + 1) & 0x00ff);

What is your opinion?


This one works for NetBSD.
Tested-by: Hervé Poussineau

RE: [RFC v2 11/22] intel_iommu: process pasid cache invalidation

2019-11-05 Thread Liu, Yi L

> From: Peter Xu [mailto:pet...@redhat.com]
> Sent: Sunday, November 3, 2019 12:06 AM
> To: Liu, Yi L 
> Subject: Re: [RFC v2 11/22] intel_iommu: process pasid cache invalidation
> 
> On Thu, Oct 24, 2019 at 08:34:32AM -0400, Liu Yi L wrote:
> > This patch adds PASID cache invalidation handling. When guest enabled
> > PASID usages (e.g. SVA), guest software should issue a proper PASID
> > cache invalidation when caching-mode is exposed. This patch only adds
> > the draft handling of pasid cache invalidation. Detailed handling will
> > be added in subsequent patches.
> >
> > Cc: Kevin Tian 
> > Cc: Jacob Pan 
> > Cc: Peter Xu 
> > Cc: Yi Sun 
> > Signed-off-by: Liu Yi L 
> > ---
> >  hw/i386/intel_iommu.c  | 66 
> > ++--
> --
> >  hw/i386/intel_iommu_internal.h | 12 
> >  hw/i386/trace-events   |  3 ++
> >  3 files changed, 76 insertions(+), 5 deletions(-)
> >
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 88b843f..84ff6f0 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -2335,6 +2335,63 @@ static bool vtd_process_iotlb_desc(IntelIOMMUState
> *s, VTDInvDesc *inv_desc)
> >  return true;
> >  }
> >
> > +static int vtd_pasid_cache_dsi(IntelIOMMUState *s, uint16_t domain_id)
> > +{
> > +return 0;
> > +}
> > +
> > +static int vtd_pasid_cache_psi(IntelIOMMUState *s,
> > +   uint16_t domain_id, uint32_t pasid)
> > +{
> > +return 0;
> > +}
> > +
> > +static int vtd_pasid_cache_gsi(IntelIOMMUState *s)
> > +{
> > +return 0;
> > +}
> > +
> > +static bool vtd_process_pasid_desc(IntelIOMMUState *s,
> > +   VTDInvDesc *inv_desc)
> > +{
> > +uint16_t domain_id;
> > +uint32_t pasid;
> > +int ret = 0;
> > +
> > +if ((inv_desc->val[0] & VTD_INV_DESC_PASIDC_RSVD_VAL0) ||
> > +(inv_desc->val[1] & VTD_INV_DESC_PASIDC_RSVD_VAL1) ||
> > +(inv_desc->val[2] & VTD_INV_DESC_PASIDC_RSVD_VAL2) ||
> > +(inv_desc->val[3] & VTD_INV_DESC_PASIDC_RSVD_VAL3)) {
> > +error_report_once("non-zero-field-in-pc_inv_desc hi: 0x%" PRIx64
> > +  " lo: 0x%" PRIx64, inv_desc->val[1], inv_desc->val[0]);
> > +return false;
> > +}
> > +
> > +domain_id = VTD_INV_DESC_PASIDC_DID(inv_desc->val[0]);
> > +pasid = VTD_INV_DESC_PASIDC_PASID(inv_desc->val[0]);
> > +
> > +switch (inv_desc->val[0] & VTD_INV_DESC_PASIDC_G) {
> > +case VTD_INV_DESC_PASIDC_DSI:
> > +ret = vtd_pasid_cache_dsi(s, domain_id);
> > +break;
> > +
> > +case VTD_INV_DESC_PASIDC_PASID_SI:
> > +ret = vtd_pasid_cache_psi(s, domain_id, pasid);
> > +break;
> > +
> > +case VTD_INV_DESC_PASIDC_GLOBAL:
> > +ret = vtd_pasid_cache_gsi(s);
> > +break;
> > +
> > +default:
> > +error_report_once("invalid-inv-granu-in-pc_inv_desc hi: 0x%" PRIx64
> > +  " lo: 0x%" PRIx64, inv_desc->val[1], inv_desc->val[0]);
> > +return false;
> > +}
> > +
> > +return (ret == 0) ? true : false;
> > +}
> > +
> >  static bool vtd_process_inv_iec_desc(IntelIOMMUState *s,
> >   VTDInvDesc *inv_desc)
> >  {
> > @@ -2441,12 +2498,11 @@ static bool vtd_process_inv_desc(IntelIOMMUState
> *s)
> >  }
> >  break;
> >
> > -/*
> > - * TODO: the entity of below two cases will be implemented in future 
> > series.
> > - * To make guest (which integrates scalable mode support patch set in
> > - * iommu driver) work, just return true is enough so far.
> > - */
> >  case VTD_INV_DESC_PC:
> > +trace_vtd_inv_desc("pasid-cache", inv_desc.val[1], 
> > inv_desc.val[0]);
> 
> Could be helpful if you dump [2|3] together here...

sure. Let me add it in next version.

> > +if (!vtd_process_pasid_desc(s, &inv_desc)) {
> > +return false;
> > +}
> >  break;
> >
> >  case VTD_INV_DESC_PIOTLB:
> > diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
> > index 8668771..c6cb28b 100644
> > --- a/hw/i386/intel_iommu_internal.h
> > +++ b/hw/i386/intel_iommu_internal.h
> > @@ -445,6 +445,18 @@ typedef union VTDInvDesc VTDInvDesc;
> >  #define VTD_SPTE_LPAGE_L4_RSVD_MASK(aw) \
> >  (0x880ULL | ~(VTD_HAW_MASK(aw) | VTD_SL_IGN_COM))
> >
> > +#define VTD_INV_DESC_PASIDC_G  (3ULL << 4)
> > +#define VTD_INV_DESC_PASIDC_PASID(val) (((val) >> 32) & 0xfULL)
> > +#define VTD_INV_DESC_PASIDC_DID(val)   (((val) >> 16) &
> VTD_DOMAIN_ID_MASK)
> > +#define VTD_INV_DESC_PASIDC_RSVD_VAL0  0xfff0ffc0ULL
> 
> Nit: Mind to comment here that bit 9-11 is marked as zero rather than
> reserved?  This seems to work but if bit 9-11 can be non-zero in some
> other descriptors then it would be clearer to define it as
> 0xfff0f1c0ULL then explicitly check bits 9-11.
> 
> Otherwise looks good to me.

You are right. This is not reserve

RE: [PATCH v1 Resend] target/i386: set the CPUID level to 0x14 on old machine-type

2019-11-05 Thread Kang, Luwei

> > The CPUID level need to be set to 0x14 manually on old machine-type if
> > Intel PT is enabled in guest. e.g. in Qemu 3.1 -machine pc-i440fx-3.1
> > -cpu qemu64,+intel-pt will be CPUID[0].EAX(level)=7 and
> > CPUID[7].EBX[25](intel-pt)=1.
> >
> > Some Intel PT capabilities are exposed by leaf 0x14 and the missing
> > capabilities will cause some MSRs access failed.
> > This patch add a warning message to inform the user to extend the
> > CPUID level.
> 
> Note that a warning is not an acceptable fix for a QEMU crash.
> We still need to fix the QEMU crash reported at:
> https://lore.kernel.org/qemu-devel/20191024141536.gu6...@habkost.net/
> 
> 
> >
> > Suggested-by: Eduardo Habkost 
> > Signed-off-by: Luwei Kang 
> 
> The subject line says "v1", but this patch is different from the
> v1 you sent earlier.
> 
> If you are sending a different patch, please indicate it is a new version.  
> Please also
> indicate what changed between different patch versions, to help review.

Got it. I fix a code style problem in resending patch (remove the '\n').

ERROR: Error messages should not contain newlines
#36: FILE: target/i386/cpu.c:5448:
+"by \"-cpu ...,+intel-pt,level=0x14\"\n");
total: 1 errors, 0 warnings, 14 lines checked

> 
> > ---
> >  target/i386/cpu.c | 8 ++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c index
> > a624163..f67c479 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -5440,8 +5440,12 @@ static void x86_cpu_expand_features(X86CPU
> > *cpu, Error **errp)
> >
> >  /* Intel Processor Trace requires CPUID[0x14] */
> >  if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) &&
> > - kvm_enabled() && cpu->intel_pt_auto_level) {
> 
> Not directly related to the warning: do you know why we have a
> kvm_enabled() check here?  It seems unnecessary.  We want CPUID level to be 
> correct
> for all accelerators.

Intel PT virtualization enabling in KVM guest need some hardware enhancement and
EPT must be enabled in KVM.  I think it can't work for e.g. tcg pure simulation 
accelerator.

> 
> > -x86_cpu_adjust_level(cpu, &cpu->env.cpuid_min_level, 0x14);
> > + kvm_enabled()) {
> > +if (cpu->intel_pt_auto_level)
> > +x86_cpu_adjust_level(cpu, &cpu->env.cpuid_min_level, 0x14);
> > +else
> > +warn_report("Intel PT need CPUID leaf 0x14, please set "
> > +"by \"-cpu ...,+intel-pt,level=0x14\"");
> 
> The warning shouldn't be triggered if level is already >= 0x14.
> 
> It is probably a good idea to mention that this happens only on
> pc-*-3.1 and older, as updating the machine-type is a better solution to the 
> problem
> than manually setting the "level"
> property.
> 
> This will print the warning multiple times if there are multiple VCPUs.  You 
> can use
> warn_report_once() to avoid that.

Got it. Will fix.

As you mentioned in this email " a warning is not an acceptable fix for a QEMU 
crash."
We can't change the configuration of the old machine type because it may break 
the
ABI compatibility. May I add more check on Intel PT, if CPUID[7].EBX[25] 
(intel-pt) = 1
and level is <0x14, mask off this feature? Or do you have any other suggestions?

Thanks,
Luwei Kang

> 
> >  }
> >
> >  /* CPU topology with multi-dies support requires CPUID[0x1F]
> > */
> > --
> > 1.8.3.1
> >
> 
> --
> Eduardo

Re: [PATCH 1/2] i386: Add missing cpu feature bits in EPYC model

2019-11-05 Thread Eduardo Habkost

On Wed, Nov 06, 2019 at 12:16:53AM +, Moger, Babu wrote:
[...]
> > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > index 51b72439b4..a72fe1db31 100644
> > > --- a/hw/i386/pc.c
> > > +++ b/hw/i386/pc.c
> > > @@ -105,7 +105,13 @@ struct hpet_fw_config hpet_cfg = {.count =
> > UINT8_MAX};
> > >  /* Physical Address of PVH entry point read from kernel ELF NOTE */
> > >  static size_t pvh_start_addr;
> > >
> > > -GlobalProperty pc_compat_4_1[] = {};
> > > +GlobalProperty pc_compat_4_1[] = {
> > > +{ "EPYC" "-" TYPE_X86_CPU, "perfctr-core", "off" },
> > > +{ "EPYC" "-" TYPE_X86_CPU, "clzero", "off" },
> > > +{ "EPYC" "-" TYPE_X86_CPU, "xsaveerptr", "off" },
> > > +{ "EPYC" "-" TYPE_X86_CPU, "ibpb", "off" },
> > > +{ "EPYC" "-" TYPE_X86_CPU, "xsaves", "off" },
> > > +};
> > 
> > machine-type-based CPU compatibility was now replaced by
> > versioned CPU models.  Please use the X86CPUDefinition.versions
> > field to add a new version of EPYC instead.
> 
> Ok. Did  you mean like this commit  below?
> fd63c6d1a5f77d68 ("i386: Add Cascadelake-Server-v2 CPU model")

Correct.  Thanks!

-- 
Eduardo

RE: [PATCH 1/2] i386: Add missing cpu feature bits in EPYC model

2019-11-05 Thread Moger, Babu




> -Original Message-
> From: Eduardo Habkost 
> Sent: Tuesday, November 5, 2019 3:43 PM
> To: Moger, Babu 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; qemu-devel@nongnu.org
> Subject: Re: [PATCH 1/2] i386: Add missing cpu feature bits in EPYC model
> 
> On Tue, Nov 05, 2019 at 09:17:30PM +, Moger, Babu wrote:
> > Adds the following missing CPUID bits:
> > perfctr-core : core performance counter extensions support. Enables the VM
> >to use extended performance counter support. It enables six
> >programmable counters instead of 4 counters.
> > clzero   : instruction zeroes out the 64 byte cache line specified in 
> > RAX.
> > xsaveerptr   : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
> >pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
> >pointers.
> > ibpb : Indirect Branch Prediction Barrie.
> > xsaves   : XSAVES, XRSTORS and IA32_XSS supported.
> >
> > Depends on:
> > 40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
> > 504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
> > 52297436199d ("kvm: svm: Update svm_xsaves_supported")
> >
> > Signed-off-by: Babu Moger 
> > ---
> >  hw/i386/pc.c  |8 +++-
> >  target/i386/cpu.c |   11 +--
> >  2 files changed, 12 insertions(+), 7 deletions(-)
> >
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 51b72439b4..a72fe1db31 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -105,7 +105,13 @@ struct hpet_fw_config hpet_cfg = {.count =
> UINT8_MAX};
> >  /* Physical Address of PVH entry point read from kernel ELF NOTE */
> >  static size_t pvh_start_addr;
> >
> > -GlobalProperty pc_compat_4_1[] = {};
> > +GlobalProperty pc_compat_4_1[] = {
> > +{ "EPYC" "-" TYPE_X86_CPU, "perfctr-core", "off" },
> > +{ "EPYC" "-" TYPE_X86_CPU, "clzero", "off" },
> > +{ "EPYC" "-" TYPE_X86_CPU, "xsaveerptr", "off" },
> > +{ "EPYC" "-" TYPE_X86_CPU, "ibpb", "off" },
> > +{ "EPYC" "-" TYPE_X86_CPU, "xsaves", "off" },
> > +};
> 
> machine-type-based CPU compatibility was now replaced by
> versioned CPU models.  Please use the X86CPUDefinition.versions
> field to add a new version of EPYC instead.

Ok. Did  you mean like this commit  below?
fd63c6d1a5f77d68 ("i386: Add Cascadelake-Server-v2 CPU model")

> 
> >  const size_t pc_compat_4_1_len = G_N_ELEMENTS(pc_compat_4_1);
> >
> >  GlobalProperty pc_compat_4_0[] = {};
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 07cf562d89..71233e6310 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -3110,19 +3110,18 @@ static X86CPUDefinition builtin_x86_defs[] = {
> >  CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
> >  CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM
> |
> >  CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
> > -CPUID_EXT3_TOPOEXT,
> > +CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
> > +.features[FEAT_8000_0008_EBX] =
> > +CPUID_8000_0008_EBX_CLZERO |
> CPUID_8000_0008_EBX_XSAVEERPTR |
> > +CPUID_8000_0008_EBX_IBPB,
> >  .features[FEAT_7_0_EBX] =
> >  CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
> CPUID_7_0_EBX_AVX2 |
> >  CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
> CPUID_7_0_EBX_RDSEED |
> >  CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_SMAP |
> CPUID_7_0_EBX_CLFLUSHOPT |
> >  CPUID_7_0_EBX_SHA_NI,
> > -/* Missing: XSAVES (not supported by some Linux versions,
> > - * including v4.1 to v4.12).
> > - * KVM doesn't yet expose any XSAVES state save component.
> > - */
> >  .features[FEAT_XSAVE] =
> >  CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
> > -CPUID_XSAVE_XGETBV1,
> > +CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES,
> >  .features[FEAT_6_EAX] =
> >  CPUID_6_EAX_ARAT,
> >  .features[FEAT_SVM] =
> >
> 
> --
> Eduardo

[PATCH v2 4/4] target/arm: Add support for DC CVAP & DC CVADP ins

2019-11-05 Thread Beata Michalska

ARMv8.2 introduced support for Data Cache Clean instructions
to PoP (point-of-persistence) - DC CVAP and PoDP (point-of-deep-persistence)
- DV CVADP. Both specify conceptual points in a memory system where all writes
that are to reach them are considered persistent.
The support provided considers both to be actually the same so there is no
distinction between the two. If none is available (there is no backing store
for given memory) both will result in Data Cache Clean up to the point of
coherency. Otherwise sync for the specified range shall be performed.

Signed-off-by: Beata Michalska 
---
 linux-user/elfload.c |  2 ++
 target/arm/cpu.h | 10 ++
 target/arm/cpu64.c   |  1 +
 target/arm/helper.c  | 56 
 4 files changed, 69 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index f6693e5..07b16cc 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -656,6 +656,7 @@ static uint32_t get_elf_hwcap(void)
 GET_FEATURE_ID(aa64_jscvt, ARM_HWCAP_A64_JSCVT);
 GET_FEATURE_ID(aa64_sb, ARM_HWCAP_A64_SB);
 GET_FEATURE_ID(aa64_condm_4, ARM_HWCAP_A64_FLAGM);
+GET_FEATURE_ID(aa64_dcpop, ARM_HWCAP_A64_DCPOP);
 
 return hwcaps;
 }
@@ -665,6 +666,7 @@ static uint32_t get_elf_hwcap2(void)
 ARMCPU *cpu = ARM_CPU(thread_cpu);
 uint32_t hwcaps = 0;
 
+GET_FEATURE_ID(aa64_dcpodp, ARM_HWCAP2_A64_DCPODP);
 GET_FEATURE_ID(aa64_condm_5, ARM_HWCAP2_A64_FLAGM2);
 GET_FEATURE_ID(aa64_frint, ARM_HWCAP2_A64_FRINT);
 
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index e1a66a2..0dc22c6 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3617,6 +3617,16 @@ static inline bool isar_feature_aa64_frint(const 
ARMISARegisters *id)
 return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FRINTTS) != 0;
 }
 
+static inline bool isar_feature_aa64_dcpop(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) != 0;
+}
+
+static inline bool isar_feature_aa64_dcpodp(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) >= 2;
+}
+
 static inline bool isar_feature_aa64_fp16(const ARMISARegisters *id)
 {
 /* We always set the AdvSIMD and FP fields identically wrt FP16.  */
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 68baf04..e6a033e 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -661,6 +661,7 @@ static void aarch64_max_initfn(Object *obj)
 cpu->isar.id_aa64isar0 = t;
 
 t = cpu->isar.id_aa64isar1;
+t = FIELD_DP64(t, ID_AA64ISAR1, DPB, 2);
 t = FIELD_DP64(t, ID_AA64ISAR1, JSCVT, 1);
 t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
 t = FIELD_DP64(t, ID_AA64ISAR1, APA, 1); /* PAuth, architected only */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index be67e2c..00c72e4 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5924,6 +5924,52 @@ static const ARMCPRegInfo rndr_reginfo[] = {
   .access = PL0_R, .readfn = rndr_readfn },
 REGINFO_SENTINEL
 };
+
+#ifndef CONFIG_USER_ONLY
+static void dccvap_writefn(CPUARMState *env, const ARMCPRegInfo *opaque,
+  uint64_t value)
+{
+ARMCPU *cpu = env_archcpu(env);
+/* CTR_EL0 System register -> DminLine, bits [19:16] */
+uint64_t dline_size = 4 << ((cpu->ctr >> 16) & 0xF);
+uint64_t vaddr_in = (uint64_t) value;
+uint64_t vaddr = vaddr_in & ~(dline_size - 1);
+void *haddr;
+int mem_idx = cpu_mmu_index(env, false);
+
+/* This won't be crossing page boundaries */
+haddr = probe_read(env, vaddr, dline_size, mem_idx, GETPC());
+if (haddr) {
+
+ram_addr_t offset;
+MemoryRegion *mr;
+
+/* RCU lock is already being held */
+mr = memory_region_from_host(haddr, &offset);
+
+if (mr) {
+memory_region_do_writeback(mr, offset, dline_size);
+}
+}
+}
+
+static const ARMCPRegInfo dcpop_reg[] = {
+{ .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
+  .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
+  .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
+  .accessfn = aa64_cacheop_access, .writefn = dccvap_writefn },
+REGINFO_SENTINEL
+};
+
+static const ARMCPRegInfo dcpodp_reg[] = {
+{ .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
+  .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
+  .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
+  .accessfn = aa64_cacheop_access, .writefn = dccvap_writefn },
+REGINFO_SENTINEL
+};
+#endif /*CONFIG_USER_ONLY*/
+
 #endif
 
 static CPAccessResult access_predinv(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -6884,6 +6930,16 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 if (cpu_isar_feature(aa64_rndr, cpu)) {
 define_arm_cp_regs(cpu, rndr_reginfo);
 }
+#ifndef CONFIG_USER_ONLY
+/* Data Cache clean instructions up to PoP */
+if (cpu_isar_feature(aa64_dcpop, cpu)) {

[PATCH v2 3/4] migration: ram: Switch to ram block writeback

2019-11-05 Thread Beata Michalska

Switch to ram block writeback for pmem migration.

Signed-off-by: Beata Michalska 
Reviewed-by: Richard Henderson 
Acked-by: Dr. David Alan Gilbert 
---
 migration/ram.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 5078f94..38070f1 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -33,7 +33,6 @@
 #include "qemu/bitops.h"
 #include "qemu/bitmap.h"
 #include "qemu/main-loop.h"
-#include "qemu/pmem.h"
 #include "xbzrle.h"
 #include "ram.h"
 #include "migration.h"
@@ -3981,9 +3980,7 @@ static int ram_load_cleanup(void *opaque)
 RAMBlock *rb;
 
 RAMBLOCK_FOREACH_NOT_IGNORED(rb) {
-if (ramblock_is_pmem(rb)) {
-pmem_persist(rb->host, rb->used_length);
-}
+qemu_ram_block_writeback(rb);
 }
 
 xbzrle_load_cleanup();
-- 
2.7.4

Re: [EXTERNAL] Re: Adding New, Unsupported ISA to Qemu

2019-11-05 Thread Philippe Mathieu-Daudé


On 11/5/19 10:39 PM, Peter Maydell wrote:

On Tue, 5 Nov 2019 at 21:23, Hanson, Seth  wrote:

I completely understand your concern. Rest assured, this project is entirely 
internal and requires no code contribution, unit testing, etc. from QEMU devs. 
We simply want to garner as much documentation as possible to ensure optimal 
conversion/compatibility. My team and I have already completed a majority of 
our instruction set

mapping into TCG. Lately however, we've encountered issues with
floating point operations.

Yeah, for internal forks you have none of the upstreaming
issues (you're merely more on-your-own for figuring out
bugs :-))


I noticed in the TCG Readme that floating point operations are no longer 
officially supported but were previously (per the last paragraph in 4.1).


Git blame will tell you that that claim about floating point
has been in there since the readme was first added to
the project in 2008. It would be more accurate to say
simply that TCG does not natively implement fp operations.

TCG's approach to fp is to just (at the TCG opcode level)
treat fp registers the same way as integer ones -- they're
32 bit or 64 bit binary values. Mostly fp operations are
implemented by having the TCG code call out to a helper
function, the same way you'd implement any moderately
complex operation that's not easy to do with inline TCG ops.
 From the helper function, you can call the various emulation
functions provided by our generic fpu emulation layer
('softfloat') whose headers are in include/fpu. The FPU
emulation provides IEEE-compliant implementations of
various basic operations; you have to tell it how your target
handles things that IEEE 754 doesn't nail down (eg whether
you detect tininess before or after rounding, what your NaN
format is, that kind of thing), through
a mix of calling the functions that initialize the 'float_status'
and also adding to the target-specific ifdeffery in
fpu/softfloat-specialize.inc.c. When your target needs things
that aren't IEEE-specified you just have to implement
emulation of those in your per-target code (arm does this
for the 'recpe' reciprocal-estimate instruction, for instance).

IEEE cumulative exception flags (inexact, denormal, etc)
are tracked in the float_status and need to be made visible
to the guest in whatever fp status register it uses to show
those. The default assumption is that IEEE exceptions
don't generate guest CPU exceptions, but you can implement
the latter if you need it -- see ppc for an example of that.


Can you please provide documentation for implementing the latter?


As usual for QEMU internals there are no documentation.
You can look at the headers in include/fpu which have some
comments describing the APIs, and at the existing CPUs
which use them to implement their FPU support.


You can also read the git history (of a particular file/directory), you 
will learn a lot about API changes and why the design is this way today.


Also, looking at other similar contributions in the mailing list 
archives might give you useful hints. In particular when patches were 
not accepted, what was the reasons.


Regards,

Phil.

[PATCH v2 2/4] Memory: Enable writeback for given memory region

2019-11-05 Thread Beata Michalska

Add an option to trigger memory writeback to sync given memory region
with the corresponding backing store, case one is available.
This extends the support for persistent memory, allowing syncing on-demand.

Signed-off-by: Beata Michalska 
---
 exec.c  | 43 +++
 include/exec/memory.h   |  6 ++
 include/exec/ram_addr.h |  8 
 include/qemu/cutils.h   |  1 +
 memory.c| 12 
 util/cutils.c   | 47 +++
 6 files changed, 117 insertions(+)

diff --git a/exec.c b/exec.c
index ffdb518..e1f06de 100644
--- a/exec.c
+++ b/exec.c
@@ -65,6 +65,8 @@
 #include "exec/ram_addr.h"
 #include "exec/log.h"
 
+#include "qemu/pmem.h"
+
 #include "migration/vmstate.h"
 
 #include "qemu/range.h"
@@ -2156,6 +2158,47 @@ int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, 
Error **errp)
 return 0;
 }
 
+/*
+ * Trigger sync on the given ram block for range [start, start + length]
+ * with the backing store if one is available.
+ * Otherwise no-op.
+ * @Note: this is supposed to be a synchronous op.
+ */
+void qemu_ram_writeback(RAMBlock *block, ram_addr_t start, ram_addr_t length)
+{
+void *addr = ramblock_ptr(block, start);
+
+/*
+ * The requested range might spread up to the very end of the block
+ */
+if ((start + length) > block->used_length) {
+qemu_log("%s: sync range outside the block boundaries: "
+ "start: " RAM_ADDR_FMT " length: " RAM_ADDR_FMT
+ " block length: " RAM_ADDR_FMT " Narrowing down ..." ,
+ __func__, start, length, block->used_length);
+length = block->used_length - start;
+}
+
+#ifdef CONFIG_LIBPMEM
+/* The lack of support for pmem should not block the sync */
+if (ramblock_is_pmem(block)) {
+pmem_persist(addr, length);
+} else
+#endif
+if (block->fd >= 0) {
+/**
+ * Case there is no support for PMEM or the memory has not been
+ * specified as persistent (or is not one) - use the msync.
+ * Less optimal but still achieves the same goal
+ */
+if (qemu_msync(addr, length, qemu_host_page_size, block->fd)) {
+warn_report("%s: failed to sync memory range: start: "
+RAM_ADDR_FMT " length: " RAM_ADDR_FMT,
+__func__, start, length);
+}
+}
+}
+
 /* Called with ram_list.mutex held */
 static void dirty_memory_extend(ram_addr_t old_ram_size,
 ram_addr_t new_ram_size)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index e499dc2..27a84e0 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1265,6 +1265,12 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr);
  */
 void memory_region_ram_resize(MemoryRegion *mr, ram_addr_t newsize,
   Error **errp);
+/**
+ * memory_region_do_writeback: Trigger writeback for selected address range
+ * [addr, addr + size]
+ *
+ */
+void memory_region_do_writeback(MemoryRegion *mr, hwaddr addr, hwaddr size);
 
 /**
  * memory_region_set_log: Turn dirty logging on or off for a region.
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index bed0554..5adebb0 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -174,6 +174,14 @@ void qemu_ram_free(RAMBlock *block);
 
 int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp);
 
+void qemu_ram_writeback(RAMBlock *block, ram_addr_t start, ram_addr_t length);
+
+/* Clear whole block of mem */
+static inline void qemu_ram_block_writeback(RAMBlock *block)
+{
+qemu_ram_writeback(block, 0, block->used_length);
+}
+
 #define DIRTY_CLIENTS_ALL ((1 << DIRTY_MEMORY_NUM) - 1)
 #define DIRTY_CLIENTS_NOCODE  (DIRTY_CLIENTS_ALL & ~(1 << DIRTY_MEMORY_CODE))
 
diff --git a/include/qemu/cutils.h b/include/qemu/cutils.h
index b54c847..41c5fa9 100644
--- a/include/qemu/cutils.h
+++ b/include/qemu/cutils.h
@@ -130,6 +130,7 @@ const char *qemu_strchrnul(const char *s, int c);
 #endif
 time_t mktimegm(struct tm *tm);
 int qemu_fdatasync(int fd);
+int qemu_msync(void *addr, size_t length, size_t alignment, int fd);
 int fcntl_setfl(int fd, int flag);
 int qemu_parse_fd(const char *param);
 int qemu_strtoi(const char *nptr, const char **endptr, int base,
diff --git a/memory.c b/memory.c
index c952eab..15734a0 100644
--- a/memory.c
+++ b/memory.c
@@ -2214,6 +2214,18 @@ void memory_region_ram_resize(MemoryRegion *mr, 
ram_addr_t newsize, Error **errp
 qemu_ram_resize(mr->ram_block, newsize, errp);
 }
 
+
+void memory_region_do_writeback(MemoryRegion *mr, hwaddr addr, hwaddr size)
+{
+/*
+ * Might be extended case needed to cover
+ * different types of memory regions
+ */
+if (mr->ram_block && mr->dirty_log_mask) {
+qemu_ram_writeback(mr->ram_block, addr, size);
+}
+}
+
 /*
  * Call proper memory listeners about the change o

[PATCH v2 1/4] tcg: cputlb: Add probe_read

2019-11-05 Thread Beata Michalska

Add probe_read alongside the write probing equivalent.

Signed-off-by: Beata Michalska 
Reviewed-by: Alex Bennée 
---
 include/exec/exec-all.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index d85e610..350c4b4 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -339,6 +339,12 @@ static inline void *probe_write(CPUArchState *env, 
target_ulong addr, int size,
 return probe_access(env, addr, size, MMU_DATA_STORE, mmu_idx, retaddr);
 }
 
+static inline void *probe_read(CPUArchState *env, target_ulong addr, int size,
+   int mmu_idx, uintptr_t retaddr)
+{
+return probe_access(env, addr, size, MMU_DATA_LOAD, mmu_idx, retaddr);
+}
+
 #define CODE_GEN_ALIGN   16 /* must be >= of the size of a icache line 
*/
 
 /* Estimated block size for TB allocation.  */
-- 
2.7.4

[PATCH v2 0/4] target/arm: Support for Data Cache Clean up to PoP

2019-11-05 Thread Beata Michalska

ARMv8.2 introduced support for Data Cache Clean instructions to PoP
(point-of-persistence) and PoDP (point-of-deep-persistence):
ARMv8.2-DCCVAP &  ARMv8.2-DCCVADP respectively.
This patch set adds support for emulating both, though there is no
distinction between the two points: the PoDP is assumed to represent
the same point of persistence as PoP. Case there is no such point specified
for the considered memory system both will fall back to the DV CVAC inst
(clean up to the point of coherency).
The changes introduced include adding probe_read for validating read memory
access to allow verification for mandatory read access for both cache
clean instructions, along with support for writeback for requested memory
regions through msync, if one is available, based otherwise on fsyncdata.

As currently the virt platform is missing support for NVDIMM,
the changes have been tested  with [1] & [2]


[1] https://patchwork.kernel.org/cover/10830237/
[2] https://patchwork.kernel.org/project/qemu-devel/list/?series=159441

v2:
- Moved the msync into a qemu wrapper with
  CONFIG_POSIX switch + additional comments
- Fixed length alignment
- Dropped treating the DC CVAP/CVADP as special case
  and moved those to conditional registration
- Dropped needless locking for grabbing mem region


Beata Michalska (4):
  tcg: cputlb: Add probe_read
  Memory: Enable writeback for given memory region
  migration: ram: Switch to ram block writeback
  target/arm: Add support for DC CVAP & DC CVADP ins

 exec.c  | 43 +
 include/exec/exec-all.h |  6 ++
 include/exec/memory.h   |  6 ++
 include/exec/ram_addr.h |  8 +++
 include/qemu/cutils.h   |  1 +
 linux-user/elfload.c|  2 ++
 memory.c| 12 +++
 migration/ram.c |  5 +
 target/arm/cpu.h| 10 +
 target/arm/cpu64.c  |  1 +
 target/arm/helper.c | 56 +
 util/cutils.c   | 47 +
 12 files changed, 193 insertions(+), 4 deletions(-)

-- 
2.7.4

Re: [PATCH v6 1/3] hw: rtc: Add Goldfish RTC device

2019-11-05 Thread Philippe Mathieu-Daudé


Hi Anup,

On 11/3/19 8:55 AM, Anup Patel wrote:

This patch adds model for Google Goldfish virtual platform RTC device.

We will be adding Goldfish RTC device to the QEMU RISC-V virt machine
for providing real date-time to Guest Linux. The corresponding Linux
driver for Goldfish RTC device is already available in upstream Linux.

For now, VM migration support is available but untested for Goldfish RTC
device. It will be hardened in-future when we implement VM migration for
KVM RISC-V.

Signed-off-by: Anup Patel 
Reviewed-by: Alistair Francis 
---
  hw/rtc/Kconfig|   3 +
  hw/rtc/Makefile.objs  |   1 +
  hw/rtc/goldfish_rtc.c | 288 ++
  hw/rtc/trace-events   |   4 +
  include/hw/rtc/goldfish_rtc.h |  46 ++


Correct path, thanks :)


  5 files changed, 342 insertions(+)
  create mode 100644 hw/rtc/goldfish_rtc.c
  create mode 100644 include/hw/rtc/goldfish_rtc.h

diff --git a/hw/rtc/Kconfig b/hw/rtc/Kconfig
index 45daa8d655..bafe6ac2c9 100644
--- a/hw/rtc/Kconfig
+++ b/hw/rtc/Kconfig
@@ -21,3 +21,6 @@ config MC146818RTC
  
  config SUN4V_RTC

  bool
+
+config GOLDFISH_RTC
+bool
diff --git a/hw/rtc/Makefile.objs b/hw/rtc/Makefile.objs
index 8dc9fcd3a9..aa208d0d10 100644
--- a/hw/rtc/Makefile.objs
+++ b/hw/rtc/Makefile.objs
@@ -11,3 +11,4 @@ common-obj-$(CONFIG_EXYNOS4) += exynos4210_rtc.o
  obj-$(CONFIG_MC146818RTC) += mc146818rtc.o
  common-obj-$(CONFIG_SUN4V_RTC) += sun4v-rtc.o
  common-obj-$(CONFIG_ASPEED_SOC) += aspeed_rtc.o
+common-obj-$(CONFIG_GOLDFISH_RTC) += goldfish_rtc.o
diff --git a/hw/rtc/goldfish_rtc.c b/hw/rtc/goldfish_rtc.c
new file mode 100644
index 00..f71f6eaab0
--- /dev/null
+++ b/hw/rtc/goldfish_rtc.c
@@ -0,0 +1,288 @@
+/*
+ * Goldfish virtual platform RTC
+ *
+ * Copyright (C) 2019 Western Digital Corporation or its affiliates.
+ *
+ * For more details on Google Goldfish virtual platform refer:
+ * 
https://android.googlesource.com/platform/external/qemu/+/master/docs/GOLDFISH-VIRTUAL-HARDWARE.TXT


I'd use a (fixed) release tag, and not the (unstable) master branch:

https://android.googlesource.com/platform/external/qemu/+/refs/heads/emu-2.0-release/docs/GOLDFISH-VIRTUAL-HARDWARE.TXT


+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "hw/rtc/goldfish_rtc.h"
+#include "migration/vmstate.h"
+#include "hw/irq.h"
+#include "hw/qdev-properties.h"
+#include "hw/sysbus.h"
+#include "qemu/timer.h"
+#include "sysemu/sysemu.h"
+#include "qemu/cutils.h"
+#include "qemu/log.h"
+
+#include "trace.h"
+
+#define RTC_TIME_LOW0x00
+#define RTC_TIME_HIGH   0x04
+#define RTC_ALARM_LOW   0x08
+#define RTC_ALARM_HIGH  0x0c
+#define RTC_IRQ_ENABLED 0x10
+#define RTC_CLEAR_ALARM 0x14
+#define RTC_ALARM_STATUS0x18
+#define RTC_CLEAR_INTERRUPT 0x1c
+
+static void goldfish_rtc_update(GoldfishRTCState *s)
+{
+qemu_set_irq(s->irq, (s->irq_pending & s->irq_enabled) ? 1 : 0);
+}
+
+static void goldfish_rtc_interrupt(void *opaque)
+{
+GoldfishRTCState *s = (GoldfishRTCState *)opaque;
+
+s->alarm_running = 0;
+s->irq_pending = 1;
+goldfish_rtc_update(s);
+}
+
+static uint64_t goldfish_rtc_get_count(GoldfishRTCState *s)
+{
+return s->tick_offset + (uint64_t)qemu_clock_get_ns(rtc_clock);
+}
+
+static void goldfish_rtc_clear_alarm(GoldfishRTCState *s)
+{
+timer_del(s->timer);
+s->alarm_running = 0;
+}
+
+static void goldfish_rtc_set_alarm(GoldfishRTCState *s)
+{
+uint64_t ticks = goldfish_rtc_get_count(s);
+uint64_t event = s->alarm_next;
+
+if (event <= ticks) {
+goldfish_rtc_clear_alarm(s);
+goldfish_rtc_interrupt(s);
+} else {
+/*
+ * We should be setting timer expiry to:
+ * qemu_clock_get_ns(rtc_clock) + (event - ticks)
+ * but this is equivalent to:
+ * event - s->tick_offset
+ */
+timer_mod(s->timer, event - s->tick_offset);
+s->alarm_running = 1;
+}
+}
+
+static uint64_t goldfish_rtc_read(void *opaque, hwaddr offset,
+  unsigned size)
+{
+GoldfishRTCState *s = opaque;
+uint64_t r = 0;
+
+switch (offset) {
+case RTC_TIME_LOW:
+r = goldfish_rtc_get_count(s) & 0x;
+break;
+case RTC_TIME_HIGH:
+r = goldfish_rtc_get_cou

Re: [PULL 00/13] Linux user for 4.2 patches

2019-11-05 Thread Philippe Mathieu-Daudé


On 11/5/19 11:14 PM, Laurent Vivier wrote:

Richard,

could you update your series?

If you prefer to wait next release I can drop your series from the pull
request.

Thanks,
Laurent

Le 05/11/2019 à 23:06, no-re...@patchew.org a écrit :

Patchew URL: https://patchew.org/QEMU/20191105181119.26779-1-laur...@vivier.eu/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PULL 00/13] Linux user for 4.2 patches
Type: series
Message-id: 20191105181119.26779-1-laur...@vivier.eu

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
 From https://github.com/patchew-project/qemu
36609b4..412fbef  master -> master
Switched to a new branch 'test'
965f842 linux-user/alpha: Set r20 secondary return value
a59ca3b linux-user/sparc: Fix cpu_clone_regs_*
046ba0d linux-user: Introduce cpu_clone_regs_parent
1afe1bc linux-user: Rename cpu_clone_regs to cpu_clone_regs_child
748db1e linux-user/sparc64: Fix target_signal_frame
2e90cc8 linux-user/sparc: Fix WREG usage in setup_frame
608f997 linux-user/sparc: Use WREG_SP constant in sparc/signal.c
279530b linux-user/sparc: Begin using WREG constants in sparc/signal.c
3d27837 linux-user/sparc: Use WREG constants in sparc/target_cpu.h
b30437c target/sparc: Define an enumeration for accessing env->regwptr
128b52d tests/tcg/multiarch/linux-test: Fix error check for shmat
e78b5ec scripts/qemu-binfmt-conf: Update for sparc64
5a6b0f4 linux-user: Support for NETLINK socket options

=== OUTPUT BEGIN ===
1/13 Checking commit 5a6b0f46c670 (linux-user: Support for NETLINK socket 
options)
2/13 Checking commit e78b5ec2867e (scripts/qemu-binfmt-conf: Update for sparc64)
WARNING: line over 80 characters
#36: FILE: scripts/qemu-binfmt-conf.sh:41:
+sparc64_magic='\x7fELF\x02\x02\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x2b'


False positive. Shouldn't we take this file out of checkpatch default list?



ERROR: line over 90 characters
#37: FILE: scripts/qemu-binfmt-conf.sh:42:
+sparc64_mask='\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff'


Ditto.



total: 1 errors, 1 warnings, 20 lines checked

Patch 2/13 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

3/13 Checking commit 128b52d81645 (tests/tcg/multiarch/linux-test: Fix error 
check for shmat)
4/13 Checking commit b30437c1b51f (target/sparc: Define an enumeration for 
accessing env->regwptr)
5/13 Checking commit 3d27837139f0 (linux-user/sparc: Use WREG constants in 
sparc/target_cpu.h)
6/13 Checking commit 279530b9caeb (linux-user/sparc: Begin using WREG constants 
in sparc/signal.c)
ERROR: spaces required around that '+' (ctx:VxV)
#52: FILE: linux-user/sparc/signal.c:151:
+__put_user(env->regwptr[WREG_O0 + i], &si->si_regs.u_regs[i+8]);
 ^


True positive :/



ERROR: spaces required around that '+' (ctx:VxV)
#124: FILE: linux-user/sparc/signal.c:290:
+__get_user(env->regwptr[i + WREG_O0], &sf->info.si_regs.u_regs[i+8]);
  ^


Again.



ERROR: spaces required around that '+' (ctx:VxV)
#171: FILE: linux-user/sparc/signal.c:460:
+w_addr = TARGET_STACK_BIAS+env->regwptr[WREG_O6];
^


Again.



ERROR: spaces required around that '+' (ctx:VxV)
#206: FILE: linux-user/sparc/signal.c:563:
+w_addr = TARGET_STACK_BIAS+env->regwptr[WREG_O6];
^


Again.



total: 4 errors, 0 warnings, 175 lines checked

Patch 6/13 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

7/13 Checking commit 608f99725ea6 (linux-user/sparc: Use WREG_SP constant in 
sparc/signal.c)
8/13 Checking commit 2e90cc889f5a (linux-user/sparc: Fix WREG usage in 
setup_frame)
9/13 Checking commit 748db1e8856b (linux-user/sparc64: Fix target_signal_frame)
ERROR: space prohibited between function name and open parenthesis '('
#24: FILE: linux-user/sparc/signal.c:90:
+uint32_tinsns[2] __attribute__ ((aligned (8)));


False positive likely?



total: 1 errors, 0 warnings, 16 lines checked

Patch 9/13 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

10/13 Checking commit 1afe1bce0919 (linux-user: Rename cpu_clone_regs to 
cpu_clone_regs_child)
11/13 Checking commit 046ba0d62866 (linux-user: Introduce cpu_clone_regs_parent)
12/13 Checking commit a59ca3b85381 (linux-user/sparc: Fix cpu_clone_regs_*)
13/13 Checking comm

[PULL v2 00/21] hw/i386/pc: Split PIIX3 southbridge from i440FX northbridge

2019-11-05 Thread Philippe Mathieu-Daudé

Hi Peter,

This is a X86/MIPS pull, Paolo and Aleksandar are OK I send it:

  https://lists.gnu.org/archive/html/qemu-devel/2019-10/msg04959.html

Since v1: Fixed the Kconfig bug you reported here:

  https://lists.gnu.org/archive/html/qemu-devel/2019-11/msg00125.html

This is not a new feature, and the series was already posted before
soft freeze.

Regards,

Phil.

The following changes since commit 412fbef3d076c43e56451bacb28c4544858c66a3:

  Merge remote-tracking branch 
'remotes/philmd-gitlab/tags/fw_cfg-next-pull-request' into staging (2019-11-05 
20:17:11 +)

are available in the Git repository at:

  https://gitlab.com/philmd/qemu.git tags/mips-next-20191105

for you to fetch changes up to 48bc99a09cb160a3a2612c4bd5a8a225ed7bf6fb:

  hw/pci-host/i440fx: Remove the last PIIX3 traces (2019-11-05 23:33:12 +0100)


The i440FX northbridge is only used by the PC machine, while the
PIIX southbridge is also used by the Malta MIPS machine.

Split the PIIX3 southbridge from i440FX northbridge.



Hervé Poussineau (5):
  piix4: Add the Reset Control Register
  piix4: Add an i8259 Interrupt Controller as specified in datasheet
  piix4: Rename PIIX4 object to piix4-isa
  piix4: Add an i8257 DMA Controller as specified in datasheet
  piix4: Add an i8254 PIT Controller as specified in datasheet

Philippe Mathieu-Daudé (16):
  Makefile: Fix config-devices.mak not regenerated when Kconfig updated
  MAINTAINERS: Keep PIIX4 South Bridge separate from PC Chipsets
  Revert "irq: introduce qemu_irq_proxy()"
  piix4: Add a MC146818 RTC Controller as specified in datasheet
  hw/mips/mips_malta: Create IDE hard drive array dynamically
  hw/mips/mips_malta: Extract the PIIX4 creation code as piix4_create()
  hw/isa/piix4: Move piix4_create() to hw/isa/piix4.c
  hw/i386: Remove obsolete LoadStateHandler::load_state_old handlers
  hw/pci-host/piix: Extract piix3_create()
  hw/pci-host/piix: Move RCR_IOPORT register definition
  hw/pci-host/piix: Define and use the PIIX IRQ Route Control Registers
  hw/pci-host/piix: Move i440FX declarations to hw/pci-host/i440fx.h
  hw/pci-host/piix: Fix code style issues
  hw/pci-host/piix: Extract PIIX3 functions to hw/isa/piix3.c
  hw/pci-host: Rename incorrectly named 'piix' as 'i440fx'
  hw/pci-host/i440fx: Remove the last PIIX3 traces

 MAINTAINERS  |  14 +-
 Makefile |   3 +-
 hw/acpi/pcihp.c  |   2 +-
 hw/acpi/piix4.c  |  42 +--
 hw/core/irq.c|  14 -
 hw/i386/Kconfig  |   3 +-
 hw/i386/acpi-build.c |   5 +-
 hw/i386/pc_piix.c|  10 +-
 hw/i386/xen/xen-hvm.c|   5 +-
 hw/intc/apic_common.c|  49 
 hw/isa/Kconfig   |   4 +
 hw/isa/Makefile.objs |   1 +
 hw/isa/piix3.c   | 399 +
 hw/isa/piix4.c   | 151 ++-
 hw/mips/gt64xxx_pci.c|   5 +-
 hw/mips/mips_malta.c |  46 +---
 hw/pci-host/Kconfig  |   3 +-
 hw/pci-host/Makefile.objs|   2 +-
 hw/pci-host/{piix.c => i440fx.c} | 424 +--
 hw/timer/i8254_common.c  |  40 ---
 include/hw/acpi/piix4.h  |   6 -
 include/hw/i386/pc.h |  37 ---
 include/hw/irq.h |   5 -
 include/hw/isa/isa.h |   2 +
 include/hw/pci-host/i440fx.h |  36 +++
 include/hw/southbridge/piix.h|  74 ++
 stubs/pci-host-piix.c|   3 +-
 27 files changed, 701 insertions(+), 684 deletions(-)
 create mode 100644 hw/isa/piix3.c
 rename hw/pci-host/{piix.c => i440fx.c} (58%)
 delete mode 100644 include/hw/acpi/piix4.h
 create mode 100644 include/hw/pci-host/i440fx.h
 create mode 100644 include/hw/southbridge/piix.h

-- 
2.21.0

[PULL v2 01/21] Makefile: Fix config-devices.mak not regenerated when Kconfig updated

2019-11-05 Thread Philippe Mathieu-Daudé

When hw/$DIR/Kconfig is changed, the corresponding generated
hw/$DIR/config-devices.mak is not being updated.
Fix this by including all the hw/*/Kconfig files to the prerequisite
names of the rule generating the config-devices.mak files.

Fixes: e0e312f3525a (build: switch to Kconfig)
Reported-by: Peter Maydell 
Suggested-by: Daniel P. Berrangé 
Reviewed-by: Laurent Vivier 
Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Philippe Mathieu-Daudé 
---
 Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index bd6376d295..aa9d1a42aa 100644
--- a/Makefile
+++ b/Makefile
@@ -390,7 +390,8 @@ MINIKCONF_ARGS = \
 CONFIG_LINUX=$(CONFIG_LINUX) \
 CONFIG_PVRDMA=$(CONFIG_PVRDMA)
 
-MINIKCONF_INPUTS = $(SRC_PATH)/Kconfig.host $(SRC_PATH)/hw/Kconfig
+MINIKCONF_INPUTS = $(SRC_PATH)/Kconfig.host $(SRC_PATH)/hw/Kconfig \
+   $(wildcard $(SRC_PATH)/hw/*/Kconfig)
 MINIKCONF = $(PYTHON) $(SRC_PATH)/scripts/minikconf.py \
 
 $(SUBDIR_DEVICES_MAK): %/config-devices.mak: default-configs/%.mak 
$(MINIKCONF_INPUTS) $(BUILD_DIR)/config-host.mak
-- 
2.21.0

Re: [PULL 00/13] Linux user for 4.2 patches

2019-11-05 Thread Laurent Vivier

Richard,

could you update your series?

If you prefer to wait next release I can drop your series from the pull
request.

Thanks,
Laurent

Le 05/11/2019 à 23:06, no-re...@patchew.org a écrit :
> Patchew URL: 
> https://patchew.org/QEMU/20191105181119.26779-1-laur...@vivier.eu/
> 
> 
> 
> Hi,
> 
> This series seems to have some coding style problems. See output below for
> more information:
> 
> Subject: [PULL 00/13] Linux user for 4.2 patches
> Type: series
> Message-id: 20191105181119.26779-1-laur...@vivier.eu
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> git rev-parse base > /dev/null || exit 0
> git config --local diff.renamelimit 0
> git config --local diff.renames True
> git config --local diff.algorithm histogram
> ./scripts/checkpatch.pl --mailback base..
> === TEST SCRIPT END ===
> 
> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> From https://github.com/patchew-project/qemu
>36609b4..412fbef  master -> master
> Switched to a new branch 'test'
> 965f842 linux-user/alpha: Set r20 secondary return value
> a59ca3b linux-user/sparc: Fix cpu_clone_regs_*
> 046ba0d linux-user: Introduce cpu_clone_regs_parent
> 1afe1bc linux-user: Rename cpu_clone_regs to cpu_clone_regs_child
> 748db1e linux-user/sparc64: Fix target_signal_frame
> 2e90cc8 linux-user/sparc: Fix WREG usage in setup_frame
> 608f997 linux-user/sparc: Use WREG_SP constant in sparc/signal.c
> 279530b linux-user/sparc: Begin using WREG constants in sparc/signal.c
> 3d27837 linux-user/sparc: Use WREG constants in sparc/target_cpu.h
> b30437c target/sparc: Define an enumeration for accessing env->regwptr
> 128b52d tests/tcg/multiarch/linux-test: Fix error check for shmat
> e78b5ec scripts/qemu-binfmt-conf: Update for sparc64
> 5a6b0f4 linux-user: Support for NETLINK socket options
> 
> === OUTPUT BEGIN ===
> 1/13 Checking commit 5a6b0f46c670 (linux-user: Support for NETLINK socket 
> options)
> 2/13 Checking commit e78b5ec2867e (scripts/qemu-binfmt-conf: Update for 
> sparc64)
> WARNING: line over 80 characters
> #36: FILE: scripts/qemu-binfmt-conf.sh:41:
> +sparc64_magic='\x7fELF\x02\x02\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x2b'
> 
> ERROR: line over 90 characters
> #37: FILE: scripts/qemu-binfmt-conf.sh:42:
> +sparc64_mask='\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff'
> 
> total: 1 errors, 1 warnings, 20 lines checked
> 
> Patch 2/13 has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> 3/13 Checking commit 128b52d81645 (tests/tcg/multiarch/linux-test: Fix error 
> check for shmat)
> 4/13 Checking commit b30437c1b51f (target/sparc: Define an enumeration for 
> accessing env->regwptr)
> 5/13 Checking commit 3d27837139f0 (linux-user/sparc: Use WREG constants in 
> sparc/target_cpu.h)
> 6/13 Checking commit 279530b9caeb (linux-user/sparc: Begin using WREG 
> constants in sparc/signal.c)
> ERROR: spaces required around that '+' (ctx:VxV)
> #52: FILE: linux-user/sparc/signal.c:151:
> +__put_user(env->regwptr[WREG_O0 + i], &si->si_regs.u_regs[i+8]);
> ^
> 
> ERROR: spaces required around that '+' (ctx:VxV)
> #124: FILE: linux-user/sparc/signal.c:290:
> +__get_user(env->regwptr[i + WREG_O0], &sf->info.si_regs.u_regs[i+8]);
>  ^
> 
> ERROR: spaces required around that '+' (ctx:VxV)
> #171: FILE: linux-user/sparc/signal.c:460:
> +w_addr = TARGET_STACK_BIAS+env->regwptr[WREG_O6];
>^
> 
> ERROR: spaces required around that '+' (ctx:VxV)
> #206: FILE: linux-user/sparc/signal.c:563:
> +w_addr = TARGET_STACK_BIAS+env->regwptr[WREG_O6];
>^
> 
> total: 4 errors, 0 warnings, 175 lines checked
> 
> Patch 6/13 has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> 7/13 Checking commit 608f99725ea6 (linux-user/sparc: Use WREG_SP constant in 
> sparc/signal.c)
> 8/13 Checking commit 2e90cc889f5a (linux-user/sparc: Fix WREG usage in 
> setup_frame)
> 9/13 Checking commit 748db1e8856b (linux-user/sparc64: Fix 
> target_signal_frame)
> ERROR: space prohibited between function name and open parenthesis '('
> #24: FILE: linux-user/sparc/signal.c:90:
> +uint32_tinsns[2] __attribute__ ((aligned (8)));
> 
> total: 1 errors, 0 warnings, 16 lines checked
> 
> Patch 9/13 has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> 10/13 Checking commit 1afe1bce0919 (linux-user: Rename cpu_clone_regs to 
> cpu_clone_regs_child)
> 11/13 Checking commit 046ba0d62866 (linux-user: Introduce 
> cpu_clone_regs_parent)
> 12/13 Checking commit a59ca3b85381 (linux-user/sparc: Fix cpu_clone_regs_*)
> 13/13 C

Re: [PATCH 1/2] i386: Add missing cpu feature bits in EPYC model

2019-11-05 Thread Eduardo Habkost

On Tue, Nov 05, 2019 at 09:17:30PM +, Moger, Babu wrote:
> Adds the following missing CPUID bits:
> perfctr-core : core performance counter extensions support. Enables the VM
>to use extended performance counter support. It enables six
>programmable counters instead of 4 counters.
> clzero   : instruction zeroes out the 64 byte cache line specified in RAX.
> xsaveerptr   : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
>pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
>pointers.
> ibpb : Indirect Branch Prediction Barrie.
> xsaves   : XSAVES, XRSTORS and IA32_XSS supported.
> 
> Depends on:
> 40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
> 504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
> 52297436199d ("kvm: svm: Update svm_xsaves_supported")
> 
> Signed-off-by: Babu Moger 
> ---
>  hw/i386/pc.c  |8 +++-
>  target/i386/cpu.c |   11 +--
>  2 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 51b72439b4..a72fe1db31 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -105,7 +105,13 @@ struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
>  /* Physical Address of PVH entry point read from kernel ELF NOTE */
>  static size_t pvh_start_addr;
>  
> -GlobalProperty pc_compat_4_1[] = {};
> +GlobalProperty pc_compat_4_1[] = {
> +{ "EPYC" "-" TYPE_X86_CPU, "perfctr-core", "off" },
> +{ "EPYC" "-" TYPE_X86_CPU, "clzero", "off" },
> +{ "EPYC" "-" TYPE_X86_CPU, "xsaveerptr", "off" },
> +{ "EPYC" "-" TYPE_X86_CPU, "ibpb", "off" },
> +{ "EPYC" "-" TYPE_X86_CPU, "xsaves", "off" },
> +};

machine-type-based CPU compatibility was now replaced by
versioned CPU models.  Please use the X86CPUDefinition.versions
field to add a new version of EPYC instead.

>  const size_t pc_compat_4_1_len = G_N_ELEMENTS(pc_compat_4_1);
>  
>  GlobalProperty pc_compat_4_0[] = {};
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 07cf562d89..71233e6310 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -3110,19 +3110,18 @@ static X86CPUDefinition builtin_x86_defs[] = {
>  CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
>  CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
>  CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
> -CPUID_EXT3_TOPOEXT,
> +CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
> +.features[FEAT_8000_0008_EBX] =
> +CPUID_8000_0008_EBX_CLZERO | CPUID_8000_0008_EBX_XSAVEERPTR |
> +CPUID_8000_0008_EBX_IBPB,
>  .features[FEAT_7_0_EBX] =
>  CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_AVX2 
> |
>  CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_RDSEED |
>  CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_SMAP | 
> CPUID_7_0_EBX_CLFLUSHOPT |
>  CPUID_7_0_EBX_SHA_NI,
> -/* Missing: XSAVES (not supported by some Linux versions,
> - * including v4.1 to v4.12).
> - * KVM doesn't yet expose any XSAVES state save component.
> - */
>  .features[FEAT_XSAVE] =
>  CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
> -CPUID_XSAVE_XGETBV1,
> +CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES,
>  .features[FEAT_6_EAX] =
>  CPUID_6_EAX_ARAT,
>  .features[FEAT_SVM] =
> 

-- 
Eduardo

Re: [PULL 00/13] Linux user for 4.2 patches

2019-11-05 Thread no-reply

Patchew URL: https://patchew.org/QEMU/20191105181119.26779-1-laur...@vivier.eu/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PULL 00/13] Linux user for 4.2 patches
Type: series
Message-id: 20191105181119.26779-1-laur...@vivier.eu

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
   36609b4..412fbef  master -> master
Switched to a new branch 'test'
965f842 linux-user/alpha: Set r20 secondary return value
a59ca3b linux-user/sparc: Fix cpu_clone_regs_*
046ba0d linux-user: Introduce cpu_clone_regs_parent
1afe1bc linux-user: Rename cpu_clone_regs to cpu_clone_regs_child
748db1e linux-user/sparc64: Fix target_signal_frame
2e90cc8 linux-user/sparc: Fix WREG usage in setup_frame
608f997 linux-user/sparc: Use WREG_SP constant in sparc/signal.c
279530b linux-user/sparc: Begin using WREG constants in sparc/signal.c
3d27837 linux-user/sparc: Use WREG constants in sparc/target_cpu.h
b30437c target/sparc: Define an enumeration for accessing env->regwptr
128b52d tests/tcg/multiarch/linux-test: Fix error check for shmat
e78b5ec scripts/qemu-binfmt-conf: Update for sparc64
5a6b0f4 linux-user: Support for NETLINK socket options

=== OUTPUT BEGIN ===
1/13 Checking commit 5a6b0f46c670 (linux-user: Support for NETLINK socket 
options)
2/13 Checking commit e78b5ec2867e (scripts/qemu-binfmt-conf: Update for sparc64)
WARNING: line over 80 characters
#36: FILE: scripts/qemu-binfmt-conf.sh:41:
+sparc64_magic='\x7fELF\x02\x02\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x2b'

ERROR: line over 90 characters
#37: FILE: scripts/qemu-binfmt-conf.sh:42:
+sparc64_mask='\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff'

total: 1 errors, 1 warnings, 20 lines checked

Patch 2/13 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

3/13 Checking commit 128b52d81645 (tests/tcg/multiarch/linux-test: Fix error 
check for shmat)
4/13 Checking commit b30437c1b51f (target/sparc: Define an enumeration for 
accessing env->regwptr)
5/13 Checking commit 3d27837139f0 (linux-user/sparc: Use WREG constants in 
sparc/target_cpu.h)
6/13 Checking commit 279530b9caeb (linux-user/sparc: Begin using WREG constants 
in sparc/signal.c)
ERROR: spaces required around that '+' (ctx:VxV)
#52: FILE: linux-user/sparc/signal.c:151:
+__put_user(env->regwptr[WREG_O0 + i], &si->si_regs.u_regs[i+8]);
^

ERROR: spaces required around that '+' (ctx:VxV)
#124: FILE: linux-user/sparc/signal.c:290:
+__get_user(env->regwptr[i + WREG_O0], &sf->info.si_regs.u_regs[i+8]);
 ^

ERROR: spaces required around that '+' (ctx:VxV)
#171: FILE: linux-user/sparc/signal.c:460:
+w_addr = TARGET_STACK_BIAS+env->regwptr[WREG_O6];
   ^

ERROR: spaces required around that '+' (ctx:VxV)
#206: FILE: linux-user/sparc/signal.c:563:
+w_addr = TARGET_STACK_BIAS+env->regwptr[WREG_O6];
   ^

total: 4 errors, 0 warnings, 175 lines checked

Patch 6/13 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

7/13 Checking commit 608f99725ea6 (linux-user/sparc: Use WREG_SP constant in 
sparc/signal.c)
8/13 Checking commit 2e90cc889f5a (linux-user/sparc: Fix WREG usage in 
setup_frame)
9/13 Checking commit 748db1e8856b (linux-user/sparc64: Fix target_signal_frame)
ERROR: space prohibited between function name and open parenthesis '('
#24: FILE: linux-user/sparc/signal.c:90:
+uint32_tinsns[2] __attribute__ ((aligned (8)));

total: 1 errors, 0 warnings, 16 lines checked

Patch 9/13 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

10/13 Checking commit 1afe1bce0919 (linux-user: Rename cpu_clone_regs to 
cpu_clone_regs_child)
11/13 Checking commit 046ba0d62866 (linux-user: Introduce cpu_clone_regs_parent)
12/13 Checking commit a59ca3b85381 (linux-user/sparc: Fix cpu_clone_regs_*)
13/13 Checking commit 965f842f57f6 (linux-user/alpha: Set r20 secondary return 
value)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20191105181119.26779-1-laur...@vivier.eu/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[PATCH 2/2] i386: Add 2nd Generation AMD EPYC processors

2019-11-05 Thread Moger, Babu

Adds the support for 2nd Gen AMD EPYC Processors. The model display
name will be EPYC-Rome.

Adds the following new feature bits on top of the feature bits from the
first generation EPYC models.
perfctr-core : core performance counter extensions support. Enables the VM to
   use extended performance counter support. It enables six
   programmable counters instead of four counters.
clzero   : instruction zeroes out the 64 byte cache line specified in RAX.
xsaveerptr   : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
   pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
   pointers.
wbnoinvd : Write back and do not invalidate cache
ibpb : Indirect Branch Prediction Barrier
amd-stibp: Single Thread Indirect Branch Predictor
clwb : Cache Line Write Back and Retain
xsaves   : XSAVES, XRSTORS and IA32_XSS support
rdpid: Read Processor ID instruction support
umip : User-Mode Instruction Prevention support

The  Reference documents are available at
https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdf
https://www.amd.com/system/files/TechDocs/24594.pdf

Depends on following kernel commits:
40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
6d61e3c32248 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
52297436199d ("kvm: svm: Update svm_xsaves_supported")

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c |  102 -
 target/i386/cpu.h |2 +
 2 files changed, 103 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 71233e6310..846662c879 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1133,7 +1133,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = 
{
 "clzero", NULL, "xsaveerptr", NULL,
 NULL, NULL, NULL, NULL,
 NULL, "wbnoinvd", NULL, NULL,
-"ibpb", NULL, NULL, NULL,
+"ibpb", NULL, NULL, "amd-stibp",
 NULL, NULL, NULL, NULL,
 NULL, NULL, NULL, NULL,
 "amd-ssbd", "virt-ssbd", "amd-no-ssb", NULL,
@@ -1796,6 +1796,56 @@ static CPUCaches epyc_cache_info = {
 },
 };
 
+static CPUCaches epyc_rome_cache_info = {
+.l1d_cache = &(CPUCacheInfo) {
+.type = DATA_CACHE,
+.level = 1,
+.size = 32 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 64,
+.lines_per_tag = 1,
+.self_init = 1,
+.no_invd_sharing = true,
+},
+.l1i_cache = &(CPUCacheInfo) {
+.type = INSTRUCTION_CACHE,
+.level = 1,
+.size = 32 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 64,
+.lines_per_tag = 1,
+.self_init = 1,
+.no_invd_sharing = true,
+},
+.l2_cache = &(CPUCacheInfo) {
+.type = UNIFIED_CACHE,
+.level = 2,
+.size = 512 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 1024,
+.lines_per_tag = 1,
+},
+.l3_cache = &(CPUCacheInfo) {
+.type = UNIFIED_CACHE,
+.level = 3,
+.size = 16 * MiB,
+.line_size = 64,
+.associativity = 16,
+.partitions = 1,
+.sets = 16384,
+.lines_per_tag = 1,
+.self_init = true,
+.inclusive = true,
+.complex_indexing = true,
+},
+};
+
 static X86CPUDefinition builtin_x86_defs[] = {
 {
 .name = "qemu64",
@@ -3194,6 +3244,56 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .model_id = "Hygon Dhyana Processor",
 .cache_info = &epyc_cache_info,
 },
+{
+.name = "EPYC-Rome",
+.level = 0xd,
+.vendor = CPUID_VENDOR_AMD,
+.family = 23,
+.model = 49,
+.stepping = 0,
+.features[FEAT_1_EDX] =
+CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX | CPUID_CLFLUSH |
+CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA | CPUID_PGE |
+CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 | CPUID_MCE |
+CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE |
+CPUID_VME | CPUID_FP87,
+.features[FEAT_1_ECX] =
+CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX |
+CPUID_EXT_XSAVE | CPUID_EXT_AES |  CPUID_EXT_POPCNT |
+CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
+CPUID_EXT_CX16 | CPUID_EXT_FMA | CPUID_EXT_SSSE3 |
+CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3,
+.features[FEAT_8000_0001_EDX] =
+CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB |
+CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX |
+CPUID_EXT2_SYSCALL,
+.features[FEAT_8000_0001_ECX] =
+

Re: Adding New, Unsupported ISA to Qemu

2019-11-05 Thread Palmer Dabbelt


On Tue, 05 Nov 2019 08:42:53 PST (-0800), stefa...@gmail.com wrote:

On Mon, Nov 04, 2019 at 11:50:11PM +, Hanson, Seth via wrote:

I'm looking for in-depth documentation pertaining to how an unsupported 16 bit 
RISC ISA can be emulated in Qemu.

I've referenced this:

https://wiki.qemu.org/Documentation/TCG

and have been hoping there's additional, related documentation that I've 
overlooked.


The general advice I've seen is:

1. Look at existing TCG targets to learn how to implement aspects of
   your ISA.


Michael wrote a pair of blogs describing our port.  They're part of the "All 
Aboard" series, which details the RISC-V ports of the various core software 
components (binutils, GCC, glibc, Linux, and QEMU):


https://www.sifive.com/blog/risc-v-qemu-part-1-privileged-isa-hifive1-virtio

https://www.sifive.com/blog/risc-v-qemu-part-2-the-risc-v-qemu-port-is-upstream

It's a whole different thing than the documentation and is two years out of 
date, but it at least provides some perspective on why certain things in our 
port were done the way they were in caesy ou end up looking at the code.>



2. If you are unfamiliar with emulation, CPU ISA, or just-in-time
   compiler concepts, try to read up on them and then look back at the
   QEMU code.  Things will be clearer.

You're welcome to join #qemu IRC on irc.oftc.net to ask questions.

Good luck!

Stefan

Re: [PATCH 2/3] dp8393x: fix dp8393x_receive()

2019-11-05 Thread Laurent Vivier

Le 05/11/2019 à 22:06, Hervé Poussineau a écrit :
> Le 02/11/2019 à 18:15, Laurent Vivier a écrit :
>> address_space_rw() access size must be multiplied by the width.
>>
>> This fixes DHCP for Q800 guest.
>>
>> Signed-off-by: Laurent Vivier 
>> ---
>>   hw/net/dp8393x.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
>> index 85d3f3788e..b8c4473f99 100644
>> --- a/hw/net/dp8393x.c
>> +++ b/hw/net/dp8393x.c
>> @@ -833,7 +833,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
>> const uint8_t * buf,
>>   } else {
>>   dp8393x_put(s, width, 0, 0); /* in_use */
>>   address_space_rw(&s->as, dp8393x_crda(s) + sizeof(uint16_t)
>> * 6 * width,
>> -    MEMTXATTRS_UNSPECIFIED, (uint8_t *)s->data,
>> sizeof(uint16_t), 1);
>> +    MEMTXATTRS_UNSPECIFIED, (uint8_t *)s->data, size, 1);
>>   s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
>>   s->regs[SONIC_ISR] |= SONIC_ISR_PKTRX;
>>   s->regs[SONIC_RSC] = (s->regs[SONIC_RSC] & 0xff00) |
>> (((s->regs[SONIC_RSC] & 0x00ff) + 1) & 0x00ff);
>>
> 
> This patch is problematic.
> The code was initially created with "size".
> It was changed in 409b52bfe199d8106dadf7c5ff3d88d2228e89b5 to fix
> networking in NetBSD 5.1.
> 
> To test with NetBSD 5.1
> - boot the installer (arccd-5.1.iso)
> - choose (S)hell option
> - "ifconfig sn0 10.0.2.15 netmask 255.255.255.0"
> - "route add default 10.0.2.2"
> - networking should work (I test with "ftp 212.27.63.3")

I've the firmware from
http://hpoussineau.free.fr/qemu/firmware/magnum-4000/setup.zip
Which file to use? NTPROM.RAW?

> Without this patch, I get the FTP banner.
> With this patch, connection can't be established.
> 
> In datasheet page 17, you can see the "Receive Descriptor Format", which
> contains the in_use field.
> It is clearly said that RXpkt.in_use is 16 bit wide, and that the bits
> 16-31 are not used in 32-bit mode.
> 
> So, I don't see why you need to clear 32 bits in 32-bit mode. Maybe you
> need to clear only the other
> 16 bits ? Maybe it depends of endianness ?

Thank you for the details. I think the problem should likely come from
the endianness.

The offset must be adjusted according to the access mode (endianness and
size).

The following patch fixes the problem for me, and should not break other
targets:

diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
index 85d3f3788e..3d991af163 100644
--- a/hw/net/dp8393x.c
+++ b/hw/net/dp8393x.c
@@ -831,9 +831,15 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
 /* EOL detected */
 s->regs[SONIC_ISR] |= SONIC_ISR_RDE;
 } else {
-dp8393x_put(s, width, 0, 0); /* in_use */
-address_space_rw(&s->as, dp8393x_crda(s) + sizeof(uint16_t) * 6
* width,
-MEMTXATTRS_UNSPECIFIED, (uint8_t *)s->data,
sizeof(uint16_t), 1);
+/* Clear in_use, but it is always 16bit wide */
+int offset = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width;
+if (s->big_endian && width == 2) {
+/* we need to adjust the offset of the 16bit field */
+offset += sizeof(uint16_t);
+}
+s->data[0] = 0;
+address_space_rw(&s->as, offset, MEMTXATTRS_UNSPECIFIED,
+ (uint8_t *)s->data, sizeof(uint16_t), 1);
 s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
 s->regs[SONIC_ISR] |= SONIC_ISR_PKTRX;
 s->regs[SONIC_RSC] = (s->regs[SONIC_RSC] & 0xff00) |
(((s->regs[SONIC_RSC] & 0x00ff) + 1) & 0x00ff);

What is your opinion?

Thanks,
Laurent

Re: [EXTERNAL] Re: Adding New, Unsupported ISA to Qemu

2019-11-05 Thread Hanson, Seth



Gentlemen,

Thank you for your input. 


Peter,

I completely understand your concern. Rest assured, this project is entirely 
internal and requires no code contribution, unit testing, etc. from QEMU devs. 
We simply want to garner as much documentation as possible to ensure optimal 
conversion/compatibility. My team and I have already completed a majority of 
our instruction set mapping into TCG. Lately however, we've encountered issues 
with floating point operations. 

I noticed in the TCG Readme that floating point operations are no longer 
officially supported but were previously (per the last paragraph in 4.1).

Can you please provide documentation for implementing the latter?


Regards,
Seth



From: Peter Maydell 
Sent: Tuesday, November 5, 2019 1:55 PM
To: Stefan Hajnoczi
Cc: Hanson, Seth; qemu-devel@nongnu.org
Subject: [EXTERNAL] Re: Adding New, Unsupported ISA to Qemu

On Tue, 5 Nov 2019 at 16:44, Stefan Hajnoczi  wrote
> The general advice I've seen is:
>
> 1. Look at existing TCG targets to learn how to implement aspects of
>your ISA.

...and *don't* look at older/less maintained targets (including
x86), as they have a lot of bad habits you don't want to copy.
Using 'decodetree' is probably a good idea.

> 2. If you are unfamiliar with emulation, CPU ISA, or just-in-time
>compiler concepts, try to read up on them and then look back at the
>QEMU code.  Things will be clearer.

I would also add
3.  Don't expect getting this implemented and upstream to be easy.

(Apologies if the following sounds pessimistic and off-putting;
but I would prefer people to have a clear understanding of
what they're getting into and not assume the chances of
success are higher than they might actually be.)

"New TCG target" is an unlucky combination of:
 (1) it's quite a lot of work in pure amount-of-code terms
 (2) because it is a big feature it is not a good choice as a "first
   contribution to the project", but new targets often are proposed
   and written by people who don't have any previous history of
   writing QEMU code
 (3) we already have targets for the common CPU ISAs, so
   anything new is likely to be obscure and not have many people
   who care about it either in our userbase or in our dev community.
   (riscv is the obvious recent exception here, as it is clearly relevant
   as a new architecture and has attracted multiple people to work
   on it and contribute both code and reviews)

1 and 2 mean that code review of a new TCG target is a lot
of work, and 3 means it's not clear how much return the project
gets for that investment :-(

There is not a large community of upstream developers who are
interested in maintaining a lot of obscure guest architectures:
we essentially rely on the goodwill and not-entirely-work-time
of just a few people when it comes to reviewing new TCG targets.
That means that patchsets often hang around on list for a long
time without getting attention.

Our past historical experience has often been that when people
contribute TCG targets, we do a lot of work on our end with
code review and helping to get the code into upstream QEMU, and
then these people more or less disappear, leaving us with the
burden of something we have to support and no help doing it.
If in general people submitting new TCG targets were all
*helping each other*, passing on what they learned to the
next person along, contributing code review, updating older
code as QEMU APIs improve/churn, etc, then I think I'd feel
differently about this. But to be honest mostly I find myself
thinking "oh dear, not another one".

We already have two new TCG ports with patches on list
which are kind of stalled due to not having enough existing
upstream QEMU devs who can/will code review them (and
another which hasn't had patches posted but might do soon).
The odds for your new port having a happier future don't seem
too great to me :-(

thanks
-- PMM

[PATCH 1/2] i386: Add missing cpu feature bits in EPYC model

2019-11-05 Thread Moger, Babu

Adds the following missing CPUID bits:
perfctr-core : core performance counter extensions support. Enables the VM
   to use extended performance counter support. It enables six
   programmable counters instead of 4 counters.
clzero   : instruction zeroes out the 64 byte cache line specified in RAX.
xsaveerptr   : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
   pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
   pointers.
ibpb : Indirect Branch Prediction Barrie.
xsaves   : XSAVES, XRSTORS and IA32_XSS supported.

Depends on:
40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
52297436199d ("kvm: svm: Update svm_xsaves_supported")

Signed-off-by: Babu Moger 
---
 hw/i386/pc.c  |8 +++-
 target/i386/cpu.c |   11 +--
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 51b72439b4..a72fe1db31 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -105,7 +105,13 @@ struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
 /* Physical Address of PVH entry point read from kernel ELF NOTE */
 static size_t pvh_start_addr;
 
-GlobalProperty pc_compat_4_1[] = {};
+GlobalProperty pc_compat_4_1[] = {
+{ "EPYC" "-" TYPE_X86_CPU, "perfctr-core", "off" },
+{ "EPYC" "-" TYPE_X86_CPU, "clzero", "off" },
+{ "EPYC" "-" TYPE_X86_CPU, "xsaveerptr", "off" },
+{ "EPYC" "-" TYPE_X86_CPU, "ibpb", "off" },
+{ "EPYC" "-" TYPE_X86_CPU, "xsaves", "off" },
+};
 const size_t pc_compat_4_1_len = G_N_ELEMENTS(pc_compat_4_1);
 
 GlobalProperty pc_compat_4_0[] = {};
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 07cf562d89..71233e6310 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3110,19 +3110,18 @@ static X86CPUDefinition builtin_x86_defs[] = {
 CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
 CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
 CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
-CPUID_EXT3_TOPOEXT,
+CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
+.features[FEAT_8000_0008_EBX] =
+CPUID_8000_0008_EBX_CLZERO | CPUID_8000_0008_EBX_XSAVEERPTR |
+CPUID_8000_0008_EBX_IBPB,
 .features[FEAT_7_0_EBX] =
 CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_AVX2 |
 CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_RDSEED |
 CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_CLFLUSHOPT |
 CPUID_7_0_EBX_SHA_NI,
-/* Missing: XSAVES (not supported by some Linux versions,
- * including v4.1 to v4.12).
- * KVM doesn't yet expose any XSAVES state save component.
- */
 .features[FEAT_XSAVE] =
 CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
-CPUID_XSAVE_XGETBV1,
+CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES,
 .features[FEAT_6_EAX] =
 CPUID_6_EAX_ARAT,
 .features[FEAT_SVM] =

[PATCH 37/55] roms/Makefile.edk2: don't pull in submodules when building from tarball

2019-11-05 Thread Michael Roth

Currently the `make efi` target pulls submodules nested under the
roms/edk2 submodule as dependencies. However, when we attempt to build
from a tarball this fails since we are no longer in a git tree.

A preceding patch will pre-populate these submodules in the tarball,
so assume this build dependency is only needed when building from a
git tree.

Cc: Laszlo Ersek 
Cc: Bruce Rogers 
Cc: qemu-sta...@nongnu.org # v4.1.0
Reported-by: Bruce Rogers 
Reviewed-by: Laszlo Ersek 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Signed-off-by: Michael Roth 
Message-Id: <20190912231202.12327-3-mdr...@linux.vnet.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé 
(cherry picked from commit f3e330e3c319160ac04954399b5a10afc965098c)
Signed-off-by: Michael Roth 
---
 roms/Makefile.edk2 | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/roms/Makefile.edk2 b/roms/Makefile.edk2
index c2f2ff59d5..33a074d3a4 100644
--- a/roms/Makefile.edk2
+++ b/roms/Makefile.edk2
@@ -46,8 +46,13 @@ all: $(foreach 
flashdev,$(flashdevs),../pc-bios/edk2-$(flashdev).fd.bz2) \
 # files.
 .INTERMEDIATE: $(foreach flashdev,$(flashdevs),../pc-bios/edk2-$(flashdev).fd)
 
+# Fetch edk2 submodule's submodules. If it is not in a git tree, assume
+# we're building from a tarball and that they've already been fetched by
+# make-release/tarball scripts.
 submodules:
-   cd edk2 && git submodule update --init --force
+   if test -d edk2/.git; then \
+   cd edk2 && git submodule update --init --force; \
+   fi
 
 # See notes on the ".NOTPARALLEL" target and the "+" indicator in
 # "tests/uefi-test-tools/Makefile".
-- 
2.17.1

[PATCH 0/2] Add support for 2nd generation AMD EPYC processors

2019-11-05 Thread Moger, Babu

The following series adds the support for 2nd generation AMD EPYC Processors
on qemu guests. The model display name for will be EPYC-Rome.

Also fixes few missed cpu feature bits in 1st generation EPYC models.

The Reference documents are available at
https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdf
https://www.amd.com/system/files/TechDocs/24594.pdf

---

Babu Moger (2):
  i386: Add missing cpu feature bits in EPYC model
  i386: Add 2nd Generation AMD EPYC processors


 hw/i386/pc.c  |8 +++-
 target/i386/cpu.c |  113 ++---
 target/i386/cpu.h |2 +
 3 files changed, 115 insertions(+), 8 deletions(-)

--

[PATCH 06/55] xen-bus: Fix backend state transition on device reset

2019-11-05 Thread Michael Roth

From: Anthony PERARD 

When a frontend wants to reset its state and the backend one, it
starts with setting "Closing", then waits for the backend (QEMU) to do
the same.

But when QEMU is setting "Closing" to its state, it triggers an event
(xenstore watch) that re-execute xen_device_backend_changed() and set
the backend state to "Closed". QEMU should wait for the frontend to
set "Closed" before doing the same.

Before setting "Closed" to the backend_state, we are also going to
check if there is a frontend. If that the case, when the backend state
is set to "Closing" the frontend should react and sets its state to
"Closing" then "Closed". The backend should wait for that to happen.

Fixes: b6af8926fb858c4f1426e5acb2cfc1f0580ec98a
Signed-off-by: Anthony PERARD 
Reviewed-by: Paul Durrant 
Message-Id: <20190823101534.465-2-anthony.per...@citrix.com>
(cherry picked from commit cb3231460747552d70af9d546dc53d8195bcb796)
Signed-off-by: Michael Roth 
---
 hw/xen/xen-bus.c | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index 7503eea9e9..5929aa4b2e 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -516,6 +516,23 @@ static void xen_device_backend_set_online(XenDevice 
*xendev, bool online)
 xen_device_backend_printf(xendev, "online", "%u", online);
 }
 
+/*
+ * Tell from the state whether the frontend is likely alive,
+ * i.e. it will react to a change of state of the backend.
+ */
+static bool xen_device_state_is_active(enum xenbus_state state)
+{
+switch (state) {
+case XenbusStateInitWait:
+case XenbusStateInitialised:
+case XenbusStateConnected:
+case XenbusStateClosing:
+return true;
+default:
+return false;
+}
+}
+
 static void xen_device_backend_changed(void *opaque)
 {
 XenDevice *xendev = opaque;
@@ -539,11 +556,11 @@ static void xen_device_backend_changed(void *opaque)
 
 /*
  * If the toolstack (or unplug request callback) has set the backend
- * state to Closing, but there is no active frontend (i.e. the
- * state is not Connected) then set the backend state to Closed.
+ * state to Closing, but there is no active frontend then set the
+ * backend state to Closed.
  */
 if (xendev->backend_state == XenbusStateClosing &&
-xendev->frontend_state != XenbusStateConnected) {
+!xen_device_state_is_active(state)) {
 xen_device_backend_set_state(xendev, XenbusStateClosed);
 }
 
-- 
2.17.1

Re: [EXTERNAL] Re: Adding New, Unsupported ISA to Qemu

2019-11-05 Thread Peter Maydell

On Tue, 5 Nov 2019 at 21:23, Hanson, Seth  wrote:
> I completely understand your concern. Rest assured, this project is entirely 
> internal and requires no code contribution, unit testing, etc. from QEMU 
> devs. We simply want to garner as much documentation as possible to ensure 
> optimal conversion/compatibility. My team and I have already completed a 
> majority of our instruction set
mapping into TCG. Lately however, we've encountered issues with
floating point operations.

Yeah, for internal forks you have none of the upstreaming
issues (you're merely more on-your-own for figuring out
bugs :-))

> I noticed in the TCG Readme that floating point operations are no longer 
> officially supported but were previously (per the last paragraph in 4.1).

Git blame will tell you that that claim about floating point
has been in there since the readme was first added to
the project in 2008. It would be more accurate to say
simply that TCG does not natively implement fp operations.

TCG's approach to fp is to just (at the TCG opcode level)
treat fp registers the same way as integer ones -- they're
32 bit or 64 bit binary values. Mostly fp operations are
implemented by having the TCG code call out to a helper
function, the same way you'd implement any moderately
complex operation that's not easy to do with inline TCG ops.
>From the helper function, you can call the various emulation
functions provided by our generic fpu emulation layer
('softfloat') whose headers are in include/fpu. The FPU
emulation provides IEEE-compliant implementations of
various basic operations; you have to tell it how your target
handles things that IEEE 754 doesn't nail down (eg whether
you detect tininess before or after rounding, what your NaN
format is, that kind of thing), through
a mix of calling the functions that initialize the 'float_status'
and also adding to the target-specific ifdeffery in
fpu/softfloat-specialize.inc.c. When your target needs things
that aren't IEEE-specified you just have to implement
emulation of those in your per-target code (arm does this
for the 'recpe' reciprocal-estimate instruction, for instance).

IEEE cumulative exception flags (inexact, denormal, etc)
are tracked in the float_status and need to be made visible
to the guest in whatever fp status register it uses to show
those. The default assumption is that IEEE exceptions
don't generate guest CPU exceptions, but you can implement
the latter if you need it -- see ppc for an example of that.

> Can you please provide documentation for implementing the latter?

As usual for QEMU internals there are no documentation.
You can look at the headers in include/fpu which have some
comments describing the APIs, and at the existing CPUs
which use them to implement their FPU support.

thanks
-- PMM

[PATCH 05/55] pc: Don't make die-id mandatory unless necessary

2019-11-05 Thread Michael Roth

From: Eduardo Habkost 

We have this issue reported when using libvirt to hotplug CPUs:
https://bugzilla.redhat.com/show_bug.cgi?id=1741451

Basically, libvirt is not copying die-id from
query-hotpluggable-cpus, but die-id is now mandatory.

We could blame libvirt and say it is not following the documented
interface, because we have this buried in the QAPI schema
documentation:

> Note: currently there are 5 properties that could be present
> but management should be prepared to pass through other
> properties with device_add command to allow for future
> interface extension. This also requires the filed names to be kept in
> sync with the properties passed to -device/device_add.

But I don't think this would be reasonable from us.  We can just
make QEMU more flexible and let die-id to be omitted when there's
no ambiguity.  This will allow us to keep compatibility with
existing libvirt versions.

Test case included to ensure we don't break this again.

Fixes: commit 176d2cda0dee ("i386/cpu: Consolidate die-id validity in smp 
context")
Signed-off-by: Eduardo Habkost 
Message-Id: <20190816170750.23910-1-ehabk...@redhat.com>
Signed-off-by: Eduardo Habkost 
(cherry picked from commit fea374e7c8079563bca7c8fac895c6a880f76adc)
Signed-off-by: Michael Roth 
---
 hw/i386/pc.c |  8 ++
 tests/acceptance/pc_cpu_hotplug_props.py | 35 
 2 files changed, 43 insertions(+)
 create mode 100644 tests/acceptance/pc_cpu_hotplug_props.py

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 549c437050..947f81070f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2403,6 +2403,14 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 int max_socket = (ms->smp.max_cpus - 1) /
 smp_threads / smp_cores / pcms->smp_dies;
 
+/*
+ * die-id was optional in QEMU 4.0 and older, so keep it optional
+ * if there's only one die per socket.
+ */
+if (cpu->die_id < 0 && pcms->smp_dies == 1) {
+cpu->die_id = 0;
+}
+
 if (cpu->socket_id < 0) {
 error_setg(errp, "CPU socket-id is not set");
 return;
diff --git a/tests/acceptance/pc_cpu_hotplug_props.py 
b/tests/acceptance/pc_cpu_hotplug_props.py
new file mode 100644
index 00..08b7e632c6
--- /dev/null
+++ b/tests/acceptance/pc_cpu_hotplug_props.py
@@ -0,0 +1,35 @@
+#
+# Ensure CPU die-id can be omitted on -device
+#
+#  Copyright (c) 2019 Red Hat Inc
+#
+# Author:
+#  Eduardo Habkost 
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, see .
+#
+
+from avocado_qemu import Test
+
+class OmittedCPUProps(Test):
+"""
+:avocado: tags=arch:x86_64
+"""
+def test_no_die_id(self):
+self.vm.add_args('-nodefaults', '-S')
+self.vm.add_args('-smp', '1,sockets=2,cores=2,threads=2,maxcpus=8')
+self.vm.add_args('-cpu', 'qemu64')
+self.vm.add_args('-device', 
'qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0')
+self.vm.launch()
+self.assertEquals(len(self.vm.command('query-cpus')), 2)
-- 
2.17.1

Re: [PATCH 2/3] dp8393x: fix dp8393x_receive()

2019-11-05 Thread Hervé Poussineau


Le 02/11/2019 à 18:15, Laurent Vivier a écrit :

address_space_rw() access size must be multiplied by the width.

This fixes DHCP for Q800 guest.

Signed-off-by: Laurent Vivier 
---
  hw/net/dp8393x.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
index 85d3f3788e..b8c4473f99 100644
--- a/hw/net/dp8393x.c
+++ b/hw/net/dp8393x.c
@@ -833,7 +833,7 @@ static ssize_t dp8393x_receive(NetClientState *nc, const 
uint8_t * buf,
  } else {
  dp8393x_put(s, width, 0, 0); /* in_use */
  address_space_rw(&s->as, dp8393x_crda(s) + sizeof(uint16_t) * 6 * 
width,
-MEMTXATTRS_UNSPECIFIED, (uint8_t *)s->data, sizeof(uint16_t), 1);
+MEMTXATTRS_UNSPECIFIED, (uint8_t *)s->data, size, 1);
  s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
  s->regs[SONIC_ISR] |= SONIC_ISR_PKTRX;
  s->regs[SONIC_RSC] = (s->regs[SONIC_RSC] & 0xff00) | (((s->regs[SONIC_RSC] 
& 0x00ff) + 1) & 0x00ff);



This patch is problematic.
The code was initially created with "size".
It was changed in 409b52bfe199d8106dadf7c5ff3d88d2228e89b5 to fix networking in 
NetBSD 5.1.

To test with NetBSD 5.1
- boot the installer (arccd-5.1.iso)
- choose (S)hell option
- "ifconfig sn0 10.0.2.15 netmask 255.255.255.0"
- "route add default 10.0.2.2"
- networking should work (I test with "ftp 212.27.63.3")

Without this patch, I get the FTP banner.
With this patch, connection can't be established.

In datasheet page 17, you can see the "Receive Descriptor Format", which 
contains the in_use field.
It is clearly said that RXpkt.in_use is 16 bit wide, and that the bits 16-31 
are not used in 32-bit mode.

So, I don't see why you need to clear 32 bits in 32-bit mode. Maybe you need to 
clear only the other
16 bits ? Maybe it depends of endianness ?

Regards,

Hervé

[PATCH 55/55] virtio-blk: Cancel the pending BH when the dataplane is reset

2019-11-05 Thread Michael Roth

From: Philippe Mathieu-Daudé 

When 'system_reset' is called, the main loop clear the memory
region cache before the BH has a chance to execute. Later when
the deferred function is called, some assumptions that were
made when scheduling them are no longer true when they actually
execute.

This is what happens using a virtio-blk device (fresh RHEL7.8 install):

 $ (sleep 12.3; echo system_reset; sleep 12.3; echo system_reset; sleep 1; echo 
q) \
   | qemu-system-x86_64 -m 4G -smp 8 -boot menu=on \
 -device virtio-blk-pci,id=image1,drive=drive_image1 \
 -drive 
file=/var/lib/libvirt/images/rhel78.qcow2,if=none,id=drive_image1,format=qcow2,cache=none
 \
 -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c4:e7:84 \
 -netdev tap,id=net0,script=/bin/true,downscript=/bin/true,vhost=on \
 -monitor stdio -serial null -nographic
  (qemu) system_reset
  (qemu) system_reset
  (qemu) qemu-system-x86_64: hw/virtio/virtio.c:225: vring_get_region_caches: 
Assertion `caches != NULL' failed.
  Aborted

  (gdb) bt
  Thread 1 (Thread 0x7f109c17b680 (LWP 10939)):
  #0  0x5604083296d1 in vring_get_region_caches (vq=0x56040a24bdd0) at 
hw/virtio/virtio.c:227
  #1  0x56040832972b in vring_avail_flags (vq=0x56040a24bdd0) at 
hw/virtio/virtio.c:235
  #2  0x56040832d13d in virtio_should_notify (vdev=0x56040a240630, 
vq=0x56040a24bdd0) at hw/virtio/virtio.c:1648
  #3  0x56040832d1f8 in virtio_notify_irqfd (vdev=0x56040a240630, 
vq=0x56040a24bdd0) at hw/virtio/virtio.c:1662
  #4  0x5604082d213d in notify_guest_bh (opaque=0x56040a243ec0) at 
hw/block/dataplane/virtio-blk.c:75
  #5  0x56040883dc35 in aio_bh_call (bh=0x56040a243f10) at util/async.c:90
  #6  0x56040883dccd in aio_bh_poll (ctx=0x560409161980) at util/async.c:118
  #7  0x560408842af7 in aio_dispatch (ctx=0x560409161980) at 
util/aio-posix.c:460
  #8  0x56040883e068 in aio_ctx_dispatch (source=0x560409161980, 
callback=0x0, user_data=0x0) at util/async.c:261
  #9  0x7f10a8fca06d in g_main_context_dispatch () at 
/lib64/libglib-2.0.so.0
  #10 0x560408841445 in glib_pollfds_poll () at util/main-loop.c:215
  #11 0x5604088414bf in os_host_main_loop_wait (timeout=0) at 
util/main-loop.c:238
  #12 0x5604088415c4 in main_loop_wait (nonblocking=0) at 
util/main-loop.c:514
  #13 0x560408416b1e in main_loop () at vl.c:1923
  #14 0x56040841e0e8 in main (argc=20, argv=0x7ffc2c3f9c58, 
envp=0x7ffc2c3f9d00) at vl.c:4578

Fix this by cancelling the BH when the virtio dataplane is stopped.

[This is version of the patch was modified as discussed with Philippe on
the mailing list thread.
--Stefan]

Reported-by: Yihuang Yu 
Suggested-by: Stefan Hajnoczi 
Fixes: https://bugs.launchpad.net/qemu/+bug/1839428
Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20190816171503.24761-1-phi...@redhat.com>
Signed-off-by: Stefan Hajnoczi 
(cherry picked from commit ebb6ff25cd888a52a64a9adc3692541c6d1d9a42)
Signed-off-by: Michael Roth 
---
 hw/block/dataplane/virtio-blk.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 158c78f852..5fea76df85 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -297,6 +297,9 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev)
 virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), i);
 }
 
+qemu_bh_cancel(s->bh);
+notify_guest_bh(s); /* final chance to notify guest */
+
 /* Clean up guest notifier (irq) */
 k->set_guest_notifiers(qbus->parent, nvqs, false);
 
-- 
2.17.1

[PATCH 10/55] pr-manager: Fix invalid g_free() crash bug

2019-11-05 Thread Michael Roth

From: Markus Armbruster 

pr_manager_worker() passes its @opaque argument to g_free().  Wrong;
it points to pr_manager_worker()'s automatic @data.  Broken when
commit 2f3a7ab39be converted @data from heap- to stack-allocated.  Fix
by deleting the g_free().

Fixes: 2f3a7ab39bec4ba8022dc4d42ea641165b004e3e
Cc: qemu-sta...@nongnu.org
Signed-off-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 
Acked-by: Paolo Bonzini 
Signed-off-by: Kevin Wolf 
(cherry picked from commit 6b9d62c2a9e83bbad73fb61406f0ff69b46ff6f3)
Signed-off-by: Michael Roth 
---
 scsi/pr-manager.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/scsi/pr-manager.c b/scsi/pr-manager.c
index ee43663576..0c866e8698 100644
--- a/scsi/pr-manager.c
+++ b/scsi/pr-manager.c
@@ -39,7 +39,6 @@ static int pr_manager_worker(void *opaque)
 int fd = data->fd;
 int r;
 
-g_free(data);
 trace_pr_manager_run(fd, hdr->cmdp[0], hdr->cmdp[1]);
 
 /* The reference was taken in pr_manager_execute.  */
-- 
2.17.1

Re: [PATCH v1 Resend] target/i386: set the CPUID level to 0x14 on old machine-type

2019-11-05 Thread Eduardo Habkost

On Wed, Oct 30, 2019 at 02:28:02PM +0800, Luwei Kang wrote:
> The CPUID level need to be set to 0x14 manually on old
> machine-type if Intel PT is enabled in guest. e.g. in Qemu 3.1
> -machine pc-i440fx-3.1 -cpu qemu64,+intel-pt
> will be CPUID[0].EAX(level)=7 and CPUID[7].EBX[25](intel-pt)=1.
> 
> Some Intel PT capabilities are exposed by leaf 0x14 and the
> missing capabilities will cause some MSRs access failed.
> This patch add a warning message to inform the user to extend
> the CPUID level.

Note that a warning is not an acceptable fix for a QEMU crash.
We still need to fix the QEMU crash reported at:
https://lore.kernel.org/qemu-devel/20191024141536.gu6...@habkost.net/

> 
> Suggested-by: Eduardo Habkost 
> Signed-off-by: Luwei Kang 

The subject line says "v1", but this patch is different from the
v1 you sent earlier.

If you are sending a different patch, please indicate it is a new
version.  Please also indicate what changed between different
patch versions, to help review.

> ---
>  target/i386/cpu.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index a624163..f67c479 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5440,8 +5440,12 @@ static void x86_cpu_expand_features(X86CPU *cpu, Error 
> **errp)
>  
>  /* Intel Processor Trace requires CPUID[0x14] */
>  if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) &&
> - kvm_enabled() && cpu->intel_pt_auto_level) {

Not directly related to the warning: do you know why we have a
kvm_enabled() check here?  It seems unnecessary.  We want CPUID
level to be correct for all accelerators.

> -x86_cpu_adjust_level(cpu, &cpu->env.cpuid_min_level, 0x14);
> + kvm_enabled()) {
> +if (cpu->intel_pt_auto_level)
> +x86_cpu_adjust_level(cpu, &cpu->env.cpuid_min_level, 0x14);
> +else
> +warn_report("Intel PT need CPUID leaf 0x14, please set "
> +"by \"-cpu ...,+intel-pt,level=0x14\"");

The warning shouldn't be triggered if level is already >= 0x14.

It is probably a good idea to mention that this happens only on
pc-*-3.1 and older, as updating the machine-type is a better
solution to the problem than manually setting the "level"
property.

This will print the warning multiple times if there are multiple
VCPUs.  You can use warn_report_once() to avoid that.

>  }
>  
>  /* CPU topology with multi-dies support requires CPUID[0x1F] */
> -- 
> 1.8.3.1
> 

-- 
Eduardo

[PATCH 51/55] hbitmap: handle set/reset with zero length

2019-11-05 Thread Michael Roth

From: Vladimir Sementsov-Ogievskiy 

Passing zero length to these functions leads to unpredicted results.
Zero-length set/reset may occur in active-mirror, on zero-length write
(which is unlikely, but not guaranteed to never happen).

Let's just do nothing on zero-length request.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-id: 20191011090711.19940-2-vsement...@virtuozzo.com
Reviewed-by: Max Reitz 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Reitz 
(cherry picked from commit fed33bd175f663cc8c13f8a490a4f35a19756cfe)
Signed-off-by: Michael Roth 
---
 util/hbitmap.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/util/hbitmap.c b/util/hbitmap.c
index 71c6ba2c52..c059313b9e 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -387,6 +387,10 @@ void hbitmap_set(HBitmap *hb, uint64_t start, uint64_t 
count)
 uint64_t first, n;
 uint64_t last = start + count - 1;
 
+if (count == 0) {
+return;
+}
+
 trace_hbitmap_set(hb, start, count,
   start >> hb->granularity, last >> hb->granularity);
 
@@ -478,6 +482,10 @@ void hbitmap_reset(HBitmap *hb, uint64_t start, uint64_t 
count)
 uint64_t last = start + count - 1;
 uint64_t gran = 1ULL << hb->granularity;
 
+if (count == 0) {
+return;
+}
+
 assert(QEMU_IS_ALIGNED(start, gran));
 assert(QEMU_IS_ALIGNED(count, gran) || (start + count == hb->orig_size));
 
-- 
2.17.1

[PATCH 54/55] scsi: lsi: exit infinite loop while executing script (CVE-2019-12068)

2019-11-05 Thread Michael Roth

From: Paolo Bonzini 

When executing script in lsi_execute_script(), the LSI scsi adapter
emulator advances 's->dsp' index to read next opcode. This can lead
to an infinite loop if the next opcode is empty. Move the existing
loop exit after 10k iterations so that it covers no-op opcodes as
well.

Reported-by: Bugs SysSec 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Prasad J Pandit 
Signed-off-by: Paolo Bonzini 
(cherry picked from commit de594e47659029316bbf9391efb79da0a1a08e08)
Signed-off-by: Michael Roth 
---
 hw/scsi/lsi53c895a.c | 41 +++--
 1 file changed, 27 insertions(+), 14 deletions(-)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 10468c1ec1..72f7b59ab5 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -185,6 +185,9 @@ static const char *names[] = {
 /* Flag set if this is a tagged command.  */
 #define LSI_TAG_VALID (1 << 16)
 
+/* Maximum instructions to process. */
+#define LSI_MAX_INSN1
+
 typedef struct lsi_request {
 SCSIRequest *req;
 uint32_t tag;
@@ -1132,7 +1135,21 @@ static void lsi_execute_script(LSIState *s)
 
 s->istat1 |= LSI_ISTAT1_SRUN;
 again:
-insn_processed++;
+if (++insn_processed > LSI_MAX_INSN) {
+/* Some windows drivers make the device spin waiting for a memory
+   location to change.  If we have been executed a lot of code then
+   assume this is the case and force an unexpected device disconnect.
+   This is apparently sufficient to beat the drivers into submission.
+ */
+if (!(s->sien0 & LSI_SIST0_UDC)) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "lsi_scsi: inf. loop with UDC masked");
+}
+lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
+lsi_disconnect(s);
+trace_lsi_execute_script_stop();
+return;
+}
 insn = read_dword(s, s->dsp);
 if (!insn) {
 /* If we receive an empty opcode increment the DSP by 4 bytes
@@ -1569,19 +1586,7 @@ again:
 }
 }
 }
-if (insn_processed > 1 && s->waiting == LSI_NOWAIT) {
-/* Some windows drivers make the device spin waiting for a memory
-   location to change.  If we have been executed a lot of code then
-   assume this is the case and force an unexpected device disconnect.
-   This is apparently sufficient to beat the drivers into submission.
- */
-if (!(s->sien0 & LSI_SIST0_UDC)) {
-qemu_log_mask(LOG_GUEST_ERROR,
-  "lsi_scsi: inf. loop with UDC masked");
-}
-lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
-lsi_disconnect(s);
-} else if (s->istat1 & LSI_ISTAT1_SRUN && s->waiting == LSI_NOWAIT) {
+if (s->istat1 & LSI_ISTAT1_SRUN && s->waiting == LSI_NOWAIT) {
 if (s->dcntl & LSI_DCNTL_SSM) {
 lsi_script_dma_interrupt(s, LSI_DSTAT_SSI);
 } else {
@@ -1969,6 +1974,10 @@ static void lsi_reg_writeb(LSIState *s, int offset, 
uint8_t val)
 case 0x2f: /* DSP[24:31] */
 s->dsp &= 0x00ff;
 s->dsp |= val << 24;
+/*
+ * FIXME: if s->waiting != LSI_NOWAIT, this will only execute one
+ * instruction.  Is this correct?
+ */
 if ((s->dmode & LSI_DMODE_MAN) == 0
 && (s->istat1 & LSI_ISTAT1_SRUN) == 0)
 lsi_execute_script(s);
@@ -1987,6 +1996,10 @@ static void lsi_reg_writeb(LSIState *s, int offset, 
uint8_t val)
 break;
 case 0x3b: /* DCNTL */
 s->dcntl = val & ~(LSI_DCNTL_PFF | LSI_DCNTL_STD);
+/*
+ * FIXME: if s->waiting != LSI_NOWAIT, this will only execute one
+ * instruction.  Is this correct?
+ */
 if ((val & LSI_DCNTL_STD) && (s->istat1 & LSI_ISTAT1_SRUN) == 0)
 lsi_execute_script(s);
 break;
-- 
2.17.1

Re: [PULL 0/2] fw_cfg for-4.2-soft-freeze patches

2019-11-05 Thread Peter Maydell

On Sun, 3 Nov 2019 at 22:26, Philippe Mathieu-Daudé  wrote:
>
> Hi Peter,
>
> One fw_cfg fix from David Gilbert.
>
> The following changes since commit f3cad9c6dbd4b9877232c44bf2dd877353a73209:
>
>   iotests: Remove 130 from the "auto" group (2019-10-31 11:04:10 +)
>
> are available in the Git repository at:
>
>   https://gitlab.com/philmd/qemu.git tags/fw_cfg-next-pull-request
>
> for you to fetch changes up to eda4e62cc2f5d12fcedcf799a5a3f9eba855ad77:
>
>   tests/fw_cfg: Test 'reboot-timeout=-1' special value (2019-11-01 19:19:24 
> +0100)
>
> 
> Fix the fw_cfg reboot-timeout=-1 special value, add a test for it.
>
> 
>
> Dr. David Alan Gilbert (1):
>   fw_cfg: Allow reboot-timeout=-1 again
>
> Philippe Mathieu-Daudé (1):
>   tests/fw_cfg: Test 'reboot-timeout=-1' special value
>
>  hw/nvram/fw_cfg.c   |  7 ---
>  tests/fw_cfg-test.c | 21 +
>  2 files changed, 25 insertions(+), 3 deletions(-)

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.2
for any user-visible changes.

-- PMM

[PATCH 50/55] util/hbitmap: strict hbitmap_reset

2019-11-05 Thread Michael Roth

From: Vladimir Sementsov-Ogievskiy 

hbitmap_reset has an unobvious property: it rounds requested region up.
It may provoke bugs, like in recently fixed write-blocking mode of
mirror: user calls reset on unaligned region, not keeping in mind that
there are possible unrelated dirty bytes, covered by rounded-up region
and information of this unrelated "dirtiness" will be lost.

Make hbitmap_reset strict: assert that arguments are aligned, allowing
only one exception when @start + @count == hb->orig_size. It's needed
to comfort users of hbitmap_next_dirty_area, which cares about
hb->orig_size.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20190806152611.280389-1-vsement...@virtuozzo.com>
[Maintainer edit: Max's suggestions from on-list. --js]
[Maintainer edit: Eric's suggestion for aligned macro. --js]
Signed-off-by: John Snow 
(cherry picked from commit 48557b138383aaf69c2617ca9a88bfb394fc50ec)
*prereq for fed33bd175f663cc8c13f8a490a4f35a19756cfe
Signed-off-by: Michael Roth 
---
 include/qemu/hbitmap.h | 5 +
 tests/test-hbitmap.c   | 2 +-
 util/hbitmap.c | 4 
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 4afbe6292e..1bf944ca3d 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -132,6 +132,11 @@ void hbitmap_set(HBitmap *hb, uint64_t start, uint64_t 
count);
  * @count: Number of bits to reset.
  *
  * Reset a consecutive range of bits in an HBitmap.
+ * @start and @count must be aligned to bitmap granularity. The only exception
+ * is resetting the tail of the bitmap: @count may be equal to hb->orig_size -
+ * @start, in this case @count may be not aligned. The sum of @start + @count 
is
+ * allowed to be greater than hb->orig_size, but only if @start < hb->orig_size
+ * and @start + @count = ALIGN_UP(hb->orig_size, granularity).
  */
 void hbitmap_reset(HBitmap *hb, uint64_t start, uint64_t count);
 
diff --git a/tests/test-hbitmap.c b/tests/test-hbitmap.c
index 592d8219db..2be56d1597 100644
--- a/tests/test-hbitmap.c
+++ b/tests/test-hbitmap.c
@@ -423,7 +423,7 @@ static void test_hbitmap_granularity(TestHBitmapData *data,
 hbitmap_test_check(data, 0);
 hbitmap_test_set(data, 0, 3);
 g_assert_cmpint(hbitmap_count(data->hb), ==, 4);
-hbitmap_test_reset(data, 0, 1);
+hbitmap_test_reset(data, 0, 2);
 g_assert_cmpint(hbitmap_count(data->hb), ==, 2);
 }
 
diff --git a/util/hbitmap.c b/util/hbitmap.c
index bcc0acdc6a..71c6ba2c52 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -476,6 +476,10 @@ void hbitmap_reset(HBitmap *hb, uint64_t start, uint64_t 
count)
 /* Compute range in the last layer.  */
 uint64_t first;
 uint64_t last = start + count - 1;
+uint64_t gran = 1ULL << hb->granularity;
+
+assert(QEMU_IS_ALIGNED(start, gran));
+assert(QEMU_IS_ALIGNED(count, gran) || (start + count == hb->orig_size));
 
 trace_hbitmap_reset(hb, start, count,
 start >> hb->granularity, last >> hb->granularity);
-- 
2.17.1

Re: Adding New, Unsupported ISA to Qemu

2019-11-05 Thread Peter Maydell

On Tue, 5 Nov 2019 at 16:44, Stefan Hajnoczi  wrote
> The general advice I've seen is:
>
> 1. Look at existing TCG targets to learn how to implement aspects of
>your ISA.

...and *don't* look at older/less maintained targets (including
x86), as they have a lot of bad habits you don't want to copy.
Using 'decodetree' is probably a good idea.

> 2. If you are unfamiliar with emulation, CPU ISA, or just-in-time
>compiler concepts, try to read up on them and then look back at the
>QEMU code.  Things will be clearer.

I would also add
3.  Don't expect getting this implemented and upstream to be easy.

(Apologies if the following sounds pessimistic and off-putting;
but I would prefer people to have a clear understanding of
what they're getting into and not assume the chances of
success are higher than they might actually be.)

"New TCG target" is an unlucky combination of:
 (1) it's quite a lot of work in pure amount-of-code terms
 (2) because it is a big feature it is not a good choice as a "first
   contribution to the project", but new targets often are proposed
   and written by people who don't have any previous history of
   writing QEMU code
 (3) we already have targets for the common CPU ISAs, so
   anything new is likely to be obscure and not have many people
   who care about it either in our userbase or in our dev community.
   (riscv is the obvious recent exception here, as it is clearly relevant
   as a new architecture and has attracted multiple people to work
   on it and contribute both code and reviews)

1 and 2 mean that code review of a new TCG target is a lot
of work, and 3 means it's not clear how much return the project
gets for that investment :-(

There is not a large community of upstream developers who are
interested in maintaining a lot of obscure guest architectures:
we essentially rely on the goodwill and not-entirely-work-time
of just a few people when it comes to reviewing new TCG targets.
That means that patchsets often hang around on list for a long
time without getting attention.

Our past historical experience has often been that when people
contribute TCG targets, we do a lot of work on our end with
code review and helping to get the code into upstream QEMU, and
then these people more or less disappear, leaving us with the
burden of something we have to support and no help doing it.
If in general people submitting new TCG targets were all
*helping each other*, passing on what they learned to the
next person along, contributing code review, updating older
code as QEMU APIs improve/churn, etc, then I think I'd feel
differently about this. But to be honest mostly I find myself
thinking "oh dear, not another one".

We already have two new TCG ports with patches on list
which are kind of stalled due to not having enough existing
upstream QEMU devs who can/will code review them (and
another which hasn't had patches posted but might do soon).
The odds for your new port having a happier future don't seem
too great to me :-(

thanks
-- PMM

[PATCH 41/55] vhost-user: save features if the char dev is closed

2019-11-05 Thread Michael Roth

From: Adrian Moreno 

That way the state can be correctly restored when the device is opened
again. This might happen if the backend is restarted.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1738768
Reported-by: Pei Zhang 
Fixes: 6ab79a20af3a ("do not call vhost_net_cleanup() on running net from char 
user event")
Cc: ddstr...@canonical.com
Cc: Michael S. Tsirkin 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Adrian Moreno 
Message-Id: <20190924162044.11414-1-amore...@redhat.com>
Acked-by: Jason Wang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit c6beefd674fff8d41b90365dfccad32e53a5abcb)
Signed-off-by: Michael Roth 
---
 net/vhost-user.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/net/vhost-user.c b/net/vhost-user.c
index 51921de443..014199d600 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -235,6 +235,10 @@ static void chr_closed_bh(void *opaque)
 
 s = DO_UPCAST(NetVhostUserState, nc, ncs[0]);
 
+if (s->vhost_net) {
+s->acked_features = vhost_net_get_acked_features(s->vhost_net);
+}
+
 qmp_set_link(name, false, &err);
 
 qemu_chr_fe_set_handlers(&s->chr, NULL, NULL, net_vhost_user_event,
-- 
2.17.1

Re: [PATCH v1 1/4] virtio: protect non-modern devices from too big virtqueue size setting

2019-11-05 Thread Michael S. Tsirkin

On Tue, Nov 05, 2019 at 07:11:02PM +0300, Denis Plotnikov wrote:
> The patch protects from creating illegal virtio device configuration
> via direct virtqueue size property setting.
> 
> Signed-off-by: Denis Plotnikov 
> ---
>  hw/virtio/virtio-blk-pci.c  |  9 +
>  hw/virtio/virtio-scsi-pci.c | 10 ++
>  2 files changed, 19 insertions(+)
> 
> diff --git a/hw/virtio/virtio-blk-pci.c b/hw/virtio/virtio-blk-pci.c
> index 60c9185c39..6177ff1df8 100644
> --- a/hw/virtio/virtio-blk-pci.c
> +++ b/hw/virtio/virtio-blk-pci.c
> @@ -48,6 +48,15 @@ static void virtio_blk_pci_realize(VirtIOPCIProxy 
> *vpci_dev, Error **errp)
>  {
>  VirtIOBlkPCI *dev = VIRTIO_BLK_PCI(vpci_dev);
>  DeviceState *vdev = DEVICE(&dev->vdev);
> +bool modern = virtio_pci_modern(vpci_dev);
> +uint32_t queue_size = dev->vdev.conf.queue_size;
> +
> +if (!modern && queue_size > 128) {
> +error_setg(errp,
> +   "too big queue size (%u, max: 128) "
> +   "for non-modern virtio device", queue_size);
> +return;
> +}


this enables for transitional so still visible to legacy
interface. I am guessing you want to check whether
device is accessed through the modern interface instead.

>  if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
>  vpci_dev->nvectors = dev->vdev.conf.num_queues + 1;

> diff --git a/hw/virtio/virtio-scsi-pci.c b/hw/virtio/virtio-scsi-pci.c
> index 2830849729..6e6790fda5 100644
> --- a/hw/virtio/virtio-scsi-pci.c
> +++ b/hw/virtio/virtio-scsi-pci.c
> @@ -17,6 +17,7 @@
>  
>  #include "hw/virtio/virtio-scsi.h"
>  #include "virtio-pci.h"
> +#include "qapi/error.h"
>  
>  typedef struct VirtIOSCSIPCI VirtIOSCSIPCI;
>  
> @@ -47,6 +48,15 @@ static void virtio_scsi_pci_realize(VirtIOPCIProxy 
> *vpci_dev, Error **errp)
>  VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(vdev);
>  DeviceState *proxy = DEVICE(vpci_dev);
>  char *bus_name;
> +bool modern = virtio_pci_modern(vpci_dev);
> +uint32_t virtqueue_size = vs->conf.virtqueue_size;
> +
> +if (!modern && virtqueue_size > 128) {
> +error_setg(errp,
> +   "too big virtqueue size (%u, max: 128) "
> +   "for non-modern virtio device", virtqueue_size);
> +return;
> +}

why? what is illegal about 256 for legacy?

>  
>  if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
>  vpci_dev->nvectors = vs->conf.num_queues + 3;
> -- 
> 2.17.0

[PATCH 40/55] iotests: Test internal snapshots with -blockdev

2019-11-05 Thread Michael Roth

From: Kevin Wolf 

Signed-off-by: Kevin Wolf 
Reviewed-by: Peter Krempa 
Tested-by: Peter Krempa 
(cherry picked from commit 92b22e7b1789b0e5f20d245706e72eae70dbddce)
Signed-off-by: Michael Roth 
---
 tests/qemu-iotests/267   | 168 
 tests/qemu-iotests/267.out   | 182 +++
 tests/qemu-iotests/common.filter |  11 +-
 tests/qemu-iotests/group |   1 +
 4 files changed, 358 insertions(+), 4 deletions(-)
 create mode 100755 tests/qemu-iotests/267
 create mode 100644 tests/qemu-iotests/267.out

diff --git a/tests/qemu-iotests/267 b/tests/qemu-iotests/267
new file mode 100755
index 00..d37a67c012
--- /dev/null
+++ b/tests/qemu-iotests/267
@@ -0,0 +1,168 @@
+#!/usr/bin/env bash
+#
+# Test which nodes are involved in internal snapshots
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=kw...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+status=1   # failure is the default!
+
+_cleanup()
+{
+_cleanup_test_img
+rm -f "$TEST_DIR/nbd"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+# Internal snapshots are (currently) impossible with refcount_bits=1
+_unsupported_imgopts 'refcount_bits=1[^0-9]'
+
+do_run_qemu()
+{
+echo Testing: "$@"
+(
+if ! test -t 0; then
+while read cmd; do
+echo $cmd
+done
+fi
+echo quit
+) | $QEMU -nographic -monitor stdio -nodefaults "$@"
+echo
+}
+
+run_qemu()
+{
+do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_qemu | _filter_hmp |
+_filter_generated_node_ids | _filter_imgfmt | _filter_vmstate_size
+}
+
+size=128M
+
+run_test()
+{
+_make_test_img $size
+printf "savevm snap0\ninfo snapshots\nloadvm snap0\n" | run_qemu "$@" | 
_filter_date
+}
+
+
+echo
+echo "=== No block devices at all ==="
+echo
+
+run_test
+
+echo
+echo "=== -drive if=none ==="
+echo
+
+run_test -drive driver=file,file="$TEST_IMG",if=none
+run_test -drive driver=$IMGFMT,file="$TEST_IMG",if=none
+run_test -drive driver=$IMGFMT,file="$TEST_IMG",if=none -device 
virtio-blk,drive=none0
+
+echo
+echo "=== -drive if=virtio ==="
+echo
+
+run_test -drive driver=file,file="$TEST_IMG",if=virtio
+run_test -drive driver=$IMGFMT,file="$TEST_IMG",if=virtio
+
+echo
+echo "=== Simple -blockdev ==="
+echo
+
+run_test -blockdev driver=file,filename="$TEST_IMG",node-name=file
+run_test -blockdev driver=file,filename="$TEST_IMG",node-name=file \
+ -blockdev driver=$IMGFMT,file=file,node-name=fmt
+run_test -blockdev driver=file,filename="$TEST_IMG",node-name=file \
+ -blockdev driver=raw,file=file,node-name=raw \
+ -blockdev driver=$IMGFMT,file=raw,node-name=fmt
+
+echo
+echo "=== -blockdev with a filter on top ==="
+echo
+
+run_test -blockdev driver=file,filename="$TEST_IMG",node-name=file \
+ -blockdev driver=$IMGFMT,file=file,node-name=fmt \
+ -blockdev driver=copy-on-read,file=fmt,node-name=filter
+
+echo
+echo "=== -blockdev with a backing file ==="
+echo
+
+TEST_IMG="$TEST_IMG.base" _make_test_img $size
+
+IMGOPTS="backing_file=$TEST_IMG.base" \
+run_test -blockdev 
driver=file,filename="$TEST_IMG.base",node-name=backing-file \
+ -blockdev driver=file,filename="$TEST_IMG",node-name=file \
+ -blockdev driver=$IMGFMT,file=file,backing=backing-file,node-name=fmt
+
+IMGOPTS="backing_file=$TEST_IMG.base" \
+run_test -blockdev 
driver=file,filename="$TEST_IMG.base",node-name=backing-file \
+ -blockdev driver=$IMGFMT,file=backing-file,node-name=backing-fmt \
+ -blockdev driver=file,filename="$TEST_IMG",node-name=file \
+ -blockdev driver=$IMGFMT,file=file,backing=backing-fmt,node-name=fmt
+
+# A snapshot should be present on the overlay, but not the backing file
+echo Internal snapshots on overlay:
+$QEMU_IMG snapshot -l "$TEST_IMG" | _filter_date | _filter_vmstate_size
+
+echo Internal snapshots on backing file:
+$QEMU_IMG snapshot -l "$TEST_IMG.base" | _filter_date | _filter_vmstate_size
+
+echo
+echo "=== -blockdev with NBD server on the backing file ==="
+echo
+
+IMGOPTS="backing_file=$TEST_IMG.base" _make_test_img $size
+cat <

[PATCH 49/55] COLO-compare: Fix incorrect `if` logic

2019-11-05 Thread Michael Roth

From: Fan Yang 

'colo_mark_tcp_pkt' should return 'true' when packets are the same, and
'false' otherwise.  However, it returns 'true' when
'colo_compare_packet_payload' returns non-zero while
'colo_compare_packet_payload' is just a 'memcmp'.  The result is that
COLO-compare reports inconsistent TCP packets when they are actually
the same.

Fixes: f449c9e549c ("colo: compare the packet based on the tcp sequence number")
Cc: qemu-sta...@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Fan Yang 
Signed-off-by: Jason Wang 
(cherry picked from commit 1e907a32b77e5d418538453df5945242e43224fa)
Signed-off-by: Michael Roth 
---
 net/colo-compare.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 7489840bde..7ee17f2cf8 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -319,7 +319,7 @@ static bool colo_mark_tcp_pkt(Packet *ppkt, Packet *spkt,
 *mark = 0;
 
 if (ppkt->tcp_seq == spkt->tcp_seq && ppkt->seq_end == spkt->seq_end) {
-if (colo_compare_packet_payload(ppkt, spkt,
+if (!colo_compare_packet_payload(ppkt, spkt,
 ppkt->header_size, spkt->header_size,
 ppkt->payload_size)) {
 *mark = COLO_COMPARE_FREE_SECONDARY | COLO_COMPARE_FREE_PRIMARY;
@@ -329,7 +329,7 @@ static bool colo_mark_tcp_pkt(Packet *ppkt, Packet *spkt,
 
 /* one part of secondary packet payload still need to be compared */
 if (!after(ppkt->seq_end, spkt->seq_end)) {
-if (colo_compare_packet_payload(ppkt, spkt,
+if (!colo_compare_packet_payload(ppkt, spkt,
 ppkt->header_size + ppkt->offset,
 spkt->header_size + spkt->offset,
 ppkt->payload_size - ppkt->offset)) {
@@ -348,7 +348,7 @@ static bool colo_mark_tcp_pkt(Packet *ppkt, Packet *spkt,
 /* primary packet is longer than secondary packet, compare
  * the same part and mark the primary packet offset
  */
-if (colo_compare_packet_payload(ppkt, spkt,
+if (!colo_compare_packet_payload(ppkt, spkt,
 ppkt->header_size + ppkt->offset,
 spkt->header_size + spkt->offset,
 spkt->payload_size - spkt->offset)) {
-- 
2.17.1

[PATCH 04/55] target/alpha: fix tlb_fill trap_arg2 value for instruction fetch

2019-11-05 Thread Michael Roth

From: Aurelien Jarno 

Commit e41c94529740cc26 ("target/alpha: Convert to CPUClass::tlb_fill")
slightly changed the way the trap_arg2 value is computed in case of TLB
fill. The type of the variable used in the ternary operator has been
changed from an int to an enum. This causes the -1 value to not be
sign-extended to 64-bit in case of an instruction fetch. The trap_arg2
ends up with 0x instead of 0x. Fix that by
changing the -1 into -1LL.

This fixes the execution of user space processes in qemu-system-alpha.

Fixes: e41c94529740cc26
Cc: qemu-sta...@nongnu.org
Signed-off-by: Aurelien Jarno 
[rth: Test MMU_DATA_LOAD and MMU_DATA_STORE instead of implying them.]
Signed-off-by: Richard Henderson 
(cherry picked from commit cb1de55a83eaca9ee32be9c959dca99e11f2fea8)
Signed-off-by: Michael Roth 
---
 target/alpha/helper.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/target/alpha/helper.c b/target/alpha/helper.c
index 93b8e788b1..d0cc623192 100644
--- a/target/alpha/helper.c
+++ b/target/alpha/helper.c
@@ -283,7 +283,9 @@ bool alpha_cpu_tlb_fill(CPUState *cs, vaddr addr, int size,
 cs->exception_index = EXCP_MMFAULT;
 env->trap_arg0 = addr;
 env->trap_arg1 = fail;
-env->trap_arg2 = (access_type == MMU_INST_FETCH ? -1 : access_type);
+env->trap_arg2 = (access_type == MMU_DATA_LOAD ? 0ull :
+  access_type == MMU_DATA_STORE ? 1ull :
+  /* access_type == MMU_INST_FETCH */ -1ull);
 cpu_loop_exit_restore(cs, retaddr);
 }
 
-- 
2.17.1

[PATCH 46/55] ui: Fix hanging up Cocoa display on macOS 10.15 (Catalina)

2019-11-05 Thread Michael Roth

From: Hikaru Nishida 

macOS API documentation says that before applicationDidFinishLaunching
is called, any events will not be processed. However, some events are
fired before it is called in macOS Catalina. This causes deadlock of
iothread_lock in handleEvent while it will be released after the
app_started_sem is posted.
This patch avoids processing events before the app_started_sem is
posted to prevent this deadlock.

Buglink: https://bugs.launchpad.net/qemu/+bug/1847906
Signed-off-by: Hikaru Nishida 
Message-id: 20191015010734.85229-1-hikaru...@gmail.com
Signed-off-by: Gerd Hoffmann 
(cherry picked from commit dff742ad27efa474ec04accdbf422c9acfd3e30e)
Signed-off-by: Michael Roth 
---
 ui/cocoa.m | 12 
 1 file changed, 12 insertions(+)

diff --git a/ui/cocoa.m b/ui/cocoa.m
index c2984028c5..3026ead621 100644
--- a/ui/cocoa.m
+++ b/ui/cocoa.m
@@ -132,6 +132,7 @@ NSArray * supportedImageFileTypes;
 
 static QemuSemaphore display_init_sem;
 static QemuSemaphore app_started_sem;
+static bool allow_events;
 
 // Utility functions to run specified code block with iothread lock held
 typedef void (^CodeBlock)(void);
@@ -727,6 +728,16 @@ QemuCocoaView *cocoaView;
 
 - (bool) handleEvent:(NSEvent *)event
 {
+if(!allow_events) {
+/*
+ * Just let OSX have all events that arrive before
+ * applicationDidFinishLaunching.
+ * This avoids a deadlock on the iothread lock, which 
cocoa_display_init()
+ * will not drop until after the app_started_sem is posted. (In theory
+ * there should not be any such events, but OSX Catalina now emits 
some.)
+ */
+return false;
+}
 return bool_with_iothread_lock(^{
 return [self handleEventLocked:event];
 });
@@ -1154,6 +1165,7 @@ QemuCocoaView *cocoaView;
 - (void)applicationDidFinishLaunching: (NSNotification *) note
 {
 COCOA_DEBUG("QemuCocoaAppController: applicationDidFinishLaunching\n");
+allow_events = true;
 /* Tell cocoa_display_init to proceed */
 qemu_sem_post(&app_started_sem);
 }
-- 
2.17.1

[PATCH 13/55] iotests: add testing shim for script-style python tests

2019-11-05 Thread Michael Roth

From: John Snow 

Because the new-style python tests don't use the iotests.main() test
launcher, we don't turn on the debugger logging for these scripts
when invoked via ./check -d.

Refactor the launcher shim into new and old style shims so that they
share environmental configuration.

Two cleanup notes: debug was not actually used as a global, and there
was no reason to create a class in an inner scope just to achieve
default variables; we can simply create an instance of the runner with
the values we want instead.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
Message-id: 20190709232550.10724-14-js...@redhat.com
Signed-off-by: John Snow 
(cherry picked from commit 456a2d5ac7641c7e75c76328a561b528a8607a8e)
*prereq for 88d2aa533a
Signed-off-by: Michael Roth 
---
 tests/qemu-iotests/iotests.py | 40 +++
 1 file changed, 26 insertions(+), 14 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index ce74177ab1..25c5a047b3 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -61,7 +61,6 @@ cachemode = os.environ.get('CACHEMODE')
 qemu_default_machine = os.environ.get('QEMU_DEFAULT_MACHINE')
 
 socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
-debug = False
 
 luks_default_secret_object = 'secret,id=keysec0,data=' + \
  os.environ.get('IMGKEYSECRET', '')
@@ -842,11 +841,22 @@ def skip_if_unsupported(required_formats=[], 
read_only=False):
 return func_wrapper
 return skip_test_decorator
 
-def main(supported_fmts=[], supported_oses=['linux'], supported_cache_modes=[],
- unsupported_fmts=[]):
-'''Run tests'''
+def execute_unittest(output, verbosity, debug):
+runner = unittest.TextTestRunner(stream=output, descriptions=True,
+ verbosity=verbosity)
+try:
+# unittest.main() will use sys.exit(); so expect a SystemExit
+# exception
+unittest.main(testRunner=runner)
+finally:
+if not debug:
+sys.stderr.write(re.sub(r'Ran (\d+) tests? in [\d.]+s',
+r'Ran \1 tests', output.getvalue()))
 
-global debug
+def execute_test(test_function=None,
+ supported_fmts=[], supported_oses=['linux'],
+ supported_cache_modes=[], unsupported_fmts=[]):
+"""Run either unittest or script-style tests."""
 
 # We are using TEST_DIR and QEMU_DEFAULT_MACHINE as proxies to
 # indicate that we're not being run via "check". There may be
@@ -878,13 +888,15 @@ def main(supported_fmts=[], supported_oses=['linux'], 
supported_cache_modes=[],
 
 logging.basicConfig(level=(logging.DEBUG if debug else logging.WARN))
 
-class MyTestRunner(unittest.TextTestRunner):
-def __init__(self, stream=output, descriptions=True, 
verbosity=verbosity):
-unittest.TextTestRunner.__init__(self, stream, descriptions, 
verbosity)
+if not test_function:
+execute_unittest(output, verbosity, debug)
+else:
+test_function()
+
+def script_main(test_function, *args, **kwargs):
+"""Run script-style tests outside of the unittest framework"""
+execute_test(test_function, *args, **kwargs)
 
-# unittest.main() will use sys.exit() so expect a SystemExit exception
-try:
-unittest.main(testRunner=MyTestRunner)
-finally:
-if not debug:
-sys.stderr.write(re.sub(r'Ran (\d+) tests? in [\d.]+s', r'Ran \1 
tests', output.getvalue()))
+def main(*args, **kwargs):
+"""Run tests using the unittest framework"""
+execute_test(None, *args, **kwargs)
-- 
2.17.1

[PATCH 44/55] iotests: Test large write request to qcow2 file

2019-11-05 Thread Michael Roth

From: Max Reitz 

Without HEAD^, the following happens when you attempt a large write
request to a qcow2 file such that the number of bytes covered by all
clusters involved in a single allocation will exceed INT_MAX:

(A) handle_alloc_space() decides to fill the whole area with zeroes and
fails because bdrv_co_pwrite_zeroes() fails (the request is too
large).

(B) If handle_alloc_space() does not do anything, but merge_cow()
decides that the requests can be merged, it will create a too long
IOV that later cannot be written.

(C) Otherwise, all parts will be written separately, so those requests
will work.

In either B or C, though, qcow2_alloc_cluster_link_l2() will have an
overflow: We use an int (i) to iterate over nb_clusters, and then
calculate the L2 entry based on "i << s->cluster_bits" -- which will
overflow if the range covers more than INT_MAX bytes.  This then leads
to image corruption because the L2 entry will be wrong (it will be
recognized as a compressed cluster).

Even if that were not the case, the .cow_end area would be empty
(because handle_alloc() will cap avail_bytes and nb_bytes at INT_MAX, so
their difference (which is the .cow_end size) will be 0).

So this test checks that on such large requests, the image will not be
corrupted.  Unfortunately, we cannot check whether COW will be handled
correctly, because that data is discarded when it is written to null-co
(but we have to use null-co, because writing 2 GB of data in a test is
not quite reasonable).

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Signed-off-by: Kevin Wolf 
(cherry picked from commit a1406a9262a087d9ec9627b88da13c4590b61dae)
 Conflicts:
tests/qemu-iotests/group
*drop context dep. on tests not in 4.1
Signed-off-by: Michael Roth 
---
 tests/qemu-iotests/270 | 83 ++
 tests/qemu-iotests/270.out |  9 +
 tests/qemu-iotests/group   |  1 +
 3 files changed, 93 insertions(+)
 create mode 100755 tests/qemu-iotests/270
 create mode 100644 tests/qemu-iotests/270.out

diff --git a/tests/qemu-iotests/270 b/tests/qemu-iotests/270
new file mode 100755
index 00..b9a12b908c
--- /dev/null
+++ b/tests/qemu-iotests/270
@@ -0,0 +1,83 @@
+#!/usr/bin/env bash
+#
+# Test large write to a qcow2 image
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+seq=$(basename "$0")
+echo "QA output created by $seq"
+
+status=1   # failure is the default!
+
+_cleanup()
+{
+_cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+# This is a qcow2 regression test
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+# We use our own external data file and our own cluster size, and we
+# require v3 images
+_unsupported_imgopts data_file cluster_size 'compat=0.10'
+
+
+# We need a backing file so that handle_alloc_space() will not do
+# anything.  (If it were to do anything, it would simply fail its
+# write-zeroes request because the request range is too large.)
+TEST_IMG="$TEST_IMG.base" _make_test_img 4G
+$QEMU_IO -c 'write 0 512' "$TEST_IMG.base" | _filter_qemu_io
+
+# (Use .orig because _cleanup_test_img will remove that file)
+# We need a large cluster size, see below for why (above the $QEMU_IO
+# invocation)
+_make_test_img -o cluster_size=2M,data_file="$TEST_IMG.orig" \
+-b "$TEST_IMG.base" 4G
+
+# We want a null-co as the data file, because it allows us to quickly
+# "write" 2G of data without using any space.
+# (qemu-img create does not like it, though, because null-co does not
+# support image creation.)
+$QEMU_IMG amend -o data_file="json:{'driver':'null-co',,'size':'4294967296'}" \
+"$TEST_IMG"
+
+# This gives us a range of:
+#   2^31 - 512 + 768 - 1 = 2^31 + 255 > 2^31
+# until the beginning of the end COW block.  (The total allocation
+# size depends on the cluster size, but all that is important is that
+# it exceeds INT_MAX.)
+#
+# 2^31 - 512 is the maximum request size.  We want this to result in a
+# single allocation, and because the qcow2 driver splits allocations
+# on L2 boundaries, we need large L2 tables; hence the cluster size of
+# 2 MB.  (Anything from 256 kB should work, though, because then one L2
+# table covers 8 GB.)
+$QEMU_IO -c "write 768 $((2 ** 31 - 512))" "$TEST_IMG" | _filter_qemu

[PATCH 25/55] curl: Check completion in curl_multi_do()

2019-11-05 Thread Michael Roth

From: Max Reitz 

While it is more likely that transfers complete after some file
descriptor has data ready to read, we probably should not rely on it.
Better be safe than sorry and call curl_multi_check_completion() in
curl_multi_do(), too, just like it is done in curl_multi_read().

With this change, curl_multi_do() and curl_multi_read() are actually the
same, so drop curl_multi_read() and use curl_multi_do() as the sole FD
handler.

Signed-off-by: Max Reitz 
Message-id: 20190910124136.10565-4-mre...@redhat.com
Reviewed-by: Maxim Levitsky 
Reviewed-by: John Snow 
Signed-off-by: Max Reitz 
(cherry picked from commit 948403bcb1c7e71dcbe8ab8479cf3934a0efcbb5)
Signed-off-by: Michael Roth 
---
 block/curl.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 95d7b77dc0..5838afef99 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -139,7 +139,6 @@ typedef struct BDRVCURLState {
 
 static void curl_clean_state(CURLState *s);
 static void curl_multi_do(void *arg);
-static void curl_multi_read(void *arg);
 
 #ifdef NEED_CURL_TIMER_CALLBACK
 /* Called from curl_multi_do_locked, with s->mutex held.  */
@@ -186,7 +185,7 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int 
action,
 switch (action) {
 case CURL_POLL_IN:
 aio_set_fd_handler(s->aio_context, fd, false,
-   curl_multi_read, NULL, NULL, state);
+   curl_multi_do, NULL, NULL, state);
 break;
 case CURL_POLL_OUT:
 aio_set_fd_handler(s->aio_context, fd, false,
@@ -194,7 +193,7 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int 
action,
 break;
 case CURL_POLL_INOUT:
 aio_set_fd_handler(s->aio_context, fd, false,
-   curl_multi_read, curl_multi_do, NULL, state);
+   curl_multi_do, curl_multi_do, NULL, state);
 break;
 case CURL_POLL_REMOVE:
 aio_set_fd_handler(s->aio_context, fd, false,
@@ -416,15 +415,6 @@ static void curl_multi_do(void *arg)
 {
 CURLState *s = (CURLState *)arg;
 
-qemu_mutex_lock(&s->s->mutex);
-curl_multi_do_locked(s);
-qemu_mutex_unlock(&s->s->mutex);
-}
-
-static void curl_multi_read(void *arg)
-{
-CURLState *s = (CURLState *)arg;
-
 qemu_mutex_lock(&s->s->mutex);
 curl_multi_do_locked(s);
 curl_multi_check_completion(s->s);
-- 
2.17.1

[PATCH 47/55] virtio: new post_load hook

2019-11-05 Thread Michael Roth

From: "Michael S. Tsirkin" 

Post load hook in virtio vmsd is called early while device is processed,
and when VirtIODevice core isn't fully initialized.  Most device
specific code isn't ready to deal with a device in such state, and
behaves weirdly.

Add a new post_load hook in a device class instead.  Devices should use
this unless they specifically want to verify the migration stream as
it's processed, e.g. for bounds checking.

Cc: qemu-sta...@nongnu.org
Suggested-by: "Dr. David Alan Gilbert" 
Cc: Mikhail Sennikovsky 
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
(cherry picked from commit 1dd713837cac8ec5a97d3b8492d72ce5ac94803c)
Signed-off-by: Michael Roth 
---
 hw/virtio/virtio.c | 7 +++
 include/hw/virtio/virtio.h | 6 ++
 2 files changed, 13 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index a94ea18a9c..7c3822c3a0 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2287,6 +2287,13 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int 
version_id)
 }
 rcu_read_unlock();
 
+if (vdc->post_load) {
+ret = vdc->post_load(vdev);
+if (ret) {
+return ret;
+}
+}
+
 return 0;
 }
 
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index b189788cb2..f9f62370e9 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -158,6 +158,12 @@ typedef struct VirtioDeviceClass {
  */
 void (*save)(VirtIODevice *vdev, QEMUFile *f);
 int (*load)(VirtIODevice *vdev, QEMUFile *f, int version_id);
+/* Post load hook in vmsd is called early while device is processed, and
+ * when VirtIODevice isn't fully initialized.  Devices should use this 
instead,
+ * unless they specifically want to verify the migration stream as it's
+ * processed, e.g. for bounds checking.
+ */
+int (*post_load)(VirtIODevice *vdev);
 const VMStateDescription *vmsd;
 } VirtioDeviceClass;
 
-- 
2.17.1

[PATCH 21/55] qcow2: Fix the calculation of the maximum L2 cache size

2019-11-05 Thread Michael Roth

From: Alberto Garcia 

The size of the qcow2 L2 cache defaults to 32 MB, which can be easily
larger than the maximum amount of L2 metadata that the image can have.
For example: with 64 KB clusters the user would need a qcow2 image
with a virtual size of 256 GB in order to have 32 MB of L2 metadata.

Because of that, since commit b749562d9822d14ef69c9eaa5f85903010b86c30
we forbid the L2 cache to become larger than the maximum amount of L2
metadata for the image, calculated using this formula:

uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);

The problem with this formula is that the result should be rounded up
to the cluster size because an L2 table on disk always takes one full
cluster.

For example, a 1280 MB qcow2 image with 64 KB clusters needs exactly
160 KB of L2 metadata, but we need 192 KB on disk (3 clusters) even if
the last 32 KB of those are not going to be used.

However QEMU rounds the numbers down and only creates 2 cache tables
(128 KB), which is not enough for the image.

A quick test doing 4KB random writes on a 1280 MB image gives me
around 500 IOPS, while with the correct cache size I get 16K IOPS.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
(cherry picked from commit b70d08205b2e4044c529eefc21df2c8ab61b473b)
Signed-off-by: Michael Roth 
---
 block/qcow2.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 039bdc2f7e..865839682c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -826,7 +826,11 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
 bool l2_cache_entry_size_set;
 int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
 uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
-uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
+uint64_t max_l2_entries = DIV_ROUND_UP(virtual_disk_size, s->cluster_size);
+/* An L2 table is always one cluster in size so the max cache size
+ * should be a multiple of the cluster size. */
+uint64_t max_l2_cache = ROUND_UP(max_l2_entries * sizeof(uint64_t),
+ s->cluster_size);
 
 combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
 l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
-- 
2.17.1

[PATCH 52/55] target/arm: Allow reading flags from FPSCR for M-profile

2019-11-05 Thread Michael Roth

From: Christophe Lyon 

rt==15 is a special case when reading the flags: it means the
destination is APSR. This patch avoids rejecting
vmrs apsr_nzcv, fpscr
as illegal instruction.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Christophe Lyon 
Message-id: 20191025095711.10853-1-christophe.l...@linaro.org
[PMM: updated the comment]
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
(cherry picked from commit 2529ab43b8a05534494704e803e0332d111d8b91)
Signed-off-by: Michael Roth 
---
 target/arm/translate-vfp.inc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
index ef45cecbea..75406fd9db 100644
--- a/target/arm/translate-vfp.inc.c
+++ b/target/arm/translate-vfp.inc.c
@@ -704,9 +704,10 @@ static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS 
*a)
 if (arm_dc_feature(s, ARM_FEATURE_M)) {
 /*
  * The only M-profile VFP vmrs/vmsr sysreg is FPSCR.
- * Writes to R15 are UNPREDICTABLE; we choose to undef.
+ * Accesses to R15 are UNPREDICTABLE; we choose to undef.
+ * (FPSCR -> r15 is a special case which writes to the PSR flags.)
  */
-if (a->rt == 15 || a->reg != ARM_VFP_FPSCR) {
+if (a->rt == 15 && (!a->l || a->reg != ARM_VFP_FPSCR)) {
 return false;
 }
 }
-- 
2.17.1

[PATCH 38/55] s390: PCI: fix IOMMU region init

2019-11-05 Thread Michael Roth

From: Matthew Rosato 

The fix in dbe9cf606c shrinks the IOMMU memory region to a size
that seems reasonable on the surface, however is actually too
small as it is based against a 0-mapped address space.  This
causes breakage with small guests as they can overrun the IOMMU window.

Let's go back to the prior method of initializing iommu for now.

Fixes: dbe9cf606c ("s390x/pci: Set the iommu region size mpcifc request")
Cc: qemu-sta...@nongnu.org
Reviewed-by: Pierre Morel 
Reported-by: Boris Fiuczynski 
Tested-by: Boris Fiuczynski 
Reported-by: Stefan Zimmerman 
Signed-off-by: Matthew Rosato 
Message-Id: <1569507036-15314-1-git-send-email-mjros...@linux.ibm.com>
Signed-off-by: Christian Borntraeger 
(cherry picked from commit 7df1dac5f1c85312474df9cb3a8fcae72303da62)
Signed-off-by: Michael Roth 
---
 hw/s390x/s390-pci-bus.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 2c6e084e2c..9a935f22b5 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -694,10 +694,15 @@ static const MemoryRegionOps s390_msi_ctrl_ops = {
 
 void s390_pci_iommu_enable(S390PCIIOMMU *iommu)
 {
+/*
+ * The iommu region is initialized against a 0-mapped address space,
+ * so the smallest IOMMU region we can define runs from 0 to the end
+ * of the PCI address space.
+ */
 char *name = g_strdup_printf("iommu-s390-%04x", iommu->pbdev->uid);
 memory_region_init_iommu(&iommu->iommu_mr, sizeof(iommu->iommu_mr),
  TYPE_S390_IOMMU_MEMORY_REGION, OBJECT(&iommu->mr),
- name, iommu->pal - iommu->pba + 1);
+ name, iommu->pal + 1);
 iommu->enabled = true;
 memory_region_add_subregion(&iommu->mr, 0, 
MEMORY_REGION(&iommu->iommu_mr));
 g_free(name);
-- 
2.17.1

[PATCH 42/55] hw/core/loader: Fix possible crash in rom_copy()

2019-11-05 Thread Michael Roth

From: Thomas Huth 

Both, "rom->addr" and "addr" are derived from the binary image
that can be loaded with the "-kernel" paramer. The code in
rom_copy() then calculates:

d = dest + (rom->addr - addr);

and uses "d" as destination in a memcpy() some lines later. Now with
bad kernel images, it is possible that rom->addr is smaller than addr,
thus "rom->addr - addr" gets negative and the memcpy() then tries to
copy contents from the image to a bad memory location. This could
maybe be used to inject code from a kernel image into the QEMU binary,
so we better fix it with an additional sanity check here.

Cc: qemu-sta...@nongnu.org
Reported-by: Guangming Liu
Buglink: https://bugs.launchpad.net/qemu/+bug/1844635
Message-Id: <20190925130331.27825-1-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Thomas Huth 
(cherry picked from commit e423455c4f23a1a828901c78fe6d03b7dde79319)
Signed-off-by: Michael Roth 
---
 hw/core/loader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/core/loader.c b/hw/core/loader.c
index 425bf69a99..838a34174a 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -1242,7 +1242,7 @@ int rom_copy(uint8_t *dest, hwaddr addr, size_t size)
 if (rom->addr + rom->romsize < addr) {
 continue;
 }
-if (rom->addr > end) {
+if (rom->addr > end || rom->addr < addr) {
 break;
 }
 
-- 
2.17.1

[PATCH 39/55] block/snapshot: Restrict set of snapshot nodes

2019-11-05 Thread Michael Roth

From: Kevin Wolf 

Nodes involved in internal snapshots were those that were returned by
bdrv_next(), inserted and not read-only. bdrv_next() in turn returns all
nodes that are either the root node of a BlockBackend or monitor-owned
nodes.

With the typical -drive use, this worked well enough. However, in the
typical -blockdev case, the user defines one node per option, making all
nodes monitor-owned nodes. This includes protocol nodes etc. which often
are not snapshottable, so "savevm" only returns an error.

Change the conditions so that internal snapshot still include all nodes
that have a BlockBackend attached (we definitely want to snapshot
anything attached to a guest device and probably also the built-in NBD
server; snapshotting block job BlockBackends is more of an accident, but
a preexisting one), but other monitor-owned nodes are only included if
they have no parents.

This makes internal snapshots usable again with typical -blockdev
configurations.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Peter Krempa 
Tested-by: Peter Krempa 
(cherry picked from commit 05f4aced658a02b02d3e89a6c7a2281008fcf26c)
Signed-off-by: Michael Roth 
---
 block/snapshot.c | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/block/snapshot.c b/block/snapshot.c
index f2f48f926a..8081616ae9 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -31,6 +31,7 @@
 #include "qapi/qmp/qerror.h"
 #include "qapi/qmp/qstring.h"
 #include "qemu/option.h"
+#include "sysemu/block-backend.h"
 
 QemuOptsList internal_snapshot_opts = {
 .name = "snapshot",
@@ -384,6 +385,16 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState 
*bs,
 return ret;
 }
 
+static bool bdrv_all_snapshots_includes_bs(BlockDriverState *bs)
+{
+if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
+return false;
+}
+
+/* Include all nodes that are either in use by a BlockBackend, or that
+ * aren't attached to any node, but owned by the monitor. */
+return bdrv_has_blk(bs) || QLIST_EMPTY(&bs->parents);
+}
 
 /* Group operations. All block drivers are involved.
  * These functions will properly handle dataplane (take aio_context_acquire
@@ -399,7 +410,7 @@ bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs)
 AioContext *ctx = bdrv_get_aio_context(bs);
 
 aio_context_acquire(ctx);
-if (bdrv_is_inserted(bs) && !bdrv_is_read_only(bs)) {
+if (bdrv_all_snapshots_includes_bs(bs)) {
 ok = bdrv_can_snapshot(bs);
 }
 aio_context_release(ctx);
@@ -426,8 +437,9 @@ int bdrv_all_delete_snapshot(const char *name, 
BlockDriverState **first_bad_bs,
 AioContext *ctx = bdrv_get_aio_context(bs);
 
 aio_context_acquire(ctx);
-if (bdrv_can_snapshot(bs) &&
-bdrv_snapshot_find(bs, snapshot, name) >= 0) {
+if (bdrv_all_snapshots_includes_bs(bs) &&
+bdrv_snapshot_find(bs, snapshot, name) >= 0)
+{
 ret = bdrv_snapshot_delete(bs, snapshot->id_str,
snapshot->name, err);
 }
@@ -455,7 +467,7 @@ int bdrv_all_goto_snapshot(const char *name, 
BlockDriverState **first_bad_bs,
 AioContext *ctx = bdrv_get_aio_context(bs);
 
 aio_context_acquire(ctx);
-if (bdrv_can_snapshot(bs)) {
+if (bdrv_all_snapshots_includes_bs(bs)) {
 ret = bdrv_snapshot_goto(bs, name, errp);
 }
 aio_context_release(ctx);
@@ -481,7 +493,7 @@ int bdrv_all_find_snapshot(const char *name, 
BlockDriverState **first_bad_bs)
 AioContext *ctx = bdrv_get_aio_context(bs);
 
 aio_context_acquire(ctx);
-if (bdrv_can_snapshot(bs)) {
+if (bdrv_all_snapshots_includes_bs(bs)) {
 err = bdrv_snapshot_find(bs, &sn, name);
 }
 aio_context_release(ctx);
@@ -512,7 +524,7 @@ int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
 if (bs == vm_state_bs) {
 sn->vm_state_size = vm_state_size;
 err = bdrv_snapshot_create(bs, sn);
-} else if (bdrv_can_snapshot(bs)) {
+} else if (bdrv_all_snapshots_includes_bs(bs)) {
 sn->vm_state_size = 0;
 err = bdrv_snapshot_create(bs, sn);
 }
@@ -538,7 +550,7 @@ BlockDriverState *bdrv_all_find_vmstate_bs(void)
 bool found;
 
 aio_context_acquire(ctx);
-found = bdrv_can_snapshot(bs);
+found = bdrv_all_snapshots_includes_bs(bs) && bdrv_can_snapshot(bs);
 aio_context_release(ctx);
 
 if (found) {
-- 
2.17.1

[PATCH 34/55] block/backup: fix backup_cow_with_offload for last cluster

2019-11-05 Thread Michael Roth

From: Vladimir Sementsov-Ogievskiy 

We shouldn't try to copy bytes beyond EOF. Fix it.

Fixes: 9ded4a0114968e
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
Message-id: 20190920142056.12778-3-vsement...@virtuozzo.com
Signed-off-by: Max Reitz 
(cherry picked from commit 1048ddf0a32dcdaa952e581bd503d49adad527cc)
Signed-off-by: Michael Roth 
---
 block/backup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/backup.c b/block/backup.c
index 7067d1d1ad..8761f1f9a7 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -167,7 +167,7 @@ static int coroutine_fn 
backup_cow_with_offload(BackupBlockJob *job,
 
 assert(QEMU_IS_ALIGNED(job->copy_range_size, job->cluster_size));
 assert(QEMU_IS_ALIGNED(start, job->cluster_size));
-nbytes = MIN(job->copy_range_size, end - start);
+nbytes = MIN(job->copy_range_size, MIN(end, job->len) - start);
 nr_clusters = DIV_ROUND_UP(nbytes, job->cluster_size);
 hbitmap_reset(job->copy_bitmap, start, job->cluster_size * nr_clusters);
 ret = blk_co_copy_range(blk, start, job->target, start, nbytes,
-- 
2.17.1

[PATCH 07/55] xen-bus: check whether the frontend is active during device reset...

2019-11-05 Thread Michael Roth

From: Paul Durrant 

...not the backend

Commit cb323146 "xen-bus: Fix backend state transition on device reset"
contained a subtle mistake. The hunk

@@ -539,11 +556,11 @@ static void xen_device_backend_changed(void *opaque)

 /*
  * If the toolstack (or unplug request callback) has set the backend
- * state to Closing, but there is no active frontend (i.e. the
- * state is not Connected) then set the backend state to Closed.
+ * state to Closing, but there is no active frontend then set the
+ * backend state to Closed.
  */
 if (xendev->backend_state == XenbusStateClosing &&
-xendev->frontend_state != XenbusStateConnected) {
+!xen_device_state_is_active(state)) {
 xen_device_backend_set_state(xendev, XenbusStateClosed);
 }

mistakenly replaced the check of 'xendev->frontend_state' with a check
(now in a helper function) of 'state', which actually equates to
'xendev->backend_state'.

This patch fixes the mistake.

Fixes: cb3231460747552d70af9d546dc53d8195bcb796
Signed-off-by: Paul Durrant 
Reviewed-by: Anthony PERARD 
Message-Id: <20190910171753.3775-1-paul.durr...@citrix.com>
Signed-off-by: Anthony PERARD 
(cherry picked from commit df6180bb56cd03949c2c64083da58755fed81a61)
Signed-off-by: Michael Roth 
---
 hw/xen/xen-bus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index 5929aa4b2e..10b7e02b5c 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -560,7 +560,7 @@ static void xen_device_backend_changed(void *opaque)
  * backend state to Closed.
  */
 if (xendev->backend_state == XenbusStateClosing &&
-!xen_device_state_is_active(state)) {
+!xen_device_state_is_active(xendev->frontend_state)) {
 xen_device_backend_set_state(xendev, XenbusStateClosed);
 }
 
-- 
2.17.1

[PATCH 45/55] mirror: Do not dereference invalid pointers

2019-11-05 Thread Michael Roth

From: Max Reitz 

mirror_exit_common() may be called twice (if it is called from
mirror_prepare() and fails, it will be called from mirror_abort()
again).

In such a case, many of the pointers in the MirrorBlockJob object will
already be freed.  This can be seen most reliably for s->target, which
is set to NULL (and then dereferenced by blk_bs()).

Cc: qemu-sta...@nongnu.org
Fixes: 737efc1eda23b904fbe0e66b37715fb0e5c3e58b
Signed-off-by: Max Reitz 
Reviewed-by: John Snow 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Message-id: 20191014153931.20699-2-mre...@redhat.com
Signed-off-by: Max Reitz 
(cherry picked from commit f93c3add3a773e0e3f6277e5517583c4ad3a43c2)
Signed-off-by: Michael Roth 
---
 block/mirror.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 9f5c59ece1..0e3f7923cf 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -618,11 +618,11 @@ static int mirror_exit_common(Job *job)
 {
 MirrorBlockJob *s = container_of(job, MirrorBlockJob, common.job);
 BlockJob *bjob = &s->common;
-MirrorBDSOpaque *bs_opaque = s->mirror_top_bs->opaque;
+MirrorBDSOpaque *bs_opaque;
 AioContext *replace_aio_context = NULL;
-BlockDriverState *src = s->mirror_top_bs->backing->bs;
-BlockDriverState *target_bs = blk_bs(s->target);
-BlockDriverState *mirror_top_bs = s->mirror_top_bs;
+BlockDriverState *src;
+BlockDriverState *target_bs;
+BlockDriverState *mirror_top_bs;
 Error *local_err = NULL;
 bool abort = job->ret < 0;
 int ret = 0;
@@ -632,6 +632,11 @@ static int mirror_exit_common(Job *job)
 }
 s->prepared = true;
 
+mirror_top_bs = s->mirror_top_bs;
+bs_opaque = mirror_top_bs->opaque;
+src = mirror_top_bs->backing->bs;
+target_bs = blk_bs(s->target);
+
 if (bdrv_chain_contains(src, target_bs)) {
 bdrv_unfreeze_backing_chain(mirror_top_bs, target_bs);
 }
-- 
2.17.1

[PATCH 08/55] block/file-posix: Reduce xfsctl() use

2019-11-05 Thread Michael Roth

From: Max Reitz 

This patch removes xfs_write_zeroes() and xfs_discard().  Both functions
have been added just before the same feature was present through
fallocate():

- fallocate() has supported PUNCH_HOLE for XFS since Linux 2.6.38 (March
  2011); xfs_discard() was added in December 2010.

- fallocate() has supported ZERO_RANGE for XFS since Linux 3.15 (June
  2014); xfs_write_zeroes() was added in November 2013.

Nowadays, all systems that qemu runs on should support both fallocate()
features (RHEL 7's kernel does).

xfsctl() is still useful for getting the request alignment for O_DIRECT,
so this patch does not remove our dependency on it completely.

Note that xfs_write_zeroes() had a bug: It calls ftruncate() when the
file is shorter than the specified range (because ZERO_RANGE does not
increase the file length).  ftruncate() may yield and then discard data
that parallel write requests have written past the EOF in the meantime.
Dropping the function altogether fixes the bug.

Suggested-by: Paolo Bonzini 
Fixes: 50ba5b2d994853b38fed10e0841b119da0f8b8e5
Reported-by: Lukáš Doktor 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Reitz 
Reviewed-by: Stefano Garzarella 
Reviewed-by: John Snow 
Tested-by: Stefano Garzarella 
Tested-by: John Snow 
Signed-off-by: Kevin Wolf 
(cherry picked from commit b2c6f23f4a9f6d8f1b648705cd46d3713b78d6a2)
Signed-off-by: Michael Roth 
---
 block/file-posix.c | 77 +-
 1 file changed, 1 insertion(+), 76 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 4479cc7ab4..992eb4a798 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1445,59 +1445,6 @@ out:
 }
 }
 
-#ifdef CONFIG_XFS
-static int xfs_write_zeroes(BDRVRawState *s, int64_t offset, uint64_t bytes)
-{
-int64_t len;
-struct xfs_flock64 fl;
-int err;
-
-len = lseek(s->fd, 0, SEEK_END);
-if (len < 0) {
-return -errno;
-}
-
-if (offset + bytes > len) {
-/* XFS_IOC_ZERO_RANGE does not increase the file length */
-if (ftruncate(s->fd, offset + bytes) < 0) {
-return -errno;
-}
-}
-
-memset(&fl, 0, sizeof(fl));
-fl.l_whence = SEEK_SET;
-fl.l_start = offset;
-fl.l_len = bytes;
-
-if (xfsctl(NULL, s->fd, XFS_IOC_ZERO_RANGE, &fl) < 0) {
-err = errno;
-trace_file_xfs_write_zeroes(strerror(errno));
-return -err;
-}
-
-return 0;
-}
-
-static int xfs_discard(BDRVRawState *s, int64_t offset, uint64_t bytes)
-{
-struct xfs_flock64 fl;
-int err;
-
-memset(&fl, 0, sizeof(fl));
-fl.l_whence = SEEK_SET;
-fl.l_start = offset;
-fl.l_len = bytes;
-
-if (xfsctl(NULL, s->fd, XFS_IOC_UNRESVSP64, &fl) < 0) {
-err = errno;
-trace_file_xfs_discard(strerror(errno));
-return -err;
-}
-
-return 0;
-}
-#endif
-
 static int translate_err(int err)
 {
 if (err == -ENODEV || err == -ENOSYS || err == -EOPNOTSUPP ||
@@ -1553,10 +1500,8 @@ static ssize_t 
handle_aiocb_write_zeroes_block(RawPosixAIOData *aiocb)
 static int handle_aiocb_write_zeroes(void *opaque)
 {
 RawPosixAIOData *aiocb = opaque;
-#if defined(CONFIG_FALLOCATE) || defined(CONFIG_XFS)
-BDRVRawState *s = aiocb->bs->opaque;
-#endif
 #ifdef CONFIG_FALLOCATE
+BDRVRawState *s = aiocb->bs->opaque;
 int64_t len;
 #endif
 
@@ -1564,12 +1509,6 @@ static int handle_aiocb_write_zeroes(void *opaque)
 return handle_aiocb_write_zeroes_block(aiocb);
 }
 
-#ifdef CONFIG_XFS
-if (s->is_xfs) {
-return xfs_write_zeroes(s, aiocb->aio_offset, aiocb->aio_nbytes);
-}
-#endif
-
 #ifdef CONFIG_FALLOCATE_ZERO_RANGE
 if (s->has_write_zeroes) {
 int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
@@ -1632,14 +1571,6 @@ static int handle_aiocb_write_zeroes_unmap(void *opaque)
 }
 #endif
 
-#ifdef CONFIG_XFS
-if (s->is_xfs) {
-/* xfs_discard() guarantees that the discarded area reads as all-zero
- * afterwards, so we can use it here. */
-return xfs_discard(s, aiocb->aio_offset, aiocb->aio_nbytes);
-}
-#endif
-
 /* If we couldn't manage to unmap while guaranteed that the area reads as
  * all-zero afterwards, just write zeroes without unmapping */
 ret = handle_aiocb_write_zeroes(aiocb);
@@ -1716,12 +1647,6 @@ static int handle_aiocb_discard(void *opaque)
 ret = -errno;
 #endif
 } else {
-#ifdef CONFIG_XFS
-if (s->is_xfs) {
-return xfs_discard(s, aiocb->aio_offset, aiocb->aio_nbytes);
-}
-#endif
-
 #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
 ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
aiocb->aio_offset, aiocb->aio_nbytes);
-- 
2.17.1

[PATCH 03/55] s390x/tcg: Fix VERIM with 32/64 bit elements

2019-11-05 Thread Michael Roth

From: David Hildenbrand 

Wrong order of operands. The constant always comes last. Makes QEMU crash
reliably on specific git fetch invocations.

Reported-by: Stefano Brivio 
Signed-off-by: David Hildenbrand 
Message-Id: <20190814151242.27199-1-da...@redhat.com>
Reviewed-by: Cornelia Huck 
Fixes: 5c4b0ab460ef ("s390x/tcg: Implement VECTOR ELEMENT ROTATE AND INSERT 
UNDER MASK")
Cc: qemu-sta...@nongnu.org
Signed-off-by: Cornelia Huck 
(cherry picked from commit 25bcb45d1b81d22634daa2b1a2d8bee746ac129b)
Signed-off-by: Michael Roth 
---
 target/s390x/translate_vx.inc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index 41d5cf869f..0caddb3958 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -213,7 +213,7 @@ static void get_vec_element_ptr_i64(TCGv_ptr ptr, uint8_t 
reg, TCGv_i64 enr,
vec_full_reg_offset(v3), ptr, 16, 16, data, fn)
 #define gen_gvec_3i(v1, v2, v3, c, gen) \
 tcg_gen_gvec_3i(vec_full_reg_offset(v1), vec_full_reg_offset(v2), \
-vec_full_reg_offset(v3), c, 16, 16, gen)
+vec_full_reg_offset(v3), 16, 16, c, gen)
 #define gen_gvec_4(v1, v2, v3, v4, gen) \
 tcg_gen_gvec_4(vec_full_reg_offset(v1), vec_full_reg_offset(v2), \
vec_full_reg_offset(v3), vec_full_reg_offset(v4), \
-- 
2.17.1

[PATCH 32/55] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()

2019-11-05 Thread Michael Roth

From: Kevin Wolf 

qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which
requires s->lock to be taken to protect its accesses to the refcount
table and refcount blocks. However, nothing in this code path actually
took the lock. This could cause the same cache entry to be used by two
requests at the same time, for different tables at different offsets,
resulting in image corruption.

As it would be preferable to base the detection on consistent data (even
though it's just heuristics), let's take the lock not only around the
qcow2_get_refcount() calls, but around the whole function.

This patch takes the lock in qcow2_co_block_status() earlier and asserts
in qcow2_detect_metadata_preallocation() that we hold the lock.

Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
Cc: qemu-sta...@nongnu.org
Reported-by: Michael Weiser 
Signed-off-by: Kevin Wolf 
Tested-by: Michael Weiser 
Reviewed-by: Michael Weiser 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
(cherry picked from commit 5e9785505210e2477e590e61b1ab100d0ec22b01)
Signed-off-by: Michael Roth 
---
 block/qcow2-refcount.c | 2 ++
 block/qcow2.c  | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index ef965d7895..0d64bf5a5e 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -3455,6 +3455,8 @@ int qcow2_detect_metadata_preallocation(BlockDriverState 
*bs)
 int64_t i, end_cluster, cluster_count = 0, threshold;
 int64_t file_length, real_allocation, real_clusters;
 
+qemu_co_mutex_assert_locked(&s->lock);
+
 file_length = bdrv_getlength(bs->file->bs);
 if (file_length < 0) {
 return file_length;
diff --git a/block/qcow2.c b/block/qcow2.c
index 865839682c..c0f5439dc8 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1899,6 +1899,8 @@ static int coroutine_fn 
qcow2_co_block_status(BlockDriverState *bs,
 unsigned int bytes;
 int status = 0;
 
+qemu_co_mutex_lock(&s->lock);
+
 if (!s->metadata_preallocation_checked) {
 ret = qcow2_detect_metadata_preallocation(bs);
 s->metadata_preallocation = (ret == 1);
@@ -1906,7 +1908,6 @@ static int coroutine_fn 
qcow2_co_block_status(BlockDriverState *bs,
 }
 
 bytes = MIN(INT_MAX, count);
-qemu_co_mutex_lock(&s->lock);
 ret = qcow2_get_cluster_offset(bs, offset, &bytes, &cluster_offset);
 qemu_co_mutex_unlock(&s->lock);
 if (ret < 0) {
-- 
2.17.1

[PATCH 36/55] make-release: pull in edk2 submodules so we can build it from tarballs

2019-11-05 Thread Michael Roth

The `make efi` target added by 536d2173 is built from the roms/edk2
submodule, which in turn relies on additional submodules nested under
roms/edk2.

The make-release script currently only pulls in top-level submodules,
so these nested submodules are missing in the resulting tarball.

We could try to address this situation more generally by recursively
pulling in all submodules, but this doesn't necessarily ensure the
end-result will build properly (this case also required other changes).

Additionally, due to the nature of submodules, we may not always have
control over how these sorts of things are dealt with, so for now we
continue to handle it on a case-by-case in the make-release script.

Cc: Laszlo Ersek 
Cc: Bruce Rogers 
Cc: qemu-sta...@nongnu.org # v4.1.0
Reported-by: Bruce Rogers 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Signed-off-by: Michael Roth 
Message-Id: <20190912231202.12327-2-mdr...@linux.vnet.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé 
(cherry picked from commit 45c61c6c23918e3b05ed9ecac5b2328ebae5f774)
Signed-off-by: Michael Roth 
---
 scripts/make-release | 8 
 1 file changed, 8 insertions(+)

diff --git a/scripts/make-release b/scripts/make-release
index b4af9c9e52..a2a8cda33c 100755
--- a/scripts/make-release
+++ b/scripts/make-release
@@ -20,6 +20,14 @@ git checkout "v${version}"
 git submodule update --init
 (cd roms/seabios && git describe --tags --long --dirty > .version)
 (cd roms/skiboot && ./make_version.sh > .version)
+# Fetch edk2 submodule's submodules, since it won't have access to them via
+# the tarball later.
+#
+# A more uniform way to handle this sort of situation would be nice, but we
+# don't necessarily have much control over how a submodule handles its
+# submodule dependencies, so we continue to handle these on a case-by-case
+# basis for now.
+(cd roms/edk2 && git submodule update --init)
 popd
 tar --exclude=.git -cjf ${destination}.tar.bz2 ${destination}
 rm -rf ${destination}
-- 
2.17.1

[PATCH 48/55] virtio-net: prevent offloads reset on migration

2019-11-05 Thread Michael Roth

From: Mikhail Sennikovsky 

Currently offloads disabled by guest via the VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET
command are not preserved on VM migration.
Instead all offloads reported by guest features (via VIRTIO_PCI_GUEST_FEATURES)
get enabled.
What happens is: first the VirtIONet::curr_guest_offloads gets restored and 
offloads
are getting set correctly:

 #0  qemu_set_offload (nc=0x56a11400, csum=1, tso4=0, tso6=0, ecn=0, ufo=0) 
at net/net.c:474
 #1  virtio_net_apply_guest_offloads (n=0x57701ca0) at 
hw/net/virtio-net.c:720
 #2  virtio_net_post_load_device (opaque=0x57701ca0, version_id=11) at 
hw/net/virtio-net.c:2334
 #3  vmstate_load_state (f=0x569dc010, vmsd=0x56577c80 
, opaque=0x57701ca0, version_id=11)
 at migration/vmstate.c:168
 #4  virtio_load (vdev=0x57701ca0, f=0x569dc010, version_id=11) at 
hw/virtio/virtio.c:2197
 #5  virtio_device_get (f=0x569dc010, opaque=0x57701ca0, size=0, 
field=0x5668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036
 #6  vmstate_load_state (f=0x569dc010, vmsd=0x56577ce0 
, opaque=0x57701ca0, version_id=11) at 
migration/vmstate.c:143
 #7  vmstate_load (f=0x569dc010, se=0x578189e0) at 
migration/savevm.c:829
 #8  qemu_loadvm_section_start_full (f=0x569dc010, mis=0x569eee20) at 
migration/savevm.c:2211
 #9  qemu_loadvm_state_main (f=0x569dc010, mis=0x569eee20) at 
migration/savevm.c:2395
 #10 qemu_loadvm_state (f=0x569dc010) at migration/savevm.c:2467
 #11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449

However later on the features are getting restored, and offloads get reset to
everything supported by features:

 #0  qemu_set_offload (nc=0x56a11400, csum=1, tso4=1, tso6=1, ecn=0, ufo=0) 
at net/net.c:474
 #1  virtio_net_apply_guest_offloads (n=0x57701ca0) at 
hw/net/virtio-net.c:720
 #2  virtio_net_set_features (vdev=0x57701ca0, features=5104441767) at 
hw/net/virtio-net.c:773
 #3  virtio_set_features_nocheck (vdev=0x57701ca0, val=5104441767) at 
hw/virtio/virtio.c:2052
 #4  virtio_load (vdev=0x57701ca0, f=0x569dc010, version_id=11) at 
hw/virtio/virtio.c:2220
 #5  virtio_device_get (f=0x569dc010, opaque=0x57701ca0, size=0, 
field=0x5668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036
 #6  vmstate_load_state (f=0x569dc010, vmsd=0x56577ce0 
, opaque=0x57701ca0, version_id=11) at 
migration/vmstate.c:143
 #7  vmstate_load (f=0x569dc010, se=0x578189e0) at 
migration/savevm.c:829
 #8  qemu_loadvm_section_start_full (f=0x569dc010, mis=0x569eee20) at 
migration/savevm.c:2211
 #9  qemu_loadvm_state_main (f=0x569dc010, mis=0x569eee20) at 
migration/savevm.c:2395
 #10 qemu_loadvm_state (f=0x569dc010) at migration/savevm.c:2467
 #11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449

Fix this by preserving the state in saved_guest_offloads field and
pushing out offload initialization to the new post load hook.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Mikhail Sennikovsky 
Signed-off-by: Jason Wang 
(cherry picked from commit 7788c3f2e21e35902d45809b236791383bbb613e)
Signed-off-by: Michael Roth 
---
 hw/net/virtio-net.c| 27 ---
 include/hw/virtio/virtio-net.h |  2 ++
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index b9e1cd71cf..6adb0fe252 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -2330,9 +2330,13 @@ static int virtio_net_post_load_device(void *opaque, int 
version_id)
 n->curr_guest_offloads = virtio_net_supported_guest_offloads(n);
 }
 
-if (peer_has_vnet_hdr(n)) {
-virtio_net_apply_guest_offloads(n);
-}
+/*
+ * curr_guest_offloads will be later overwritten by the
+ * virtio_set_features_nocheck call done from the virtio_load.
+ * Here we make sure it is preserved and restored accordingly
+ * in the virtio_net_post_load_virtio callback.
+ */
+n->saved_guest_offloads = n->curr_guest_offloads;
 
 virtio_net_set_queues(n);
 
@@ -2367,6 +2371,22 @@ static int virtio_net_post_load_device(void *opaque, int 
version_id)
 return 0;
 }
 
+static int virtio_net_post_load_virtio(VirtIODevice *vdev)
+{
+VirtIONet *n = VIRTIO_NET(vdev);
+/*
+ * The actual needed state is now in saved_guest_offloads,
+ * see virtio_net_post_load_device for detail.
+ * Restore it back and apply the desired offloads.
+ */
+n->curr_guest_offloads = n->saved_guest_offloads;
+if (peer_has_vnet_hdr(n)) {
+virtio_net_apply_guest_offloads(n);
+}
+
+return 0;
+}
+
 /* tx_waiting field of a VirtIONetQueue */
 static const VMStateDescription vmstate_virtio_net_queue_tx_waiting = {
 .name = "virtio-net-queue-tx_waiting",
@@ -2909,6 +2929,7 @@ static void virtio_net_class_init(ObjectClass *klass, 
void *data)
 vdc->guest_notifier_mask = virtio_net_guest_notifier_mask

[PATCH 02/55] Revert "ide/ahci: Check for -ECANCELED in aio callbacks"

2019-11-05 Thread Michael Roth

From: John Snow 

This reverts commit 0d910cfeaf2076b116b4517166d5deb0fea76394.

It's not correct to just ignore an error code in a callback; we need to
handle that error and possible report failure to the guest so that they
don't wait indefinitely for an operation that will now never finish.

This ought to help cases reported by Nutanix where iSCSI returns a
legitimate -ECANCELED for certain operations which should be propagated
normally.

Reported-by: Shaju Abraham 
Signed-off-by: John Snow 
Message-id: 20190729223605.7163-1-js...@redhat.com
Signed-off-by: John Snow 
(cherry picked from commit 8ec41c4265714255d5a138f8b538faf3583dcff6)
Signed-off-by: Michael Roth 
---
 hw/ide/ahci.c |  3 ---
 hw/ide/core.c | 14 --
 2 files changed, 17 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 00ba422a48..6aaf66534a 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -1023,9 +1023,6 @@ static void ncq_cb(void *opaque, int ret)
 IDEState *ide_state = &ncq_tfs->drive->port.ifs[0];
 
 ncq_tfs->aiocb = NULL;
-if (ret == -ECANCELED) {
-return;
-}
 
 if (ret < 0) {
 bool is_read = ncq_tfs->cmd == READ_FPDMA_QUEUED;
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 6afadf894f..8e1624f7ce 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -722,9 +722,6 @@ static void ide_sector_read_cb(void *opaque, int ret)
 s->pio_aiocb = NULL;
 s->status &= ~BUSY_STAT;
 
-if (ret == -ECANCELED) {
-return;
-}
 if (ret != 0) {
 if (ide_handle_rw_error(s, -ret, IDE_RETRY_PIO |
 IDE_RETRY_READ)) {
@@ -840,10 +837,6 @@ static void ide_dma_cb(void *opaque, int ret)
 uint64_t offset;
 bool stay_active = false;
 
-if (ret == -ECANCELED) {
-return;
-}
-
 if (ret == -EINVAL) {
 ide_dma_error(s);
 return;
@@ -975,10 +968,6 @@ static void ide_sector_write_cb(void *opaque, int ret)
 IDEState *s = opaque;
 int n;
 
-if (ret == -ECANCELED) {
-return;
-}
-
 s->pio_aiocb = NULL;
 s->status &= ~BUSY_STAT;
 
@@ -1058,9 +1047,6 @@ static void ide_flush_cb(void *opaque, int ret)
 
 s->pio_aiocb = NULL;
 
-if (ret == -ECANCELED) {
-return;
-}
 if (ret < 0) {
 /* XXX: What sector number to set here? */
 if (ide_handle_rw_error(s, -ret, IDE_RETRY_FLUSH)) {
-- 
2.17.1

[PATCH 33/55] block/backup: fix max_transfer handling for copy_range

2019-11-05 Thread Michael Roth

From: Vladimir Sementsov-Ogievskiy 

Of course, QEMU_ALIGN_UP is a typo, it should be QEMU_ALIGN_DOWN, as we
are trying to find aligned size which satisfy both source and target.
Also, don't ignore too small max_transfer. In this case seems safer to
disable copy_range.

Fixes: 9ded4a0114968e
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-id: 20190920142056.12778-2-vsement...@virtuozzo.com
Signed-off-by: Max Reitz 
(cherry picked from commit 981fb5810aa3f68797ee6e261db338bd78857614)
Signed-off-by: Michael Roth 
---
 block/backup.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index b26c22c4b8..7067d1d1ad 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -657,12 +657,19 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 job->cluster_size = cluster_size;
 job->copy_bitmap = copy_bitmap;
 copy_bitmap = NULL;
-job->use_copy_range = !compress; /* compression isn't supported for it */
 job->copy_range_size = MIN_NON_ZERO(blk_get_max_transfer(job->common.blk),
 blk_get_max_transfer(job->target));
-job->copy_range_size = MAX(job->cluster_size,
-   QEMU_ALIGN_UP(job->copy_range_size,
- job->cluster_size));
+job->copy_range_size = QEMU_ALIGN_DOWN(job->copy_range_size,
+   job->cluster_size);
+/*
+ * Set use_copy_range, consider the following:
+ * 1. Compression is not supported for copy_range.
+ * 2. copy_range does not respect max_transfer (it's a TODO), so we factor
+ *that in here. If max_transfer is smaller than the job->cluster_size,
+ *we do not use copy_range (in that case it's zero after aligning down
+ *above).
+ */
+job->use_copy_range = !compress && job->copy_range_size > 0;
 
 /* Required permissions are already taken with target's blk_new() */
 block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
-- 
2.17.1

[PATCH 22/55] block/nfs: tear down aio before nfs_close

2019-11-05 Thread Michael Roth

From: Peter Lieven 

nfs_close is a sync call from libnfs and has its own event
handler polling on the nfs FD. Avoid that both QEMU and libnfs
are intefering here.

CC: qemu-sta...@nongnu.org
Signed-off-by: Peter Lieven 
Signed-off-by: Kevin Wolf 
(cherry picked from commit 601dc6559725f7a614b6f893611e17ff0908e914)
Signed-off-by: Michael Roth 
---
 block/nfs.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/nfs.c b/block/nfs.c
index d93241b3bb..2b7a078241 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -390,12 +390,14 @@ static void nfs_attach_aio_context(BlockDriverState *bs,
 static void nfs_client_close(NFSClient *client)
 {
 if (client->context) {
+qemu_mutex_lock(&client->mutex);
+aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context),
+   false, NULL, NULL, NULL, NULL);
+qemu_mutex_unlock(&client->mutex);
 if (client->fh) {
 nfs_close(client->context, client->fh);
 client->fh = NULL;
 }
-aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context),
-   false, NULL, NULL, NULL, NULL);
 nfs_destroy_context(client->context);
 client->context = NULL;
 }
-- 
2.17.1

[PATCH 43/55] qcow2: Limit total allocation range to INT_MAX

2019-11-05 Thread Michael Roth

From: Max Reitz 

When the COW areas are included, the size of an allocation can exceed
INT_MAX.  This is kind of limited by handle_alloc() in that it already
caps avail_bytes at INT_MAX, but the number of clusters still reflects
the original length.

This can have all sorts of effects, ranging from the storage layer write
call failing to image corruption.  (If there were no image corruption,
then I suppose there would be data loss because the .cow_end area is
forced to be empty, even though there might be something we need to
COW.)

Fix all of it by limiting nb_clusters so the equivalent number of bytes
will not exceed INT_MAX.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Kevin Wolf 
(cherry picked from commit d1b9d19f99586b33795e20a79f645186ccbc070f)
Signed-off-by: Michael Roth 
---
 block/qcow2-cluster.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 760564c8fb..f8576031b6 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1341,6 +1341,9 @@ static int handle_alloc(BlockDriverState *bs, uint64_t 
guest_offset,
 nb_clusters = MIN(nb_clusters, s->l2_slice_size - l2_index);
 assert(nb_clusters <= INT_MAX);
 
+/* Limit total allocation byte count to INT_MAX */
+nb_clusters = MIN(nb_clusters, INT_MAX >> s->cluster_bits);
+
 /* Find L2 entry for the first involved cluster */
 ret = get_cluster_table(bs, guest_offset, &l2_slice, &l2_index);
 if (ret < 0) {
@@ -1429,7 +1432,7 @@ static int handle_alloc(BlockDriverState *bs, uint64_t 
guest_offset,
  * request actually writes to (excluding COW at the end)
  */
 uint64_t requested_bytes = *bytes + offset_into_cluster(s, guest_offset);
-int avail_bytes = MIN(INT_MAX, nb_clusters << s->cluster_bits);
+int avail_bytes = nb_clusters << s->cluster_bits;
 int nb_bytes = MIN(requested_bytes, avail_bytes);
 QCowL2Meta *old_m = *m;
 
-- 
2.17.1

[PATCH 18/55] target/arm: Free TCG temps in trans_VMOV_64_sp()

2019-11-05 Thread Michael Roth

From: Peter Maydell 

The function neon_store_reg32() doesn't free the TCG temp that it
is passed, so the caller must do that. We got this right in most
places but forgot to free the TCG temps in trans_VMOV_64_sp().

Cc: qemu-sta...@nongnu.org
Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-id: 20190827121931.26836-1-peter.mayd...@linaro.org
(cherry picked from commit 342d27581bd3ecdb995e4fc55fcd383cf3242888)
Signed-off-by: Michael Roth 
---
 target/arm/translate-vfp.inc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
index 092eb5ec53..ef45cecbea 100644
--- a/target/arm/translate-vfp.inc.c
+++ b/target/arm/translate-vfp.inc.c
@@ -881,8 +881,10 @@ static bool trans_VMOV_64_sp(DisasContext *s, 
arg_VMOV_64_sp *a)
 /* gpreg to fpreg */
 tmp = load_reg(s, a->rt);
 neon_store_reg32(tmp, a->vm);
+tcg_temp_free_i32(tmp);
 tmp = load_reg(s, a->rt2);
 neon_store_reg32(tmp, a->vm + 1);
+tcg_temp_free_i32(tmp);
 }
 
 return true;
-- 
2.17.1

[PATCH 31/55] coroutine: Add qemu_co_mutex_assert_locked()

2019-11-05 Thread Michael Roth

From: Kevin Wolf 

Some functions require that the caller holds a certain CoMutex for them
to operate correctly. Add a function so that they can assert the lock is
really held.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Kevin Wolf 
Tested-by: Michael Weiser 
Reviewed-by: Michael Weiser 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Denis V. Lunev 
Reviewed-by: Max Reitz 
(cherry picked from commit 944f3d5dd216fcd8cb007eddd4f82dced0a15b3d)
Signed-off-by: Michael Roth 
---
 include/qemu/coroutine.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 9801e7f5a4..f4843b5f59 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -167,6 +167,21 @@ void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex);
  */
 void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex);
 
+/**
+ * Assert that the current coroutine holds @mutex.
+ */
+static inline coroutine_fn void qemu_co_mutex_assert_locked(CoMutex *mutex)
+{
+/*
+ * mutex->holder doesn't need any synchronisation if the assertion holds
+ * true because the mutex protects it. If it doesn't hold true, we still
+ * don't mind if another thread takes or releases mutex behind our back,
+ * because the condition will be false no matter whether we read NULL or
+ * the pointer for any other coroutine.
+ */
+assert(atomic_read(&mutex->locked) &&
+   mutex->holder == qemu_coroutine_self());
+}
 
 /**
  * CoQueues are a mechanism to queue coroutines in order to continue executing
-- 
2.17.1

[PATCH 19/55] target/arm: Don't abort on M-profile exception return in linux-user mode

2019-11-05 Thread Michael Roth

From: Peter Maydell 

An attempt to do an exception-return (branch to one of the magic
addresses) in linux-user mode for M-profile should behave like
a normal branch, because linux-user mode is always going to be
in 'handler' mode. This used to work, but we broke it when we added
support for the M-profile security extension in commit d02a8698d7ae2bfed.

In that commit we allowed even handler-mode calls to magic return
values to be checked for and dealt with by causing an
EXCP_EXCEPTION_EXIT exception to be taken, because this is
needed for the FNC_RETURN return-from-non-secure-function-call
handling. For system mode we added a check in do_v7m_exception_exit()
to make any spurious calls from Handler mode behave correctly, but
forgot that linux-user mode would also be affected.

How an attempted return-from-non-secure-function-call in linux-user
mode should be handled is not clear -- on real hardware it would
result in return to secure code (not to the Linux kernel) which
could then handle the error in any way it chose. For QEMU we take
the simple approach of treating this erroneous return the same way
it would be handled on a CPU without the security extensions --
treat it as a normal branch.

The upshot of all this is that for linux-user mode we should never
do any of the bx_excret magic, so the code change is simple.

This ought to be a weird corner case that only affects broken guest
code (because Linux user processes should never be attempting to do
exception returns or NS function returns), except that the code that
assigns addresses in RAM for the process and stack in our linux-user
code does not attempt to avoid this magic address range, so
legitimate code attempting to return to a trampoline routine on the
stack can fall into this case. This change fixes those programs,
but we should also look at restricting the range of memory we
use for M-profile linux-user guests to the area that would be
real RAM in hardware.

Cc: qemu-sta...@nongnu.org
Reported-by: Christophe Lyon 
Reviewed-by: Richard Henderson 
Signed-off-by: Peter Maydell 
Message-id: 20190822131534.16602-1-peter.mayd...@linaro.org
Fixes: https://bugs.launchpad.net/qemu/+bug/1840922
Signed-off-by: Peter Maydell 
(cherry picked from commit 5e5584c89f36b302c666bc6db535fd3f7ff35ad2)
Signed-off-by: Michael Roth 
---
 target/arm/translate.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 7853462b21..24cb4ba075 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -952,10 +952,27 @@ static inline void gen_bx(DisasContext *s, TCGv_i32 var)
 store_cpu_field(var, thumb);
 }
 
-/* Set PC and Thumb state from var. var is marked as dead.
+/*
+ * Set PC and Thumb state from var. var is marked as dead.
  * For M-profile CPUs, include logic to detect exception-return
  * branches and handle them. This is needed for Thumb POP/LDM to PC, LDR to PC,
  * and BX reg, and no others, and happens only for code in Handler mode.
+ * The Security Extension also requires us to check for the FNC_RETURN
+ * which signals a function return from non-secure state; this can happen
+ * in both Handler and Thread mode.
+ * To avoid having to do multiple comparisons in inline generated code,
+ * we make the check we do here loose, so it will match for EXC_RETURN
+ * in Thread mode. For system emulation do_v7m_exception_exit() checks
+ * for these spurious cases and returns without doing anything (giving
+ * the same behaviour as for a branch to a non-magic address).
+ *
+ * In linux-user mode it is unclear what the right behaviour for an
+ * attempted FNC_RETURN should be, because in real hardware this will go
+ * directly to Secure code (ie not the Linux kernel) which will then treat
+ * the error in any way it chooses. For QEMU we opt to make the FNC_RETURN
+ * attempt behave the way it would on a CPU without the security extension,
+ * which is to say "like a normal branch". That means we can simply treat
+ * all branches as normal with no magic address behaviour.
  */
 static inline void gen_bx_excret(DisasContext *s, TCGv_i32 var)
 {
@@ -963,10 +980,12 @@ static inline void gen_bx_excret(DisasContext *s, 
TCGv_i32 var)
  * s->base.is_jmp that we need to do the rest of the work later.
  */
 gen_bx(s, var);
+#ifndef CONFIG_USER_ONLY
 if (arm_dc_feature(s, ARM_FEATURE_M_SECURITY) ||
 (s->v7m_handler_mode && arm_dc_feature(s, ARM_FEATURE_M))) {
 s->base.is_jmp = DISAS_BX_EXCRET;
 }
+#endif
 }
 
 static inline void gen_bx_excret_final_code(DisasContext *s)
-- 
2.17.1

[PATCH 35/55] hw/arm/boot.c: Set NSACR.{CP11, CP10} for NS kernel boots

2019-11-05 Thread Michael Roth

From: Peter Maydell 

If we're booting a Linux kernel directly into Non-Secure
state on a CPU which has Secure state, then make sure we
set the NSACR CP11 and CP10 bits, so that Non-Secure is allowed
to access the FPU. Otherwise an AArch32 kernel will UNDEF as
soon as it tries to use the FPU.

It used to not matter that we didn't do this until commit
fc1120a7f5f2d4b6, where we implemented actually honouring
these NSACR bits.

The problem only exists for CPUs where EL3 is AArch32; the
equivalent AArch64 trap bits are in CPTR_EL3 and are "0 to
not trap, 1 to trap", so the reset value of the register
permits NS access, unlike NSACR.

Fixes: fc1120a7f5
Fixes: https://bugs.launchpad.net/qemu/+bug/1844597
Cc: qemu-sta...@nongnu.org
Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20190920174039.3916-1-peter.mayd...@linaro.org
(cherry picked from commit ece628fcf69cbbd4b3efb6fbd203af07609467a2)
Signed-off-by: Michael Roth 
---
 hw/arm/boot.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index c2b89b3bb9..fc4e021a38 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -754,6 +754,8 @@ static void do_cpu_reset(void *opaque)
 (cs != first_cpu || !info->secure_board_setup)) {
 /* Linux expects non-secure state */
 env->cp15.scr_el3 |= SCR_NS;
+/* Set NSACR.{CP11,CP10} so NS can access the FPU */
+env->cp15.nsacr |= 3 << 10;
 }
 }
 
-- 
2.17.1

[PATCH 26/55] curl: Pass CURLSocket to curl_multi_do()

2019-11-05 Thread Michael Roth

From: Max Reitz 

curl_multi_do_locked() currently marks all sockets as ready.  That is
not only inefficient, but in fact unsafe (the loop is).  A follow-up
patch will change that, but to do so, curl_multi_do_locked() needs to
know exactly which socket is ready; and that is accomplished by this
patch here.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Reitz 
Message-id: 20190910124136.10565-5-mre...@redhat.com
Reviewed-by: Maxim Levitsky 
Reviewed-by: John Snow 
Signed-off-by: Max Reitz 
(cherry picked from commit 9dbad87d25587ff640ef878f7b6159fc368ff541)
Signed-off-by: Michael Roth 
---
 block/curl.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 5838afef99..cf2686218d 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -185,15 +185,15 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int 
action,
 switch (action) {
 case CURL_POLL_IN:
 aio_set_fd_handler(s->aio_context, fd, false,
-   curl_multi_do, NULL, NULL, state);
+   curl_multi_do, NULL, NULL, socket);
 break;
 case CURL_POLL_OUT:
 aio_set_fd_handler(s->aio_context, fd, false,
-   NULL, curl_multi_do, NULL, state);
+   NULL, curl_multi_do, NULL, socket);
 break;
 case CURL_POLL_INOUT:
 aio_set_fd_handler(s->aio_context, fd, false,
-   curl_multi_do, curl_multi_do, NULL, state);
+   curl_multi_do, curl_multi_do, NULL, socket);
 break;
 case CURL_POLL_REMOVE:
 aio_set_fd_handler(s->aio_context, fd, false,
@@ -392,9 +392,10 @@ static void curl_multi_check_completion(BDRVCURLState *s)
 }
 
 /* Called with s->mutex held.  */
-static void curl_multi_do_locked(CURLState *s)
+static void curl_multi_do_locked(CURLSocket *ready_socket)
 {
 CURLSocket *socket, *next_socket;
+CURLState *s = ready_socket->state;
 int running;
 int r;
 
@@ -413,12 +414,13 @@ static void curl_multi_do_locked(CURLState *s)
 
 static void curl_multi_do(void *arg)
 {
-CURLState *s = (CURLState *)arg;
+CURLSocket *socket = arg;
+BDRVCURLState *s = socket->state->s;
 
-qemu_mutex_lock(&s->s->mutex);
-curl_multi_do_locked(s);
-curl_multi_check_completion(s->s);
-qemu_mutex_unlock(&s->s->mutex);
+qemu_mutex_lock(&s->mutex);
+curl_multi_do_locked(socket);
+curl_multi_check_completion(s);
+qemu_mutex_unlock(&s->mutex);
 }
 
 static void curl_multi_timeout_do(void *arg)
-- 
2.17.1

[PATCH 27/55] curl: Report only ready sockets

2019-11-05 Thread Michael Roth

From: Max Reitz 

Instead of reporting all sockets to cURL, only report the one that has
caused curl_multi_do_locked() to be called.  This lets us get rid of the
QLIST_FOREACH_SAFE() list, which was actually wrong: SAFE foreaches are
only safe when the current element is removed in each iteration.  If it
possible for the list to be concurrently modified, we cannot guarantee
that only the current element will be removed.  Therefore, we must not
use QLIST_FOREACH_SAFE() here.

Fixes: ff5ca1664af85b24a4180d595ea6873fd3deac57
Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Reitz 
Message-id: 20190910124136.10565-6-mre...@redhat.com
Reviewed-by: Maxim Levitsky 
Reviewed-by: John Snow 
Signed-off-by: Max Reitz 
(cherry picked from commit 9abaf9fc474c3dd53e8e119326abc774c977c331)
Signed-off-by: Michael Roth 
---
 block/curl.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index cf2686218d..fd70f1ebc4 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -392,24 +392,19 @@ static void curl_multi_check_completion(BDRVCURLState *s)
 }
 
 /* Called with s->mutex held.  */
-static void curl_multi_do_locked(CURLSocket *ready_socket)
+static void curl_multi_do_locked(CURLSocket *socket)
 {
-CURLSocket *socket, *next_socket;
-CURLState *s = ready_socket->state;
+BDRVCURLState *s = socket->state->s;
 int running;
 int r;
 
-if (!s->s->multi) {
+if (!s->multi) {
 return;
 }
 
-/* Need to use _SAFE because curl_multi_socket_action() may trigger
- * curl_sock_cb() which might modify this list */
-QLIST_FOREACH_SAFE(socket, &s->sockets, next, next_socket) {
-do {
-r = curl_multi_socket_action(s->s->multi, socket->fd, 0, &running);
-} while (r == CURLM_CALL_MULTI_PERFORM);
-}
+do {
+r = curl_multi_socket_action(s->multi, socket->fd, 0, &running);
+} while (r == CURLM_CALL_MULTI_PERFORM);
 }
 
 static void curl_multi_do(void *arg)
-- 
2.17.1

[PATCH 28/55] curl: Handle success in multi_check_completion

2019-11-05 Thread Michael Roth

From: Max Reitz 

Background: As of cURL 7.59.0, it verifies that several functions are
not called from within a callback.  Among these functions is
curl_multi_add_handle().

curl_read_cb() is a callback from cURL and not a coroutine.  Waking up
acb->co will lead to entering it then and there, which means the current
request will settle and the caller (if it runs in the same coroutine)
may then issue the next request.  In such a case, we will enter
curl_setup_preadv() effectively from within curl_read_cb().

Calling curl_multi_add_handle() will then fail and the new request will
not be processed.

Fix this by not letting curl_read_cb() wake up acb->co.  Instead, leave
the whole business of settling the AIOCB objects to
curl_multi_check_completion() (which is called from our timer callback
and our FD handler, so not from any cURL callbacks).

Reported-by: Natalie Gavrielov 
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1740193
Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Reitz 
Message-id: 20190910124136.10565-7-mre...@redhat.com
Reviewed-by: John Snow 
Reviewed-by: Maxim Levitsky 
Signed-off-by: Max Reitz 
(cherry picked from commit bfb23b480a49114315877aacf700b49453e0f9d9)
Signed-off-by: Michael Roth 
---
 block/curl.c | 69 ++--
 1 file changed, 29 insertions(+), 40 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index fd70f1ebc4..c343c7ed3d 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -229,7 +229,6 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t 
nmemb, void *opaque)
 {
 CURLState *s = ((CURLState*)opaque);
 size_t realsize = size * nmemb;
-int i;
 
 trace_curl_read_cb(realsize);
 
@@ -245,32 +244,6 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t 
nmemb, void *opaque)
 memcpy(s->orig_buf + s->buf_off, ptr, realsize);
 s->buf_off += realsize;
 
-for(i=0; iacb[i];
-
-if (!acb)
-continue;
-
-if ((s->buf_off >= acb->end)) {
-size_t request_length = acb->bytes;
-
-qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
-acb->end - acb->start);
-
-if (acb->end - acb->start < request_length) {
-size_t offset = acb->end - acb->start;
-qemu_iovec_memset(acb->qiov, offset, 0,
-  request_length - offset);
-}
-
-acb->ret = 0;
-s->acb[i] = NULL;
-qemu_mutex_unlock(&s->s->mutex);
-aio_co_wake(acb->co);
-qemu_mutex_lock(&s->s->mutex);
-}
-}
-
 read_end:
 /* curl will error out if we do not return this value */
 return size * nmemb;
@@ -351,13 +324,14 @@ static void curl_multi_check_completion(BDRVCURLState *s)
 break;
 
 if (msg->msg == CURLMSG_DONE) {
+int i;
 CURLState *state = NULL;
+bool error = msg->data.result != CURLE_OK;
+
 curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE,
   (char **)&state);
 
-/* ACBs for successful messages get completed in curl_read_cb */
-if (msg->data.result != CURLE_OK) {
-int i;
+if (error) {
 static int errcount = 100;
 
 /* Don't lose the original error message from curl, since
@@ -369,20 +343,35 @@ static void curl_multi_check_completion(BDRVCURLState *s)
 error_report("curl: further errors suppressed");
 }
 }
+}
 
-for (i = 0; i < CURL_NUM_ACB; i++) {
-CURLAIOCB *acb = state->acb[i];
+for (i = 0; i < CURL_NUM_ACB; i++) {
+CURLAIOCB *acb = state->acb[i];
 
-if (acb == NULL) {
-continue;
-}
+if (acb == NULL) {
+continue;
+}
+
+if (!error) {
+/* Assert that we have read all data */
+assert(state->buf_off >= acb->end);
+
+qemu_iovec_from_buf(acb->qiov, 0,
+state->orig_buf + acb->start,
+acb->end - acb->start);
 
-acb->ret = -EIO;
-state->acb[i] = NULL;
-qemu_mutex_unlock(&s->mutex);
-aio_co_wake(acb->co);
-qemu_mutex_lock(&s->mutex);
+if (acb->end - acb->start < acb->bytes) {
+size_t offset = acb->end - acb->start;
+qemu_iovec_memset(acb->qiov, offset, 0,
+  acb->bytes - offset);
+}
 }
+
+acb->ret = error ? -EIO : 0;
+state->acb[i] = NULL;
+qemu_mute

[PATCH 20/55] libvhost-user: fix SLAVE_SEND_FD handling

2019-11-05 Thread Michael Roth

From: Johannes Berg 

It doesn't look like this could possibly work properly since
VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD is defined to 10, but the
dev->protocol_features has a bitmap. I suppose the peer this
was tested with also supported VHOST_USER_PROTOCOL_F_LOG_SHMFD,
in which case the test would always be false, but nevertheless
the code seems wrong.

Use has_feature() to fix this.

Fixes: d84599f56c82 ("libvhost-user: support host notifier")
Signed-off-by: Johannes Berg 
Message-Id: <20190903200422.11693-1-johan...@sipsolutions.net>
Reviewed-by: Tiwei Bie 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit 8726b70b449896f1211f869ec4f608904f027207)
Signed-off-by: Michael Roth 
---
 contrib/libvhost-user/libvhost-user.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/libvhost-user/libvhost-user.c 
b/contrib/libvhost-user/libvhost-user.c
index 4b36e35a82..cb5f5770e4 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -1097,7 +1097,8 @@ bool vu_set_queue_host_notifier(VuDev *dev, VuVirtq *vq, 
int fd,
 
 vmsg.fd_num = fd_num;
 
-if ((dev->protocol_features & VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD) == 0) {
+if (!has_feature(dev->protocol_features,
+ VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD)) {
 return false;
 }
 
-- 
2.17.1

Re: [PATCH v1 3/4] virtio: increase virtuqueue sizes in new machine types

2019-11-05 Thread Michael S. Tsirkin

On Tue, Nov 05, 2019 at 07:11:04PM +0300, Denis Plotnikov wrote:
> Linux guests submit IO requests no longer than PAGE_SIZE * max_seg
> field reported by SCSI controler. Thus typical sequential read with
> 1 MB size results in the following pattern of the IO from the guest:
>   8,16   115754 2.766095122  2071  D   R 2095104 + 1008 [dd]
>   8,16   115755 2.766108785  2071  D   R 2096112 + 1008 [dd]
>   8,16   115756 2.766113486  2071  D   R 2097120 + 32 [dd]
>   8,16   115757 2.767668961 0  C   R 2095104 + 1008 [0]
>   8,16   115758 2.768534315 0  C   R 2096112 + 1008 [0]
>   8,16   115759 2.768539782 0  C   R 2097120 + 32 [0]
> The IO was generated by
>   dd if=/dev/sda of=/dev/null bs=1024 iflag=direct
> 
> This effectively means that on rotational disks we will observe 3 IOPS
> for each 2 MBs processed. This definitely negatively affects both
> guest and host IO performance.
> 
> The cure is relatively simple - we should report lengthy scatter-gather
> ability of the SCSI controller. Fortunately the situation here is very
> good. VirtIO transport layer can accomodate 1024 items in one request
> while we are using only 128. This situation is present since almost
> very beginning. 2 items are dedicated for request metadata thus we
> should publish VIRTQUEUE_MAX_SIZE - 2 as max_seg.
> 
> The following pattern is observed after the patch:
>   8,16   1 9921 2.662721340  2063  D   R 2095104 + 1024 [dd]
>   8,16   1 9922 2.662737585  2063  D   R 2096128 + 1024 [dd]
>   8,16   1 9923 2.665188167 0  C   R 2095104 + 1024 [0]
>   8,16   1 9924 2.665198777 0  C   R 2096128 + 1024 [0]
> which is much better.
> 
> To fix this particular case, the patch adds new machine types with
> extended virtqueue sizes to 256 which also increases max_seg to 254
> implicitly.
> 
> Suggested-by: Denis V. Lunev 
> Signed-off-by: Denis Plotnikov 
> ---



the way we normally do this is change the defaults to 256.
what is wrong with doing that?


>  hw/core/machine.c   | 14 ++
>  hw/i386/pc_piix.c   | 16 +---
>  hw/i386/pc_q35.c| 14 --
>  include/hw/boards.h |  6 ++
>  4 files changed, 45 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 55b08f1466..28013a0e3f 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -24,6 +24,13 @@
>  #include "hw/pci/pci.h"
>  #include "hw/mem/nvdimm.h"
>  
> +GlobalProperty hw_compat_4_0_1[] = {
> +{ "virtio-blk-device", "queue-size", "128" },
> +{ "virtio-scsi-device", "virtqueue_size", "128" },
> +{ "vhost-scsi-device", "virtqueue_size", "128" },
> +};
> +const size_t hw_compat_4_0_1_len = G_N_ELEMENTS(hw_compat_4_0_1);
> +
>  GlobalProperty hw_compat_4_0[] = {
>  { "virtio-balloon-device", "qemu-4-0-config-size", "true" },
>  };
> @@ -157,6 +164,13 @@ GlobalProperty hw_compat_2_1[] = {
>  };
>  const size_t hw_compat_2_1_len = G_N_ELEMENTS(hw_compat_2_1);
>  
> +GlobalProperty hw_compat[] = {
> +{ "virtio-blk-device", "queue-size", "256" },
> +{ "virtio-scsi-device", "virtqueue_size", "256" },
> +{ "vhost-scsi-device", "virtqueue_size", "256" },
> +};
> +const size_t hw_compat_len = G_N_ELEMENTS(hw_compat);
> +
>  static char *machine_get_accel(Object *obj, Error **errp)
>  {
>  MachineState *ms = MACHINE(obj);
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 8ad8e885c6..2260a61b1b 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -426,15 +426,27 @@ static void pc_i440fx_machine_options(MachineClass *m)
>  m->default_machine_opts = "firmware=bios-256k.bin";
>  m->default_display = "std";
>  machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE);
> +compat_props_add(m->compat_props, hw_compat, hw_compat_len);
>  }
>  
> -static void pc_i440fx_4_0_machine_options(MachineClass *m)
> +static void pc_i440fx_4_0_2_machine_options(MachineClass *m)
>  {
>  pc_i440fx_machine_options(m);
>  m->alias = "pc";
>  m->is_default = 1;
>  }
>  
> +DEFINE_I440FX_MACHINE(v4_0_2, "pc-i440fx-4.0.2", NULL,
> +  pc_i440fx_4_0_2_machine_options);
> +
> +static void pc_i440fx_4_0_machine_options(MachineClass *m)
> +{
> +pc_i440fx_4_0_2_machine_options(m);
> +m->alias = NULL;
> +m->is_default = 0;
> +compat_props_add(m->compat_props, hw_compat_4_0_1, hw_compat_4_0_1_len);
> +}
> +
>  DEFINE_I440FX_MACHINE(v4_0, "pc-i440fx-4.0", NULL,
>pc_i440fx_4_0_machine_options);
>  
> @@ -443,9 +455,7 @@ static void pc_i440fx_3_1_machine_options(MachineClass *m)
>  PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
>  
>  pc_i440fx_4_0_machine_options(m);
> -m->is_default = 0;
>  m->smbus_no_migration_support = true;
> -m->alias = NULL;
>  pcmc->pvh_enabled = false;
>  compat_props_add(m->compat_props, hw_compat_3_1, hw_compat_3_1_len);
>  compat_props_add(m->compat_prop

[PATCH 24/55] curl: Keep *socket until the end of curl_sock_cb()

2019-11-05 Thread Michael Roth

From: Max Reitz 

This does not really change anything, but it makes the code a bit easier
to follow once we use @socket as the opaque pointer for
aio_set_fd_handler().

Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Reitz 
Message-id: 20190910124136.10565-3-mre...@redhat.com
Reviewed-by: Maxim Levitsky 
Reviewed-by: John Snow 
Signed-off-by: Max Reitz 
(cherry picked from commit 007f339b1099af46a008dac438ca0943e31dba72)
Signed-off-by: Michael Roth 
---
 block/curl.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 92dc2f630e..95d7b77dc0 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -172,10 +172,6 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int 
action,
 
 QLIST_FOREACH(socket, &state->sockets, next) {
 if (socket->fd == fd) {
-if (action == CURL_POLL_REMOVE) {
-QLIST_REMOVE(socket, next);
-g_free(socket);
-}
 break;
 }
 }
@@ -185,7 +181,6 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int 
action,
 socket->state = state;
 QLIST_INSERT_HEAD(&state->sockets, socket, next);
 }
-socket = NULL;
 
 trace_curl_sock_cb(action, (int)fd);
 switch (action) {
@@ -207,6 +202,11 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int 
action,
 break;
 }
 
+if (action == CURL_POLL_REMOVE) {
+QLIST_REMOVE(socket, next);
+g_free(socket);
+}
+
 return 0;
 }
 
-- 
2.17.1

[PATCH 29/55] blockjob: update nodes head while removing all bdrv

2019-11-05 Thread Michael Roth

From: Sergio Lopez 

block_job_remove_all_bdrv() iterates through job->nodes, calling
bdrv_root_unref_child() for each entry. The call to the latter may
reach child_job_[can_]set_aio_ctx(), which will also attempt to
traverse job->nodes, potentially finding entries that where freed
on previous iterations.

To avoid this situation, update job->nodes head on each iteration to
ensure that already freed entries are no longer linked to the list.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1746631
Signed-off-by: Sergio Lopez 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Reitz 
Message-id: 20190911100316.32282-1-mre...@redhat.com
Reviewed-by: Sergio Lopez 
Signed-off-by: Max Reitz 
(cherry picked from commit d876bf676f5e7c6aa9ac64555e48cba8734ecb2f)
Signed-off-by: Michael Roth 
---
 blockjob.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 20b7f557da..74abb97bfd 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -186,14 +186,23 @@ static const BdrvChildRole child_job = {
 
 void block_job_remove_all_bdrv(BlockJob *job)
 {
-GSList *l;
-for (l = job->nodes; l; l = l->next) {
+/*
+ * bdrv_root_unref_child() may reach child_job_[can_]set_aio_ctx(),
+ * which will also traverse job->nodes, so consume the list one by
+ * one to make sure that such a concurrent access does not attempt
+ * to process an already freed BdrvChild.
+ */
+while (job->nodes) {
+GSList *l = job->nodes;
 BdrvChild *c = l->data;
+
+job->nodes = l->next;
+
 bdrv_op_unblock_all(c->bs, job->blocker);
 bdrv_root_unref_child(c);
+
+g_slist_free_1(l);
 }
-g_slist_free(job->nodes);
-job->nodes = NULL;
 }
 
 bool block_job_has_bdrv(BlockJob *job, BlockDriverState *bs)
-- 
2.17.1

[PATCH 00/55] Patch Round-up for stable 4.1.1, freeze on 2019-11-12

2019-11-05 Thread Michael Roth

Hi everyone,

The following new patches are queued for QEMU stable v4.1.1:

  https://github.com/mdroth/qemu/commits/stable-4.1-staging

The release is tentatively planned for 2019-11-14:

  https://wiki.qemu.org/Planning/4.1

Please note that the original release date was planned for 2019-11-21,
but was moved up to address a number of qcow2 corruption issues:

  https://lists.gnu.org/archive/html/qemu-devel/2019-10/msg07144.html

Fixes for the XFS issues noted in the thread are still pending, but will
hopefully be qemu.git master in time for 4.1.1 freeze and the
currently-scheduled release date for 4.2.0-rc1.

The list of still-pending patchsets being tracked for inclusion are:

  qcow2: Fix data corruption on XFS
https://lists.gnu.org/archive/html/qemu-devel/2019-11/msg00073.html
(PULL pending)
  qcow2: Fix QCOW2_COMPRESSED_SECTOR_MASK
https://lists.gnu.org/archive/html/qemu-devel/2019-10/msg07718.html
  qcow2-bitmap: Fix uint64_t left-shift overflow
https://lists.gnu.org/archive/html/qemu-devel/2019-10/msg07989.html

Please respond here or CC qemu-sta...@nongnu.org on any additional patches
you think should be included in the release.

Thanks!


Adrian Moreno (1):
  vhost-user: save features if the char dev is closed

Alberto Garcia (1):
  qcow2: Fix the calculation of the maximum L2 cache size

Anthony PERARD (1):
  xen-bus: Fix backend state transition on device reset

Aurelien Jarno (1):
  target/alpha: fix tlb_fill trap_arg2 value for instruction fetch

Christophe Lyon (1):
  target/arm: Allow reading flags from FPSCR for M-profile

David Hildenbrand (1):
  s390x/tcg: Fix VERIM with 32/64 bit elements

Eduardo Habkost (1):
  pc: Don't make die-id mandatory unless necessary

Fan Yang (1):
  COLO-compare: Fix incorrect `if` logic

Hikaru Nishida (1):
  ui: Fix hanging up Cocoa display on macOS 10.15 (Catalina)

Igor Mammedov (1):
  x86: do not advertise die-id in query-hotpluggbale-cpus if '-smp dies' is 
not set

Johannes Berg (1):
  libvhost-user: fix SLAVE_SEND_FD handling

John Snow (2):
  Revert "ide/ahci: Check for -ECANCELED in aio callbacks"
  iotests: add testing shim for script-style python tests

Kevin Wolf (4):
  coroutine: Add qemu_co_mutex_assert_locked()
  qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()
  block/snapshot: Restrict set of snapshot nodes
  iotests: Test internal snapshots with -blockdev

Markus Armbruster (1):
  pr-manager: Fix invalid g_free() crash bug

Matthew Rosato (1):
  s390: PCI: fix IOMMU region init

Max Filippov (1):
  target/xtensa: regenerate and re-import test_mmuhifi_c3 core

Max Reitz (16):
  block/file-posix: Reduce xfsctl() use
  iotests: Test reverse sub-cluster qcow2 writes
  vpc: Return 0 from vpc_co_create() on success
  iotests: Add supported protocols to execute_test()
  iotests: Restrict file Python tests to file
  iotests: Restrict nbd Python tests to nbd
  iotests: Test blockdev-create for vpc
  curl: Keep pointer to the CURLState in CURLSocket
  curl: Keep *socket until the end of curl_sock_cb()
  curl: Check completion in curl_multi_do()
  curl: Pass CURLSocket to curl_multi_do()
  curl: Report only ready sockets
  curl: Handle success in multi_check_completion
  qcow2: Limit total allocation range to INT_MAX
  iotests: Test large write request to qcow2 file
  mirror: Do not dereference invalid pointers

Maxim Levitsky (1):
  block/qcow2: Fix corruption introduced by commit 8ac0f15f335

Michael Roth (2):
  make-release: pull in edk2 submodules so we can build it from tarballs
  roms/Makefile.edk2: don't pull in submodules when building from tarball

Michael S. Tsirkin (1):
  virtio: new post_load hook

Mikhail Sennikovsky (1):
  virtio-net: prevent offloads reset on migration

Paolo Bonzini (2):
  dma-helpers: ensure AIO callback is invoked after cancellation
  scsi: lsi: exit infinite loop while executing script (CVE-2019-12068)

Paul Durrant (1):
  xen-bus: check whether the frontend is active during device reset...

Peter Lieven (1):
  block/nfs: tear down aio before nfs_close

Peter Maydell (3):
  target/arm: Free TCG temps in trans_VMOV_64_sp()
  target/arm: Don't abort on M-profile exception return in linux-user mode
  hw/arm/boot.c: Set NSACR.{CP11,CP10} for NS kernel boots

Philippe Mathieu-Daudé (1):
  virtio-blk: Cancel the pending BH when the dataplane is reset

Sergio Lopez (1):
  blockjob: update nodes head while removing all bdrv

Thomas Huth (1):
  hw/core/loader: Fix possible crash in rom_copy()

Vladimir Sementsov-Ogievskiy (4):
  block/backup: fix max_transfer handling for copy_range
  block/backup: fix backup_cow_with_offload for last cluster
  util/hbitmap: strict hbitmap_reset
  hbitmap: handle set/reset with z

[PATCH 14/55] iotests: Add supported protocols to execute_test()

2019-11-05 Thread Michael Roth

From: Max Reitz 

Signed-off-by: Max Reitz 
Signed-off-by: Kevin Wolf 
(cherry picked from commit 88d2aa533a4a1aad44a27c2e6cd5bc5fbcbce7ed)
Signed-off-by: Michael Roth 
---
 tests/qemu-iotests/iotests.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 25c5a047b3..2f7edc2f33 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -855,7 +855,8 @@ def execute_unittest(output, verbosity, debug):
 
 def execute_test(test_function=None,
  supported_fmts=[], supported_oses=['linux'],
- supported_cache_modes=[], unsupported_fmts=[]):
+ supported_cache_modes=[], unsupported_fmts=[],
+ supported_protocols=[], unsupported_protocols=[]):
 """Run either unittest or script-style tests."""
 
 # We are using TEST_DIR and QEMU_DEFAULT_MACHINE as proxies to
@@ -869,6 +870,7 @@ def execute_test(test_function=None,
 debug = '-d' in sys.argv
 verbosity = 1
 verify_image_format(supported_fmts, unsupported_fmts)
+verify_protocol(supported_protocols, unsupported_protocols)
 verify_platform(supported_oses)
 verify_cache_mode(supported_cache_modes)
 
-- 
2.17.1

[PATCH 23/55] curl: Keep pointer to the CURLState in CURLSocket

2019-11-05 Thread Michael Roth

From: Max Reitz 

A follow-up patch will make curl_multi_do() and curl_multi_read() take a
CURLSocket instead of the CURLState.  They still need the latter,
though, so add a pointer to it to the former.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Reitz 
Reviewed-by: John Snow 
Message-id: 20190910124136.10565-2-mre...@redhat.com
Reviewed-by: Maxim Levitsky 
Signed-off-by: Max Reitz 
(cherry picked from commit 0487861685294660b23bc146e1ebd5304aa8bbe0)
Signed-off-by: Michael Roth 
---
 block/curl.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/curl.c b/block/curl.c
index d4c8e94f3e..92dc2f630e 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -80,6 +80,7 @@ static CURLMcode __curl_multi_socket_action(CURLM 
*multi_handle,
 #define CURL_BLOCK_OPT_TIMEOUT_DEFAULT 5
 
 struct BDRVCURLState;
+struct CURLState;
 
 static bool libcurl_initialized;
 
@@ -97,6 +98,7 @@ typedef struct CURLAIOCB {
 
 typedef struct CURLSocket {
 int fd;
+struct CURLState *state;
 QLIST_ENTRY(CURLSocket) next;
 } CURLSocket;
 
@@ -180,6 +182,7 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int 
action,
 if (!socket) {
 socket = g_new0(CURLSocket, 1);
 socket->fd = fd;
+socket->state = state;
 QLIST_INSERT_HEAD(&state->sockets, socket, next);
 }
 socket = NULL;
-- 
2.17.1

[PATCH 30/55] block/qcow2: Fix corruption introduced by commit 8ac0f15f335

2019-11-05 Thread Michael Roth

From: Maxim Levitsky 

This fixes subtle corruption introduced by luks threaded encryption
in commit 8ac0f15f335

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1745922

The corruption happens when we do a write that
   * writes to two or more unallocated clusters at once
   * doesn't fully cover the first sector
   * doesn't fully cover the last sector
   * uses luks encryption

In this case, when allocating the new clusters we COW both areas
prior to the write and after the write, and we encrypt them.

The above mentioned commit accidentally made it so we encrypt the
second COW area using the physical cluster offset of the first area.

The problem is that offset_in_cluster in do_perform_cow_encrypt
can be larger that the cluster size, thus cluster_offset
will no longer point to the start of the cluster at which encrypted
area starts.

Next patch in this series will refactor the code to avoid all these
assumptions.

In the bugreport that was triggered by rebasing a luks image to new,
zero filled base, which lot of such writes, and causes some files
with zero areas to contain garbage there instead.
But as described above it can happen elsewhere as well

Signed-off-by: Maxim Levitsky 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Message-id: 20190915203655.21638-2-mlevi...@redhat.com
Reviewed-by: Max Reitz 
Signed-off-by: Max Reitz 
(cherry picked from commit 38e7d54bdc518b5a05a922467304bcace2396945)
Signed-off-by: Michael Roth 
---
 block/qcow2-cluster.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index cc5609e27a..760564c8fb 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -473,9 +473,10 @@ static bool coroutine_fn 
do_perform_cow_encrypt(BlockDriverState *bs,
 assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
 assert((bytes & ~BDRV_SECTOR_MASK) == 0);
 assert(s->crypto);
-if (qcow2_co_encrypt(bs, cluster_offset,
- src_cluster_offset + offset_in_cluster,
- buffer, bytes) < 0) {
+if (qcow2_co_encrypt(bs,
+start_of_cluster(s, cluster_offset + offset_in_cluster),
+src_cluster_offset + offset_in_cluster,
+buffer, bytes) < 0) {
 return false;
 }
 }
-- 
2.17.1

[PATCH 15/55] iotests: Restrict file Python tests to file

2019-11-05 Thread Michael Roth

From: Max Reitz 

Most of our Python unittest-style tests only support the file protocol.
You can run them with any other protocol, but the test will simply
ignore your choice and use file anyway.

We should let them signal that they require the file protocol so they
are skipped when you want to test some other protocol.

Signed-off-by: Max Reitz 
Signed-off-by: Kevin Wolf 
(cherry picked from commit 103cbc771e5660d1f5bb458be80aa9e363547ae0)
 Conflicts:
tests/qemu-iotests/257
*drop changes for tests not in 4.1.0
Signed-off-by: Michael Roth 
---
 tests/qemu-iotests/030 | 3 ++-
 tests/qemu-iotests/040 | 3 ++-
 tests/qemu-iotests/041 | 3 ++-
 tests/qemu-iotests/044 | 3 ++-
 tests/qemu-iotests/045 | 3 ++-
 tests/qemu-iotests/055 | 3 ++-
 tests/qemu-iotests/056 | 3 ++-
 tests/qemu-iotests/057 | 3 ++-
 tests/qemu-iotests/065 | 3 ++-
 tests/qemu-iotests/096 | 3 ++-
 tests/qemu-iotests/118 | 3 ++-
 tests/qemu-iotests/124 | 3 ++-
 tests/qemu-iotests/129 | 3 ++-
 tests/qemu-iotests/132 | 3 ++-
 tests/qemu-iotests/139 | 3 ++-
 tests/qemu-iotests/148 | 3 ++-
 tests/qemu-iotests/151 | 3 ++-
 tests/qemu-iotests/152 | 3 ++-
 tests/qemu-iotests/155 | 3 ++-
 tests/qemu-iotests/163 | 3 ++-
 tests/qemu-iotests/165 | 3 ++-
 tests/qemu-iotests/169 | 3 ++-
 tests/qemu-iotests/196 | 3 ++-
 tests/qemu-iotests/199 | 3 ++-
 tests/qemu-iotests/245 | 3 ++-
 25 files changed, 50 insertions(+), 25 deletions(-)

diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index 1b69f318c6..f3766f2a81 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -957,4 +957,5 @@ class TestSetSpeed(iotests.QMPTestCase):
 self.cancel_and_wait(resume=True)
 
 if __name__ == '__main__':
-iotests.main(supported_fmts=['qcow2', 'qed'])
+iotests.main(supported_fmts=['qcow2', 'qed'],
+ supported_protocols=['file'])
diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
index aa0b1847e3..f9e603e715 100755
--- a/tests/qemu-iotests/040
+++ b/tests/qemu-iotests/040
@@ -433,4 +433,5 @@ class TestReopenOverlay(ImageCommitTestCase):
 self.run_commit_test(self.img1, self.img0)
 
 if __name__ == '__main__':
-iotests.main(supported_fmts=['qcow2', 'qed'])
+iotests.main(supported_fmts=['qcow2', 'qed'],
+ supported_protocols=['file'])
diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 26bf1701eb..ae6ed952c6 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -1068,4 +1068,5 @@ class TestOrphanedSource(iotests.QMPTestCase):
 self.assert_qmp(result, 'error/class', 'GenericError')
 
 if __name__ == '__main__':
-iotests.main(supported_fmts=['qcow2', 'qed'])
+iotests.main(supported_fmts=['qcow2', 'qed'],
+ supported_protocols=['file'])
diff --git a/tests/qemu-iotests/044 b/tests/qemu-iotests/044
index 9ec3dba734..05ea1f49c5 100755
--- a/tests/qemu-iotests/044
+++ b/tests/qemu-iotests/044
@@ -118,4 +118,5 @@ class TestRefcountTableGrowth(iotests.QMPTestCase):
 pass
 
 if __name__ == '__main__':
-iotests.main(supported_fmts=['qcow2'])
+iotests.main(supported_fmts=['qcow2'],
+ supported_protocols=['file'])
diff --git a/tests/qemu-iotests/045 b/tests/qemu-iotests/045
index d5484a0ee1..01cc038884 100755
--- a/tests/qemu-iotests/045
+++ b/tests/qemu-iotests/045
@@ -175,4 +175,5 @@ class TestSCMFd(iotests.QMPTestCase):
 "File descriptor named '%s' not found" % fdname)
 
 if __name__ == '__main__':
-iotests.main(supported_fmts=['raw'])
+iotests.main(supported_fmts=['raw'],
+ supported_protocols=['file'])
diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
index 3437c11507..c732a112d6 100755
--- a/tests/qemu-iotests/055
+++ b/tests/qemu-iotests/055
@@ -563,4 +563,5 @@ class TestDriveCompression(iotests.QMPTestCase):
 target='drive1')
 
 if __name__ == '__main__':
-iotests.main(supported_fmts=['raw', 'qcow2'])
+iotests.main(supported_fmts=['raw', 'qcow2'],
+ supported_protocols=['file'])
diff --git a/tests/qemu-iotests/056 b/tests/qemu-iotests/056
index e761e465ae..98c55d8e5a 100755
--- a/tests/qemu-iotests/056
+++ b/tests/qemu-iotests/056
@@ -335,4 +335,5 @@ class BackupTest(iotests.QMPTestCase):
 self.dismissal_failure(True)
 
 if __name__ == '__main__':
-iotests.main(supported_fmts=['qcow2', 'qed'])
+iotests.main(supported_fmts=['qcow2', 'qed'],
+ supported_protocols=['file'])
diff --git a/tests/qemu-iotests/057 b/tests/qemu-iotests/057
index 9f0a5a3057..9fbba759b6 100755
--- a/tests/qemu-iotests/057
+++ b/tests/qemu-iotests/057
@@ -256,4 +256,5 @@ class TestSnapshotDelete(ImageSnapshotTestCase):
 self.assert_qmp(result, 'error/class', 'GenericError')
 
 if __name__ == '__main__':
-iotests.main(supported_fmts=['qcow2'])
+iotests.main(supported_fmts=['qcow2'],
+ supported_protocols=['file'])
diff --git a/tests/qemu-iotests/06

[PATCH 01/55] dma-helpers: ensure AIO callback is invoked after cancellation

2019-11-05 Thread Michael Roth

From: Paolo Bonzini 

dma_aio_cancel unschedules the BH if there is one, which corresponds
to the reschedule_dma case of dma_blk_cb.  This can stall the DMA
permanently, because dma_complete will never get invoked and therefore
nobody will ever invoke the original AIO callback in dbs->common.cb.

Fix this by invoking the callback (which is ensured to happen after
a bdrv_aio_cancel_async, or done manually in the dbs->bh case), and
add assertions to check that the DMA state machine is indeed waiting
for dma_complete or reschedule_dma, but never both.

Reported-by: John Snow 
Signed-off-by: Paolo Bonzini 
Message-id: 20190729213416.1972-1-pbonz...@redhat.com
Signed-off-by: John Snow 
(cherry picked from commit 539343c0a47e19d5dd64d846d64d084d9793681f)
Signed-off-by: Michael Roth 
---
 dma-helpers.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/dma-helpers.c b/dma-helpers.c
index 2d7e02d35e..d3871dc61e 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -90,6 +90,7 @@ static void reschedule_dma(void *opaque)
 {
 DMAAIOCB *dbs = (DMAAIOCB *)opaque;
 
+assert(!dbs->acb && dbs->bh);
 qemu_bh_delete(dbs->bh);
 dbs->bh = NULL;
 dma_blk_cb(dbs, 0);
@@ -111,15 +112,12 @@ static void dma_complete(DMAAIOCB *dbs, int ret)
 {
 trace_dma_complete(dbs, ret, dbs->common.cb);
 
+assert(!dbs->acb && !dbs->bh);
 dma_blk_unmap(dbs);
 if (dbs->common.cb) {
 dbs->common.cb(dbs->common.opaque, ret);
 }
 qemu_iovec_destroy(&dbs->iov);
-if (dbs->bh) {
-qemu_bh_delete(dbs->bh);
-dbs->bh = NULL;
-}
 qemu_aio_unref(dbs);
 }
 
@@ -179,14 +177,21 @@ static void dma_aio_cancel(BlockAIOCB *acb)
 
 trace_dma_aio_cancel(dbs);
 
+assert(!(dbs->acb && dbs->bh));
 if (dbs->acb) {
+/* This will invoke dma_blk_cb.  */
 blk_aio_cancel_async(dbs->acb);
+return;
 }
+
 if (dbs->bh) {
 cpu_unregister_map_client(dbs->bh);
 qemu_bh_delete(dbs->bh);
 dbs->bh = NULL;
 }
+if (dbs->common.cb) {
+dbs->common.cb(dbs->common.opaque, -ECANCELED);
+}
 }
 
 static AioContext *dma_get_aio_context(BlockAIOCB *acb)
-- 
2.17.1

[PATCH 11/55] x86: do not advertise die-id in query-hotpluggbale-cpus if '-smp dies' is not set

2019-11-05 Thread Michael Roth

From: Igor Mammedov 

Commit 176d2cda0 (i386/cpu: Consolidate die-id validity in smp context) added
new 'die-id' topology property to CPUs and exposed it via QMP command
query-hotpluggable-cpus, which broke -device/device_add cpu-foo for existing
users that do not support die-id/dies yet. That's would be fine if it happened
to new machine type only but it also happened to old machine types,
which breaks migration from old QEMU to the new one, for example following CLI:

  OLD-QEMU -M pc-i440fx-4.0 -smp 1,max_cpus=2 \
   -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id
is not able to start with new QEMU, complaining about invalid die-id.

After discovering regression, the patch
   "pc: Don't make die-id mandatory unless necessary"
makes die-id optional so old CLI would work.

However it's not enough as new QEMU still exposes die-id via 
query-hotpluggbale-cpus
QMP command, so the users that started old machine type on new QEMU, using all
properties (including die-id) received from QMP command (as required), won't be
able to start old QEMU using the same properties since it doesn't support 
die-id.

Fix it by hiding die-id in query-hotpluggbale-cpus for all machine types in case
'-smp dies' is not provided on CLI or -smp dies = 1', in which case smp_dies == 
1
and APIC ID is calculated in default way (as it was before DIE support) so we 
won't
need compat code as in both cases the topology provided to guest via CPUID is 
the same.

Signed-off-by: Igor Mammedov 
Message-Id: <20190902120222.6179-1-imamm...@redhat.com>
Reviewed-by: Eduardo Habkost 
Signed-off-by: Eduardo Habkost 
(cherry picked from commit c6c1bb89fb46f3b88f832e654cf5a6f7941aac51)
Signed-off-by: Michael Roth 
---
 hw/i386/pc.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 947f81070f..d011733ff7 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2887,8 +2887,10 @@ static const CPUArchIdList 
*pc_possible_cpu_arch_ids(MachineState *ms)
  ms->smp.threads, &topo);
 ms->possible_cpus->cpus[i].props.has_socket_id = true;
 ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
-ms->possible_cpus->cpus[i].props.has_die_id = true;
-ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
+if (pcms->smp_dies > 1) {
+ms->possible_cpus->cpus[i].props.has_die_id = true;
+ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
+}
 ms->possible_cpus->cpus[i].props.has_core_id = true;
 ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
 ms->possible_cpus->cpus[i].props.has_thread_id = true;
-- 
2.17.1

[PATCH 17/55] iotests: Test blockdev-create for vpc

2019-11-05 Thread Michael Roth

From: Max Reitz 

Signed-off-by: Max Reitz 
Signed-off-by: Kevin Wolf 
(cherry picked from commit cb73747e1a47b93d3dfdc3f769c576b053916938)
Signed-off-by: Michael Roth 
---
 tests/qemu-iotests/266 | 153 +
 tests/qemu-iotests/266.out | 137 +
 tests/qemu-iotests/group   |   1 +
 3 files changed, 291 insertions(+)
 create mode 100755 tests/qemu-iotests/266
 create mode 100644 tests/qemu-iotests/266.out

diff --git a/tests/qemu-iotests/266 b/tests/qemu-iotests/266
new file mode 100755
index 00..5b35cd67e4
--- /dev/null
+++ b/tests/qemu-iotests/266
@@ -0,0 +1,153 @@
+#!/usr/bin/env python
+#
+# Test VPC and file image creation
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import iotests
+from iotests import imgfmt
+
+
+def blockdev_create(vm, options):
+result = vm.qmp_log('blockdev-create', job_id='job0', options=options,
+filters=[iotests.filter_qmp_testfiles])
+
+if 'return' in result:
+assert result['return'] == {}
+vm.run_job('job0')
+
+
+# Successful image creation (defaults)
+def implicit_defaults(vm, file_path):
+iotests.log("=== Successful image creation (defaults) ===")
+iotests.log("")
+
+# 8 heads, 964 cyls/head, 17 secs/cyl
+# (Close to 64 MB)
+size = 8 * 964 * 17 * 512
+
+blockdev_create(vm, { 'driver': imgfmt,
+  'file': 'protocol-node',
+  'size': size })
+
+
+# Successful image creation (explicit defaults)
+def explicit_defaults(vm, file_path):
+iotests.log("=== Successful image creation (explicit defaults) ===")
+iotests.log("")
+
+# 16 heads, 964 cyls/head, 17 secs/cyl
+# (Close to 128 MB)
+size = 16 * 964 * 17 * 512
+
+blockdev_create(vm, { 'driver': imgfmt,
+  'file': 'protocol-node',
+  'size': size,
+  'subformat': 'dynamic',
+  'force-size': False })
+
+
+# Successful image creation (non-default options)
+def non_defaults(vm, file_path):
+iotests.log("=== Successful image creation (non-default options) ===")
+iotests.log("")
+
+# Not representable in CHS (fine with force-size=True)
+size = 1048576
+
+blockdev_create(vm, { 'driver': imgfmt,
+  'file': 'protocol-node',
+  'size': size,
+  'subformat': 'fixed',
+  'force-size': True })
+
+
+# Size not representable in CHS with force-size=False
+def non_chs_size_without_force(vm, file_path):
+iotests.log("=== Size not representable in CHS ===")
+iotests.log("")
+
+# Not representable in CHS (will not work with force-size=False)
+size = 1048576
+
+blockdev_create(vm, { 'driver': imgfmt,
+  'file': 'protocol-node',
+  'size': size,
+  'force-size': False })
+
+
+# Zero size
+def zero_size(vm, file_path):
+iotests.log("=== Zero size===")
+iotests.log("")
+
+blockdev_create(vm, { 'driver': imgfmt,
+  'file': 'protocol-node',
+  'size': 0 })
+
+
+# Maximum CHS size
+def maximum_chs_size(vm, file_path):
+iotests.log("=== Maximum CHS size===")
+iotests.log("")
+
+blockdev_create(vm, { 'driver': imgfmt,
+  'file': 'protocol-node',
+  'size': 16 * 65535 * 255 * 512 })
+
+
+# Actual maximum size
+def maximum_size(vm, file_path):
+iotests.log("=== Actual maximum size===")
+iotests.log("")
+
+blockdev_create(vm, { 'driver': imgfmt,
+  'file': 'protocol-node',
+  'size': 0xff00 * 512,
+  'force-size': True })
+
+
+def main():
+for test_func in [implicit_defaults, explicit_defaults, non_defaults,
+  non_chs_size_without_force, zero_size, maximum_chs_size,
+  maximum_size]:
+
+with iotests.FilePath('t.vpc') as file_path, \
+ iotests.VM() as vm:
+
+vm.launch()
+
+iotests.log('--- Creating empty file ---')
+blockdev_create(vm, { 'driver': 'file',
+  'filename': file_path,

[PATCH 16/55] iotests: Restrict nbd Python tests to nbd

2019-11-05 Thread Michael Roth

From: Max Reitz 

We have two Python unittest-style tests that test NBD.  As such, they
should specify supported_protocols=['nbd'] so they are skipped when the
user wants to test some other protocol.

Furthermore, we should restrict their choice of formats to 'raw'.  The
idea of a protocol/format combination is to use some format over some
protocol; but we always use the raw format over NBD.  It does not really
matter what the NBD server uses on its end, and it is not a useful test
of the respective format driver anyway.

Signed-off-by: Max Reitz 
Signed-off-by: Kevin Wolf 
(cherry picked from commit 7c932a1d69a6d6ac5c0b615c11d191da3bbe9aa8)
Signed-off-by: Michael Roth 
---
 tests/qemu-iotests/147 | 5 ++---
 tests/qemu-iotests/205 | 3 ++-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/147 b/tests/qemu-iotests/147
index 2d84fddb01..ab8480b9a4 100755
--- a/tests/qemu-iotests/147
+++ b/tests/qemu-iotests/147
@@ -287,6 +287,5 @@ class BuiltinNBD(NBDBlockdevAddBase):
 
 
 if __name__ == '__main__':
-# Need to support image creation
-iotests.main(supported_fmts=['vpc', 'parallels', 'qcow', 'vdi', 'qcow2',
- 'vmdk', 'raw', 'vhdx', 'qed'])
+iotests.main(supported_fmts=['raw'],
+ supported_protocols=['nbd'])
diff --git a/tests/qemu-iotests/205 b/tests/qemu-iotests/205
index b8a86c446e..76f6c5fa2b 100755
--- a/tests/qemu-iotests/205
+++ b/tests/qemu-iotests/205
@@ -153,4 +153,5 @@ class TestNbdServerRemove(iotests.QMPTestCase):
 
 
 if __name__ == '__main__':
-iotests.main(supported_fmts=['generic'])
+iotests.main(supported_fmts=['raw'],
+ supported_protocols=['nbd'])
-- 
2.17.1

[PATCH 09/55] iotests: Test reverse sub-cluster qcow2 writes

2019-11-05 Thread Michael Roth

From: Max Reitz 

This exercises the regression introduced in commit
50ba5b2d994853b38fed10e0841b119da0f8b8e5.  On my machine, it has close
to a 50 % false-negative rate, but that should still be sufficient to
test the fix.

Signed-off-by: Max Reitz 
Reviewed-by: Stefano Garzarella 
Reviewed-by: John Snow 
Tested-by: Stefano Garzarella 
Tested-by: John Snow 
Signed-off-by: Kevin Wolf 
(cherry picked from commit ae6ef0190981a21f2d4bc8dcee7253688f14fae7)
 Conflicts:
tests/qemu-iotests/group
*fix context deps on tests not in 4.1.0
Signed-off-by: Michael Roth 
---
 tests/qemu-iotests/265 | 67 ++
 tests/qemu-iotests/265.out |  6 
 tests/qemu-iotests/group   |  1 +
 3 files changed, 74 insertions(+)
 create mode 100755 tests/qemu-iotests/265
 create mode 100644 tests/qemu-iotests/265.out

diff --git a/tests/qemu-iotests/265 b/tests/qemu-iotests/265
new file mode 100755
index 00..dce6f77be3
--- /dev/null
+++ b/tests/qemu-iotests/265
@@ -0,0 +1,67 @@
+#!/usr/bin/env bash
+#
+# Test reverse-ordered qcow2 writes on a sub-cluster level
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+seq=$(basename $0)
+echo "QA output created by $seq"
+
+status=1   # failure is the default!
+
+_cleanup()
+{
+_cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+# qcow2-specific test
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+echo '--- Writing to the image ---'
+
+# Reduce cluster size so we get more and quicker I/O
+IMGOPTS='cluster_size=4096' _make_test_img 1M
+(for ((kb = 1024 - 4; kb >= 0; kb -= 4)); do \
+ echo "aio_write -P 42 $((kb + 1))k 2k"; \
+ done) \
+ | $QEMU_IO "$TEST_IMG" > /dev/null
+
+echo '--- Verifying its content ---'
+
+(for ((kb = 0; kb < 1024; kb += 4)); do \
+echo "read -P 0 ${kb}k 1k"; \
+echo "read -P 42 $((kb + 1))k 2k"; \
+echo "read -P 0 $((kb + 3))k 1k"; \
+ done) \
+ | $QEMU_IO "$TEST_IMG" | _filter_qemu_io | grep 'verification'
+
+# Status of qemu-io
+if [ ${PIPESTATUS[1]} = 0 ]; then
+echo 'Content verified.'
+fi
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/265.out b/tests/qemu-iotests/265.out
new file mode 100644
index 00..6eac620f25
--- /dev/null
+++ b/tests/qemu-iotests/265.out
@@ -0,0 +1,6 @@
+QA output created by 265
+--- Writing to the image ---
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+--- Verifying its content ---
+Content verified.
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index f13e5f2e23..468458efb1 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -271,3 +271,4 @@
 254 rw backing quick
 255 rw quick
 256 rw quick
+265 rw auto quick
-- 
2.17.1

[PATCH 12/55] vpc: Return 0 from vpc_co_create() on success

2019-11-05 Thread Michael Roth

From: Max Reitz 

blockdev_create_run() directly uses .bdrv_co_create()'s return value as
the job's return value.  Jobs must return 0 on success, not just any
nonnegative value.  Therefore, using blockdev-create for VPC images may
currently fail as the vpc driver may return a positive integer.

Because there is no point in returning a positive integer anywhere in
the block layer (all non-negative integers are generally treated as
complete success), we probably do not want to add more such cases.
Therefore, fix this problem by making the vpc driver always return 0 in
case of success.

Suggested-by: Kevin Wolf 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Reitz 
Signed-off-by: Kevin Wolf 
(cherry picked from commit 1a37e3124407b5a145d44478d3ecbdb89c63789f)
Signed-off-by: Michael Roth 
---
 block/vpc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/vpc.c b/block/vpc.c
index d4776ee8a5..3a88e28e2b 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -885,6 +885,7 @@ static int create_dynamic_disk(BlockBackend *blk, uint8_t 
*buf,
 goto fail;
 }
 
+ret = 0;
  fail:
 return ret;
 }
@@ -908,7 +909,7 @@ static int create_fixed_disk(BlockBackend *blk, uint8_t 
*buf,
 return ret;
 }
 
-return ret;
+return 0;
 }
 
 static int calculate_rounded_image_size(BlockdevCreateOptionsVpc *vpc_opts,
-- 
2.17.1

Re: [PATCH v1 2/4] virtio: make seg_max virtqueue size dependent

2019-11-05 Thread Michael S. Tsirkin

On Tue, Nov 05, 2019 at 07:11:03PM +0300, Denis Plotnikov wrote:
> seg_max has a restriction to be less or equal to virtqueue size
> according to Virtio 1.0 specification
> 
> Although seg_max can't be set directly, it's worth to express this
> dependancy directly in the code for sanity purpose.
> 
> Signed-off-by: Denis Plotnikov 

This is guest visible so needs to be machine type dependent, right?

> ---
>  hw/block/virtio-blk.c | 2 +-
>  hw/scsi/virtio-scsi.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 06e57a4d39..21530304cf 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -903,7 +903,7 @@ static void virtio_blk_update_config(VirtIODevice *vdev, 
> uint8_t *config)
>  blk_get_geometry(s->blk, &capacity);
>  memset(&blkcfg, 0, sizeof(blkcfg));
>  virtio_stq_p(vdev, &blkcfg.capacity, capacity);
> -virtio_stl_p(vdev, &blkcfg.seg_max, 128 - 2);
> +virtio_stl_p(vdev, &blkcfg.seg_max, s->conf.queue_size - 2);
>  virtio_stw_p(vdev, &blkcfg.geometry.cylinders, conf->cyls);
>  virtio_stl_p(vdev, &blkcfg.blk_size, blk_size);
>  virtio_stw_p(vdev, &blkcfg.min_io_size, conf->min_io_size / blk_size);
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index 839f120256..f7e5533cd5 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -650,7 +650,7 @@ static void virtio_scsi_get_config(VirtIODevice *vdev,
>  VirtIOSCSICommon *s = VIRTIO_SCSI_COMMON(vdev);
>  
>  virtio_stl_p(vdev, &scsiconf->num_queues, s->conf.num_queues);
> -virtio_stl_p(vdev, &scsiconf->seg_max, 128 - 2);
> +virtio_stl_p(vdev, &scsiconf->seg_max, s->conf.virtqueue_size - 2);
>  virtio_stl_p(vdev, &scsiconf->max_sectors, s->conf.max_sectors);
>  virtio_stl_p(vdev, &scsiconf->cmd_per_lun, s->conf.cmd_per_lun);
>  virtio_stl_p(vdev, &scsiconf->event_info_size, sizeof(VirtIOSCSIEvent));
> -- 
> 2.17.0

Re: [PATCH 3/3] dp8393x: fix receiving buffer exhaustion

2019-11-05 Thread Laurent Vivier

Le 05/11/2019 à 21:45, Hervé Poussineau a écrit :
> Le 02/11/2019 à 18:15, Laurent Vivier a écrit :
>> The card is not able to exit from exhaustion state, because
>> while the drive consumes the buffers, the RRP is incremented
>> (when the driver clears the ISR RBE bit), so it stays equal
>> to RWP, and while RRP == RWP, the card thinks it is always
>> in exhaustion state. So the driver consumes all the buffers,
>> but the card cannot receive new ones.
>>
>> This patch fixes the problem by not incrementing RRP when
>> the driver clears the ISR RBE bit.
>>
>> Signed-off-by: Laurent Vivier 
>> ---
>>   hw/net/dp8393x.c | 31 ---
>>   1 file changed, 16 insertions(+), 15 deletions(-)
> 
> I checked the DP83932C specification, available at
> https://www.eit.lth.se/fileadmin/eit/courses/datablad/Periphery/Communication/DP83932C.pdf
> 
> 
> In the Buffer Resources Exhausted section (page 20):
> "To continue reception after the last RBA is used, the system must supply
> additional RRA descriptor(s), update the RWP register, and clear the RBE
> bit in the ISR. The SONIC rereads the RRA after this bit is cleared."
> 
> If I understand correctly, if the OS updates first the RWP and then
> clear the RBE bit,
> then RRP should be different of RWP in dp8393x_do_read_rra() ? Or did I
> miss something?

No, I found that by debugging the problem, I didn't have the
documentation. I'll rework this patch, relying on the doc to better
understand the problem.

Thanks,
Laurent

Re: [PATCH v1 1/1] opensbi: Upgrade from v0.4 to v0.5

2019-11-05 Thread Palmer Dabbelt

On Tue, 05 Nov 2019 11:23:39 PST (-0800), alistai...@gmail.com wrote:
> On Tue, Oct 29, 2019 at 3:33 AM Alistair Francis  wrote:
>>
>> On Mon, Oct 28, 2019 at 5:56 PM Palmer Dabbelt  wrote:
>> >
>> > On Sat, 26 Oct 2019 01:46:45 PDT (-0700), phi...@redhat.com wrote:
>> > > On Sat, Oct 26, 2019 at 10:45 AM Philippe Mathieu-DaudÃ©
>> > >  wrote:
>> > >>
>> > >> Hi Alistair,
>> > >>
>> > >> On 10/26/19 1:15 AM, Alistair Francis wrote:
>> > >> > This release has:
>> > >> >  Lot of critical fixes
>> > >> >  Hypervisor extension support
>> > >> >  SBI v0.2 base extension support
>> > >> >  Debug prints support
>> > >> >  Handle traps when doing unpriv load/store
>> > >> >  Allow compiling without FP support
>> > >> >  Use git describe to generate boot-time banner
>> > >> >  Andes AE350 platform support
>> > >>
>> > >> Do you mind amending the output of 'git shortlog v0.4..v0.5'?
>> > >
>> > > Err this comment is for Palmer, if Alistair agree (no need to repost).
>> >
>> > Works for me.  I've included the shortlog as part of the patch on my 
>> > for-master
>> > branch, unless there's any opposition I'll include this in my next PR.
>>
>> Sounds good!
>
> Ping! Just want to make sure this makes it into 4.2.

Thanks, I must have somehow dropped this after fixing up the commit message -- 
I'm up to three active laptops during the transition, so everything's a bit of 
a mess on my end.  The commit (with the updated message) is back:.

>
> Alistair
>
>>
>> Alistair
>>
>> >
>> > >
>> > >> >
>> > >> > Signed-off-by: Alistair Francis 
>> > >> > ---
>> > >> > You can get the branch from here if the binaries are causing issues:
>> > >> > https://github.com/alistair23/qemu/tree/mainline/alistair/opensbi.next
>> > >>
>> > >> You can use 'git format-patch --no-binary'.
>> > >>
>> > >> >
>> > >> >   pc-bios/opensbi-riscv32-virt-fw_jump.bin | Bin 36888 -> 40984 
>> > >> > bytes
>> > >> >   pc-bios/opensbi-riscv64-sifive_u-fw_jump.bin | Bin 45064 -> 49160 
>> > >> > bytes
>> > >> >   pc-bios/opensbi-riscv64-virt-fw_jump.bin | Bin 40968 -> 45064 
>> > >> > bytes
>> > >> >   roms/opensbi |   2 +-
>> > >> >   4 files changed, 1 insertion(+), 1 deletion(-)
>> > >> [...]
>> > >> > diff --git a/roms/opensbi b/roms/opensbi
>> > >> > index ce228ee091..be92da280d 16
>> > >> > --- a/roms/opensbi
>> > >> > +++ b/roms/opensbi
>> > >> > @@ -1 +1 @@
>> > >> > -Subproject commit ce228ee0919deb9957192d723eecc8aaae2697c6
>> > >> > +Subproject commit be92da280d87c38a2e0adc5d3f43bab7b5468f09
>> > >> >

Re: [PATCH 3/3] dp8393x: fix receiving buffer exhaustion

2019-11-05 Thread Hervé Poussineau


Le 02/11/2019 à 18:15, Laurent Vivier a écrit :

The card is not able to exit from exhaustion state, because
while the drive consumes the buffers, the RRP is incremented
(when the driver clears the ISR RBE bit), so it stays equal
to RWP, and while RRP == RWP, the card thinks it is always
in exhaustion state. So the driver consumes all the buffers,
but the card cannot receive new ones.

This patch fixes the problem by not incrementing RRP when
the driver clears the ISR RBE bit.

Signed-off-by: Laurent Vivier 
---
  hw/net/dp8393x.c | 31 ---
  1 file changed, 16 insertions(+), 15 deletions(-)


I checked the DP83932C specification, available at
https://www.eit.lth.se/fileadmin/eit/courses/datablad/Periphery/Communication/DP83932C.pdf

In the Buffer Resources Exhausted section (page 20):
"To continue reception after the last RBA is used, the system must supply
additional RRA descriptor(s), update the RWP register, and clear the RBE
bit in the ISR. The SONIC rereads the RRA after this bit is cleared."

If I understand correctly, if the OS updates first the RWP and then clear the 
RBE bit,
then RRP should be different of RWP in dp8393x_do_read_rra() ? Or did I miss 
something?



diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
index b8c4473f99..21deb32456 100644
--- a/hw/net/dp8393x.c
+++ b/hw/net/dp8393x.c
@@ -304,7 +304,7 @@ static void dp8393x_do_load_cam(dp8393xState *s)
  dp8393x_update_irq(s);
  }
  
-static void dp8393x_do_read_rra(dp8393xState *s)

+static void dp8393x_do_read_rra(dp8393xState *s, int next)
  {
  int width, size;
  
@@ -323,19 +323,20 @@ static void dp8393x_do_read_rra(dp8393xState *s)

  s->regs[SONIC_CRBA0], s->regs[SONIC_CRBA1],
  s->regs[SONIC_RBWC0], s->regs[SONIC_RBWC1]);
  
-/* Go to next entry */

-s->regs[SONIC_RRP] += size;
+if (next) {
+/* Go to next entry */
+s->regs[SONIC_RRP] += size;
  
-/* Handle wrap */

-if (s->regs[SONIC_RRP] == s->regs[SONIC_REA]) {
-s->regs[SONIC_RRP] = s->regs[SONIC_RSA];
-}
+/* Handle wrap */
+if (s->regs[SONIC_RRP] == s->regs[SONIC_REA]) {
+s->regs[SONIC_RRP] = s->regs[SONIC_RSA];
+}
  
-/* Check resource exhaustion */

-if (s->regs[SONIC_RRP] == s->regs[SONIC_RWP])
-{
-s->regs[SONIC_ISR] |= SONIC_ISR_RBE;
-dp8393x_update_irq(s);
+/* Check resource exhaustion */
+if (s->regs[SONIC_RRP] == s->regs[SONIC_RWP]) {
+s->regs[SONIC_ISR] |= SONIC_ISR_RBE;
+dp8393x_update_irq(s);
+}
  }
  
  /* Done */

@@ -549,7 +550,7 @@ static void dp8393x_do_command(dp8393xState *s, uint16_t 
command)
  if (command & SONIC_CR_RST)
  dp8393x_do_software_reset(s);
  if (command & SONIC_CR_RRRA)
-dp8393x_do_read_rra(s);
+dp8393x_do_read_rra(s, 1);
  if (command & SONIC_CR_LCAM)
  dp8393x_do_load_cam(s);
  }
@@ -640,7 +641,7 @@ static void dp8393x_write(void *opaque, hwaddr addr, 
uint64_t data,
  data &= s->regs[reg];
  s->regs[reg] &= ~data;
  if (data & SONIC_ISR_RBE) {
-dp8393x_do_read_rra(s);
+dp8393x_do_read_rra(s, 0);
  }
  dp8393x_update_irq(s);
  if (dp8393x_can_receive(s->nic->ncs)) {
@@ -840,7 +841,7 @@ static ssize_t dp8393x_receive(NetClientState *nc, const 
uint8_t * buf,
  
  if (s->regs[SONIC_RCR] & SONIC_RCR_LPKT) {

  /* Read next RRA */
-dp8393x_do_read_rra(s);
+dp8393x_do_read_rra(s, 1);
  }
  }

Re: [PULL 0/1] Require Python >= 3.5 to build QEMU

2019-11-05 Thread Eduardo Habkost

On Tue, Nov 05, 2019 at 08:25:03PM +, Alex Bennée wrote:
> 
> Eduardo Habkost  writes:
> 
> > On Thu, Oct 31, 2019 at 08:12:01AM +, Peter Maydell wrote:
> >> On Fri, 25 Oct 2019 at 21:34, Eduardo Habkost  wrote:
> >> >
> >> > The following changes since commit 
> >> > 03bf012e523ecdf047ac56b2057950247256064d:
> >> >
> >> >   Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into 
> >> > staging (2019-10-25 14:59:53 +0100)
> >> >
> >> > are available in the Git repository at:
> >> >
> >> >   git://github.com/ehabkost/qemu.git tags/python-next-pull-request
> >> >
> >> > for you to fetch changes up to d24e417866f85229de1b75bc5c0a1d942451a842:
> >> >
> >> >   configure: Require Python >= 3.5 (2019-10-25 16:34:57 -0300)
> >> >
> >> > 
> >> > Require Python >= 3.5 to build QEMU
> >> >
> >> > 
> >>
> >> I can't apply this until we've fixed the tests/vm netbsd setup to
> >> not use Python 2.
> >
> > Fixing tests/vm/netbsd is being tricky.  It looks like the
> > configure patch will have to wait until after QEMU 4.2.0.  :(
> 
> I've posted fixes for the netbsd serial install but there are still
> problems with the tests including what looks like a fairly serious
> failure in the async code.

This sounds like a known "feature": QEMU expects clients to be
constantly reading from chardev sockets until the socket is
closed.  Otherwise, VCPU threads may block waiting for the socket
to be writeable.


> 
> >
> >>
> >> Have you tried a test run with Travis/etc/etc to check that none of
> >> those CI configs need updating to have python3 available ?
> >
> > I have tested this pull request on Shippable, and I will take a
> > look at Travis.  I'd appreciate help from the CI system
> > maintainers (CCed) for the rest, as I don't have accounts in all
> > our CI systems.
> 
> Setting up accounts on the others doesn't take long. I use the
> CustomCIStatus template to instantiate all the buttons for my various
> maintainer branches on the wiki, e.g.:
> 
>   
> {{CustomCIStatus|user=stsquad|repo=qemu|branch=testing/next|ship_proj=5885eac43b653a0f00fa97f5}}
> 
> which means I just have to glance at the button state rather than going
> through each individual CI's status pages.

This is awesome.  Thanks for the tip!

> 
> > Do we expect maintainers to test their pull requests in all CI
> > systems listed at the QEMU wiki[1]?  Do we have an official list
> > of CI systems that you consider to be pull request blockers?
> 
> Well they all catch various things but none of them catch all the things
> Peter's PR processing does. Historically Travis has been allowed to
> slide because of test instability and timeouts. Having said that last I
> checked everything was green so breaking any of the main CIs
> (Travis/Shippable/Cirrus/Gitlab) indicates there is a problem that needs
> to be fixed.

Manually checking if 5 different CI systems are green wouldn't be
reasonable to me, but the CustomCIStatus template will be useful.

-- 
Eduardo

1 2 3 >

1 - 100 of 264 matches

Mail list logo