date:20191210

Re: [PATCH] vhost-user-test: fix a memory leak

2019-12-10 Thread Thomas Huth

 Hi!

On 11/12/2019 01.55, pannengy...@huawei.com wrote:
[...]
> diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
> index 91ea373..54be931 100644
> --- a/tests/vhost-user-test.c
> +++ b/tests/vhost-user-test.c
> @@ -717,6 +717,8 @@ static void test_migrate(void *obj, void *arg, 
> QGuestAllocator *alloc)
>  guint64 size;
>  
>  if (!wait_for_fds(s)) {
> +g_free(uri);
> +test_server_free(dest);
>  return;
>  }

Well spotted. But I'd prefer to rather move the allocation of these
resources after the if-statement instead of doing the allocation at the
declaration of the variables already. Or maybe use a "goto out" and jump
to the end of the function instead? ... whatever you prefer, but
duplicating the "free" functions sounds like a cumbersome solution to me.

 Thanks,
  Thomas

Re: [PATCH 0/5] ARM virt: Add NVDIMM support

2019-12-10 Thread Igor Mammedov

On Mon, 9 Dec 2019 17:39:15 +
Shameerali Kolothum Thodi  wrote:

> Hi Igor/ xiaoguangrong,
> 
> > -Original Message-
> > From: Shameerali Kolothum Thodi
> > Sent: 28 November 2019 12:36
> > To: 'Igor Mammedov' ;
> > xiaoguangrong.e...@gmail.com
> > Cc: peter.mayd...@linaro.org; drjo...@redhat.com;
> > shannon.zha...@gmail.com; qemu-devel@nongnu.org; Linuxarm
> > ; Auger Eric ;
> > qemu-...@nongnu.org; xuwei (O) ;
> > ler...@redhat.com
> > Subject: RE: [PATCH 0/5] ARM virt: Add NVDIMM support
> > 
> > 
> >   
> > > -Original Message-
> > > From: Qemu-devel
> > >  
> > [mailto:qemu-devel-bounces+shameerali.kolothum.thodi=huawei.com@nongn  
> > > u.org] On Behalf Of Igor Mammedov
> > > Sent: 26 November 2019 08:57
> > > To: Shameerali Kolothum Thodi 
> > > Cc: peter.mayd...@linaro.org; drjo...@redhat.com;
> > > xiaoguangrong.e...@gmail.com; shannon.zha...@gmail.com;
> > > qemu-devel@nongnu.org; Linuxarm ; Auger Eric
> > > ; qemu-...@nongnu.org; xuwei (O)
> > > ; ler...@redhat.com
> > > Subject: Re: [PATCH 0/5] ARM virt: Add NVDIMM support  
> > 
> > [..]
> >   
> > > > > 0xb8 Dirty No.  -->Another read is attempted  
> > > > > > [Qemu]NVDIMM:nvdimm_dsm_func_read_fit: read_fit_out buf size 0x8  
> > > > > func_ret_status 3  --> Error status returned
> > > > >
> > > > > status 3 means that QEMU didn't like content of NRAM, and there is 
> > > > > only
> > > > > 1 place like this in nvdimm_dsm_func_read_fit()
> > > > > if (read_fit->offset > fit->len) {
> > > > > func_ret_status = NVDIMM_DSM_RET_STATUS_INVALID;
> > > > > goto exit;
> > > > > }
> > > > >
> > > > > so I'd start looking from here and check that QEMU gets expected data
> > > > > in nvdimm_dsm_write(). In other words I'd try to trace/compare
> > > > > content of DSM buffer (from qemu side).  
> > > >
> > > > I had printed the DSM buffer previously and it looked same, I will 
> > > > double  
> > check  
> > > > that.  
> > 
> > Tried printing the buffer in both Qemu/AML code.
> > 
> > On Amr64,  
> 
> [...]
>  
> > Attached the SSDT.dsl used for debugging. I am still not clear why on ARM64,
> > 2nd iteration case, the created buffer size in NCAL and RFIT methods have
> > additional 4 bytes!.
> > 
> > CreateField (ODAT, Zero, Local1, OBUF)
> > Concatenate (Buffer (Zero){}, OBUF, Local7)
> > 
> > Please let me know if you have any clue.
> >   
> 
> I couldn't figure out yet, why this extra 4 bytes are added by aml code on 
> ARM64
> when the nvdimm_dsm_func_read_fit() returns NvdimmFuncReadFITOut without
> any FIT data. ie, when the FIT buffer len (read_len) is zero.
> 
> But the below will fix this issue,
> 
> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> index f91eea3802..cddf95f4c1 100644
> --- a/hw/acpi/nvdimm.c
> +++ b/hw/acpi/nvdimm.c
> @@ -588,7 +588,7 @@ static void nvdimm_dsm_func_read_fit(NVDIMMState *state, 
> NvdimmDsmIn *in,
>  nvdimm_debug("Read FIT: offset %#x FIT size %#x Dirty %s.\n",
>   read_fit->offset, fit->len, fit_buf->dirty ? "Yes" : "No");
> 
> -if (read_fit->offset > fit->len) {
> +if (read_fit->offset >= fit->len) {
>  func_ret_status = NVDIMM_DSM_RET_STATUS_INVALID;
>  goto exit;
>  }
> 
> 
> This will return error code to aml in the second iteration when there is no 
> further
> FIT data to report. But, I am not sure why this check was omitted in the 
> first place.
> 
> Please let me know if this is acceptable and then probably I can look into a 
> v2 of this
> series.
Sorry, I don't have capacity to debug this right now,
but I'd prefer if 'why' question was answered first.

Anyways, if something is unclear in how concrete AML code is build/works,
feel free to ask and I'll try to explain and guide you.

> Thanks,
> Shameer
> 
> 
>

Re: [PATCH] vhost-user-test: fix a memory leak

2019-12-10 Thread Laurent Vivier

On 11/12/2019 01:55, pannengy...@huawei.com wrote:
> From: Pan Nengyuan 
> 
> Spotted by ASAN.
> 
> Reported-by: Euler Robot 
> Signed-off-by: Pan Nengyuan 
> ---
>  tests/vhost-user-test.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
> index 91ea373..54be931 100644
> --- a/tests/vhost-user-test.c
> +++ b/tests/vhost-user-test.c
> @@ -717,6 +717,8 @@ static void test_migrate(void *obj, void *arg, 
> QGuestAllocator *alloc)
>  guint64 size;
>  
>  if (!wait_for_fds(s)) {
> +g_free(uri);
> +test_server_free(dest);
>  return;
>  }
>  

Don't we need also a g_string_free(dest_cmdline)?

I think it is also missing at the end of the function.

Thanks,
Laurent

Re: [PATCH for-4.2? v3 0/8] block: Fix resize (extending) of short overlays

2019-12-10 Thread Max Reitz

On 10.12.19 18:46, Kevin Wolf wrote:
> Am 22.11.2019 um 17:05 hat Kevin Wolf geschrieben:
>> See patch 4 for the description of the bug fixed.
> 
> I'm applying patches 3 and 5-7 to the block branch because they make
> sense on their own.
> 
> The real fix will need another approach because the error handling is
> broken in this one: If zeroing out fails (either because of NO_FALLBACK
> or because of some other I/O error), bdrv_co_truncate() will return
> failure, but the image size has already been increased, with potentially
> incorrect data in the new area.
> 
> To fix this, we need to make sure that zeros will be read before we
> commit the new image size to the image file (e.g. qcow2 header) and to
> bs->total_sectors. In other words, it must become the responsibility of
> the block driver.
> 
> To this effect, I'm planning to introduce a PREALLOC_MODE_ZERO_INIT flag
> that can be or'ed to the preallocation mode. This will fail by default
> because it looks like just another unimplemented preallocation mode to
> block drivers. It will be requested explicitly by commit jobs and
> automatically added by bdrv_co_truncate() if the backing file would
> become visible (like in this series, but now for all preallocation
> modes). I'm planning to implement it for qcow2 and file-posix for now,
> which should cover most interesting cases.
> 
> Does this make sense to you?

Sounds good to me.

Max



signature.asc
Description: OpenPGP digital signature

Re: [RFC PATCH 4/9] vfio-pci: register default dynamic-trap-bar-info region

2019-12-10 Thread Yan Zhao

On Wed, Dec 11, 2019 at 12:38:05AM +0800, Alex Williamson wrote:
> On Tue, 10 Dec 2019 02:44:44 -0500
> Yan Zhao  wrote:
> 
> > On Tue, Dec 10, 2019 at 05:16:08AM +0800, Alex Williamson wrote:
> > > On Mon, 9 Dec 2019 01:22:12 -0500
> > > Yan Zhao  wrote:
> > >   
> > > > On Fri, Dec 06, 2019 at 11:20:38PM +0800, Alex Williamson wrote:  
> > > > > On Fri, 6 Dec 2019 01:04:07 -0500
> > > > > Yan Zhao  wrote:
> > > > > 
> > > > > > On Fri, Dec 06, 2019 at 07:55:30AM +0800, Alex Williamson wrote:
> > > > > > > On Wed,  4 Dec 2019 22:26:50 -0500
> > > > > > > Yan Zhao  wrote:
> > > > > > >   
> > > > > > > > Dynamic trap bar info region is a channel for QEMU and vendor 
> > > > > > > > driver to
> > > > > > > > communicate dynamic trap info. It is of type
> > > > > > > > VFIO_REGION_TYPE_DYNAMIC_TRAP_BAR_INFO and subtype
> > > > > > > > VFIO_REGION_SUBTYPE_DYNAMIC_TRAP_BAR_INFO.
> > > > > > > > 
> > > > > > > > This region has two fields: dt_fd and trap.
> > > > > > > > When QEMU detects a device regions of this type, it will create 
> > > > > > > > an
> > > > > > > > eventfd and write its eventfd id to dt_fd field.
> > > > > > > > When vendor drivre signals this eventfd, QEMU reads trap field 
> > > > > > > > of this
> > > > > > > > info region.
> > > > > > > > - If trap is true, QEMU would search the device's PCI BAR
> > > > > > > > regions and disable all the sparse mmaped subregions (if the 
> > > > > > > > sparse
> > > > > > > > mmaped subregion is disablable).
> > > > > > > > - If trap is false, QEMU would re-enable those subregions.
> > > > > > > > 
> > > > > > > > A typical usage is
> > > > > > > > 1. vendor driver first cuts its bar 0 into several sections, 
> > > > > > > > all in a
> > > > > > > > sparse mmap array. So initally, all its bar 0 are passthroughed.
> > > > > > > > 2. vendor driver specifys part of bar 0 sections to be 
> > > > > > > > disablable.
> > > > > > > > 3. on migration starts, vendor driver signals dt_fd and set 
> > > > > > > > trap to true
> > > > > > > > to notify QEMU disabling the bar 0 sections of disablable flags 
> > > > > > > > on.
> > > > > > > > 4. QEMU disables those bar 0 section and hence let vendor 
> > > > > > > > driver be able
> > > > > > > > to trap access of bar 0 registers and make dirty page tracking 
> > > > > > > > possible.
> > > > > > > > 5. on migration failure, vendor driver signals dt_fd to QEMU 
> > > > > > > > again.
> > > > > > > > QEMU reads trap field of this info region which is false and 
> > > > > > > > QEMU
> > > > > > > > re-passthrough the whole bar 0 region.
> > > > > > > > 
> > > > > > > > Vendor driver specifies whether it supports 
> > > > > > > > dynamic-trap-bar-info region
> > > > > > > > through cap VFIO_PCI_DEVICE_CAP_DYNAMIC_TRAP_BAR in
> > > > > > > > vfio_pci_mediate_ops->open().
> > > > > > > > 
> > > > > > > > If vfio-pci detects this cap, it will create a default
> > > > > > > > dynamic_trap_bar_info region on behalf of vendor driver with 
> > > > > > > > region len=0
> > > > > > > > and region->ops=null.
> > > > > > > > Vvendor driver should override this region's len, flags, rw, 
> > > > > > > > mmap in its
> > > > > > > > vfio_pci_mediate_ops.  
> > > > > > > 
> > > > > > > TBH, I don't like this interface at all.  Userspace doesn't pass 
> > > > > > > data
> > > > > > > to the kernel via INFO ioctls.  We have a SET_IRQS ioctl for
> > > > > > > configuring user signaling with eventfds.  I think we only need to
> > > > > > > define an IRQ type that tells the user to re-evaluate the sparse 
> > > > > > > mmap
> > > > > > > information for a region.  The user would enumerate the device 
> > > > > > > IRQs via
> > > > > > > GET_IRQ_INFO, find one of this type where the IRQ info would also
> > > > > > > indicate which region(s) should be re-evaluated on signaling.  
> > > > > > > The user
> > > > > > > would enable that signaling via SET_IRQS and simply re-evaluate 
> > > > > > > the  
> > > > > > ok. I'll try to switch to this way. Thanks for this suggestion.
> > > > > > 
> > > > > > > sparse mmap capability for the associated regions when signaled.  
> > > > > > > 
> > > > > > 
> > > > > > Do you like the "disablable" flag of sparse mmap ?
> > > > > > I think it's a lightweight way for user to switch mmap state of a 
> > > > > > whole region,
> > > > > > otherwise going through a complete flow of GET_REGION_INFO and 
> > > > > > re-setup
> > > > > > region might be too heavy.
> > > > > 
> > > > > No, I don't like the disable-able flag.  At what frequency do we 
> > > > > expect
> > > > > regions to change?  It seems like we'd only change when switching into
> > > > > and out of the _SAVING state, which is rare.  It seems easy for
> > > > > userspace, at least QEMU, to drop the entire mmap configuration and   
> > > > >  
> > > > ok. I'll try this way.
> > > >   
> > > > > re-read it.  Another concern here is how do we synchronize the event?
> > > > > Are we assuming that this event would

Re: [PATCH v1 ppc-for-5.0 2/2] ppc/spapr: Support reboot of secure pseries guest

2019-12-10 Thread David Gibson

On Wed, Dec 11, 2019 at 10:38:24AM +0530, Bharata B Rao wrote:
> On Wed, Dec 11, 2019 at 10:41:32AM +1100, David Gibson wrote:
> > On Tue, Dec 10, 2019 at 12:20:07PM +0530, Bharata B Rao wrote:
> > > On Tue, Dec 10, 2019 at 04:05:36PM +1100, David Gibson wrote:
> > > > On Tue, Dec 10, 2019 at 03:03:01PM +1100, Alexey Kardashevskiy wrote:
> > > > > 
> > > > > 
> > > > > On 10/12/2019 14:50, Bharata B Rao wrote:
> > > > > > On Tue, Dec 10, 2019 at 02:28:51PM +1100, David Gibson wrote:
> > > > > >> On Mon, Dec 09, 2019 at 12:30:12PM +0530, Bharata B Rao wrote:
> > > > > >>> A pseries guest can be run as a secure guest on Ultravisor-enabled
> > > > > >>> POWER platforms. When such a secure guest is reset, we need to
> > > > > >>> release/reset a few resources both on ultravisor and hypervisor 
> > > > > >>> side.
> > > > > >>> This is achieved by invoking this new ioctl KVM_PPC_SVM_OFF from 
> > > > > >>> the
> > > > > >>> machine reset path.
> > > > > >>>
> > > > > >>> As part of this ioctl, the secure guest is essentially 
> > > > > >>> transitioned
> > > > > >>> back to normal mode so that it can reboot like a regular guest and
> > > > > >>> become secure again.
> > > > > >>>
> > > > > >>> This ioctl has no effect when invoked for a normal guest.
> > > > > >>>
> > > > > >>> Signed-off-by: Bharata B Rao 
> > > > > >>> ---
> > > > > >>>  hw/ppc/spapr.c   | 1 +
> > > > > >>>  target/ppc/kvm.c | 7 +++
> > > > > >>>  target/ppc/kvm_ppc.h | 6 ++
> > > > > >>>  3 files changed, 14 insertions(+)
> > > > > >>>
> > > > > >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > > >>> index f11422fc41..4c7ad3400d 100644
> > > > > >>> --- a/hw/ppc/spapr.c
> > > > > >>> +++ b/hw/ppc/spapr.c
> > > > > >>> @@ -1597,6 +1597,7 @@ static void 
> > > > > >>> spapr_machine_reset(MachineState *machine)
> > > > > >>>  void *fdt;
> > > > > >>>  int rc;
> > > > > >>>  
> > > > > >>> +kvmppc_svm_off();
> > > > > >>
> > > > > >> If you're going to have this return an error value, you should 
> > > > > >> really
> > > > > >> check it here.
> > > > > > 
> > > > > > I could, by spapr_machine_reset() and the callers don't propagate 
> > > > > > the
> > > > > > errors up. So may be I could print a warning instead when ioctl 
> > > > > > fails?
> > > > > 
> > > > > An error here means you cannot restart the machine and should probably
> > > > > suspend, or try until it is not EBUSY (==all threads have stopped?).
> > > > 
> > > > Right, if this fails, something has gone badly wrong.  You should
> > > > absolutely print a message, and in fact it might be appropriate to
> > > > quit outright.  IIUC the way PEF resets work, a failure here means you
> > > > won't be able to boot after the reset, since the guest memory will
> > > > still be inaccessible to the host.
> > > 
> > > Correct. I will send next version with a message and abort() added in
> > > the ioctl failure path.
> > 
> > abort() or assert() isn't right either - that's reserved for things
> > that are definitely caused by a qemu code bug.  This should be an
> > exit(EXIT_FAILURE).
> 
> Ok, but I see a problem with checking the return value of this
> ioctl from userspace. If this ioctl is run on older kernels that don't
> support this ioctl, we get -ENOTTY as return value. We shouldn't be
> exiting in that case.

Ah, right.  We'll need to check for -ENOTTY specifically and ignore
it, then.  We don't want this spewing warnings on every non-secure
guest.

> It looks like we may need a new KVM capability to advertise the presence
> of KVM_PPC_SVM_OFF ioctl (or more generally, to advertise host kernel's
> capability to support secure guests).

Actually, that's probably a better idea still.

> Paul - Do you think we should add such a KVM capability? Here is the
> summary of the problem:
> 
> 1. QEMU invokes KVM_PPC_SVM_OFF ioctl from machine reset path and currently
>we don't check for its return value.
> 2. On host kernels that support secure guests,
>2a. this ioctl returns 0 for regular guests and has no effect.
>2b. the ioctl can fail for secure guests and here we could check the
>return value and exit the guest right away.
> 3. On host kernels that don't support secure guests, ioctl returns -ENOTTY
>but we ignore the return value and continue the reset as this is
>for a non-secure guest.
> 
> If we have such a KVM capability, we could invoke the ioctl only if it
> is supported and handle the return value appropriately.
> 
> Regards,
> Bharata.
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH v1 ppc-for-5.0 2/2] ppc/spapr: Support reboot of secure pseries guest

2019-12-10 Thread Bharata B Rao

On Wed, Dec 11, 2019 at 10:41:32AM +1100, David Gibson wrote:
> On Tue, Dec 10, 2019 at 12:20:07PM +0530, Bharata B Rao wrote:
> > On Tue, Dec 10, 2019 at 04:05:36PM +1100, David Gibson wrote:
> > > On Tue, Dec 10, 2019 at 03:03:01PM +1100, Alexey Kardashevskiy wrote:
> > > > 
> > > > 
> > > > On 10/12/2019 14:50, Bharata B Rao wrote:
> > > > > On Tue, Dec 10, 2019 at 02:28:51PM +1100, David Gibson wrote:
> > > > >> On Mon, Dec 09, 2019 at 12:30:12PM +0530, Bharata B Rao wrote:
> > > > >>> A pseries guest can be run as a secure guest on Ultravisor-enabled
> > > > >>> POWER platforms. When such a secure guest is reset, we need to
> > > > >>> release/reset a few resources both on ultravisor and hypervisor 
> > > > >>> side.
> > > > >>> This is achieved by invoking this new ioctl KVM_PPC_SVM_OFF from the
> > > > >>> machine reset path.
> > > > >>>
> > > > >>> As part of this ioctl, the secure guest is essentially transitioned
> > > > >>> back to normal mode so that it can reboot like a regular guest and
> > > > >>> become secure again.
> > > > >>>
> > > > >>> This ioctl has no effect when invoked for a normal guest.
> > > > >>>
> > > > >>> Signed-off-by: Bharata B Rao 
> > > > >>> ---
> > > > >>>  hw/ppc/spapr.c   | 1 +
> > > > >>>  target/ppc/kvm.c | 7 +++
> > > > >>>  target/ppc/kvm_ppc.h | 6 ++
> > > > >>>  3 files changed, 14 insertions(+)
> > > > >>>
> > > > >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > >>> index f11422fc41..4c7ad3400d 100644
> > > > >>> --- a/hw/ppc/spapr.c
> > > > >>> +++ b/hw/ppc/spapr.c
> > > > >>> @@ -1597,6 +1597,7 @@ static void spapr_machine_reset(MachineState 
> > > > >>> *machine)
> > > > >>>  void *fdt;
> > > > >>>  int rc;
> > > > >>>  
> > > > >>> +kvmppc_svm_off();
> > > > >>
> > > > >> If you're going to have this return an error value, you should really
> > > > >> check it here.
> > > > > 
> > > > > I could, by spapr_machine_reset() and the callers don't propagate the
> > > > > errors up. So may be I could print a warning instead when ioctl fails?
> > > > 
> > > > An error here means you cannot restart the machine and should probably
> > > > suspend, or try until it is not EBUSY (==all threads have stopped?).
> > > 
> > > Right, if this fails, something has gone badly wrong.  You should
> > > absolutely print a message, and in fact it might be appropriate to
> > > quit outright.  IIUC the way PEF resets work, a failure here means you
> > > won't be able to boot after the reset, since the guest memory will
> > > still be inaccessible to the host.
> > 
> > Correct. I will send next version with a message and abort() added in
> > the ioctl failure path.
> 
> abort() or assert() isn't right either - that's reserved for things
> that are definitely caused by a qemu code bug.  This should be an
> exit(EXIT_FAILURE).

Ok, but I see a problem with checking the return value of this
ioctl from userspace. If this ioctl is run on older kernels that don't
support this ioctl, we get -ENOTTY as return value. We shouldn't be
exiting in that case.

It looks like we may need a new KVM capability to advertise the presence
of KVM_PPC_SVM_OFF ioctl (or more generally, to advertise host kernel's
capability to support secure guests). 

Paul - Do you think we should add such a KVM capability? Here is the
summary of the problem:

1. QEMU invokes KVM_PPC_SVM_OFF ioctl from machine reset path and currently
   we don't check for its return value.
2. On host kernels that support secure guests,
   2a. this ioctl returns 0 for regular guests and has no effect.
   2b. the ioctl can fail for secure guests and here we could check the
   return value and exit the guest right away.
3. On host kernels that don't support secure guests, ioctl returns -ENOTTY
   but we ignore the return value and continue the reset as this is
   for a non-secure guest.

If we have such a KVM capability, we could invoke the ioctl only if it
is supported and handle the return value appropriately.

Regards,
Bharata.

Re: guest / host buffer sharing ...

2019-12-10 Thread David Stevens

There are three issues being discussed here that aren't being clearly
delineated: sharing guest allocated memory with the host, sharing host
allocated memory with the guest, and sharing buffers between devices.

Right now, guest allocated memory can be shared with the host through
the virtqueues or by passing a scatterlist in the virtio payload (i.e.
what virtio-gpu does). Host memory can be shared with the guest using
the new shared memory regions. As far as I can tell, these mechanisms
should be sufficient for sharing memory between the guest and host and
vice versa.

Where things are not sufficient is when we talk about sharing buffers
between devices. For starters, a 'buffer' as we're discussing here is
not something that is currently defined by the virtio spec. The
original proposal defines a buffer as a generic object that is guest
ram+id+metadata, and is created by a special buffer allocation device.
With this approach, buffers can be cleanly shared between devices.

An alternative that Tomasz suggested would be to avoid defining a
generic buffer object, and instead state that the scatterlist which
virtio-gpu currently uses is the 'correct' way for virtio device
protocols to define buffers. With this approach, sharing buffers
between devices potentially requires the host to map different
scatterlists back to a consistent representation of a buffer.

None of the proposals directly address the use case of sharing host
allocated buffers between devices, but I think they can be extended to
support it. Host buffers can be identified by the following tuple:
(transport type enum, transport specific device address, shmid,
offset). I think this is sufficient even for host-allocated buffers
that aren't visible to the guest (e.g. protected memory, vram), since
they can still be given address space in some shared memory region,
even if those addresses are actually inaccessible to the guest. At
this point, the host buffer identifier can simply be passed in place
of the guest ram scatterlist with either proposed buffer sharing
mechanism.

I think the main question here is whether or not the complexity of
generic buffers and a buffer sharing device is worth it compared to
the more implicit definition of buffers. Personally, I lean towards
the implicit definition of buffers, since a buffer sharing device
brings a lot of complexity and there aren't any clear clients of the
buffer metadata feature.

Cheers,
David

On Thu, Dec 5, 2019 at 7:22 AM Dylan Reid  wrote:
>
> On Thu, Nov 21, 2019 at 4:59 PM Tomasz Figa  wrote:
> >
> > On Thu, Nov 21, 2019 at 6:41 AM Geoffrey McRae  
> > wrote:
> > >
> > >
> > >
> > > On 2019-11-20 23:13, Tomasz Figa wrote:
> > > > Hi Geoffrey,
> > > >
> > > > On Thu, Nov 7, 2019 at 7:28 AM Geoffrey McRae 
> > > > wrote:
> > > >>
> > > >>
> > > >>
> > > >> On 2019-11-06 23:41, Gerd Hoffmann wrote:
> > > >> > On Wed, Nov 06, 2019 at 05:36:22PM +0900, David Stevens wrote:
> > > >> >> > (1) The virtio device
> > > >> >> > =
> > > >> >> >
> > > >> >> > Has a single virtio queue, so the guest can send commands to 
> > > >> >> > register
> > > >> >> > and unregister buffers.  Buffers are allocated in guest ram.  
> > > >> >> > Each buffer
> > > >> >> > has a list of memory ranges for the data. Each buffer also has 
> > > >> >> > some
> > > >> >>
> > > >> >> Allocating from guest ram would work most of the time, but I think
> > > >> >> it's insufficient for many use cases. It doesn't really support 
> > > >> >> things
> > > >> >> such as contiguous allocations, allocations from carveouts or <4GB,
> > > >> >> protected buffers, etc.
> > > >> >
> > > >> > If there are additional constrains (due to gpu hardware I guess)
> > > >> > I think it is better to leave the buffer allocation to virtio-gpu.
> > > >>
> > > >> The entire point of this for our purposes is due to the fact that we
> > > >> can
> > > >> not allocate the buffer, it's either provided by the GPU driver or
> > > >> DirectX. If virtio-gpu were to allocate the buffer we might as well
> > > >> forget
> > > >> all this and continue using the ivshmem device.
> > > >
> > > > I don't understand why virtio-gpu couldn't allocate those buffers.
> > > > Allocation doesn't necessarily mean creating new memory. Since the
> > > > virtio-gpu device on the host talks to the GPU driver (or DirectX?),
> > > > why couldn't it return one of the buffers provided by those if
> > > > BIND_SCANOUT is requested?
> > > >
> > >
> > > Because in our application we are a user-mode application in windows
> > > that is provided with buffers that were allocated by the video stack in
> > > windows. We are not using a virtual GPU but a physical GPU via vfio
> > > passthrough and as such we are limited in what we can do. Unless I have
> > > completely missed what virtio-gpu does, from what I understand it's
> > > attempting to be a virtual GPU in its own right, which is not at all
> > > suitable for our requirements.
> >
> > Not necessarily. virtio-gpu

Re: [PATCH v3 2/3] spapr: Add NVDIMM device support

2019-12-10 Thread Shivaprasad G Bhat




On 12/06/2019 07:22 AM, David Gibson wrote:

On Wed, Nov 27, 2019 at 09:50:54AM +0530, Bharata B Rao wrote:

On Fri, Nov 22, 2019 at 10:42 AM David Gibson
 wrote:

Ok.  A number of queries about this.

1) The PAPR spec for ibm,dynamic-memory-v2 says that the first word in
each entry is the number of LMBs, but for NVDIMMs you use the
not-necessarily-equal scm_block_size instead.  Does the NVDIMM
amendment for PAPR really specify to use different block sizes for
these cases?  (In which case that's a really stupid spec decision, but
that wouldn't surprise me at this point).

SCM block sizes can be different from LMB sizes, but here we enforce
that SCM device size (excluding metadata) to multiple of LMB size so
that we don't end up memory range that is not aligned to LMB size.

Right, but it still doesn't make sense to use scm_block_size when you
create the dynamic-memory-v2 property.


Right, I should use LMB size here as I will be creating holes here to 
disallow DIMMs

to claim those LMBs marking them INVALID as Bharata Suggested before.


  As far as the thing
interpreting that goes, it *must* be LMB size, not SCM block size.  If
those are required to be the same at this point, you should use an
assert().


SCM block size should be a multiple for LMB size, need not be equal. 
I'll add an assert
for that, checking if equal. There is no benefit I see as of now having 
higher

SCM block size as the bind/unbind are already done before the bind hcall.


2) Similarly, the ibm,dynamic-memory-v2 description says that the
memory block described by the entry has a whole batch of contiguous
DRCs starting at the DRC index given and continuing for #LMBs DRCs.
For NVDIMMs it appears that you just have one DRC for the whole
NVDIMM.  Is that right?

One NVDIMM has one DRC, In our case, we need to mark the LMBs
corresponding to that address range in ibm,dynamic-memory-v2 as
reserved and invalid.

Ok, that fits very weirdly with the DRC allocation for the rest of
pluggable memory, but I suppose that's PAPR for you.

Having these in together is very inscrutable though, and relies on a
heap of non-obvious constraints about placement of DIMMs and NVDIMMs
relative to each other.  I really wonder if it would be better to have
a completely different address range for the NVDIMMs.


The backend object for both DIMM and NVDIMM are memory-backend-*
and they use the address from the same space. Separating it would mean
using/introducing different backend object. I dont think we have a 
choice here.





3) You're not setting *any* extra flags on the entry.  How is the
guest supposed to know which are NVDIMM entries and which are regular
DIMM entries?  AFAICT in this version the NVDIMM slots are
indistinguishable from the unassigned hotplug memory (which makes the
difference in LMB and DRC numbering even more troubling).

For NVDIMM case, this patch should populate the LMB set in
ibm,dynamic-memory-v2 something like below:
 elem = spapr_get_drconf_cell(size /lmb_size, addr,
  0, -1,
SPAPR_LMB_FLAGS_RESERVED | SPAPR_LMB_FLAGS_DRC_INVALID);

This will ensure that the NVDIMM range will never be considered as
valid memory range for memory hotplug.

Hrm.  Ok so we already have code that does that for any gaps between
DIMMs.  I don't think there's actually anything that that code will do
differently than the code you have for NVDIMMs, so you could just skip
over the NVDIMMs here and it should do the right thing.

The *interpretation* of those entries will become different: for space
into which a regular DIMM is later inserted, we'll assume the DRC
index given is a base and there are more DRCs following it, but for
NVDIMMs we'll assume the same DRC throughout.  This is nuts, but IIUC
that's what PAPR says and we can't do much about it.


My current patch is buggy as Bharata pointed out. The NVDIMM DRCs
are not to be populated here, but mark the LMB DRCs as RESERVED and INVALID
so that no malicious attempts to online those LMBs at those NVDIMM address
ranges are attempted.




4) AFAICT these are _present_ NVDIMMs, so why is
SPAPR_LMB_FLAGS_ASSIGNED not set for them?  (and why is the node
forced to -1, regardless of di->node).


  QSIMPLEQ_INSERT_TAIL(_queue, elem, entry);
  nr_entries++;
  cur_addr = addr + size;
@@ -1261,6 +1273,85 @@ static void spapr_dt_hypervisor(SpaprMachineState 
*spapr, void *fdt)
  }
  }

+static void spapr_create_nvdimm_dr_connectors(SpaprMachineState *spapr)
+{
+MachineState *machine = MACHINE(spapr);
+int i;
+
+for (i = 0; i < machine->ram_slots; i++) {
+spapr_dr_connector_new(OBJECT(spapr), TYPE_SPAPR_DRC_PMEM, i);

What happens if you try to plug an NVDIMM to one of these slots, but a
regular DIMM has already taken it?

NVDIMM hotplug won't get that occupied slot.

Ok.

Re: [RFC PATCH 1/9] vfio/pci: introduce mediate ops to intercept vfio-pci ops

2019-12-10 Thread Yan Zhao

On Wed, Dec 11, 2019 at 12:58:24AM +0800, Alex Williamson wrote:
> On Mon, 9 Dec 2019 21:44:23 -0500
> Yan Zhao  wrote:
> 
> > > > > > Currently, yes, i40e has build dependency on vfio-pci.
> > > > > > It's like this, if i40e decides to support SRIOV and compiles in vf
> > > > > > related code who depends on vfio-pci, it will also have build 
> > > > > > dependency
> > > > > > on vfio-pci. isn't it natural?
> > > > > 
> > > > > No, this is not natural.  There are certainly i40e VF use cases that
> > > > > have no interest in vfio and having dependencies between the two
> > > > > modules is unacceptable.  I think you probably want to modularize the
> > > > > i40e vfio support code and then perhaps register a table in vfio-pci
> > > > > that the vfio-pci code can perform a module request when using a
> > > > > compatible device.  Just and idea, there might be better options.  I
> > > > > will not accept a solution that requires unloading the i40e driver in
> > > > > order to unload the vfio-pci driver.  It's inconvenient with just one
> > > > > NIC driver, imagine how poorly that scales.
> > > > > 
> > > > what about this way:
> > > > mediate driver registers a module notifier and every time when
> > > > vfio_pci is loaded, register to vfio_pci its mediate ops?
> > > > (Just like in below sample code)
> > > > This way vfio-pci is free to unload and this registering only gives
> > > > vfio-pci a name of what module to request.
> > > > After that,
> > > > in vfio_pci_open(), vfio-pci requests the mediate driver. (or puts
> > > > the mediate driver when mediate driver does not support mediating the
> > > > device)
> > > > in vfio_pci_release(), vfio-pci puts the mediate driver.
> > > > 
> > > > static void register_mediate_ops(void)
> > > > {
> > > > int (*func)(struct vfio_pci_mediate_ops *ops) = NULL;
> > > > 
> > > > func = symbol_get(vfio_pci_register_mediate_ops);
> > > > 
> > > > if (func) {
> > > > func(_dt_ops);
> > > > symbol_put(vfio_pci_register_mediate_ops);
> > > > }
> > > > }
> > > > 
> > > > static int igd_module_notify(struct notifier_block *self,
> > > >   unsigned long val, void *data)
> > > > {
> > > > struct module *mod = data;
> > > > int ret = 0;
> > > > 
> > > > switch (val) {
> > > > case MODULE_STATE_LIVE:
> > > > if (!strcmp(mod->name, "vfio_pci"))
> > > > register_mediate_ops();
> > > > break;
> > > > case MODULE_STATE_GOING:
> > > > break;
> > > > default:
> > > > break;
> > > > }
> > > > return ret;
> > > > }
> > > > 
> > > > static struct notifier_block igd_module_nb = {
> > > > .notifier_call = igd_module_notify,
> > > > .priority = 0,
> > > > };
> > > > 
> > > > 
> > > > 
> > > > static int __init igd_dt_init(void)
> > > > {
> > > > ...
> > > > register_mediate_ops();
> > > > register_module_notifier(_module_nb);
> > > > ...
> > > > return 0;
> > > > }  
> > > 
> > > 
> > > No, this is bad.  Please look at MODULE_ALIAS() and request_module() as
> > > used in the vfio-platform for loading reset driver modules.  I think
> > > the correct approach is that vfio-pci should perform a request_module()
> > > based on the device being probed.  Having the mediation provider
> > > listening for vfio-pci and registering itself regardless of whether we
> > > intend to use it assumes that we will want to use it and assumes that
> > > the mediation provider module is already loaded.  We should be able to
> > > support demand loading of modules that may serve no other purpose than
> > > providing this mediation.  Thanks,  
> > hi Alex
> > Thanks for this message.
> > So is it good to create a separate module as mediation provider driver,
> > and alias its module name to "vfio-pci-mediate-vid-did".
> > Then when vfio-pci probes the device, it requests module of that name ?
> 
> I think this would give us an option to have the mediator as a separate
> module, but not require it.  Maybe rather than a request_module(),
> where if we follow the platform reset example we'd then expect the init
> code for the module to register into a list, we could do a
> symbol_request().  AIUI, this would give us a reference to the symbol
> if the module providing it is already loaded, and request a module
> (perhaps via an alias) if it's not already load.  Thanks,
> 
ok. got it!
Thank you :)

Yan

[PATCH] vhost-user-test: fix a memory leak

2019-12-10 Thread pannengyuan

From: Pan Nengyuan 

Spotted by ASAN.

Reported-by: Euler Robot 
Signed-off-by: Pan Nengyuan 
---
 tests/vhost-user-test.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
index 91ea373..54be931 100644
--- a/tests/vhost-user-test.c
+++ b/tests/vhost-user-test.c
@@ -717,6 +717,8 @@ static void test_migrate(void *obj, void *arg, 
QGuestAllocator *alloc)
 guint64 size;
 
 if (!wait_for_fds(s)) {
+g_free(uri);
+test_server_free(dest);
 return;
 }
 
-- 
2.7.2.windows.1

Re: [PATCH] ppc: Drop useless extern annotation for functions

2019-12-10 Thread David Gibson

On Tue, Dec 10, 2019 at 09:09:20AM +0100, Greg Kurz wrote:

Applied to ppc-for-5.0.

> Signed-off-by: Greg Kurz 
> ---
>  include/hw/ppc/pnv_xscom.h |   22 +++---
>  include/hw/ppc/spapr_vio.h |6 +++---
>  2 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/include/hw/ppc/pnv_xscom.h b/include/hw/ppc/pnv_xscom.h
> index a40d2a2a2a98..1c1d76bf9be5 100644
> --- a/include/hw/ppc/pnv_xscom.h
> +++ b/include/hw/ppc/pnv_xscom.h
> @@ -109,16 +109,16 @@ typedef struct PnvXScomInterfaceClass {
>  #define PNV10_XSCOM_PSIHB_BASE 0x3011D00
>  #define PNV10_XSCOM_PSIHB_SIZE 0x100
>  
> -extern void pnv_xscom_realize(PnvChip *chip, uint64_t size, Error **errp);
> -extern int pnv_dt_xscom(PnvChip *chip, void *fdt, int offset);
> -
> -extern void pnv_xscom_add_subregion(PnvChip *chip, hwaddr offset,
> -MemoryRegion *mr);
> -extern void pnv_xscom_region_init(MemoryRegion *mr,
> -  struct Object *owner,
> -  const MemoryRegionOps *ops,
> -  void *opaque,
> -  const char *name,
> -  uint64_t size);
> +void pnv_xscom_realize(PnvChip *chip, uint64_t size, Error **errp);
> +int pnv_dt_xscom(PnvChip *chip, void *fdt, int offset);
> +
> +void pnv_xscom_add_subregion(PnvChip *chip, hwaddr offset,
> + MemoryRegion *mr);
> +void pnv_xscom_region_init(MemoryRegion *mr,
> +   struct Object *owner,
> +   const MemoryRegionOps *ops,
> +   void *opaque,
> +   const char *name,
> +   uint64_t size);
>  
>  #endif /* PPC_PNV_XSCOM_H */
> diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h
> index 72762ed16b70..ce6d9b0c6605 100644
> --- a/include/hw/ppc/spapr_vio.h
> +++ b/include/hw/ppc/spapr_vio.h
> @@ -80,10 +80,10 @@ struct SpaprVioBus {
>  uint32_t next_reg;
>  };
>  
> -extern SpaprVioBus *spapr_vio_bus_init(void);
> -extern SpaprVioDevice *spapr_vio_find_by_reg(SpaprVioBus *bus, uint32_t reg);
> +SpaprVioBus *spapr_vio_bus_init(void);
> +SpaprVioDevice *spapr_vio_find_by_reg(SpaprVioBus *bus, uint32_t reg);
>  void spapr_dt_vdevice(SpaprVioBus *bus, void *fdt);
> -extern gchar *spapr_vio_stdout_path(SpaprVioBus *bus);
> +gchar *spapr_vio_stdout_path(SpaprVioBus *bus);
>  
>  static inline void spapr_vio_irq_pulse(SpaprVioDevice *dev)
>  {
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH 1/2] ppc/pnv: Loop on the whole hierarchy to populate the DT with the XSCOM nodes

2019-12-10 Thread David Gibson

On Tue, Dec 10, 2019 at 05:49:01PM +0100, Greg Kurz wrote:
> On Tue, 10 Dec 2019 14:58:44 +0100
> Cédric Le Goater  wrote:
> 
> > Some PnvXScomInterface objects lie a bit deeper (PnvPBCQState) than
> 
> I didn't find any trace of PnvPBCQState in the code... what is it ?
> 
> > the first layer, so we need to loop on the whole object hierarchy to
> > catch them.
> > 
> > Signed-off-by: Cédric Le Goater 
> > ---
> >  hw/ppc/pnv_xscom.c | 7 ++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
> > index bed41840845e..006d87e970d9 100644
> > --- a/hw/ppc/pnv_xscom.c
> > +++ b/hw/ppc/pnv_xscom.c
> > @@ -326,7 +326,12 @@ int pnv_dt_xscom(PnvChip *chip, void *fdt, int 
> > root_offset)
> >  args.fdt = fdt;
> >  args.xscom_offset = xscom_offset;
> >  
> > -object_child_foreach(OBJECT(chip), xscom_dt_child, );
> > +/*
> > + * Loop on the whole object hierarchy to catch all
> > + * PnvXScomInterface objects which can lie a bit deeper the first
> 
> s/deeper the first/deeper than the first/

Fixed during commit.

> 
> > + * layer.
> > + */
> > +object_child_foreach_recursive(OBJECT(chip), xscom_dt_child, );
> >  return 0;
> >  }
> >  
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH 0/2] ppc/pnv: minor XSCOM fixes

2019-12-10 Thread David Gibson

On Tue, Dec 10, 2019 at 02:58:43PM +0100, Cédric Le Goater wrote:
> Hello,
> 
> Here are a couple of fixes/cleanups for the PowerNV XSCOM bus.

Applied to ppc-for-5.0, thanks.

> 
> Thanks,
> 
> C. 
> 
> Cédric Le Goater (2):
>   ppc/pnv: Loop on the whole hierarchy to populate the DT with the XSCOM
> nodes
>   ppc/pnv: populate the DT with realized XSCOM devices
> 
>  hw/ppc/pnv_xscom.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH 2/5] ppc/pnv: Introduce a POWER10 PnvChip and a powernv10 machine

2019-12-10 Thread David Gibson

On Tue, Dec 10, 2019 at 09:52:05AM +0100, Cédric Le Goater wrote:
> On 10/12/2019 04:34, David Gibson wrote:
> >> +static inline bool pnv_chip_is_power10(const PnvChip *chip)
> >> +{
> >> +return PNV_CHIP_GET_CLASS(chip)->chip_type == PNV_CHIP_POWER10;
> >> +}
> >> +
> >> +static inline bool pnv_is_power10(PnvMachineState *pnv)
> >> +{
> >> +return pnv_chip_is_power10(pnv->chips[0]);
> >> +}
> >
> 
> I agree this is starting to be ugly.
> 
> > It's not in scope for this series, but now that we have P8/9/10
> > specific chip object types and powernv8/powernv9, we should be able to
> > remove the ugly chip_type field, and just do object class checks on
> > the chip and or machine objects themselves.
>  
> 
> So we would use object_class_dynamic_cast() instead of field
> chip_type ?

Just object_dynamic_cast() should suffice, since you have access to
the chip and machine instances.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH v1 ppc-for-5.0 2/2] ppc/spapr: Support reboot of secure pseries guest

2019-12-10 Thread David Gibson

On Tue, Dec 10, 2019 at 12:20:07PM +0530, Bharata B Rao wrote:
> On Tue, Dec 10, 2019 at 04:05:36PM +1100, David Gibson wrote:
> > On Tue, Dec 10, 2019 at 03:03:01PM +1100, Alexey Kardashevskiy wrote:
> > > 
> > > 
> > > On 10/12/2019 14:50, Bharata B Rao wrote:
> > > > On Tue, Dec 10, 2019 at 02:28:51PM +1100, David Gibson wrote:
> > > >> On Mon, Dec 09, 2019 at 12:30:12PM +0530, Bharata B Rao wrote:
> > > >>> A pseries guest can be run as a secure guest on Ultravisor-enabled
> > > >>> POWER platforms. When such a secure guest is reset, we need to
> > > >>> release/reset a few resources both on ultravisor and hypervisor side.
> > > >>> This is achieved by invoking this new ioctl KVM_PPC_SVM_OFF from the
> > > >>> machine reset path.
> > > >>>
> > > >>> As part of this ioctl, the secure guest is essentially transitioned
> > > >>> back to normal mode so that it can reboot like a regular guest and
> > > >>> become secure again.
> > > >>>
> > > >>> This ioctl has no effect when invoked for a normal guest.
> > > >>>
> > > >>> Signed-off-by: Bharata B Rao 
> > > >>> ---
> > > >>>  hw/ppc/spapr.c   | 1 +
> > > >>>  target/ppc/kvm.c | 7 +++
> > > >>>  target/ppc/kvm_ppc.h | 6 ++
> > > >>>  3 files changed, 14 insertions(+)
> > > >>>
> > > >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > >>> index f11422fc41..4c7ad3400d 100644
> > > >>> --- a/hw/ppc/spapr.c
> > > >>> +++ b/hw/ppc/spapr.c
> > > >>> @@ -1597,6 +1597,7 @@ static void spapr_machine_reset(MachineState 
> > > >>> *machine)
> > > >>>  void *fdt;
> > > >>>  int rc;
> > > >>>  
> > > >>> +kvmppc_svm_off();
> > > >>
> > > >> If you're going to have this return an error value, you should really
> > > >> check it here.
> > > > 
> > > > I could, by spapr_machine_reset() and the callers don't propagate the
> > > > errors up. So may be I could print a warning instead when ioctl fails?
> > > 
> > > An error here means you cannot restart the machine and should probably
> > > suspend, or try until it is not EBUSY (==all threads have stopped?).
> > 
> > Right, if this fails, something has gone badly wrong.  You should
> > absolutely print a message, and in fact it might be appropriate to
> > quit outright.  IIUC the way PEF resets work, a failure here means you
> > won't be able to boot after the reset, since the guest memory will
> > still be inaccessible to the host.
> 
> Correct. I will send next version with a message and abort() added in
> the ioctl failure path.

abort() or assert() isn't right either - that's reserved for things
that are definitely caused by a qemu code bug.  This should be an
exit(EXIT_FAILURE).

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH v2 1/1] display/bochs-display: fix memory leak

2019-12-10 Thread Philippe Mathieu-Daudé


On 12/11/19 12:28 AM, Aleksandar Markovic wrote:


On Wednesday, December 11, 2019, Philippe Mathieu-Daudé 
mailto:phi...@redhat.com>> wrote:


On 12/10/19 10:27 PM, Cameron Esfahani via wrote:

Fix memory leak in bochs_display_update().  Leaks 304 bytes per
frame.

Signed-off-by: Cameron Esfahani mailto:di...@apple.com>>


Funny to have a dirty@ email fixing a DirtyBitmapSnapshot leak =)

Fixes: 33ebad54056


Dirty fixing bad in a snap. What a fix. :-o


Nice one! =)

Re: [PATCH v2 1/1] display/bochs-display: fix memory leak

2019-12-10 Thread Aleksandar Markovic

On Wednesday, December 11, 2019, Philippe Mathieu-Daudé 
wrote:

> On 12/10/19 10:27 PM, Cameron Esfahani via wrote:
>
>> Fix memory leak in bochs_display_update().  Leaks 304 bytes per frame.
>>
>> Signed-off-by: Cameron Esfahani 
>>
>
> Funny to have a dirty@ email fixing a DirtyBitmapSnapshot leak =)
>
> Fixes: 33ebad54056


Dirty fixing bad in a snap. What a fix. :-o


> Reviewed-by: Philippe Mathieu-Daudé 
>
> ---
>>   hw/display/bochs-display.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/hw/display/bochs-display.c b/hw/display/bochs-display.c
>> index dc1bd1641d..215db9a231 100644
>> --- a/hw/display/bochs-display.c
>> +++ b/hw/display/bochs-display.c
>> @@ -252,6 +252,8 @@ static void bochs_display_update(void *opaque)
>>   dpy_gfx_update(s->con, 0, ys,
>>  mode.width, y - ys);
>>   }
>> +
>> +g_free(snap);
>>   }
>>   }
>>
>>
>
>
>

Re: [PATCH v2 1/1] display/bochs-display: fix memory leak

2019-12-10 Thread Philippe Mathieu-Daudé


On 12/10/19 10:27 PM, Cameron Esfahani via wrote:

Fix memory leak in bochs_display_update().  Leaks 304 bytes per frame.

Signed-off-by: Cameron Esfahani 


Funny to have a dirty@ email fixing a DirtyBitmapSnapshot leak =)

Fixes: 33ebad54056
Reviewed-by: Philippe Mathieu-Daudé 


---
  hw/display/bochs-display.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/hw/display/bochs-display.c b/hw/display/bochs-display.c
index dc1bd1641d..215db9a231 100644
--- a/hw/display/bochs-display.c
+++ b/hw/display/bochs-display.c
@@ -252,6 +252,8 @@ static void bochs_display_update(void *opaque)
  dpy_gfx_update(s->con, 0, ys,
 mode.width, y - ys);
  }
+
+g_free(snap);
  }
  }

Re: [PATCH 0/2] hw/arm: ast2600: Wire up eMMC controller

2019-12-10 Thread Andrew Jeffery




On Tue, 10 Dec 2019, at 19:23, Cédric Le Goater wrote:
> On 10/12/2019 01:52, Andrew Jeffery wrote:
> > Hello,
> > 
> > The AST2600 has an additional SDHCI intended for use as an eMMC boot source.
> 
> Have you also considered booting the QEMU Aspeed AST2600 machine 
> from the eMMC device ?
> 

I hadn't got that far. I was surprised we hadn't yet wired up the eMMC 
controller at
all, so I solved that problem first :)

Andrew

Re: [PATCH 1/2] hw/sd: Configure number of slots exposed by the ASPEED SDHCI model

2019-12-10 Thread Andrew Jeffery




On Tue, 10 Dec 2019, at 19:26, Cédric Le Goater wrote:
> On 10/12/2019 01:52, Andrew Jeffery wrote:
> > The AST2600 includes a second cut-down version of the SD/MMC controller
> > found in the AST2500, named the eMMC controller. It's cut down in the
> > sense that it only supports one slot rather than two, but it brings the
> > total number of slots supported by the AST2600 to three.
> > 
> > The existing code assumed that the SD controller always provided two
> > slots. Rework the SDHCI object to expose the number of slots as a
> > property to be set by the SoC configuration.
> > 
> > Signed-off-by: Andrew Jeffery 
> 
> Reviewed-by: Cédric Le Goater 
> 
> One minor question below.
> 
> 
> > ---
> >  hw/arm/aspeed.c  |  2 +-
> >  hw/arm/aspeed_ast2600.c  |  2 ++
> >  hw/arm/aspeed_soc.c  |  3 +++
> >  hw/sd/aspeed_sdhci.c | 11 +--
> >  include/hw/sd/aspeed_sdhci.h |  1 +
> >  5 files changed, 16 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
> > index 028191ff36fc..862549b1f3a9 100644
> > --- a/hw/arm/aspeed.c
> > +++ b/hw/arm/aspeed.c
> > @@ -259,7 +259,7 @@ static void aspeed_board_init(MachineState *machine,
> >  cfg->i2c_init(bmc);
> >  }
> >  
> > -for (i = 0; i < ARRAY_SIZE(bmc->soc.sdhci.slots); i++) {
> > +for (i = 0; i < bmc->soc.sdhci.num_slots; i++) {
> >  SDHCIState *sdhci = >soc.sdhci.slots[i];
> >  DriveInfo *dinfo = drive_get_next(IF_SD);
> >  BlockBackend *blk;
> > diff --git a/hw/arm/aspeed_ast2600.c b/hw/arm/aspeed_ast2600.c
> > index 931887ac681f..931ee5aae183 100644
> > --- a/hw/arm/aspeed_ast2600.c
> > +++ b/hw/arm/aspeed_ast2600.c
> > @@ -208,6 +208,8 @@ static void aspeed_soc_ast2600_init(Object *obj)
> >  sysbus_init_child_obj(obj, "sdc", OBJECT(>sdhci), sizeof(s->sdhci),
> >TYPE_ASPEED_SDHCI);
> >  
> > +object_property_set_int(OBJECT(>sdhci), 2, "num-slots", 
> > _abort);
> 
> OK. This defines 2 SDHCI slots for the ast2600 SoC, but
> 
> > +
> >  /* Init sd card slot class here so that they're under the correct 
> > parent */
> >  for (i = 0; i < ASPEED_SDHCI_NUM_SLOTS; ++i) {
> >  sysbus_init_child_obj(obj, "sdhci[*]", OBJECT(>sdhci.slots[i]),
> > diff --git a/hw/arm/aspeed_soc.c b/hw/arm/aspeed_soc.c
> > index f4fe243458fd..3498f55603f2 100644
> > --- a/hw/arm/aspeed_soc.c
> > +++ b/hw/arm/aspeed_soc.c
> > @@ -215,6 +215,9 @@ static void aspeed_soc_init(Object *obj)
> >  sysbus_init_child_obj(obj, "sdc", OBJECT(>sdhci), sizeof(s->sdhci),
> >TYPE_ASPEED_SDHCI);
> >  
> > +object_property_set_int(OBJECT(>sdhci), ASPEED_SDHCI_NUM_SLOTS,
> > +"num-slots", _abort);
> 
> 
> why use ASPEED_SDHCI_NUM_SLOTS here ?

No good reason. I'll just switch it to '2' like in the 2600.

Andrew

Re: [PATCH 2/2] hw/arm: ast2600: Wire up the eMMC controller

2019-12-10 Thread Andrew Jeffery




On Tue, 10 Dec 2019, at 23:22, Cédric Le Goater wrote:
> On 10/12/2019 01:52, Andrew Jeffery wrote:
> > Initialise another SDHCI model instance for the AST2600's eMMC
> > controller and use the SDHCI's num_slots value introduced previously to
> > determine whether we should create an SD card instance for the new slot.
> > 
> > Signed-off-by: Andrew Jeffery 
> 
> LGTM. One comment.
> 
> > ---
> >  hw/arm/aspeed.c | 13 +
> >  hw/arm/aspeed_ast2600.c | 21 +
> >  include/hw/arm/aspeed_soc.h |  2 ++
> >  3 files changed, 36 insertions(+)
> > 
> > diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
> > index 862549b1f3a9..0e08d62e9ff3 100644
> > --- a/hw/arm/aspeed.c
> > +++ b/hw/arm/aspeed.c
> > @@ -272,6 +272,19 @@ static void aspeed_board_init(MachineState *machine,
> >  object_property_set_bool(OBJECT(card), true, "realized", 
> > _fatal);
> >  }
> >  
> > +if (bmc->soc.emmc.num_slots) {
> > +SDHCIState *emmc = >soc.emmc.slots[0];
> > +DriveInfo *dinfo = drive_get_next(IF_SD);
> > +BlockBackend *blk;
> > +DeviceState *card;
> > +
> > +blk = dinfo ? blk_by_legacy_dinfo(dinfo) : NULL;
> > +card = qdev_create(qdev_get_child_bus(DEVICE(emmc), "sd-bus"),
> > +   TYPE_SD_CARD);
> > +qdev_prop_set_drive(card, "drive", blk, _fatal);
> > +object_property_set_bool(OBJECT(card), true, "realized", 
> > _fatal);
> > +}
> 
> I think we could use a function for the above ^

Yep, I'll refactor that.

Andrew

[PATCH v2 0/1] Fix bochs memory leak

2019-12-10 Thread Cameron Esfahani via

Fix a small memory leak in the Bochs display driver.

Each frame would leak about 304 bytes.

v2: Add missing signed-off-by line.

Cameron Esfahani (1):
  display/bochs-display: fix memory leak

 hw/display/bochs-display.c | 2 ++
 1 file changed, 2 insertions(+)

-- 
2.24.0

[PATCH v2 1/1] display/bochs-display: fix memory leak

2019-12-10 Thread Cameron Esfahani via

Fix memory leak in bochs_display_update().  Leaks 304 bytes per frame.

Signed-off-by: Cameron Esfahani 
---
 hw/display/bochs-display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/display/bochs-display.c b/hw/display/bochs-display.c
index dc1bd1641d..215db9a231 100644
--- a/hw/display/bochs-display.c
+++ b/hw/display/bochs-display.c
@@ -252,6 +252,8 @@ static void bochs_display_update(void *opaque)
 dpy_gfx_update(s->con, 0, ys,
mode.width, y - ys);
 }
+
+g_free(snap);
 }
 }
 
-- 
2.24.0

Re: [PATCH 2/2] ppc/pnv: populate the DT with realized XSCOM devices

2019-12-10 Thread Greg Kurz

On Tue, 10 Dec 2019 14:58:45 +0100
Cédric Le Goater  wrote:

> Some devices could be initialized in the instance_init handler but not
> realized for configuration reasons. Nodes should not be added in the DT
> for such devices.
> 
> Signed-off-by: Cédric Le Goater 
> ---

Reviewed-by: Greg Kurz 

>  hw/ppc/pnv_xscom.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
> index 006d87e970d9..6d3745a49e50 100644
> --- a/hw/ppc/pnv_xscom.c
> +++ b/hw/ppc/pnv_xscom.c
> @@ -272,7 +272,10 @@ static int xscom_dt_child(Object *child, void *opaque)
>  PnvXScomInterface *xd = PNV_XSCOM_INTERFACE(child);
>  PnvXScomInterfaceClass *xc = PNV_XSCOM_INTERFACE_GET_CLASS(xd);
>  
> -if (xc->dt_xscom) {
> +/*
> + * Only "realized" devices should be configured in the DT
> + */
> +if (xc->dt_xscom && DEVICE(child)->realized) {
>  _FDT((xc->dt_xscom(xd, args->fdt, args->xscom_offset)));
>  }
>  }

Re: [PATCH 1/2] ppc/pnv: Loop on the whole hierarchy to populate the DT with the XSCOM nodes

2019-12-10 Thread Greg Kurz

On Tue, 10 Dec 2019 14:58:44 +0100
Cédric Le Goater  wrote:

> Some PnvXScomInterface objects lie a bit deeper (PnvPBCQState) than
> the first layer, so we need to loop on the whole object hierarchy to
> catch them.
> 
> Signed-off-by: Cédric Le Goater 
> ---

Reviewed-by: Greg Kurz 

>  hw/ppc/pnv_xscom.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
> index bed41840845e..006d87e970d9 100644
> --- a/hw/ppc/pnv_xscom.c
> +++ b/hw/ppc/pnv_xscom.c
> @@ -326,7 +326,12 @@ int pnv_dt_xscom(PnvChip *chip, void *fdt, int 
> root_offset)
>  args.fdt = fdt;
>  args.xscom_offset = xscom_offset;
>  
> -object_child_foreach(OBJECT(chip), xscom_dt_child, );
> +/*
> + * Loop on the whole object hierarchy to catch all
> + * PnvXScomInterface objects which can lie a bit deeper the first
> + * layer.
> + */
> +object_child_foreach_recursive(OBJECT(chip), xscom_dt_child, );
>  return 0;
>  }
>

Re: [PATCH v1 0/1] Fix bochs memory leak

2019-12-10 Thread no-reply

Patchew URL: https://patchew.org/QEMU/cover.1576012190.git.di...@apple.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PATCH v1 0/1] Fix bochs memory leak
Type: series
Message-id: cover.1576012190.git.di...@apple.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Switched to a new branch 'test'
a3c9e5d display/bochs-display: fix memory leak

=== OUTPUT BEGIN ===
ERROR: Missing Signed-off-by: line(s)

total: 1 errors, 0 warnings, 8 lines checked

Commit a3c9e5dc8c32 (display/bochs-display: fix memory leak) has style 
problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/cover.1576012190.git.di...@apple.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[PATCH v1 1/1] display/bochs-display: fix memory leak

2019-12-10 Thread Cameron Esfahani via

Fix memory leak in bochs_display_update().  Leaks 304 bytes per frame.
---
 hw/display/bochs-display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/display/bochs-display.c b/hw/display/bochs-display.c
index dc1bd1641d..215db9a231 100644
--- a/hw/display/bochs-display.c
+++ b/hw/display/bochs-display.c
@@ -252,6 +252,8 @@ static void bochs_display_update(void *opaque)
 dpy_gfx_update(s->con, 0, ys,
mode.width, y - ys);
 }
+
+g_free(snap);
 }
 }
 
-- 
2.24.0

[PATCH v1 0/1] Fix bochs memory leak

2019-12-10 Thread Cameron Esfahani via

Fix a small memory leak in the Bochs display driver.

Each frame would leak about 304 bytes.

Cameron Esfahani (1):
  display/bochs-display: fix memory leak

 hw/display/bochs-display.c | 2 ++
 1 file changed, 2 insertions(+)

-- 
2.24.0

Re: [PATCH v38 04/22] target/avr: Add instruction translation - Registers definition

2019-12-10 Thread Aleksandar Markovic

On Sun, Dec 8, 2019 at 7:40 PM Michael Rolnik  wrote:
>
> Signed-off-by: Michael Rolnik 
> Reviewed-by: Philippe Mathieu-Daudé 
> Tested-by: Philippe Mathieu-Daudé 
> ---
>  target/avr/translate.c | 143 +
>  1 file changed, 143 insertions(+)
>  create mode 100644 target/avr/translate.c
>
> diff --git a/target/avr/translate.c b/target/avr/translate.c
> new file mode 100644
> index 00..0139bcabb1
> --- /dev/null
> +++ b/target/avr/translate.c
> @@ -0,0 +1,143 @@
> +/*
> + * QEMU AVR CPU
> + *
> + * Copyright (c) 2019 Michael Rolnik
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see
> + * 
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/qemu-print.h"
> +#include "tcg/tcg.h"
> +#include "cpu.h"
> +#include "exec/exec-all.h"
> +#include "tcg-op.h"
> +#include "exec/cpu_ldst.h"
> +#include "exec/helper-proto.h"
> +#include "exec/helper-gen.h"
> +#include "exec/log.h"
> +#include "exec/translator.h"
> +#include "exec/gen-icount.h"
> +
> +/*
> + *  Define if you want a BREAK instruction translated to a breakpoint
> + *  Active debugging connection is assumed
> + *  This is for
> + *  https://github.com/seharris/qemu-avr-tests/tree/master/instruction-tests
> + *  tests
> + */
> +#undef BREAKPOINT_ON_BREAK
> +
> +static TCGv cpu_pc;
> +
> +static TCGv cpu_Cf;
> +static TCGv cpu_Zf;
> +static TCGv cpu_Nf;
> +static TCGv cpu_Vf;
> +static TCGv cpu_Sf;
> +static TCGv cpu_Hf;
> +static TCGv cpu_Tf;
> +static TCGv cpu_If;
> +
> +static TCGv cpu_rampD;
> +static TCGv cpu_rampX;
> +static TCGv cpu_rampY;
> +static TCGv cpu_rampZ;
> +
> +static TCGv cpu_r[NUMBER_OF_CPU_REGISTERS];
> +static TCGv cpu_eind;
> +static TCGv cpu_sp;
> +
> +static TCGv cpu_skip;
> +
> +static const char reg_names[NUMBER_OF_CPU_REGISTERS][8] = {
> +"r0",  "r1",  "r2",  "r3",  "r4",  "r5",  "r6",  "r7",
> +"r8",  "r9",  "r10", "r11", "r12", "r13", "r14", "r15",
> +"r16", "r17", "r18", "r19", "r20", "r21", "r22", "r23",
> +"r24", "r25", "r26", "r27", "r28", "r29", "r30", "r31",
> +};
> +#define REG(x) (cpu_r[x])
> +
> +enum {
> +DISAS_EXIT   = DISAS_TARGET_0,  /* We want return to the cpu main loop.  
> */
> +DISAS_LOOKUP = DISAS_TARGET_1,  /* We have a variable condition exit.  */
> +DISAS_CHAIN  = DISAS_TARGET_2,  /* We have a single condition exit.  */
> +};
> +
> +typedef struct DisasContext DisasContext;
> +
> +/* This is the state at translation time. */
> +struct DisasContext {
> +TranslationBlock *tb;
> +
> +CPUAVRState *env;
> +CPUState *cs;
> +
> +target_long npc;
> +uint32_t opcode;
> +
> +/* Routine used to access memory */
> +int memidx;
> +int bstate;
> +int singlestep;
> +
> +TCGv skip_var0;
> +TCGv skip_var1;
> +TCGCond skip_cond;
> +bool free_skip_var0;
> +};

Still waiting on the comment...

> +
> +static int to_regs_16_31_by_one(DisasContext *ctx, int indx)
> +{
> +return 16 + (indx % 16);
> +}
> +
> +static int to_regs_16_23_by_one(DisasContext *ctx, int indx)
> +{
> +return 16 + (indx % 8);
> +}
> +static int to_regs_24_30_by_two(DisasContext *ctx, int indx)
> +{
> +return 24 + (indx % 4) * 2;
> +}
> +static int to_regs_00_30_by_two(DisasContext *ctx, int indx)
> +{
> +return (indx % 16) * 2;
> +}
> +
> +static uint16_t next_word(DisasContext *ctx)
> +{
> +return cpu_lduw_code(ctx->env, ctx->npc++ * 2);
> +}
> +
> +static int append_16(DisasContext *ctx, int x)
> +{
> +return x << 16 | next_word(ctx);
> +}
> +
> +
> +static bool avr_have_feature(DisasContext *ctx, int feature)
> +{
> +if (!avr_feature(ctx->env, feature)) {
> +gen_helper_unsupported(cpu_env);
> +ctx->bstate = DISAS_NORETURN;
> +return false;
> +}
> +return true;
> +}
> +
> +static bool decode_insn(DisasContext *ctx, uint16_t insn);
> +#include "decode_insn.inc.c"
> +
> --
> 2.17.2 (Apple Git-113)
>

Re: [PATCH v38 17/22] target/avr: Register AVR support with the rest of QEMU

2019-12-10 Thread Aleksandar Markovic

On Mon, Dec 9, 2019 at 7:31 PM Michael Rolnik  wrote:
>
> I prefer to remove it, as nobody uses it. what do you think? the full list is 
> in target/avr/cpu.h file
>

I have mixed filings about that.

I can just imagine someone in future might make a "superassembler"
that uses this header, and than avr info would be than missing if you
delete this hunk...

But I don't have a strong preference.

Thanks,
Aleksandar

> On Mon, Dec 9, 2019 at 8:16 PM Aleksandar Markovic 
>  wrote:
>>
>>
>>
>> On Sunday, December 8, 2019, Michael Rolnik  wrote:
>>>
>>> Add AVR related definitions into QEMU
>>>
>>> Signed-off-by: Michael Rolnik 
>>> Tested-by: Philippe Mathieu-Daudé 
>>> Reviewed-by: Aleksandar Markovic 
>>> ---
>>>  qapi/machine.json  | 3 ++-
>>>  include/disas/dis-asm.h| 6 ++
>>>  include/sysemu/arch_init.h | 1 +
>>>  arch_init.c| 2 ++
>>>  4 files changed, 11 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qapi/machine.json b/qapi/machine.json
>>> index ca26779f1a..8c6df54921 100644
>>> --- a/qapi/machine.json
>>> +++ b/qapi/machine.json
>>> @@ -21,11 +21,12 @@
>>>  #is true even for "qemu-system-x86_64".
>>>  #
>>>  # ppcemb: dropped in 3.1
>>> +# avr: since 5.0
>>>  #
>>>  # Since: 3.0
>>>  ##
>>>  { 'enum' : 'SysEmuTarget',
>>> -  'data' : [ 'aarch64', 'alpha', 'arm', 'cris', 'hppa', 'i386', 'lm32',
>>> +  'data' : [ 'aarch64', 'alpha', 'arm', 'avr', 'cris', 'hppa', 'i386', 
>>> 'lm32',
>>>
>>>   'm68k', 'microblaze', 'microblazeel', 'mips', 'mips64',
>>>   'mips64el', 'mipsel', 'moxie', 'nios2', 'or1k', 'ppc',
>>>   'ppc64', 'riscv32', 'riscv64', 's390x', 'sh4',
>>> diff --git a/include/disas/dis-asm.h b/include/disas/dis-asm.h
>>> index e9c7dd8eb4..8bedce17ac 100644
>>> --- a/include/disas/dis-asm.h
>>> +++ b/include/disas/dis-asm.h
>>> @@ -211,6 +211,12 @@ enum bfd_architecture
>>>  #define bfd_mach_m32r  0  /* backwards compatibility */
>>>bfd_arch_mn10200,/* Matsushita MN10200 */
>>>bfd_arch_mn10300,/* Matsushita MN10300 */
>>> +  bfd_arch_avr,   /* Atmel AVR microcontrollers.  */
>>> +#define bfd_mach_avr1  1
>>> +#define bfd_mach_avr2  2
>>> +#define bfd_mach_avr3  3
>>> +#define bfd_mach_avr4  4
>>> +#define bfd_mach_avr5  5
>>
>>
>> Incomplete list. I already explained why in reply to v37.
>>
>>
>>>
>>>bfd_arch_cris,   /* Axis CRIS */
>>>  #define bfd_mach_cris_v0_v10   255
>>>  #define bfd_mach_cris_v32  32
>>> diff --git a/include/sysemu/arch_init.h b/include/sysemu/arch_init.h
>>> index 62c6fe4cf1..893df26ce2 100644
>>> --- a/include/sysemu/arch_init.h
>>> +++ b/include/sysemu/arch_init.h
>>> @@ -24,6 +24,7 @@ enum {
>>>  QEMU_ARCH_NIOS2 = (1 << 17),
>>>  QEMU_ARCH_HPPA = (1 << 18),
>>>  QEMU_ARCH_RISCV = (1 << 19),
>>> +QEMU_ARCH_AVR = (1 << 20),
>>>  };
>>>
>>>  extern const uint32_t arch_type;
>>> diff --git a/arch_init.c b/arch_init.c
>>> index 705d0b94ad..6a741165b2 100644
>>> --- a/arch_init.c
>>> +++ b/arch_init.c
>>> @@ -89,6 +89,8 @@ int graphic_depth = 32;
>>>  #define QEMU_ARCH QEMU_ARCH_UNICORE32
>>>  #elif defined(TARGET_XTENSA)
>>>  #define QEMU_ARCH QEMU_ARCH_XTENSA
>>> +#elif defined(TARGET_AVR)
>>> +#define QEMU_ARCH QEMU_ARCH_AVR
>>>  #endif
>>>
>>>  const uint32_t arch_type = QEMU_ARCH;
>>> --
>>> 2.17.2 (Apple Git-113)
>>>
>
>
> --
> Best Regards,
> Michael Rolnik

Re: [PATCH v38 05/22] target/avr: Add instruction translation - Arithmetic and Logic Instructions

2019-12-10 Thread Aleksandar Markovic

On Sun, Dec 8, 2019 at 7:40 PM Michael Rolnik  wrote:

> +
> +/*
> + *  Performs the logical AND between the contents of register Rd and register
> + *  Rr and places the result in the destination register Rd.
> + */
> +static bool trans_AND(DisasContext *ctx, arg_AND *a)
> +{
> +TCGv Rd = cpu_r[a->rd];
> +TCGv Rr = cpu_r[a->rr];
> +TCGv R = tcg_temp_new_i32();
> +
> +tcg_gen_and_tl(R, Rd, Rr); /* Rd = Rd and Rr */
> +tcg_gen_movi_tl(cpu_Vf, 0); /* Vf = 0 */
> +tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
> +gen_ZNSf(R);
> +tcg_gen_mov_tl(Rd, R);
> +
> +tcg_temp_free_i32(R);
> +
> +return true;
> +}
> +
> +

In the v37 comment I suggested to you make "update status register"
portions more compact (but also to mark it with an one-line comment),
but you overdid it and extended this compacting all the way to the
main part of the handler. And now it is still hard to discern the core
of the handler ("and" operation) and the update-status-register
portion.

Yours,
Aleksandar

[ANNOUNCE] QEMU 4.2.0-rc5 is now available

2019-12-10 Thread Michael Roth

Hello,

On behalf of the QEMU Team, I'd like to announce the availability of the
sixth release candidate for the QEMU 4.2 release.  This release is meant
for testing purposes and should not be used in a production environment.

  http://download.qemu-project.org/qemu-4.2.0-rc5.tar.xz
  http://download.qemu-project.org/qemu-4.2.0-rc5.tar.xz.sig

A note from the maintainer:

  rc5 contains three fixes for late-breaking release critical bugs:
   * a QEMU crash when using KVM with an x86 host kernel with nested=0
   * a guest crash in PPC spapr machines with pci bridges
   * a crash involving bitmap handling in the qcow2 image format
  
  We now plan to make the final release with no further
  changes on Thursday, 2019-12-12.

You can help improve the quality of the QEMU 4.2 release by testing this
release and reporting bugs on Launchpad:

  https://bugs.launchpad.net/qemu/

The release plan, as well a documented known issues for release
candidates, are available at:

  http://wiki.qemu.org/Planning/4.2

Please add entries to the ChangeLog for the 4.2 release below:

  http://wiki.qemu.org/ChangeLog/4.2

Thank you to everyone involved!

Changes since rc4:

52901abf94: Update version for v4.2.0-rc5 release (Peter Maydell)
f56281abd9: block/qcow2-bitmap: fix crash bug in 
qcow2_co_remove_persistent_dirty_bitmap (Vladimir Sementsov-Ogievskiy)
a2fad86497: pseries: Update SLOF firmware image (Alexey Kardashevskiy)
2605188240: target/i386: disable VMX features if nested=0 (Yang Zhong)

Re: [bugfix ping2] Re: [PATCH v2 0/2] fix qcow2_can_store_new_dirty_bitmap

2019-12-10 Thread John Snow

On 12/10/19 8:24 AM, Max Reitz wrote:
> On 10.12.19 09:11, Max Reitz wrote:
>> On 09.12.19 23:03, Eric Blake wrote:
>>> On 12/9/19 11:58 AM, Max Reitz wrote:
 On 09.12.19 17:30, Max Reitz wrote:
> On 02.12.19 15:09, Vladimir Sementsov-Ogievskiy wrote:
>> Hi again!
>>
>> Still forgotten bug-fix :(
>>
>> Is it too late for 4.2?
>
> Sorry. :-/
>
> Yes, I think I just forgot it.  I don’t think it’s too important for
> 4.2, so, well, it isn’t too bad, but...  Sorry.
>
>> I can't imagine better test, and it tests exactly what written in
>> https://bugzilla.redhat.com/show_bug.cgi?id=1712636
>>
>> (Hmm, actually, I doubt that it is real use-case, more probably it's
>> a bug in management layer)
>>
>> So, take this with test or without test, to 4.2 or 5.0.
>
> I was thinking of seeing whether I could write a quicker test, but of
> course we should take the patch either way.

 OK, I give up.  It’s very much possible to create an image with 65535
 bitmaps very quickly (like, under a second) outside of qemu, but just
 opening it takes 2:30 min (because of the quadratic complexity of
 checking whether a bitmap of the same name already exists).
>>>
>>> Can we fix that to use a hash table for amortized O(1) lookup rather
>>> than the current O(n) lookup?
>>
>> Not unreasonable, considering that this is probably what we would’ve
>> done from the start in any language where hash tables are built in.
>>
>> But OTOH when you have 66k bitmaps, you probably have other problems.
>> Like, writes being incredibly slow, because all those bitmaps have to be
>> updated.
>>
>> (Well, you can technically have 99 % of them disabled, but who’d do such
>> a thing?)
>>
>> ((Maybe I’ll look into it.))
> 
> Hmm, now I did.  This gets the test down to 24 s.  Still not sure
> whether it’s worth it, though...
> 
> Max
> 

I agree we very likely have other problems once we reach resource usage
of this level.

Still, if we want to make this blazing fast for the love of doing so:

(1) Read in the directory *once*, and cache it. We have avoided doing
this largely to feel more confident that the code is correct and is
never working on an "outdated" version of the directory.

[On cache invalidation, we can write the directory back out to the
bitmap, and delete our cache. The next time we need the list, we can
reload it. This should alleviate consistency concerns.]

(2) Store the entries in an rbtree! 65536 entries is only ~16 lookups
maximum in the worst case. I took a look at the linux rbtree
implementation and did a very quick back-of-the-envelope benchmarking
for inserting strings (len=32) into a tree:

name generation 53151 usec
insert [0-10] 5 usec
insert [10-100] 14 usec
insert [100-1000] 195 usec
insert [1000-1] 2919 usec
insert [1-65536] 41485 usec

This seems fast enough that we're likely going to be eclipsed just by
other string handling concerns.

--js

Re: [PATCH 00/10] Add Allwinner H3 SoC and Orange Pi PC Machine

2019-12-10 Thread Niek Linnenbank

Hi Philippe,

On Tue, Dec 10, 2019 at 9:26 AM Philippe Mathieu-Daudé 
wrote:

> On 12/9/19 10:37 PM, Niek Linnenbank wrote:
> > Hi Philippe,
> >
> > On Tue, Dec 3, 2019 at 9:47 AM Philippe Mathieu-Daudé  > > wrote:
> >
> > On 12/2/19 10:09 PM, Niek Linnenbank wrote:
> >  > Dear QEMU developers,
> >  >
> >  > Hereby I would like to contribute the following set of patches to
> > QEMU
> >  > which add support for the Allwinner H3 System on Chip and the
> >  > Orange Pi PC machine. The following features and devices are
> > supported:
> >  >
> >  >   * SMP (Quad Core Cortex A7)
> >  >   * Generic Interrupt Controller configuration
> >  >   * SRAM mappings
> >  >   * Timer device (re-used from Allwinner A10)
> >  >   * UART
> >  >   * SD/MMC storage controller
> >  >   * EMAC ethernet connectivity
> >  >   * USB 2.0 interfaces
> >  >   * Clock Control Unit
> >  >   * System Control module
> >  >   * Security Identifier device
> >
> > Awesome!
> >
> >  > Functionality related to graphical output such as HDMI, GPU,
> >  > Display Engine and audio are not included. Recently released
> >  > mainline Linux kernels (4.19 up to latest master) and mainline
> U-Boot
> >  > are known to work. The SD/MMC code is tested using bonnie++ and
> >  > various tools such as fsck, dd and fdisk. The EMAC is verified
> > with iperf3
> >  > using -netdev socket.
> >  >
> >  > To build a Linux mainline kernel that can be booted by the Orange
> > Pi PC
> >  > machine, simply configure the kernel using the sunxi_defconfig
> > configuration:
> >  >   $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make mrproper
> >  >   $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make sunxi_defconfig
> >  >
> >  > To be able to use USB storage, you need to manually enable the
> > corresponding
> >  > configuration item. Start the kconfig configuration tool:
> >  >   $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make menuconfig
> >  >
> >  > Navigate to the following item, enable it and save your
> > configuration:
> >  >   Device Drivers > USB support > USB Mass Storage support
> >  >
> >  > Build the Linux kernel with:
> >  >   $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make -j5
> >  >
> >  > To boot the newly build linux kernel in QEMU with the Orange Pi
> > PC machine, use:
> >  >   $ qemu-system-arm -M orangepi -m 512 -nic user -nographic \
> >  >   -kernel /path/to/linux/arch/arm/boot/zImage \
> >  >   -append 'console=ttyS0,115200' \
> >  >   -dtb
> /path/to/linux/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dtb
> >  >
> >  > Note that this kernel does not have a root filesystem. You may
> > provide it
> >  > with an official Orange Pi PC image [1] either as an SD card or as
> >  > USB mass storage. To boot using the Orange Pi PC Debian image on
> > SD card,
> >  > simply add the -sd argument and provide the proper root= kernel
> > parameter:
> >  >   $ qemu-system-arm -M orangepi -m 512 -nic user -nographic \
> >  >   -kernel /path/to/linux/arch/arm/boot/zImage \
> >  >   -append 'console=ttyS0,115200 root=/dev/mmcblk0p2' \
> >  >   -dtb
> > /path/to/linux/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dtb \
> >  >   -sd OrangePi_pc_debian_stretch_server_linux5.3.5_v1.0.img
> >  >
> >  > Alternatively, you can also choose to build and boot a recent
> > buildroot [2]
> >  > using the orangepi_pc_defconfig or Armbian image [3] for Orange
> > Pi PC.
> >
> > Richard, trying the Armbian image from
> > https://apt.armbian.com/pool/main/l/linux-4.20.7-sunxi/ I get:
> >
> > $ arm-softmmu/qemu-system-arm -M orangepi -m 512 -nic user \
> > -append 'console=ttyS0,115200' \
> > -kernel boot/vmlinuz-4.20.7-sunxi \
> > -dtb usr/lib/linux-image-dev-sunxi/sun8i-h3-orangepi-pc.dtb \
> > -serial stdio -d unimp
> > Uncompressing Linux... done, booting the kernel.
> > rtc: unimplemented device write (size 4, value 0x16aa0001, offset
> 0x0)
> > rtc: unimplemented device read (size 4, offset 0x0)
> > rtc: unimplemented device read (size 4, offset 0x0)
> > rtc: unimplemented device read (size 4, offset 0x8)
> > qemu-system-arm: target/arm/helper.c:11359: cpu_get_tb_cpu_state:
> > Assertion `flags == rebuild_hflags_internal(env)' failed.
> > Aborted (core dumped)
> >
> >
> > I'm trying to reproduce the error you reported here with my patch set on
> > latest master,
> > but so far without any result. The host OS I'm using is Ubuntu 18.04.3
> > LTS on x86_64.
> > I ran several times using the same 4.20.7-sunxi kernel and same command
> > line.
> >
> > Some questions that might help:
> > 1) Are there any specific steps you did in order to produce this error?
>
> I build QEMU

Re: [PATCH for-5.0 v11 18/20] virtio-iommu: Support migration

2019-12-10 Thread Peter Xu

On Fri, Nov 22, 2019 at 07:29:41PM +0100, Eric Auger wrote:
> +static const VMStateDescription vmstate_virtio_iommu_device = {
> +.name = "virtio-iommu-device",
> +.minimum_version_id = 1,
> +.version_id = 1,
> +.post_load = iommu_post_load,
> +.fields = (VMStateField[]) {
> +VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 1,
> +   _domain, viommu_domain),
> +VMSTATE_GTREE_DIRECT_KEY_V(endpoints, VirtIOIOMMU, 1,
> +   _endpoint, viommu_endpoint),

IIUC vmstate_domain already contains all the endpoint information (in
endpoint_list of vmstate_domain), but here we migrate it twice.  I
suppose that's why now we need reconstruct_ep_domain_link() to fixup
the duplicated migration?

Then I'll instead ask whether we can skip migrating here?  Then in
post_load we simply:

  foreach(domain)
foreach(endpoint in domain)
  g_tree_insert(s->endpoints);

It might help to avoid the reconstruct_ep_domain_link ugliness?

And besides, I also agree with Jean that the endpoint data structure
could be reused with IOMMUDevice somehow.

Thanks,

-- 
Peter Xu

Re: [PATCH 00/10] Add Allwinner H3 SoC and Orange Pi PC Machine

2019-12-10 Thread Niek Linnenbank

Hi Frederic,

On Tue, Dec 10, 2019 at 11:34 AM KONRAD Frederic <
frederic.kon...@adacore.com> wrote:

>
>
> Le 12/2/19 à 10:09 PM, Niek Linnenbank a écrit :
> > Dear QEMU developers,
> >
> > Hereby I would like to contribute the following set of patches to QEMU
> > which add support for the Allwinner H3 System on Chip and the
> > Orange Pi PC machine. The following features and devices are supported:
> >
> >   * SMP (Quad Core Cortex A7)
> >   * Generic Interrupt Controller configuration
> >   * SRAM mappings
> >   * Timer device (re-used from Allwinner A10)
> >   * UART
> >   * SD/MMC storage controller
> >   * EMAC ethernet connectivity
> >   * USB 2.0 interfaces
> >   * Clock Control Unit
> >   * System Control module
> >   * Security Identifier device
> >
> > Functionality related to graphical output such as HDMI, GPU,
> > Display Engine and audio are not included. Recently released
> > mainline Linux kernels (4.19 up to latest master) and mainline U-Boot
> > are known to work. The SD/MMC code is tested using bonnie++ and
> > various tools such as fsck, dd and fdisk. The EMAC is verified with
> iperf3
> > using -netdev socket.
> >
> > To build a Linux mainline kernel that can be booted by the Orange Pi PC
> > machine, simply configure the kernel using the sunxi_defconfig
> configuration:
> >   $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make mrproper
> >   $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make sunxi_defconfig
> >
> > To be able to use USB storage, you need to manually enable the
> corresponding
> > configuration item. Start the kconfig configuration tool:
> >   $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make menuconfig
> >
> > Navigate to the following item, enable it and save your configuration:
> >   Device Drivers > USB support > USB Mass Storage support
> >
> > Build the Linux kernel with:
> >   $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make -j5
> >
> > To boot the newly build linux kernel in QEMU with the Orange Pi PC
> machine, use:
> >   $ qemu-system-arm -M orangepi -m 512 -nic user -nographic \
> >   -kernel /path/to/linux/arch/arm/boot/zImage \
> >   -append 'console=ttyS0,115200' \
> >   -dtb /path/to/linux/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dtb
> >
> > Note that this kernel does not have a root filesystem. You may provide it
> > with an official Orange Pi PC image [1] either as an SD card or as
> > USB mass storage. To boot using the Orange Pi PC Debian image on SD card,
> > simply add the -sd argument and provide the proper root= kernel
> parameter:
> >   $ qemu-system-arm -M orangepi -m 512 -nic user -nographic \
> >   -kernel /path/to/linux/arch/arm/boot/zImage \
> >   -append 'console=ttyS0,115200 root=/dev/mmcblk0p2' \
> >   -dtb /path/to/linux/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dtb \
> >   -sd OrangePi_pc_debian_stretch_server_linux5.3.5_v1.0.img
> >
> > Alternatively, you can also choose to build and boot a recent buildroot
> [2]
> > using the orangepi_pc_defconfig or Armbian image [3] for Orange Pi PC.
> > To attach an USB mass storage device to the machine, simply append to
> the command:
> >   -drive if=none,id=stick,file=myimage.img \
> >   -device usb-storage,bus=usb-bus.0,drive=stick
> >
> > U-Boot mainline can be build and configured using the
> orangepi_pc_defconfig
> > using similar commands as describe above for Linux. To start U-Boot using
> > the Orange Pi PC machine, provide the u-boot binary to the -kernel
> argument:
> >   $ qemu-system-arm -M orangepi -m 512 -nic user -nographic \
> >   -kernel /path/to/uboot/u-boot -sd disk.img
> >
> > Use the following U-boot commands to load and boot a Linux kernel from
> SD card:
> >   -> setenv bootargs console=ttyS0,115200
> >   -> ext2load mmc 0 0x4200 zImage
> >   -> ext2load mmc 0 0x4300 sun8i-h2-plus-orangepi-zero.dtb
> >   -> bootz 0x4200 - 0x4300
> >
> > Looking forward to your review comments. I will do my best
> > to update the patches where needed.
> >
> > With kind regards,
> >
> > Niek Linnenbank
> >
> > [1] http://www.orangepi.org/downloadresources/
> > [2] https://buildroot.org/download.html
> > [3] https://www.armbian.com/orange-pi-pc/
>
> Works well on my side with vanilla linux-4.9.13 built with gcc-8.3.0 +
> busybox
> and sun8i-h3-orangepi-one.dtb.
>

> Tested-by: KONRAD Frederic 
>
> Thank you for testing! Great, I'll add the tag for the next v2 of the
patches.

Regards,
Niek


> >
> > Niek Linnenbank (10):
> >hw: arm: add Allwinner H3 System-on-Chip
> >hw: arm: add Xunlong Orange Pi PC machine
> >arm: allwinner-h3: add Clock Control Unit
> >arm: allwinner-h3: add USB host controller
> >arm: allwinner-h3: add System Control module
> >arm/arm-powerctl: set NSACR.{CP11,CP10} bits in arm_set_cpu_on()
> >arm: allwinner-h3: add CPU Configuration module
> >arm: allwinner-h3: add Security Identifier device
> >arm: allwinner-h3: add SD/MMC host controller
> >arm: allwinner-h3: add EMAC ethernet device
> >
> >

Re: BeagleBone support, omap1, omap2, omap3, etc.

2019-12-10 Thread Niek Linnenbank

Hello Philippe and Esteban,

On Tue, Dec 10, 2019 at 10:55 AM Philippe Mathieu-Daudé 
wrote:

> Hi Esteban,
>
> On 12/3/19 4:24 PM, Esteban Bosse wrote:
> > Ping
> >
> > El mié., 6 nov. 2019 16:04, Esteban Bosse  > > escribió:
> >
> > Hello!
> >
> > Some months ago I started to work trying to port the Beaglebone
> > support from the old qemu-linaro fork to the new QEMU mainstream.
> >
> > During my work I found that the Beaglebone have an OMAP3 mpu this
> > mpu has very strong relation with the OMAP2 and OMAP1 in qemu, they
> > implement a lot of functions in common.
> >
> > Then I understood that the omap1 and omap2 don't implement things
> > like QOM and needs a lot of work to upgrade it, at the same time
> > they are some boards like: omap1_sx, palm, nseries that implement
> > this mpus.
> >
> > Looking the datasheet of the omap1 I realized that it's an very old
> > device and some questions like "make sense work with this old
> > device?" comes to my mind.
>
> The OMAP3 reuse various components of the OMAP1/2.
> Although in old shape, the OMAP1/2 are in the codebase and work.
> It make sense to me to start upgrading the OMAP1/2 to new QOM standard,
> then add the OMAP3 missing parts.
>
> The previous recommendations from Peter are still valid:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg636936.html
>
> Or you can use the schema followed by Niek when adding the Allwinner H3:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg662591.html
>
> That is:
>
> - Add tests using old code (booting Linux, network access in guest)
> - Add an empty board
> - Plug an empty OMAP SoC into the board, add the PoP LPDRAM
> - Add a ARM926 core into the SoC
> - Add most of the devices as UnimplementedDevice
> - Add the interrupt controller in the SoC
> - Add the UART in the SoC
> - Add the Timers in the SoC
> - Try to boot a Linux kernel (UART, TMR, then IRQ tested)
> - Add the SD controller in the SoC
> - Plug a drive to the SD in the board
> - Try to boot u-boot
> - You can now start the OMAP2 using a ARM1136 core
> - Add the missing UNIMP devices (loop to previous steps)
> - Add network controller
> - Run tests (booting Linux, network access in guest)
> - Remove old code
>
> > When I went to the KVM Forum the last week I talked with some of
> > you, and you help my with different ideas and proposal to make this
> > task, but I can't see the right way to make this work because it is
> > a lot of work.
> >
> > My motivation is learn more about embedded devices, architecture,
> > kernel, etc. and of course contribute to the community.
> >
> > I would love to hear your opinions about this 3 related devices with
> > they respected boards.
> >
> > Maybe someone is interested to work with me.
> > I dream to make this work beautiful (like the musca board with the
> > armsse and armv7m modules) with a good variety of tests. And in the
> > same time I would like to write some documentation about the process
> > with the final idea to "make an easier way for new contributors".
>
> Very good idea.
>
> Niek, since you recently did the same, do you mind sharing your
> experience, tell us what was not clear or hard to understand, so we can
> have a better idea what part of the documentation/process we should
> improve first, to help and welcome new contributors?
>

Sure! Based on my own experience with the Allwinner H3, I can fully
recommend the steps
described above by Philippe to get the work done. Those are mostly the
things I did as well.

I think the best advice I can give you to get started is, start with the
bare minimum: kernel output.
What I mean by that is, get the linux source and compile it for your target
machine. Next, take the QEMU source and choose
any existing machine that come closest to the machine or SoC that you want
to implement.
Then, just try to get the kernel output working via the serial console by
loading it with -kernel, -append and -dtb arguments.

If you are lucky, serial output already works since the machine is similar
to the one you want to implement. If not,
you may need to check for things like the load address and DRAM addresses
first and try to get output
by reading the kernel dmesg via GDB [1]. If you start QEMU with -s -S
arguments, connect with gdb
and give the 'lx-dmesg' command you'll read the kernel output before it
goes to the serial device.
If you at least selected the right processor and things like the load
address are OK, chances are good
that you at least get some logging.  And then, you have a starting point to
start the real work using the
steps described above by Philippe.

Regards,
Niek

[1]
https://www.kernel.org/doc/html/v4.10/dev-tools/gdb-kernel-debugging.html



>
> >
> > If someone want to work with me in this task, should know that I
> > don't have to much experience and I'm doing this job in my free time
> >

Re: [PATCH for-5.0 v11 14/20] virtio-iommu: Handle reserved regions in the translation process

2019-12-10 Thread Peter Xu

On Fri, Nov 22, 2019 at 07:29:37PM +0100, Eric Auger wrote:
> When translating an address we need to check if it belongs to
> a reserved virtual address range. If it does, there are 2 cases:
> 
> - it belongs to a RESERVED region: the guest should neither use
>   this address in a MAP not instruct the end-point to DMA on
>   them. We report an error
> 
> - It belongs to an MSI region: we bypass the translation.
> 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> v10 -> v11:
> - directly use the reserved_regions properties array
> 
> v9 -> v10:
> - in case of MSI region, we immediatly return
> ---
>  hw/virtio/virtio-iommu.c | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index 1ce2218935..c5b202fab7 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -548,6 +548,7 @@ static IOMMUTLBEntry 
> virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>  uint32_t sid, flags;
>  bool bypass_allowed;
>  bool found;
> +int i;
>  
>  interval.low = addr;
>  interval.high = addr + 1;
> @@ -580,6 +581,22 @@ static IOMMUTLBEntry 
> virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>  goto unlock;
>  }
>  
> +for (i = 0; i < s->nb_reserved_regions; i++) {
> +if (interval.low >= s->reserved_regions[i].low &&
> +interval.low <= s->reserved_regions[i].high) {
> +switch (s->reserved_regions[i].type) {
> +case VIRTIO_IOMMU_RESV_MEM_T_MSI:
> +entry.perm = flag;
> +goto unlock;

Might be a bit clearer to break here instead of goto, then..

> +case VIRTIO_IOMMU_RESV_MEM_T_RESERVED:

   /* Passthrough */

> +default:
> +virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_MAPPING,
> +  0, sid, addr);
> +goto unlock;

.. do the same thing here, and...

> +   }

.. goto unlock here..

> +}
> +}
> +
>  if (!ep->domain) {
>  if (!bypass_allowed) {
>  qemu_log_mask(LOG_GUEST_ERROR,
> -- 
> 2.20.1
> 

-- 
Peter Xu

Re: [PATCH for-5.0 v11 13/20] virtio-iommu: Implement probe request

2019-12-10 Thread Peter Xu

On Fri, Nov 22, 2019 at 07:29:36PM +0100, Eric Auger wrote:
> This patch implements the PROBE request. At the moment,
> no reserved regions are returned as none are registered
> per device. Only a NONE property is returned.
> 
> Signed-off-by: Eric Auger 

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [PATCH for-5.0 v11 08/20] virtio-iommu: Implement translate

2019-12-10 Thread Peter Xu

On Fri, Nov 22, 2019 at 07:29:31PM +0100, Eric Auger wrote:
> This patch implements the translate callback
> 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> v10 -> v11:
> - take into account the new value struct and use
>   g_tree_lookup_extended
> - switched to error_report_once
> 
> v6 -> v7:
> - implemented bypass-mode
> 
> v5 -> v6:
> - replace error_report by qemu_log_mask
> 
> v4 -> v5:
> - check the device domain is not NULL
> - s/printf/error_report
> - set flags to IOMMU_NONE in case of all translation faults
> ---
>  hw/virtio/trace-events   |  1 +
>  hw/virtio/virtio-iommu.c | 63 +++-
>  2 files changed, 63 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index f25359cee2..de7cbb3c8f 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -72,3 +72,4 @@ virtio_iommu_get_endpoint(uint32_t ep_id) "Alloc 
> endpoint=%d"
>  virtio_iommu_put_endpoint(uint32_t ep_id) "Free endpoint=%d"
>  virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
>  virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
> +virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t 
> sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index f0a56833a2..a83666557b 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -412,19 +412,80 @@ static IOMMUTLBEntry 
> virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>  int iommu_idx)
>  {
>  IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
> +viommu_interval interval, *mapping_key;
> +viommu_mapping *mapping_value;
> +VirtIOIOMMU *s = sdev->viommu;
> +viommu_endpoint *ep;
> +bool bypass_allowed;
>  uint32_t sid;
> +bool found;
> +
> +interval.low = addr;
> +interval.high = addr + 1;
>  
>  IOMMUTLBEntry entry = {
>  .target_as = _space_memory,
>  .iova = addr,
>  .translated_addr = addr,
> -.addr_mask = ~(hwaddr)0,
> +.addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
>  .perm = IOMMU_NONE,
>  };
>  
> +bypass_allowed = virtio_has_feature(s->acked_features,
> +VIRTIO_IOMMU_F_BYPASS);
> +

Would it be easier to check bypass_allowed here once and then drop the
latter [1] and [2] check?

>  sid = virtio_iommu_get_sid(sdev);
>  
>  trace_virtio_iommu_translate(mr->parent_obj.name, sid, addr, flag);
> +qemu_mutex_lock(>mutex);
> +
> +ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid));
> +if (!ep) {
> +if (!bypass_allowed) {

[1]

> +error_report_once("%s sid=%d is not known!!", __func__, sid);
> +} else {
> +entry.perm = flag;
> +}
> +goto unlock;
> +}
> +
> +if (!ep->domain) {
> +if (!bypass_allowed) {

[2]

> +qemu_log_mask(LOG_GUEST_ERROR,
> +  "%s %02x:%02x.%01x not attached to any domain\n",
> +  __func__, PCI_BUS_NUM(sid),
> +  PCI_SLOT(sid), PCI_FUNC(sid));
> +} else {
> +entry.perm = flag;
> +}
> +goto unlock;
> +}
> +
> +found = g_tree_lookup_extended(ep->domain->mappings, 
> (gpointer)(),
> +   (void **)_key,
> +   (void **)_value);
> +if (!found) {
> +qemu_log_mask(LOG_GUEST_ERROR,
> +  "%s no mapping for 0x%"PRIx64" for sid=%d\n",
> +  __func__, addr, sid);

I would still suggest that we use the same logging interface (either
error_report_once() or qemu_log_mask(), not use them randomly).

> +goto unlock;
> +}
> +
> +if (((flag & IOMMU_RO) &&
> +!(mapping_value->flags & VIRTIO_IOMMU_MAP_F_READ)) ||
> +((flag & IOMMU_WO) &&
> +!(mapping_value->flags & VIRTIO_IOMMU_MAP_F_WRITE))) {
> +qemu_log_mask(LOG_GUEST_ERROR,
> +  "Permission error on 0x%"PRIx64"(%d): allowed=%d\n",
> +  addr, flag, mapping_value->flags);

(Btw, IIUC this may not be a guest error. Say, what if the device is
 simply broken?)

> +goto unlock;
> +}
> +entry.translated_addr = addr - mapping_key->low + 
> mapping_value->phys_addr;
> +entry.perm = flag;
> +trace_virtio_iommu_translate_out(addr, entry.translated_addr, sid);
> +
> +unlock:
> +qemu_mutex_unlock(>mutex);
>  return entry;
>  }
>  
> -- 
> 2.20.1
> 

-- 
Peter Xu

Re: [PATCH 2/2] analyze-migration.py: replace numpy with python 3.2

2019-12-10 Thread Eduardo Habkost

On Tue, Dec 10, 2019 at 12:57:28PM -0500, Cleber Rosa wrote:
> On Tue, Dec 10, 2019 at 12:49:11PM -0500, Cleber Rosa wrote:
> > On Tue, Dec 10, 2019 at 02:14:18PM +0400, Marc-André Lureau wrote:
> > > Hi
> > > 
> > > On Tue, Dec 10, 2019 at 6:59 AM Eduardo Habkost  
> > > wrote:
> > > >
> > > > On Fri, Dec 06, 2019 at 09:27:23AM -0500, Cleber Rosa wrote:
> > > > > On Wed, Nov 27, 2019 at 02:10:38PM +0400, Marc-André Lureau wrote:
> > > > > > Use int.from_bytes() from python 3.2 instead.
> > > > > >
> > > > > > Signed-off-by: Marc-André Lureau 
> > > > > > ---
> > > > > >  scripts/analyze-migration.py | 35 
> > > > > > +++
> > > > > >  1 file changed, 19 insertions(+), 16 deletions(-)
> > > > > >
> > > > > > diff --git a/scripts/analyze-migration.py 
> > > > > > b/scripts/analyze-migration.py
> > > > > > index 2b835d9b70..96a31d3974 100755
> > > > > > --- a/scripts/analyze-migration.py
> > > > > > +++ b/scripts/analyze-migration.py
> > > > > > @@ -1,4 +1,4 @@
> > > > > > -#!/usr/bin/env python
> > > > > > +#!/usr/bin/env python3
> > > > [...]
> > > > >
> > > > > Marc-André, I couldn't yet pinpoint the reason yet, but this patch
> > > > > changes the parsing of bool fields.  This is a diff between the output
> > > > > pre/post this patch on the same images:
> > > > >
> > > > > $ diff -u out_x8664_pre out_x8664_post
> > > > > --- out_x8664_pre   2019-12-06 09:14:16.128943264 -0500
> > > > > +++ out_x8664_post  2019-12-06 09:23:35.861378600 -0500
> > > > > @@ -3039,7 +3039,7 @@
> > > > >  "mac_reg[RADV]": "0x",
> > > > >  "mac_reg[TADV]": "0x",
> > > > >  "mac_reg[ITR]": "0x",
> > > > > -"mit_irq_level": true
> > > > > +"mit_irq_level": false
> > > > >  },
> > > > >  "e1000/full_mac_state": {
> > > > >  "mac_reg": [
> > > > > @@ -36010,10 +36010,10 @@
> > > > >  ],
> > > > >  "smb_auxctl": "0x02",
> > > > >  "smb_blkdata": "0x00",
> > > > > -"i2c_enable": true,
> > > > > +"i2c_enable": false,
> > > > >  "op_done": true,
> > > > > -"in_i2c_block_read": true,
> > > > > -"start_transaction_on_status_read": true
> > > > > +"in_i2c_block_read": false,
> > > > > +"start_transaction_on_status_read": false
> > > > >  },
> > > > >  "ar.tmr.timer": "ff ff ff ff ff ff ff ff",
> > > > >  "ar.tmr.overflow_time": "0x",
> > > > >
> > > > > This true/false flipping is consistent across various images (tried on
> > > > > images generated on a few other targets).
> > > >
> > > > It looks like moving to python3 accidentally fixes a bug.
> > > >
> > > > This is VMSDFieldBool.read:
> > > >
> > > > def read(self):
> > > > super(VMSDFieldBool, self).read()
> > > > if self.data[0] == 0:
> > > > self.data = False
> > > > else:
> > > > self.data = True
> > > > return self.data
> > > >
> > > > On python2, MigrationFile.readvar() returned a string, so the
> > > > (self.data[0] == 0) check was never true.  This means all boolean
> > > > fields were always initialized to True.
> > > >
> > > > On python3, MigrationFile.readvar() returns a bytearray, so the
> > > > (self.data[0] == 0) check now works as expected.
> > > 
> > > Ah! nice surprise. Do you mind updating the commit message on commit?
> > > Or should I resend?
> > > 
> > > thanks
> > > 
> > > -- 
> > > Marc-André Lureau
> > > 
> > 
> > Yep, I'm queueing this with an updated commit message.
> > 
> > Eduardo, does your comment imply a "Reviewed-by"?

Sure!

Reviewed-by: Eduardo Habkost 


> > 
> > Cheers,
> > - Cleber.
> 
> Eduardo,
> 
> I only noticed now that you queued patch 1/2.  Do you want me to queue
> that one instead?  Or do you wanto to queue both on this series?

If you are planning a new pull request soon, feel free to queue
both, and also to pull all patches from my python-next branch so
it is included in a single pull request.  Thanks!

-- 
Eduardo

Re: Avocado notes from KVM forum 2019

2019-12-10 Thread Willian Rampazzo

On Mon, Nov 25, 2019 at 3:10 PM Cleber Rosa  wrote:
>
> On Mon, Nov 25, 2019 at 10:58:02AM -0300, Eduardo Habkost wrote:
> > Thank you, Philippe, those are great ideas.  I have copied them
> > to the Avocado+QEMU Trello board so we don't forget about them:
> > https://trello.com/b/6Qi1pxVn/avocado-qemu
> >
> > Additional comments below:
> >
> > On Mon, Nov 25, 2019 at 01:35:13PM +0100, Philippe Mathieu-Daudé wrote:
> > > Hi Cleber,
> > >
> > > Here are my notes from talking about Avocado with various people during 
> > > the
> > > KVM forum in Lyon last month.
> > >
> > > All comments are QEMU oriented.
> > >
> > >
> > > 1) Working offline
> > >
> > > Various people complained Avocado requires online access, and they would
> > > like to use it offline.
> > >
> > >   Maintainer workflow example is:
> > >
> > >   - run avocado
> > >   - hack QEMU, build
> > >   - git pull
> > >   - build
> > >   - hack QEMU
> > >   (go offline)
> > >   - hack QEMU
> > >   - build
> > >   - run avocado <- FAILS
> > >
> >
> > Ouch.  This shouldn't happen even with no explicit --offline
> > option.  Failure to download artifacts shouldn't make tests
> > report failure.
> >
> >
>
> Agreed.  There are a number of work items already to cover this.  One
> is a more generic test metadata collection system:
>
>https://trello.com/c/lumR8u8Y/1526-rfc-nrunner-extended-metadata
>
> We already have code that can find the required assets, and with that,
> we can let the job (not the test) attempt to fulfill those
> requirements, skipping the tests if they can not be fulfilled.
>
> Until this is available, we can wrap the "fetch_asset()" calls and
> cancel the test if it fails.
>
> > > Failure is because mainstream added new tests, which got pulled in, and 
> > > the
> > > user only notice when running avocado again, but offline.
> > > Test example is boot_linux_console.py, which added various tests from 
> > > other
> > > subsystems, so the maintainer has to disable the new tests manually to be
> > > able to run his previous tests.
> > >
> > > Expected solution: skip tests when artifact is not available, eventually
> > > when the --offline option is used
> > >
> > >
> > > 2) Add artifacts manually to the cache
> > >
> > > Not all artifacts can be easily downloadable, some are public but require
> > > the user to accept an End User License Agreement.
> > > Users would like to share their tests with the documentation about 
> > > where/how
> > > to download the requisite files (accepting the EULA) to run the tests.
> > >
> > >
> > > 2b) Add reference to artifact to the cache
> > >
> > > Group of users might share group of files (like NFS storage) and would 
> > > like
> > > to use directly their remote read-only files, instead of copying it to 
> > > their
> > > home directory.
> >
> > This sounds nice and useful, but I don't know how to make the
> > interface for this usable.
> >
> >
>
> I guess this would require an Avocado installation-wide configuration
> entry for the available cache directories.  IMO given that
> configuration is applied, the tests should transparently find assets
> in the configured locations.
>
> > >
> > >
> > > 3) Provide qemu/avocado-qemu Python packages
> > >
> > > The mainstream project uses Avocado to test QEMU. Others projects use QEMU
> > > to test their code, and would like to automatize that using Avocado. They
> > > usually not rebuild QEMU but use a stable binary from distributions. The
> > > python classes are not available, so they have to clone QEMU to use 
> > > Avocado
> > > (I guess they only need 5 python files).
> > > When running on Continuous Integration, this is overkill, because when you
> > > clone QEMU you also clone various other submodules.
> >
> > I only have one concern, here: I don't think we have the
> > bandwidth to start maintaining a stable external Python API.
> > Users of those packages will need to be aware that future
> > versions of the modules might have incompatible APIs.
> >
>
> My understanding is that we would publish those files as a Python
> module with versions matching QEMU.  No stability would be promised.
> Users can always require a specific version of the Python module that
> matches the QEMU version they expect/want to use.
>
> > >
> > >
> > > 4) Warn the user when Avocado is too old for the tests
> > >
> > > Some users tried Avocado following the examples on the mailing list and 
> > > the
> > > one in some commit's descriptions where we simply show "avocado run ...".
> >
> > Oops.
> >
> > > They installed the distribution Avocado package and tried and it fails for
> > > few of them with no obvious reason (the .log file is hard to read when you
> > > are not custom to). IIUC their distribution provides a older Avocado (69?)
> > > while we use recent features (72).
> > >
> > > We never noticed it because we use 'make check-venv' and do not test the
> > > distrib Avocado. While we can not test all distribs, we could add a 
> > > version
> > > test if the

Re: [PATCH for-5.0 v11 04/20] virtio-iommu: Add the iommu regions

2019-12-10 Thread Peter Xu

On Fri, Nov 22, 2019 at 07:29:27PM +0100, Eric Auger wrote:
> This patch initializes the iommu memory regions so that
> PCIe end point transactions get translated. The translation
> function is not yet implemented though.
> 
> Signed-off-by: Eric Auger 

Either with/without Jean's comment addressed:

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [PATCH 01/10] hw: arm: add Allwinner H3 System-on-Chip

2019-12-10 Thread Niek Linnenbank

Hi Philippe,

On Tue, Dec 10, 2019 at 10:02 AM Philippe Mathieu-Daudé 
wrote:

> On 12/2/19 10:09 PM, Niek Linnenbank wrote:
> > The Allwinner H3 is a System on Chip containing four ARM Cortex A7
> > processor cores. Features and specifications include DDR2/DDR3 memory,
> > SD/MMC storage cards, 10/100/1000Mbit ethernet, USB 2.0, HDMI and
> > various I/O modules. This commit adds support for the Allwinner H3
> > System on Chip.
> >
> > Signed-off-by: Niek Linnenbank 
> > ---
> [...]
> > +
> > +/* UART */
> > +if (serial_hd(0)) {
>
> As the uart0 is always mapped in the SoC, don't use 'if serial_hd()',
> instead map it regardless a console is connected.
>

Indeed, the UARTs should always be mapped for this SoC.
Noted, I'll solve this too for in the v2 patch update.

Regards,
Niek


>
> > +serial_mm_init(get_system_memory(), AW_H3_UART0_REG_BASE, 2,
> > +   s->irq[AW_H3_GIC_SPI_UART0], 115200,
> serial_hd(0),
> > +   DEVICE_NATIVE_ENDIAN);
> > +}
> > +
> > +/* Unimplemented devices */
> > +create_unimplemented_device("display-engine", AW_H3_DE_BASE,
> AW_H3_DE_SIZE);
> > +create_unimplemented_device("dma", AW_H3_DMA_BASE, AW_H3_DMA_SIZE);
> > +create_unimplemented_device("lcd0", AW_H3_LCD0_BASE,
> AW_H3_LCD0_SIZE);
> > +create_unimplemented_device("lcd1", AW_H3_LCD1_BASE,
> AW_H3_LCD1_SIZE);
> > +create_unimplemented_device("gpu", AW_H3_GPU_BASE, AW_H3_GPU_SIZE);
> > +create_unimplemented_device("hdmi", AW_H3_HDMI_BASE,
> AW_H3_HDMI_SIZE);
> > +create_unimplemented_device("rtc", AW_H3_RTC_BASE, AW_H3_RTC_SIZE);
> > +create_unimplemented_device("audio-codec", AW_H3_AC_BASE,
> AW_H3_AC_SIZE);
>
>

-- 
Niek Linnenbank

Re: [PATCH 02/10] hw: arm: add Xunlong Orange Pi PC machine

2019-12-10 Thread Niek Linnenbank

Hi Philippe,

On Tue, Dec 10, 2019 at 9:59 AM Philippe Mathieu-Daudé 
wrote:

> On 12/6/19 11:15 PM, Niek Linnenbank wrote:
> [...]
> >  >  > +static void orangepi_machine_init(MachineClass *mc)
> >  >  > +{
> >  >  > +mc->desc = "Orange Pi PC";
> >  >  > +mc->init = orangepi_init;
> >  >  > +mc->units_per_default_bus = 1;
> >  >  > +mc->min_cpus = AW_H3_NUM_CPUS;
> >  >  > +mc->max_cpus = AW_H3_NUM_CPUS;
> >  >  > +mc->default_cpus = AW_H3_NUM_CPUS;
> >  >
> >  >  mc->default_cpu_type =
> ARM_CPU_TYPE_NAME("cortex-a7");
> >  >
> >  >  > +mc->ignore_memory_transaction_failures = true;
> >  >
> >  > You should not use this flag in new design. See the
> > documentation in
> >  > include/hw/boards.h:
> >  >
> >  >* @ignore_memory_transaction_failures:
> >  >*[...] New board models
> >  >*should instead use "unimplemented-device" for all
> memory
> >  > ranges where
> >  >*the guest will attempt to probe for a device that
> > QEMU doesn't
> >  >*implement and a stub device is required.
> >  >
> >  > You already use the "unimplemented-device".
> >  >
> >  > Thanks, I'm working on this now. I think that at least I'll need
> > to add
> >  > all of the devices mentioned in the 4.1 Memory Mapping chapter of
> > the
> >  > datasheet
> >  > as an unimplemented device. Previously I only added some that I
> > thought
> >  > were relevant.
> >  >
> >  > I added all the missing devices as unimplemented and removed the
> >  > ignore_memory_transaction_failures flag
> >
> > I was going to say, "instead of adding *all* the devices regions you
> > can
> > add the likely bus decoding regions", probably:
> >
> > 0x01c0.   128KiB   AMBA AXI
> > 0x01c2.   64KiBAMBA APB
> >
> > But too late.
> >
> >
> > Hehe its okey, I can change it to whichever is preferable: the minimum
> set
> > with unimplemented device entries to get a working linux kernel / u-boot
> or
> > just cover the full memory space from the datasheet. My initial thought
> > was that if
> > we only provide the minimum set, and the linux kernel later adds a new
> > driver for a device
> > which is not marked unimplemented, it will trigger the data abort and
> > potentially resulting in a non-booting kernel.
> >
> > But I'm not sure what is normally done here. I do see other board files
> > using the create_unimplemented_device() function,
> > but I dont know if they are covering the whole memory space or not.
>
> If they don't cover all memory regions, the guest code can trigger a
> data abort indeed. Since we don't know the memory layout of old board,
> they are still supported with ignore_memory_transaction_failures=true,
> so guest still run unaffected.
> We expect new boards to implement the minimum layout.
> As long as your guest is happy and usable, UNIMP devices are fine,
> either as big region or individual device (this requires someone with
> access to the datasheet to verify). If you can add a UNIMP for each
> device - which is what you did - it is even better because the the unimp
> access log will be more useful, having finer granularity.
>
>  > I added all the missing devices as unimplemented and removed the
>  > ignore_memory_transaction_failures flag
>
> IOW, you already did the best you could do :)
>
> >  > from the machine. Now it seems Linux gets a data abort while
> > probing the
> >  > uart1 serial device at 0x01c28400,
> >
> > Did you add the UART1 as UNIMP or 16550?
> >
> >
> > I discovered what goes wrong here. See this kernel oops message:
> >
> > [1.084985] [f08600f8] *pgd=6f00a811, *pte=01c28653, *ppte=01c28453
> > [1.085564] Internal error: : 8 [#1] SMP ARM
> > [1.085698] Modules linked in:
> > [1.085940] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> 5.4.0-11747-g2f13437b8917 #4
> > [1.085968] Hardware name: Allwinner sun8i Family
> > [1.086447] PC is at dw8250_setup_port+0x10/0x10c
> > [1.086478] LR is at dw8250_probe+0x500/0x56c
> >
> > It tries to access the UART0 at base address 0x01c28400, which I did
> > provide. The strange
> > thing is that is accesses offset 0xf8, thus address 0x01c284f8. The
> > datasheet does not mention this register
> > but if we provide the 1KiB (0x400) I/O space it should at least read as
> > zero and writes ignored. Unfortunately the emulated
> > serial driver only maps a small portion until 0x1f:
> >
> > (qemu) info mtree
> > ...
> >  01c28000-01c2801f (prio 0, i/o): serial
> >  01c28400-01c2841f (prio 0, i/o): serial
> >  01c28800-01c2881f (prio 0, i/o): serial
> >
> >
> > Apparently, the register that the mainline linux kernel is using is
> > DesignWare specific:
> >
> >

Re: [PATCH for-5.0 v11 03/20] virtio-iommu: Decode the command payload

2019-12-10 Thread Peter Xu

On Fri, Nov 22, 2019 at 07:29:26PM +0100, Eric Auger wrote:
> This patch adds the command payload decoding and
> introduces the functions that will do the actual
> command handling. Those functions are not yet implemented.
> 
> Signed-off-by: Eric Auger 

I would simply squash this into previous patch to avoid removing lines
from newly introduced, but this is ok too so to keep Jean's r-b:

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [PATCH 04/10] arm: allwinner-h3: add USB host controller

2019-12-10 Thread Niek Linnenbank

Hi Gerd,

On Tue, Dec 10, 2019 at 9:29 AM Gerd Hoffmann  wrote:

> On Tue, Dec 10, 2019 at 08:56:02AM +0100, Philippe Mathieu-Daudé wrote:
> > On 12/2/19 10:09 PM, Niek Linnenbank wrote:
> > > The Allwinner H3 System on Chip contains multiple USB 2.0 bus
> > > connections which provide software access using the Enhanced
> > > Host Controller Interface (EHCI) and Open Host Controller
> > > Interface (OHCI) interfaces. This commit adds support for
> > > both interfaces in the Allwinner H3 System on Chip.
> > >
> > > Signed-off-by: Niek Linnenbank 
> > > ---
> > >   hw/arm/allwinner-h3.c| 20 
> > >   hw/usb/hcd-ehci-sysbus.c | 17 +
> > >   hw/usb/hcd-ehci.h|  1 +
> >
> > Cc'ing Gerd, the maintainer of these files.
>
> Looks all reasonable.
> Reviewed-by: Gerd Hoffmann 
>
> (assuming this will be merged through arm tree not usb).
>

Thanks for reviewing! I'll add the tag to the commit message.

Regards,
Niek

>
> >
> > Reviewed-by: Philippe Mathieu-Daudé 
> >
> > >   3 files changed, 38 insertions(+)
> > >
> > > diff --git a/hw/arm/allwinner-h3.c b/hw/arm/allwinner-h3.c
> > > index 5566e979ec..afeb49c0ac 100644
> > > --- a/hw/arm/allwinner-h3.c
> > > +++ b/hw/arm/allwinner-h3.c
> > > @@ -26,6 +26,7 @@
> > >   #include "hw/sysbus.h"
> > >   #include "hw/arm/allwinner-h3.h"
> > >   #include "hw/misc/unimp.h"
> > > +#include "hw/usb/hcd-ehci.h"
> > >   #include "sysemu/sysemu.h"
> > >   static void aw_h3_init(Object *obj)
> > > @@ -183,6 +184,25 @@ static void aw_h3_realize(DeviceState *dev, Error
> **errp)
> > >   }
> > >   sysbus_mmio_map(SYS_BUS_DEVICE(>ccu), 0, AW_H3_CCU_BASE);
> > > +/* Universal Serial Bus */
> > > +sysbus_create_simple(TYPE_AW_H3_EHCI, AW_H3_EHCI0_BASE,
> > > + s->irq[AW_H3_GIC_SPI_EHCI0]);
> > > +sysbus_create_simple(TYPE_AW_H3_EHCI, AW_H3_EHCI1_BASE,
> > > + s->irq[AW_H3_GIC_SPI_EHCI1]);
> > > +sysbus_create_simple(TYPE_AW_H3_EHCI, AW_H3_EHCI2_BASE,
> > > + s->irq[AW_H3_GIC_SPI_EHCI2]);
> > > +sysbus_create_simple(TYPE_AW_H3_EHCI, AW_H3_EHCI3_BASE,
> > > + s->irq[AW_H3_GIC_SPI_EHCI3]);
> > > +
> > > +sysbus_create_simple("sysbus-ohci", AW_H3_OHCI0_BASE,
> > > + s->irq[AW_H3_GIC_SPI_OHCI0]);
> > > +sysbus_create_simple("sysbus-ohci", AW_H3_OHCI1_BASE,
> > > + s->irq[AW_H3_GIC_SPI_OHCI1]);
> > > +sysbus_create_simple("sysbus-ohci", AW_H3_OHCI2_BASE,
> > > + s->irq[AW_H3_GIC_SPI_OHCI2]);
> > > +sysbus_create_simple("sysbus-ohci", AW_H3_OHCI3_BASE,
> > > + s->irq[AW_H3_GIC_SPI_OHCI3]);
> > > +
> > >   /* UART */
> > >   if (serial_hd(0)) {
> > >   serial_mm_init(get_system_memory(), AW_H3_UART0_REG_BASE, 2,
> > > diff --git a/hw/usb/hcd-ehci-sysbus.c b/hw/usb/hcd-ehci-sysbus.c
> > > index 020211fd10..174c3446ef 100644
> > > --- a/hw/usb/hcd-ehci-sysbus.c
> > > +++ b/hw/usb/hcd-ehci-sysbus.c
> > > @@ -145,6 +145,22 @@ static const TypeInfo ehci_exynos4210_type_info =
> {
> > >   .class_init= ehci_exynos4210_class_init,
> > >   };
> > > +static void ehci_aw_h3_class_init(ObjectClass *oc, void *data)
> > > +{
> > > +SysBusEHCIClass *sec = SYS_BUS_EHCI_CLASS(oc);
> > > +DeviceClass *dc = DEVICE_CLASS(oc);
> > > +
> > > +sec->capsbase = 0x0;
> > > +sec->opregbase = 0x10;
> > > +set_bit(DEVICE_CATEGORY_USB, dc->categories);
> > > +}
> > > +
> > > +static const TypeInfo ehci_aw_h3_type_info = {
> > > +.name  = TYPE_AW_H3_EHCI,
> > > +.parent= TYPE_SYS_BUS_EHCI,
> > > +.class_init= ehci_aw_h3_class_init,
> > > +};
> > > +
> > >   static void ehci_tegra2_class_init(ObjectClass *oc, void *data)
> > >   {
> > >   SysBusEHCIClass *sec = SYS_BUS_EHCI_CLASS(oc);
> > > @@ -267,6 +283,7 @@ static void ehci_sysbus_register_types(void)
> > >   type_register_static(_platform_type_info);
> > >   type_register_static(_xlnx_type_info);
> > >   type_register_static(_exynos4210_type_info);
> > > +type_register_static(_aw_h3_type_info);
> > >   type_register_static(_tegra2_type_info);
> > >   type_register_static(_ppc4xx_type_info);
> > >   type_register_static(_fusbh200_type_info);
> > > diff --git a/hw/usb/hcd-ehci.h b/hw/usb/hcd-ehci.h
> > > index 0298238f0b..edb59311c4 100644
> > > --- a/hw/usb/hcd-ehci.h
> > > +++ b/hw/usb/hcd-ehci.h
> > > @@ -342,6 +342,7 @@ typedef struct EHCIPCIState {
> > >   #define TYPE_SYS_BUS_EHCI "sysbus-ehci-usb"
> > >   #define TYPE_PLATFORM_EHCI "platform-ehci-usb"
> > >   #define TYPE_EXYNOS4210_EHCI "exynos4210-ehci-usb"
> > > +#define TYPE_AW_H3_EHCI "aw-h3-ehci-usb"
> > >   #define TYPE_TEGRA2_EHCI "tegra2-ehci-usb"
> > >   #define TYPE_PPC4xx_EHCI "ppc4xx-ehci-usb"
> > >   #define TYPE_FUSBH200_EHCI "fusbh200-ehci-usb"
> > >
> >
>
>

-- 
Niek Linnenbank

Re: [PATCH v1 00/36] Add RISC-V Hypervisor Extension v0.5

2019-12-10 Thread Aleksandar Markovic

On Tue, Dec 10, 2019 at 1:04 AM Alistair Francis  wrote:
>
> On Mon, Dec 9, 2019 at 2:55 PM Aleksandar Markovic
>  wrote:
> >
> >
> >
> > On Monday, December 9, 2019, Alistair Francis  
> > wrote:
> >>
> >> This patch series adds the RISC-V Hypervisor extension v0.5. This is the
> >> latest draft spec of the Hypervisor extension.
> >>
> >
> > Hi, Alistair,
> >
> > I have a question for you:
> >
> > Let's say this series is accepted. And let's say, next year, the draft spec 
> > of RISC-V Hypervisor extension v0.6 is released, and you or somebody else 
> > comes up with series on QEMU support for it, and that series is accepted 
> > too. What would happen afterwards:
> >
> > A. Both support for v0.5 and v0.6 would continue to coexist perpetually
> >
> > B. Support for v0.5 would be deprecated according to QEMU deprecation 
> > rules, and in two cycle would disappear
> >
> > C. Support for v0.5 would abruptly stop existing
>
> My current plan is to upgrade the implementation to the next version
> and drop v0.5 when a new spec release. happens.
>
> The justification for this is that the Hypervisor is a draft
> extension, disabled by default and marked as experimental. Therefore I
> think it's ok to just support the latest version of the spec.
>

OK. Thanks for clarification.

> Alistair
>
> >
> > D. Something else
> >
> > ?
> >
> > Thanks,
> > Aleksandar
> >
> >
> >>
> >> The Hypervisor extension is disabled by default, so this series should
> >> result in no changes to anyone using QEMU unless they enable the
> >> extension. The extention can be enabled with the -cpu property (see
> >> below).
> >>
> >> Testing of this implementation has been done by using the baremetal
> >> Xvisor Hypervisor. We are able to run two Linux guests (that's all I
> >> have tried) as guests in 64-bit. In 32-bit so far I can only run
> >> baremetal guests, but I think this is a baremetal boot loader issue and
> >> not an issue in QEMU.
> >>
> >> The RISC-V KVM implementation was also written using these patches. The
> >> KVM implementation is currently under review.
> >>
> >> At the moment this spec is in a draft state and is subject to change. As
> >> QEMU is extreamly useful in early bring up I think it makes sense for
> >> QEMU to support non-frozen extensions.
> >>
> >> Thanks to Anup for doing the initial port of Xvisor. The port is avaliable 
> >> here:
> >> https://github.com/avpatel/xvisor-next and will run on QEMU.
> >>
> >> Also thanks to Atish for implementing the SBI call support in Xvisor and
> >> for lots of help debugging.
> >>
> >> To run this yourself:
> >>  1. Apply this patch series to QEMU. The latest branch can be found here:
> >>   
> >> https://github.com/alistair23/qemu/tree/mainline/alistair/riscv-hyp-ext-v0.5.next
> >>  2. Get the version of OpenSBI that supports the H extension. This can
> >> be found here:
> >>   https://github.com/avpatel/opensbi/tree/riscv_hyp_ext_0_5_v1
> >>  3. Build the next release of Xvisor. It is available here:
> >>   https://github.com/avpatel/xvisor-next
> >>  4. Make sure you build the Xvisor tests, see here for details:
> >>   
> >> https://github.com/avpatel/xvisor-next/tree/master/tests/riscv/virt64/linux
> >>  5. Run QEMU:
> >>  ./riscv64-softmmu/qemu-system-riscv64 -nographic \
> >>-machine virt -cpu rv64,x-h=true \
> >>-serial mon:stdio -serial null -m 4G \
> >>-device loader,file=vmm.bin,addr=0x8020 \
> >>-kernel fw_jump.elf \
> >>-initrd vmm-disk-linux.img \
> >>-append "vmm.console=uart@1000 vmm.bootcmd=\"vfs mount initrd 
> >> /;vfs run /boot.xscript;vfs cat /system/banner.txt\""
> >>
> >>Once you get to the prompt you can start the geust by running:
> >>  guest kick guest0
> >>You can then bind to the serial port using:
> >>  vserial bind guest0/uart0
> >>Then you can start Linux using:
> >>  autoexec
> >>
> >>  This was all tested with the mainline 5.2/5.3 kernels.
> >>
> >> There is very early work on a Xen port as well which is avaliable here:
> >> https://github.com/alistair23/xen/tree/alistair/riscv-port
> >>
> >> ToDo/Issues
> >>  - Get 32-bit fully working
> >>
> >>
> >>
> >> Alistair Francis (36):
> >>   target/riscv: Convert MIP CSR to target_ulong
> >>   target/riscv: Don't set write permissions on dirty PTEs
> >>   target/riscv: Add the Hypervisor extension
> >>   target/riscv: Add the Hypervisor CSRs to CPUState
> >>   target/riscv: Add support for the new execption numbers
> >>   target/riscv: Rename the H irqs to VS irqs
> >>   target/riscv: Add the virtulisation mode
> >>   target/riscv: Add the force HS exception mode
> >>   target/riscv: Fix CSR perm checking for HS mode
> >>   target/riscv: Print priv and virt in disas log
> >>   target/riscv: Dump Hypervisor registers if enabled
> >>   target/riscv: Add Hypervisor CSR access functions
> >>   target/riscv: Add Hypervisor virtual CSRs accesses
> >>   target/riscv: Add Hypervisor virtual CSRs

Re: [PATCH] riscv/sifive_u: fix a memory leak in soc_realize()

2019-12-10 Thread Alistair Francis

On Mon, Dec 9, 2019 at 11:15 PM  wrote:
>
> From: Pan Nengyuan 
>
> Fix a minor memory leak in riscv_sifive_u_soc_realize()
>
> Reported-by: Euler Robot 
> Signed-off-by: Pan Nengyuan 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/sifive_u.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> index 0140e95..0e12b3c 100644
> --- a/hw/riscv/sifive_u.c
> +++ b/hw/riscv/sifive_u.c
> @@ -542,6 +542,7 @@ static void riscv_sifive_u_soc_realize(DeviceState *dev, 
> Error **errp)
>  SIFIVE_U_PLIC_CONTEXT_BASE,
>  SIFIVE_U_PLIC_CONTEXT_STRIDE,
>  memmap[SIFIVE_U_PLIC].size);
> +g_free(plic_hart_config);
>  sifive_uart_create(system_memory, memmap[SIFIVE_U_UART0].base,
>  serial_hd(0), qdev_get_gpio_in(DEVICE(s->plic), SIFIVE_U_UART0_IRQ));
>  sifive_uart_create(system_memory, memmap[SIFIVE_U_UART1].base,
> --
> 2.7.2.windows.1
>
>
>

Re: [PATCH 1/2] ppc/pnv: Loop on the whole hierarchy to populate the DT with the XSCOM nodes

2019-12-10 Thread Cédric Le Goater

On 10/12/2019 18:08, Cédric Le Goater wrote:
> On 10/12/2019 17:49, Greg Kurz wrote:
>> On Tue, 10 Dec 2019 14:58:44 +0100
>> Cédric Le Goater  wrote:
>>
>>> Some PnvXScomInterface objects lie a bit deeper (PnvPBCQState) than
>>
>> I didn't find any trace of PnvPBCQState in the code... what is it ?
> 
> PHB4, which is not merged yet.  

In fact, PnvPBCQState  is for PHB3. But we have a similar need for PHB4.

C.

Re: [PATCH 2/2] analyze-migration.py: replace numpy with python 3.2

2019-12-10 Thread Cleber Rosa

On Tue, Dec 10, 2019 at 12:49:11PM -0500, Cleber Rosa wrote:
> On Tue, Dec 10, 2019 at 02:14:18PM +0400, Marc-André Lureau wrote:
> > Hi
> > 
> > On Tue, Dec 10, 2019 at 6:59 AM Eduardo Habkost  wrote:
> > >
> > > On Fri, Dec 06, 2019 at 09:27:23AM -0500, Cleber Rosa wrote:
> > > > On Wed, Nov 27, 2019 at 02:10:38PM +0400, Marc-André Lureau wrote:
> > > > > Use int.from_bytes() from python 3.2 instead.
> > > > >
> > > > > Signed-off-by: Marc-André Lureau 
> > > > > ---
> > > > >  scripts/analyze-migration.py | 35 +++
> > > > >  1 file changed, 19 insertions(+), 16 deletions(-)
> > > > >
> > > > > diff --git a/scripts/analyze-migration.py 
> > > > > b/scripts/analyze-migration.py
> > > > > index 2b835d9b70..96a31d3974 100755
> > > > > --- a/scripts/analyze-migration.py
> > > > > +++ b/scripts/analyze-migration.py
> > > > > @@ -1,4 +1,4 @@
> > > > > -#!/usr/bin/env python
> > > > > +#!/usr/bin/env python3
> > > [...]
> > > >
> > > > Marc-André, I couldn't yet pinpoint the reason yet, but this patch
> > > > changes the parsing of bool fields.  This is a diff between the output
> > > > pre/post this patch on the same images:
> > > >
> > > > $ diff -u out_x8664_pre out_x8664_post
> > > > --- out_x8664_pre   2019-12-06 09:14:16.128943264 -0500
> > > > +++ out_x8664_post  2019-12-06 09:23:35.861378600 -0500
> > > > @@ -3039,7 +3039,7 @@
> > > >  "mac_reg[RADV]": "0x",
> > > >  "mac_reg[TADV]": "0x",
> > > >  "mac_reg[ITR]": "0x",
> > > > -"mit_irq_level": true
> > > > +"mit_irq_level": false
> > > >  },
> > > >  "e1000/full_mac_state": {
> > > >  "mac_reg": [
> > > > @@ -36010,10 +36010,10 @@
> > > >  ],
> > > >  "smb_auxctl": "0x02",
> > > >  "smb_blkdata": "0x00",
> > > > -"i2c_enable": true,
> > > > +"i2c_enable": false,
> > > >  "op_done": true,
> > > > -"in_i2c_block_read": true,
> > > > -"start_transaction_on_status_read": true
> > > > +"in_i2c_block_read": false,
> > > > +"start_transaction_on_status_read": false
> > > >  },
> > > >  "ar.tmr.timer": "ff ff ff ff ff ff ff ff",
> > > >  "ar.tmr.overflow_time": "0x",
> > > >
> > > > This true/false flipping is consistent across various images (tried on
> > > > images generated on a few other targets).
> > >
> > > It looks like moving to python3 accidentally fixes a bug.
> > >
> > > This is VMSDFieldBool.read:
> > >
> > > def read(self):
> > > super(VMSDFieldBool, self).read()
> > > if self.data[0] == 0:
> > > self.data = False
> > > else:
> > > self.data = True
> > > return self.data
> > >
> > > On python2, MigrationFile.readvar() returned a string, so the
> > > (self.data[0] == 0) check was never true.  This means all boolean
> > > fields were always initialized to True.
> > >
> > > On python3, MigrationFile.readvar() returns a bytearray, so the
> > > (self.data[0] == 0) check now works as expected.
> > 
> > Ah! nice surprise. Do you mind updating the commit message on commit?
> > Or should I resend?
> > 
> > thanks
> > 
> > -- 
> > Marc-André Lureau
> > 
> 
> Yep, I'm queueing this with an updated commit message.
> 
> Eduardo, does your comment imply a "Reviewed-by"?
> 
> Cheers,
> - Cleber.

Eduardo,

I only noticed now that you queued patch 1/2.  Do you want me to queue
that one instead?  Or do you wanto to queue both on this series?

Cheers,
- Cleber.


signature.asc
Description: PGP signature

Re: [PATCH 2/2] analyze-migration.py: replace numpy with python 3.2

2019-12-10 Thread Cleber Rosa

On Tue, Dec 10, 2019 at 02:14:18PM +0400, Marc-André Lureau wrote:
> Hi
> 
> On Tue, Dec 10, 2019 at 6:59 AM Eduardo Habkost  wrote:
> >
> > On Fri, Dec 06, 2019 at 09:27:23AM -0500, Cleber Rosa wrote:
> > > On Wed, Nov 27, 2019 at 02:10:38PM +0400, Marc-André Lureau wrote:
> > > > Use int.from_bytes() from python 3.2 instead.
> > > >
> > > > Signed-off-by: Marc-André Lureau 
> > > > ---
> > > >  scripts/analyze-migration.py | 35 +++
> > > >  1 file changed, 19 insertions(+), 16 deletions(-)
> > > >
> > > > diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
> > > > index 2b835d9b70..96a31d3974 100755
> > > > --- a/scripts/analyze-migration.py
> > > > +++ b/scripts/analyze-migration.py
> > > > @@ -1,4 +1,4 @@
> > > > -#!/usr/bin/env python
> > > > +#!/usr/bin/env python3
> > [...]
> > >
> > > Marc-André, I couldn't yet pinpoint the reason yet, but this patch
> > > changes the parsing of bool fields.  This is a diff between the output
> > > pre/post this patch on the same images:
> > >
> > > $ diff -u out_x8664_pre out_x8664_post
> > > --- out_x8664_pre   2019-12-06 09:14:16.128943264 -0500
> > > +++ out_x8664_post  2019-12-06 09:23:35.861378600 -0500
> > > @@ -3039,7 +3039,7 @@
> > >  "mac_reg[RADV]": "0x",
> > >  "mac_reg[TADV]": "0x",
> > >  "mac_reg[ITR]": "0x",
> > > -"mit_irq_level": true
> > > +"mit_irq_level": false
> > >  },
> > >  "e1000/full_mac_state": {
> > >  "mac_reg": [
> > > @@ -36010,10 +36010,10 @@
> > >  ],
> > >  "smb_auxctl": "0x02",
> > >  "smb_blkdata": "0x00",
> > > -"i2c_enable": true,
> > > +"i2c_enable": false,
> > >  "op_done": true,
> > > -"in_i2c_block_read": true,
> > > -"start_transaction_on_status_read": true
> > > +"in_i2c_block_read": false,
> > > +"start_transaction_on_status_read": false
> > >  },
> > >  "ar.tmr.timer": "ff ff ff ff ff ff ff ff",
> > >  "ar.tmr.overflow_time": "0x",
> > >
> > > This true/false flipping is consistent across various images (tried on
> > > images generated on a few other targets).
> >
> > It looks like moving to python3 accidentally fixes a bug.
> >
> > This is VMSDFieldBool.read:
> >
> > def read(self):
> > super(VMSDFieldBool, self).read()
> > if self.data[0] == 0:
> > self.data = False
> > else:
> > self.data = True
> > return self.data
> >
> > On python2, MigrationFile.readvar() returned a string, so the
> > (self.data[0] == 0) check was never true.  This means all boolean
> > fields were always initialized to True.
> >
> > On python3, MigrationFile.readvar() returns a bytearray, so the
> > (self.data[0] == 0) check now works as expected.
> 
> Ah! nice surprise. Do you mind updating the commit message on commit?
> Or should I resend?
> 
> thanks
> 
> -- 
> Marc-André Lureau
> 

Yep, I'm queueing this with an updated commit message.

Eduardo, does your comment imply a "Reviewed-by"?

Cheers,
- Cleber.


signature.asc
Description: PGP signature

Re: [PATCH for-4.2? v3 0/8] block: Fix resize (extending) of short overlays

2019-12-10 Thread Kevin Wolf

Am 22.11.2019 um 17:05 hat Kevin Wolf geschrieben:
> See patch 4 for the description of the bug fixed.

I'm applying patches 3 and 5-7 to the block branch because they make
sense on their own.

The real fix will need another approach because the error handling is
broken in this one: If zeroing out fails (either because of NO_FALLBACK
or because of some other I/O error), bdrv_co_truncate() will return
failure, but the image size has already been increased, with potentially
incorrect data in the new area.

To fix this, we need to make sure that zeros will be read before we
commit the new image size to the image file (e.g. qcow2 header) and to
bs->total_sectors. In other words, it must become the responsibility of
the block driver.

To this effect, I'm planning to introduce a PREALLOC_MODE_ZERO_INIT flag
that can be or'ed to the preallocation mode. This will fail by default
because it looks like just another unimplemented preallocation mode to
block drivers. It will be requested explicitly by commit jobs and
automatically added by bdrv_co_truncate() if the backing file would
become visible (like in this series, but now for all preallocation
modes). I'm planning to implement it for qcow2 and file-posix for now,
which should cover most interesting cases.

Does this make sense to you?

Kevin

Re: [PATCH v2 15/18] xen: convert "-machine igd-passthru" to an accelerator property

2019-12-10 Thread Paolo Bonzini

On 10/12/19 13:56, Marc-André Lureau wrote:
>> +if (g_str_equal(qom_name, "igd-passthru")) {
>> +object_register_sugar_prop(ACCEL_CLASS_NAME("xen"), qom_name, 
>> value);
> 
> shouldn't it warn about deprecation?

The old version is not deprecated (I'm only deprecating -tb-size because
it's a toplevel option, but not the -machine versions; they have been
there forever and probably have too many users).

Paolo

Re: [PATCH v6] migration: Support QLIST migration

2019-12-10 Thread Peter Xu

On Fri, Nov 22, 2019 at 06:16:21PM +0100, Eric Auger wrote:
> Support QLIST migration using the same principle as QTAILQ:
> 94869d5c52 ("migration: migrate QTAILQ").
> 
> The VMSTATE_QLIST_V macro has the same proto as VMSTATE_QTAILQ_V.
> The change mainly resides in QLIST RAW macros: QLIST_RAW_INSERT_HEAD
> and QLIST_RAW_REVERSE.
> 
> Tests also are provided.
> 
> Signed-off-by: Eric Auger 

https://patchew.org/QEMU/20191023150237.17324-1-eric.au...@redhat.com/diff/20191122171621.1307-1-eric.au...@redhat.com/

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time

2019-12-10 Thread Andrea Bolognani

On Tue, 2019-12-10 at 17:08 +0100, Andrew Jones wrote:
> I don't have a strong enough opinion about kvm-adjvtime vs.
> kvm-no-adjvtime to insist one way or another. I agree double inversions
> are easier to mess up, but I also like the way the '-no-' better
> communicates that the default is [probably] 'yes'.
> 
> All interested parties, please vote. I'll be sending v2 soon and I can
> call this thing anything the majority (or the dominate minority) prefer.

I like kvm-adjvtime better because it avoids the double negative,
but on the other hand if the new default is be to adjust the time
then I don't expect many people will actually need to use the
parameter, so the name doesn't matter that much after all :)

-- 
Andrea Bolognani / Red Hat / Virtualization

Re: [PATCH 2/2] ppc/pnv: populate the DT with realized XSCOM devices

2019-12-10 Thread Cédric Le Goater

On 10/12/2019 17:53, Greg Kurz wrote:
> On Tue, 10 Dec 2019 14:58:45 +0100
> Cédric Le Goater  wrote:
> 
>> Some devices could be initialized in the instance_init handler but not
>> realized for configuration reasons. Nodes should not be added in the DT
>> for such devices.
>>
> 
> Do you have examples of such devices to share ?

PHB4 again. 

C. 


> 
>> Signed-off-by: Cédric Le Goater 
>> ---
>>  hw/ppc/pnv_xscom.c | 5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
>> index 006d87e970d9..6d3745a49e50 100644
>> --- a/hw/ppc/pnv_xscom.c
>> +++ b/hw/ppc/pnv_xscom.c
>> @@ -272,7 +272,10 @@ static int xscom_dt_child(Object *child, void *opaque)
>>  PnvXScomInterface *xd = PNV_XSCOM_INTERFACE(child);
>>  PnvXScomInterfaceClass *xc = PNV_XSCOM_INTERFACE_GET_CLASS(xd);
>>  
>> -if (xc->dt_xscom) {
>> +/*
>> + * Only "realized" devices should be configured in the DT
>> + */
>> +if (xc->dt_xscom && DEVICE(child)->realized) {
>>  _FDT((xc->dt_xscom(xd, args->fdt, args->xscom_offset)));
>>  }
>>  }
>

Re: [PATCH 1/2] ppc/pnv: Loop on the whole hierarchy to populate the DT with the XSCOM nodes

2019-12-10 Thread Cédric Le Goater

On 10/12/2019 17:49, Greg Kurz wrote:
> On Tue, 10 Dec 2019 14:58:44 +0100
> Cédric Le Goater  wrote:
> 
>> Some PnvXScomInterface objects lie a bit deeper (PnvPBCQState) than
> 
> I didn't find any trace of PnvPBCQState in the code... what is it ?

PHB4, which is not merged yet.  

C. 

> 
>> the first layer, so we need to loop on the whole object hierarchy to
>> catch them.
>>
>> Signed-off-by: Cédric Le Goater 
>> ---
>>  hw/ppc/pnv_xscom.c | 7 ++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
>> index bed41840845e..006d87e970d9 100644
>> --- a/hw/ppc/pnv_xscom.c
>> +++ b/hw/ppc/pnv_xscom.c
>> @@ -326,7 +326,12 @@ int pnv_dt_xscom(PnvChip *chip, void *fdt, int 
>> root_offset)
>>  args.fdt = fdt;
>>  args.xscom_offset = xscom_offset;
>>  
>> -object_child_foreach(OBJECT(chip), xscom_dt_child, );
>> +/*
>> + * Loop on the whole object hierarchy to catch all
>> + * PnvXScomInterface objects which can lie a bit deeper the first
> 
> s/deeper the first/deeper than the first/
> 
>> + * layer.
>> + */
>> +object_child_foreach_recursive(OBJECT(chip), xscom_dt_child, );
>>  return 0;
>>  }
>>  
>

Re: [PATCH v3] qga: fence guest-set-time if hwclock not available

2019-12-10 Thread Cornelia Huck

On Thu,  5 Dec 2019 12:53:50 +0100
Cornelia Huck  wrote:

> The Posix implementation of guest-set-time invokes hwclock to
> set/retrieve the time to/from the hardware clock. If hwclock
> is not available, the user is currently informed that "hwclock
> failed to set hardware clock to system time", which is quite
> misleading. This may happen e.g. on s390x, which has a different
> timekeeping concept anyway.
> 
> Let's check for the availability of the hwclock command and
> return QERR_UNSUPPORTED for guest-set-time if it is not available.
> 
> Reviewed-by: Laszlo Ersek 
> Reviewed-by: Daniel P. Berrangé 
> Reviewed-by: Michael Roth 
> Signed-off-by: Cornelia Huck 
> ---
> 
> v2->v3:
>   - added 'static' keyword to hwclock_path
> 
> Not sure what tree this is going through; if there's no better place,
> I can also take this through the s390 tree.
> 
> ---
>  qga/commands-posix.c | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)

Queued to s390-next.

Re: [PATCH 2/2] ppc/pnv: populate the DT with realized XSCOM devices

2019-12-10 Thread Greg Kurz

On Tue, 10 Dec 2019 14:58:45 +0100
Cédric Le Goater  wrote:

> Some devices could be initialized in the instance_init handler but not
> realized for configuration reasons. Nodes should not be added in the DT
> for such devices.
> 

Do you have examples of such devices to share ?

> Signed-off-by: Cédric Le Goater 
> ---
>  hw/ppc/pnv_xscom.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
> index 006d87e970d9..6d3745a49e50 100644
> --- a/hw/ppc/pnv_xscom.c
> +++ b/hw/ppc/pnv_xscom.c
> @@ -272,7 +272,10 @@ static int xscom_dt_child(Object *child, void *opaque)
>  PnvXScomInterface *xd = PNV_XSCOM_INTERFACE(child);
>  PnvXScomInterfaceClass *xc = PNV_XSCOM_INTERFACE_GET_CLASS(xd);
>  
> -if (xc->dt_xscom) {
> +/*
> + * Only "realized" devices should be configured in the DT
> + */
> +if (xc->dt_xscom && DEVICE(child)->realized) {
>  _FDT((xc->dt_xscom(xd, args->fdt, args->xscom_offset)));
>  }
>  }

Re: [RFC PATCH 1/9] vfio/pci: introduce mediate ops to intercept vfio-pci ops

2019-12-10 Thread Alex Williamson

On Mon, 9 Dec 2019 21:44:23 -0500
Yan Zhao  wrote:

> > > > > Currently, yes, i40e has build dependency on vfio-pci.
> > > > > It's like this, if i40e decides to support SRIOV and compiles in vf
> > > > > related code who depends on vfio-pci, it will also have build 
> > > > > dependency
> > > > > on vfio-pci. isn't it natural?
> > > > 
> > > > No, this is not natural.  There are certainly i40e VF use cases that
> > > > have no interest in vfio and having dependencies between the two
> > > > modules is unacceptable.  I think you probably want to modularize the
> > > > i40e vfio support code and then perhaps register a table in vfio-pci
> > > > that the vfio-pci code can perform a module request when using a
> > > > compatible device.  Just and idea, there might be better options.  I
> > > > will not accept a solution that requires unloading the i40e driver in
> > > > order to unload the vfio-pci driver.  It's inconvenient with just one
> > > > NIC driver, imagine how poorly that scales.
> > > > 
> > > what about this way:
> > > mediate driver registers a module notifier and every time when
> > > vfio_pci is loaded, register to vfio_pci its mediate ops?
> > > (Just like in below sample code)
> > > This way vfio-pci is free to unload and this registering only gives
> > > vfio-pci a name of what module to request.
> > > After that,
> > > in vfio_pci_open(), vfio-pci requests the mediate driver. (or puts
> > > the mediate driver when mediate driver does not support mediating the
> > > device)
> > > in vfio_pci_release(), vfio-pci puts the mediate driver.
> > > 
> > > static void register_mediate_ops(void)
> > > {
> > > int (*func)(struct vfio_pci_mediate_ops *ops) = NULL;
> > > 
> > > func = symbol_get(vfio_pci_register_mediate_ops);
> > > 
> > > if (func) {
> > > func(_dt_ops);
> > > symbol_put(vfio_pci_register_mediate_ops);
> > > }
> > > }
> > > 
> > > static int igd_module_notify(struct notifier_block *self,
> > >   unsigned long val, void *data)
> > > {
> > > struct module *mod = data;
> > > int ret = 0;
> > > 
> > > switch (val) {
> > > case MODULE_STATE_LIVE:
> > > if (!strcmp(mod->name, "vfio_pci"))
> > > register_mediate_ops();
> > > break;
> > > case MODULE_STATE_GOING:
> > > break;
> > > default:
> > > break;
> > > }
> > > return ret;
> > > }
> > > 
> > > static struct notifier_block igd_module_nb = {
> > > .notifier_call = igd_module_notify,
> > > .priority = 0,
> > > };
> > > 
> > > 
> > > 
> > > static int __init igd_dt_init(void)
> > > {
> > >   ...
> > >   register_mediate_ops();
> > >   register_module_notifier(_module_nb);
> > >   ...
> > >   return 0;
> > > }  
> > 
> > 
> > No, this is bad.  Please look at MODULE_ALIAS() and request_module() as
> > used in the vfio-platform for loading reset driver modules.  I think
> > the correct approach is that vfio-pci should perform a request_module()
> > based on the device being probed.  Having the mediation provider
> > listening for vfio-pci and registering itself regardless of whether we
> > intend to use it assumes that we will want to use it and assumes that
> > the mediation provider module is already loaded.  We should be able to
> > support demand loading of modules that may serve no other purpose than
> > providing this mediation.  Thanks,  
> hi Alex
> Thanks for this message.
> So is it good to create a separate module as mediation provider driver,
> and alias its module name to "vfio-pci-mediate-vid-did".
> Then when vfio-pci probes the device, it requests module of that name ?

I think this would give us an option to have the mediator as a separate
module, but not require it.  Maybe rather than a request_module(),
where if we follow the platform reset example we'd then expect the init
code for the module to register into a list, we could do a
symbol_request().  AIUI, this would give us a reference to the symbol
if the module providing it is already loaded, and request a module
(perhaps via an alias) if it's not already load.  Thanks,

Alex

Re: [PATCH] hw/arm/virt: Second uart for normal-world

2019-12-10 Thread Daniel Thompson

On Mon, Dec 09, 2019 at 05:10:30PM +, Peter Maydell wrote:
> On Mon, 9 Dec 2019 at 17:08, Daniel Thompson  
> wrote:
> > I don't object to making it command line dependant (it is certainly
> > lower risk) but, out of interest, has using /aliases to force the
> > kernel to enumerate the serial nodes in the existing order been ruled
> > out for any reason.
> 
> No, I don't think anybody's investigated that (I wasn't aware
> that you could do something like that). Bear in mind that the
> kernel is not the only consumer of the DT, though -- you need
> to use a mechanism that all DT consumers will handle correctly.

The syntax for /aliases is standardized (in the DT documentation) but
AFAIK the exact semantic meaning of an alias relies somewhat on idiom.
It is true that the DT binding documentation for some serial drivers
does include details of /aliases but sadly PL011 is not amoung them.

I took a fairly detailed look at FreeBSD. I don't think /aliases is
used to control enumeration order but that appears to be because
aliases are handled in a different way to Linux. For example 
FreeBSD allows a custom console to be selected using FDT syntax
(hw.fdt.console=serial0 or hw.fdt.console=/path/to/fdt-uart ) which
means the Linux-like approach (such as console=ttyAMA0) need not be
used.

In summary I think that support for /aliases can and should be added
since it the best way to help DT systems figure out how to match qemu
uart numbering to its own naming.

However I agree we still need a way to create systems with only a
single UART even if I have not yet been able to come up with a test
case that proves /aliases is insufficient ;-)

Daniel.

Re: [PATCH for-5.0 v11 19/20] pc: Add support for virtio-iommu-pci

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:42PM +0100, Eric Auger wrote:
> The virtio-iommu-pci is instantiated through the -device QEMU
> option. However if instantiated it also requires an IORT ACPI table
> to describe the ID mappings between the root complex and the iommu.
> 
> This patch adds the generation of the IORT table if the
> virtio-iommu-pci device is instantiated.
> 
> We also declare the [0xfee0 - 0xfeef] MSI reserved region
> so that it gets bypassed by the IOMMU.
> 
> Signed-off-by: Eric Auger 

It would be nice to factor the IORT code with arm, but this looks OK.

Reviewed-by: Jean-Philippe Brucker

Re: [PATCH for-5.0 v11 17/20] hw/arm/virt-acpi-build: Add virtio-iommu node in IORT table

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:40PM +0100, Eric Auger wrote:
> This patch builds the virtio-iommu node in the ACPI IORT table.
> 
> The RID space of the root complex, which spans 0x0-0x1
> maps to streamid space 0x0-0x1 in the virtio-iommu which in
> turn maps to deviceid space 0x0-0x1 in the ITS group.
> 
> The iommu RID is excluded as described in virtio-iommu
> specification.
> 
> Signed-off-by: Eric Auger 

Reviewed-by: Jean-Philippe Brucker 

Although VIOT changes the layout of the IORT node slightly, the
implementation should stay pretty much the same.

Re: [PATCH for-5.0 v11 13/20] virtio-iommu: Implement probe request

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:36PM +0100, Eric Auger wrote:
> This patch implements the PROBE request. At the moment,
> no reserved regions are returned as none are registered
> per device. Only a NONE property is returned.
> 
> Signed-off-by: Eric Auger 

Reviewed-by: Jean-Philippe Brucker

Re: [PATCH for-5.0 v11 16/20] hw/arm/virt-acpi-build: Introduce fill_iort_idmap helper

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:39PM +0100, Eric Auger wrote:
>  build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>  {
> @@ -426,13 +437,12 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  smmu->gerr_gsiv = cpu_to_le32(irq + 2);
>  smmu->sync_gsiv = cpu_to_le32(irq + 3);
>  
> -/* Identity RID mapping covering the whole input RID range */
> -idmap = >id_mapping_array[0];
> -idmap->input_base = 0;
> -idmap->id_count = cpu_to_le32(0x);
> -idmap->output_base = 0;
> -/* output IORT node is the ITS group node (the first node) */
> -idmap->output_reference = cpu_to_le32(iort_node_offset);
> +/*
> + * Identity RID mapping covering the whole input RID range.
> + * The output IORT node is the ITS group node (the first node).
> + */
> +fill_iort_idmap(smmu->id_mapping_array, 0, 0, 0x, 0,

nit: the other calls use uppercase hex digits

Reviewed-by: Jean-Philippe Brucker

Re: [PATCH 1/2] ppc/pnv: Loop on the whole hierarchy to populate the DT with the XSCOM nodes

2019-12-10 Thread Greg Kurz

On Tue, 10 Dec 2019 14:58:44 +0100
Cédric Le Goater  wrote:

> Some PnvXScomInterface objects lie a bit deeper (PnvPBCQState) than

I didn't find any trace of PnvPBCQState in the code... what is it ?

> the first layer, so we need to loop on the whole object hierarchy to
> catch them.
> 
> Signed-off-by: Cédric Le Goater 
> ---
>  hw/ppc/pnv_xscom.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
> index bed41840845e..006d87e970d9 100644
> --- a/hw/ppc/pnv_xscom.c
> +++ b/hw/ppc/pnv_xscom.c
> @@ -326,7 +326,12 @@ int pnv_dt_xscom(PnvChip *chip, void *fdt, int 
> root_offset)
>  args.fdt = fdt;
>  args.xscom_offset = xscom_offset;
>  
> -object_child_foreach(OBJECT(chip), xscom_dt_child, );
> +/*
> + * Loop on the whole object hierarchy to catch all
> + * PnvXScomInterface objects which can lie a bit deeper the first

s/deeper the first/deeper than the first/

> + * layer.
> + */
> +object_child_foreach_recursive(OBJECT(chip), xscom_dt_child, );
>  return 0;
>  }
>

Re: [PATCH for-5.0 v11 11/20] hw/arm/virt: Add the virtio-iommu device tree mappings

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:34PM +0100, Eric Auger wrote:
> diff --git a/hw/virtio/virtio-iommu-pci.c b/hw/virtio/virtio-iommu-pci.c
> index 280230b31e..4cfae1f9df 100644
> --- a/hw/virtio/virtio-iommu-pci.c
> +++ b/hw/virtio/virtio-iommu-pci.c
> @@ -31,9 +31,6 @@ struct VirtIOIOMMUPCI {
>  
>  static Property virtio_iommu_pci_properties[] = {
>  DEFINE_PROP_UINT32("class", VirtIOPCIProxy, class_code, 0),
> -DEFINE_PROP_ARRAY("reserved-regions", VirtIOIOMMUPCI,
> -  vdev.nb_reserved_regions, vdev.reserved_regions,
> -  qdev_prop_interval, Interval),

Belongs in patch 10?

Apart from that 
Reviewed-by: Jean-Philippe Brucker

Re: [PATCH for-5.0 v11 18/20] virtio-iommu: Support migration

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:41PM +0100, Eric Auger wrote:
> +static const VMStateDescription vmstate_virtio_iommu_device = {
> +.name = "virtio-iommu-device",
> +.minimum_version_id = 1,
> +.version_id = 1,
> +.post_load = iommu_post_load,
> +.fields = (VMStateField[]) {
> +VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 1,
> +   _domain, viommu_domain),
> +VMSTATE_GTREE_DIRECT_KEY_V(endpoints, VirtIOIOMMU, 1,
> +   _endpoint, viommu_endpoint),

So if I understand correctly these fields are state that is modified by
the guest? We don't need to save/load fields that cannot be modified by
the guest, static information that is created from the QEMU command-line. 

I think the above covers everything we need to migrate in VirtIOIOMMU
then, except for acked_features, which (as I pointed out on another patch)
seems redundant anyway since there is vdev->guest_features.

Reviewed-by: Jean-Philippe Brucker 

> +VMSTATE_END_OF_LIST()
> +},
> +};
> +
> +
>  static const VMStateDescription vmstate_virtio_iommu = {
>  .name = "virtio-iommu",
>  .minimum_version_id = 1,
> -- 
> 2.20.1
> 
>

Re: [PATCH for-5.0 v11 15/20] virtio-iommu-pci: Add array of Interval properties

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:38PM +0100, Eric Auger wrote:
> The machine may need to pass reserved regions to the
> virtio-iommu-pci device (such as the MSI window on x86).
> So let's add an array of Interval properties.
> 
> Signed-off-by: Eric Auger 

Reviewed-by: Jean-Philippe Brucker

Re: [PATCH for-5.0 v11 14/20] virtio-iommu: Handle reserved regions in the translation process

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:37PM +0100, Eric Auger wrote:
> +for (i = 0; i < s->nb_reserved_regions; i++) {
> +if (interval.low >= s->reserved_regions[i].low &&
> +interval.low <= s->reserved_regions[i].high) {
> +switch (s->reserved_regions[i].type) {
> +case VIRTIO_IOMMU_RESV_MEM_T_MSI:
> +entry.perm = flag;
> +goto unlock;
> +case VIRTIO_IOMMU_RESV_MEM_T_RESERVED:
> +default:
> +virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_MAPPING,
> +  0, sid, addr);

Needs the VIRTIO_IOMMU_FAULT_F_ADDRESS flag.

Thanks,
Jean

Re: [PATCH for-5.0 v11 10/20] virtio-iommu-pci: Add virtio iommu pci support

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:33PM +0100, Eric Auger wrote:
> This patch adds virtio-iommu-pci, which is the pci proxy for
> the virtio-iommu device.
> 
> Signed-off-by: Eric Auger 

Reviewed-by: Jean-Philippe Brucker

Re: [PATCH for-5.0 v11 09/20] virtio-iommu: Implement fault reporting

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:32PM +0100, Eric Auger wrote:
> @@ -443,6 +489,8 @@ static IOMMUTLBEntry 
> virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>  if (!ep) {
>  if (!bypass_allowed) {
>  error_report_once("%s sid=%d is not known!!", __func__, sid);
> +virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_UNKNOWN,
> +  0, sid, 0);

I guess we could report the faulting address as well, it can be useful for
diagnostics.

>  } else {
>  entry.perm = flag;
>  }
> @@ -455,6 +503,8 @@ static IOMMUTLBEntry 
> virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>"%s %02x:%02x.%01x not attached to any domain\n",
>__func__, PCI_BUS_NUM(sid),
>PCI_SLOT(sid), PCI_FUNC(sid));
> +virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_DOMAIN,
> +  0, sid, 0);

Here as well, especially since that error would get propagated by a linux
guest to the device driver

>  } else {
>  entry.perm = flag;
>  }
> @@ -468,16 +518,25 @@ static IOMMUTLBEntry 
> virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>  qemu_log_mask(LOG_GUEST_ERROR,
>"%s no mapping for 0x%"PRIx64" for sid=%d\n",
>__func__, addr, sid);
> +virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_MAPPING,
> +  0, sid, addr);

Flag VIRTIO_IOMMU_FAULT_F_ADDRESS denotes a valid address field

Thanks,
Jean

Re: [PATCH for-5.0 v11 08/20] virtio-iommu: Implement translate

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:31PM +0100, Eric Auger wrote:
> This patch implements the translate callback
> 
> Signed-off-by: Eric Auger 

Reviewed-by: Jean-Philippe Brucker

Re: [PATCH for-5.0 v11 07/20] virtio-iommu: Implement map/unmap

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:30PM +0100, Eric Auger wrote:
> @@ -238,10 +244,35 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
>  uint64_t virt_start = le64_to_cpu(req->virt_start);
>  uint64_t virt_end = le64_to_cpu(req->virt_end);
>  uint32_t flags = le32_to_cpu(req->flags);
> +viommu_domain *domain;
> +viommu_interval *interval;
> +viommu_mapping *mapping;

Additional checks would be good. Most importantly we need to return
S_INVAL if we don't recognize a bit in flags (a MUST in the spec). It
might be good to check that addresses are aligned on the page granule as
well, and return S_RANGE if they aren't (a SHOULD in the spec), but I
don't care as much.

> +
> +interval = g_malloc0(sizeof(*interval));
> +
> +interval->low = virt_start;
> +interval->high = virt_end;
> +
> +domain = g_tree_lookup(s->domains, GUINT_TO_POINTER(domain_id));
> +if (!domain) {
> +return VIRTIO_IOMMU_S_NOENT;

Leaks interval, I guess you could allocate it after this block.

Thanks,
Jean

> +}
> +
> +mapping = g_tree_lookup(domain->mappings, (gpointer)interval);
> +if (mapping) {
> +g_free(interval);
> +return VIRTIO_IOMMU_S_INVAL;
> +}
>  
>  trace_virtio_iommu_map(domain_id, virt_start, virt_end, phys_start, 
> flags);
>  
> -return VIRTIO_IOMMU_S_UNSUPP;
> +mapping = g_malloc0(sizeof(*mapping));
> +mapping->phys_addr = phys_start;
> +mapping->flags = flags;
> +
> +g_tree_insert(domain->mappings, interval, mapping);
> +
> +return VIRTIO_IOMMU_S_OK;

Re: [PATCH for-5.0 v11 06/20] virtio-iommu: Implement attach/detach command

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:29PM +0100, Eric Auger wrote:
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index 235bde2203..138d5b2a9c 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -77,11 +77,12 @@ static gint interval_cmp(gconstpointer a, gconstpointer 
> b, gpointer user_data)
>  static void virtio_iommu_detach_endpoint_from_domain(viommu_endpoint *ep)
>  {
>  QLIST_REMOVE(ep, next);
> +g_tree_unref(ep->domain->mappings);
>  ep->domain = NULL;
>  }
>  
> -viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, uint32_t ep_id);
> -viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, uint32_t ep_id)
> +static viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
> +  uint32_t ep_id)
>  {
>  viommu_endpoint *ep;
>  
> @@ -102,15 +103,14 @@ static void virtio_iommu_put_endpoint(gpointer data)
>  
>  if (ep->domain) {
>  virtio_iommu_detach_endpoint_from_domain(ep);
> -g_tree_unref(ep->domain->mappings);
>  }
>  
>  trace_virtio_iommu_put_endpoint(ep->id);
>  g_free(ep);
>  }
>  
> -viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s, uint32_t domain_id);
> -viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s, uint32_t domain_id)
> +static viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s,
> +  uint32_t domain_id)

Looks like the above change belong to patch 5?

>  {
>  viommu_domain *domain;
>  
> @@ -137,7 +137,6 @@ static void virtio_iommu_put_domain(gpointer data)
>  QLIST_FOREACH_SAFE(iter, >endpoint_list, next, tmp) {
>  virtio_iommu_detach_endpoint_from_domain(iter);
>  }
> -g_tree_destroy(domain->mappings);

When created by virtio_iommu_get_domain(), mappings has one reference.
Then for each attach (including the first one) an additional reference is
taken, and freed by virtio_iommu_detach_endpoint_from_domain(). So I think
there are two problems:

* virtio_iommu_put_domain() drops one ref for each endpoint, but we still
  have one reference to mappings, so they're not freed. We do need this
  g_tree_destroy()

* After detaching all the endpoints, the guest may reuse the domain ID for
  another domain, but the previous mappings haven't been erased. Not sure
  how to fix this using the g_tree refs, because dropping all the
  references will free the internal tree data and it won't be reusable.

Thanks,
Jean

Re: [PATCH v1 2/5] timer: arm: Introduce functions to get the host cntfrq

2019-12-10 Thread Andrew Jones

On Tue, Dec 10, 2019 at 03:47:39PM +, Peter Maydell wrote:
> On Wed, 16 Oct 2019 at 15:34, Andrew Jones  wrote:
> >
> > When acceleration like KVM is in use it's necessary to use the host's
> > counter frequency when converting ticks to or from time units.
> >
> > Signed-off-by: Andrew Jones 
> > Reviewed-by: Richard Henderson 
> > ---
> >  include/qemu/timer.h | 16 
> >  1 file changed, 16 insertions(+)
> >
> > diff --git a/include/qemu/timer.h b/include/qemu/timer.h
> > index 85bc6eb00b21..8941ddea8242 100644
> > --- a/include/qemu/timer.h
> > +++ b/include/qemu/timer.h
> > @@ -1006,6 +1006,22 @@ static inline int64_t cpu_get_host_ticks(void)
> >  }
> >  #endif
> >
> > +#if defined(__aarch64__)
> > +static inline uint32_t cpu_get_host_tick_frequency(void)
> > +{
> > +uint64_t frq;
> > +asm volatile("mrs %0, cntfrq_el0" : "=r" (frq));
> > +return frq;
> > +}
> > +#elif defined(__arm__)
> > +static inline uint32_t cpu_get_host_tick_frequency(void)
> > +{
> > +uint32_t frq;
> > +asm volatile("mrc p15, 0, %0, c14, c0, 0" : "=r" (frq));
> > +return frq;
> > +}
> > +#endif
> 
> Don't we want to know what the guest counter frequency
> is, not the host counter frequency? That is, I would have
> expected that we get this value via doing a KVM ONE_REG ioctl
> to ask the kernel what the guest cntfrq is. Are we using
> the host value on the assumption that the guest might have
> misprogrammed their copy of the register?
>

Indeed it would be best to get whatever KVM is using for the given VCPU's
CNTFRQ through GET_ONE_REG, but KVM doesn't seem to allow it. I hadn't
tested before, but to be sure, I did now with the following kselftests
test

 #include "kvm_util.h"
 #include "processor.h"
 
 static void guest_code(void) { }
 
 int main(void)
 {
   struct kvm_vm *vm = vm_create_default(0, 0, guest_code);
   uint64_t reg;
 
   get_reg(vm, 0, ARM64_SYS_REG(3, 3, 14, 0, 0), );
   printf("%lx\n", reg);
   return 0;
 }

The vcpu ioctl fails with ENOENT. Currently KVM requires all guests to
have the same frequency as the host, but I guess that will change
eventually. It's definitely best to do the save/restore thing, as you
suggest.

Thanks,
drew

Re: [RFC PATCH 4/9] vfio-pci: register default dynamic-trap-bar-info region

2019-12-10 Thread Alex Williamson

On Tue, 10 Dec 2019 02:44:44 -0500
Yan Zhao  wrote:

> On Tue, Dec 10, 2019 at 05:16:08AM +0800, Alex Williamson wrote:
> > On Mon, 9 Dec 2019 01:22:12 -0500
> > Yan Zhao  wrote:
> >   
> > > On Fri, Dec 06, 2019 at 11:20:38PM +0800, Alex Williamson wrote:  
> > > > On Fri, 6 Dec 2019 01:04:07 -0500
> > > > Yan Zhao  wrote:
> > > > 
> > > > > On Fri, Dec 06, 2019 at 07:55:30AM +0800, Alex Williamson wrote:
> > > > > > On Wed,  4 Dec 2019 22:26:50 -0500
> > > > > > Yan Zhao  wrote:
> > > > > >   
> > > > > > > Dynamic trap bar info region is a channel for QEMU and vendor 
> > > > > > > driver to
> > > > > > > communicate dynamic trap info. It is of type
> > > > > > > VFIO_REGION_TYPE_DYNAMIC_TRAP_BAR_INFO and subtype
> > > > > > > VFIO_REGION_SUBTYPE_DYNAMIC_TRAP_BAR_INFO.
> > > > > > > 
> > > > > > > This region has two fields: dt_fd and trap.
> > > > > > > When QEMU detects a device regions of this type, it will create an
> > > > > > > eventfd and write its eventfd id to dt_fd field.
> > > > > > > When vendor drivre signals this eventfd, QEMU reads trap field of 
> > > > > > > this
> > > > > > > info region.
> > > > > > > - If trap is true, QEMU would search the device's PCI BAR
> > > > > > > regions and disable all the sparse mmaped subregions (if the 
> > > > > > > sparse
> > > > > > > mmaped subregion is disablable).
> > > > > > > - If trap is false, QEMU would re-enable those subregions.
> > > > > > > 
> > > > > > > A typical usage is
> > > > > > > 1. vendor driver first cuts its bar 0 into several sections, all 
> > > > > > > in a
> > > > > > > sparse mmap array. So initally, all its bar 0 are passthroughed.
> > > > > > > 2. vendor driver specifys part of bar 0 sections to be disablable.
> > > > > > > 3. on migration starts, vendor driver signals dt_fd and set trap 
> > > > > > > to true
> > > > > > > to notify QEMU disabling the bar 0 sections of disablable flags 
> > > > > > > on.
> > > > > > > 4. QEMU disables those bar 0 section and hence let vendor driver 
> > > > > > > be able
> > > > > > > to trap access of bar 0 registers and make dirty page tracking 
> > > > > > > possible.
> > > > > > > 5. on migration failure, vendor driver signals dt_fd to QEMU 
> > > > > > > again.
> > > > > > > QEMU reads trap field of this info region which is false and QEMU
> > > > > > > re-passthrough the whole bar 0 region.
> > > > > > > 
> > > > > > > Vendor driver specifies whether it supports dynamic-trap-bar-info 
> > > > > > > region
> > > > > > > through cap VFIO_PCI_DEVICE_CAP_DYNAMIC_TRAP_BAR in
> > > > > > > vfio_pci_mediate_ops->open().
> > > > > > > 
> > > > > > > If vfio-pci detects this cap, it will create a default
> > > > > > > dynamic_trap_bar_info region on behalf of vendor driver with 
> > > > > > > region len=0
> > > > > > > and region->ops=null.
> > > > > > > Vvendor driver should override this region's len, flags, rw, mmap 
> > > > > > > in its
> > > > > > > vfio_pci_mediate_ops.  
> > > > > > 
> > > > > > TBH, I don't like this interface at all.  Userspace doesn't pass 
> > > > > > data
> > > > > > to the kernel via INFO ioctls.  We have a SET_IRQS ioctl for
> > > > > > configuring user signaling with eventfds.  I think we only need to
> > > > > > define an IRQ type that tells the user to re-evaluate the sparse 
> > > > > > mmap
> > > > > > information for a region.  The user would enumerate the device IRQs 
> > > > > > via
> > > > > > GET_IRQ_INFO, find one of this type where the IRQ info would also
> > > > > > indicate which region(s) should be re-evaluated on signaling.  The 
> > > > > > user
> > > > > > would enable that signaling via SET_IRQS and simply re-evaluate the 
> > > > > >  
> > > > > ok. I'll try to switch to this way. Thanks for this suggestion.
> > > > > 
> > > > > > sparse mmap capability for the associated regions when signaled.
> > > > > >   
> > > > > 
> > > > > Do you like the "disablable" flag of sparse mmap ?
> > > > > I think it's a lightweight way for user to switch mmap state of a 
> > > > > whole region,
> > > > > otherwise going through a complete flow of GET_REGION_INFO and 
> > > > > re-setup
> > > > > region might be too heavy.
> > > > 
> > > > No, I don't like the disable-able flag.  At what frequency do we expect
> > > > regions to change?  It seems like we'd only change when switching into
> > > > and out of the _SAVING state, which is rare.  It seems easy for
> > > > userspace, at least QEMU, to drop the entire mmap configuration and
> > > ok. I'll try this way.
> > >   
> > > > re-read it.  Another concern here is how do we synchronize the event?
> > > > Are we assuming that this event would occur when a user switch to
> > > > _SAVING mode on the device?  That operation is synchronous, the device
> > > > must be in saving mode after the write to device state completes, but
> > > > it seems like this might be trying to add an asynchronous dependency.
> > > > Will the write to device_state only complete once the user

Re: [PATCH v3] qga: fence guest-set-time if hwclock not available

2019-12-10 Thread Philippe Mathieu-Daudé


On 12/9/19 7:33 PM, Cornelia Huck wrote:

On Fri, 06 Dec 2019 08:17:27 +0100
Markus Armbruster  wrote:


Cornelia Huck  writes:


On Thu, 5 Dec 2019 14:05:19 +0100
Philippe Mathieu-Daudé  wrote:
  

Hi Cornelia,

On 12/5/19 12:53 PM, Cornelia Huck wrote:

The Posix implementation of guest-set-time invokes hwclock to
set/retrieve the time to/from the hardware clock. If hwclock
is not available, the user is currently informed that "hwclock
failed to set hardware clock to system time", which is quite
misleading. This may happen e.g. on s390x, which has a different
timekeeping concept anyway.

Let's check for the availability of the hwclock command and
return QERR_UNSUPPORTED for guest-set-time if it is not available.

Reviewed-by: Laszlo Ersek 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Michael Roth 
Signed-off-by: Cornelia Huck 
---

v2->v3:
- added 'static' keyword to hwclock_path

Not sure what tree this is going through; if there's no better place,
I can also take this through the s390 tree.


s390 or trivial trees seems appropriate.
   


---
   qga/commands-posix.c | 13 -
   1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 1c1a165daed8..0be301a4ea77 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -156,6 +156,17 @@ void qmp_guest_set_time(bool has_time, int64_t time_ns, 
Error **errp)
   pid_t pid;
   Error *local_err = NULL;
   struct timeval tv;
+static const char hwclock_path[] = "/sbin/hwclock";
+static int hwclock_available = -1;
+
+if (hwclock_available < 0) {
+hwclock_available = (access(hwclock_path, X_OK) == 0);
+}
+
+if (!hwclock_available) {
+error_setg(errp, QERR_UNSUPPORTED);


In include/qapi/qmp/qerror.h we have:

/*
   * These macros will go away, please don't use in new code, and do not
   * add new ones!
   */


Sigh, it is really hard to keep track here :( I just copied from other
callers in this file...


I'm not faulting you for that.

I think this new use is acceptable.  For details, see my other reply in
this thread.


Ok, thanks for your explanation there.

I guess I'll queue this on s390-next... Philippe, any objections to
adding your R-b to the unmodified patch?


Certainly, sorry for the delay/noise on this trivial patch, I learned 
the subtle differences between comments in code and reality :)


Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH for-5.0 v11 05/20] virtio-iommu: Endpoint and domains structs and helpers

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:28PM +0100, Eric Auger wrote:
> +typedef struct viommu_domain {
> +uint32_t id;
> +GTree *mappings;
> +QLIST_HEAD(, viommu_endpoint) endpoint_list;
> +} viommu_domain;
> +
> +typedef struct viommu_endpoint {
> +uint32_t id;
> +viommu_domain *domain;
> +QLIST_ENTRY(viommu_endpoint) next;
> +} viommu_endpoint;

There might be a way to merge viommu_endpoint and the IOMMUDevice
structure introduced in patch 4, since they both represent one endpoint.
Maybe virtio_iommu_find_add_pci_as() could add the IOMMUDevice to
s->endpoints, and IOMMUDevice could store the endpoint ID rather than bus
and devfn.

> +typedef struct viommu_interval {
> +uint64_t low;
> +uint64_t high;
> +} viommu_interval;

I guess these should be named in CamelCase? Although if we're allowed to
choose my vote goes to underscores :)

Thanks,
Jean

Re: [PATCH for-5.0 v11 04/20] virtio-iommu: Add the iommu regions

2019-12-10 Thread Jean-Philippe Brucker

Two small things below, but looks good overall

Reviewed-by: Jean-Philippe Brucker 

On Fri, Nov 22, 2019 at 07:29:27PM +0100, Eric Auger wrote:
> +static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
> +  int devfn)
> +{
> +VirtIOIOMMU *s = opaque;
> +IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
> +static uint32_t mr_index;
> +IOMMUDevice *sdev;
> +
> +if (!sbus) {
> +sbus = g_malloc0(sizeof(IOMMUPciBus) +
> + sizeof(IOMMUDevice *) * IOMMU_PCI_DEVFN_MAX);
> +sbus->bus = bus;
> +g_hash_table_insert(s->as_by_busptr, bus, sbus);
> +}
> +
> +sdev = sbus->pbdev[devfn];
> +if (!sdev) {
> +char *name = g_strdup_printf("%s-%d-%d",
> + TYPE_VIRTIO_IOMMU_MEMORY_REGION,
> + mr_index++, devfn);
> +sdev = sbus->pbdev[devfn] = g_malloc0(sizeof(IOMMUDevice));
> +
> +sdev->viommu = s;
> +sdev->bus = bus;
> +sdev->devfn = devfn;

It might be better to store the endpoint ID in IOMMUDevice, then you could
get rid of virtio_iommu_get_sid(), and remove a tiny bit of overhead in
virtio_iommu_translate(). But I doubt it's significant.

[...]
> +static const TypeInfo virtio_iommu_memory_region_info = {
> +.parent = TYPE_IOMMU_MEMORY_REGION,
> +.name = TYPE_VIRTIO_IOMMU_MEMORY_REGION,
> +.class_init = virtio_iommu_memory_region_class_init,
> +};
> +
> +

nit: newline.

Thanks,
Jean

Re: [for-5.0 PATCH v2 1/3] cpu: Introduce CPUReset callback typedef

2019-12-10 Thread Greg Kurz

On Tue, 10 Dec 2019 10:42:51 +
Peter Maydell  wrote:

> On Tue, 10 Dec 2019 at 10:39, Markus Armbruster  wrote:
> >
> > Greg Kurz  writes:
> > > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > > index 77c6f0529903..047e3972ecaf 100644
> > > --- a/include/hw/core/cpu.h
> > > +++ b/include/hw/core/cpu.h
> > > @@ -74,6 +74,8 @@ typedef struct CPUWatchpoint CPUWatchpoint;
> > >
> > >  struct TranslationBlock;
> > >
> > > +typedef void (*CPUReset)(CPUState *cpu);
> > > +
> > >  /**
> > >   * CPUClass:
> > >   * @class_by_name: Callback to map -cpu command line model name to an
> > > @@ -165,7 +167,7 @@ typedef struct CPUClass {
> > >  ObjectClass *(*class_by_name)(const char *cpu_model);
> > >  void (*parse_features)(const char *typename, char *str, Error 
> > > **errp);
> > >
> > > -void (*reset)(CPUState *cpu);
> > > +CPUReset reset;
> > >  int reset_dump_flags;
> > >  bool (*has_work)(CPUState *cpu);
> > >  void (*do_interrupt)(CPUState *cpu);
> > [...]
> >
> > Opinion, not objection: such typedefs make the code less obvious.
> 
> It's particularly odd here where this class has several
> methods but we've only chosen one to privilege with a typedef.
> 

Yes, children classes don't do the overloading-and-call-the-parent for
other methods which was the initial motivation for the typedef.

> Personal preference: if you use a typedef, typedef the
> function type, not the pointer-to-the-function-type.
> But I would just leave it be.
> 

Thinking again, I'm not sure the typedef really helps here. Markus
doesn't like it either. I'll try without.

> thanks
> -- PMM

Re: [PATCH v38 15/22] target/avr: Add example board configuration

2019-12-10 Thread Aleksandar Markovic

On Mon, Dec 9, 2019 at 7:38 PM Michael Rolnik  wrote:
>
> I will check again.
>

On my test bed:

...

  CC  riscv32-softmmu/hw/virtio/virtio-serial-pci.o
  CC  riscv32-softmmu/hw/riscv/boot.o
/home/rtrk/Build/qemu-rolnik/hw/riscv/boot.c: In function ‘riscv_load_kernel’:
/home/rtrk/Build/qemu-rolnik/hw/riscv/boot.c:123:36: error: passing
argument 10 of ‘load_elf_ram_sym’ makes pointer from integer without a
cast [-Werror=int-conversion]
  EM_RISCV, 1, 0, NULL, true, sym_cb) > 0) {
^
In file included from /home/rtrk/Build/qemu-rolnik/hw/riscv/boot.c:26:0:
/home/rtrk/Build/qemu-rolnik/include/hw/loader.h:130:5: note: expected
‘uint32_t * {aka unsigned int *}’ but argument is of type ‘int’
 int load_elf_ram_sym(const char *filename,
 ^~~~
/home/rtrk/Build/qemu-rolnik/hw/riscv/boot.c:123:42: error: passing
argument 12 of ‘load_elf_ram_sym’ makes integer from pointer without a
cast [-Werror=int-conversion]
  EM_RISCV, 1, 0, NULL, true, sym_cb) > 0) {
  ^~~~
In file included from /home/rtrk/Build/qemu-rolnik/hw/riscv/boot.c:26:0:
/home/rtrk/Build/qemu-rolnik/include/hw/loader.h:130:5: note: expected
‘int’ but argument is of type ‘void *’
 int load_elf_ram_sym(const char *filename,
 ^~~~
/home/rtrk/Build/qemu-rolnik/hw/riscv/boot.c:123:48: error: passing
argument 13 of ‘load_elf_ram_sym’ makes pointer from integer without a
cast [-Werror=int-conversion]
  EM_RISCV, 1, 0, NULL, true, sym_cb) > 0) {
^~~~
In file included from /home/rtrk/Build/qemu-rolnik/hw/riscv/boot.c:26:0:
/home/rtrk/Build/qemu-rolnik/include/hw/loader.h:130:5: note: expected
‘AddressSpace * {aka struct AddressSpace *}’ but argument is of type
‘int’
 int load_elf_ram_sym(const char *filename,
 ^~~~
/home/rtrk/Build/qemu-rolnik/hw/riscv/boot.c:121:9: error: too few
arguments to function ‘load_elf_ram_sym’
 if (load_elf_ram_sym(kernel_filename, NULL, NULL, NULL,
 ^~~~
In file included from /home/rtrk/Build/qemu-rolnik/hw/riscv/boot.c:26:0:
/home/rtrk/Build/qemu-rolnik/include/hw/loader.h:130:5: note: declared here
 int load_elf_ram_sym(const char *filename,
 ^~~~
cc1: all warnings being treated as errors
/home/rtrk/Build/qemu-rolnik/rules.mak:69: recipe for target
'hw/riscv/boot.o' failed
make[1]: *** [hw/riscv/boot.o] Error 1
Makefile:491: recipe for target 'riscv32-softmmu/all' failed
make: *** [riscv32-softmmu/all] Error 2



> On Mon, Dec 9, 2019 at 8:30 PM Michael Rolnik  wrote:
>>
>> Yes, I did compile other platforms.
>>
>> On Mon, Dec 9, 2019 at 8:24 PM Aleksandar Markovic 
>>  wrote:
>>>
>>>
>>>
>>> On Sunday, December 8, 2019, Michael Rolnik  wrote:

 A simple board setup that configures an AVR CPU to run a given firmware 
 image.
 This is all that's useful to implement without peripheral emulation as AVR 
 CPUs include a lot of on-board peripherals.

 NOTE: this is not a real board 
 NOTE: it's used for CPU testing

 Signed-off-by: Michael Rolnik 
 Reviewed-by: Aleksandar Markovic 
 Nacked-by: Philippe Mathieu-Daudé 
 ---
  include/elf.h|   2 +
  include/hw/elf_ops.h |   6 +-
  include/hw/loader.h  |   3 +-
  hw/avr/sample.c  | 293 +++
  hw/core/loader.c |  13 +-
  hw/Kconfig   |   1 +
  hw/avr/Kconfig   |   6 +
  hw/avr/Makefile.objs |   1 +
  8 files changed, 317 insertions(+), 8 deletions(-)
  create mode 100644 hw/avr/sample.c
  create mode 100644 hw/avr/Kconfig
  create mode 100644 hw/avr/Makefile.objs

 diff --git a/include/elf.h b/include/elf.h
 index 3501e0c8d0..53cdfa23b7 100644
 --- a/include/elf.h
 +++ b/include/elf.h
 @@ -202,6 +202,8 @@ typedef struct mips_elf_abiflags_v0 {
  #define EM_MOXIE   223 /* Moxie processor family */
  #define EM_MOXIE_OLD   0xFEED

 +#define EM_AVR 83 /* AVR 8-bit microcontroller */
 +
  /* This is the info that is needed to parse the dynamic section of the 
 file */
  #define DT_NULL0
  #define DT_NEEDED  1
 diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
 index e07d276df7..9f28c16490 100644
 --- a/include/hw/elf_ops.h
 +++ b/include/hw/elf_ops.h
 @@ -316,7 +316,8 @@ static int glue(load_elf, SZ)(const char *name, int fd,
void *translate_opaque,
int must_swab, uint64_t *pentry,
uint64_t *lowaddr, uint64_t *highaddr,
 -  int elf_machine, int clear_lsb, int 
 data_swab,
 +  int elf_machine, uint32_t *pflags,
 +

Re: [PATCH for-5.0 v11 03/20] virtio-iommu: Decode the command payload

2019-12-10 Thread Jean-Philippe Brucker

On Fri, Nov 22, 2019 at 07:29:26PM +0100, Eric Auger wrote:
> This patch adds the command payload decoding and
> introduces the functions that will do the actual
> command handling. Those functions are not yet implemented.
> 
> Signed-off-by: Eric Auger 

Reviewed-by: Jean-Philippe Brucker 

Which isn't worth much since I don't have prior QEMU experience but I did
stare at this code for a while and work on future extensions.

Thanks,
Jean

Re: [PATCH for-5.0 v11 02/20] virtio-iommu: Add skeleton

2019-12-10 Thread Jean-Philippe Brucker

Hi Eric,

On Fri, Nov 22, 2019 at 07:29:25PM +0100, Eric Auger wrote:
> +typedef struct VirtIOIOMMU {
> +VirtIODevice parent_obj;
> +VirtQueue *req_vq;
> +VirtQueue *event_vq;
> +struct virtio_iommu_config config;
> +uint64_t features;
> +uint64_t acked_features;

We already have guest_features in the parent object.

> +GHashTable *as_by_busptr;
> +IOMMUPciBus *as_by_bus_num[IOMMU_PCI_BUS_MAX];

Doesn't seem used anymore.

Thanks,
Jean

> +PCIBus *primary_bus;
> +GTree *domains;
> +QemuMutex mutex;
> +GTree *endpoints;
> +} VirtIOIOMMU;
> +
> +#endif
> -- 
> 2.20.1
> 
>

Re: [for-5.0 PATCH v2 1/3] cpu: Introduce CPUReset callback typedef

2019-12-10 Thread Greg Kurz

On Tue, 10 Dec 2019 11:39:00 +0100
Markus Armbruster  wrote:

> Greg Kurz  writes:
> 
> > Use it in include/hw/core/cpu.h and convert all targets to use it as
> > well with:
> >
> > perl -pi \
> >  -e 's/void\s+\(\*(parent_reset)\)\(CPUState\s+\*\w+\)/CPUReset \1/;' \
> >  $(git ls-files 'target/*.h')
> >
> > Signed-off-by: Greg Kurz 
> > Acked-by: David Gibson 
> > Reviewed-by: Alistair Francis 
> > ---
> >  include/hw/core/cpu.h   |4 +++-
> >  target/alpha/cpu-qom.h  |2 +-
> >  target/arm/cpu-qom.h|2 +-
> >  target/cris/cpu-qom.h   |2 +-
> >  target/hppa/cpu-qom.h   |2 +-
> >  target/i386/cpu-qom.h   |2 +-
> >  target/lm32/cpu-qom.h   |2 +-
> >  target/m68k/cpu-qom.h   |2 +-
> >  target/microblaze/cpu-qom.h |2 +-
> >  target/mips/cpu-qom.h   |2 +-
> >  target/moxie/cpu.h  |2 +-
> >  target/nios2/cpu.h  |2 +-
> >  target/openrisc/cpu.h   |2 +-
> >  target/ppc/cpu-qom.h|2 +-
> >  target/riscv/cpu.h  |2 +-
> >  target/s390x/cpu-qom.h  |2 +-
> >  target/sh4/cpu-qom.h|2 +-
> >  target/sparc/cpu-qom.h  |2 +-
> >  target/tilegx/cpu.h |2 +-
> >  target/tricore/cpu-qom.h|2 +-
> >  target/xtensa/cpu-qom.h |2 +-
> >  21 files changed, 23 insertions(+), 21 deletions(-)
> >
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index 77c6f0529903..047e3972ecaf 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -74,6 +74,8 @@ typedef struct CPUWatchpoint CPUWatchpoint;
> >  
> >  struct TranslationBlock;
> >  
> > +typedef void (*CPUReset)(CPUState *cpu);
> > +
> >  /**
> >   * CPUClass:
> >   * @class_by_name: Callback to map -cpu command line model name to an
> > @@ -165,7 +167,7 @@ typedef struct CPUClass {
> >  ObjectClass *(*class_by_name)(const char *cpu_model);
> >  void (*parse_features)(const char *typename, char *str, Error **errp);
> >  
> > -void (*reset)(CPUState *cpu);
> > +CPUReset reset;
> >  int reset_dump_flags;
> >  bool (*has_work)(CPUState *cpu);
> >  void (*do_interrupt)(CPUState *cpu);
> [...]
> 
> Opinion, not objection: such typedefs make the code less obvious.
> 

I merely followed what we have in qdev, but I tend to agree. And, as
mentioned by Peter in another mail, it looks odd to only convert the
reset method.

Re: [PATCH v38 00/22] QEMU AVR 8 bit cores

2019-12-10 Thread Aleksandar Markovic

On Sun, Dec 8, 2019 at 7:39 PM Michael Rolnik  wrote:
>
> This series of patches adds 8bit AVR cores to QEMU.

I get several strange messages about "new blank lines at EOF" while
applying this series via patchwork: (check the last line of involved
files)

rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278219
Applying patch #11278219 using u'git am'
Description: [v38,01/22] target/avr: Add outward facing interfaces and
core CPU logic
Applying: target/avr: Add outward facing interfaces and core CPU logic
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278213
Applying patch #11278213 using u'git am'
Description: [v38,02/22] target/avr: Add instruction helpers
Applying: target/avr: Add instruction helpers
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278211
Applying patch #11278211 using u'git am'
Description: [v38,03/22] target/avr: Add instruction decoding
Applying: target/avr: Add instruction decoding
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278227
Applying patch #11278227 using u'git am'
Description: [v38,04/22] target/avr: Add instruction translation -
Registers definition
Applying: target/avr: Add instruction translation - Registers definition
.git/rebase-apply/patch:154: new blank line at EOF.
+
warning: 1 line adds whitespace errors.
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278215
Applying patch #11278215 using u'git am'
Description: [v38,05/22] target/avr: Add instruction translation -
Arithmetic and Logic Instructions
Applying: target/avr: Add instruction translation - Arithmetic and
Logic Instructions
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278217
Applying patch #11278217 using u'git am'
Description: [v38,06/22] target/avr: Add instruction translation -
Branch Instructions
Applying: target/avr: Add instruction translation - Branch Instructions
.git/rebase-apply/patch:556: new blank line at EOF.
+
warning: 1 line adds whitespace errors.
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278241
Applying patch #11278241 using u'git am'
Description: [v38,07/22] target/avr: Add instruction translation -
Data Transfer Instructions
Applying: target/avr: Add instruction translation - Data Transfer Instructions
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278235
Applying patch #11278235 using u'git am'
Description: [v38,08/22] target/avr: Add instruction translation - Bit
and Bit-test Instructions
Applying: target/avr: Add instruction translation - Bit and Bit-test
Instructions
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278225
Applying patch #11278225 using u'git am'
Description: [v38,09/22] target/avr: Add instruction translation - MCU
Control Instructions
Applying: target/avr: Add instruction translation - MCU Control Instructions
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278223
Applying patch #11278223 using u'git am'
Description: [v38,10/22] target/avr: Add instruction translation - CPU
main translation function
Applying: target/avr: Add instruction translation - CPU main
translation function
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278221
Applying patch #11278221 using u'git am'
Description: [v38,11/22] target/avr: Add instruction disassembly function
Applying: target/avr: Add instruction disassembly function
.git/rebase-apply/patch:265: new blank line at EOF.
+
warning: 1 line adds whitespace errors.
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278229
Applying patch #11278229 using u'git am'
Description: [v38,12/22] target/avr: Add limited support for USART peripheral
Applying: target/avr: Add limited support for USART peripheral
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278255
Applying patch #11278255 using u'git am'
Description: [v38,13/22] target/avr: Add limited support for 16 bit
timer peripheral
Applying: target/avr: Add limited support for 16 bit timer peripheral
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278231
Applying patch #11278231 using u'git am'
Description: [v38,14/22] target/avr: Add dummy mask device
Applying: target/avr: Add dummy mask device
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278237
Applying patch #11278237 using u'git am'
Description: [v38,15/22] target/avr: Add example board configuration
Applying: target/avr: Add example board configuration
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278251
Applying patch #11278251 using u'git am'
Description: [v38,16/22] target/avr: Add section about AVR into QEMU
documentation
Applying: target/avr: Add section about AVR into QEMU documentation
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278243
Applying patch #11278243 using u'git am'
Description: [v38,17/22] target/avr: Register AVR support with the rest of QEMU
Applying: target/avr: Register AVR support with the rest of QEMU
rtrk@rtrkw774-lin:~/Build/qemu-rolnik$ ~/pwclient git-am 11278233
Applying patch #11278233 using u'git am'

Re: [PATCH v1 3/5] target/arm/kvm: Implement cpu feature kvm-adjvtime

2019-12-10 Thread Andrew Jones

On Tue, Dec 10, 2019 at 03:54:11PM +, Peter Maydell wrote:
> On Wed, 16 Oct 2019 at 15:34, Andrew Jones  wrote:
> >
> > When kvm-adjvtime is enabled the guest's cntvct[_el0] won't count
> > the time when the VM is stopped. That time is skipped by updating
> > cntvoff[_el2] on each transition to vm_running using the current
> > QEMU_CLOCK_VIRTUAL time. QEMU_CLOCK_VIRTUAL only ticks when the VM
> > is running.
> >
> > This patch only provides the implementation. A subsequent patch
> > will provide the CPU property allowing the feature to be enabled.
> 
> 
> > +void kvm_arm_set_virtual_time(CPUState *cs)
> > +{
> > +uint64_t cnt;
> > +struct kvm_one_reg reg = {
> > +.id = KVM_REG_ARM_TIMER_CNT,
> > +.addr = (uintptr_t),
> > +};
> > +int ret;
> > +
> > +cnt = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
> > +   cpu_get_host_tick_frequency(),
> > +   NANOSECONDS_PER_SECOND);
> > +
> > +ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, );
> > +if (ret) {
> > +error_report("Failed to set KVM_REG_ARM_TIMER_CNT");
> > +abort();
> > +}
> 
> The commit message (and the doc comment for this function)
> say that we're updating the counter offset, but the
> kvm_one_reg operation here is updating the timer count
> (and relying on the kernel's handling of "if we update
> the timer count implement that by changing the offset").
> That seems a bit confusing.
> 
> Would it be possible to implement "cntvct should not change while the
> VM is stopped" with "read cntvct when the VM stops, and just write
> back that value when the VM is restarted", rather than
> "write back a new value calculated from QEMU_CLOCK_VIRTUAL"?
> If I understand commit 00f4d64ee76e873be8 correctly, that's
> basically how x86 is doing it. It would also let you sidestep
> the need to know the tick frequency of the counter.

That's definitely worth some experimenting. Will do.

Thanks,
drew

Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time

2019-12-10 Thread Andrew Jones

On Tue, Dec 10, 2019 at 04:47:49PM +0100, Andrea Bolognani wrote:
> On Tue, 2019-12-10 at 15:33 +0100, Andrew Jones wrote:
> > On Tue, Dec 10, 2019 at 03:21:02PM +0100, Andrea Bolognani wrote:
> > > I agree with everything except the naming: why would
> > > 
> > >   kvm-no-adjvtime=off  vtime is adjusted  (default)
> > >   kvm-no-adjvtime=on   vtime is not adjusted
> > > 
> > > be better than
> > > 
> > >   kvm-adjvtime=on   vtime is adjusted  (default)
> > >   kvm-adjvtime=off  vtime is not adjusted
> > > 
> > > ? Both offer the exact same amount of flexibility, but the latter has
> > > the advantage of not introducing any unwieldy double negatives.
> > 
> > A default of 'off' == 0 means not setting anything at all. There's
> > already precedent for 'kvm-no*' prefixed cpu features,
> > 
> > kvm-no-smi-migration
> > kvm-nopiodelay
> 
> Sorry, I'm not sure I understand. Do you mean that the array where
> you store CPU features is 0-inizialized, so it's more convenient to
> have off (0) as the default because it means you don't have to touch
> it beforehand? Or something different?
>

Right. The CPU feature flag (a boolean member of the CPU state) will
be zero by default because C. It's not a big deal, though, because the
property default can easily be set to true while it's added to a cpu
type.

I don't have a strong enough opinion about kvm-adjvtime vs.
kvm-no-adjvtime to insist one way or another. I agree double inversions
are easier to mess up, but I also like the way the '-no-' better
communicates that the default is [probably] 'yes'.

All interested parties, please vote. I'll be sending v2 soon and I can
call this thing anything the majority (or the dominate minority) prefer.

Thanks,
drew

Re: [PATCH v1 3/5] target/arm/kvm: Implement cpu feature kvm-adjvtime

2019-12-10 Thread Peter Maydell

On Wed, 16 Oct 2019 at 15:34, Andrew Jones  wrote:
>
> When kvm-adjvtime is enabled the guest's cntvct[_el0] won't count
> the time when the VM is stopped. That time is skipped by updating
> cntvoff[_el2] on each transition to vm_running using the current
> QEMU_CLOCK_VIRTUAL time. QEMU_CLOCK_VIRTUAL only ticks when the VM
> is running.
>
> This patch only provides the implementation. A subsequent patch
> will provide the CPU property allowing the feature to be enabled.

> +void kvm_arm_set_virtual_time(CPUState *cs)
> +{
> +uint64_t cnt;
> +struct kvm_one_reg reg = {
> +.id = KVM_REG_ARM_TIMER_CNT,
> +.addr = (uintptr_t),
> +};
> +int ret;
> +
> +cnt = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
> +   cpu_get_host_tick_frequency(),
> +   NANOSECONDS_PER_SECOND);
> +
> +ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, );
> +if (ret) {
> +error_report("Failed to set KVM_REG_ARM_TIMER_CNT");
> +abort();
> +}

The commit message (and the doc comment for this function)
say that we're updating the counter offset, but the
kvm_one_reg operation here is updating the timer count
(and relying on the kernel's handling of "if we update
the timer count implement that by changing the offset").
That seems a bit confusing.

Would it be possible to implement "cntvct should not change while the
VM is stopped" with "read cntvct when the VM stops, and just write
back that value when the VM is restarted", rather than
"write back a new value calculated from QEMU_CLOCK_VIRTUAL"?
If I understand commit 00f4d64ee76e873be8 correctly, that's
basically how x86 is doing it. It would also let you sidestep
the need to know the tick frequency of the counter.

thanks
-- PMM

Re: [PATCH v1 2/5] timer: arm: Introduce functions to get the host cntfrq

2019-12-10 Thread Peter Maydell

On Wed, 16 Oct 2019 at 15:34, Andrew Jones  wrote:
>
> When acceleration like KVM is in use it's necessary to use the host's
> counter frequency when converting ticks to or from time units.
>
> Signed-off-by: Andrew Jones 
> Reviewed-by: Richard Henderson 
> ---
>  include/qemu/timer.h | 16 
>  1 file changed, 16 insertions(+)
>
> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
> index 85bc6eb00b21..8941ddea8242 100644
> --- a/include/qemu/timer.h
> +++ b/include/qemu/timer.h
> @@ -1006,6 +1006,22 @@ static inline int64_t cpu_get_host_ticks(void)
>  }
>  #endif
>
> +#if defined(__aarch64__)
> +static inline uint32_t cpu_get_host_tick_frequency(void)
> +{
> +uint64_t frq;
> +asm volatile("mrs %0, cntfrq_el0" : "=r" (frq));
> +return frq;
> +}
> +#elif defined(__arm__)
> +static inline uint32_t cpu_get_host_tick_frequency(void)
> +{
> +uint32_t frq;
> +asm volatile("mrc p15, 0, %0, c14, c0, 0" : "=r" (frq));
> +return frq;
> +}
> +#endif

Don't we want to know what the guest counter frequency
is, not the host counter frequency? That is, I would have
expected that we get this value via doing a KVM ONE_REG ioctl
to ask the kernel what the guest cntfrq is. Are we using
the host value on the assumption that the guest might have
misprogrammed their copy of the register?

thanks
-- PMM

Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time

2019-12-10 Thread Andrea Bolognani

On Tue, 2019-12-10 at 15:33 +0100, Andrew Jones wrote:
> On Tue, Dec 10, 2019 at 03:21:02PM +0100, Andrea Bolognani wrote:
> > I agree with everything except the naming: why would
> > 
> >   kvm-no-adjvtime=off  vtime is adjusted  (default)
> >   kvm-no-adjvtime=on   vtime is not adjusted
> > 
> > be better than
> > 
> >   kvm-adjvtime=on   vtime is adjusted  (default)
> >   kvm-adjvtime=off  vtime is not adjusted
> > 
> > ? Both offer the exact same amount of flexibility, but the latter has
> > the advantage of not introducing any unwieldy double negatives.
> 
> A default of 'off' == 0 means not setting anything at all. There's
> already precedent for 'kvm-no*' prefixed cpu features,
> 
> kvm-no-smi-migration
> kvm-nopiodelay

Sorry, I'm not sure I understand. Do you mean that the array where
you store CPU features is 0-inizialized, so it's more convenient to
have off (0) as the default because it means you don't have to touch
it beforehand? Or something different?

-- 
Andrea Bolognani / Red Hat / Virtualization

Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time

2019-12-10 Thread Peter Maydell

On Tue, 10 Dec 2019 at 13:33, Andrew Jones  wrote:
> So the ins and outs of this particular timekeeping issue (to the best of
> my knowledge) is that x86 has implemented this behavior since
> 00f4d64ee76e ("kvmclock: clock should count only if vm is running"), which
> was committed over six years ago. Possibly x86 VM time would behave more
> like arm VM time if kvmclock was disabled, but that's not a recommended
> configuration.
>
> PPC got an equivalent patch to the x86 one in 2017, 42043e4f1241 ("spapr:
> clock should count only if vm is running"), but when stopping time during
> pause on spapr they actually *keep* 'date' and 'hwclock' in synch. I guess
> whatever clocksource 'hwclock' uses on spapr was already stopping when
> paused? For x86 those values diverge, and for arm without this series they
> stay the same but experience jumps, and with this series they diverge like
> x86. I don't see any way to disable the behaviour 42043e4f1241 introduces.
>
> s390x got what appears to be its equivalent patch last year 9bc9d3d1ae3b
> ("s390x/tod: Properly stop the KVM TOD while the guest is not running").
> The commit message doesn't state how hwclock and date values change /
> don't change, and I don't see any way to disable the behavior.
>
> MIPS has had this implemented since KVM support was introduced. No way
> to disable it that I know of.
>
> So why is this arm-specific? arm is just trying to catch up, but also
> offer a switch allowing the current behavior to be selected. If other
> architectures see value in the switch then they're free to adopt it too.
> After having done this git mining, it looks more and more like we should
> at least consider naming this feature 'kvm-no-adjvtime' and probably
> also change arm's default.

Thanks for pulling up the handling by other architectures.
I think I agree that we should change the arm default (ie
we should just call this a bug fix, since the old behaviour
seems unhelpful generally and is more random accident than
a deliberate choice), with a switch provided just in case
anybody had something depending on the old behaviour.

-- PMM

Re: [PATCH] qemu-img: fix info --backing-chain --image-opts

2019-12-10 Thread Kevin Wolf

Am 05.12.2019 um 14:46 hat Stefan Hajnoczi geschrieben:
> Only apply --image-opts to the topmost image when listing an entire
> backing chain.  It is incorrect to treat backing filenames as image
> options.  Assuming we have the backing chain t.IMGFMT.base <-
> t.IMGFMT.mid <- t.IMGFMT, qemu-img info fails as follows:
> 
>   $ qemu-img info --backing-chain --image-opts \
>   driver=qcow2,file.driver=file,file.filename=t.IMGFMT
>   qemu-img: Could not open 'TEST_DIR/t.IMGFMT.mid': Cannot find 
> device=TEST_DIR/t.IMGFMT.mid nor node_name=TEST_DIR/t.IMGFMT.mid
> 
> Signed-off-by: Stefan Hajnoczi 

> diff --git a/tests/qemu-iotests/279 b/tests/qemu-iotests/279
> new file mode 100755
> index 00..b555a92859
> --- /dev/null
> +++ b/tests/qemu-iotests/279
> @@ -0,0 +1,56 @@
> +#!/usr/bin/env bash
> +#
> +# Test qemu-img --backing-chain --image-opts
> +#
> +# Copyright (C) 2019 Red Hat, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see .
> +#
> +
> +seq=$(basename "$0")
> +echo "QA output created by $seq"
> +
> +status=1 # failure is the default!
> +
> +_cleanup()
> +{
> +_cleanup_test_img

I'm squashing in this fixup:

diff --git a/tests/qemu-iotests/279 b/tests/qemu-iotests/279
index b555a92859..6682376808 100755
--- a/tests/qemu-iotests/279
+++ b/tests/qemu-iotests/279
@@ -26,6 +26,7 @@ status=1  # failure is the default!
 _cleanup()
 {
 _cleanup_test_img
+rm -f "$TEST_IMG.mid"
 }
 trap "_cleanup; exit \$status" 0 1 2 3 15


Thanks, applied to the block branch.

Kevin

Re: [PATCH] qapi: better document NVMe blockdev @device parameter

2019-12-10 Thread Kevin Wolf

Am 06.12.2019 um 15:38 hat Daniel P. Berrangé geschrieben:
> Mention that this is a PCI device address & give the format it is
> expected it. Also mention that it must be first unbound from any
> host kernel driver.
> 
> Signed-off-by: Daniel P. Berrangé 

Thanks, applied to the block branch.

Kevin

Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time

2019-12-10 Thread Andrew Jones

On Tue, Dec 10, 2019 at 03:21:02PM +0100, Andrea Bolognani wrote:
> On Tue, 2019-12-10 at 14:32 +0100, Andrew Jones wrote:
> > After having done this git mining, it looks more and more like we should
> > at least consider naming this feature 'kvm-no-adjvtime' and probably
> > also change arm's default.
> 
> I agree with everything except the naming: why would
> 
>   kvm-no-adjvtime=off  vtime is adjusted  (default)
>   kvm-no-adjvtime=on   vtime is not adjusted
> 
> be better than
> 
>   kvm-adjvtime=on   vtime is adjusted  (default)
>   kvm-adjvtime=off  vtime is not adjusted
> 
> ? Both offer the exact same amount of flexibility, but the latter has
> the advantage of not introducing any unwieldy double negatives.
>

A default of 'off' == 0 means not setting anything at all. There's
already precedent for 'kvm-no*' prefixed cpu features,

kvm-no-smi-migration
kvm-nopiodelay

(Unless there's something called a 'nopio' then I guess that one is
missing a -, but whatever...)

Thanks,
drew

Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time

2019-12-10 Thread Andrea Bolognani

On Tue, 2019-12-10 at 14:32 +0100, Andrew Jones wrote:
> After having done this git mining, it looks more and more like we should
> at least consider naming this feature 'kvm-no-adjvtime' and probably
> also change arm's default.

I agree with everything except the naming: why would

  kvm-no-adjvtime=off  vtime is adjusted  (default)
  kvm-no-adjvtime=on   vtime is not adjusted

be better than

  kvm-adjvtime=on   vtime is adjusted  (default)
  kvm-adjvtime=off  vtime is not adjusted

? Both offer the exact same amount of flexibility, but the latter has
the advantage of not introducing any unwieldy double negatives.

-- 
Andrea Bolognani / Red Hat / Virtualization

[PATCH v3] net/imx_fec: Adding support for MAC filtering in the FEC IP implementation.

2019-12-10 Thread bilalwasim676

From: bwasim 

This addition ensures that the IP does NOT boot up in promiscuous mode
by default, and so the software only receives the desired
packets(Unicast, Broadcast, Unicast / Multicast hashed) by default.
The software running on-top of QEMU can also modify these settings and
disable reception of broadcast frames or make the IP receive all packets (PROM 
mode).
This patch greatly reduces the number of packets received by the
software running on-top of the QEMU model. Tested with the armv7-a SABRE_LITE 
machine.
Testing included running a custom OS with IPv4 / IPv6 support. Hashing
and filtering of packets is tested to work well. Skeleton taken from
the CADENCE_GEM IP and hash generation algorithm from the Linux Kernel.

Signed-off-by: Bilal Wasim 
---
 hw/net/imx_fec.c | 109 ++-
 include/hw/net/imx_fec.h |  10 
 2 files changed, 118 insertions(+), 1 deletion(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index bd99236864..d248f39fb0 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -419,6 +419,79 @@ static void imx_enet_write_bd(IMXENETBufDesc *bd, 
dma_addr_t addr)
 dma_memory_write(_space_memory, addr, bd, sizeof(*bd));
 }
 
+/*
+ * Calculate a FEC MAC Address hash index
+ */
+static unsigned calc_mac_hash(const uint8_t *mac, uint8_t mac_length)
+{
+uint32_t crc = net_crc32_le(mac, mac_length);
+
+/*
+ * only upper 6 bits (FEC_HASH_BITS) are used
+ * which point to specific bit in the hash registers
+ */
+return (crc >> (32 - FEC_HASH_BITS)) & 0x3f;
+}
+
+/*
+ * fec_mac_address_filter:
+ * Accept or reject this destination address?
+ */
+static int fec_mac_address_filter(IMXFECState *s, const uint8_t *packet)
+{
+const uint8_t broadcast_addr[] = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF };
+uint32_t addr1, addr2;
+uint8_t  hash;
+
+/* Promiscuous mode? */
+if (s->regs[ENET_RCR] & ENET_RCR_PROM) {
+/* Accept all packets in promiscuous mode (even if bc_rej is set). */
+return FEC_RX_PROMISCUOUS_ACCEPT;
+}
+
+/* Broadcast packet? */
+if (!memcmp(packet, broadcast_addr, 6)) {
+/* Reject broadcast packets? */
+if (s->regs[ENET_RCR] & ENET_RCR_BC_REJ) {
+return FEC_RX_REJECT;
+}
+/* Accept packets from broadcast address. */
+return FEC_RX_BROADCAST_ACCEPT;
+}
+
+/* Accept packets -w- hash match? */
+hash = calc_mac_hash(packet, 6);
+
+/* Accept packets -w- multicast hash match? */
+if ((packet[0] & 0x01) == 0x01) {
+/* Computed hash matches GAUR / GALR register ? */
+if (((hash < 32) && (s->regs[ENET_GALR] & (1 << hash)))
+|| ((hash > 31) && (s->regs[ENET_GAUR] & (1 << (hash - 32) 
{
+/* Accept multicast hash enabled address. */
+return FEC_RX_MULTICAST_HASH_ACCEPT;
+}
+} else {
+/* Computed hash matches IAUR / IALR register ? */
+if (((hash < 32) && (s->regs[ENET_IALR] & (1 << hash)))
+|| ((hash > 31) && (s->regs[ENET_IAUR] & (1 << (hash - 32) 
{
+/* Accept multicast hash enabled address. */
+return FEC_RX_UNICAST_HASH_ACCEPT;
+}
+}
+
+/* Match Unicast address. */
+addr1  = g_htonl(s->regs[ENET_PALR]);
+addr2  = g_htonl(s->regs[ENET_PAUR]);
+if (!(memcmp(packet, (uint8_t *) , 4) ||
+  memcmp(packet + 4, (uint8_t *) , 2))) {
+/* Accept packet because it matches my unicast address. */
+return FEC_RX_UNICAST_ACCEPT;
+}
+
+/* Return -1 because we do NOT support MAC address filtering.. */
+return FEC_RX_REJECT;
+}
+
 static void imx_eth_update(IMXFECState *s)
 {
 /*
@@ -984,7 +1057,7 @@ static void imx_eth_write(void *opaque, hwaddr offset, 
uint64_t value,
 case ENET_IALR:
 case ENET_GAUR:
 case ENET_GALR:
-/* TODO: implement MAC hash filtering.  */
+s->regs[index] |= value;
 break;
 case ENET_TFWR:
 if (s->is_fec) {
@@ -1066,8 +1139,15 @@ static ssize_t imx_fec_receive(NetClientState *nc, const 
uint8_t *buf,
 uint32_t buf_addr;
 uint8_t *crc_ptr;
 unsigned int buf_len;
+int maf;
 size_t size = len;
 
+/* Is this destination MAC address "for us" ? */
+maf = fec_mac_address_filter(s, buf);
+if (maf == FEC_RX_REJECT) {
+return FEC_RX_REJECT;
+}
+
 FEC_PRINTF("len %d\n", (int)size);
 
 if (!s->regs[ENET_RDAR]) {
@@ -1133,6 +1213,16 @@ static ssize_t imx_fec_receive(NetClientState *nc, const 
uint8_t *buf,
 } else {
 s->regs[ENET_EIR] |= ENET_INT_RXB;
 }
+
+/* Update descriptor based on the "maf" flag. */
+if (maf == FEC_RX_BROADCAST_ACCEPT) {
+/* The packet is destined for the "broadcast" address. */
+bd.flags |= ENET_BD_BC;
+} else if (maf == FEC_RX_MULTICAST_HASH_ACCEPT) {
+/* The packet is destined for a "multicast" address.

Re: 回复: 回复: 答复: [Qemu-devel] migrate_set_speed has no effect if the guest is using hugepages.

2019-12-10 Thread Dr. David Alan Gilbert

* Lin Ma (l...@suse.com) wrote:
> Hi Dave,
> 
> The patch fixed the issue, The rate limit with hugepages works well.
> Thanks for your help!

No problem; thank you for reporting and testing it.

Dave

> Lin
> 
> > -邮件原件-
> > 发件人: Dr. David Alan Gilbert 
> > 发送时间: 2019年12月5日 18:32
> > 收件人: Lin Ma 
> > 抄送: qemu-devel@nongnu.org
> > 主题: Re: 回复: 答复: [Qemu-devel] migrate_set_speed has no effect if the
> > guest is using hugepages.
> > 
> > Hi Lin,
> >   I've just posted 'migration: Rate limit inside host pages'; please let me 
> > know if
> > it helps for you.
> > 
> > Dave
> > 
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

1 2 >

1 - 100 of 167 matches

Mail list logo