Re: [PATCH] hw/core/loader: Fix possible crash in rom_copy()

2019-09-25 Thread Thomas Huth
On 25/09/2019 22.51, Philippe Mathieu-Daudé wrote:
[...]
> Let's say I have write access to a LAN TFTP server used by some PXE
> bootloader where I can store my crafted nasty kernel, then I get this score:
> 
> https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=AV:A/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H/E:P/RL:O/RC:C=3.1
> 
> CVSS Base Score: 9.6
> CVSS Temporal Score: 8.6
> 
> Which seems quite high.

I don't think you can trigger this bug this way. If you load your kernel
via a PXE server, the ELF parsing will be done by the bootloader, won't
it? I think you can only trigger this bug here if you load your kernel
via the "-kernel" command line parameter of QEMU (or the generic-loader
device), so it's not a real guest escape, as far as I can see.

 Thomas



[Bug 1841990] Re: instruction 'denbcdq' misbehaving

2019-09-25 Thread Mark Cave-Ayland
That's looking much better :)  And finally, how many failures do you get
running the same test under QEMU 3.1? If that gives you zero failures
then I'll need to look a lot closer at the changes to try and figure out
what is going on.

As a matter of interest, which tests are the ones that are failing?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1841990

Title:
  instruction 'denbcdq' misbehaving

Status in QEMU:
  New

Bug description:
  Instruction 'denbcdq' appears to have no effect.  Test case attached.

  On ppc64le native:
  --
  gcc -g -O -mcpu=power9 bcdcfsq.c test-denbcdq.c -o test-denbcdq
  $ ./test-denbcdq
  0x
  0x000c
  0x2208
  $ ./test-denbcdq 1
  0x0001
  0x001c
  0x22080001
  $ ./test-denbcdq $(seq 0 99)
  0x0064
  0x100c
  0x22080080
  --

  With "qemu-ppc64le -cpu power9"
  --
  $ qemu-ppc64le -cpu power9 -L [...] ./test-denbcdq
  0x
  0x000c
  0x000c
  $ qemu-ppc64le -cpu power9 -L [...] ./test-denbcdq 1
  0x0001
  0x001c
  0x001c
  $ qemu-ppc64le -cpu power9 -L [...] ./test-denbcdq $(seq 100)
  0x0064
  0x100c
  0x100c
  --

  I started looking at the code, but I got confused rather quickly.
  Could be related to endianness? I think denbcdq arrived on the scene
  before little-endian was a big deal.  Maybe something to do with
  utilizing implicit floating-point register pairs...  I don't think the
  right data is getting to helper_denbcdq, which would point back to the
  gen_fprp_ptr uses in dfp-impl.inc.c (GEN_DFP_T_FPR_I32_Rc).  (Maybe?)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1841990/+subscriptions



Re: [PATCH] hw/core/loader: Fix possible crash in rom_copy()

2019-09-25 Thread Thomas Huth
On 25/09/2019 22.51, Philippe Mathieu-Daudé wrote:
> Hi Thomas,
> 
> On 9/25/19 3:03 PM, Thomas Huth wrote:
>> Both, "rom->addr" and "addr" are derived from the binary image
>> that can be loaded with the "-kernel" paramer. The code in
>> rom_copy() then calculates:
>>
>> d = dest + (rom->addr - addr);
>>
>> and uses "d" as destination in a memcpy() some lines later. Now with
>> bad kernel images, it is possible that rom->addr is smaller than addr,
>> thus "rom->addr - addr" gets negative and the memcpy() then tries to
>> copy contents from the image to a bad memory location. In the best case,
>> this just crashes QEMU, in the worst case, this could maybe be used to
>> inject code from the kernel image into the QEMU binary, so we better fix
>> it with an additional sanity check here.
>>
>> Cc: qemu-sta...@nongnu.org
>> Reported-by: Guangming Liu
>> Buglink: https://bugs.launchpad.net/qemu/+bug/1844635
> 
> "This page does not exist, or you may not have permission to see it."
> 
> This seems security related. Shouldn't we open a CVE for this?
> https://wiki.qemu.org/SecurityProcess#CVE_allocation

I wrote to the security team before writing the patch, so I assume a CVE
number is already on the way. I'll reply to this thread when it is
available.

 Thomas



Re: [PATCH v5 2/2] target/i386: drop the duplicated definition of cpuid AVX512_VBMI macro

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/26/19 4:10 AM, Tao Xu wrote:
> Drop the duplicated definition of cpuid AVX512_VBMI macro and rename
> it as CPUID_7_0_ECX_AVX512_VBMI. Rename CPUID_7_0_ECX_VBMI2 as
> CPUID_7_0_ECX_AVX512_VBMI2.
> 
> Acked-by: Stefano Garzarella 
> Signed-off-by: Tao Xu 
> ---
>  target/i386/cpu.c   | 8 
>  target/i386/cpu.h   | 5 ++---
>  target/i386/hvf/x86_cpuid.c | 2 +-
>  3 files changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 9e0bac31e8..71034aeb5a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -2412,8 +2412,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
>  CPUID_7_0_EBX_RTM | CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX |
>  CPUID_7_0_EBX_SMAP,
>  .features[FEAT_7_0_ECX] =
> -CPUID_7_0_ECX_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU |
> -CPUID_7_0_ECX_VBMI2 | CPUID_7_0_ECX_GFNI |
> +CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | 
> CPUID_7_0_ECX_PKU |
> +CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
>  CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
>  CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
>  CPUID_7_0_ECX_AVX512_VPOPCNTDQ,
> @@ -2470,8 +2470,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
>  CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512CD |
>  CPUID_7_0_EBX_AVX512VL | CPUID_7_0_EBX_CLFLUSHOPT,
>  .features[FEAT_7_0_ECX] =
> -CPUID_7_0_ECX_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU |
> -CPUID_7_0_ECX_VBMI2 | CPUID_7_0_ECX_GFNI |
> +CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | 
> CPUID_7_0_ECX_PKU |
> +CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
>  CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
>  CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
>  CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57,
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index fa4c4cad79..8e090acd74 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -695,8 +695,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
>  #define CPUID_7_0_EBX_AVX512VL  (1U << 31)
>  
>  /* AVX-512 Vector Byte Manipulation Instruction */
> -#define CPUID_7_0_ECX_AVX512BMI (1U << 1)
> -#define CPUID_7_0_ECX_VBMI  (1U << 1)
> +#define CPUID_7_0_ECX_AVX512_VBMI   (1U << 1)
>  /* User-Mode Instruction Prevention */
>  #define CPUID_7_0_ECX_UMIP  (1U << 2)
>  /* Protection Keys for User-mode Pages */
> @@ -704,7 +703,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
>  /* OS Enable Protection Keys */
>  #define CPUID_7_0_ECX_OSPKE (1U << 4)
>  /* Additional AVX-512 Vector Byte Manipulation Instruction */
> -#define CPUID_7_0_ECX_VBMI2 (1U << 6)
> +#define CPUID_7_0_ECX_AVX512_VBMI2  (1U << 6)
>  /* Galois Field New Instructions */
>  #define CPUID_7_0_ECX_GFNI  (1U << 8)
>  /* Vector AES Instructions */
> diff --git a/target/i386/hvf/x86_cpuid.c b/target/i386/hvf/x86_cpuid.c
> index 4d957fe896..16762b6eb4 100644
> --- a/target/i386/hvf/x86_cpuid.c
> +++ b/target/i386/hvf/x86_cpuid.c
> @@ -89,7 +89,7 @@ uint32_t hvf_get_supported_cpuid(uint32_t func, uint32_t 
> idx,
>  ebx &= ~CPUID_7_0_EBX_INVPCID;
>  }
>  
> -ecx &= CPUID_7_0_ECX_AVX512BMI | CPUID_7_0_ECX_AVX512_VPOPCNTDQ;
> +ecx &= CPUID_7_0_ECX_AVX512_VBMI | 
> CPUID_7_0_ECX_AVX512_VPOPCNTDQ;
>  edx &= CPUID_7_0_EDX_AVX512_4VNNIW | CPUID_7_0_EDX_AVX512_4FMAPS;
>  } else {
>  ebx = 0;
> 

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH v3 29/33] docker: remove 'deprecated' image definitions

2019-09-25 Thread Philippe Mathieu-Daudé
Hi Alex,

On 9/26/19 1:34 AM, Alex Bennée wrote:
> Philippe Mathieu-Daudé  writes:
>> On 9/24/19 11:01 PM, Alex Bennée wrote:
>>> From: John Snow 
>>>
>>> There isn't a debian.dockerfile anymore,
>>> so perform some ghost-busting.
>>
>> Won't we deprecate other images in the future?
> 
> Sure but we can just drop them from dockerfiles. It's not like we
> allowed people to use them as we filtered them out.

This patch isn't about removing a deprecated image, but about removing
the handy DOCKER_DEPRECATED_IMAGES variable used to start a deprecation
process.

Fam remembered once we should respect the QEMU deprecation policy even
with docker images, because there might be users relying on them, so we
want to give them time to adapt. I can not find a thread on the list, so
we might have discussed that over IRC. The related commits are:

$ git show bcaf457786c

docker: do not display deprecated images in 'make docker' help

the 'debian' base image is deprecated since 3e11974988d8

$ git show 3e11974988d8

docker: warn users to use newer debian8/debian9 base image

to stay backward incompatible.

I'd rather keep the DOCKER_DEPRECATED_IMAGES variable empty, maybe with
a comment describing why it exists. What do you think?

Thanks,

Phil.

>>> Signed-off-by: John Snow 
>>> Message-Id: <20190923181140.7235-4-js...@redhat.com>
>>> Signed-off-by: Alex Bennée 
>>> ---
>>>  tests/docker/Makefile.include | 7 +++
>>>  1 file changed, 3 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
>>> index 82d5a8a5393..fd6f470fbf8 100644
>>> --- a/tests/docker/Makefile.include
>>> +++ b/tests/docker/Makefile.include
>>> @@ -4,11 +4,10 @@
>>>
>>>  DOCKER_SUFFIX := .docker
>>>  DOCKER_FILES_DIR := $(SRC_PATH)/tests/docker/dockerfiles
>>> -DOCKER_DEPRECATED_IMAGES := debian
>>>  # we don't run tests on intermediate images (used as base by another image)
>>> -DOCKER_PARTIAL_IMAGES := debian debian9 debian10 debian-sid
>>> +DOCKER_PARTIAL_IMAGES := debian9 debian10 debian-sid
>>>  DOCKER_PARTIAL_IMAGES += debian9-mxe debian-ports debian-bootstrap
>>> -DOCKER_IMAGES := $(filter-out $(DOCKER_DEPRECATED_IMAGES),$(sort $(notdir 
>>> $(basename $(wildcard $(DOCKER_FILES_DIR)/*.docker)
>>> +DOCKER_IMAGES := $(sort $(notdir $(basename $(wildcard 
>>> $(DOCKER_FILES_DIR)/*.docker
>>>  DOCKER_TARGETS := $(patsubst %,docker-image-%,$(DOCKER_IMAGES))
>>>  # Use a global constant ccache directory to speed up repetitive builds
>>>  DOCKER_CCACHE_DIR := $$HOME/.cache/qemu-docker-ccache
>>> @@ -160,7 +159,7 @@ docker-image-debian-powerpc-user-cross: 
>>> docker-binfmt-image-debian-powerpc-user
>>>  DOCKER_USER_IMAGES += debian-powerpc-user
>>>
>>>  # Expand all the pre-requistes for each docker image and test combination
>>> -$(foreach i,$(filter-out $(DOCKER_PARTIAL_IMAGES),$(DOCKER_IMAGES) 
>>> $(DOCKER_DEPRECATED_IMAGES)), \
>>> +$(foreach i,$(filter-out $(DOCKER_PARTIAL_IMAGES),$(DOCKER_IMAGES)), \
>>> $(foreach t,$(DOCKER_TESTS) $(DOCKER_TOOLS), \
>>> $(eval .PHONY: docker-$t@$i) \
>>> $(eval docker-$t@$i: docker-image-$i docker-run-$t@$i) \
>>>
> 
> 
> --
> Alex Bennée
> 



Re: [PATCH] docker: fix uid maping with podman

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/26/19 1:31 AM, no-re...@patchew.org wrote:
> Patchew URL: 
> https://patchew.org/QEMU/4b9204cc8ade1c965dc5412c53c6f7c5b4f019a2.1569413332.git.tgole...@redhat.com/
> 
> Hi,
> 
> This series failed the asan build test. Please find the testing commands and
> their output below. If you have Docker installed, you can probably reproduce 
> it
> locally.
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> export ARCH=x86_64
> make docker-image-fedora V=1 NETWORK=1
> time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
> === TEST SCRIPT END ===
> 
> The full log is available at
> http://patchew.org/logs/4b9204cc8ade1c965dc5412c53c6f7c5b4f019a2.1569413332.git.tgole...@redhat.com/testing.asan/?type=message.

The issue does not seem related to this particular patch:

  SPHINX  docs/specs
Exception occurred:
  File
"/usr/lib/python3.7/site-packages/sphinx/environment/__init__.py", line
612, in get_doctree
doctree = pickle.load(f)
_pickle.UnpicklingError: pickle data was truncated
The full traceback has been saved in /tmp/sphinx-err-d58j1r8p.log, if
you want to report the issue to the developers.
Please also report this if it was a user error, so that a better error
message can be provided next time.
A bug report can be filed in the tracker at
. Thanks!
make: *** [Makefile:990: docs/interop/index.html] Error 2



RE: [PATCH v8 01/13] vfio: KABI for migration interface

2019-09-25 Thread Tian, Kevin
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Thursday, September 26, 2019 3:06 AM
[...]
> > > > The second point is about write-protection:
> > > >
> > > > > There is another value of recording GPA in VFIO. Vendor drivers
> > > > > (e.g. GVT-g) may need to selectively write-protect guest memory
> > > > > pages when interpreting certain workload descriptors. Those pages
> > > > > are recorded in IOVA when vIOMMU is enabled, however the KVM
> > > > > write-protection API only knows GPA. So currently vIOMMU must
> > > > > be disabled on Intel vGPUs when GVT-g is enabled. To make it
> working
> > > > > we need a way to translate IOVA into GPA in the vendor drivers.
> > > > > There are two options. One is having KVM export a new API for such
> > > > > translation purpose. But as you explained earlier it's not good to
> > > > > have vendor drivers depend on KVM. The other is having VFIO
> > > > > maintaining such knowledge through extended MAP interface,
> > > > > then providing a uniform API for all vendor drivers to use.
> > >
> > > So the argument is that in order to interact with KVM (write protecting
> > > guest memory) there's a missing feature (IOVA to GPA translation), but
> > > we don't want to add an API to KVM for this feature because that would
> > > create a dependency on KVM (for interacting with KVM), so lets add an
> > > API to vfio instead.  That makes no sense to me.  What am I missing?
> > > Thanks,
> > >
> >
> > Then do you have a recommendation how such feature can be
> > implemented cleanly in vendor driver, without introducing direct
> > dependency on KVM?
> 
> I think the disconnect is that these sorts of extensions don't reflect
> things that a physical device can actually do.  The idea of vfio is
> that it's a userspace driver interface.  It provides a channel for the
> user to interact with the device, map device resources, receive
> interrupts, map system memory through the iommu, etc.  Mediated
> devices
> augment this by replacing the physical device the user accesses with a
> software virtualized device.  So then the question becomes why this
> device virtualizing software, ie. the mdev vendor driver, needs to do
> things that a physical device clearly cannot do.  For example, how can
> a physical device write-protect portions of system memory?  Or even,
> why would it need to?  It makes me suspect that mdev is being used to
> bypass the hypervisor, or maybe fill in the gaps for hardware that
> isn't as "mediation friendly" as it claims to be.

We do have one such example on Intel GPU. To support direct cmd
submission from userspace (SVA), kernel driver allocates a doorbell
page (in system memory) for each application and then registers
the page to the GPU. Once the doorbell is armed, the GPU starts
to monitor CPU writes to that page. Then the application can ring the 
GPU by simply writing to the doorbell page to submit cmds. This
possibly makes sense only for integrated devices.

In case that direction submission is not allowed in mediated device
(some auditing work is required in GVT-g), we need to write-protect 
the doorbell page with hypervisor help to mimic the hardware 
behavior. We have prototype work internally, but hasn't sent it out.

> 
> In the case of a physical device discovering an iova translation, this
> is what device iotlbs are for, but as an acceleration and offload
> mechanism for the system iommu rather than a lookup mechanism as
> seems
> to be wanted here.  If we had a system iommu with dirty page tracking,
> I believe that tracking would live in the iommu page tables and
> therefore reflect dirty pages relative to iova.  We'd need to consume
> those dirty page bits before we tear down the iova mappings, much like
> we're suggesting QEMU do here.

Yes. There are two cases:

1) iova shadowing. Say using only 2nd level as today. Here the dirty 
bits are associated to iova. When Qemu is revised to invoke log_sync 
before tearing down any iova mapping, vfio can get the dirty info 
from iommu driver for affected range.

2) iova nesting, where iova->gpa is in 1st level and gpa->hpa is in
2nd level. In that case the iova carried in the map/unmap ioctl is
actually gpa, thus the dirty bits are associated to gpa. In such case,
Qemu should continue to consume gpa-based dirty bitmap, as if
viommu is disabled.

> 
> Unfortunately I also think that KVM and vhost are not really the best
> examples of what we need to do for vfio.  KVM is intimately involved
> with GPAs, so clearly dirty page tracking at that level is not an
> issue.  Vhost tends to circumvent the viommu; it's trying to poke
> directly into guest memory without the help of a physical iommu.  So
> I can't say that I have much faith that QEMU is already properly wired
> with respect to viommu and dirty page tracking, leaving open the
> possibility that a log_sync on iommu region unmap is simply a gap in
> the QEMU migration story.  The vfio migration interface we have on the
> table 

Re: [PATCH 04/20] xics: Eliminate reset hook

2019-09-25 Thread David Gibson
On Wed, Sep 25, 2019 at 09:59:52AM +0200, Greg Kurz wrote:
> On Wed, 25 Sep 2019 16:45:18 +1000
> David Gibson  wrote:
> 
> > Currently TYPE_XICS_BASE and TYPE_XICS_SIMPLE have their own reset methods,
> > using the standard technique for having the subtype call the supertype's
> > methods before doing its own thing.
> > 
> > But TYPE_XICS_SIMPLE is the only subtype of TYPE_XICS_BASE ever
> > instantiated, so there's no point having the split here.  Merge them
> > together into just an ics_reset() function.
> > 
> > Signed-off-by: David Gibson 
> > ---
> >  hw/intc/xics.c| 57 ++-
> >  include/hw/ppc/xics.h |  1 -
> >  2 files changed, 24 insertions(+), 34 deletions(-)
> > 
> > diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> > index 310dc72b46..82e6f09259 100644
> > --- a/hw/intc/xics.c
> > +++ b/hw/intc/xics.c
> > @@ -547,11 +547,28 @@ static void ics_eoi(ICSState *ics, uint32_t nr)
> >  }
> >  }
> >  
> > -static void ics_simple_reset(DeviceState *dev)
> > +static void ics_reset_irq(ICSIRQState *irq)
> >  {
> > -ICSStateClass *icsc = ICS_BASE_GET_CLASS(dev);
> > +irq->priority = 0xff;
> > +irq->saved_priority = 0xff;
> > +}
> >  
> > -icsc->parent_reset(dev);
> > +static void ics_reset(DeviceState *dev)
> > +{
> > +ICSState *ics = ICS_BASE(dev);
> > +int i;
> > +uint8_t flags[ics->nr_irqs];
> > +
> > +for (i = 0; i < ics->nr_irqs; i++) {
> > +flags[i] = ics->irqs[i].flags;
> > +}
> > +
> > +memset(ics->irqs, 0, sizeof(ICSIRQState) * ics->nr_irqs);
> > +
> > +for (i = 0; i < ics->nr_irqs; i++) {
> > +ics_reset_irq(ics->irqs + i);
> > +ics->irqs[i].flags = flags[i];
> > +}
> >  
> >  if (kvm_irqchip_in_kernel()) {
> >  Error *local_err = NULL;
> > @@ -563,9 +580,9 @@ static void ics_simple_reset(DeviceState *dev)
> >  }
> >  }
> >  
> > -static void ics_simple_reset_handler(void *dev)
> > +static void ics_reset_handler(void *dev)
> >  {
> > -ics_simple_reset(dev);
> > +ics_reset(dev);
> >  }
> >  
> >  static void ics_simple_realize(DeviceState *dev, Error **errp)
> > @@ -580,7 +597,7 @@ static void ics_simple_realize(DeviceState *dev, Error 
> > **errp)
> >  return;
> >  }
> >  
> > -qemu_register_reset(ics_simple_reset_handler, ics);
> > +qemu_register_reset(ics_reset_handler, ics);
> 
> As suggested by Philippe, this could be the opportunity to add
> a comment that explain why we rely on qemu_register_reset()
> rather than dc->reset.

I don't thinmk it's really in scope for this patch, since it was there
just as bare before.  I'm considering it for another patch, but I'm
still thinking about exactly what I want to do with the reset.

> 
> >  }
> >  
> >  static void ics_simple_class_init(ObjectClass *klass, void *data)
> > @@ -590,8 +607,6 @@ static void ics_simple_class_init(ObjectClass *klass, 
> > void *data)
> >  
> >  device_class_set_parent_realize(dc, ics_simple_realize,
> >  >parent_realize);
> > -device_class_set_parent_reset(dc, ics_simple_reset,
> > -  >parent_reset);
> >  }
> >  
> >  static const TypeInfo ics_simple_info = {
> > @@ -602,30 +617,6 @@ static const TypeInfo ics_simple_info = {
> >  .class_size = sizeof(ICSStateClass),
> >  };
> >  
> > -static void ics_reset_irq(ICSIRQState *irq)
> > -{
> > -irq->priority = 0xff;
> > -irq->saved_priority = 0xff;
> > -}
> > -
> > -static void ics_base_reset(DeviceState *dev)
> > -{
> > -ICSState *ics = ICS_BASE(dev);
> > -int i;
> > -uint8_t flags[ics->nr_irqs];
> > -
> > -for (i = 0; i < ics->nr_irqs; i++) {
> > -flags[i] = ics->irqs[i].flags;
> > -}
> > -
> > -memset(ics->irqs, 0, sizeof(ICSIRQState) * ics->nr_irqs);
> > -
> > -for (i = 0; i < ics->nr_irqs; i++) {
> > -ics_reset_irq(ics->irqs + i);
> > -ics->irqs[i].flags = flags[i];
> > -}
> > -}
> > -
> >  static void ics_base_realize(DeviceState *dev, Error **errp)
> >  {
> >  ICSState *ics = ICS_BASE(dev);
> > @@ -726,7 +717,7 @@ static void ics_base_class_init(ObjectClass *klass, 
> > void *data)
> >  
> >  dc->realize = ics_base_realize;
> >  dc->props = ics_base_properties;
> > -dc->reset = ics_base_reset;
> > +dc->reset = ics_reset;
> 
> I hadn't spotted it previously but since you're removing the call to
> device_class_set_parent_reset(), we don't need dc->reset anymore.

Hrm, I'd prefer to leave it in there, even though it's not strictly
necessary - this way calling device_reset() on the ICS will do what
you'd expect

> 
> This basically reverts:
> 
> commit eeefd43b3cf342d1696128462a16e092995ff1b5
> Author: Cédric Le Goater 
> Date:   Mon Jun 25 11:17:16 2018 +0200
> 
> ppx/xics: introduce a parent_reset in ICSStateClass
> 
> Just like for the realize handlers, this makes possible to move the
> common ICSState code of the reset handlers in 

Re: [PATCH 11/20] spapr: Fix indexing of XICS irqs

2019-09-25 Thread David Gibson
On Wed, Sep 25, 2019 at 10:17:46PM +0200, Greg Kurz wrote:
> On Wed, 25 Sep 2019 16:45:25 +1000
> David Gibson  wrote:
> 
> > spapr global irq numbers are different from the source numbers on the ICS
> > when using XICS - they're offset by XICS_IRQ_BASE (0x1000).  But
> > spapr_irq_set_irq_xics() was passing through the global irq number to
> > the ICS code unmodified.
> > 
> > We only got away with this because of a counteracting bug - we were
> > incorrectly adjusting the qemu_irq we returned for a requested global irq
> > number.
> > 
> > That approach mostly worked but is very confusing, incorrectly relies on
> > the way the qemu_irq array is allocated, and undermines the intention of
> > having the global array of qemu_irqs for spapr have a consistent meaning
> > regardless of irq backend.
> > 
> > So, fix both set_irq and qemu_irq indexing.  We rename some parameters at
> > the same time to make it clear that they are referring to spapr global
> > irq numbers.
> > 
> > Signed-off-by: David Gibson 
> > ---
> 
> Reviewed-by: Greg Kurz 
> 
> Further cleanup could be to have the XICS backend to only take global
> irq numbers and to convert them to ICS source numbers internally. This
> would put an end to the confusion between srcno/irq in the frontend
> code.

Yeah, maybe.  But the local srcnos do actually make sense from within
the perspective of ICS, so I'm not all that keen to do that.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH 0/4] xics: Eliminate unnecessary class

2019-09-25 Thread David Gibson
On Tue, Sep 24, 2019 at 01:00:00PM +0200, Cédric Le Goater wrote:
> On 24/09/2019 12:04, Philippe Mathieu-Daudé wrote:
> > Hi Cédric,
> > 
> > On 9/24/19 11:55 AM, Cédric Le Goater wrote:
> >> On 24/09/2019 09:52, Greg Kurz wrote:
> >>> On Tue, 24 Sep 2019 07:22:51 +0200
> >>> Cédric Le Goater  wrote:
> >>>
>  On 24/09/2019 06:59, David Gibson wrote:
> > The XICS interrupt controller device used to have separate subtypes
> > for the KVM and non-KVM variant of the device.  That was a bad idea,
> > because it leaked information that should be entirely host-side
> > implementation specific to the kinda-sorta guest visible QOM class
> > names.
> >
> > We eliminated the KVM specific class some time ago, but it's left
> > behind a distinction between the TYPE_ICS_BASE abstract class and
> > TYPE_ICS_SIMPLE subtype which no longer serves any purpose.
> >
> > This series collapses the two types back into one.
> 
>  OK. Is it migration compatible ? because this is why we have kept
> >>>
> >>> Yeah, the types themselves don't matter, only the format of the
> >>> binary stream described in the VMStateDescription does.
> >>>
>  this subclass. I am glad we can remove it finally. 
> 
> >>>
> >>> I was thinking we were keeping that for pnv...
> >>
> >>
> >> Yes, also. See the resend and reject handler in the code 
> >> below.
> >>
> >>
> >> I have been maintaining this patch since QEMU 2.6. I think 
> >> it is time for me to declare defeat on getting it merged 
> >> and QEMU 4.2 will be the last rebase. 
> > 
> > Do you remember what is missing for being merged upstream?
> 
> lack of interest I would say. Here is the patch.

So, I had a look at what this is doing with ICS.

AFAIT, what it really wants is something *more* abstract that ICS_BASE
- the main reason for its own resend and reject hooks AFAICT is
because it uses a different representation of the irq state - the RBA
registers rather than the full set of state bits per irq that are in
ICSState.

I'm also not entirely convinced the current PHB3 draft patch properly
keeps the stuff in ICSState properly synchronized with its own
representation.

I can see two ways we could handle that 1) just use ICS_SIMPLE and
convert to the RBA representation at MMIO time, simple but slow (but
maybe not enough that we'd care).  2) Have a different ics base class
(let's call it ICS_ABSTRACT) that has the reject/resend/eoi hooks but
whose instance has no state at all.

In any case (2) is different enough from what we have now that I now
tend to agree that the way forward is to collapse the types together
as I've proposed, and if we resurrect the PHB3 patch we can
reimplement it in the way it actually needs.



> 
> C.
> 
> >From b254a0279f5cf13a5db2db0bb209fe806af59ad0 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= 
> Date: Wed, 18 Sep 2019 18:02:03 +0200
> Subject: [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> This is a model of the PCIe Host Bridge (PHB3) found on a Power8
> processor. It includes the PowerBus logic interface (PBCQ), IOMMU
> support, a single PCIe Gen.3 Root Complex, and support for MSI and LSI
> interrupt sources as found on a Power8 system using the XICS interrupt
> controller.
> 
> The Power8 processor comes in different flavors: Venice, Murano,
> Naple, each having a different number of PHBs. To make things simpler,
> the PHB3 model provides 3 per chip. Some platforms, like the
> Firestone, can also couple PHBs on the first chip to provide more
> bandwidth but this is too specific to model in QEMU.
> 
> No default device layout is provided and PCI devices can be added on
> any of the available PCIe Root Port (pcie.0 .. 2 of a Power8 chip)
> with address 0x0 as the firwware (skiboot) only accepts a single
> device per root port. To run a simple system with a network and a
> storage adapters, use a command line options such as :
> 
>   -device e1000e,netdev=net0,mac=C0:FF:EE:00:00:02,bus=pcie.0,addr=0x0
>   -netdev 
> bridge,id=net0,helper=/usr/libexec/qemu-bridge-helper,br=virbr0,id=hostnet0
> 
>   -device megasas,id=scsi0,bus=pcie.1,addr=0x0
>   -drive file=$disk,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none
>   -device 
> scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2
> 
> If more are needed, include a bridge.
> 
> Multi chip is supported, each chip adding its set of PHB3 controllers
> and its PCI busses. The model doesn't emulate the EEH error handling
> (and may never do).
> 
> Originally from Benjamin Herrenschmidt. I am taking ownership since I
> have been maintaining the model since QEMU 2.6.
> 
> Signed-off-by: Cédric Le Goater 
> ---
>  include/hw/pci-host/pnv_phb3.h  |  173 
>  include/hw/pci-host/pnv_phb3_regs.h |  454 ++
>  include/hw/ppc/pnv.h|7 +
>  

Re: [PATCH] spapr/irq: Fix migration of older machine types with XIVE

2019-09-25 Thread David Gibson
On Wed, Sep 25, 2019 at 06:07:40PM +0200, Greg Kurz wrote:
> Recent patch "spapr/irq: Only claim VALID interrupts at the KVM level"
> broke migration of older machine types started with ic-mode=xive:
> 
> qemu-system-ppc64: KVM_SET_DEVICE_ATTR failed: Group 3 attr 
> 0x1300: Invalid argument
> qemu-system-ppc64: error while loading state for instance 0x0 of device 
> 'spapr'
> qemu-system-ppc64: load of migration failed: Operation not permitted
> 
> This is because we should set the interrupt source in KVM at post load,
> since we no longer do it unconditionaly at reset time for all interrupts.
> 
> Signed-off-by: Greg Kurz 
> ---
> 
> David,
> 
> I guess you should probably fold this fix directly into Cedric's
> patch (currently SHA1 966d526cdfd9 in ppc-for-4.2) to avoid
> bisection breakage.

Done.

> ---
>  hw/intc/spapr_xive_kvm.c |   11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> index 71b88d7797bc..2006f96aece1 100644
> --- a/hw/intc/spapr_xive_kvm.c
> +++ b/hw/intc/spapr_xive_kvm.c
> @@ -678,6 +678,17 @@ int kvmppc_xive_post_load(SpaprXive *xive, int 
> version_id)
>  continue;
>  }
>  
> +/*
> + * We can only restore the source config if the source has been
> + * previously set in KVM. Since we don't do that for all interrupts
> + * at reset time anymore, let's do it now.
> + */
> +kvmppc_xive_source_reset_one(>source, i, _err);
> +if (local_err) {
> +error_report_err(local_err);
> +return -1;
> +}
> +
>  kvmppc_xive_set_source_config(xive, i, >eat[i], _err);
>  if (local_err) {
>  error_report_err(local_err);
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[PATCH v5 2/2] target/i386: drop the duplicated definition of cpuid AVX512_VBMI macro

2019-09-25 Thread Tao Xu
Drop the duplicated definition of cpuid AVX512_VBMI macro and rename
it as CPUID_7_0_ECX_AVX512_VBMI. Rename CPUID_7_0_ECX_VBMI2 as
CPUID_7_0_ECX_AVX512_VBMI2.

Acked-by: Stefano Garzarella 
Signed-off-by: Tao Xu 
---
 target/i386/cpu.c   | 8 
 target/i386/cpu.h   | 5 ++---
 target/i386/hvf/x86_cpuid.c | 2 +-
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 9e0bac31e8..71034aeb5a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -2412,8 +2412,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
 CPUID_7_0_EBX_RTM | CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX |
 CPUID_7_0_EBX_SMAP,
 .features[FEAT_7_0_ECX] =
-CPUID_7_0_ECX_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU |
-CPUID_7_0_ECX_VBMI2 | CPUID_7_0_ECX_GFNI |
+CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU 
|
+CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
 CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
 CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
 CPUID_7_0_ECX_AVX512_VPOPCNTDQ,
@@ -2470,8 +2470,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
 CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512CD |
 CPUID_7_0_EBX_AVX512VL | CPUID_7_0_EBX_CLFLUSHOPT,
 .features[FEAT_7_0_ECX] =
-CPUID_7_0_ECX_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU |
-CPUID_7_0_ECX_VBMI2 | CPUID_7_0_ECX_GFNI |
+CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU 
|
+CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
 CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
 CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
 CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index fa4c4cad79..8e090acd74 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -695,8 +695,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_7_0_EBX_AVX512VL  (1U << 31)
 
 /* AVX-512 Vector Byte Manipulation Instruction */
-#define CPUID_7_0_ECX_AVX512BMI (1U << 1)
-#define CPUID_7_0_ECX_VBMI  (1U << 1)
+#define CPUID_7_0_ECX_AVX512_VBMI   (1U << 1)
 /* User-Mode Instruction Prevention */
 #define CPUID_7_0_ECX_UMIP  (1U << 2)
 /* Protection Keys for User-mode Pages */
@@ -704,7 +703,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
 /* OS Enable Protection Keys */
 #define CPUID_7_0_ECX_OSPKE (1U << 4)
 /* Additional AVX-512 Vector Byte Manipulation Instruction */
-#define CPUID_7_0_ECX_VBMI2 (1U << 6)
+#define CPUID_7_0_ECX_AVX512_VBMI2  (1U << 6)
 /* Galois Field New Instructions */
 #define CPUID_7_0_ECX_GFNI  (1U << 8)
 /* Vector AES Instructions */
diff --git a/target/i386/hvf/x86_cpuid.c b/target/i386/hvf/x86_cpuid.c
index 4d957fe896..16762b6eb4 100644
--- a/target/i386/hvf/x86_cpuid.c
+++ b/target/i386/hvf/x86_cpuid.c
@@ -89,7 +89,7 @@ uint32_t hvf_get_supported_cpuid(uint32_t func, uint32_t idx,
 ebx &= ~CPUID_7_0_EBX_INVPCID;
 }
 
-ecx &= CPUID_7_0_ECX_AVX512BMI | CPUID_7_0_ECX_AVX512_VPOPCNTDQ;
+ecx &= CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_AVX512_VPOPCNTDQ;
 edx &= CPUID_7_0_EDX_AVX512_4VNNIW | CPUID_7_0_EDX_AVX512_4FMAPS;
 } else {
 ebx = 0;
-- 
2.20.1




[PATCH v5 0/2] target/i386: cpu.h macros clean up

2019-09-25 Thread Tao Xu
Add some comments, clean up comments over 80 chars per line. There
is an extra line in comment of CPUID_8000_0008_EBX_WBNOINVD, remove
the extra enter and spaces.

Drop the duplicated definition of cpuid AVX512_VBMI macro and rename it
as CPUID_7_0_ECX_AVX512_VBMI. Rename CPUID_7_0_ECX_VBMI2
as CPUID_7_0_ECX_AVX512_VBMI2.

Changelog:
v5:
- correct commit messages. (Suggested by Stefano Garzarella)
v4:
- rename CPUID_7_0_ECX_VBMI2 as CPUID_7_0_ECX_AVX512_VBMI2.
  (Suggested by Stefano Garzarella)
v3:
- split the patch into 2 patches. (Suggested by Stefano Garzarella
  and Eduardo Habkost)
v2:
- correct the comments over 80 chars per line. (Suggested by
  Philippe Mathieu-Daudé)

Tao Xu (2):
  target/i386: clean up comments over 80 chars per line
  target/i386: drop the duplicated definition of cpuid AVX512_VBMI macro

 target/i386/cpu.c   |   8 +-
 target/i386/cpu.h   | 163 +++-
 target/i386/hvf/x86_cpuid.c |   2 +-
 3 files changed, 111 insertions(+), 62 deletions(-)

-- 
2.20.1




[PATCH v5 1/2] target/i386: clean up comments over 80 chars per line

2019-09-25 Thread Tao Xu
Add some comments, clean up comments over 80 chars per line. And there
is an extra line in comment of CPUID_8000_0008_EBX_WBNOINVD, remove
the extra enter and spaces.

Acked-by: Stefano Garzarella 
Signed-off-by: Tao Xu 
---
 target/i386/cpu.h | 164 ++
 1 file changed, 107 insertions(+), 57 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 5f6e3a029a..fa4c4cad79 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -641,63 +641,113 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_SVM_PAUSEFILTER  (1U << 10)
 #define CPUID_SVM_PFTHRESHOLD  (1U << 12)
 
-#define CPUID_7_0_EBX_FSGSBASE (1U << 0)
-#define CPUID_7_0_EBX_BMI1 (1U << 3)
-#define CPUID_7_0_EBX_HLE  (1U << 4)
-#define CPUID_7_0_EBX_AVX2 (1U << 5)
-#define CPUID_7_0_EBX_SMEP (1U << 7)
-#define CPUID_7_0_EBX_BMI2 (1U << 8)
-#define CPUID_7_0_EBX_ERMS (1U << 9)
-#define CPUID_7_0_EBX_INVPCID  (1U << 10)
-#define CPUID_7_0_EBX_RTM  (1U << 11)
-#define CPUID_7_0_EBX_MPX  (1U << 14)
-#define CPUID_7_0_EBX_AVX512F  (1U << 16) /* AVX-512 Foundation */
-#define CPUID_7_0_EBX_AVX512DQ (1U << 17) /* AVX-512 Doubleword & Quadword 
Instrs */
-#define CPUID_7_0_EBX_RDSEED   (1U << 18)
-#define CPUID_7_0_EBX_ADX  (1U << 19)
-#define CPUID_7_0_EBX_SMAP (1U << 20)
-#define CPUID_7_0_EBX_AVX512IFMA (1U << 21) /* AVX-512 Integer Fused Multiply 
Add */
-#define CPUID_7_0_EBX_PCOMMIT  (1U << 22) /* Persistent Commit */
-#define CPUID_7_0_EBX_CLFLUSHOPT (1U << 23) /* Flush a Cache Line Optimized */
-#define CPUID_7_0_EBX_CLWB (1U << 24) /* Cache Line Write Back */
-#define CPUID_7_0_EBX_INTEL_PT (1U << 25) /* Intel Processor Trace */
-#define CPUID_7_0_EBX_AVX512PF (1U << 26) /* AVX-512 Prefetch */
-#define CPUID_7_0_EBX_AVX512ER (1U << 27) /* AVX-512 Exponential and 
Reciprocal */
-#define CPUID_7_0_EBX_AVX512CD (1U << 28) /* AVX-512 Conflict Detection */
-#define CPUID_7_0_EBX_SHA_NI   (1U << 29) /* SHA1/SHA256 Instruction 
Extensions */
-#define CPUID_7_0_EBX_AVX512BW (1U << 30) /* AVX-512 Byte and Word 
Instructions */
-#define CPUID_7_0_EBX_AVX512VL (1U << 31) /* AVX-512 Vector Length Extensions 
*/
-
-#define CPUID_7_0_ECX_AVX512BMI (1U << 1)
-#define CPUID_7_0_ECX_VBMI (1U << 1)  /* AVX-512 Vector Byte Manipulation 
Instrs */
-#define CPUID_7_0_ECX_UMIP (1U << 2)
-#define CPUID_7_0_ECX_PKU  (1U << 3)
-#define CPUID_7_0_ECX_OSPKE(1U << 4)
-#define CPUID_7_0_ECX_VBMI2(1U << 6) /* Additional VBMI Instrs */
-#define CPUID_7_0_ECX_GFNI (1U << 8)
-#define CPUID_7_0_ECX_VAES (1U << 9)
-#define CPUID_7_0_ECX_VPCLMULQDQ (1U << 10)
-#define CPUID_7_0_ECX_AVX512VNNI (1U << 11)
-#define CPUID_7_0_ECX_AVX512BITALG (1U << 12)
-#define CPUID_7_0_ECX_AVX512_VPOPCNTDQ (1U << 14) /* POPCNT for vectors of 
DW/QW */
-#define CPUID_7_0_ECX_LA57 (1U << 16)
-#define CPUID_7_0_ECX_RDPID(1U << 22)
-#define CPUID_7_0_ECX_CLDEMOTE (1U << 25)  /* CLDEMOTE Instruction */
-#define CPUID_7_0_ECX_MOVDIRI  (1U << 27)  /* MOVDIRI Instruction */
-#define CPUID_7_0_ECX_MOVDIR64B (1U << 28) /* MOVDIR64B Instruction */
-
-#define CPUID_7_0_EDX_AVX512_4VNNIW (1U << 2) /* AVX512 Neural Network 
Instructions */
-#define CPUID_7_0_EDX_AVX512_4FMAPS (1U << 3) /* AVX512 Multiply Accumulation 
Single Precision */
-#define CPUID_7_0_EDX_SPEC_CTRL (1U << 26) /* Speculation Control */
-#define CPUID_7_0_EDX_ARCH_CAPABILITIES (1U << 29)  /*Arch Capabilities*/
-#define CPUID_7_0_EDX_CORE_CAPABILITY   (1U << 30)  /*Core Capability*/
-#define CPUID_7_0_EDX_SPEC_CTRL_SSBD  (1U << 31) /* Speculative Store Bypass 
Disable */
-
-#define CPUID_7_1_EAX_AVX512_BF16 (1U << 5) /* AVX512 BFloat16 Instruction */
-
-#define CPUID_8000_0008_EBX_WBNOINVD  (1U << 9)  /* Write back and
- 
do not invalidate cache */
-#define CPUID_8000_0008_EBX_IBPB(1U << 12) /* Indirect Branch Prediction 
Barrier */
+/* Support RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE */
+#define CPUID_7_0_EBX_FSGSBASE  (1U << 0)
+/* 1st Group of Advanced Bit Manipulation Extensions */
+#define CPUID_7_0_EBX_BMI1  (1U << 3)
+/* Hardware Lock Elision */
+#define CPUID_7_0_EBX_HLE   (1U << 4)
+/* Intel Advanced Vector Extensions 2 */
+#define CPUID_7_0_EBX_AVX2  (1U << 5)
+/* Supervisor-mode Execution Prevention */
+#define CPUID_7_0_EBX_SMEP  (1U << 7)
+/* 2nd Group of Advanced Bit Manipulation Extensions */
+#define CPUID_7_0_EBX_BMI2  (1U << 8)
+/* Enhanced REP MOVSB/STOSB */
+#define CPUID_7_0_EBX_ERMS  (1U << 9)
+/* Invalidate Process-Context Identifier */
+#define CPUID_7_0_EBX_INVPCID   (1U << 10)
+/* Restricted Transactional Memory */
+#define CPUID_7_0_EBX_RTM   (1U << 11)
+/* Memory Protection Extension */
+#define CPUID_7_0_EBX_MPX   (1U << 14)
+/* AVX-512 Foundation */
+#define CPUID_7_0_EBX_AVX512F  

Re: [PATCH 05/20] xics: Merge TYPE_ICS_BASE and TYPE_ICS_SIMPLE classes

2019-09-25 Thread David Gibson
On Wed, Sep 25, 2019 at 10:31:12AM +0200, Greg Kurz wrote:
> On Wed, 25 Sep 2019 10:16:10 +0200
> Greg Kurz  wrote:
> 
> > On Wed, 25 Sep 2019 16:45:19 +1000
> > David Gibson  wrote:
> > 
> > > TYPE_ICS_SIMPLE is the only subtype of TYPE_ICS_BASE that's ever
> > > instantiated, and the only one we're ever likely to want.  The
>   ^^
> This seems to be kind of contradicted by the next patch BTW. Maybe just
> drop that to avoid confusion ? The rest of the changelog explains why we
> should merge the classes well enough IMHO.

Yeah, good point.

> 
> > > existence of different classes is just a hang over from when we
> > > (misguidedly) had separate subtypes for the KVM and non-KVM version of
> > > the device.
> > > 
> > > So, collapse the two classes together into just TYPE_ICS.
> > > 
> > > Signed-off-by: David Gibson 
> > > ---
> > 
> > So this also kills the realize hook, unlike in your previous series
> > where this was done along with the reset hook change. Makes sense
> > when merging parent/child class as well.
> > 
> > Reviewed-by: Greg Kurz 
> > 
> > >  hw/intc/xics.c| 86 ++-
> > >  hw/ppc/pnv_psi.c  |  2 +-
> > >  hw/ppc/spapr_irq.c|  4 +-
> > >  include/hw/ppc/xics.h | 16 +++-
> > >  4 files changed, 36 insertions(+), 72 deletions(-)
> > > 
> > > diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> > > index 82e6f09259..dfe7dbd254 100644
> > > --- a/hw/intc/xics.c
> > > +++ b/hw/intc/xics.c
> > > @@ -555,7 +555,7 @@ static void ics_reset_irq(ICSIRQState *irq)
> > >  
> > >  static void ics_reset(DeviceState *dev)
> > >  {
> > > -ICSState *ics = ICS_BASE(dev);
> > > +ICSState *ics = ICS(dev);
> > >  int i;
> > >  uint8_t flags[ics->nr_irqs];
> > >  
> > > @@ -573,7 +573,7 @@ static void ics_reset(DeviceState *dev)
> > >  if (kvm_irqchip_in_kernel()) {
> > >  Error *local_err = NULL;
> > >  
> > > -ics_set_kvm_state(ICS_BASE(dev), _err);
> > > +ics_set_kvm_state(ICS(dev), _err);
> > >  if (local_err) {
> > >  error_report_err(local_err);
> > >  }
> > > @@ -585,47 +585,15 @@ static void ics_reset_handler(void *dev)
> > >  ics_reset(dev);
> > >  }
> > >  
> > > -static void ics_simple_realize(DeviceState *dev, Error **errp)
> > > +static void ics_realize(DeviceState *dev, Error **errp)
> > >  {
> > > -ICSState *ics = ICS_SIMPLE(dev);
> > > -ICSStateClass *icsc = ICS_BASE_GET_CLASS(ics);
> > > +ICSState *ics = ICS(dev);
> > >  Error *local_err = NULL;
> > > -
> > > -icsc->parent_realize(dev, _err);
> > > -if (local_err) {
> > > -error_propagate(errp, local_err);
> > > -return;
> > > -}
> > > -
> > > -qemu_register_reset(ics_reset_handler, ics);
> > > -}
> > > -
> > > -static void ics_simple_class_init(ObjectClass *klass, void *data)
> > > -{
> > > -DeviceClass *dc = DEVICE_CLASS(klass);
> > > -ICSStateClass *isc = ICS_BASE_CLASS(klass);
> > > -
> > > -device_class_set_parent_realize(dc, ics_simple_realize,
> > > ->parent_realize);
> > > -}
> > > -
> > > -static const TypeInfo ics_simple_info = {
> > > -.name = TYPE_ICS_SIMPLE,
> > > -.parent = TYPE_ICS_BASE,
> > > -.instance_size = sizeof(ICSState),
> > > -.class_init = ics_simple_class_init,
> > > -.class_size = sizeof(ICSStateClass),
> > > -};
> > > -
> > > -static void ics_base_realize(DeviceState *dev, Error **errp)
> > > -{
> > > -ICSState *ics = ICS_BASE(dev);
> > >  Object *obj;
> > > -Error *err = NULL;
> > >  
> > > -obj = object_property_get_link(OBJECT(dev), ICS_PROP_XICS, );
> > > +obj = object_property_get_link(OBJECT(dev), ICS_PROP_XICS, 
> > > _err);
> > >  if (!obj) {
> > > -error_propagate_prepend(errp, err,
> > > +error_propagate_prepend(errp, local_err,
> > >  "required link '" ICS_PROP_XICS
> > >  "' not found: ");
> > >  return;
> > > @@ -637,16 +605,18 @@ static void ics_base_realize(DeviceState *dev, 
> > > Error **errp)
> > >  return;
> > >  }
> > >  ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
> > > +
> > > +qemu_register_reset(ics_reset_handler, ics);
> > >  }
> > >  
> > > -static void ics_base_instance_init(Object *obj)
> > > +static void ics_instance_init(Object *obj)
> > >  {
> > > -ICSState *ics = ICS_BASE(obj);
> > > +ICSState *ics = ICS(obj);
> > >  
> > >  ics->offset = XICS_IRQ_BASE;
> > >  }
> > >  
> > > -static int ics_base_pre_save(void *opaque)
> > > +static int ics_pre_save(void *opaque)
> > >  {
> > >  ICSState *ics = opaque;
> > >  
> > > @@ -657,7 +627,7 @@ static int ics_base_pre_save(void *opaque)
> > >  return 0;
> > >  }
> > >  
> > > -static int ics_base_post_load(void *opaque, int version_id)
> > > +static int ics_post_load(void *opaque, int 

Re: [PATCH 20/20] spapr: Eliminate SpaprIrq::init hook

2019-09-25 Thread David Gibson
On Wed, Sep 25, 2019 at 09:31:54AM +0200, Cédric Le Goater wrote:
> On 25/09/2019 08:45, David Gibson wrote:
> > This method is used to set up the interrupt backends for the current
> > configuration.  However, this means some confusing redirection between
> > the "dual" mode init and the init hooks for xics only and xive only modes.
> > 
> > Since we now have simple flags indicating whether XICS and/or XIVE are
> > supported, it's easier to just open code each initialization directly in
> > spapr_irq_init().  This will also make some future cleanups simpler.
> > 
> > Signed-off-by: David Gibson 
> 
> Reviewed-by: Cédric Le Goater 
> 
> one comment below,
> 
> > ---
> >  hw/ppc/spapr_irq.c  | 138 
> >  include/hw/ppc/spapr_irq.h  |   1 -
> >  include/hw/ppc/xics_spapr.h |   1 +
> >  3 files changed, 63 insertions(+), 77 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index 073f375ba2..62647dd5a3 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -91,27 +91,6 @@ static void spapr_irq_init_kvm(SpaprMachineState *spapr,
> >  /*
> >   * XICS IRQ backend.
> >   */
> > -
> > -static void spapr_irq_init_xics(SpaprMachineState *spapr, Error **errp)
> > -{
> > -Object *obj;
> > -Error *local_err = NULL;
> > -
> > -obj = object_new(TYPE_ICS_SPAPR);
> > -object_property_add_child(OBJECT(spapr), "ics", obj, _abort);
> > -object_property_add_const_link(obj, ICS_PROP_XICS, OBJECT(spapr),
> > -   _fatal);
> > -object_property_set_int(obj, spapr->irq->nr_xirqs,
> > -"nr-irqs",  _fatal);
> > -object_property_set_bool(obj, true, "realized", _err);
> > -if (local_err) {
> > -error_propagate(errp, local_err);
> > -return;
> > -}
> > -
> > -spapr->ics = ICS_SPAPR(obj);
> > -}
> > -
> >  static void spapr_irq_claim_xics(SpaprMachineState *spapr, int irq, bool 
> > lsi,
> >   Error **errp)
> >  {
> > @@ -212,7 +191,6 @@ SpaprIrq spapr_irq_xics = {
> >  .xics= true,
> >  .xive= false,
> >  
> > -.init= spapr_irq_init_xics,
> >  .claim   = spapr_irq_claim_xics,
> >  .free= spapr_irq_free_xics,
> >  .print_info  = spapr_irq_print_info_xics,
> > @@ -227,37 +205,6 @@ SpaprIrq spapr_irq_xics = {
> >  /*
> >   * XIVE IRQ backend.
> >   */
> > -static void spapr_irq_init_xive(SpaprMachineState *spapr, Error **errp)
> > -{
> > -uint32_t nr_servers = spapr_max_server_number(spapr);
> > -DeviceState *dev;
> > -int i;
> > -
> > -dev = qdev_create(NULL, TYPE_SPAPR_XIVE);
> > -qdev_prop_set_uint32(dev, "nr-irqs",
> > - spapr->irq->nr_xirqs + SPAPR_XIRQ_BASE);
> > -/*
> > - * 8 XIVE END structures per CPU. One for each available priority
> > - */
> > -qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
> > -qdev_init_nofail(dev);
> > -
> > -spapr->xive = SPAPR_XIVE(dev);
> > -
> > -/* Enable the CPU IPIs */
> > -for (i = 0; i < nr_servers; ++i) {
> > -Error *local_err = NULL;
> > -
> > -spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i, false, 
> > _err);
> > -if (local_err) {
> > -error_propagate(errp, local_err);
> > -return;
> > -}
> > -}
> > -
> > -spapr_xive_hcall_init(spapr);
> > -}
> > -
> >  static void spapr_irq_claim_xive(SpaprMachineState *spapr, int irq, bool 
> > lsi,
> >   Error **errp)
> >  {
> > @@ -361,7 +308,6 @@ SpaprIrq spapr_irq_xive = {
> >  .xics= false,
> >  .xive= true,
> >  
> > -.init= spapr_irq_init_xive,
> >  .claim   = spapr_irq_claim_xive,
> >  .free= spapr_irq_free_xive,
> >  .print_info  = spapr_irq_print_info_xive,
> > @@ -392,23 +338,6 @@ static SpaprIrq *spapr_irq_current(SpaprMachineState 
> > *spapr)
> >  _irq_xive : _irq_xics;
> >  }
> >  
> > -static void spapr_irq_init_dual(SpaprMachineState *spapr, Error **errp)
> > -{
> > -Error *local_err = NULL;
> > -
> > -spapr_irq_xics.init(spapr, _err);
> > -if (local_err) {
> > -error_propagate(errp, local_err);
> > -return;
> > -}
> > -
> > -spapr_irq_xive.init(spapr, _err);
> > -if (local_err) {
> > -error_propagate(errp, local_err);
> > -return;
> > -}
> > -}
> > -
> >  static void spapr_irq_claim_dual(SpaprMachineState *spapr, int irq, bool 
> > lsi,
> >   Error **errp)
> >  {
> > @@ -520,7 +449,6 @@ SpaprIrq spapr_irq_dual = {
> >  .xics= true,
> >  .xive= true,
> >  
> > -.init= spapr_irq_init_dual,
> >  .claim   = spapr_irq_claim_dual,
> >  .free= spapr_irq_free_dual,
> >  .print_info  = spapr_irq_print_info_dual,
> > @@ -608,8 +536,7 @@ void 

Re: [PATCH 06/20] xics: Create sPAPR specific ICS subtype

2019-09-25 Thread David Gibson
On Wed, Sep 25, 2019 at 10:55:35AM +0200, Cédric Le Goater wrote:
> On 25/09/2019 10:40, Greg Kurz wrote:
> > On Wed, 25 Sep 2019 16:45:20 +1000
> > David Gibson  wrote:
> > 
> >> We create a subtype of TYPE_ICS specifically for sPAPR.  For now all this
> >> does is move the setup of the PAPR specific hcalls and RTAS calls to
> >> the realize() function for this, rather than requiring the PAPR code to
> >> explicitly call xics_spapr_init().  In future it will have some more
> >> function.
> >>
> >> Signed-off-by: David Gibson 
> >> ---
> > 
> > LGTM, but for extra safety I would also introduce a SpaprIcsState typedef
> 
> why ? we have ICS_SPAPR() for the checks already.

Eh.. using typedefs when we haven't actually extended a base type
isn't common QOM practice.  Yes, it's not as typesafe as it could be,
but I'm not really inclined to go to the extra effort here.

> 
> > and use it everywhere where we only expect this subtype. Especially in
> > the definition of SpaprMachineState.
> > 
> >>  hw/intc/xics_spapr.c| 34 +-
> >>  hw/ppc/spapr_irq.c  |  6 ++
> >>  include/hw/ppc/xics_spapr.h |  4 +++-
> >>  3 files changed, 38 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> >> index 3e9444813a..e6dd004587 100644
> >> --- a/hw/intc/xics_spapr.c
> >> +++ b/hw/intc/xics_spapr.c
> >> @@ -283,8 +283,18 @@ static void rtas_int_on(PowerPCCPU *cpu, 
> >> SpaprMachineState *spapr,
> >>  rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> >>  }
> >>  
> >> -void xics_spapr_init(SpaprMachineState *spapr)
> >> +static void ics_spapr_realize(DeviceState *dev, Error **errp)
> >>  {
> >> +ICSState *ics = ICS_SPAPR(dev);
> >> +ICSStateClass *icsc = ICS_GET_CLASS(ics);
> >> +Error *local_err = NULL;
> >> +
> >> +icsc->parent_realize(dev, _err);
> >> +if (local_err) {
> >> +error_propagate(errp, local_err);
> >> +return;
> >> +}
> >> +
> >>  spapr_rtas_register(RTAS_IBM_SET_XIVE, "ibm,set-xive", rtas_set_xive);
> >>  spapr_rtas_register(RTAS_IBM_GET_XIVE, "ibm,get-xive", rtas_get_xive);
> >>  spapr_rtas_register(RTAS_IBM_INT_OFF, "ibm,int-off", rtas_int_off);
> >> @@ -319,3 +329,25 @@ void spapr_dt_xics(SpaprMachineState *spapr, uint32_t 
> >> nr_servers, void *fdt,
> >>  _FDT(fdt_setprop_cell(fdt, node, "linux,phandle", phandle));
> >>  _FDT(fdt_setprop_cell(fdt, node, "phandle", phandle));
> >>  }
> >> +
> >> +static void ics_spapr_class_init(ObjectClass *klass, void *data)
> >> +{
> >> +DeviceClass *dc = DEVICE_CLASS(klass);
> >> +ICSStateClass *isc = ICS_CLASS(klass);
> >> +
> >> +device_class_set_parent_realize(dc, ics_spapr_realize,
> >> +>parent_realize);
> >> +}
> >> +
> >> +static const TypeInfo ics_spapr_info = {
> >> +.name = TYPE_ICS_SPAPR,
> >> +.parent = TYPE_ICS,
> >> +.class_init = ics_spapr_class_init,
> >> +};
> >> +
> >> +static void xics_spapr_register_types(void)
> >> +{
> >> +type_register_static(_spapr_info);
> >> +}
> >> +
> >> +type_init(xics_spapr_register_types)
> >> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> >> index 6c45d2a3c0..8c26fa2d1e 100644
> >> --- a/hw/ppc/spapr_irq.c
> >> +++ b/hw/ppc/spapr_irq.c
> >> @@ -98,7 +98,7 @@ static void spapr_irq_init_xics(SpaprMachineState 
> >> *spapr, int nr_irqs,
> >>  Object *obj;
> >>  Error *local_err = NULL;
> >>  
> >> -obj = object_new(TYPE_ICS);
> >> +obj = object_new(TYPE_ICS_SPAPR);
> >>  object_property_add_child(OBJECT(spapr), "ics", obj, _abort);
> >>  object_property_add_const_link(obj, ICS_PROP_XICS, OBJECT(spapr),
> >> _fatal);
> >> @@ -109,9 +109,7 @@ static void spapr_irq_init_xics(SpaprMachineState 
> >> *spapr, int nr_irqs,
> >>  return;
> >>  }
> >>  
> >> -spapr->ics = ICS(obj);
> >> -
> >> -xics_spapr_init(spapr);
> >> +spapr->ics = ICS_SPAPR(obj);
> >>  }
> >>  
> >>  static int spapr_irq_claim_xics(SpaprMachineState *spapr, int irq, bool 
> >> lsi,
> >> diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
> >> index 5dabc9a138..691a6d00f7 100644
> >> --- a/include/hw/ppc/xics_spapr.h
> >> +++ b/include/hw/ppc/xics_spapr.h
> >> @@ -31,11 +31,13 @@
> >>  
> >>  #define XICS_NODENAME "interrupt-controller"
> >>  
> >> +#define TYPE_ICS_SPAPR "ics-spapr"
> >> +#define ICS_SPAPR(obj) OBJECT_CHECK(ICSState, (obj), TYPE_ICS_SPAPR)
> >> +
> >>  void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void 
> >> *fdt,
> >> uint32_t phandle);
> >>  int xics_kvm_connect(SpaprMachineState *spapr, Error **errp);
> >>  void xics_kvm_disconnect(SpaprMachineState *spapr, Error **errp);
> >>  bool xics_kvm_has_broken_disconnect(SpaprMachineState *spapr);
> >> -void xics_spapr_init(SpaprMachineState *spapr);
> >>  
> >>  #endif /* XICS_SPAPR_H */
> > 
> 

-- 
David Gibson| I'll 

Re: [PATCH 09/20] spapr: Clarify and fix handling of nr_irqs

2019-09-25 Thread David Gibson
On Wed, Sep 25, 2019 at 09:05:34AM +0200, Cédric Le Goater wrote:
> On 25/09/2019 08:45, David Gibson wrote:
> > Both the XICS and XIVE interrupt backends have a "nr-irqs" property, but
> > it means slightly different things.  For XICS (or, strictly, the ICS) it
> > indicates the number of "real" external IRQs.  Those start at XICS_IRQ_BASE
> > (0x1000) and don't include the special IPI vector.  For XIVE, however, it
> > includes the whole IRQ space, including XIVE's many IPI vectors.
> > 
> > The spapr code currently doesn't handle this sensibly, with the nr_irqs
> > value in SpaprIrq having different meanings depending on the backend.
> > We fix this by renaming nr_irqs to nr_xirqs and making it always indicate
> > just the number of external irqs, adjusting the value we pass to XIVE
> > accordingly.  We also use move to using common constants in most of the
> > irq configurations, to make it clearer that the IRQ space looks the same
> > to the guest (and emulated devices), even if the backend is different.
> > 
> > Signed-off-by: David Gibson 
> 
> Reviewed-by: Cédric Le Goater 
> 
> one comment below,
> 
> > ---
> >  hw/ppc/spapr_irq.c | 48 +++---
> >  include/hw/ppc/spapr_irq.h | 19 +--
> >  2 files changed, 31 insertions(+), 36 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index 8c26fa2d1e..5190a33e08 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -92,7 +92,7 @@ static void spapr_irq_init_kvm(SpaprMachineState *spapr,
> >   * XICS IRQ backend.
> >   */
> >  
> > -static void spapr_irq_init_xics(SpaprMachineState *spapr, int nr_irqs,
> > +static void spapr_irq_init_xics(SpaprMachineState *spapr, int nr_xirqs,
> >  Error **errp)
> >  {
> >  Object *obj;
> > @@ -102,7 +102,7 @@ static void spapr_irq_init_xics(SpaprMachineState 
> > *spapr, int nr_irqs,
> >  object_property_add_child(OBJECT(spapr), "ics", obj, _abort);
> >  object_property_add_const_link(obj, ICS_PROP_XICS, OBJECT(spapr),
> > _fatal);
> > -object_property_set_int(obj, nr_irqs, "nr-irqs",  _fatal);
> > +object_property_set_int(obj, nr_xirqs, "nr-irqs",  _fatal);
> >  object_property_set_bool(obj, true, "realized", _err);
> >  if (local_err) {
> >  error_propagate(errp, local_err);
> > @@ -234,13 +234,9 @@ static void spapr_irq_init_kvm_xics(SpaprMachineState 
> > *spapr, Error **errp)
> >  }
> >  }
> >  
> > -#define SPAPR_IRQ_XICS_NR_IRQS 0x1000
> > -#define SPAPR_IRQ_XICS_NR_MSIS \
> > -(XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
> > -
> >  SpaprIrq spapr_irq_xics = {
> > -.nr_irqs = SPAPR_IRQ_XICS_NR_IRQS,
> > -.nr_msis = SPAPR_IRQ_XICS_NR_MSIS,
> > +.nr_xirqs= SPAPR_NR_XIRQS,
> > +.nr_msis = SPAPR_NR_MSIS,
> >  .ov5 = SPAPR_OV5_XIVE_LEGACY,
> >  
> >  .init= spapr_irq_init_xics,
> > @@ -260,7 +256,7 @@ SpaprIrq spapr_irq_xics = {
> >  /*
> >   * XIVE IRQ backend.
> >   */
> > -static void spapr_irq_init_xive(SpaprMachineState *spapr, int nr_irqs,
> > +static void spapr_irq_init_xive(SpaprMachineState *spapr, int nr_xirqs,
> >  Error **errp)
> >  {
> >  uint32_t nr_servers = spapr_max_server_number(spapr);
> > @@ -268,7 +264,7 @@ static void spapr_irq_init_xive(SpaprMachineState 
> > *spapr, int nr_irqs,
> >  int i;
> >  
> >  dev = qdev_create(NULL, TYPE_SPAPR_XIVE);
> > -qdev_prop_set_uint32(dev, "nr-irqs", nr_irqs);
> > +qdev_prop_set_uint32(dev, "nr-irqs", nr_xirqs + SPAPR_XIRQ_BASE);
> >  /*
> >   * 8 XIVE END structures per CPU. One for each available priority
> >   */
> > @@ -308,7 +304,7 @@ static qemu_irq spapr_qirq_xive(SpaprMachineState 
> > *spapr, int irq)
> >  {
> >  SpaprXive *xive = spapr->xive;
> >  
> > -if (irq >= xive->nr_irqs) {
> > +if ((irq < SPAPR_XIRQ_BASE) || (irq >= xive->nr_irqs)) {
> 
> So IPIs cannot be pulsed ? I think that is OK in QEMU.

They can be pulsed, they just can't be retrieved via the spapr_qirq()
interface.  Since that interface basically exists for the spapr root
devices (VIO and PHBs) to find the qemu_irqs to wire themselves up to,
I think that's fine.

If we discover some reason we need to grab IPI qirqs by global number
then we can revisit this.

I'll add a comment to clarify this in the later patch where I unify
the qirq implementations.

> XIVE unifies all the interrupts at the controller level. Any one can trigger 
> an interrupt with a store on the associate ESB page.

Absolutely, and nothing's stopping them.
> 
> >  return NULL;
> >  }
> >  
> > @@ -409,12 +405,9 @@ static void spapr_irq_init_kvm_xive(SpaprMachineState 
> > *spapr, Error **errp)
> >   * with XICS.
> >   */
> >  
> > -#define SPAPR_IRQ_XIVE_NR_IRQS 0x2000
> > -#define SPAPR_IRQ_XIVE_NR_MSIS (SPAPR_IRQ_XIVE_NR_IRQS - SPAPR_IRQ_MSI)

Re: [PATCH 05/20] xics: Merge TYPE_ICS_BASE and TYPE_ICS_SIMPLE classes

2019-09-25 Thread David Gibson
On Wed, Sep 25, 2019 at 10:16:10AM +0200, Greg Kurz wrote:
> On Wed, 25 Sep 2019 16:45:19 +1000
> David Gibson  wrote:
> 
> > TYPE_ICS_SIMPLE is the only subtype of TYPE_ICS_BASE that's ever
> > instantiated, and the only one we're ever likely to want.  The
> > existence of different classes is just a hang over from when we
> > (misguidedly) had separate subtypes for the KVM and non-KVM version of
> > the device.
> > 
> > So, collapse the two classes together into just TYPE_ICS.
> > 
> > Signed-off-by: David Gibson 
> > ---
> 
> So this also kills the realize hook, unlike in your previous series
> where this was done along with the reset hook change. Makes sense
> when merging parent/child class as well.

Actually, it doesn't  There's now only one actual implementation of
the realize handler, but I left the parent_realize hook in there for
future use.

> Reviewed-by: Greg Kurz 
> 
> >  hw/intc/xics.c| 86 ++-
> >  hw/ppc/pnv_psi.c  |  2 +-
> >  hw/ppc/spapr_irq.c|  4 +-
> >  include/hw/ppc/xics.h | 16 +++-
> >  4 files changed, 36 insertions(+), 72 deletions(-)
> > 
> > diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> > index 82e6f09259..dfe7dbd254 100644
> > --- a/hw/intc/xics.c
> > +++ b/hw/intc/xics.c
> > @@ -555,7 +555,7 @@ static void ics_reset_irq(ICSIRQState *irq)
> >  
> >  static void ics_reset(DeviceState *dev)
> >  {
> > -ICSState *ics = ICS_BASE(dev);
> > +ICSState *ics = ICS(dev);
> >  int i;
> >  uint8_t flags[ics->nr_irqs];
> >  
> > @@ -573,7 +573,7 @@ static void ics_reset(DeviceState *dev)
> >  if (kvm_irqchip_in_kernel()) {
> >  Error *local_err = NULL;
> >  
> > -ics_set_kvm_state(ICS_BASE(dev), _err);
> > +ics_set_kvm_state(ICS(dev), _err);
> >  if (local_err) {
> >  error_report_err(local_err);
> >  }
> > @@ -585,47 +585,15 @@ static void ics_reset_handler(void *dev)
> >  ics_reset(dev);
> >  }
> >  
> > -static void ics_simple_realize(DeviceState *dev, Error **errp)
> > +static void ics_realize(DeviceState *dev, Error **errp)
> >  {
> > -ICSState *ics = ICS_SIMPLE(dev);
> > -ICSStateClass *icsc = ICS_BASE_GET_CLASS(ics);
> > +ICSState *ics = ICS(dev);
> >  Error *local_err = NULL;
> > -
> > -icsc->parent_realize(dev, _err);
> > -if (local_err) {
> > -error_propagate(errp, local_err);
> > -return;
> > -}
> > -
> > -qemu_register_reset(ics_reset_handler, ics);
> > -}
> > -
> > -static void ics_simple_class_init(ObjectClass *klass, void *data)
> > -{
> > -DeviceClass *dc = DEVICE_CLASS(klass);
> > -ICSStateClass *isc = ICS_BASE_CLASS(klass);
> > -
> > -device_class_set_parent_realize(dc, ics_simple_realize,
> > ->parent_realize);
> > -}
> > -
> > -static const TypeInfo ics_simple_info = {
> > -.name = TYPE_ICS_SIMPLE,
> > -.parent = TYPE_ICS_BASE,
> > -.instance_size = sizeof(ICSState),
> > -.class_init = ics_simple_class_init,
> > -.class_size = sizeof(ICSStateClass),
> > -};
> > -
> > -static void ics_base_realize(DeviceState *dev, Error **errp)
> > -{
> > -ICSState *ics = ICS_BASE(dev);
> >  Object *obj;
> > -Error *err = NULL;
> >  
> > -obj = object_property_get_link(OBJECT(dev), ICS_PROP_XICS, );
> > +obj = object_property_get_link(OBJECT(dev), ICS_PROP_XICS, _err);
> >  if (!obj) {
> > -error_propagate_prepend(errp, err,
> > +error_propagate_prepend(errp, local_err,
> >  "required link '" ICS_PROP_XICS
> >  "' not found: ");
> >  return;
> > @@ -637,16 +605,18 @@ static void ics_base_realize(DeviceState *dev, Error 
> > **errp)
> >  return;
> >  }
> >  ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
> > +
> > +qemu_register_reset(ics_reset_handler, ics);
> >  }
> >  
> > -static void ics_base_instance_init(Object *obj)
> > +static void ics_instance_init(Object *obj)
> >  {
> > -ICSState *ics = ICS_BASE(obj);
> > +ICSState *ics = ICS(obj);
> >  
> >  ics->offset = XICS_IRQ_BASE;
> >  }
> >  
> > -static int ics_base_pre_save(void *opaque)
> > +static int ics_pre_save(void *opaque)
> >  {
> >  ICSState *ics = opaque;
> >  
> > @@ -657,7 +627,7 @@ static int ics_base_pre_save(void *opaque)
> >  return 0;
> >  }
> >  
> > -static int ics_base_post_load(void *opaque, int version_id)
> > +static int ics_post_load(void *opaque, int version_id)
> >  {
> >  ICSState *ics = opaque;
> >  
> > @@ -675,7 +645,7 @@ static int ics_base_post_load(void *opaque, int 
> > version_id)
> >  return 0;
> >  }
> >  
> > -static const VMStateDescription vmstate_ics_base_irq = {
> > +static const VMStateDescription vmstate_ics_irq = {
> >  .name = "ics/irq",
> >  .version_id = 2,
> >  .minimum_version_id = 1,
> > @@ -689,45 +659,44 @@ static const VMStateDescription 

Re: [PATCH 18/20] xive: Improve irq claim/free path

2019-09-25 Thread David Gibson
On Wed, Sep 25, 2019 at 09:25:47AM +0200, Cédric Le Goater wrote:
> On 25/09/2019 08:45, David Gibson wrote:
> > spapr_xive_irq_claim() returns a bool to indicate if it succeeded.  But
> > most of the callers and one callee use a richer Error * instead.  So use
> > that instead of a bool return so we can actually pass more informative
> > errors up the stack.
> > 
> > In addition it didn't actually check if the irq was already claimed, which
> > is one of the primary purposes of the claim path, so do that.
> > 
> > spapr_xive_irq_free() also returned a bool... which no callers checked, so
> > just drop it.
> > 
> > Signed-off-by: David Gibson 
> > ---
> >  hw/intc/spapr_xive.c| 17 ++---
> >  hw/ppc/spapr_irq.c  | 12 
> >  include/hw/ppc/spapr_xive.h |  5 +++--
> >  3 files changed, 21 insertions(+), 13 deletions(-)
> > 
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index 47b5ec0b56..5a56a58299 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -528,12 +528,18 @@ static void spapr_xive_register_types(void)
> >  
> >  type_init(spapr_xive_register_types)
> >  
> > -bool spapr_xive_irq_claim(SpaprXive *xive, uint32_t lisn, bool lsi)
> > +void spapr_xive_irq_claim(SpaprXive *xive, uint32_t lisn, bool lsi,
> > +  Error **errp)
> >  {
> >  XiveSource *xsrc = >source;
> >  
> >  assert(lisn < xive->nr_irqs);
> >  
> > +if (be64_to_cpu(xive->eat[lisn].w) & EAS_VALID) {
> 
> please use xive_eas_is_valid()

Oops, missed that that existed.  Fixed.

> with that change,
> 
> Reviewed-by: Cédric Le Goater 

Oops, missed

> 
> 
> C. 
> 
> > +error_setg(errp, "IRQ %d is not free", lisn);
> > +return;
> > +}
> > +
> >  /*
> >   * Set default values when allocating an IRQ number
> >   */
> > @@ -547,20 +553,17 @@ bool spapr_xive_irq_claim(SpaprXive *xive, uint32_t 
> > lisn, bool lsi)
> >  
> >  kvmppc_xive_source_reset_one(xsrc, lisn, _err);
> >  if (local_err) {
> > -error_report_err(local_err);
> > -return false;
> > +error_propagate(errp, local_err);
> > +return;
> >  }
> >  }
> > -
> > -return true;
> >  }
> >  
> > -bool spapr_xive_irq_free(SpaprXive *xive, uint32_t lisn)
> > +void spapr_xive_irq_free(SpaprXive *xive, uint32_t lisn)
> >  {
> >  assert(lisn < xive->nr_irqs);
> >  
> >  xive->eat[lisn].w &= cpu_to_be64(~EAS_VALID);
> > -return true;
> >  }
> >  
> >  /*
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index 2673a90604..f53544e45e 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -245,7 +245,13 @@ static void spapr_irq_init_xive(SpaprMachineState 
> > *spapr, Error **errp)
> >  
> >  /* Enable the CPU IPIs */
> >  for (i = 0; i < nr_servers; ++i) {
> > -spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i, false);
> > +Error *local_err = NULL;
> > +
> > +spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i, false, 
> > _err);
> > +if (local_err) {
> > +error_propagate(errp, local_err);
> > +return;
> > +}
> >  }
> >  
> >  spapr_xive_hcall_init(spapr);
> > @@ -254,9 +260,7 @@ static void spapr_irq_init_xive(SpaprMachineState 
> > *spapr, Error **errp)
> >  static void spapr_irq_claim_xive(SpaprMachineState *spapr, int irq, bool 
> > lsi,
> >   Error **errp)
> >  {
> > -if (!spapr_xive_irq_claim(spapr->xive, irq, lsi)) {
> > -error_setg(errp, "IRQ %d is invalid", irq);
> > -}
> > +spapr_xive_irq_claim(spapr->xive, irq, lsi, errp);
> >  }
> >  
> >  static void spapr_irq_free_xive(SpaprMachineState *spapr, int irq)
> > diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> > index bfd40f01d8..69df3793e1 100644
> > --- a/include/hw/ppc/spapr_xive.h
> > +++ b/include/hw/ppc/spapr_xive.h
> > @@ -54,8 +54,9 @@ typedef struct SpaprXive {
> >   */
> >  #define SPAPR_XIVE_BLOCK_ID 0x0
> >  
> > -bool spapr_xive_irq_claim(SpaprXive *xive, uint32_t lisn, bool lsi);
> > -bool spapr_xive_irq_free(SpaprXive *xive, uint32_t lisn);
> > +void spapr_xive_irq_claim(SpaprXive *xive, uint32_t lisn, bool lsi,
> > +  Error **errp);
> > +void spapr_xive_irq_free(SpaprXive *xive, uint32_t lisn);
> >  void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon);
> >  int spapr_xive_post_load(SpaprXive *xive, int version_id);
> >  
> > 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v4 2/2] target/i386: drop the duplicated definition of cpuid AVX512_VBMI marco

2019-09-25 Thread Tao Xu

On 9/25/2019 4:42 PM, Stefano Garzarella wrote:

Hi Tao,

Typo in the commit title and message? s/marco/macro?

On Tue, Sep 24, 2019 at 09:02:09AM +0800, Tao Xu wrote:

Drop the duplicated definition of cpuid AVX512_VBMI and marco and

I'm not native speaker but I'd remove some 'and'  ^ this


rename it as CPUID_7_0_ECX_AVX512_VBMI. And rename CPUID_7_0_ECX_VBMI2

   ^ this


Oh, my mistake, I will correct these. Thank you for reminding me.

as CPUID_7_0_ECX_AVX512_VBMI2.

Signed-off-by: Tao Xu 
---
  target/i386/cpu.c   | 8 
  target/i386/cpu.h   | 5 ++---
  target/i386/hvf/x86_cpuid.c | 2 +-
  3 files changed, 7 insertions(+), 8 deletions(-)



The rest LGTM:

Acked-by: Stefano Garzarella 

Thanks,
Stefano


diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 9e0bac31e8..71034aeb5a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -2412,8 +2412,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
  CPUID_7_0_EBX_RTM | CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX |
  CPUID_7_0_EBX_SMAP,
  .features[FEAT_7_0_ECX] =
-CPUID_7_0_ECX_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU |
-CPUID_7_0_ECX_VBMI2 | CPUID_7_0_ECX_GFNI |
+CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU 
|
+CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
  CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
  CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
  CPUID_7_0_ECX_AVX512_VPOPCNTDQ,
@@ -2470,8 +2470,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
  CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512CD |
  CPUID_7_0_EBX_AVX512VL | CPUID_7_0_EBX_CLFLUSHOPT,
  .features[FEAT_7_0_ECX] =
-CPUID_7_0_ECX_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU |
-CPUID_7_0_ECX_VBMI2 | CPUID_7_0_ECX_GFNI |
+CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU 
|
+CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
  CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
  CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
  CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index fa4c4cad79..8e090acd74 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -695,8 +695,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
  #define CPUID_7_0_EBX_AVX512VL  (1U << 31)
  
  /* AVX-512 Vector Byte Manipulation Instruction */

-#define CPUID_7_0_ECX_AVX512BMI (1U << 1)
-#define CPUID_7_0_ECX_VBMI  (1U << 1)
+#define CPUID_7_0_ECX_AVX512_VBMI   (1U << 1)
  /* User-Mode Instruction Prevention */
  #define CPUID_7_0_ECX_UMIP  (1U << 2)
  /* Protection Keys for User-mode Pages */
@@ -704,7 +703,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
  /* OS Enable Protection Keys */
  #define CPUID_7_0_ECX_OSPKE (1U << 4)
  /* Additional AVX-512 Vector Byte Manipulation Instruction */
-#define CPUID_7_0_ECX_VBMI2 (1U << 6)
+#define CPUID_7_0_ECX_AVX512_VBMI2  (1U << 6)
  /* Galois Field New Instructions */
  #define CPUID_7_0_ECX_GFNI  (1U << 8)
  /* Vector AES Instructions */
diff --git a/target/i386/hvf/x86_cpuid.c b/target/i386/hvf/x86_cpuid.c
index 4d957fe896..16762b6eb4 100644
--- a/target/i386/hvf/x86_cpuid.c
+++ b/target/i386/hvf/x86_cpuid.c
@@ -89,7 +89,7 @@ uint32_t hvf_get_supported_cpuid(uint32_t func, uint32_t idx,
  ebx &= ~CPUID_7_0_EBX_INVPCID;
  }
  
-ecx &= CPUID_7_0_ECX_AVX512BMI | CPUID_7_0_ECX_AVX512_VPOPCNTDQ;

+ecx &= CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_AVX512_VPOPCNTDQ;
  edx &= CPUID_7_0_EDX_AVX512_4VNNIW | CPUID_7_0_EDX_AVX512_4FMAPS;
  } else {
  ebx = 0;
--
2.20.1








[PATCH V6] target/riscv: Ignore reserved bits in PTE for RV64

2019-09-25 Thread guoren
From: Guo Ren 

Highest 10 bits of PTE are reserved in riscv-privileged, ref: [1], so we
need to ignore them. They cannot be a part of ppn.

1: The RISC-V Instruction Set Manual, Volume II: Privileged Architecture
   4.4 Sv39: Page-Based 39-bit Virtual-Memory System
   4.5 Sv48: Page-Based 48-bit Virtual-Memory System

Signed-off-by: Guo Ren 
Tested-by: Bin Meng 
Reviewed-by: Liu Zhiwei 
Reviewed-by: Bin Meng 
Reviewed-by: Alistair Francis 
---
 target/riscv/cpu_bits.h   | 7 +++
 target/riscv/cpu_helper.c | 2 +-
 2 files changed, 8 insertions(+), 1 deletion(-)

 Changelog V6:
  - Add Reviewer: Alistair Francis

 Changelog V5:
  - Add Reviewer and Tester: Bin Meng

 Changelog V4:
  - Change title to Ignore not Bugfix
  - Use PTE_PPN_MASK for RV32 and RV64

 Changelog V3:
  - Use UUL define for PTE_RESERVED
  - Keep ppn >> PTE_PPN_SHIFT

 Changelog V2:
  - Bugfix pte destroyed cause boot fail
  - Change to AND with a mask instead of shifting both directions

diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index e998348..399c2c6 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -473,6 +473,13 @@
 /* Page table PPN shift amount */
 #define PTE_PPN_SHIFT   10
 
+/* Page table PPN mask */
+#if defined(TARGET_RISCV32)
+#define PTE_PPN_MASK0xUL
+#elif defined(TARGET_RISCV64)
+#define PTE_PPN_MASK0x3fULL
+#endif
+
 /* Leaf page shift amount */
 #define PGSHIFT 12
 
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 87dd6a6..9961b37 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -261,7 +261,7 @@ restart:
 #elif defined(TARGET_RISCV64)
 target_ulong pte = ldq_phys(cs->as, pte_addr);
 #endif
-hwaddr ppn = pte >> PTE_PPN_SHIFT;
+hwaddr ppn = (pte & PTE_PPN_MASK) >> PTE_PPN_SHIFT;
 
 if (!(pte & PTE_V)) {
 /* Invalid PTE */
-- 
2.7.4




Re: [PATCH v7 4/4] s390: do not call memory_region_allocate_system_memory() multiple times

2019-09-25 Thread Peter Xu
On Wed, Sep 25, 2019 at 01:51:05PM +0200, Igor Mammedov wrote:
> On Wed, 25 Sep 2019 11:27:00 +0800
> Peter Xu  wrote:
> 
> > On Tue, Sep 24, 2019 at 10:47:51AM -0400, Igor Mammedov wrote:
> > > s390 was trying to solve limited KVM memslot size issue by abusing
> > > memory_region_allocate_system_memory(), which breaks API contract
> > > where the function might be called only once.
> > > 
> > > Beside an invalid use of API, the approach also introduced migration
> > > issue, since RAM chunks for each KVM_SLOT_MAX_BYTES are transferred in
> > > migration stream as separate RAMBlocks.
> > > 
> > > After discussion [1], it was agreed to break migration from older
> > > QEMU for guest with RAM >8Tb (as it was relatively new (since 2.12)
> > > and considered to be not actually used downstream).
> > > Migration should keep working for guests with less than 8TB and for
> > > more than 8TB with QEMU 4.2 and newer binary.
> > > In case user tries to migrate more than 8TB guest, between incompatible
> > > QEMU versions, migration should fail gracefully due to non-exiting
> > > RAMBlock ID or RAMBlock size mismatch.
> > > 
> > > Taking in account above and that now KVM code is able to split too
> > > big MemorySection into several memslots, partially revert commit
> > >  (bb223055b s390-ccw-virtio: allow for systems larger that 7.999TB)
> > > and use kvm_set_max_memslot_size() to set KVMSlot size to
> > > KVM_SLOT_MAX_BYTES.
> > > 
> > > 1) [PATCH RFC v2 4/4] s390: do not call  
> > > memory_region_allocate_system_memory() multiple times
> > > 
> > > Signed-off-by: Igor Mammedov   
> > 
> > Acked-by: Peter Xu 
> > 
> > IMHO it would be good to at least mention bb223055b9 in the commit
> > message even if not with a "Fixed:" tag.  May be amended during commit
> > if anyone prefers.
> 
> /me confused, bb223055b9 is mentioned in commit message

I'm sorry, I overlooked that.

>  
> > Also, this only applies the split limitation to s390.  Would that be a
> > good thing to some other archs as well?
> 
> Don't we have the similar bitmap size issue in KVM for other archs?

Yes I thought we had.  So I feel like it would be good to also allow
other archs to support >8TB mem as well.  Thanks,

-- 
Peter Xu



Re: [PATCH V5] target/riscv: Ignore reserved bits in PTE for RV64

2019-09-25 Thread Guo Ren
Thx, Sincerely

On Thu, Sep 26, 2019 at 6:52 AM Alistair Francis  wrote:
>
> On Wed, Sep 25, 2019 at 5:05 AM  wrote:
> >
> > From: Guo Ren 
> >
> > Highest 10 bits of PTE are reserved in riscv-privileged, ref: [1], so we
> > need to ignore them. They cannot be a part of ppn.
> >
> > 1: The RISC-V Instruction Set Manual, Volume II: Privileged Architecture
> >4.4 Sv39: Page-Based 39-bit Virtual-Memory System
> >4.5 Sv48: Page-Based 48-bit Virtual-Memory System
> >
> > Signed-off-by: Guo Ren 
> > Tested-by: Bin Meng 
> > Reviewed-by: Liu Zhiwei 
> > Reviewed-by: Bin Meng 
>
> Reviewed-by: Alistair Francis 
>
> Alistair
>
> > ---
> >  target/riscv/cpu_bits.h   | 7 +++
> >  target/riscv/cpu_helper.c | 2 +-
> >  2 files changed, 8 insertions(+), 1 deletion(-)
> >
> >  Changelog V5:
> >   - Update Reviewer and Tester.
> >
> >  Changelog V4:
> >   - Change title to Ignore not Bugfix
> >   - Use PTE_PPN_MASK for RV32 and RV64
> >
> >  Changelog V3:
> >   - Use UUL define for PTE_RESERVED
> >   - Keep ppn >> PTE_PPN_SHIFT
> >
> >  Changelog V2:
> >   - Bugfix pte destroyed cause boot fail
> >   - Change to AND with a mask instead of shifting both directions
> >
> > diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
> > index e998348..399c2c6 100644
> > --- a/target/riscv/cpu_bits.h
> > +++ b/target/riscv/cpu_bits.h
> > @@ -473,6 +473,13 @@
> >  /* Page table PPN shift amount */
> >  #define PTE_PPN_SHIFT   10
> >
> > +/* Page table PPN mask */
> > +#if defined(TARGET_RISCV32)
> > +#define PTE_PPN_MASK0xUL
> > +#elif defined(TARGET_RISCV64)
> > +#define PTE_PPN_MASK0x3fULL
> > +#endif
> > +
> >  /* Leaf page shift amount */
> >  #define PGSHIFT 12
> >
> > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > index 87dd6a6..9961b37 100644
> > --- a/target/riscv/cpu_helper.c
> > +++ b/target/riscv/cpu_helper.c
> > @@ -261,7 +261,7 @@ restart:
> >  #elif defined(TARGET_RISCV64)
> >  target_ulong pte = ldq_phys(cs->as, pte_addr);
> >  #endif
> > -hwaddr ppn = pte >> PTE_PPN_SHIFT;
> > +hwaddr ppn = (pte & PTE_PPN_MASK) >> PTE_PPN_SHIFT;
> >
> >  if (!(pte & PTE_V)) {
> >  /* Invalid PTE */
> > --
> > 2.7.4
> >



-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/



Re: [PATCH v7 3/4] kvm: split too big memory section on several memslots

2019-09-25 Thread Peter Xu
On Wed, Sep 25, 2019 at 02:09:15PM +0200, Igor Mammedov wrote:
> On Wed, 25 Sep 2019 11:12:11 +0800
> Peter Xu  wrote:
> 
> > On Tue, Sep 24, 2019 at 10:47:50AM -0400, Igor Mammedov wrote:
> > 
> > [...]
> > 
> > > @@ -2877,6 +2912,7 @@ static bool kvm_accel_has_memory(MachineState *ms, 
> > > AddressSpace *as,
> > >  
> > >  for (i = 0; i < kvm->nr_as; ++i) {
> > >  if (kvm->as[i].as == as && kvm->as[i].ml) {
> > > +size = MIN(kvm_max_slot_size, size);
> > >  return NULL != kvm_lookup_matching_slot(kvm->as[i].ml,
> > >  start_addr, size);
> > >  }  
> > 
> > Ideally we could also check that the whole (start_addr, size) region
> > is covered by KVM memslots here, but with current code I can't think
> > of a case where the result doesn't match with only checking the 1st
> > memslot. So I assume it's fine.
> yep, it's micro-optimization that works on assumption that whole memory
> section always is covered by memslots and original semantics where
> working only for if start_addr/size where covering whole memory section.
> 
> Sole user mtree_print_flatview() is not performance sensitive,
> so if you'd like I can post an additional patch that iterates
> over whole range.

No need it's fine, thanks!

-- 
Peter Xu



Re: [PATCH v3 29/33] docker: remove 'deprecated' image definitions

2019-09-25 Thread Alex Bennée


Philippe Mathieu-Daudé  writes:

> On 9/24/19 11:01 PM, Alex Bennée wrote:
>> From: John Snow 
>>
>> There isn't a debian.dockerfile anymore,
>> so perform some ghost-busting.
>
> Won't we deprecate other images in the future?

Sure but we can just drop them from dockerfiles. It's not like we
allowed people to use them as we filtered them out.

>
>> Signed-off-by: John Snow 
>> Message-Id: <20190923181140.7235-4-js...@redhat.com>
>> Signed-off-by: Alex Bennée 
>> ---
>>  tests/docker/Makefile.include | 7 +++
>>  1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
>> index 82d5a8a5393..fd6f470fbf8 100644
>> --- a/tests/docker/Makefile.include
>> +++ b/tests/docker/Makefile.include
>> @@ -4,11 +4,10 @@
>>
>>  DOCKER_SUFFIX := .docker
>>  DOCKER_FILES_DIR := $(SRC_PATH)/tests/docker/dockerfiles
>> -DOCKER_DEPRECATED_IMAGES := debian
>>  # we don't run tests on intermediate images (used as base by another image)
>> -DOCKER_PARTIAL_IMAGES := debian debian9 debian10 debian-sid
>> +DOCKER_PARTIAL_IMAGES := debian9 debian10 debian-sid
>>  DOCKER_PARTIAL_IMAGES += debian9-mxe debian-ports debian-bootstrap
>> -DOCKER_IMAGES := $(filter-out $(DOCKER_DEPRECATED_IMAGES),$(sort $(notdir 
>> $(basename $(wildcard $(DOCKER_FILES_DIR)/*.docker)
>> +DOCKER_IMAGES := $(sort $(notdir $(basename $(wildcard 
>> $(DOCKER_FILES_DIR)/*.docker
>>  DOCKER_TARGETS := $(patsubst %,docker-image-%,$(DOCKER_IMAGES))
>>  # Use a global constant ccache directory to speed up repetitive builds
>>  DOCKER_CCACHE_DIR := $$HOME/.cache/qemu-docker-ccache
>> @@ -160,7 +159,7 @@ docker-image-debian-powerpc-user-cross: 
>> docker-binfmt-image-debian-powerpc-user
>>  DOCKER_USER_IMAGES += debian-powerpc-user
>>
>>  # Expand all the pre-requistes for each docker image and test combination
>> -$(foreach i,$(filter-out $(DOCKER_PARTIAL_IMAGES),$(DOCKER_IMAGES) 
>> $(DOCKER_DEPRECATED_IMAGES)), \
>> +$(foreach i,$(filter-out $(DOCKER_PARTIAL_IMAGES),$(DOCKER_IMAGES)), \
>>  $(foreach t,$(DOCKER_TESTS) $(DOCKER_TOOLS), \
>>  $(eval .PHONY: docker-$t@$i) \
>>  $(eval docker-$t@$i: docker-image-$i docker-run-$t@$i) \
>>


--
Alex Bennée



Re: [PATCH] docker: fix uid maping with podman

2019-09-25 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/4b9204cc8ade1c965dc5412c53c6f7c5b4f019a2.1569413332.git.tgole...@redhat.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===




The full log is available at
http://patchew.org/logs/4b9204cc8ade1c965dc5412c53c6f7c5b4f019a2.1569413332.git.tgole...@redhat.com/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[RFC PATCH] configure: deprecate 32 bit build hosts

2019-09-25 Thread Alex Bennée
The 32 bit hosts are already a second class citizen especially with
support for running 64 bit guests under TCG. We are also limited by
testing as actual working 32 bit machines are getting quite rare in
developers personal menageries. For TCG supporting newer types like
Int128 is a lot harder with 32 bit calling conventions compared to
their larger bit sized cousins. Fundamentally address space is the
most useful thing for the translator to have even for a 32 bit guest a
32 bit host is quite constrained.

As far as I'm aware 32 bit KVM users are even less numerous. Even
ILP32 doesn't make much sense given the address space QEMU needs to
manage.

Lets mark these machines as deprecated so we can have the wailing and
gnashing of teeth now and look to actually dropping the support in a
couple of cycles.

Signed-off-by: Alex Bennée 
Cc: Richard Henderson 
---
 configure | 24 +---
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/configure b/configure
index 542f6aea3f..776fd460b5 100755
--- a/configure
+++ b/configure
@@ -728,7 +728,7 @@ ARCH=
 # Normalise host CPU name and set ARCH.
 # Note that this case should only have supported host CPUs, not guests.
 case "$cpu" in
-  ppc|ppc64|s390|s390x|sparc64|x32|riscv32|riscv64)
+  ppc64|s390|s390x|sparc64|riscv64)
 supported_cpu="yes"
   ;;
   ppc64le)
@@ -737,7 +737,6 @@ case "$cpu" in
   ;;
   i386|i486|i586|i686|i86pc|BePC)
 cpu="i386"
-supported_cpu="yes"
   ;;
   x86_64|amd64)
 cpu="x86_64"
@@ -745,19 +744,22 @@ case "$cpu" in
   ;;
   armv*b|armv*l|arm)
 cpu="arm"
-supported_cpu="yes"
   ;;
   aarch64)
 cpu="aarch64"
 supported_cpu="yes"
   ;;
-  mips*)
+  mips64*)
 cpu="mips"
 supported_cpu="yes"
   ;;
+  mips*)
+cpu="mips"
+  ;;
   sparc|sun4[cdmuv])
 cpu="sparc"
-supported_cpu="yes"
+  ;;
+  x32|riscv32)
   ;;
   *)
 # This will result in either an error or falling back to TCI later
@@ -6438,12 +6440,12 @@ if test "$supported_cpu" = "no"; then
 echo "WARNING: SUPPORT FOR THIS HOST CPU WILL GO AWAY IN FUTURE RELEASES!"
 echo
 echo "CPU host architecture $cpu support is not currently maintained."
-echo "The QEMU project intends to remove support for this host CPU in"
-echo "a future release if nobody volunteers to maintain it and to"
-echo "provide a build host for our continuous integration setup."
-echo "configure has succeeded and you can continue to build, but"
-echo "if you care about QEMU on this platform you should contact"
-echo "us upstream at qemu-devel@nongnu.org."
+echo "The QEMU project intends to remove support for all 32 bit host"
+echo "CPUs in a future release. 64 bit hosts will need a volunteer"
+echo "to maintain it and to provide a build host for our continuous"
+echo "integration setup. configure has succeeded and you can continue"
+echo "to build, but if you care about QEMU on this platform you"
+echo "should contact us upstream at qemu-devel@nongnu.org."
 fi
 
 if test "$supported_os" = "no"; then
-- 
2.20.1




Re: [PATCH V5] target/riscv: Ignore reserved bits in PTE for RV64

2019-09-25 Thread Alistair Francis
On Wed, Sep 25, 2019 at 5:05 AM  wrote:
>
> From: Guo Ren 
>
> Highest 10 bits of PTE are reserved in riscv-privileged, ref: [1], so we
> need to ignore them. They cannot be a part of ppn.
>
> 1: The RISC-V Instruction Set Manual, Volume II: Privileged Architecture
>4.4 Sv39: Page-Based 39-bit Virtual-Memory System
>4.5 Sv48: Page-Based 48-bit Virtual-Memory System
>
> Signed-off-by: Guo Ren 
> Tested-by: Bin Meng 
> Reviewed-by: Liu Zhiwei 
> Reviewed-by: Bin Meng 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu_bits.h   | 7 +++
>  target/riscv/cpu_helper.c | 2 +-
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
>  Changelog V5:
>   - Update Reviewer and Tester.
>
>  Changelog V4:
>   - Change title to Ignore not Bugfix
>   - Use PTE_PPN_MASK for RV32 and RV64
>
>  Changelog V3:
>   - Use UUL define for PTE_RESERVED
>   - Keep ppn >> PTE_PPN_SHIFT
>
>  Changelog V2:
>   - Bugfix pte destroyed cause boot fail
>   - Change to AND with a mask instead of shifting both directions
>
> diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
> index e998348..399c2c6 100644
> --- a/target/riscv/cpu_bits.h
> +++ b/target/riscv/cpu_bits.h
> @@ -473,6 +473,13 @@
>  /* Page table PPN shift amount */
>  #define PTE_PPN_SHIFT   10
>
> +/* Page table PPN mask */
> +#if defined(TARGET_RISCV32)
> +#define PTE_PPN_MASK0xUL
> +#elif defined(TARGET_RISCV64)
> +#define PTE_PPN_MASK0x3fULL
> +#endif
> +
>  /* Leaf page shift amount */
>  #define PGSHIFT 12
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 87dd6a6..9961b37 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -261,7 +261,7 @@ restart:
>  #elif defined(TARGET_RISCV64)
>  target_ulong pte = ldq_phys(cs->as, pte_addr);
>  #endif
> -hwaddr ppn = pte >> PTE_PPN_SHIFT;
> +hwaddr ppn = (pte & PTE_PPN_MASK) >> PTE_PPN_SHIFT;
>
>  if (!(pte & PTE_V)) {
>  /* Invalid PTE */
> --
> 2.7.4
>



Re: [PATCH V3] target/riscv: Bugfix reserved bits in PTE for RV64

2019-09-25 Thread Alistair Francis
On Wed, Sep 25, 2019 at 9:16 AM Guo Ren  wrote:
>
> "Bits 63–54 are reserved for future use and must be
> zeroed by software for forward compatibility."
>
> That doesn't mean 63-54 are belong to ppn, it's reserved for future
> and nobody know 63-54 will be part of ppn.
> Current riscv qemu ppn implementation is obviously wrong. It shouldn't
> care the software's behavior, please follow the spec.

You have convinced me, I think this is an acceptable change.

Alistair

>
> On Wed, Sep 25, 2019 at 11:58 PM Jonathan Behrens  wrote:
> >
> > > The specification is very clear: these bits are not part of ppn, not
> > > part of the translation target address. The current code is against
> > > the riscv-privilege specification.
> >
> > If all of the reserved bits are zero then the patch changes nothing.
> > Further the only normative mention of the reserved bits in the spec
> > says they must be: "Bits 63–54 are reserved for future use and must be
> > zeroed by software for forward compatibility." Provided that software
> > follows the spec current QEMU will behave properly. For software that
> > ignores that directive an sets some of those bits, the spec says
> > nothing  about what hardware should do, so both the old an the new
> > behavior are fine.
> >
> > Jonathan
>
>
>
> --
> Best Regards
>  Guo Ren
>
> ML: https://lore.kernel.org/linux-csky/



Re: [PATCH v3 01/33] target/alpha: Use array for FPCR_DYN conversion

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:00 PM, Alex Bennée wrote:
> From: Richard Henderson 
> 
> This is a bit more straight-forward than using a switch statement.
> No functional change.
> 
> Signed-off-by: Richard Henderson 
> Signed-off-by: Alex Bennée 
> Message-Id: <20190921043256.4575-2-richard.hender...@linaro.org>

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  target/alpha/helper.c | 24 
>  1 file changed, 8 insertions(+), 16 deletions(-)
> 
> diff --git a/target/alpha/helper.c b/target/alpha/helper.c
> index 19cda0a2db5..6c1703682e0 100644
> --- a/target/alpha/helper.c
> +++ b/target/alpha/helper.c
> @@ -36,6 +36,13 @@ uint64_t cpu_alpha_load_fpcr(CPUAlphaState *env)
>  
>  void cpu_alpha_store_fpcr(CPUAlphaState *env, uint64_t val)
>  {
> +static const uint8_t rm_map[] = {
> +[FPCR_DYN_NORMAL >> FPCR_DYN_SHIFT] = float_round_nearest_even,
> +[FPCR_DYN_CHOPPED >> FPCR_DYN_SHIFT] = float_round_to_zero,
> +[FPCR_DYN_MINUS >> FPCR_DYN_SHIFT] = float_round_down,
> +[FPCR_DYN_PLUS >> FPCR_DYN_SHIFT] = float_round_up,
> +};
> +
>  uint32_t fpcr = val >> 32;
>  uint32_t t = 0;
>  
> @@ -48,22 +55,7 @@ void cpu_alpha_store_fpcr(CPUAlphaState *env, uint64_t val)
>  env->fpcr = fpcr;
>  env->fpcr_exc_enable = ~t & FPCR_STATUS_MASK;
>  
> -switch (fpcr & FPCR_DYN_MASK) {
> -case FPCR_DYN_NORMAL:
> -default:
> -t = float_round_nearest_even;
> -break;
> -case FPCR_DYN_CHOPPED:
> -t = float_round_to_zero;
> -break;
> -case FPCR_DYN_MINUS:
> -t = float_round_down;
> -break;
> -case FPCR_DYN_PLUS:
> -t = float_round_up;
> -break;
> -}
> -env->fpcr_dyn_round = t;
> +env->fpcr_dyn_round = rm_map[(fpcr & FPCR_DYN_MASK) >> FPCR_DYN_SHIFT];
>  
>  env->fpcr_flush_to_zero = (fpcr & FPCR_UNFD) && (fpcr & FPCR_UNDZ);
>  env->fp_status.flush_inputs_to_zero = (fpcr & FPCR_DNZ) != 0;
> 



Re: [PATCH v3 03/33] target/alpha: Fix SWCR_TRAP_ENABLE_MASK

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:00 PM, Alex Bennée wrote:
> From: Richard Henderson 
> 
> The CONFIG_USER_ONLY adjustment blindly mashed the swcr
> exception enable bits into the fpcr exception disable bits.
> 
> However, fpcr_exc_enable has already converted the exception
> disable bits into the exception status bits in order to make
> it easier to mask status bits at runtime.
> 
> Instead, merge the swcr enable bits with the fpcr before we
> convert to status bits.
> 
> Signed-off-by: Richard Henderson 
> Signed-off-by: Alex Bennée 
> Message-Id: <20190921043256.4575-4-richard.hender...@linaro.org>

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  target/alpha/helper.c | 23 ++-
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/target/alpha/helper.c b/target/alpha/helper.c
> index 10602fb3394..e21c488aa32 100644
> --- a/target/alpha/helper.c
> +++ b/target/alpha/helper.c
> @@ -46,34 +46,39 @@ void cpu_alpha_store_fpcr(CPUAlphaState *env, uint64_t 
> val)
>  uint32_t fpcr = val >> 32;
>  uint32_t t = 0;
>  
> +/* Record the raw value before adjusting for linux-user.  */
> +env->fpcr = fpcr;
> +
> +#ifdef CONFIG_USER_ONLY
> +/*
> + * Override some of these bits with the contents of ENV->SWCR.
> + * In system mode, some of these would trap to the kernel, at
> + * which point the kernel's handler would emulate and apply
> + * the software exception mask.
> + */
> +uint32_t soft_fpcr = alpha_ieee_swcr_to_fpcr(env->swcr) >> 32;
> +fpcr |= soft_fpcr & FPCR_STATUS_MASK;
> +#endif
> +
>  t |= CONVERT_BIT(fpcr, FPCR_INED, FPCR_INE);
>  t |= CONVERT_BIT(fpcr, FPCR_UNFD, FPCR_UNF);
>  t |= CONVERT_BIT(fpcr, FPCR_OVFD, FPCR_OVF);
>  t |= CONVERT_BIT(fpcr, FPCR_DZED, FPCR_DZE);
>  t |= CONVERT_BIT(fpcr, FPCR_INVD, FPCR_INV);
>  
> -env->fpcr = fpcr;
>  env->fpcr_exc_enable = ~t & FPCR_STATUS_MASK;
>  
>  env->fpcr_dyn_round = rm_map[(fpcr & FPCR_DYN_MASK) >> FPCR_DYN_SHIFT];
>  
>  env->fpcr_flush_to_zero = (fpcr & FPCR_UNFD) && (fpcr & FPCR_UNDZ);
>  env->fp_status.flush_inputs_to_zero = (fpcr & FPCR_DNZ) != 0;
> -
>  #ifdef CONFIG_USER_ONLY
> -/*
> - * Override some of these bits with the contents of ENV->SWCR.
> - * In system mode, some of these would trap to the kernel, at
> - * which point the kernel's handler would emulate and apply
> - * the software exception mask.
> - */
>  if (env->swcr & SWCR_MAP_DMZ) {
>  env->fp_status.flush_inputs_to_zero = 1;
>  }
>  if (env->swcr & SWCR_MAP_UMZ) {
>  env->fpcr_flush_to_zero = 1;
>  }
> -env->fpcr_exc_enable &= ~(alpha_ieee_swcr_to_fpcr(env->swcr) >> 32);
>  #endif
>  }
>  
> 



Re: [PATCH v3 07/33] target/alpha: Tidy helper_fp_exc_raise_s

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:00 PM, Alex Bennée wrote:
> From: Richard Henderson 
> 
> Remove a redundant masking of ignore.  Once that's gone it is
> obvious that the system-mode inner test is redundant with the
> outer test.  Move the fpcr_exc_enable masking up and tidy.
> 
> No functional change.
> 
> Signed-off-by: Richard Henderson 
> Signed-off-by: Alex Bennée 
> Message-Id: <20190921043256.4575-8-richard.hender...@linaro.org>

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  target/alpha/fpu_helper.c | 15 ---
>  1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/target/alpha/fpu_helper.c b/target/alpha/fpu_helper.c
> index 62a066d9020..df8b58963ba 100644
> --- a/target/alpha/fpu_helper.c
> +++ b/target/alpha/fpu_helper.c
> @@ -90,25 +90,18 @@ void helper_fp_exc_raise_s(CPUAlphaState *env, uint32_t 
> ignore, uint32_t regno)
>  uint32_t exc = env->error_code & ~ignore;
>  if (exc) {
>  env->fpcr |= exc;
> -exc &= ~ignore;
> -#ifdef CONFIG_USER_ONLY
> -/*
> - * In user mode, the kernel's software handler only
> - * delivers a signal if the exception is enabled.
> - */
> -if (!(exc & env->fpcr_exc_enable)) {
> -return;
> -}
> -#else
> +exc &= env->fpcr_exc_enable;
>  /*
>   * In system mode, the software handler gets invoked
>   * for any non-ignored exception.
> + * In user mode, the kernel's software handler only
> + * delivers a signal if the exception is enabled.
>   */
> +#ifdef CONFIG_USER_ONLY
>  if (!exc) {
>  return;
>  }
>  #endif
> -exc &= env->fpcr_exc_enable;
>  fp_exc_raise1(env, GETPC(), exc, regno, EXC_M_SWC);
>  }
>  }
> 



Re: [PATCH v3 08/33] tests/migration: Fail on unexpected migration states

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:00 PM, Alex Bennée wrote:
> From: "Dr. David Alan Gilbert" 
> 
> We've got various places where we wait for a migration to enter
> a given state; but if we enter an unexpected state we tend to fail
> in odd ways; add a mechanism for explicitly testing for any state
> which we shouldn't be in.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> Signed-off-by: Alex Bennée 
> Message-Id: <20190923131022.15498-2-dgilb...@redhat.com>
> ---
>  tests/migration-test.c | 23 +--
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/tests/migration-test.c b/tests/migration-test.c
> index 258aa064d48..9c62ee5331b 100644
> --- a/tests/migration-test.c
> +++ b/tests/migration-test.c
> @@ -255,15 +255,19 @@ static void read_blocktime(QTestState *who)
>  }
>  
>  static void wait_for_migration_status(QTestState *who,
> -  const char *goal)
> +  const char *goal,
> +  const char **ungoals)
>  {
>  while (true) {
>  bool completed;
>  char *status;
> +const char **ungoal;
>  
>  status = migrate_query_status(who);
>  completed = strcmp(status, goal) == 0;
> -g_assert_cmpstr(status, !=,  "failed");
> +for (ungoal = ungoals; *ungoal; ungoal++) {
> +g_assert_cmpstr(status, !=,  *ungoal);

:)

Reviewed-by: Philippe Mathieu-Daudé 

> +}
>  g_free(status);
>  if (completed) {
>  return;
> @@ -274,7 +278,8 @@ static void wait_for_migration_status(QTestState *who,
>  
>  static void wait_for_migration_complete(QTestState *who)
>  {
> -wait_for_migration_status(who, "completed");
> +wait_for_migration_status(who, "completed",
> +  (const char * []) { "failed", NULL });
>  }
>  
>  static void wait_for_migration_pass(QTestState *who)
> @@ -809,7 +814,9 @@ static void test_postcopy_recovery(void)
>   * Wait until postcopy is really started; we can only run the
>   * migrate-pause command during a postcopy
>   */
> -wait_for_migration_status(from, "postcopy-active");
> +wait_for_migration_status(from, "postcopy-active",
> +  (const char * []) { "failed",
> +  "completed", NULL });
>  
>  /*
>   * Manually stop the postcopy migration. This emulates a network
> @@ -822,7 +829,9 @@ static void test_postcopy_recovery(void)
>   * migrate-recover command can only succeed if destination machine
>   * is in the paused state
>   */
> -wait_for_migration_status(to, "postcopy-paused");
> +wait_for_migration_status(to, "postcopy-paused",
> +  (const char * []) { "failed", "active",
> +  "completed", NULL });
>  
>  /*
>   * Create a new socket to emulate a new channel that is different
> @@ -836,7 +845,9 @@ static void test_postcopy_recovery(void)
>   * Try to rebuild the migration channel using the resume flag and
>   * the newly created channel
>   */
> -wait_for_migration_status(from, "postcopy-paused");
> +wait_for_migration_status(from, "postcopy-paused",
> +  (const char * []) { "failed", "active",
> +  "completed", NULL });
>  migrate(from, uri, "{'resume': true}");
>  g_free(uri);
>  
> 



Re: [PATCH v3 14/33] tests/docker: remove python2.7 from debian9-mxe

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:00 PM, Alex Bennée wrote:
> From: John Snow 
> 
> When it was based on debian8 which uses python-minimal, it needed this.
> It no longer does.
> 
> Goodbye, python2.7.
> 
> Signed-off-by: John Snow 
> Message-Id: <20190918222546.11696-1-js...@redhat.com>
> [AJB: fixed up commit message]
> Signed-off-by: Alex Bennée 
> ---
>  tests/docker/dockerfiles/debian9-mxe.docker | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/tests/docker/dockerfiles/debian9-mxe.docker 
> b/tests/docker/dockerfiles/debian9-mxe.docker
> index 7431168dad9..62ff1cecf2d 100644
> --- a/tests/docker/dockerfiles/debian9-mxe.docker
> +++ b/tests/docker/dockerfiles/debian9-mxe.docker
> @@ -16,7 +16,6 @@ RUN apt-key adv --keyserver keyserver.ubuntu.com 
> --recv-keys C6BF758A33A3A276 &&
>  RUN apt-get update && \
>  DEBIAN_FRONTEND=noninteractive eatmydata \
>  apt-get install -y --no-install-recommends \
> -libpython2.7-stdlib \
>  $(apt-get -s install -y --no-install-recommends 
> gw32.shared-mingw-w64 | egrep "^Inst mxe-x86-64-unknown-" | cut -d\  -f2)
>  
> -ENV PATH $PATH:/usr/lib/mxe/usr/bin/ 
> +ENV PATH $PATH:/usr/lib/mxe/usr/bin/
> 

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 



Re: [PATCH v3 22/33] configure: preserve PKG_CONFIG for subdir builds

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:00 PM, Alex Bennée wrote:
> The slirp sub-module complains about not being able to find the glib
> library on cross-compiles because it is using the default pkg-config
> tool (which isn't installed in our cross-build docker images).
> Preserve PKG_CONFIG in our host config and pass it down to slirp.
> 
> Signed-off-by: Alex Bennée 
> Reviewed-by: Richard Henderson 
> ---
>  Makefile  | 6 +-
>  configure | 1 +
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/Makefile b/Makefile
> index a0c1430b407..8da33595edd 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -510,7 +510,11 @@ capstone/all: .git-submodule-status
>  
>  .PHONY: slirp/all
>  slirp/all: .git-submodule-status
> - $(call quiet-command,$(MAKE) -C $(SRC_PATH)/slirp 
> BUILD_DIR="$(BUILD_DIR)/slirp" CC="$(CC)" AR="$(AR)" LD="$(LD)" 
> RANLIB="$(RANLIB)" CFLAGS="$(QEMU_CFLAGS) $(CFLAGS)" LDFLAGS="$(LDFLAGS)")
> + $(call quiet-command,$(MAKE) -C $(SRC_PATH)/slirp   \
> + BUILD_DIR="$(BUILD_DIR)/slirp"  \
> + PKG_CONFIG="$(PKG_CONFIG)"  \

Eh it was that easy... nice.

Reviewed-by: Philippe Mathieu-Daudé 

> + CC="$(CC)" AR="$(AR)"   LD="$(LD)" RANLIB="$(RANLIB)"   \
> + CFLAGS="$(QEMU_CFLAGS) $(CFLAGS)" LDFLAGS="$(LDFLAGS)")
>  
>  # Compatibility gunk to keep make working across the rename of targets
>  # for recursion, to be removed some time after 4.1.
> diff --git a/configure b/configure
> index 397bb476e19..542f6aea3f6 100755
> --- a/configure
> +++ b/configure
> @@ -7302,6 +7302,7 @@ echo "OBJCOPY=$objcopy" >> $config_host_mak
>  echo "LD=$ld" >> $config_host_mak
>  echo "RANLIB=$ranlib" >> $config_host_mak
>  echo "NM=$nm" >> $config_host_mak
> +echo "PKG_CONFIG=$pkg_config_exe" >> $config_host_mak
>  echo "WINDRES=$windres" >> $config_host_mak
>  echo "CFLAGS=$CFLAGS" >> $config_host_mak
>  echo "CFLAGS_NOPIE=$CFLAGS_NOPIE" >> $config_host_mak
> 



Re: [PATCH v3 20/33] tests/tcg: add generic version of float_convs

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:00 PM, Alex Bennée wrote:
> This is broadly similar to the existing fcvt test for ARM but using
> the generic float testing framework. We should be able to pare down
> the ARM fcvt test case to purely half-precision with or without the
> Alt HP provision.
> 
> Signed-off-by: Alex Bennée 
> Reviewed-by: Richard Henderson 
> ---
>  tests/tcg/aarch64/float_convs.ref   | 748 
>  tests/tcg/arm/float_convs.ref   | 748 
>  tests/tcg/multiarch/Makefile.target |   6 +-
>  tests/tcg/multiarch/float_convs.c   | 105 
>  4 files changed, 1604 insertions(+), 3 deletions(-)
>  create mode 100755 tests/tcg/aarch64/float_convs.ref
>  create mode 100644 tests/tcg/arm/float_convs.ref
>  create mode 100644 tests/tcg/multiarch/float_convs.c
[...]

Tested-by: Philippe Mathieu-Daudé 



Re: [PATCH v3 31/33] docker: remove unused debian-sid

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:01 PM, Alex Bennée wrote:
> From: John Snow 
> 
> debian-sid is listed as a partial image, so we cannot run tests against it.
> Since it isn't used by any other testable image, remove it for now as it
> is prone to bitrot.
> 
> Signed-off-by: John Snow 
> Message-Id: <20190923181140.7235-6-js...@redhat.com>
> Signed-off-by: Alex Bennée 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  tests/docker/Makefile.include  |  2 +-
>  tests/docker/dockerfiles/debian-sid.docker | 35 --
>  2 files changed, 1 insertion(+), 36 deletions(-)
>  delete mode 100644 tests/docker/dockerfiles/debian-sid.docker
> 
> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
> index 053c418d8cd..180e5439ef9 100644
> --- a/tests/docker/Makefile.include
> +++ b/tests/docker/Makefile.include
> @@ -5,7 +5,7 @@
>  DOCKER_SUFFIX := .docker
>  DOCKER_FILES_DIR := $(SRC_PATH)/tests/docker/dockerfiles
>  # we don't run tests on intermediate images (used as base by another image)
> -DOCKER_PARTIAL_IMAGES := debian9 debian10 debian-sid
> +DOCKER_PARTIAL_IMAGES := debian9 debian10
>  DOCKER_PARTIAL_IMAGES += debian9-mxe debian-bootstrap
>  DOCKER_IMAGES := $(sort $(notdir $(basename $(wildcard 
> $(DOCKER_FILES_DIR)/*.docker
>  DOCKER_TARGETS := $(patsubst %,docker-image-%,$(DOCKER_IMAGES))
> diff --git a/tests/docker/dockerfiles/debian-sid.docker 
> b/tests/docker/dockerfiles/debian-sid.docker
> deleted file mode 100644
> index 2a1bcc33b24..000
> --- a/tests/docker/dockerfiles/debian-sid.docker
> +++ /dev/null
> @@ -1,35 +0,0 @@
> -#
> -# Debian Sid Base
> -#
> -# Currently we can build all our guests with cross-compilers in the
> -# latest Debian release (Buster). However new compilers will first
> -# arrive in Sid. However Sid is a rolling distro which may be broken
> -# at any particular time. To try and mitigate this we use Debian's
> -# snapshot archive which provides a "stable" view of what state Sid
> -# was in.
> -#
> -
> -# This must be earlier than the snapshot date we are aiming for
> -FROM debian:sid-20190812-slim
> -
> - # Use a snapshot known to work (see http://snapshot.debian.org/#Usage)
> -ENV DEBIAN_SNAPSHOT_DATE "20190820"
> -RUN sed -i "s%^deb \(https\?://\)deb.debian.org/debian/\? \(.*\)%deb 
> [check-valid-until=no] 
> \1snapshot.debian.org/archive/debian/${DEBIAN_SNAPSHOT_DATE} \2%" 
> /etc/apt/sources.list
> -
> -# Duplicate deb line as deb-src
> -RUN cat /etc/apt/sources.list | sed "s/^deb\ /deb-src /" >> 
> /etc/apt/sources.list
> -
> -# Install common build utilities
> -RUN apt update && \
> -DEBIAN_FRONTEND=noninteractive apt install -yy eatmydata && \
> -DEBIAN_FRONTEND=noninteractive eatmydata \
> -apt install -y --no-install-recommends \
> -bison \
> -build-essential \
> -ca-certificates \
> -flex \
> -git \
> -pkg-config \
> -psmisc \
> -python \
> -texinfo || { echo "Failed to build - see debian-sid.docker notes"; 
> exit 1; }
> 



Re: [PATCH v3 29/33] docker: remove 'deprecated' image definitions

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:01 PM, Alex Bennée wrote:
> From: John Snow 
> 
> There isn't a debian.dockerfile anymore,
> so perform some ghost-busting.

Won't we deprecate other images in the future?

> Signed-off-by: John Snow 
> Message-Id: <20190923181140.7235-4-js...@redhat.com>
> Signed-off-by: Alex Bennée 
> ---
>  tests/docker/Makefile.include | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
> index 82d5a8a5393..fd6f470fbf8 100644
> --- a/tests/docker/Makefile.include
> +++ b/tests/docker/Makefile.include
> @@ -4,11 +4,10 @@
>  
>  DOCKER_SUFFIX := .docker
>  DOCKER_FILES_DIR := $(SRC_PATH)/tests/docker/dockerfiles
> -DOCKER_DEPRECATED_IMAGES := debian
>  # we don't run tests on intermediate images (used as base by another image)
> -DOCKER_PARTIAL_IMAGES := debian debian9 debian10 debian-sid
> +DOCKER_PARTIAL_IMAGES := debian9 debian10 debian-sid
>  DOCKER_PARTIAL_IMAGES += debian9-mxe debian-ports debian-bootstrap
> -DOCKER_IMAGES := $(filter-out $(DOCKER_DEPRECATED_IMAGES),$(sort $(notdir 
> $(basename $(wildcard $(DOCKER_FILES_DIR)/*.docker)
> +DOCKER_IMAGES := $(sort $(notdir $(basename $(wildcard 
> $(DOCKER_FILES_DIR)/*.docker
>  DOCKER_TARGETS := $(patsubst %,docker-image-%,$(DOCKER_IMAGES))
>  # Use a global constant ccache directory to speed up repetitive builds
>  DOCKER_CCACHE_DIR := $$HOME/.cache/qemu-docker-ccache
> @@ -160,7 +159,7 @@ docker-image-debian-powerpc-user-cross: 
> docker-binfmt-image-debian-powerpc-user
>  DOCKER_USER_IMAGES += debian-powerpc-user
>  
>  # Expand all the pre-requistes for each docker image and test combination
> -$(foreach i,$(filter-out $(DOCKER_PARTIAL_IMAGES),$(DOCKER_IMAGES) 
> $(DOCKER_DEPRECATED_IMAGES)), \
> +$(foreach i,$(filter-out $(DOCKER_PARTIAL_IMAGES),$(DOCKER_IMAGES)), \
>   $(foreach t,$(DOCKER_TESTS) $(DOCKER_TOOLS), \
>   $(eval .PHONY: docker-$t@$i) \
>   $(eval docker-$t@$i: docker-image-$i docker-run-$t@$i) \
> 



Re: [PATCH v3 32/33] docker: move tests from python2 to python3

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:01 PM, Alex Bennée wrote:
> From: John Snow 
> 
> As part of the push to drop python2 support, replace any explicit python2
> dependencies with python3 versions.
> 
> For centos, python2 still exists as an implicit dependency, but by adding
> python3 we will be able to build even if the configure script begins to
> require python 3.5+.
> 
> Tested with centos7, fedora, ubuntu, ubuntu1804, and debian 9 (amd64).
> Tested under a custom configure script that requires Python 3.5+.
> 
> the travis dockerfile is also moved to using python3, which was tested
> by running `make docker-test-build@travis`, which I hope is sufficient.
> 
> Signed-off-by: John Snow 
> Message-Id: <20190923181140.7235-7-js...@redhat.com>
> Signed-off-by: Alex Bennée 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  tests/docker/dockerfiles/centos7.docker | 2 +-
>  tests/docker/dockerfiles/debian-xtensa-cross.docker | 2 +-
>  tests/docker/dockerfiles/debian10.docker| 2 +-
>  tests/docker/dockerfiles/debian9.docker | 2 +-
>  tests/docker/dockerfiles/travis.docker  | 2 +-
>  tests/docker/dockerfiles/ubuntu.docker  | 2 +-
>  tests/docker/dockerfiles/ubuntu1804.docker  | 2 +-
>  7 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/tests/docker/dockerfiles/centos7.docker 
> b/tests/docker/dockerfiles/centos7.docker
> index e0b9d7dbe9f..953637065c4 100644
> --- a/tests/docker/dockerfiles/centos7.docker
> +++ b/tests/docker/dockerfiles/centos7.docker
> @@ -25,6 +25,7 @@ ENV PACKAGES \
>  nettle-devel \
>  perl-Test-Harness \
>  pixman-devel \
> +python3 \
>  SDL-devel \
>  spice-glib-devel \
>  spice-server-devel \
> @@ -34,4 +35,3 @@ ENV PACKAGES \
>  zlib-devel
>  RUN yum install -y $PACKAGES
>  RUN rpm -q $PACKAGES | sort > /packages.txt
> -
> diff --git a/tests/docker/dockerfiles/debian-xtensa-cross.docker 
> b/tests/docker/dockerfiles/debian-xtensa-cross.docker
> index b9c2e2e5317..e6f93f65ee2 100644
> --- a/tests/docker/dockerfiles/debian-xtensa-cross.docker
> +++ b/tests/docker/dockerfiles/debian-xtensa-cross.docker
> @@ -18,7 +18,7 @@ RUN apt-get update && \
>  flex \
>  gettext \
>  git \
> -python-minimal
> +python3-minimal
>  
>  ENV CPU_LIST csp dc232b dc233c
>  ENV TOOLCHAIN_RELEASE 2018.02
> diff --git a/tests/docker/dockerfiles/debian10.docker 
> b/tests/docker/dockerfiles/debian10.docker
> index 30a78813f27..dad498b52e3 100644
> --- a/tests/docker/dockerfiles/debian10.docker
> +++ b/tests/docker/dockerfiles/debian10.docker
> @@ -26,7 +26,7 @@ RUN apt update && \
>  git \
>  pkg-config \
>  psmisc \
> -python \
> +python3 \
>  python3-sphinx \
>  texinfo \
>  $(apt-get -s build-dep qemu | egrep ^Inst | fgrep '[all]' | cut -d\  
> -f2)
> diff --git a/tests/docker/dockerfiles/debian9.docker 
> b/tests/docker/dockerfiles/debian9.docker
> index b36f1d4ed83..8cbd742bb5f 100644
> --- a/tests/docker/dockerfiles/debian9.docker
> +++ b/tests/docker/dockerfiles/debian9.docker
> @@ -26,7 +26,7 @@ RUN apt update && \
>  git \
>  pkg-config \
>  psmisc \
> -python \
> +python3 \
>  python3-sphinx \
>  texinfo \
>  $(apt-get -s build-dep qemu | egrep ^Inst | fgrep '[all]' | cut -d\  
> -f2)
> diff --git a/tests/docker/dockerfiles/travis.docker 
> b/tests/docker/dockerfiles/travis.docker
> index e72dc85ca7a..ea14da29d97 100644
> --- a/tests/docker/dockerfiles/travis.docker
> +++ b/tests/docker/dockerfiles/travis.docker
> @@ -5,7 +5,7 @@ ENV LC_ALL en_US.UTF-8
>  RUN sed -i "s/# deb-src/deb-src/" /etc/apt/sources.list
>  RUN apt-get update
>  RUN apt-get -y build-dep qemu
> -RUN apt-get -y install device-tree-compiler python2.7 python-yaml 
> dh-autoreconf gdb strace lsof net-tools gcovr
> +RUN apt-get -y install device-tree-compiler python3 python3-yaml 
> dh-autoreconf gdb strace lsof net-tools gcovr
>  # Travis tools require PhantomJS / Neo4j / Maven accessible
>  # in their PATH (QEMU build won't access them).
>  ENV PATH 
> /usr/local/phantomjs/bin:/usr/local/phantomjs:/usr/local/neo4j-3.2.7/bin:/usr/local/maven-3.5.2/bin:/usr/local/cmake-3.9.2/bin:/usr/local/clang-5.0.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
> diff --git a/tests/docker/dockerfiles/ubuntu.docker 
> b/tests/docker/dockerfiles/ubuntu.docker
> index a4f601395c8..f4864922240 100644
> --- a/tests/docker/dockerfiles/ubuntu.docker
> +++ b/tests/docker/dockerfiles/ubuntu.docker
> @@ -60,7 +60,7 @@ ENV PACKAGES flex bison \
>  libvte-2.91-dev \
>  libxen-dev \
>  make \
> -python-yaml \
> +python3-yaml \
>  python3-sphinx \
>  sparse \
>  texinfo \
> diff --git a/tests/docker/dockerfiles/ubuntu1804.docker 
> b/tests/docker/dockerfiles/ubuntu1804.docker
> index 883f9bcf31c..3cc4f492c4a 100644
> --- 

Re: [PATCH v3 30/33] docker: remove unused debian-ports

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:01 PM, Alex Bennée wrote:
> From: John Snow 
> 
> debian-ports is listed as a partial image, so we cannot run tests against it.
> Since it isn't used by any other testable image, remove it for now as it
> is prone to bitrot.
> 
> Signed-off-by: John Snow 
> Message-Id: <20190923181140.7235-5-js...@redhat.com>
> Signed-off-by: Alex Bennée 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  tests/docker/Makefile.include|  2 +-
>  tests/docker/dockerfiles/debian-ports.docker | 36 
>  2 files changed, 1 insertion(+), 37 deletions(-)
>  delete mode 100644 tests/docker/dockerfiles/debian-ports.docker
> 
> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
> index fd6f470fbf8..053c418d8cd 100644
> --- a/tests/docker/Makefile.include
> +++ b/tests/docker/Makefile.include
> @@ -6,7 +6,7 @@ DOCKER_SUFFIX := .docker
>  DOCKER_FILES_DIR := $(SRC_PATH)/tests/docker/dockerfiles
>  # we don't run tests on intermediate images (used as base by another image)
>  DOCKER_PARTIAL_IMAGES := debian9 debian10 debian-sid
> -DOCKER_PARTIAL_IMAGES += debian9-mxe debian-ports debian-bootstrap
> +DOCKER_PARTIAL_IMAGES += debian9-mxe debian-bootstrap
>  DOCKER_IMAGES := $(sort $(notdir $(basename $(wildcard 
> $(DOCKER_FILES_DIR)/*.docker
>  DOCKER_TARGETS := $(patsubst %,docker-image-%,$(DOCKER_IMAGES))
>  # Use a global constant ccache directory to speed up repetitive builds
> diff --git a/tests/docker/dockerfiles/debian-ports.docker 
> b/tests/docker/dockerfiles/debian-ports.docker
> deleted file mode 100644
> index 61bc3f2993a..000
> --- a/tests/docker/dockerfiles/debian-ports.docker
> +++ /dev/null
> @@ -1,36 +0,0 @@
> -#
> -# Docker multiarch cross-compiler target
> -#
> -# This docker target is builds on Debian Ports cross compiler targets
> -# to build distro with a selection of cross compilers for building test 
> binaries.
> -#
> -# On its own you can't build much but the docker-foo-cross targets
> -# build on top of the base debian image.
> -#
> -FROM debian:unstable
> -
> -MAINTAINER Philippe Mathieu-Daudé 
> -
> -RUN echo "deb [arch=amd64] http://deb.debian.org/debian unstable main" > 
> /etc/apt/sources.list
> -
> -# Duplicate deb line as deb-src
> -RUN cat /etc/apt/sources.list | sed -ne "s/^deb\ \(\[.*\]\ 
> \)\?\(.*\)/deb-src \2/p" >> /etc/apt/sources.list
> -
> -# Setup some basic tools we need
> -RUN apt-get update && \
> -DEBIAN_FRONTEND=noninteractive apt install -yy eatmydata && \
> -DEBIAN_FRONTEND=noninteractive eatmydata \
> -apt-get install -y --no-install-recommends \
> -bison \
> -build-essential \
> -ca-certificates \
> -clang \
> -debian-ports-archive-keyring \
> -flex \
> -gettext \
> -git \
> -pkg-config \
> -psmisc \
> -python \
> -texinfo \
> -$(apt-get -s build-dep qemu | egrep ^Inst | fgrep '[all]' | cut -d\  
> -f2)
> 



Re: [PATCH v3 33/33] tests/docker: remove debian-powerpc-user-cross

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:01 PM, Alex Bennée wrote:
> Despite our attempts in 4d26c7fef4 to keep this going it still gets in
> the way of "make docker-test-build" completing because of course we
> can't build a modern QEMU with the image. Let's put the thing out of
> it's misery and remove it.
> 
> People who really care about building on powerpc can still use the
> binfmt_misc support to manually build an image (or just run the build
> from pre this commit).
> 
> Signed-off-by: Alex Bennée 
> Cc: Mark Cave-Ayland 
> ---
>  tests/docker/Makefile.include |  9 
>  .../debian-powerpc-user-cross.docker  | 21 ---
>  2 files changed, 30 deletions(-)
>  delete mode 100644 tests/docker/dockerfiles/debian-powerpc-user-cross.docker
> 
> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
> index 180e5439ef9..dcc37093138 100644
> --- a/tests/docker/Makefile.include
> +++ b/tests/docker/Makefile.include
> @@ -149,15 +149,6 @@ DOCKER_PARTIAL_IMAGES += fedora-i386-cross 
> fedora-cris-cross
>  # work around issues with poorly working multi-arch systems and broken
>  # packages.
>  
> -# Jessie is the last supported release for powerpc, but multi-arch is
> -# broken so we need a qemu-linux-user for this target
> -docker-binfmt-image-debian-powerpc-user: DEB_ARCH = powerpc
> -docker-binfmt-image-debian-powerpc-user: DEB_TYPE = jessie
> -docker-binfmt-image-debian-powerpc-user: DEB_URL = 
> http://snapshot.debian.org/archive/debian/20180615T211437Z
> -docker-binfmt-image-debian-powerpc-user: EXECUTABLE = 
> ${BUILD_DIR}/ppc-linux-user/qemu-ppc
> -docker-image-debian-powerpc-user-cross: 
> docker-binfmt-image-debian-powerpc-user
> -DOCKER_USER_IMAGES += debian-powerpc-user
> -
>  # Expand all the pre-requistes for each docker image and test combination
>  $(foreach i,$(filter-out $(DOCKER_PARTIAL_IMAGES),$(DOCKER_IMAGES)), \
>   $(foreach t,$(DOCKER_TESTS) $(DOCKER_TOOLS), \
> diff --git a/tests/docker/dockerfiles/debian-powerpc-user-cross.docker 
> b/tests/docker/dockerfiles/debian-powerpc-user-cross.docker
> deleted file mode 100644
> index 83749b0abb8..000
> --- a/tests/docker/dockerfiles/debian-powerpc-user-cross.docker
> +++ /dev/null
> @@ -1,21 +0,0 @@
> -#
> -# Docker powerpc cross-compiler target for QEMU
> -#
> -# We can't use current Debian stable cross-compilers to build powerpc
> -# as it has been dropped as a release architecture. Using Debian Sid
> -# is just far too sketchy a build environment. This leaves us the
> -# final option of using linux-user. This image is based of the
> -# debootstrapped qemu:debian-powerpc-user but doesn't need any extra
> -# magic once it is setup.
> -#
> -# It can be used to build old versions of QEMU, current versions need
> -# newer dependencies than Jessie provides.
> -#
> -FROM qemu:debian-powerpc-user
> -
> -RUN echo man-db man-db/auto-update boolean false | debconf-set-selections
> -RUN apt-get update && \
> -DEBIAN_FRONTEND=noninteractive apt-get build-dep -yy qemu
> -
> -ENV QEMU_CONFIGURE_OPTS --disable-werror
> -ENV DEF_TARGET_LIST powerpc-softmmu,arm-linux-user,aarch64-linux-user
> 

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH v3 21/33] tests/tcg: add simple record/replay smoke test for aarch64

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:00 PM, Alex Bennée wrote:
> This adds two new tests that re-use the memory test to check basic
> record replay functionality is still working. We have to define our
> own runners rather than using the default pattern as we want to change
> the test name but re-use the memory binary.
> 
> We declare the test binaries as PHONY as they don't rely exist.
> 
> [AJB: A better test would output some sort of timer value or other
> otherwise variable value so we could compare the record and replay
> outputs and ensure they match]
> 
> Signed-off-by: Alex Bennée 
> Cc: Pavel Dovgalyuk 
> ---
>  tests/tcg/aarch64/Makefile.softmmu-target | 21 +
>  1 file changed, 21 insertions(+)
> 
> diff --git a/tests/tcg/aarch64/Makefile.softmmu-target 
> b/tests/tcg/aarch64/Makefile.softmmu-target
> index 4c4aaf61dd3..b4b39579634 100644
> --- a/tests/tcg/aarch64/Makefile.softmmu-target
> +++ b/tests/tcg/aarch64/Makefile.softmmu-target
> @@ -32,3 +32,24 @@ memory: CFLAGS+=-DCHECK_UNALIGNED=1
>  
>  # Running
>  QEMU_OPTS+=-M virt -cpu max -display none -semihosting-config 
> enable=on,target=native,chardev=output -kernel
> +
> +# Simple Record/Replay Test
> +.PHONY: memory-record
> +run-memory-record: memory-record memory
> + $(call run-test, $<, \
> +   $(QEMU) -monitor none -display none \
> +   -chardev file$(COMMA)path=$<.out$(COMMA)id=output \
> +   -icount shift=5$(COMMA)rr=record$(COMMA)rrfile=record.bin \
> +   $(QEMU_OPTS) memory, \
> +   "$< on $(TARGET_NAME)")
> +
> +.PHONY: memory-replay
> +run-memory-replay: memory-replay run-memory-record
> + $(call run-test, $<, \
> +   $(QEMU) -monitor none -display none \
> +   -chardev file$(COMMA)path=$<.out$(COMMA)id=output \
> +   -icount shift=5$(COMMA)rr=replay$(COMMA)rrfile=record.bin \
> +   $(QEMU_OPTS) memory, \
> +   "$< on $(TARGET_NAME)")
> +
> +TESTS+=memory-record memory-replay
> 

Tested-by: Philippe Mathieu-Daudé 



Re: [PATCH v3 19/33] tests/tcg: add float_madds test to multiarch

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:00 PM, Alex Bennée wrote:
> This is a generic floating point multiply and accumulate test for
> single precision floating point values. I've split of the common float
> functions into a helper library so additional tests can use the same
> common code.
> 
> As I don't have references for all architectures I've allowed some
> flexibility for tests to pass without reference files. They can be
> added as we get collect them.
> 
> Signed-off-by: Alex Bennée 
> Reviewed-by: Richard Henderson 
> ---
> v2
>   - allow tests to add addition patterns to the list
>   - conditional diff-out
>   - use __builtin_fmaf instead of forcing optimisation
>   - use hex floating point definitions and output
> v3
>   - remove add_const stuff, make explicit tests explicit
>   - various style clean-ups
> ---
>  tests/tcg/Makefile.target   |   9 +
>  tests/tcg/aarch64/float_madds.ref   | 768 
>  tests/tcg/arm/Makefile.target   |   3 +
>  tests/tcg/arm/float_madds.ref   | 768 
>  tests/tcg/multiarch/Makefile.target |  12 +-
>  tests/tcg/multiarch/float_helpers.c | 230 +
>  tests/tcg/multiarch/float_helpers.h |  26 +
>  tests/tcg/multiarch/float_madds.c   | 103 
>  8 files changed, 1918 insertions(+), 1 deletion(-)
>  create mode 100644 tests/tcg/aarch64/float_madds.ref
>  create mode 100644 tests/tcg/arm/float_madds.ref
>  create mode 100644 tests/tcg/multiarch/float_helpers.c
>  create mode 100644 tests/tcg/multiarch/float_helpers.h
>  create mode 100644 tests/tcg/multiarch/float_madds.c
[...]

Tested-by: Philippe Mathieu-Daudé 




Re: [PATCH v3 23/33] docs/devel: add "check-tcg" to testing.rst

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:00 PM, Alex Bennée wrote:
> It was pointed out we haven't documented the check-tcg part of the
> build system. Attempt to rectify that now.
> 
> Signed-off-by: Alex Bennée 
> ---
>  docs/devel/testing.rst | 62 ++
>  1 file changed, 62 insertions(+)
> 
> diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
> index bf75675fb04..1feee3ad101 100644
> --- a/docs/devel/testing.rst
> +++ b/docs/devel/testing.rst
> @@ -266,6 +266,8 @@ another application on the host may have locked the file, 
> possibly leading to a
>  test failure.  If using such devices are explicitly desired, consider adding
>  ``locking=off`` option to disable image locking.
>  
> +.. _docker-ref:
> +
>  Docker based tests
>  ==
>  
> @@ -799,3 +801,63 @@ And remove any package you want with::
>  
>  If you've used ``make check-acceptance``, the Python virtual environment 
> where
>  Avocado is installed will be cleaned up as part of ``make check-clean``.
> +
> +Testing with "make check-tcg"
> +=
> +
> +The check-tcg tests are intended for simple smoke tests of both
> +linux-user and softmmu TCG functionality. However to build test
> +programs for guest targets you need to have cross compilers available.
> +If your distribution supports cross compilers you can do something as
> +simple as::
> +
> +  apt install gcc-aarch64-linux-gnu
> +
> +The configure script will automatically pick up their presence.
> +Sometimes compilers have slightly odd names so the availability of
> +them can be prompted by passing in the appropriate configure option
> +for the architecture in question, for example::
> +
> +  $(configure) --cross-cc-aarch64=aarch64-cc
> +
> +There is also a ``--cross-cc-flags-ARCH`` flag in case additional
> +compiler flags are needed to build for a given target.
> +
> +If you have the ability to run containers as the user you can also
> +take advantage of the build systems "Docker" support. It will then use
> +containers to build any test case for an enabled guest where there is
> +no system compiler available. See :ref: `_docker-ref` for details.

Maybe you can add a line there is an easy way to run all tests for a
single target using 'make run-tcg-tests-$TARGET'?

> +TCG test dependencies
> +-
> +
> +The TCG tests are deliberately very light on dependencies and are
> +either totally bare with minimal gcc lib support (for softmmu tests)
> +or just glibc (for linux-user tests). This is because getting a cross
> +compiler to work with additional libraries can be challenging.
> +
> +Other TCG Tests
> +---
> +
> +There are a number of out-of-tree test suites that are used for more
> +extensive testing of processor features.
> +
> +KVM Unit Tests
> +~~
> +
> +The KVM unit tests are designed to run as a Guest OS under KVM but
> +there is no reason why they can't exercise the TCG as well. It
> +provides a minimal OS kernel with hooks for enabling the MMU as well
> +as reporting test results via a special device::
> +
> +  https://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git
> +
> +Linux Test Project
> +~~
> +
> +The LTP is focused on exercising the syscall interface of a Linux
> +kernel. It checks that syscalls behave as documented and strives to
> +exercise as many corner cases as possible. It is a useful test suite
> +to run to exercise QEMU's linux-user code::
> +
> +  https://linux-test-project.github.io/
> 



Re: [PATCH v3 18/33] tests/tcg: re-enable linux-test for ppc64abi32

2019-09-25 Thread Philippe Mathieu-Daudé
Hi Alex,

On 9/24/19 11:00 PM, Alex Bennée wrote:
> Now we have fixed the signal delivary bug we can remove this horrible

"delivery"

> hack from the system.
> 
> Cc: Richard Henderson 
> Signed-off-by: Alex Bennée 

Can you reorder this patch after directly "target/ppc: fix signal
delivery for ppc64abi32"?

> 
> ---
> v2
>   - drop un-needed cflags
> ---
>  tests/tcg/multiarch/Makefile.target | 11 +++
>  1 file changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/tests/tcg/multiarch/Makefile.target 
> b/tests/tcg/multiarch/Makefile.target
> index 6b1e30e2fec..657a04f802d 100644
> --- a/tests/tcg/multiarch/Makefile.target
> +++ b/tests/tcg/multiarch/Makefile.target
> @@ -12,14 +12,6 @@ VPATH  += $(MULTIARCH_SRC)
>  MULTIARCH_SRCS   =$(notdir $(wildcard $(MULTIARCH_SRC)/*.c))
>  MULTIARCH_TESTS  =$(MULTIARCH_SRCS:.c=)
>  
> -# FIXME: ppc64abi32 linux-test seems to have issues but the other basic 
> tests work
> -ifeq ($(TARGET_NAME),ppc64abi32)
> -BROKEN_TESTS = linux-test
> -endif
> -
> -# Update TESTS
> -TESTS+= $(filter-out $(BROKEN_TESTS), $(MULTIARCH_TESTS))
> -
>  #
>  # The following are any additional rules needed to build things
>  #
> @@ -39,3 +31,6 @@ run-test-mmap: test-mmap
>  run-test-mmap-%: test-mmap
>   $(call run-test, test-mmap-$*, $(QEMU) -p $* $<,\
>   "$< ($* byte pages) on $(TARGET_NAME)")
> +
> +# Update TESTS
> +TESTS += $(MULTIARCH_TESTS)
> 

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
(via make 'run-tcg-tests-ppc64abi32-linux-user')



Re: [PATCH v3 17/33] tests/tcg: clean-up some comments after the de-tangling

2019-09-25 Thread Philippe Mathieu-Daudé
On Thu, Sep 26, 2019 at 12:07 AM Philippe Mathieu-Daudé
 wrote:
> On 9/24/19 11:00 PM, Alex Bennée wrote:
> > These were missed in the recent de-tangling so have been updated to be
> > more actuate. I've also built up ARM_TESTS in a manner similar to
> > AARCH64_TESTS for better consistency.
> >
> > Signed-off-by: Alex Bennée 
> > Reviewed-by: Peter Maydell 
> > ---
> >  tests/tcg/Makefile.target |  7 +--
> >  tests/tcg/aarch64/Makefile.target |  3 ++-
> >  tests/tcg/arm/Makefile.target | 15 ---
> >  3 files changed, 15 insertions(+), 10 deletions(-)
> >
> > diff --git a/tests/tcg/Makefile.target b/tests/tcg/Makefile.target
> > index 8808beaf74b..679eb56bd37 100644
> > --- a/tests/tcg/Makefile.target
> > +++ b/tests/tcg/Makefile.target
> > @@ -74,8 +74,11 @@ TIMEOUT=15
> >  endif
> >
> >  ifdef CONFIG_USER_ONLY
> > -# The order we include is important. We include multiarch, base arch
> > -# and finally arch if it's not the same as base arch.
> > +# The order we include is important. We include multiarch first and
> > +# then the target. If there are common tests shared between
> > +# sub-targets (e.g. ARM & AArch64) then it is up to
> > +# $(TARGET_NAME)/Makefile.target to include the common parent
> > +# architecture in its VPATH.
> >  -include $(SRC_PATH)/tests/tcg/multiarch/Makefile.target
> >  -include $(SRC_PATH)/tests/tcg/$(TARGET_NAME)/Makefile.target
> >
> > diff --git a/tests/tcg/aarch64/Makefile.target 
> > b/tests/tcg/aarch64/Makefile.target
> > index e763dd9da37..9758f89f905 100644
> > --- a/tests/tcg/aarch64/Makefile.target
> > +++ b/tests/tcg/aarch64/Makefile.target
> > @@ -8,7 +8,7 @@ VPATH += $(ARM_SRC)
> >  AARCH64_SRC=$(SRC_PATH)/tests/tcg/aarch64
> >  VPATH+= $(AARCH64_SRC)
> >
> > -# we don't build any other ARM test
> > +# Float-convert Tests
> >  AARCH64_TESTS=fcvt
> >
> >  fcvt: LDFLAGS+=-lm
> > @@ -17,6 +17,7 @@ run-fcvt: fcvt
> >   $(call run-test,$<,$(QEMU) $<, "$< on $(TARGET_NAME)")
> >   $(call diff-out,$<,$(AARCH64_SRC)/fcvt.ref)
> >
> > +# Pauth Tests
> >  AARCH64_TESTS += pauth-1 pauth-2
> >  run-pauth-%: QEMU_OPTS += -cpu max
> >
> > diff --git a/tests/tcg/arm/Makefile.target b/tests/tcg/arm/Makefile.target
> > index aa4e4e3782c..7347d3d0adb 100644
> > --- a/tests/tcg/arm/Makefile.target
> > +++ b/tests/tcg/arm/Makefile.target
> > @@ -8,25 +8,26 @@ ARM_SRC=$(SRC_PATH)/tests/tcg/arm
> >  # Set search path for all sources
> >  VPATH+= $(ARM_SRC)
> >
> > -ARM_TESTS=hello-arm test-arm-iwmmxt
> > -
> > -TESTS += $(ARM_TESTS) fcvt
> > -
> > +# Basic Hello World
> > +ARM_TESTS = hello-arm
> >  hello-arm: CFLAGS+=-marm -ffreestanding
> >  hello-arm: LDFLAGS+=-nostdlib
> >
> > +# IWMXT floating point extensions
> > +ARM_TESTS += test-arm-iwmmxt
> >  test-arm-iwmmxt: CFLAGS+=-marm -march=iwmmxt -mabi=aapcs -mfpu=fpv4-sp-d16
> >  test-arm-iwmmxt: test-arm-iwmmxt.S
> >   $(CC) $(CFLAGS) $< -o $@ $(LDFLAGS)
> >
> > -ifeq ($(TARGET_NAME), arm)
> > +# Float-convert Tests
> > +ARM_TESTS += fcvt
> >  fcvt: LDFLAGS+=-lm
> >  # fcvt: CFLAGS+=-march=armv8.2-a+fp16 -mfpu=neon-fp-armv8
> > -
> >  run-fcvt: fcvt
> >   $(call run-test,fcvt,$(QEMU) $<,"$< on $(TARGET_NAME)")
> >   $(call diff-out,fcvt,$(ARM_SRC)/fcvt.ref)
> > -endif
> > +
> > +TESTS += $(ARM_TESTS)
> >
> >  # On ARM Linux only supports 4k pages
> >  EXTRA_RUNS+=run-test-mmap-4096
> >
>
> Reviewed-by: Philippe Mathieu-Daudé 
> Tested-by: Philippe Mathieu-Daudé 
> (via make run-tcg-tests-arm-softmmu)

Err I meant 'make run-tcg-tests-arm-linux-user' ;)



Re: [PATCH v3 17/33] tests/tcg: clean-up some comments after the de-tangling

2019-09-25 Thread Philippe Mathieu-Daudé
On 9/24/19 11:00 PM, Alex Bennée wrote:
> These were missed in the recent de-tangling so have been updated to be
> more actuate. I've also built up ARM_TESTS in a manner similar to
> AARCH64_TESTS for better consistency.
> 
> Signed-off-by: Alex Bennée 
> Reviewed-by: Peter Maydell 
> ---
>  tests/tcg/Makefile.target |  7 +--
>  tests/tcg/aarch64/Makefile.target |  3 ++-
>  tests/tcg/arm/Makefile.target | 15 ---
>  3 files changed, 15 insertions(+), 10 deletions(-)
> 
> diff --git a/tests/tcg/Makefile.target b/tests/tcg/Makefile.target
> index 8808beaf74b..679eb56bd37 100644
> --- a/tests/tcg/Makefile.target
> +++ b/tests/tcg/Makefile.target
> @@ -74,8 +74,11 @@ TIMEOUT=15
>  endif
>  
>  ifdef CONFIG_USER_ONLY
> -# The order we include is important. We include multiarch, base arch
> -# and finally arch if it's not the same as base arch.
> +# The order we include is important. We include multiarch first and
> +# then the target. If there are common tests shared between
> +# sub-targets (e.g. ARM & AArch64) then it is up to
> +# $(TARGET_NAME)/Makefile.target to include the common parent
> +# architecture in its VPATH.
>  -include $(SRC_PATH)/tests/tcg/multiarch/Makefile.target
>  -include $(SRC_PATH)/tests/tcg/$(TARGET_NAME)/Makefile.target
>  
> diff --git a/tests/tcg/aarch64/Makefile.target 
> b/tests/tcg/aarch64/Makefile.target
> index e763dd9da37..9758f89f905 100644
> --- a/tests/tcg/aarch64/Makefile.target
> +++ b/tests/tcg/aarch64/Makefile.target
> @@ -8,7 +8,7 @@ VPATH += $(ARM_SRC)
>  AARCH64_SRC=$(SRC_PATH)/tests/tcg/aarch64
>  VPATH+= $(AARCH64_SRC)
>  
> -# we don't build any other ARM test
> +# Float-convert Tests
>  AARCH64_TESTS=fcvt
>  
>  fcvt: LDFLAGS+=-lm
> @@ -17,6 +17,7 @@ run-fcvt: fcvt
>   $(call run-test,$<,$(QEMU) $<, "$< on $(TARGET_NAME)")
>   $(call diff-out,$<,$(AARCH64_SRC)/fcvt.ref)
>  
> +# Pauth Tests
>  AARCH64_TESTS += pauth-1 pauth-2
>  run-pauth-%: QEMU_OPTS += -cpu max
>  
> diff --git a/tests/tcg/arm/Makefile.target b/tests/tcg/arm/Makefile.target
> index aa4e4e3782c..7347d3d0adb 100644
> --- a/tests/tcg/arm/Makefile.target
> +++ b/tests/tcg/arm/Makefile.target
> @@ -8,25 +8,26 @@ ARM_SRC=$(SRC_PATH)/tests/tcg/arm
>  # Set search path for all sources
>  VPATH+= $(ARM_SRC)
>  
> -ARM_TESTS=hello-arm test-arm-iwmmxt
> -
> -TESTS += $(ARM_TESTS) fcvt
> -
> +# Basic Hello World
> +ARM_TESTS = hello-arm
>  hello-arm: CFLAGS+=-marm -ffreestanding
>  hello-arm: LDFLAGS+=-nostdlib
>  
> +# IWMXT floating point extensions
> +ARM_TESTS += test-arm-iwmmxt
>  test-arm-iwmmxt: CFLAGS+=-marm -march=iwmmxt -mabi=aapcs -mfpu=fpv4-sp-d16
>  test-arm-iwmmxt: test-arm-iwmmxt.S
>   $(CC) $(CFLAGS) $< -o $@ $(LDFLAGS)
>  
> -ifeq ($(TARGET_NAME), arm)
> +# Float-convert Tests
> +ARM_TESTS += fcvt
>  fcvt: LDFLAGS+=-lm
>  # fcvt: CFLAGS+=-march=armv8.2-a+fp16 -mfpu=neon-fp-armv8
> -
>  run-fcvt: fcvt
>   $(call run-test,fcvt,$(QEMU) $<,"$< on $(TARGET_NAME)")
>   $(call diff-out,fcvt,$(ARM_SRC)/fcvt.ref)
> -endif
> +
> +TESTS += $(ARM_TESTS)
>  
>  # On ARM Linux only supports 4k pages
>  EXTRA_RUNS+=run-test-mmap-4096
> 

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
(via make run-tcg-tests-arm-softmmu)



[PATCH] i386: Add CPUID bit for CLZERO and XSAVEERPTR

2019-09-25 Thread Sebastian Andrzej Siewior
The CPUID bits CLZERO and XSAVEERPTR are availble on AMD's ZEN platform
and could be passed to the guest.

Signed-off-by: Sebastian Andrzej Siewior 
---

I tweaked the kernel to expose these flags and figured out that this is
also missing in order see those bits in the guest.

 target/i386/cpu.c | 2 +-
 target/i386/cpu.h | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index fbed2eb804e32..e00ef3c917391 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1113,7 +1113,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = 
{
 [FEAT_8000_0008_EBX] = {
 .type = CPUID_FEATURE_WORD,
 .feat_names = {
-NULL, NULL, NULL, NULL,
+"clzero", NULL, "xsaveerptr", NULL,
 NULL, NULL, NULL, NULL,
 NULL, "wbnoinvd", NULL, NULL,
 "ibpb", NULL, NULL, NULL,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 0732e059ec989..cc475c703fc4d 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -689,6 +689,8 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_7_0_EDX_ARCH_CAPABILITIES (1U << 29)  /*Arch Capabilities*/
 #define CPUID_7_0_EDX_SPEC_CTRL_SSBD  (1U << 31) /* Speculative Store Bypass 
Disable */
 
+#define CPUD_800_008_EBX_CLZERO(1U << 0) /* CLZERO instruction 
*/
+#define CPUD_800_008_EBX_XSAVEERPTR(1U << 2) /* Always save/restore FP 
error pointers */
 #define CPUID_8000_0008_EBX_WBNOINVD  (1U << 9)  /* Write back and
  
do not invalidate cache */
 #define CPUID_8000_0008_EBX_IBPB(1U << 12) /* Indirect Branch Prediction 
Barrier */
-- 
2.23.0




[PATCH v2 11/13] qcrypto-luks: simplify the math used for keyslot locations

2019-09-25 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 crypto/block-luks.c | 63 -
 1 file changed, 40 insertions(+), 23 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 6d4e9eb348..a53d5d1916 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -409,6 +409,30 @@ qcrypto_block_luks_essiv_cipher(QCryptoCipherAlgorithm 
cipher,
 }
 }
 
+/*
+ * Returns number of sectors needed to store the key material
+ * given number of anti forensic stripes
+ */
+static int
+qcrypto_block_luks_splitkeylen_sectors(const QCryptoBlockLUKS *luks,
+   unsigned int header_sectors,
+   unsigned int stripes)
+{
+/*
+ * This calculation doesn't match that shown in the spec,
+ * but instead follows the cryptsetup implementation.
+ */
+
+size_t splitkeylen = luks->header.master_key_len * stripes;
+
+/* First align the key material size to block size*/
+size_t splitkeylen_sectors =
+DIV_ROUND_UP(splitkeylen, QCRYPTO_BLOCK_LUKS_SECTOR_SIZE);
+
+/* Then also align the key material size to the size of the header */
+return ROUND_UP(splitkeylen_sectors, header_sectors);
+}
+
 /*
  * Stores the main LUKS header, taking care of endianess
  */
@@ -1114,7 +1138,8 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 QCryptoBlockCreateOptionsLUKS luks_opts;
 Error *local_err = NULL;
 g_autofree uint8_t *masterkey = NULL;
-size_t splitkeylen = 0;
+size_t header_sectors;
+size_t split_key_sectors;
 size_t i;
 g_autofree char *password = NULL;
 const char *cipher_alg;
@@ -1333,37 +1358,29 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 goto error;
 }
 
+/* start with the sector that follows the header*/
+header_sectors = QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
+QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
+
+split_key_sectors =
+qcrypto_block_luks_splitkeylen_sectors(luks,
+   header_sectors,
+   QCRYPTO_BLOCK_LUKS_STRIPES);
 
-/* Although LUKS has multiple key slots, we're just going
- * to use the first key slot */
-splitkeylen = luks->header.master_key_len * QCRYPTO_BLOCK_LUKS_STRIPES;
 for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
-luks->header.key_slots[i].active = 
QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED;
-luks->header.key_slots[i].stripes = QCRYPTO_BLOCK_LUKS_STRIPES;
+QCryptoBlockLUKSKeySlot *slot = >header.key_slots[i];
+slot->active = QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED;
 
-/* This calculation doesn't match that shown in the spec,
- * but instead follows the cryptsetup implementation.
- */
-luks->header.key_slots[i].key_offset_sector =
-(QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
- QCRYPTO_BLOCK_LUKS_SECTOR_SIZE) +
-(ROUND_UP(DIV_ROUND_UP(splitkeylen, 
QCRYPTO_BLOCK_LUKS_SECTOR_SIZE),
-  (QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
-   QCRYPTO_BLOCK_LUKS_SECTOR_SIZE)) * i);
+slot->key_offset_sector = header_sectors + i * split_key_sectors;
+slot->stripes = QCRYPTO_BLOCK_LUKS_STRIPES;
 }
 
-
 /* The total size of the LUKS headers is the partition header + key
  * slot headers, rounded up to the nearest sector, combined with
  * the size of each master key material region, also rounded up
  * to the nearest sector */
-luks->header.payload_offset_sector =
-(QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
- QCRYPTO_BLOCK_LUKS_SECTOR_SIZE) +
-(ROUND_UP(DIV_ROUND_UP(splitkeylen, QCRYPTO_BLOCK_LUKS_SECTOR_SIZE),
-  (QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
-   QCRYPTO_BLOCK_LUKS_SECTOR_SIZE)) *
- QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS);
+luks->header.payload_offset_sector = header_sectors +
+QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS * split_key_sectors;
 
 block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
 block->payload_offset = luks->header.payload_offset_sector *
-- 
2.17.2




[PATCH v2 09/13] qcrypto-luks: extract check and parse header

2019-09-25 Thread Maxim Levitsky
This is just to make qcrypto_block_luks_open more
reasonable in size.

Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 crypto/block-luks.c | 223 +---
 1 file changed, 125 insertions(+), 98 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 47371edf13..fa799fd21d 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -500,6 +500,129 @@ qcrypto_block_luks_load_header(QCryptoBlock *block,
 return 0;
 }
 
+/*
+ * Does basic sanity checks on the LUKS header
+ */
+static int
+qcrypto_block_luks_check_header(const QCryptoBlockLUKS *luks, Error **errp)
+{
+if (memcmp(luks->header.magic, qcrypto_block_luks_magic,
+   QCRYPTO_BLOCK_LUKS_MAGIC_LEN) != 0) {
+error_setg(errp, "Volume is not in LUKS format");
+return -1;
+}
+
+if (luks->header.version != QCRYPTO_BLOCK_LUKS_VERSION) {
+error_setg(errp, "LUKS version %" PRIu32 " is not supported",
+   luks->header.version);
+return -1;
+}
+return 0;
+}
+
+/*
+ * Parses the crypto parameters that are stored in the LUKS header
+ */
+
+static int
+qcrypto_block_luks_parse_header(QCryptoBlockLUKS *luks, Error **errp)
+{
+g_autofree char *cipher_mode = g_strdup(luks->header.cipher_mode);
+char *ivgen_name, *ivhash_name;
+Error *local_err = NULL;
+
+/*
+ * The cipher_mode header contains a string that we have
+ * to further parse, of the format
+ *
+ *-[:]
+ *
+ * eg  cbc-essiv:sha256, cbc-plain64
+ */
+ivgen_name = strchr(cipher_mode, '-');
+if (!ivgen_name) {
+error_setg(errp, "Unexpected cipher mode string format %s",
+   luks->header.cipher_mode);
+return -1;
+}
+*ivgen_name = '\0';
+ivgen_name++;
+
+ivhash_name = strchr(ivgen_name, ':');
+if (!ivhash_name) {
+luks->ivgen_hash_alg = 0;
+} else {
+*ivhash_name = '\0';
+ivhash_name++;
+
+luks->ivgen_hash_alg = qcrypto_block_luks_hash_name_lookup(ivhash_name,
+   _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return -1;
+}
+}
+
+luks->cipher_mode = qcrypto_block_luks_cipher_mode_lookup(cipher_mode,
+  _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return -1;
+}
+
+luks->cipher_alg =
+qcrypto_block_luks_cipher_name_lookup(luks->header.cipher_name,
+  luks->cipher_mode,
+  luks->header.master_key_len,
+  _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return -1;
+}
+
+luks->hash_alg =
+qcrypto_block_luks_hash_name_lookup(luks->header.hash_spec,
+_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return -1;
+}
+
+luks->ivgen_alg = qcrypto_block_luks_ivgen_name_lookup(ivgen_name,
+   _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return -1;
+}
+
+if (luks->ivgen_alg == QCRYPTO_IVGEN_ALG_ESSIV) {
+if (!ivhash_name) {
+error_setg(errp, "Missing IV generator hash specification");
+return -1;
+}
+luks->ivgen_cipher_alg =
+qcrypto_block_luks_essiv_cipher(luks->cipher_alg,
+luks->ivgen_hash_alg,
+_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return -1;
+}
+} else {
+
+/*
+ * Note we parsed the ivhash_name earlier in the cipher_mode
+ * spec string even with plain/plain64 ivgens, but we
+ * will ignore it, since it is irrelevant for these ivgens.
+ * This is for compat with dm-crypt which will silently
+ * ignore hash names with these ivgens rather than report
+ * an error about the invalid usage
+ */
+luks->ivgen_cipher_alg = luks->cipher_alg;
+}
+return 0;
+}
+
 /*
  * Given a key slot, and user password, this will attempt to unlock
  * the master encryption key from the key slot.
@@ -712,11 +835,8 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 Error **errp)
 {
 QCryptoBlockLUKS *luks = NULL;
-Error *local_err = NULL;
 g_autofree uint8_t *masterkey = NULL;
-char *ivgen_name, *ivhash_name;
 g_autofree char *password = NULL;
-g_autofree char *cipher_mode = NULL;
 
 if (!(flags & QCRYPTO_BLOCK_OPEN_NO_IO)) {
 if (!options->u.luks.key_secret) {
@@ -738,107 

[PATCH v2 13/13] LUKS: better error message when creating too large files

2019-09-25 Thread Maxim Levitsky
Currently if you attampt to create too large file with luks you
get the following error message:

Formatting 'test.luks', fmt=luks size=17592186044416 key-secret=sec0
qemu-img: test.luks: Could not resize file: File too large

While for raw format the error message is
qemu-img: test.img: The image size is too large for file format 'raw'


The reason for this is that qemu-img checks for errono of the failure,
and presents the later error when it is -EFBIG

However crypto generic code 'swallows' the errno and replaces it
with -EIO.

As an attempt to make it better, we can make luks driver,
detect -EFBIG and in this case present a better error message,
which is what this patch does

The new error message is:

qemu-img: error creating test.luks: The requested file size is too large

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1534898
Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 block/crypto.c | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 6e822c6e50..19c2ac602c 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -102,10 +102,12 @@ static ssize_t block_crypto_create_init_func(QCryptoBlock 
*block,
   Error **errp)
 {
 struct BlockCryptoCreateData *data = opaque;
+Error *local_error = NULL;
+int ret;
 
 if (data->size > INT64_MAX || headerlen > INT64_MAX - data->size) {
-error_setg(errp, "The requested file size is too large");
-return -EFBIG;
+ret = -EFBIG;
+goto error;
 }
 
 /*
@@ -115,6 +117,21 @@ static ssize_t block_crypto_create_init_func(QCryptoBlock 
*block,
  */
 return blk_truncate(data->blk, data->size + headerlen, data->prealloc,
 errp);
+
+if (ret >= 0) {
+return ret;
+}
+
+error:
+if (ret == -EFBIG) {
+/* Replace the error message with a better one */
+error_free(local_error);
+error_setg(errp, "The requested file size is too large");
+} else {
+error_propagate(errp, local_error);
+}
+
+return ret;
 }
 
 
-- 
2.17.2




[PATCH v2 07/13] qcrypto-luks: purge unused error codes from open callback

2019-09-25 Thread Maxim Levitsky
These values are not used by generic crypto code anyway

Signed-off-by: Maxim Levitsky 
---
 crypto/block-luks.c | 45 +
 1 file changed, 13 insertions(+), 32 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index f3bfc921b2..b8f9b9c20a 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -622,9 +622,7 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 {
 QCryptoBlockLUKS *luks = NULL;
 Error *local_err = NULL;
-int ret = 0;
 size_t i;
-ssize_t rv;
 g_autofree uint8_t *masterkey = NULL;
 char *ivgen_name, *ivhash_name;
 g_autofree char *password = NULL;
@@ -648,13 +646,11 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 
 /* Read the entire LUKS header, minus the key material from
  * the underlying device */
-rv = readfunc(block, 0,
-  (uint8_t *)>header,
-  sizeof(luks->header),
-  opaque,
-  errp);
-if (rv < 0) {
-ret = rv;
+if (readfunc(block, 0,
+ (uint8_t *)>header,
+ sizeof(luks->header),
+ opaque,
+ errp) < 0) {
 goto fail;
 }
 
@@ -675,13 +671,11 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 if (memcmp(luks->header.magic, qcrypto_block_luks_magic,
QCRYPTO_BLOCK_LUKS_MAGIC_LEN) != 0) {
 error_setg(errp, "Volume is not in LUKS format");
-ret = -EINVAL;
 goto fail;
 }
 if (luks->header.version != QCRYPTO_BLOCK_LUKS_VERSION) {
 error_setg(errp, "LUKS version %" PRIu32 " is not supported",
luks->header.version);
-ret = -ENOTSUP;
 goto fail;
 }
 
@@ -697,7 +691,6 @@ qcrypto_block_luks_open(QCryptoBlock *block,
  */
 ivgen_name = strchr(cipher_mode, '-');
 if (!ivgen_name) {
-ret = -EINVAL;
 error_setg(errp, "Unexpected cipher mode string format %s",
cipher_mode);
 goto fail;
@@ -715,7 +708,6 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 luks->ivgen_hash_alg = qcrypto_block_luks_hash_name_lookup(ivhash_name,
_err);
 if (local_err) {
-ret = -ENOTSUP;
 error_propagate(errp, local_err);
 goto fail;
 }
@@ -724,7 +716,6 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 luks->cipher_mode = qcrypto_block_luks_cipher_mode_lookup(cipher_mode,
   _err);
 if (local_err) {
-ret = -ENOTSUP;
 error_propagate(errp, local_err);
 goto fail;
 }
@@ -735,7 +726,6 @@ qcrypto_block_luks_open(QCryptoBlock *block,
   luks->header.master_key_len,
   _err);
 if (local_err) {
-ret = -ENOTSUP;
 error_propagate(errp, local_err);
 goto fail;
 }
@@ -744,7 +734,6 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 qcrypto_block_luks_hash_name_lookup(luks->header.hash_spec,
 _err);
 if (local_err) {
-ret = -ENOTSUP;
 error_propagate(errp, local_err);
 goto fail;
 }
@@ -752,14 +741,12 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 luks->ivgen_alg = qcrypto_block_luks_ivgen_name_lookup(ivgen_name,
_err);
 if (local_err) {
-ret = -ENOTSUP;
 error_propagate(errp, local_err);
 goto fail;
 }
 
 if (luks->ivgen_alg == QCRYPTO_IVGEN_ALG_ESSIV) {
 if (!ivhash_name) {
-ret = -EINVAL;
 error_setg(errp, "Missing IV generator hash specification");
 goto fail;
 }
@@ -768,7 +755,6 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 luks->ivgen_hash_alg,
 _err);
 if (local_err) {
-ret = -ENOTSUP;
 error_propagate(errp, local_err);
 goto fail;
 }
@@ -795,7 +781,6 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 masterkey,
 readfunc, opaque,
 errp) < 0) {
-ret = -EACCES;
 goto fail;
 }
 
@@ -813,19 +798,16 @@ qcrypto_block_luks_open(QCryptoBlock *block,
  luks->header.master_key_len,
  errp);
 if (!block->ivgen) {
-ret = -ENOTSUP;
 goto fail;
 }
 
-ret = qcrypto_block_init_cipher(block,
-luks->cipher_alg,
-luks->cipher_mode,
-   

[PATCH v2 08/13] qcrypto-luks: extract store and load header

2019-09-25 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 crypto/block-luks.c | 155 ++--
 1 file changed, 93 insertions(+), 62 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index b8f9b9c20a..47371edf13 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -409,6 +409,97 @@ qcrypto_block_luks_essiv_cipher(QCryptoCipherAlgorithm 
cipher,
 }
 }
 
+/*
+ * Stores the main LUKS header, taking care of endianess
+ */
+static int
+qcrypto_block_luks_store_header(QCryptoBlock *block,
+QCryptoBlockWriteFunc writefunc,
+void *opaque,
+Error **errp)
+{
+const QCryptoBlockLUKS *luks = block->opaque;
+Error *local_err = NULL;
+size_t i;
+g_autofree QCryptoBlockLUKSHeader *hdr_copy = NULL;
+
+/* Create a copy of the header */
+hdr_copy = g_new0(QCryptoBlockLUKSHeader, 1);
+memcpy(hdr_copy, >header, sizeof(QCryptoBlockLUKSHeader));
+
+/*
+ * Everything on disk uses Big Endian (tm), so flip header fields
+ * before writing them
+ */
+cpu_to_be16s(_copy->version);
+cpu_to_be32s(_copy->payload_offset_sector);
+cpu_to_be32s(_copy->master_key_len);
+cpu_to_be32s(_copy->master_key_iterations);
+
+for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
+cpu_to_be32s(_copy->key_slots[i].active);
+cpu_to_be32s(_copy->key_slots[i].iterations);
+cpu_to_be32s(_copy->key_slots[i].key_offset_sector);
+cpu_to_be32s(_copy->key_slots[i].stripes);
+}
+
+/* Write out the partition header and key slot headers */
+writefunc(block, 0, (const uint8_t *)hdr_copy, sizeof(*hdr_copy),
+  opaque, _err);
+
+if (local_err) {
+error_propagate(errp, local_err);
+return -1;
+}
+return 0;
+}
+
+/*
+ * Loads the main LUKS header,and byteswaps it to native endianess
+ * And run basic sanity checks on it
+ */
+static int
+qcrypto_block_luks_load_header(QCryptoBlock *block,
+QCryptoBlockReadFunc readfunc,
+void *opaque,
+Error **errp)
+{
+ssize_t rv;
+size_t i;
+QCryptoBlockLUKS *luks = block->opaque;
+
+/*
+ * Read the entire LUKS header, minus the key material from
+ * the underlying device
+ */
+rv = readfunc(block, 0,
+  (uint8_t *)>header,
+  sizeof(luks->header),
+  opaque,
+  errp);
+if (rv < 0) {
+return rv;
+}
+
+/*
+ * The header is always stored in big-endian format, so
+ * convert everything to native
+ */
+be16_to_cpus(>header.version);
+be32_to_cpus(>header.payload_offset_sector);
+be32_to_cpus(>header.master_key_len);
+be32_to_cpus(>header.master_key_iterations);
+
+for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
+be32_to_cpus(>header.key_slots[i].active);
+be32_to_cpus(>header.key_slots[i].iterations);
+be32_to_cpus(>header.key_slots[i].key_offset_sector);
+be32_to_cpus(>header.key_slots[i].stripes);
+}
+
+return 0;
+}
+
 /*
  * Given a key slot, and user password, this will attempt to unlock
  * the master encryption key from the key slot.
@@ -622,7 +713,6 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 {
 QCryptoBlockLUKS *luks = NULL;
 Error *local_err = NULL;
-size_t i;
 g_autofree uint8_t *masterkey = NULL;
 char *ivgen_name, *ivhash_name;
 g_autofree char *password = NULL;
@@ -644,30 +734,10 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 luks = g_new0(QCryptoBlockLUKS, 1);
 block->opaque = luks;
 
-/* Read the entire LUKS header, minus the key material from
- * the underlying device */
-if (readfunc(block, 0,
- (uint8_t *)>header,
- sizeof(luks->header),
- opaque,
- errp) < 0) {
+if (qcrypto_block_luks_load_header(block, readfunc, opaque, errp) < 0) {
 goto fail;
 }
 
-/* The header is always stored in big-endian format, so
- * convert everything to native */
-be16_to_cpus(>header.version);
-be32_to_cpus(>header.payload_offset_sector);
-be32_to_cpus(>header.master_key_len);
-be32_to_cpus(>header.master_key_iterations);
-
-for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
-be32_to_cpus(>header.key_slots[i].active);
-be32_to_cpus(>header.key_slots[i].iterations);
-be32_to_cpus(>header.key_slots[i].key_offset_sector);
-be32_to_cpus(>header.key_slots[i].stripes);
-}
-
 if (memcmp(luks->header.magic, qcrypto_block_luks_magic,
QCRYPTO_BLOCK_LUKS_MAGIC_LEN) != 0) {
 error_setg(errp, "Volume is not in LUKS format");
@@ -1216,46 +1286,7 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 goto error;
 }

[PATCH v2 10/13] qcrypto-luks: extract store key function

2019-09-25 Thread Maxim Levitsky
This function will be used later to store
new keys to the luks metadata

Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 crypto/block-luks.c | 304 ++--
 1 file changed, 181 insertions(+), 123 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index fa799fd21d..6d4e9eb348 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -623,6 +623,176 @@ qcrypto_block_luks_parse_header(QCryptoBlockLUKS *luks, 
Error **errp)
 return 0;
 }
 
+/*
+ * Given a key slot,  user password, and the master key,
+ * will store the encrypted master key there, and update the
+ * in-memory header. User must then write the in-memory header
+ *
+ * Returns:
+ *0 if the keyslot was written successfully
+ *  with the provided password
+ *   -1 if a fatal error occurred while storing the key
+ */
+static int
+qcrypto_block_luks_store_key(QCryptoBlock *block,
+ unsigned int slot_idx,
+ const char *password,
+ uint8_t *masterkey,
+ uint64_t iter_time,
+ QCryptoBlockWriteFunc writefunc,
+ void *opaque,
+ Error **errp)
+{
+QCryptoBlockLUKS *luks = block->opaque;
+QCryptoBlockLUKSKeySlot *slot = >header.key_slots[slot_idx];
+g_autofree uint8_t *splitkey = NULL;
+size_t splitkeylen;
+g_autofree uint8_t *slotkey = NULL;
+g_autoptr(QCryptoCipher) cipher = NULL;
+g_autoptr(QCryptoIVGen) ivgen = NULL;
+Error *local_err = NULL;
+uint64_t iters;
+int ret = -1;
+
+if (qcrypto_random_bytes(slot->salt,
+ QCRYPTO_BLOCK_LUKS_SALT_LEN,
+ errp) < 0) {
+goto cleanup;
+}
+
+splitkeylen = luks->header.master_key_len * slot->stripes;
+
+/*
+ * Determine how many iterations are required to
+ * hash the user password while consuming 1 second of compute
+ * time
+ */
+iters = qcrypto_pbkdf2_count_iters(luks->hash_alg,
+   (uint8_t *)password, strlen(password),
+   slot->salt,
+   QCRYPTO_BLOCK_LUKS_SALT_LEN,
+   luks->header.master_key_len,
+   _err);
+if (local_err) {
+error_propagate(errp, local_err);
+goto cleanup;
+}
+
+if (iters > (ULLONG_MAX / iter_time)) {
+error_setg_errno(errp, ERANGE,
+ "PBKDF iterations %llu too large to scale",
+ (unsigned long long)iters);
+goto cleanup;
+}
+
+/* iter_time was in millis, but count_iters reported for secs */
+iters = iters * iter_time / 1000;
+
+if (iters > UINT32_MAX) {
+error_setg_errno(errp, ERANGE,
+ "PBKDF iterations %llu larger than %u",
+ (unsigned long long)iters, UINT32_MAX);
+goto cleanup;
+}
+
+slot->iterations =
+MAX(iters, QCRYPTO_BLOCK_LUKS_MIN_SLOT_KEY_ITERS);
+
+
+/*
+ * Generate a key that we'll use to encrypt the master
+ * key, from the user's password
+ */
+slotkey = g_new0(uint8_t, luks->header.master_key_len);
+if (qcrypto_pbkdf2(luks->hash_alg,
+   (uint8_t *)password, strlen(password),
+   slot->salt,
+   QCRYPTO_BLOCK_LUKS_SALT_LEN,
+   slot->iterations,
+   slotkey, luks->header.master_key_len,
+   errp) < 0) {
+goto cleanup;
+}
+
+
+/*
+ * Setup the encryption objects needed to encrypt the
+ * master key material
+ */
+cipher = qcrypto_cipher_new(luks->cipher_alg,
+luks->cipher_mode,
+slotkey, luks->header.master_key_len,
+errp);
+if (!cipher) {
+goto cleanup;
+}
+
+ivgen = qcrypto_ivgen_new(luks->ivgen_alg,
+  luks->ivgen_cipher_alg,
+  luks->ivgen_hash_alg,
+  slotkey, luks->header.master_key_len,
+  errp);
+if (!ivgen) {
+goto cleanup;
+}
+
+/*
+ * Before storing the master key, we need to vastly
+ * increase its size, as protection against forensic
+ * disk data recovery
+ */
+splitkey = g_new0(uint8_t, splitkeylen);
+
+if (qcrypto_afsplit_encode(luks->hash_alg,
+   luks->header.master_key_len,
+   slot->stripes,
+   masterkey,
+   splitkey,
+   errp) < 0) {
+goto cleanup;
+}
+
+/*
+ * Now we encrypt 

[PATCH v2 05/13] qcrypto-luks: pass keyslot index rather that pointer to the keyslot

2019-09-25 Thread Maxim Levitsky
Another minor refactoring

Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 crypto/block-luks.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 9e59a791a6..b759cc8d19 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -410,7 +410,7 @@ qcrypto_block_luks_essiv_cipher(QCryptoCipherAlgorithm 
cipher,
  */
 static int
 qcrypto_block_luks_load_key(QCryptoBlock *block,
-QCryptoBlockLUKSKeySlot *slot,
+size_t slot_idx,
 const char *password,
 QCryptoCipherAlgorithm cipheralg,
 QCryptoCipherMode ciphermode,
@@ -424,6 +424,7 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
 Error **errp)
 {
 QCryptoBlockLUKS *luks = block->opaque;
+const QCryptoBlockLUKSKeySlot *slot = >header.key_slots[slot_idx];
 g_autofree uint8_t *splitkey = NULL;
 size_t splitkeylen;
 g_autofree uint8_t *possiblekey = NULL;
@@ -580,13 +581,12 @@ qcrypto_block_luks_find_key(QCryptoBlock *block,
 void *opaque,
 Error **errp)
 {
-QCryptoBlockLUKS *luks = block->opaque;
 size_t i;
 int rv;
 
 for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
 rv = qcrypto_block_luks_load_key(block,
- >header.key_slots[i],
+ i,
  password,
  cipheralg,
  ciphermode,
-- 
2.17.2




[PATCH v2 06/13] qcrypto-luks: use the parsed encryption settings in QCryptoBlockLUKS

2019-09-25 Thread Maxim Levitsky
Prior to that patch, the parsed encryption settings
were already stored into the QCryptoBlockLUKS but not
used anywhere but in qcrypto_block_luks_get_info

Using them simplifies the code

Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 crypto/block-luks.c | 169 +---
 1 file changed, 79 insertions(+), 90 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index b759cc8d19..f3bfc921b2 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -199,13 +199,25 @@ QEMU_BUILD_BUG_ON(sizeof(struct QCryptoBlockLUKSHeader) 
!= 592);
 struct QCryptoBlockLUKS {
 QCryptoBlockLUKSHeader header;
 
-/* Cache parsed versions of what's in header fields,
- * as we can't rely on QCryptoBlock.cipher being
- * non-NULL */
+/* Main encryption algorithm used for encryption*/
 QCryptoCipherAlgorithm cipher_alg;
+
+/* Mode of encryption for the selected encryption algorithm */
 QCryptoCipherMode cipher_mode;
+
+/* Initialization vector generation algorithm */
 QCryptoIVGenAlgorithm ivgen_alg;
+
+/* Hash algorithm used for IV generation*/
 QCryptoHashAlgorithm ivgen_hash_alg;
+
+/*
+ * Encryption algorithm used for IV generation.
+ * Usually the same as main encryption algorithm
+ */
+QCryptoCipherAlgorithm ivgen_cipher_alg;
+
+/* Hash algorithm used in pbkdf2 function */
 QCryptoHashAlgorithm hash_alg;
 };
 
@@ -412,12 +424,6 @@ static int
 qcrypto_block_luks_load_key(QCryptoBlock *block,
 size_t slot_idx,
 const char *password,
-QCryptoCipherAlgorithm cipheralg,
-QCryptoCipherMode ciphermode,
-QCryptoHashAlgorithm hash,
-QCryptoIVGenAlgorithm ivalg,
-QCryptoCipherAlgorithm ivcipheralg,
-QCryptoHashAlgorithm ivhash,
 uint8_t *masterkey,
 QCryptoBlockReadFunc readfunc,
 void *opaque,
@@ -449,7 +455,7 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
  * the key is correct and validate the results of
  * decryption later.
  */
-if (qcrypto_pbkdf2(hash,
+if (qcrypto_pbkdf2(luks->hash_alg,
(const uint8_t *)password, strlen(password),
slot->salt, QCRYPTO_BLOCK_LUKS_SALT_LEN,
slot->iterations,
@@ -477,19 +483,23 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
 
 /* Setup the cipher/ivgen that we'll use to try to decrypt
  * the split master key material */
-cipher = qcrypto_cipher_new(cipheralg, ciphermode,
-possiblekey, luks->header.master_key_len,
+cipher = qcrypto_cipher_new(luks->cipher_alg,
+luks->cipher_mode,
+possiblekey,
+luks->header.master_key_len,
 errp);
 if (!cipher) {
 return -1;
 }
 
-niv = qcrypto_cipher_get_iv_len(cipheralg,
-ciphermode);
-ivgen = qcrypto_ivgen_new(ivalg,
-  ivcipheralg,
-  ivhash,
-  possiblekey, luks->header.master_key_len,
+niv = qcrypto_cipher_get_iv_len(luks->cipher_alg,
+luks->cipher_mode);
+
+ivgen = qcrypto_ivgen_new(luks->ivgen_alg,
+  luks->ivgen_cipher_alg,
+  luks->ivgen_hash_alg,
+  possiblekey,
+  luks->header.master_key_len,
   errp);
 if (!ivgen) {
 return -1;
@@ -518,7 +528,7 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
  * Now we've decrypted the split master key, join
  * it back together to get the actual master key.
  */
-if (qcrypto_afsplit_decode(hash,
+if (qcrypto_afsplit_decode(luks->hash_alg,
luks->header.master_key_len,
slot->stripes,
splitkey,
@@ -536,7 +546,7 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
  * then comparing that to the hash stored in the key slot
  * header
  */
-if (qcrypto_pbkdf2(hash,
+if (qcrypto_pbkdf2(luks->hash_alg,
masterkey,
luks->header.master_key_len,
luks->header.master_key_salt,
@@ -570,12 +580,6 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
 static int
 qcrypto_block_luks_find_key(QCryptoBlock *block,
 const char *password,
-QCryptoCipherAlgorithm cipheralg,
-QCryptoCipherMode 

[PATCH v2 12/13] qcrypto-luks: more rigorous header checking

2019-09-25 Thread Maxim Levitsky
Check that keyslots don't overlap with the data,
and check that keyslots don't overlap with each other.
(this is done using naive O(n^2) nested loops,
but since there are just 8 keyslots, this doesn't really matter.

Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 crypto/block-luks.c | 52 +
 1 file changed, 52 insertions(+)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index a53d5d1916..4861db810c 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -530,6 +530,11 @@ qcrypto_block_luks_load_header(QCryptoBlock *block,
 static int
 qcrypto_block_luks_check_header(const QCryptoBlockLUKS *luks, Error **errp)
 {
+size_t i, j;
+
+unsigned int header_sectors = QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
+QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
+
 if (memcmp(luks->header.magic, qcrypto_block_luks_magic,
QCRYPTO_BLOCK_LUKS_MAGIC_LEN) != 0) {
 error_setg(errp, "Volume is not in LUKS format");
@@ -541,6 +546,53 @@ qcrypto_block_luks_check_header(const QCryptoBlockLUKS 
*luks, Error **errp)
luks->header.version);
 return -1;
 }
+
+/* Check all keyslots for corruption  */
+for (i = 0 ; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS ; i++) {
+
+const QCryptoBlockLUKSKeySlot *slot1 = >header.key_slots[i];
+unsigned int start1 = slot1->key_offset_sector;
+unsigned int len1 =
+qcrypto_block_luks_splitkeylen_sectors(luks,
+   header_sectors,
+   slot1->stripes);
+
+if (slot1->stripes == 0) {
+error_setg(errp, "Keyslot %zu is corrupted (stripes == 0)", i);
+return -1;
+}
+
+if (slot1->active != QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED &&
+slot1->active != QCRYPTO_BLOCK_LUKS_KEY_SLOT_ENABLED) {
+error_setg(errp,
+   "Keyslot %zu state (active/disable) is corrupted", i);
+return -1;
+}
+
+if (start1 + len1 > luks->header.payload_offset_sector) {
+error_setg(errp,
+   "Keyslot %zu is overlapping with the encrypted payload",
+   i);
+return -1;
+}
+
+for (j = i + 1 ; j < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS ; j++) {
+const QCryptoBlockLUKSKeySlot *slot2 = >header.key_slots[j];
+unsigned int start2 = slot2->key_offset_sector;
+unsigned int len2 =
+qcrypto_block_luks_splitkeylen_sectors(luks,
+   header_sectors,
+   slot2->stripes);
+
+if (start1 + len1 > start2 && start2 + len2 > start1) {
+error_setg(errp,
+   "Keyslots %zu and %zu are overlapping in the 
header",
+   i, j);
+return -1;
+}
+}
+
+}
 return 0;
 }
 
-- 
2.17.2




[PATCH v2 02/13] qcrypto-luks: rename some fields in QCryptoBlockLUKSHeader

2019-09-25 Thread Maxim Levitsky
* key_bytes -> master_key_len
* payload_offset = payload_offset_sector (to emphasise that this isn't byte 
offset)
* key_offset -> key_offset_sector - same as above for luks slots

Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 crypto/block-luks.c | 91 +++--
 1 file changed, 47 insertions(+), 44 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 743949adbf..f12fa2d270 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -143,7 +143,7 @@ struct QCryptoBlockLUKSKeySlot {
 /* salt for PBKDF2 */
 uint8_t salt[QCRYPTO_BLOCK_LUKS_SALT_LEN];
 /* start sector of key material */
-uint32_t key_offset;
+uint32_t key_offset_sector;
 /* number of anti-forensic stripes */
 uint32_t stripes;
 };
@@ -172,10 +172,10 @@ struct QCryptoBlockLUKSHeader {
 char hash_spec[QCRYPTO_BLOCK_LUKS_HASH_SPEC_LEN];
 
 /* start offset of the volume data (in 512 byte sectors) */
-uint32_t payload_offset;
+uint32_t payload_offset_sector;
 
 /* Number of key bytes */
-uint32_t key_bytes;
+uint32_t master_key_len;
 
 /* master key checksum after PBKDF2 */
 uint8_t master_key_digest[QCRYPTO_BLOCK_LUKS_DIGEST_LEN];
@@ -466,7 +466,7 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
  * then encrypted.
  */
 rv = readfunc(block,
-  slot->key_offset * QCRYPTO_BLOCK_LUKS_SECTOR_SIZE,
+  slot->key_offset_sector * QCRYPTO_BLOCK_LUKS_SECTOR_SIZE,
   splitkey, splitkeylen,
   opaque,
   errp);
@@ -584,8 +584,8 @@ qcrypto_block_luks_find_key(QCryptoBlock *block,
 size_t i;
 int rv;
 
-*masterkey = g_new0(uint8_t, luks->header.key_bytes);
-*masterkeylen = luks->header.key_bytes;
+*masterkey = g_new0(uint8_t, luks->header.master_key_len);
+*masterkeylen = luks->header.master_key_len;
 
 for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
 rv = qcrypto_block_luks_load_key(block,
@@ -677,14 +677,14 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 /* The header is always stored in big-endian format, so
  * convert everything to native */
 be16_to_cpus(>header.version);
-be32_to_cpus(>header.payload_offset);
-be32_to_cpus(>header.key_bytes);
+be32_to_cpus(>header.payload_offset_sector);
+be32_to_cpus(>header.master_key_len);
 be32_to_cpus(>header.master_key_iterations);
 
 for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
 be32_to_cpus(>header.key_slots[i].active);
 be32_to_cpus(>header.key_slots[i].iterations);
-be32_to_cpus(>header.key_slots[i].key_offset);
+be32_to_cpus(>header.key_slots[i].key_offset_sector);
 be32_to_cpus(>header.key_slots[i].stripes);
 }
 
@@ -743,10 +743,11 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 goto fail;
 }
 
-cipheralg = qcrypto_block_luks_cipher_name_lookup(luks->header.cipher_name,
-  ciphermode,
-  luks->header.key_bytes,
-  _err);
+cipheralg =
+qcrypto_block_luks_cipher_name_lookup(luks->header.cipher_name,
+  ciphermode,
+  luks->header.master_key_len,
+  _err);
 if (local_err) {
 ret = -ENOTSUP;
 error_propagate(errp, local_err);
@@ -838,7 +839,7 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 }
 
 block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
-block->payload_offset = luks->header.payload_offset *
+block->payload_offset = luks->header.payload_offset_sector *
 block->sector_size;
 
 luks->cipher_alg = cipheralg;
@@ -993,9 +994,11 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 strcpy(luks->header.cipher_mode, cipher_mode_spec);
 strcpy(luks->header.hash_spec, hash_alg);
 
-luks->header.key_bytes = qcrypto_cipher_get_key_len(luks_opts.cipher_alg);
+luks->header.master_key_len =
+qcrypto_cipher_get_key_len(luks_opts.cipher_alg);
+
 if (luks_opts.cipher_mode == QCRYPTO_CIPHER_MODE_XTS) {
-luks->header.key_bytes *= 2;
+luks->header.master_key_len *= 2;
 }
 
 /* Generate the salt used for hashing the master key
@@ -1008,9 +1011,9 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 }
 
 /* Generate random master key */
-masterkey = g_new0(uint8_t, luks->header.key_bytes);
+masterkey = g_new0(uint8_t, luks->header.master_key_len);
 if (qcrypto_random_bytes(masterkey,
- luks->header.key_bytes, errp) < 0) {
+ luks->header.master_key_len, errp) < 0) {
 goto error;
 }
 
@@ -1018,7 +1021,7 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 /* 

[PATCH v2 04/13] qcrypto-luks: simplify masterkey and masterkey length

2019-09-25 Thread Maxim Levitsky
Let the caller allocate masterkey
Always use master key len from the header

Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 crypto/block-luks.c | 44 +---
 1 file changed, 21 insertions(+), 23 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 25f8a9f1c4..9e59a791a6 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -419,7 +419,6 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
 QCryptoCipherAlgorithm ivcipheralg,
 QCryptoHashAlgorithm ivhash,
 uint8_t *masterkey,
-size_t masterkeylen,
 QCryptoBlockReadFunc readfunc,
 void *opaque,
 Error **errp)
@@ -438,9 +437,9 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
 return 0;
 }
 
-splitkeylen = masterkeylen * slot->stripes;
+splitkeylen = luks->header.master_key_len * slot->stripes;
 splitkey = g_new0(uint8_t, splitkeylen);
-possiblekey = g_new0(uint8_t, masterkeylen);
+possiblekey = g_new0(uint8_t, luks->header.master_key_len);
 
 /*
  * The user password is used to generate a (possible)
@@ -453,7 +452,7 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
(const uint8_t *)password, strlen(password),
slot->salt, QCRYPTO_BLOCK_LUKS_SALT_LEN,
slot->iterations,
-   possiblekey, masterkeylen,
+   possiblekey, luks->header.master_key_len,
errp) < 0) {
 return -1;
 }
@@ -478,7 +477,7 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
 /* Setup the cipher/ivgen that we'll use to try to decrypt
  * the split master key material */
 cipher = qcrypto_cipher_new(cipheralg, ciphermode,
-possiblekey, masterkeylen,
+possiblekey, luks->header.master_key_len,
 errp);
 if (!cipher) {
 return -1;
@@ -489,7 +488,7 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
 ivgen = qcrypto_ivgen_new(ivalg,
   ivcipheralg,
   ivhash,
-  possiblekey, masterkeylen,
+  possiblekey, luks->header.master_key_len,
   errp);
 if (!ivgen) {
 return -1;
@@ -519,7 +518,7 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
  * it back together to get the actual master key.
  */
 if (qcrypto_afsplit_decode(hash,
-   masterkeylen,
+   luks->header.master_key_len,
slot->stripes,
splitkey,
masterkey,
@@ -537,11 +536,13 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
  * header
  */
 if (qcrypto_pbkdf2(hash,
-   masterkey, masterkeylen,
+   masterkey,
+   luks->header.master_key_len,
luks->header.master_key_salt,
QCRYPTO_BLOCK_LUKS_SALT_LEN,
luks->header.master_key_iterations,
-   keydigest, G_N_ELEMENTS(keydigest),
+   keydigest,
+   G_N_ELEMENTS(keydigest),
errp) < 0) {
 return -1;
 }
@@ -574,8 +575,7 @@ qcrypto_block_luks_find_key(QCryptoBlock *block,
 QCryptoIVGenAlgorithm ivalg,
 QCryptoCipherAlgorithm ivcipheralg,
 QCryptoHashAlgorithm ivhash,
-uint8_t **masterkey,
-size_t *masterkeylen,
+uint8_t *masterkey,
 QCryptoBlockReadFunc readfunc,
 void *opaque,
 Error **errp)
@@ -584,9 +584,6 @@ qcrypto_block_luks_find_key(QCryptoBlock *block,
 size_t i;
 int rv;
 
-*masterkey = g_new0(uint8_t, luks->header.master_key_len);
-*masterkeylen = luks->header.master_key_len;
-
 for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
 rv = qcrypto_block_luks_load_key(block,
  >header.key_slots[i],
@@ -597,8 +594,7 @@ qcrypto_block_luks_find_key(QCryptoBlock *block,
  ivalg,
  ivcipheralg,
  ivhash,
- *masterkey,
- *masterkeylen,
+ masterkey,
  readfunc,

[PATCH v2 03/13] qcrypto-luks: don't overwrite cipher_mode in header

2019-09-25 Thread Maxim Levitsky
This way we can store the header we loaded, which
will be used in key management code

Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 crypto/block-luks.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index f12fa2d270..25f8a9f1c4 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -645,6 +645,7 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 QCryptoHashAlgorithm hash;
 QCryptoHashAlgorithm ivhash;
 g_autofree char *password = NULL;
+g_autofree char *cipher_mode = NULL;
 
 if (!(flags & QCRYPTO_BLOCK_OPEN_NO_IO)) {
 if (!options->u.luks.key_secret) {
@@ -701,6 +702,8 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 goto fail;
 }
 
+cipher_mode = g_strdup(luks->header.cipher_mode);
+
 /*
  * The cipher_mode header contains a string that we have
  * to further parse, of the format
@@ -709,11 +712,11 @@ qcrypto_block_luks_open(QCryptoBlock *block,
  *
  * eg  cbc-essiv:sha256, cbc-plain64
  */
-ivgen_name = strchr(luks->header.cipher_mode, '-');
+ivgen_name = strchr(cipher_mode, '-');
 if (!ivgen_name) {
 ret = -EINVAL;
 error_setg(errp, "Unexpected cipher mode string format %s",
-   luks->header.cipher_mode);
+   cipher_mode);
 goto fail;
 }
 *ivgen_name = '\0';
@@ -735,7 +738,7 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 }
 }
 
-ciphermode = 
qcrypto_block_luks_cipher_mode_lookup(luks->header.cipher_mode,
+ciphermode = qcrypto_block_luks_cipher_mode_lookup(cipher_mode,
_err);
 if (local_err) {
 ret = -ENOTSUP;
-- 
2.17.2




[PATCH v2 00/13] crypto/luks: preparation for encryption key managment

2019-09-25 Thread Maxim Levitsky
Hi!

This patch series is the refactoring/preparation part of the
former patch series I had sent which adds support for luks
key management.

V2:
I addressed all the review comments
I also added another minor patch to improve an error messsage
when trying to create too large file, for which I have an
open bug that waits to be closed.
Its also is form of refactoring, and thus I guess it makes
sense to include it here.

Best regards,
Maxim Levitsky

Maxim Levitsky (13):
  block-crypto: misc refactoring
  qcrypto-luks: rename some fields in QCryptoBlockLUKSHeader
  qcrypto-luks: don't overwrite cipher_mode in header
  qcrypto-luks: simplify masterkey and masterkey length
  qcrypto-luks: pass keyslot index rather that pointer to the keyslot
  qcrypto-luks: use the parsed encryption settings in QCryptoBlockLUKS
  qcrypto-luks: purge unused error codes from open callback
  qcrypto-luks: extract store and load header
  qcrypto-luks: extract check and parse header
  qcrypto-luks: extract store key function
  qcrypto-luks: simplify the math used for keyslot locations
  qcrypto-luks: more rigorous header checking
  LUKS: better error message when creating too large files

 block/crypto.c  |   33 +-
 crypto/block-luks.c | 1025 +--
 2 files changed, 617 insertions(+), 441 deletions(-)

-- 
2.17.2




[PATCH v2 01/13] block-crypto: misc refactoring

2019-09-25 Thread Maxim Levitsky
* rename the write_func to create_write_func,
  and init_func to create_init_func
  this is  preparation for other write_func that will
  be used to update the encryption keys.

No functional changes

Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 block/crypto.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 7eb698774e..6e822c6e50 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -78,7 +78,7 @@ struct BlockCryptoCreateData {
 };
 
 
-static ssize_t block_crypto_write_func(QCryptoBlock *block,
+static ssize_t block_crypto_create_write_func(QCryptoBlock *block,
size_t offset,
const uint8_t *buf,
size_t buflen,
@@ -96,8 +96,7 @@ static ssize_t block_crypto_write_func(QCryptoBlock *block,
 return ret;
 }
 
-
-static ssize_t block_crypto_init_func(QCryptoBlock *block,
+static ssize_t block_crypto_create_init_func(QCryptoBlock *block,
   size_t headerlen,
   void *opaque,
   Error **errp)
@@ -109,7 +108,8 @@ static ssize_t block_crypto_init_func(QCryptoBlock *block,
 return -EFBIG;
 }
 
-/* User provided size should reflect amount of space made
+/*
+ * User provided size should reflect amount of space made
  * available to the guest, so we must take account of that
  * which will be used by the crypto header
  */
@@ -279,8 +279,8 @@ static int block_crypto_co_create_generic(BlockDriverState 
*bs,
 };
 
 crypto = qcrypto_block_create(opts, NULL,
-  block_crypto_init_func,
-  block_crypto_write_func,
+  block_crypto_create_init_func,
+  block_crypto_create_write_func,
   ,
   errp);
 
-- 
2.17.2




Re: [PATCH 3/3] iotests: Use stat -c %b in 125

2019-09-25 Thread Eric Blake

On 9/25/19 1:32 PM, Max Reitz wrote:

125 should not use qemu-img to get the on-disk image size, because that
reports it in a human-readable format that is useless to us.  Just use
stat instead (like we do to get the image file length).

Signed-off-by: Max Reitz 
---
  tests/qemu-iotests/125 | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/125 b/tests/qemu-iotests/125
index 0ef51f1e21..4e31aa4e5f 100755
--- a/tests/qemu-iotests/125
+++ b/tests/qemu-iotests/125
@@ -34,8 +34,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
  
  get_image_size_on_host()

  {
-$QEMU_IMG info -f "$IMGFMT" "$TEST_IMG" | grep "disk size" \
-| sed -e 's/^[^0-9]*\([0-9]\+\).*$/\1/'
+echo $(($(stat -c '%b * %B' "$TEST_IMG_FILE")))


Cute use of $(()) around $().

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



[Bug 1841990] Re: instruction 'denbcdq' misbehaving

2019-09-25 Thread Paul Clarke
> Did you see the follow up email indicating the typo that I found in
patch 6?

I did, then forgot to include it in my build.  I've included that change
now...

> Does that help any more tests to pass?

I'm down from 22 failures to 8.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1841990

Title:
  instruction 'denbcdq' misbehaving

Status in QEMU:
  New

Bug description:
  Instruction 'denbcdq' appears to have no effect.  Test case attached.

  On ppc64le native:
  --
  gcc -g -O -mcpu=power9 bcdcfsq.c test-denbcdq.c -o test-denbcdq
  $ ./test-denbcdq
  0x
  0x000c
  0x2208
  $ ./test-denbcdq 1
  0x0001
  0x001c
  0x22080001
  $ ./test-denbcdq $(seq 0 99)
  0x0064
  0x100c
  0x22080080
  --

  With "qemu-ppc64le -cpu power9"
  --
  $ qemu-ppc64le -cpu power9 -L [...] ./test-denbcdq
  0x
  0x000c
  0x000c
  $ qemu-ppc64le -cpu power9 -L [...] ./test-denbcdq 1
  0x0001
  0x001c
  0x001c
  $ qemu-ppc64le -cpu power9 -L [...] ./test-denbcdq $(seq 100)
  0x0064
  0x100c
  0x100c
  --

  I started looking at the code, but I got confused rather quickly.
  Could be related to endianness? I think denbcdq arrived on the scene
  before little-endian was a big deal.  Maybe something to do with
  utilizing implicit floating-point register pairs...  I don't think the
  right data is getting to helper_denbcdq, which would point back to the
  gen_fprp_ptr uses in dfp-impl.inc.c (GEN_DFP_T_FPR_I32_Rc).  (Maybe?)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1841990/+subscriptions



Re: [PATCH 1/3] iotests: Fix 125 for growth_mode = metadata

2019-09-25 Thread Eric Blake

On 9/25/19 1:32 PM, Max Reitz wrote:

If we use growth_mode = metadata, it is very much possible that the file
uses more disk space after we have written something to the added area.
We did indeed want to test for this case, but unfortunately we evidently
just copied the code from the "Test creation preallocation" section and
forgot to replace "$create_mode" by "$growth_mode".

We never noticed because we only read the first number from qemu-img
info's "disk size" output -- and that is effectively useless, because
qemu-img prints a human-readable value (which generally includes a
decimal point).  That will be fixed in the patch after the next one.

Signed-off-by: Max Reitz 
---
  tests/qemu-iotests/125 | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)



Reviewed-by: Eric Blake 


diff --git a/tests/qemu-iotests/125 b/tests/qemu-iotests/125
index dc4b8f5fb9..df328a63a6 100755
--- a/tests/qemu-iotests/125
+++ b/tests/qemu-iotests/125
@@ -111,7 +111,7 @@ for GROWTH_SIZE in 16 48 80; do
  if [ $file_length_2 -gt $file_length_1 ]; then
  echo "ERROR (grow): Image length has grown from $file_length_1 
to $file_length_2"
  fi
-if [ $create_mode != metadata ]; then
+if [ $growth_mode != metadata ]; then
  # The host size should not have grown either
  if [ $host_size_2 -gt $host_size_1 ]; then
  echo "ERROR (grow): Host size has grown from $host_size_1 
to $host_size_2"



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



Re: [PATCH 2/3] iotests: Disable 125 on broken XFS versions

2019-09-25 Thread Eric Blake

On 9/25/19 1:32 PM, Max Reitz wrote:

And by that I mean all XFS versions, as far as I can tell.  All details
are in the comment below.

We never noticed this problem because we only read the first number from
qemu-img info's "disk size" output -- and that is effectively useless,
because qemu-img prints a human-readable value (which generally includes
a decimal point).  That will be fixed in the next patch.

Signed-off-by: Max Reitz 
---
  tests/qemu-iotests/125 | 40 
  1 file changed, 40 insertions(+)

diff --git a/tests/qemu-iotests/125 b/tests/qemu-iotests/125
index df328a63a6..0ef51f1e21 100755
--- a/tests/qemu-iotests/125
+++ b/tests/qemu-iotests/125
@@ -49,6 +49,46 @@ if [ -z "$TEST_IMG_FILE" ]; then
  TEST_IMG_FILE=$TEST_IMG
  fi
  
+# Test whether we are running on a broken XFS version.  There is this

+# bug:
+
+# $ rm -f foo
+# $ touch foo
+# $ block_size=4096 # Your FS's block size
+# $ fallocate -o $((block_size / 2)) -l $block_size foo
+# $ LANG=C xfs_bmap foo | grep hole
+# 1: [8..15]: hole
+#
+# The problem is that the XFS driver rounds down the offset and
+# rounds up the length to the block size, but independently.


Eww. I concur you uncovered a bug.  Have you reported this to xfs folks?


+
+touch "$TEST_IMG_FILE"
+# Assuming there is no FS with a block size greater than 64k
+fallocate -o 65535 -l 2 "$TEST_IMG_FILE"
+len0=$(get_image_size_on_host)
+
+# Write to something that in theory we have just fallocated
+# (Thus, the on-disk size should not increase)
+poke_file "$TEST_IMG_FILE" 65536 42
+len1=$(get_image_size_on_host)
+
+if [ $len1 -gt $len0 ]; then
+_notrun "the test filesystem's fallocate() is broken"
+fi
+
+rm -f "$TEST_IMG_FILE"


Reviewed-by: Eric Blake 


+
  # Generally, we create some image with or without existing preallocation and
  # then resize it. Then we write some data into the image and verify that its
  # size does not change if we have used preallocation.



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



Re: [PATCH v4] qga: add command guest-get-devices for reporting VirtIO devices

2019-09-25 Thread Eric Blake

On 9/25/19 4:03 PM, Tomáš Golembiovský wrote:

Add command for reporting devices on Windows guest. The intent is not so
much to report the devices but more importantly the driver (and its
version) that is assigned to the device. This gives caller the
information whether VirtIO drivers are installed and/or whether
inadequate driver is used on a device (e.g. QXL device with base VGA
driver).

Signed-off-by: Tomáš Golembiovský 
---


It's nice to mention here, after the --- separator, how v4 differs from 
earlier versions, to let reviewers that saw the earlier version check 
the differences.




+++ b/qga/qapi-schema.json
@@ -1242,3 +1242,35 @@
  ##
  { 'command': 'guest-get-osinfo',
'returns': 'GuestOSInfo' }
+
+##
+# @GuestDeviceInfo:
+#
+# @vendor-id: vendor ID
+# @device-id: device ID
+# @driver-name: name of the associated driver
+# @driver-date: driver release date in format -MM-DD
+# @driver-version: driver version
+#
+# Since: 4.2
+##
+{ 'struct': 'GuestDeviceInfo',
+  'data': {
+  'vendor-id': 'uint16',
+  'device-id': 'uint16',
+  'driver-name': 'str',
+  'driver-date': 'str',
+  'driver-version': 'str'
+  } }
+
+##
+# @guest-get-devices:
+#
+# Retrieve information about device drivers in Windows guest
+#
+# Returns: @GuestDeviceInfo
+#
+# Since: 4.2
+##
+{ 'command': 'guest-get-devices',
+  'returns': ['GuestDeviceInfo'] }




I'm not spotting any obvious problems with the interface itself, but am 
not comfortable enough with the rest of the code for a full review.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



Re: [PATCH v3 25/33] tests/docker: Add fedora-win10sdk-cross image

2019-09-25 Thread Alex Bennée


Philippe Mathieu-Daudé  writes:

> Hi Alex,
>
> On 9/24/19 11:00 PM, Alex Bennée wrote:
>> From: Philippe Mathieu-Daudé 
>>
>> To build WHPX (Windows Hypervisor) binaries, we need the WHPX
>> headers provided by the Windows SDK.
>
> Justin is checking with his company if this patch is OK with them,
> I'd rather wait before merging it:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg646351.html
>
> Can you unqueue this and the next patch (which depends of it) meanwhile
> please?
>

OK, done.

> Thanks,
>
> Phil.
>
>> Add a script that fetches the required MSI/CAB files from the
>> latest SDK (currently 10.0.18362.1).
>>
>> Headers are accessible under /opt/win10sdk/include.
>>
>> Set the QEMU_CONFIGURE_OPTS environment variable accordingly,
>> enabling HAX and WHPX. Due to CPP warnings related to Microsoft
>> specific #pragmas, we also need to use the '--disable-werror'
>> configure flag.
>>
>> Cc: Justin Terry 
>> Signed-off-by: Philippe Mathieu-Daudé 
>> Signed-off-by: Alex Bennée 
>> Message-Id: <20190920113329.16787-3-phi...@redhat.com>
>> ---
>>  tests/docker/Makefile.include |  2 ++
>>  .../dockerfiles/fedora-win10sdk-cross.docker  | 23 
>>  tests/docker/dockerfiles/win10sdk-dl.sh   | 27 +++
>>  3 files changed, 52 insertions(+)
>>  create mode 100644 tests/docker/dockerfiles/fedora-win10sdk-cross.docker
>>  create mode 100755 tests/docker/dockerfiles/win10sdk-dl.sh
>>
>> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
>> index 3fc7a863e51..e85e73025ba 100644
>> --- a/tests/docker/Makefile.include
>> +++ b/tests/docker/Makefile.include
>> @@ -125,6 +125,8 @@ docker-image-debian-ppc64-cross: docker-image-debian10
>>  docker-image-debian-riscv64-cross: docker-image-debian10
>>  docker-image-debian-sh4-cross: docker-image-debian10
>>  docker-image-debian-sparc64-cross: docker-image-debian10
>> +docker-image-fedora-win10sdk-cross: docker-image-fedora
>> +docker-image-fedora-win10sdk-cross: 
>> EXTRA_FILES:=$(DOCKER_FILES_DIR)/win10sdk-dl.sh
>>
>>  docker-image-travis: NOUSER=1
>>
>> diff --git a/tests/docker/dockerfiles/fedora-win10sdk-cross.docker 
>> b/tests/docker/dockerfiles/fedora-win10sdk-cross.docker
>> new file mode 100644
>> index 000..55ca933d40d
>> --- /dev/null
>> +++ b/tests/docker/dockerfiles/fedora-win10sdk-cross.docker
>> @@ -0,0 +1,23 @@
>> +#
>> +# Docker MinGW64 cross-compiler target with WHPX header installed
>> +#
>> +# This docker target builds on the Fedora 30 base image.
>> +#
>> +# SPDX-License-Identifier: GPL-2.0-or-later
>> +#
>> +FROM qemu:fedora
>> +
>> +RUN dnf install -y \
>> +cabextract \
>> +msitools \
>> +wget
>> +
>> +# Install WHPX headers from Windows Software Development Kit:
>> +# https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk
>> +ADD win10sdk-dl.sh /usr/local/bin/win10sdk-dl.sh
>> +RUN /usr/local/bin/win10sdk-dl.sh
>> +
>> +ENV QEMU_CONFIGURE_OPTS ${QEMU_CONFIGURE_OPTS} \
>> +--cross-prefix=x86_64-w64-mingw32- \
>> +--extra-cflags=-I/opt/win10sdk/include --disable-werror \
>> +--enable-hax --enable-whpx
>> diff --git a/tests/docker/dockerfiles/win10sdk-dl.sh 
>> b/tests/docker/dockerfiles/win10sdk-dl.sh
>> new file mode 100755
>> index 000..1c35c2a2524
>> --- /dev/null
>> +++ b/tests/docker/dockerfiles/win10sdk-dl.sh
>> @@ -0,0 +1,27 @@
>> +#!/bin/bash
>> +#
>> +# Install WHPX headers from Windows Software Development Kit
>> +# https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk
>> +#
>> +# SPDX-License-Identifier: GPL-2.0-or-later
>> +
>> +WINDIR=/opt/win10sdk
>> +mkdir -p ${WINDIR}
>> +pushd ${WINDIR}
>> +# Get the bundle base for Windows SDK v10.0.18362.1
>> +BASE_URL=$(curl --silent --include 
>> 'http://go.microsoft.com/fwlink/?prd=11966=1.0=0x409=0x409=Windows10=SDK=10.0.18362.1'
>>  | sed -nE 's_Location: (.*)/\r_\1_p')/Installers
>> +# Fetch the MSI containing the headers
>> +wget --no-verbose ${BASE_URL}/'Windows SDK Desktop Headers 
>> x86-x86_en-us.msi'
>> +while true; do
>> +# Fetch all cabinets required by this MSI
>> +CAB_NAME=$(msiextract Windows\ SDK\ Desktop\ Headers\ x86-x86_en-us.msi 
>> 3>&1 2>&3 3>&-| sed -nE "s_.*Error opening file $PWD/(.*): No such file or 
>> directory_\1_p")
>> +test -z "${CAB_NAME}" && break
>> +wget --no-verbose ${BASE_URL}/${CAB_NAME}
>> +done
>> +rm *.{cab,msi}
>> +mkdir /opt/win10sdk/include
>> +# Only keep the WHPX headers
>> +for inc in "${WINDIR}/Program Files/Windows 
>> Kits/10/Include/10.0.18362.0/um"/WinHv*; do
>> +ln -s "${inc}" /opt/win10sdk/include
>> +done
>> +popd > /dev/null
>>


--
Alex Bennée



[PATCH v4] qga: add command guest-get-devices for reporting VirtIO devices

2019-09-25 Thread Tomáš Golembiovský
Add command for reporting devices on Windows guest. The intent is not so
much to report the devices but more importantly the driver (and its
version) that is assigned to the device. This gives caller the
information whether VirtIO drivers are installed and/or whether
inadequate driver is used on a device (e.g. QXL device with base VGA
driver).

Signed-off-by: Tomáš Golembiovský 
---
 qga/commands-posix.c |   9 ++
 qga/commands-win32.c | 204 ++-
 qga/qapi-schema.json |  32 +++
 3 files changed, 244 insertions(+), 1 deletion(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index dfc05f5b8a..58e93feef9 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -2757,6 +2757,8 @@ GList *ga_command_blacklist_init(GList *blacklist)
 blacklist = g_list_append(blacklist, g_strdup("guest-fstrim"));
 #endif
 
+blacklist = g_list_append(blacklist, g_strdup("guest-get-devices"));
+
 return blacklist;
 }
 
@@ -2977,3 +2979,10 @@ GuestOSInfo *qmp_guest_get_osinfo(Error **errp)
 
 return info;
 }
+
+GuestDeviceInfoList *qmp_guest_get_devices(Error **errp)
+{
+error_setg(errp, QERR_UNSUPPORTED);
+
+return NULL;
+}
diff --git a/qga/commands-win32.c b/qga/commands-win32.c
index 6b67f16faf..ec07a5b3ef 100644
--- a/qga/commands-win32.c
+++ b/qga/commands-win32.c
@@ -21,10 +21,11 @@
 #ifdef CONFIG_QGA_NTDDSCSI
 #include 
 #include 
+#endif
 #include 
 #include 
 #include 
-#endif
+#include 
 #include 
 #include 
 #include 
@@ -38,6 +39,36 @@
 #include "qemu/host-utils.h"
 #include "qemu/base64.h"
 
+/*
+ * The following should be in devpkey.h, but it isn't. The key names were
+ * prefixed to avoid (future) name clashes. Once the definitions get into
+ * mingw the following lines can be removed.
+ */
+DEFINE_DEVPROPKEY(qga_DEVPKEY_NAME, 0xb725f130, 0x47ef, 0x101a, 0xa5,
+0xf1, 0x02, 0x60, 0x8c, 0x9e, 0xeb, 0xac, 10);
+/* DEVPROP_TYPE_STRING */
+DEFINE_DEVPROPKEY(qga_DEVPKEY_Device_HardwareIds, 0xa45c254e, 0xdf1c,
+0x4efd, 0x80, 0x20, 0x67, 0xd1, 0x46, 0xa8, 0x50, 0xe0, 3);
+/* DEVPROP_TYPE_STRING_LIST */
+DEFINE_DEVPROPKEY(qga_DEVPKEY_Device_DriverDate, 0xa8b865dd, 0x2e3d,
+0x4094, 0xad, 0x97, 0xe5, 0x93, 0xa7, 0xc, 0x75, 0xd6, 2);
+/* DEVPROP_TYPE_FILETIME */
+DEFINE_DEVPROPKEY(qga_DEVPKEY_Device_DriverVersion, 0xa8b865dd, 0x2e3d,
+0x4094, 0xad, 0x97, 0xe5, 0x93, 0xa7, 0xc, 0x75, 0xd6, 3);
+/* DEVPROP_TYPE_STRING */
+/* The following shoud be in cfgmgr32.h, but it isn't */
+#ifndef CM_Get_DevNode_Property
+CMAPI CONFIGRET WINAPI CM_Get_DevNode_PropertyW(
+DEVINST  dnDevInst,
+CONST DEVPROPKEY * PropertyKey,
+DEVPROPTYPE  * PropertyType,
+PBYTEPropertyBuffer,
+PULONG   PropertyBufferSize,
+ULONGulFlags
+);
+#define CM_Get_DevNode_Property CM_Get_DevNode_PropertyW
+#endif
+
 #ifndef SHTDN_REASON_FLAG_PLANNED
 #define SHTDN_REASON_FLAG_PLANNED 0x8000
 #endif
@@ -92,6 +123,8 @@ static OpenFlags guest_file_open_modes[] = {
 g_free(suffix); \
 } while (0)
 
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(GuestDeviceInfo, qapi_free_GuestDeviceInfo)
+
 static OpenFlags *find_open_flag(const char *mode_str)
 {
 int mode;
@@ -2234,3 +2267,172 @@ GuestOSInfo *qmp_guest_get_osinfo(Error **errp)
 
 return info;
 }
+
+/*
+ * Safely get device property. Returned strings are using wide characters.
+ * Caller is responsible for freeing the buffer.
+ */
+static LPBYTE cm_get_property(DEVINST devInst, const DEVPROPKEY *propName,
+PDEVPROPTYPE propType)
+{
+CONFIGRET cr;
+g_autofree LPBYTE buffer = NULL;
+ULONG buffer_len = 0;
+
+/* First query for needed space */
+cr = CM_Get_DevNode_PropertyW(devInst, propName, propType,
+buffer, _len, 0);
+if (cr != CR_SUCCESS && cr != CR_BUFFER_SMALL) {
+
+slog("failed to get property size, error=0x%lx", cr);
+return NULL;
+}
+buffer = g_new0(BYTE, buffer_len + 1);
+cr = CM_Get_DevNode_PropertyW(devInst, propName, propType,
+buffer, _len, 0);
+if (cr != CR_SUCCESS) {
+slog("failed to get device property, error=0x%lx", cr);
+return NULL;
+}
+return g_steal_pointer();
+}
+
+static GStrv ga_get_hardware_ids(DEVINST devInstance)
+{
+GStrv hw_ids = NULL;
+GArray *values = NULL;
+DEVPROPTYPE cm_type;
+LPWSTR id;
+g_autofree LPWSTR property = (LPWSTR)cm_get_property(devInstance,
+_DEVPKEY_Device_HardwareIds, _type);
+if (property == NULL) {
+slog("failed to get hardware IDs");
+return NULL;
+}
+if (*property == '\0') {
+/* empty list */
+return NULL;
+}
+values = g_array_new(TRUE, TRUE, sizeof(gchar *));
+for (id = property; '\0' != *id; id += lstrlenW(id) + 1) {
+gchar *id8 = g_utf16_to_utf8(id, -1, NULL, NULL, NULL);
+g_array_append_val(values, id8);
+}
+hw_ids = (GStrv)g_array_free(values, FALSE);
+values = NULL;
+

[PATCH v3] qga: add command guest-get-devices for reporting VirtIO devices

2019-09-25 Thread Tomáš Golembiovský
Add command for reporting devices on Windows guest. The intent is not so
much to report the devices but more importantly the driver (and its
version) that is assigned to the device. This gives caller the
information whether VirtIO drivers are installed and/or whether
inadequate driver is used on a device (e.g. QXL device with base VGA
driver).

Signed-off-by: Tomáš Golembiovský 
---
 qga/commands-posix.c |   9 ++
 qga/commands-win32.c | 204 ++-
 qga/qapi-schema.json |  32 +++
 3 files changed, 244 insertions(+), 1 deletion(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index dfc05f5b8a..58e93feef9 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -2757,6 +2757,8 @@ GList *ga_command_blacklist_init(GList *blacklist)
 blacklist = g_list_append(blacklist, g_strdup("guest-fstrim"));
 #endif
 
+blacklist = g_list_append(blacklist, g_strdup("guest-get-devices"));
+
 return blacklist;
 }
 
@@ -2977,3 +2979,10 @@ GuestOSInfo *qmp_guest_get_osinfo(Error **errp)
 
 return info;
 }
+
+GuestDeviceInfoList *qmp_guest_get_devices(Error **errp)
+{
+error_setg(errp, QERR_UNSUPPORTED);
+
+return NULL;
+}
diff --git a/qga/commands-win32.c b/qga/commands-win32.c
index 6b67f16faf..139dbd7c9a 100644
--- a/qga/commands-win32.c
+++ b/qga/commands-win32.c
@@ -21,10 +21,11 @@
 #ifdef CONFIG_QGA_NTDDSCSI
 #include 
 #include 
+#endif
 #include 
 #include 
 #include 
-#endif
+#include 
 #include 
 #include 
 #include 
@@ -38,6 +39,36 @@
 #include "qemu/host-utils.h"
 #include "qemu/base64.h"
 
+/*
+ * The following should be in devpkey.h, but it isn't. The key names were
+ * prefixed to avoid (future) name clashes. Once the definitions get into
+ * mingw the following lines can be removed.
+ */
+DEFINE_DEVPROPKEY(qga_DEVPKEY_NAME, 0xb725f130, 0x47ef, 0x101a, 0xa5,
+0xf1, 0x02, 0x60, 0x8c, 0x9e, 0xeb, 0xac, 10);
+/* DEVPROP_TYPE_STRING */
+DEFINE_DEVPROPKEY(qga_DEVPKEY_Device_HardwareIds, 0xa45c254e, 0xdf1c,
+0x4efd, 0x80, 0x20, 0x67, 0xd1, 0x46, 0xa8, 0x50, 0xe0, 3);
+/* DEVPROP_TYPE_STRING_LIST */
+DEFINE_DEVPROPKEY(qga_DEVPKEY_Device_DriverDate, 0xa8b865dd, 0x2e3d,
+0x4094, 0xad, 0x97, 0xe5, 0x93, 0xa7, 0xc, 0x75, 0xd6, 2);
+/* DEVPROP_TYPE_FILETIME */
+DEFINE_DEVPROPKEY(qga_DEVPKEY_Device_DriverVersion, 0xa8b865dd, 0x2e3d,
+0x4094, 0xad, 0x97, 0xe5, 0x93, 0xa7, 0xc, 0x75, 0xd6, 3);
+/* DEVPROP_TYPE_STRING */
+/* The following shoud be in cfgmgr32.h, but it isn't */
+#ifndef CM_Get_DevNode_Property
+CMAPI CONFIGRET WINAPI CM_Get_DevNode_PropertyW(
+DEVINST  dnDevInst,
+CONST DEVPROPKEY * PropertyKey,
+DEVPROPTYPE  * PropertyType,
+PBYTEPropertyBuffer,
+PULONG   PropertyBufferSize,
+ULONGulFlags
+);
+#define CM_Get_DevNode_Property CM_Get_DevNode_PropertyW
+#endif
+
 #ifndef SHTDN_REASON_FLAG_PLANNED
 #define SHTDN_REASON_FLAG_PLANNED 0x8000
 #endif
@@ -92,6 +123,8 @@ static OpenFlags guest_file_open_modes[] = {
 g_free(suffix); \
 } while (0)
 
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(GuestDeviceInfo, qapi_free_GuestDeviceInfo)
+
 static OpenFlags *find_open_flag(const char *mode_str)
 {
 int mode;
@@ -2234,3 +2267,172 @@ GuestOSInfo *qmp_guest_get_osinfo(Error **errp)
 
 return info;
 }
+
+/*
+ * Safely get device property. Returned strings are using wide characters.
+ * Caller is responsible for freeing the buffer.
+ */
+static LPBYTE cm_get_property(DEVINST devInst, const DEVPROPKEY *propName,
+PDEVPROPTYPE propType)
+{
+CONFIGRET cr;
+g_autofree LPBYTE buffer = NULL;
+ULONG buffer_len = 0;
+
+/* First query for needed space */
+cr = CM_Get_DevNode_PropertyW(devInst, propName, propType,
+buffer, _len, 0);
+if (cr != CR_SUCCESS && cr != CR_BUFFER_SMALL) {
+
+slog("failed to get property size, error=0x%lx", cr);
+return NULL;
+}
+buffer = g_new0(BYTE, buffer_len + 1);
+cr = CM_Get_DevNode_PropertyW(devInst, propName, propType,
+buffer, _len, 0);
+if (cr != CR_SUCCESS) {
+slog("failed to get device property, error=0x%lx", cr);
+return NULL;
+}
+return g_steal_pointer();
+}
+
+static GStrv ga_get_hardware_ids(DEVINST devInstance)
+{
+GStrv hw_ids = NULL;
+GArray *values = NULL;
+DEVPROPTYPE cm_type;
+LPWSTR id;
+g_autofree LPWSTR property = (LPWSTR)cm_get_property(devInstance,
+_DEVPKEY_Device_HardwareIds, _type);
+if (property == NULL) {
+slog("failed to get hardware IDs");
+return NULL;
+}
+if (*property == '\0') {
+/* empty list */
+return NULL;
+}
+values = g_array_new(TRUE, TRUE, sizeof(gchar*));
+for (id = property; '\0' != *id; id += lstrlenW(id) + 1) {
+gchar* id8 = g_utf16_to_utf8(id, -1, NULL, NULL, NULL);
+g_array_append_val(values, id8);
+}
+hw_ids = (GStrv)g_array_free(values, FALSE);
+values = NULL;
+return 

Re: [PATCH] hw/core/loader: Fix possible crash in rom_copy()

2019-09-25 Thread Philippe Mathieu-Daudé
Hi Thomas,

On 9/25/19 3:03 PM, Thomas Huth wrote:
> Both, "rom->addr" and "addr" are derived from the binary image
> that can be loaded with the "-kernel" paramer. The code in
> rom_copy() then calculates:
> 
> d = dest + (rom->addr - addr);
> 
> and uses "d" as destination in a memcpy() some lines later. Now with
> bad kernel images, it is possible that rom->addr is smaller than addr,
> thus "rom->addr - addr" gets negative and the memcpy() then tries to
> copy contents from the image to a bad memory location. In the best case,
> this just crashes QEMU, in the worst case, this could maybe be used to
> inject code from the kernel image into the QEMU binary, so we better fix
> it with an additional sanity check here.
> 
> Cc: qemu-sta...@nongnu.org
> Reported-by: Guangming Liu
> Buglink: https://bugs.launchpad.net/qemu/+bug/1844635

"This page does not exist, or you may not have permission to see it."

This seems security related. Shouldn't we open a CVE for this?
https://wiki.qemu.org/SecurityProcess#CVE_allocation

Let's say I have write access to a LAN TFTP server used by some PXE
bootloader where I can store my crafted nasty kernel, then I get this score:

https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=AV:A/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H/E:P/RL:O/RC:C=3.1

CVSS Base Score: 9.6
CVSS Temporal Score: 8.6

Which seems quite high.

> Signed-off-by: Thomas Huth 
> ---
>  hw/core/loader.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/core/loader.c b/hw/core/loader.c
> index 0d60219364..5099f27dc8 100644
> --- a/hw/core/loader.c
> +++ b/hw/core/loader.c
> @@ -1281,7 +1281,7 @@ int rom_copy(uint8_t *dest, hwaddr addr, size_t size)

$ git show 235f86ef014
Date:   Thu Nov 12 21:53:11 2009 +0100

This function is old and poorly documented.

>  if (rom->addr + rom->romsize < addr) {
>  continue;
>  }
> -if (rom->addr > end) {
> +if (rom->addr > end || rom->addr < addr) {

Reviewed-by: Philippe Mathieu-Daudé 

>  break;
>  }
>  
> 



Re: [PATCH 6/7] target/ppc: use existing VsrD() macro to eliminate HI_IDX and LO_IDX from dfp_helper.c

2019-09-25 Thread Mark Cave-Ayland
On 24/09/2019 22:46, Richard Henderson wrote:

> On 9/24/19 8:35 AM, Mark Cave-Ayland wrote:
>> Switch over all accesses to the decimal numbers held in struct PPC_DFP from
>> using HI_IDX and LO_IDX to using the VsrD() macro instead. Not only does this
>> allow the compiler to ensure that the various dfp_* functions are being 
>> passed
>> a ppc_vsr_t rather than an arbitrary uint64_t pointer, but also allows the
>> host endian-specific HI_IDX and LO_IDX to be completely removed from
>> dfp_helper.c.
>>
>> Signed-off-by: Mark Cave-Ayland 
>> ---
>>  target/ppc/dfp_helper.c | 70 ++---
>>  1 file changed, 31 insertions(+), 39 deletions(-)
> 
> Ho hum, vs patch 5 that was me not realizing how many different places really
> want to manipulate a 128-bit value.  Do go ahead and keep ppc_vsr_t for now.

Yes, it's a little bit confusing in places as some operations are done on the
decNumber whilst others are done on the decimal representation. After trying a 
few
different approaches, using ppc_vsr_t seemed to be the easiest and most readable
solution here.

I see now that you've given R-b tags for patches 3-7, and having slept on it I'm
inclined to leave patches 1-2 as they are now, i.e. no code changes other than
introducing the get/set helpers to help keep the patchset as mechanical as 
possible.
Do you think that seems a reasonable approach?

> It does look like we might be well served by using Int128 at some point, so
> that these operations can expand to int128_t on appropriate hw so that the
> compiler can DTRT with these.

Certainly ppc_vsr_t already has __uint128_t and Int128 elements but the 
impression I
got from the #ifdef is that not all compilers would support it? Although having 
said
that, making such a change is not something that's really on my radar.


ATB,

Mark.



Re: [PATCH v7 0/8] Add Qemu to SeaBIOS LCHS interface

2019-09-25 Thread John Snow



On 9/25/19 7:06 AM, Sam Eiderman via wrote:
> v1:
> 
> Non-standard logical geometries break under QEMU.
> 
> A virtual disk which contains an operating system which depends on
> logical geometries (consistent values being reported from BIOS INT13
> AH=08) will most likely break under QEMU/SeaBIOS if it has non-standard
> logical geometries - for example 56 SPT (sectors per track).
> No matter what QEMU will guess - SeaBIOS, for large enough disks - will
> use LBA translation, which will report 63 SPT instead.
> 
> In addition we can not enforce SeaBIOS to rely on phyiscal geometries at
> all. A virtio-blk-pci virtual disk with 255 phyiscal heads can not
> report more than 16 physical heads when moved to an IDE controller, the
> ATA spec allows a maximum of 16 heads - this is an artifact of
> virtualization.
> 
> By supplying the logical geometies directly we are able to support such
> "exotic" disks.
> 
> We will use fw_cfg to do just that.
> 
> v2:
> 
> Fix missing parenthesis check in
> "hd-geo-test: Add tests for lchs override"
> 
> v3:
> 
> * Rename fw_cfg key to "bios-geometry".
> * Remove "extendible" interface.
> * Add cpu_to_le32 fix as Laszlo suggested or big endian hosts
> * Fix last qtest commit - automatic docker tester for some reason does not 
> have qemu-img set
> 
> v4:
> 
> * Change fw_cfg interface from mixed textual/binary to textual only
> 
> v5:
> 
> * Fix line > 80 chars in tests/hd-geo-test.c
> 
> v6:
> 
> * Small fixes for issues pointed by Max
> * (>conf)->lcyls to conf->conf.lcyls and so on
> * Remove scsi_unrealize from everything other than scsi-hd
> * Add proper include to sysemu.h
> * scsi_device_unrealize() after scsi_device_purge_requests()
> 
> v7:
> 
> * Adapted last commit (tests) to changes in qtest
> 
> Sam Eiderman (8):
>   block: Refactor macros - fix tabbing
>   block: Support providing LCHS from user
>   bootdevice: Add interface to gather LCHS
>   scsi: Propagate unrealize() callback to scsi-hd
>   bootdevice: Gather LCHS from all relevant devices
>   bootdevice: Refactor get_boot_devices_list
>   bootdevice: FW_CFG interface for LCHS values
>   hd-geo-test: Add tests for lchs override
> 
>  bootdevice.c | 148 --
>  hw/block/virtio-blk.c|   6 +
>  hw/ide/qdev.c|   7 +-
>  hw/nvram/fw_cfg.c|  14 +-
>  hw/scsi/scsi-bus.c   |  16 ++
>  hw/scsi/scsi-disk.c  |  12 +
>  include/hw/block/block.h |  22 +-
>  include/hw/scsi/scsi.h   |   1 +
>  include/sysemu/sysemu.h  |   4 +
>  tests/Makefile.include   |   2 +-
>  tests/hd-geo-test.c  | 589 +++
>  11 files changed, 780 insertions(+), 41 deletions(-)
> 

Thanks, applied to my IDE tree:

https://github.com/jnsnow/qemu/commits/ide
https://github.com/jnsnow/qemu.git

--js

Is that the right tree? Nope, but time's marching on without us. If any
other maintainer has an objection, you have until Friday before I send
the PR!



Re: [PATCH 12/20] spapr: Simplify spapr_qirq() handling

2019-09-25 Thread Greg Kurz
On Wed, 25 Sep 2019 16:45:26 +1000
David Gibson  wrote:

> Currently spapr_qirq() used to find the qemu_irq for an spapr global irq
> number, redirects through the SpaprIrq::qirq method.  But the array of
> qemu_irqs is allocated in the PAPR layer, not the backends, and so the
> method implementations all return the same thing, just differing in the
> preliminary checks they make.
> 
> So, we can remove the method, and just implement spapr_qirq() directly,
> including all the relevant checks in one place.  We change all those
> checks into assert()s as well, since a failure here indicates an error in
> the calling code.
> 
> Signed-off-by: David Gibson 
> ---

Reviewed-by: Greg Kurz 

>  hw/ppc/spapr_irq.c | 47 ++
>  include/hw/ppc/spapr_irq.h |  1 -
>  2 files changed, 12 insertions(+), 36 deletions(-)
> 
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 9a9e486eb5..038bf4 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -150,17 +150,6 @@ static void spapr_irq_free_xics(SpaprMachineState 
> *spapr, int irq, int num)
>  }
>  }
>  
> -static qemu_irq spapr_qirq_xics(SpaprMachineState *spapr, int irq)
> -{
> -ICSState *ics = spapr->ics;
> -
> -if (ics_valid_irq(ics, irq)) {
> -return spapr->qirqs[irq];
> -}
> -
> -return NULL;
> -}
> -
>  static void spapr_irq_print_info_xics(SpaprMachineState *spapr, Monitor *mon)
>  {
>  CPUState *cs;
> @@ -242,7 +231,6 @@ SpaprIrq spapr_irq_xics = {
>  .init= spapr_irq_init_xics,
>  .claim   = spapr_irq_claim_xics,
>  .free= spapr_irq_free_xics,
> -.qirq= spapr_qirq_xics,
>  .print_info  = spapr_irq_print_info_xics,
>  .dt_populate = spapr_dt_xics,
>  .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
> @@ -300,20 +288,6 @@ static void spapr_irq_free_xive(SpaprMachineState 
> *spapr, int irq, int num)
>  }
>  }
>  
> -static qemu_irq spapr_qirq_xive(SpaprMachineState *spapr, int irq)
> -{
> -SpaprXive *xive = spapr->xive;
> -
> -if ((irq < SPAPR_XIRQ_BASE) || (irq >= xive->nr_irqs)) {
> -return NULL;
> -}
> -
> -/* The sPAPR machine/device should have claimed the IRQ before */
> -assert(xive_eas_is_valid(>eat[irq]));
> -
> -return spapr->qirqs[irq];
> -}
> -
>  static void spapr_irq_print_info_xive(SpaprMachineState *spapr,
>Monitor *mon)
>  {
> @@ -413,7 +387,6 @@ SpaprIrq spapr_irq_xive = {
>  .init= spapr_irq_init_xive,
>  .claim   = spapr_irq_claim_xive,
>  .free= spapr_irq_free_xive,
> -.qirq= spapr_qirq_xive,
>  .print_info  = spapr_irq_print_info_xive,
>  .dt_populate = spapr_dt_xive,
>  .cpu_intc_create = spapr_irq_cpu_intc_create_xive,
> @@ -487,11 +460,6 @@ static void spapr_irq_free_dual(SpaprMachineState 
> *spapr, int irq, int num)
>  spapr_irq_xive.free(spapr, irq, num);
>  }
>  
> -static qemu_irq spapr_qirq_dual(SpaprMachineState *spapr, int irq)
> -{
> -return spapr_irq_current(spapr)->qirq(spapr, irq);
> -}
> -
>  static void spapr_irq_print_info_dual(SpaprMachineState *spapr, Monitor *mon)
>  {
>  spapr_irq_current(spapr)->print_info(spapr, mon);
> @@ -586,7 +554,6 @@ SpaprIrq spapr_irq_dual = {
>  .init= spapr_irq_init_dual,
>  .claim   = spapr_irq_claim_dual,
>  .free= spapr_irq_free_dual,
> -.qirq= spapr_qirq_dual,
>  .print_info  = spapr_irq_print_info_dual,
>  .dt_populate = spapr_irq_dt_populate_dual,
>  .cpu_intc_create = spapr_irq_cpu_intc_create_dual,
> @@ -700,7 +667,18 @@ void spapr_irq_free(SpaprMachineState *spapr, int irq, 
> int num)
>  
>  qemu_irq spapr_qirq(SpaprMachineState *spapr, int irq)
>  {
> -return spapr->irq->qirq(spapr, irq);
> +assert(irq >= SPAPR_XIRQ_BASE);
> +assert(irq < (spapr->irq->nr_xirqs + SPAPR_XIRQ_BASE));
> +
> +if (spapr->ics) {
> +assert(ics_valid_irq(spapr->ics, irq));
> +}
> +if (spapr->xive) {
> +assert(irq < spapr->xive->nr_irqs);
> +assert(xive_eas_is_valid(>xive->eat[irq]));
> +}
> +
> +return spapr->qirqs[irq];
>  }
>  
>  int spapr_irq_post_load(SpaprMachineState *spapr, int version_id)
> @@ -803,7 +781,6 @@ SpaprIrq spapr_irq_xics_legacy = {
>  .init= spapr_irq_init_xics,
>  .claim   = spapr_irq_claim_xics,
>  .free= spapr_irq_free_xics,
> -.qirq= spapr_qirq_xics,
>  .print_info  = spapr_irq_print_info_xics,
>  .dt_populate = spapr_dt_xics,
>  .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index 7e26288fcd..a4e790ef60 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -44,7 +44,6 @@ typedef struct SpaprIrq {
>  void (*init)(SpaprMachineState *spapr, Error **errp);
>  int (*claim)(SpaprMachineState *spapr, int irq, bool 

Re: [PATCH v3 25/33] tests/docker: Add fedora-win10sdk-cross image

2019-09-25 Thread Philippe Mathieu-Daudé
Hi Alex,

On 9/24/19 11:00 PM, Alex Bennée wrote:
> From: Philippe Mathieu-Daudé 
> 
> To build WHPX (Windows Hypervisor) binaries, we need the WHPX
> headers provided by the Windows SDK.

Justin is checking with his company if this patch is OK with them,
I'd rather wait before merging it:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg646351.html

Can you unqueue this and the next patch (which depends of it) meanwhile
please?

Thanks,

Phil.

> Add a script that fetches the required MSI/CAB files from the
> latest SDK (currently 10.0.18362.1).
> 
> Headers are accessible under /opt/win10sdk/include.
> 
> Set the QEMU_CONFIGURE_OPTS environment variable accordingly,
> enabling HAX and WHPX. Due to CPP warnings related to Microsoft
> specific #pragmas, we also need to use the '--disable-werror'
> configure flag.
> 
> Cc: Justin Terry 
> Signed-off-by: Philippe Mathieu-Daudé 
> Signed-off-by: Alex Bennée 
> Message-Id: <20190920113329.16787-3-phi...@redhat.com>
> ---
>  tests/docker/Makefile.include |  2 ++
>  .../dockerfiles/fedora-win10sdk-cross.docker  | 23 
>  tests/docker/dockerfiles/win10sdk-dl.sh   | 27 +++
>  3 files changed, 52 insertions(+)
>  create mode 100644 tests/docker/dockerfiles/fedora-win10sdk-cross.docker
>  create mode 100755 tests/docker/dockerfiles/win10sdk-dl.sh
> 
> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
> index 3fc7a863e51..e85e73025ba 100644
> --- a/tests/docker/Makefile.include
> +++ b/tests/docker/Makefile.include
> @@ -125,6 +125,8 @@ docker-image-debian-ppc64-cross: docker-image-debian10
>  docker-image-debian-riscv64-cross: docker-image-debian10
>  docker-image-debian-sh4-cross: docker-image-debian10
>  docker-image-debian-sparc64-cross: docker-image-debian10
> +docker-image-fedora-win10sdk-cross: docker-image-fedora
> +docker-image-fedora-win10sdk-cross: 
> EXTRA_FILES:=$(DOCKER_FILES_DIR)/win10sdk-dl.sh
>  
>  docker-image-travis: NOUSER=1
>  
> diff --git a/tests/docker/dockerfiles/fedora-win10sdk-cross.docker 
> b/tests/docker/dockerfiles/fedora-win10sdk-cross.docker
> new file mode 100644
> index 000..55ca933d40d
> --- /dev/null
> +++ b/tests/docker/dockerfiles/fedora-win10sdk-cross.docker
> @@ -0,0 +1,23 @@
> +#
> +# Docker MinGW64 cross-compiler target with WHPX header installed
> +#
> +# This docker target builds on the Fedora 30 base image.
> +#
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +#
> +FROM qemu:fedora
> +
> +RUN dnf install -y \
> +cabextract \
> +msitools \
> +wget
> +
> +# Install WHPX headers from Windows Software Development Kit:
> +# https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk
> +ADD win10sdk-dl.sh /usr/local/bin/win10sdk-dl.sh
> +RUN /usr/local/bin/win10sdk-dl.sh
> +
> +ENV QEMU_CONFIGURE_OPTS ${QEMU_CONFIGURE_OPTS} \
> +--cross-prefix=x86_64-w64-mingw32- \
> +--extra-cflags=-I/opt/win10sdk/include --disable-werror \
> +--enable-hax --enable-whpx
> diff --git a/tests/docker/dockerfiles/win10sdk-dl.sh 
> b/tests/docker/dockerfiles/win10sdk-dl.sh
> new file mode 100755
> index 000..1c35c2a2524
> --- /dev/null
> +++ b/tests/docker/dockerfiles/win10sdk-dl.sh
> @@ -0,0 +1,27 @@
> +#!/bin/bash
> +#
> +# Install WHPX headers from Windows Software Development Kit
> +# https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk
> +#
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +
> +WINDIR=/opt/win10sdk
> +mkdir -p ${WINDIR}
> +pushd ${WINDIR}
> +# Get the bundle base for Windows SDK v10.0.18362.1
> +BASE_URL=$(curl --silent --include 
> 'http://go.microsoft.com/fwlink/?prd=11966=1.0=0x409=0x409=Windows10=SDK=10.0.18362.1'
>  | sed -nE 's_Location: (.*)/\r_\1_p')/Installers
> +# Fetch the MSI containing the headers
> +wget --no-verbose ${BASE_URL}/'Windows SDK Desktop Headers x86-x86_en-us.msi'
> +while true; do
> +# Fetch all cabinets required by this MSI
> +CAB_NAME=$(msiextract Windows\ SDK\ Desktop\ Headers\ x86-x86_en-us.msi 
> 3>&1 2>&3 3>&-| sed -nE "s_.*Error opening file $PWD/(.*): No such file or 
> directory_\1_p")
> +test -z "${CAB_NAME}" && break
> +wget --no-verbose ${BASE_URL}/${CAB_NAME}
> +done
> +rm *.{cab,msi}
> +mkdir /opt/win10sdk/include
> +# Only keep the WHPX headers
> +for inc in "${WINDIR}/Program Files/Windows 
> Kits/10/Include/10.0.18362.0/um"/WinHv*; do
> +ln -s "${inc}" /opt/win10sdk/include
> +done
> +popd > /dev/null
> 



Re: [PATCH 11/20] spapr: Fix indexing of XICS irqs

2019-09-25 Thread Greg Kurz
On Wed, 25 Sep 2019 16:45:25 +1000
David Gibson  wrote:

> spapr global irq numbers are different from the source numbers on the ICS
> when using XICS - they're offset by XICS_IRQ_BASE (0x1000).  But
> spapr_irq_set_irq_xics() was passing through the global irq number to
> the ICS code unmodified.
> 
> We only got away with this because of a counteracting bug - we were
> incorrectly adjusting the qemu_irq we returned for a requested global irq
> number.
> 
> That approach mostly worked but is very confusing, incorrectly relies on
> the way the qemu_irq array is allocated, and undermines the intention of
> having the global array of qemu_irqs for spapr have a consistent meaning
> regardless of irq backend.
> 
> So, fix both set_irq and qemu_irq indexing.  We rename some parameters at
> the same time to make it clear that they are referring to spapr global
> irq numbers.
> 
> Signed-off-by: David Gibson 
> ---

Reviewed-by: Greg Kurz 

Further cleanup could be to have the XICS backend to only take global
irq numbers and to convert them to ICS source numbers internally. This
would put an end to the confusion between srcno/irq in the frontend
code.

>  hw/ppc/spapr_irq.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 300c65be3a..9a9e486eb5 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -153,10 +153,9 @@ static void spapr_irq_free_xics(SpaprMachineState 
> *spapr, int irq, int num)
>  static qemu_irq spapr_qirq_xics(SpaprMachineState *spapr, int irq)
>  {
>  ICSState *ics = spapr->ics;
> -uint32_t srcno = irq - ics->offset;
>  
>  if (ics_valid_irq(ics, irq)) {
> -return spapr->qirqs[srcno];
> +return spapr->qirqs[irq];
>  }
>  
>  return NULL;
> @@ -204,9 +203,10 @@ static int spapr_irq_post_load_xics(SpaprMachineState 
> *spapr, int version_id)
>  return 0;
>  }
>  
> -static void spapr_irq_set_irq_xics(void *opaque, int srcno, int val)
> +static void spapr_irq_set_irq_xics(void *opaque, int irq, int val)
>  {
>  SpaprMachineState *spapr = opaque;
> +uint32_t srcno = irq - spapr->ics->offset;
>  
>  ics_set_irq(spapr->ics, srcno, val);
>  }
> @@ -377,14 +377,14 @@ static void spapr_irq_reset_xive(SpaprMachineState 
> *spapr, Error **errp)
>  spapr_xive_mmio_set_enabled(spapr->xive, true);
>  }
>  
> -static void spapr_irq_set_irq_xive(void *opaque, int srcno, int val)
> +static void spapr_irq_set_irq_xive(void *opaque, int irq, int val)
>  {
>  SpaprMachineState *spapr = opaque;
>  
>  if (kvm_irqchip_in_kernel()) {
> -kvmppc_xive_source_set_irq(>xive->source, srcno, val);
> +kvmppc_xive_source_set_irq(>xive->source, irq, val);
>  } else {
> -xive_source_set_irq(>xive->source, srcno, val);
> +xive_source_set_irq(>xive->source, irq, val);
>  }
>  }
>  
> @@ -563,11 +563,11 @@ static void spapr_irq_reset_dual(SpaprMachineState 
> *spapr, Error **errp)
>  spapr_irq_current(spapr)->reset(spapr, errp);
>  }
>  
> -static void spapr_irq_set_irq_dual(void *opaque, int srcno, int val)
> +static void spapr_irq_set_irq_dual(void *opaque, int irq, int val)
>  {
>  SpaprMachineState *spapr = opaque;
>  
> -spapr_irq_current(spapr)->set_irq(spapr, srcno, val);
> +spapr_irq_current(spapr)->set_irq(spapr, irq, val);
>  }
>  
>  static const char *spapr_irq_get_nodename_dual(SpaprMachineState *spapr)




Re: Debian support lifetime (was Re: [PATCH] docker: move tests from python2 to python3)

2019-09-25 Thread Eduardo Habkost
On Tue, Sep 24, 2019 at 08:35:13AM +0100, Daniel P. Berrangé wrote:
> On Mon, Sep 23, 2019 at 04:05:33PM -0300, Eduardo Habkost wrote:
[...]
> > Even for other long-lifetime distros, I really think "2 years
> > after the new major version is released" is too long, and I'd
> > like to shorten this to 1 year.
> 
> I guess this is ok, since this. is still quite a long life time of
> support for distros. eg RHEL has a 3-4 year gap between major
> releases, that gives 4-5 years for each release being supported by
> QEMU. Other LTS distros are similar

Do you mean the 2 years period is OK (and shouldn't be changed),
or that shortening it to 1 year is OK?

-- 
Eduardo



Re: [PATCH v4 10/16] cputlb: Partially inline memory_region_section_get_iotlb

2019-09-25 Thread David Hildenbrand
On 25.09.19 19:55, Richard Henderson wrote:
> On 9/24/19 12:59 AM, David Hildenbrand wrote:
>>> +is_ram = memory_region_is_ram(section->mr);
>>> +is_romd = memory_region_is_romd(section->mr);
>>> +
>>> +if (is_ram || is_romd) {
>>> +/* RAM and ROMD both have associated host memory. */
>>>  addend = (uintptr_t)memory_region_get_ram_ptr(section->mr) + xlat;
>>> +} else {
>>> +/* I/O does not; force the host address to NULL. */
>>> +addend = 0;
>>> +}
>>> +
>>> +write_address = address;
>>
>> I guess the only "suboptimal" change is that you now have two checks for
>> "prot & PAGE_WRITE" twice in the case of ram instead of one.
> 
> It's a single bit test on a register operand -- as cheap as can be.  If you
> look at the entire code, there *must* be more than one test.  You can 
> rearrange
> the code to choose exactly where those tests are, but you'll have to have them
> somewhere.
> 
>>> +/* I/O or ROMD */
>>> +iotlb = memory_region_section_get_iotlb(cpu, section) + xlat;
>>> +/*
>>> + * Writes to romd devices must go through MMIO to enable write.
>>> + * Reads to romd devices go through the ram_ptr found above,
>>> + * but of course reads to I/O must go through MMIO.
>>> + */
>>> +write_address |= TLB_MMIO;
>>
>> ... and here you calculate write_address even if probably unused.
> 
> Well... while the page might not be writable (but I'd bet that it is -- I/O
> memory is almost never read-only), and therefore write_address is technically
> unused, the variable is practically used in the next line:
> 
> if (!is_romd) {
> address = write_address
> }
> 
> which will compile to a conditional move.
> 
>> Can your move the calculation of the write_address completely into the
>> "prot & PAGE_WRITE" case below?
> 
> We'd prefer not to, since the code below is within the cpu tlb lock region.
> We'd prefer to keep all of the expensive operations outside that.

Makes all sense to me then and looks sane :)

> 
> 
> r~
> 


-- 

Thanks,

David / dhildenb



Re: [PATCH v2 4/7] s390x/mmu: Inject PGM_ADDRESSING on boguous table addresses

2019-09-25 Thread David Hildenbrand
On 25.09.19 21:25, Richard Henderson wrote:
> On 9/25/19 5:52 AM, David Hildenbrand wrote:
>> +static inline int read_table_entry(hwaddr gaddr, uint64_t *entry)
>> +{
>> +/*
>> + * According to the PoP, these table addresses are "unpredictably real
>> + * or absolute". Also, "it is unpredictable whether the address wraps
>> + * or an addressing exception is recognized".
>> + *
>> + * We treat them as absolute addresses and don't wrap them.
>> + */
>> +if (unlikely(address_space_read(_space_memory, gaddr,
>> + MEMTXATTRS_UNSPECIFIED, (uint8_t *)entry, sizeof(*entry)) 
>> !=
>> + MEMTX_OK)) {
>> +return -EFAULT;
>> +}
>> +*entry = be64_to_cpu(*entry);
>> +return 0;
>> +}
> 
> Maybe I've been away from the kernel too long, but I don't find returning
> -EFAULT helpful.  I would return true/false for success/failure so that...
> 
> 
>> +if (read_table_entry(origin + offs, _entry)) {
>> +return PGM_ADDRESSING;
>> +}
> 
> ... this gets written
> 
> if (!read_table_entry(...)) {
> return PGM_ADDRESSING;
> }
> 
> This statement, to me, reads "If we did not read_table_entry, return an
> addressing exception."
> 
> If you *really* want to return non-zero on failure, I would prefer returning
> PGM_ADDRESSING instead of the out-of-context -EFAULT.

I'll go for your suggestion with a bool!

> 
>> -new_entry = ldq_phys(cs->as, origin + offs);
>> +if (read_table_entry(origin + offs, _entry)) {
> 
> Do you really want to replace cs->as with address_space_memory?
> 

I guess it shouldn't make a difference (unless I am missing something),
but I can just keep using cs->as.

Thanks!

> 
> r~
> 


-- 

Thanks,

David / dhildenb



Re: [PATCH v2 5/7] s390x/mmu: Use TARGET_PAGE_MASK in mmu_translate_pte()

2019-09-25 Thread Richard Henderson
On 9/25/19 5:52 AM, David Hildenbrand wrote:
> While ASCE_ORIGIN is not wrong, it is certainly confusing. We want a
> page frame address.
> 
> Signed-off-by: David Hildenbrand 
> ---
>  target/s390x/mmu_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson 


r~



Re: [PATCH v13 00/15] backup-top filter driver for backup

2019-09-25 Thread Vladimir Sementsov-Ogievskiy
ops, I've sent unfinished message

On 25.09.2019 22:19, Vladimir Sementsov-Ogievskiy wrote:
> Ogh :(
> 
> And I realized that there is bigger problem with design:
> 
> Assume failed copy in filter request: we want to mark bits dirty again
> and release range lock on source.. But if we have some write reguests
> in parallel, they may already passed backup-top filter, and they are
> only waiting for range lock. When lock is free the will go on and will
> not see bitmap changes..
> 
> That means that we can't use range lock: waiting request must wait on
> backup-top level, but range lock will not work on it, as they will
> interfer with original write request.

With such design we can't mark bits dirty again. We can switch to other 
behavior: on failed block-copy in filter just cancel the whole 
block-job.. But actually I think both behaviors should be available for 
user:
1. if backup is important, better to fail guest writes if needed
2. if guest is important, better to fail backup job if failed to do 
copy-before-write

> 
> I have to rething it somehow, a kind of "intersecting requests" possibly
> will be kept. I still don't like that current backup write-notifier
> locks the whole region, even non-dirty bits, instead we should lock only
> the region which we are handling at the moment.
> 
> Patches 01-11 are still good themselves, as a preparation, let's keep them
> 
> Patches 12-13 are good, but range lock is not appropriate for backup..
> May be they will be used for rewriting copy-on-read filter to copy in
> filter code.. Still I'm not sure, as COR should work through block-copy
> finally, and may possibly reuse same locking.

better drop 12-13 for now

Patch 14 is good, let's keep it. It has correct abort() in 
backup_top_cbw(), it's not dependent on 12-13, and it's waiting for 
corrected combining of backup-top, backup and block-copy.

And patch 15 is bad, I'll rewrite it. So, 16 is not needed too.

> 
> On 20.09.2019 17:20, Vladimir Sementsov-Ogievskiy wrote:
>> Hi all!
>>
>> These series introduce backup-top driver. It's a filter-node, which
>> do copy-before-write operation. Mirror uses filter-node for handling
>> guest writes, let's move to filter-node (from write-notifiers) for
>> backup too.
>>
>> v11,v12 -> v13 changes:
>>
>> [v12 was two fixes in separate: [PATCH v12 0/2] backup: copy_range fixes]
>>
>> 01: new in v12, in v13 change comment
>> 02: in v12: add "Fixes: " to commit msg, in v13 add John's r-b
>> 05: rebase on 01
>> 07: rebase on 01. It still a clean movement, keep r-b
>>
>> Vladimir Sementsov-Ogievskiy (15):
>> block/backup: fix max_transfer handling for copy_range
>> block/backup: fix backup_cow_with_offload for last cluster
>> block/backup: split shareable copying part from backup_do_cow
>> block/backup: improve comment about image fleecing
>> block/backup: introduce BlockCopyState
>> block/backup: fix block-comment style
>> block: move block_copy from block/backup.c to separate file
>> block: teach bdrv_debug_breakpoint skip filters with backing
>> iotests: prepare 124 and 257 bitmap querying for backup-top filter
>> iotests: 257: drop unused Drive.device field
>> iotests: 257: drop device_add
>> block/io: refactor wait_serialising_requests
>> block: add lock/unlock range functions
>> block: introduce backup-top filter driver
>> block/backup: use backup-top instead of write notifiers
>>
>>qapi/block-core.json  |   8 +-
>>block/backup-top.h|  37 ++
>>include/block/block-copy.h|  84 
>>include/block/block_int.h |   5 +
>>block.c   |  34 +-
>>block/backup-top.c| 240 
>>block/backup.c| 440 -
>>block/block-copy.c| 346 
>>block/io.c|  68 +++-
>>block/replication.c   |   2 +-
>>blockdev.c|   1 +
>>block/Makefile.objs   |   3 +
>>block/trace-events|  14 +-
>>tests/qemu-iotests/056|   8 +-
>>tests/qemu-iotests/124|  83 ++--
>>tests/qemu-iotests/257|  91 ++---
>>tests/qemu-iotests/257.out| 714 ++
>>tests/qemu-iotests/iotests.py |  27 ++
>>18 files changed, 1287 insertions(+), 918 deletions(-)
>>create mode 100644 block/backup-top.h
>>create mode 100644 include/block/block-copy.h
>>create mode 100644 block/backup-top.c
>>create mode 100644 block/block-copy.c
>>



Re: [PATCH v2 3/7] s390x/mmu: Inject DAT exceptions from a single place

2019-09-25 Thread Richard Henderson
On 9/25/19 5:52 AM, David Hildenbrand wrote:
> Let's return the PGM from the translation functions on error and inject
> based on that.
> 
> Signed-off-by: David Hildenbrand 
> ---
>  target/s390x/mmu_helper.c | 63 +++
>  1 file changed, 17 insertions(+), 46 deletions(-)

Reviewed-by: Richard Henderson 


r~




Re: [PATCH v13 00/15] backup-top filter driver for backup

2019-09-25 Thread Vladimir Sementsov-Ogievskiy
Ogh :(

And I realized that there is bigger problem with design:

Assume failed copy in filter request: we want to mark bits dirty again
and release range lock on source.. But if we have some write reguests
in parallel, they may already passed backup-top filter, and they are 
only waiting for range lock. When lock is free the will go on and will
not see bitmap changes..

That means that we can't use range lock: waiting request must wait on
backup-top level, but range lock will not work on it, as they will 
interfer with original write request.

I have to rething it somehow, a kind of "intersecting requests" possibly 
will be kept. I still don't like that current backup write-notifier 
locks the whole region, even non-dirty bits, instead we should lock only 
the region which we are handling at the moment.

Patches 01-11 are still good themselves, as a preparation, let's keep them

Patches 12-13 are good, but range lock is not appropriate for backup.. 
May be they will be used for rewriting copy-on-read filter to copy in 
filter code.. Still I'm not sure, as COR should work through block-copy 
finally, and may possibly reuse same locking.

On 20.09.2019 17:20, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> These series introduce backup-top driver. It's a filter-node, which
> do copy-before-write operation. Mirror uses filter-node for handling
> guest writes, let's move to filter-node (from write-notifiers) for
> backup too.
> 
> v11,v12 -> v13 changes:
> 
> [v12 was two fixes in separate: [PATCH v12 0/2] backup: copy_range fixes]
> 
> 01: new in v12, in v13 change comment
> 02: in v12: add "Fixes: " to commit msg, in v13 add John's r-b
> 05: rebase on 01
> 07: rebase on 01. It still a clean movement, keep r-b
> 
> Vladimir Sementsov-Ogievskiy (15):
>block/backup: fix max_transfer handling for copy_range
>block/backup: fix backup_cow_with_offload for last cluster
>block/backup: split shareable copying part from backup_do_cow
>block/backup: improve comment about image fleecing
>block/backup: introduce BlockCopyState
>block/backup: fix block-comment style
>block: move block_copy from block/backup.c to separate file
>block: teach bdrv_debug_breakpoint skip filters with backing
>iotests: prepare 124 and 257 bitmap querying for backup-top filter
>iotests: 257: drop unused Drive.device field
>iotests: 257: drop device_add
>block/io: refactor wait_serialising_requests
>block: add lock/unlock range functions
>block: introduce backup-top filter driver
>block/backup: use backup-top instead of write notifiers
> 
>   qapi/block-core.json  |   8 +-
>   block/backup-top.h|  37 ++
>   include/block/block-copy.h|  84 
>   include/block/block_int.h |   5 +
>   block.c   |  34 +-
>   block/backup-top.c| 240 
>   block/backup.c| 440 -
>   block/block-copy.c| 346 
>   block/io.c|  68 +++-
>   block/replication.c   |   2 +-
>   blockdev.c|   1 +
>   block/Makefile.objs   |   3 +
>   block/trace-events|  14 +-
>   tests/qemu-iotests/056|   8 +-
>   tests/qemu-iotests/124|  83 ++--
>   tests/qemu-iotests/257|  91 ++---
>   tests/qemu-iotests/257.out| 714 ++
>   tests/qemu-iotests/iotests.py |  27 ++
>   18 files changed, 1287 insertions(+), 918 deletions(-)
>   create mode 100644 block/backup-top.h
>   create mode 100644 include/block/block-copy.h
>   create mode 100644 block/backup-top.c
>   create mode 100644 block/block-copy.c
> 



Re: [PATCH v2 4/7] s390x/mmu: Inject PGM_ADDRESSING on boguous table addresses

2019-09-25 Thread Richard Henderson
On 9/25/19 5:52 AM, David Hildenbrand wrote:
> +static inline int read_table_entry(hwaddr gaddr, uint64_t *entry)
> +{
> +/*
> + * According to the PoP, these table addresses are "unpredictably real
> + * or absolute". Also, "it is unpredictable whether the address wraps
> + * or an addressing exception is recognized".
> + *
> + * We treat them as absolute addresses and don't wrap them.
> + */
> +if (unlikely(address_space_read(_space_memory, gaddr,
> + MEMTXATTRS_UNSPECIFIED, (uint8_t *)entry, sizeof(*entry)) !=
> + MEMTX_OK)) {
> +return -EFAULT;
> +}
> +*entry = be64_to_cpu(*entry);
> +return 0;
> +}

Maybe I've been away from the kernel too long, but I don't find returning
-EFAULT helpful.  I would return true/false for success/failure so that...


> +if (read_table_entry(origin + offs, _entry)) {
> +return PGM_ADDRESSING;
> +}

... this gets written

if (!read_table_entry(...)) {
return PGM_ADDRESSING;
}

This statement, to me, reads "If we did not read_table_entry, return an
addressing exception."

If you *really* want to return non-zero on failure, I would prefer returning
PGM_ADDRESSING instead of the out-of-context -EFAULT.

> -new_entry = ldq_phys(cs->as, origin + offs);
> +if (read_table_entry(origin + offs, _entry)) {

Do you really want to replace cs->as with address_space_memory?


r~



Re: [PATCH v2 1/7] s390x/mmu: Drop debug logging from MMU code

2019-09-25 Thread Richard Henderson
On 9/25/19 5:52 AM, David Hildenbrand wrote:
> Let's get it out of the way to make some further refactorings easier.
> Personally, I've never used these debug statements at all. And if I had
> to debug issue, I used plain GDB instead (debug prints are just way too
> much noise in the MMU). We might want to introduce tracing at some point
> instead, so we can able selected events on demand.
> 
> Signed-off-by: David Hildenbrand 
> ---
>  target/s390x/mmu_helper.c | 51 ---
>  1 file changed, 51 deletions(-)

Reviewed-by: Richard Henderson 


r~




Re: [PATCH v2 2/7] s390x/mmu: Move DAT protection handling out of mmu_translate_asce()

2019-09-25 Thread Richard Henderson
On 9/25/19 5:52 AM, David Hildenbrand wrote:
> We'll reuse the ilen and tec definitions in mmu_translate
> soon also for all other DAT exceptions we inject. Move it to the caller,
> where we can later pair it up with other protection checks, like IEP.
> 
> Signed-off-by: David Hildenbrand 
> ---
>  target/s390x/mmu_helper.c | 39 ---
>  1 file changed, 16 insertions(+), 23 deletions(-)

Reviewed-by: Richard Henderson 


r~




Re: [PATCH v8 01/13] vfio: KABI for migration interface

2019-09-25 Thread Alex Williamson
On Tue, 24 Sep 2019 23:04:22 +
"Tian, Kevin"  wrote:

> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Wednesday, September 25, 2019 2:03 AM
> > 
> > On Tue, 24 Sep 2019 02:19:15 +
> > "Tian, Kevin"  wrote:
> >   
> > > > From: Tian, Kevin
> > > > Sent: Friday, September 13, 2019 7:00 AM
> > > >  
> > > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > > Sent: Thursday, September 12, 2019 10:41 PM
> > > > >
> > > > > On Tue, 3 Sep 2019 06:57:27 +
> > > > > "Tian, Kevin"  wrote:
> > > > >  
> > > > > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > > > > Sent: Saturday, August 31, 2019 12:33 AM
> > > > > > >
> > > > > > > On Fri, 30 Aug 2019 08:06:32 +
> > > > > > > "Tian, Kevin"  wrote:
> > > > > > >  
> > > > > > > > > From: Tian, Kevin
> > > > > > > > > Sent: Friday, August 30, 2019 3:26 PM
> > > > > > > > >  
> > > > > > > > [...]  
> > > > > > > > > > How does QEMU handle the fact that IOVAs are potentially  
> > > > > dynamic  
> > > > > > > while  
> > > > > > > > > > performing the live portion of a migration?  For example,  
> > each  
> > > > > time a  
> > > > > > > > > > guest driver calls dma_map_page() or dma_unmap_page(), a
> > > > > > > > > > MemoryRegionSection pops in or out of the AddressSpace for  
> > > > the  
> > > > > device  
> > > > > > > > > > (I'm assuming a vIOMMU where the device AddressSpace is  
> > not  
> > > > > > > > > > system_memory).  I don't see any QEMU code that intercepts  
> > > > that  
> > > > > > > change  
> > > > > > > > > > in the AddressSpace such that the IOVA dirty pfns could be  
> > > > > recorded and  
> > > > > > > > > > translated to GFNs.  The vendor driver can't track these  
> > beyond  
> > > > > getting  
> > > > > > > > > > an unmap notification since it only knows the IOVA pfns,  
> > which  
> > > > > can be  
> > > > > > > > > > re-used with different GFN backing.  Once the DMA mapping  
> > is  
> > > > > torn  
> > > > > > > down,  
> > > > > > > > > > it seems those dirty pfns are lost in the ether.  If this 
> > > > > > > > > > works in  
> > > > > QEMU,  
> > > > > > > > > > please help me find the code that handles it.  
> > > > > > > > >
> > > > > > > > > I'm curious about this part too. Interestingly, I didn't find 
> > > > > > > > > any  
> > > > > log_sync  
> > > > > > > > > callback registered by emulated devices in Qemu. Looks dirty  
> > > > pages  
> > > > > > > > > by emulated DMAs are recorded in some implicit way. But KVM  
> > > > > always  
> > > > > > > > > reports dirty page in GFN instead of IOVA, regardless of the  
> > > > > presence of  
> > > > > > > > > vIOMMU. If Qemu also tracks dirty pages in GFN for emulated  
> > > > DMAs  
> > > > > > > > >  (translation can be done when DMA happens), then we don't  
> > > > need  
> > > > > > > > > worry about transient mapping from IOVA to GFN. Along this  
> > way  
> > > > > we  
> > > > > > > > > also want GFN-based dirty bitmap being reported through VFIO,
> > > > > > > > > similar to what KVM does. For vendor drivers, it needs to  
> > translate  
> > > > > > > > > from IOVA to HVA to GFN when tracking DMA activities on  
> > VFIO  
> > > > > > > > > devices. IOVA->HVA is provided by VFIO. for HVA->GFN, it can  
> > be  
> > > > > > > > > provided by KVM but I'm not sure whether it's exposed now.
> > > > > > > > >  
> > > > > > > >
> > > > > > > > HVA->GFN can be done through hva_to_gfn_memslot in  
> > kvm_host.h.  
> > > > > > >
> > > > > > > I thought it was bad enough that we have vendor drivers that  
> > depend  
> > > > > on  
> > > > > > > KVM, but designing a vfio interface that only supports a KVM  
> > interface  
> > > > > > > is more undesirable.  I also note without comment that  
> > > > > gfn_to_memslot()  
> > > > > > > is a GPL symbol.  Thanks,  
> > > > > >
> > > > > > yes it is bad, but sometimes inevitable. If you recall our 
> > > > > > discussions
> > > > > > back to 3yrs (when discussing the 1st mdev framework), there were  
> > > > > similar  
> > > > > > hypervisor dependencies in GVT-g, e.g. querying gpa->hpa when
> > > > > > creating some shadow structures. gpa->hpa is definitely hypervisor
> > > > > > specific knowledge, which is easy in KVM (gpa->hva->hpa), but  
> > needs  
> > > > > > hypercall in Xen. but VFIO already makes assumption based on  
> > KVM-  
> > > > > > only flavor when implementing vfio_{un}pin_page_external.  
> > > > >
> > > > > Where's the KVM assumption there?  The MAP_DMA ioctl takes an  
> > IOVA  
> > > > > and
> > > > > HVA.  When an mdev vendor driver calls vfio_pin_pages(), we GUP the  
> > > > HVA  
> > > > > to get an HPA and provide an array of HPA pfns back to the caller.  
> > > > > The
> > > > > other vGPU mdev vendor manages to make use of this without KVM...  
> > the  
> > > > > KVM interface used by GVT-g is GPL-only.  
> > > >
> > > > To be clear it's the assumption on the host-based hypervisors e.g. KVM.
> > > > GUP is 

Re: [PATCH v3 23/33] docs/devel: add "check-tcg" to testing.rst

2019-09-25 Thread Richard Henderson
On 9/24/19 2:00 PM, Alex Bennée wrote:
> It was pointed out we haven't documented the check-tcg part of the
> build system. Attempt to rectify that now.
> 
> Signed-off-by: Alex Bennée 
> ---
>  docs/devel/testing.rst | 62 ++
>  1 file changed, 62 insertions(+)

Reviewed-by: Richard Henderson 


r~




[Bug 1841990] Re: instruction 'denbcdq' misbehaving

2019-09-25 Thread Mark Cave-Ayland
That certainly sounds like progress. Did you see the follow up email
indicating the typo that I found in patch 6? It can be fixed by applying
the following diff on top:

diff --git a/target/ppc/dfp_helper.c b/target/ppc/dfp_helper.c
index c2d335e928..b801acbedc 100644
--- a/target/ppc/dfp_helper.c
+++ b/target/ppc/dfp_helper.c
@@ -1054,7 +1054,7 @@ static inline void dfp_set_sign_64(ppc_vsr_t *t, uint8_t 
sgn)
 static inline void dfp_set_sign_128(ppc_vsr_t *t, uint8_t sgn)
 {
 t->VsrD(0) <<= 4;
-t->VsrD(0) |= (t->VsrD(0) >> 60);
+t->VsrD(0) |= (t->VsrD(1) >> 60);
 t->VsrD(1) <<= 4;
 t->VsrD(1) |= (sgn & 0xF);
 }

Does that help any more tests to pass? Also the changes to the FP
register layout were made in QEMU 4.0 and so it seems to me that even if
some tests fail, if the results between QEMU 3.1 and QEMU git master
with the patchset applied are equivalent then we can assume that the
patchset functionality is correct.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1841990

Title:
  instruction 'denbcdq' misbehaving

Status in QEMU:
  New

Bug description:
  Instruction 'denbcdq' appears to have no effect.  Test case attached.

  On ppc64le native:
  --
  gcc -g -O -mcpu=power9 bcdcfsq.c test-denbcdq.c -o test-denbcdq
  $ ./test-denbcdq
  0x
  0x000c
  0x2208
  $ ./test-denbcdq 1
  0x0001
  0x001c
  0x22080001
  $ ./test-denbcdq $(seq 0 99)
  0x0064
  0x100c
  0x22080080
  --

  With "qemu-ppc64le -cpu power9"
  --
  $ qemu-ppc64le -cpu power9 -L [...] ./test-denbcdq
  0x
  0x000c
  0x000c
  $ qemu-ppc64le -cpu power9 -L [...] ./test-denbcdq 1
  0x0001
  0x001c
  0x001c
  $ qemu-ppc64le -cpu power9 -L [...] ./test-denbcdq $(seq 100)
  0x0064
  0x100c
  0x100c
  --

  I started looking at the code, but I got confused rather quickly.
  Could be related to endianness? I think denbcdq arrived on the scene
  before little-endian was a big deal.  Maybe something to do with
  utilizing implicit floating-point register pairs...  I don't think the
  right data is getting to helper_denbcdq, which would point back to the
  gen_fprp_ptr uses in dfp-impl.inc.c (GEN_DFP_T_FPR_I32_Rc).  (Maybe?)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1841990/+subscriptions



Re: [PATCH v3 33/33] tests/docker: remove debian-powerpc-user-cross

2019-09-25 Thread Richard Henderson
On 9/24/19 2:01 PM, Alex Bennée wrote:
> Despite our attempts in 4d26c7fef4 to keep this going it still gets in
> the way of "make docker-test-build" completing because of course we
> can't build a modern QEMU with the image. Let's put the thing out of
> it's misery and remove it.
> 
> People who really care about building on powerpc can still use the
> binfmt_misc support to manually build an image (or just run the build
> from pre this commit).
> 
> Signed-off-by: Alex Bennée 
> Cc: Mark Cave-Ayland 
> ---
>  tests/docker/Makefile.include |  9 
>  .../debian-powerpc-user-cross.docker  | 21 ---
>  2 files changed, 30 deletions(-)
>  delete mode 100644 tests/docker/dockerfiles/debian-powerpc-user-cross.docker

Reviewed-by: Richard Henderson 


r~




[PULL 16/16] cputlb: Pass retaddr to tb_check_watchpoint

2019-09-25 Thread Richard Henderson
Fixes the previous TLB_WATCHPOINT patches because we are currently
failing to set cpu->mem_io_pc with the call to cpu_check_watchpoint.
Pass down the retaddr directly because it's readily available.

Fixes: 50b107c5d61
Reviewed-by: Alex Bennée 
Reviewed-by: David Hildenbrand 
Signed-off-by: Richard Henderson 
---
 accel/tcg/translate-all.h | 2 +-
 accel/tcg/translate-all.c | 6 +++---
 exec.c| 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/accel/tcg/translate-all.h b/accel/tcg/translate-all.h
index 135c1ea96a..a557b4e2bb 100644
--- a/accel/tcg/translate-all.h
+++ b/accel/tcg/translate-all.h
@@ -30,7 +30,7 @@ void tb_invalidate_phys_page_fast(struct page_collection 
*pages,
   tb_page_addr_t start, int len,
   uintptr_t retaddr);
 void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end);
-void tb_check_watchpoint(CPUState *cpu);
+void tb_check_watchpoint(CPUState *cpu, uintptr_t retaddr);
 
 #ifdef CONFIG_USER_ONLY
 int page_unprotect(target_ulong address, uintptr_t pc);
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index db77fb221b..66d4bc4341 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -2142,16 +2142,16 @@ static bool tb_invalidate_phys_page(tb_page_addr_t 
addr, uintptr_t pc)
 #endif
 
 /* user-mode: call with mmap_lock held */
-void tb_check_watchpoint(CPUState *cpu)
+void tb_check_watchpoint(CPUState *cpu, uintptr_t retaddr)
 {
 TranslationBlock *tb;
 
 assert_memory_lock();
 
-tb = tcg_tb_lookup(cpu->mem_io_pc);
+tb = tcg_tb_lookup(retaddr);
 if (tb) {
 /* We can use retranslation to find the PC.  */
-cpu_restore_state_from_tb(cpu, tb, cpu->mem_io_pc, true);
+cpu_restore_state_from_tb(cpu, tb, retaddr, true);
 tb_phys_invalidate(tb, -1);
 } else {
 /* The exception probably happened in a helper.  The CPU state should
diff --git a/exec.c b/exec.c
index b3df826039..8a0a6613b1 100644
--- a/exec.c
+++ b/exec.c
@@ -2758,7 +2758,7 @@ void cpu_check_watchpoint(CPUState *cpu, vaddr addr, 
vaddr len,
 cpu->watchpoint_hit = wp;
 
 mmap_lock();
-tb_check_watchpoint(cpu);
+tb_check_watchpoint(cpu, ra);
 if (wp->flags & BP_STOP_BEFORE_ACCESS) {
 cpu->exception_index = EXCP_DEBUG;
 mmap_unlock();
-- 
2.17.1




Re: [PATCH v4 00/16] Move rom and notdirty handling to cputlb

2019-09-25 Thread Mark Cave-Ayland
On 25/09/2019 19:52, Mark Cave-Ayland wrote:

> On 23/09/2019 23:59, Richard Henderson wrote:
> 
>> Changes since v3:
>>   * Don't accidentally include the TARGET_PAGE_BITS_VARY patch set.  ;-)
>>   * Remove __has_attribute(__always_inline__).
>>   * Use single load/store_memop function instead of separate small wrappers.
>>   * Introduce optimize_away to assert the code folds away as expected.
>>
>> Patches without review:
>>
>> 0003-qemu-compiler.h-Add-optimize_away.patch
>> 0004-cputlb-Use-optimize_away-in-load-store_helpers.patch
>> 0005-cputlb-Split-out-load-store_memop.patch
>> 0010-cputlb-Partially-inline-memory_region_section_get.patch
>> 0011-cputlb-Merge-and-move-memory_notdirty_write_-prep.patch
>> 0012-cputlb-Handle-TLB_NOTDIRTY-in-probe_access.patch
>>
>>
>> r~
>>
>>
>> Richard Henderson (16):
>>   exec: Use TARGET_PAGE_BITS_MIN for TLB flags
>>   cputlb: Disable __always_inline__ without optimization
>>   qemu/compiler.h: Add optimize_away
>>   cputlb: Use optimize_away in load/store_helpers
>>   cputlb: Split out load/store_memop
>>   cputlb: Introduce TLB_BSWAP
>>   exec: Adjust notdirty tracing
>>   cputlb: Move ROM handling from I/O path to TLB path
>>   cputlb: Move NOTDIRTY handling from I/O path to TLB path
>>   cputlb: Partially inline memory_region_section_get_iotlb
>>   cputlb: Merge and move memory_notdirty_write_{prepare,complete}
>>   cputlb: Handle TLB_NOTDIRTY in probe_access
>>   cputlb: Remove cpu->mem_io_vaddr
>>   cputlb: Remove tb_invalidate_phys_page_range is_cpu_write_access
>>   cputlb: Pass retaddr to tb_invalidate_phys_page_fast
>>   cputlb: Pass retaddr to tb_check_watchpoint
>>
>>  accel/tcg/translate-all.h  |   8 +-
>>  include/exec/cpu-all.h |  23 ++-
>>  include/exec/cpu-common.h  |   3 -
>>  include/exec/exec-all.h|   6 +-
>>  include/exec/memory-internal.h |  65 ---
>>  include/hw/core/cpu.h  |   2 -
>>  include/qemu/compiler.h|  26 +++
>>  accel/tcg/cputlb.c | 340 +++--
>>  accel/tcg/translate-all.c  |  51 +++--
>>  exec.c | 158 +--
>>  hw/core/cpu.c  |   1 -
>>  memory.c   |  20 --
>>  trace-events   |   4 +-
>>  13 files changed, 279 insertions(+), 428 deletions(-)
> 
> Am I right in thinking that this is now the latest version of the patchset 
> which
> fixes up the byte swaps in RAM?
> 
> I'm not sure that I can offer much in the way of review, however is there any 
> testing
> I can do to help out here?

Ha okay, I've just seen the TCG PR appear in my inbox so I'll assume that 
everyone is
happy and everything is working as intended :)


ATB,

Mark.



Re: [PATCH v3 18/33] tests/tcg: re-enable linux-test for ppc64abi32

2019-09-25 Thread Richard Henderson
On 9/24/19 2:00 PM, Alex Bennée wrote:
> Now we have fixed the signal delivary bug we can remove this horrible
> hack from the system.
> 
> Cc: Richard Henderson 
> Signed-off-by: Alex Bennée 
> 
> ---
> v2
>   - drop un-needed cflags
> ---
>  tests/tcg/multiarch/Makefile.target | 11 +++
>  1 file changed, 3 insertions(+), 8 deletions(-)

Reviewed-by: Richard Henderson 


r~



[PULL 15/16] cputlb: Pass retaddr to tb_invalidate_phys_page_fast

2019-09-25 Thread Richard Henderson
Rather than rely on cpu->mem_io_pc, pass retaddr down directly.

Within tb_invalidate_phys_page_range__locked, the is_cpu_write_access
parameter is non-zero exactly when retaddr would be non-zero, so that
is a simple replacement.

Recognize that current_tb_not_found is true only when mem_io_pc
(and now retaddr) are also non-zero, so remove a redundant test.

Reviewed-by: Alex Bennée 
Reviewed-by: David Hildenbrand 
Signed-off-by: Richard Henderson 
---
 accel/tcg/translate-all.h |  3 ++-
 accel/tcg/cputlb.c|  6 +-
 accel/tcg/translate-all.c | 39 +++
 3 files changed, 22 insertions(+), 26 deletions(-)

diff --git a/accel/tcg/translate-all.h b/accel/tcg/translate-all.h
index 31f2117188..135c1ea96a 100644
--- a/accel/tcg/translate-all.h
+++ b/accel/tcg/translate-all.h
@@ -27,7 +27,8 @@ struct page_collection *page_collection_lock(tb_page_addr_t 
start,
  tb_page_addr_t end);
 void page_collection_unlock(struct page_collection *set);
 void tb_invalidate_phys_page_fast(struct page_collection *pages,
-  tb_page_addr_t start, int len);
+  tb_page_addr_t start, int len,
+  uintptr_t retaddr);
 void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end);
 void tb_check_watchpoint(CPUState *cpu);
 
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 4b24811ce7..defc8d5929 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1094,11 +1094,7 @@ static void notdirty_write(CPUState *cpu, vaddr 
mem_vaddr, unsigned size,
 if (!cpu_physical_memory_get_dirty_flag(ram_addr, DIRTY_MEMORY_CODE)) {
 struct page_collection *pages
 = page_collection_lock(ram_addr, ram_addr + size);
-
-/* We require mem_io_pc in tb_invalidate_phys_page_range.  */
-cpu->mem_io_pc = retaddr;
-
-tb_invalidate_phys_page_fast(pages, ram_addr, size);
+tb_invalidate_phys_page_fast(pages, ram_addr, size, retaddr);
 page_collection_unlock(pages);
 }
 
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index de4b697163..db77fb221b 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1889,7 +1889,7 @@ static void
 tb_invalidate_phys_page_range__locked(struct page_collection *pages,
   PageDesc *p, tb_page_addr_t start,
   tb_page_addr_t end,
-  int is_cpu_write_access)
+  uintptr_t retaddr)
 {
 TranslationBlock *tb;
 tb_page_addr_t tb_start, tb_end;
@@ -1897,9 +1897,9 @@ tb_invalidate_phys_page_range__locked(struct 
page_collection *pages,
 #ifdef TARGET_HAS_PRECISE_SMC
 CPUState *cpu = current_cpu;
 CPUArchState *env = NULL;
-int current_tb_not_found = is_cpu_write_access;
+bool current_tb_not_found = retaddr != 0;
+bool current_tb_modified = false;
 TranslationBlock *current_tb = NULL;
-int current_tb_modified = 0;
 target_ulong current_pc = 0;
 target_ulong current_cs_base = 0;
 uint32_t current_flags = 0;
@@ -1931,24 +1931,21 @@ tb_invalidate_phys_page_range__locked(struct 
page_collection *pages,
 if (!(tb_end <= start || tb_start >= end)) {
 #ifdef TARGET_HAS_PRECISE_SMC
 if (current_tb_not_found) {
-current_tb_not_found = 0;
-current_tb = NULL;
-if (cpu->mem_io_pc) {
-/* now we have a real cpu fault */
-current_tb = tcg_tb_lookup(cpu->mem_io_pc);
-}
+current_tb_not_found = false;
+/* now we have a real cpu fault */
+current_tb = tcg_tb_lookup(retaddr);
 }
 if (current_tb == tb &&
 (tb_cflags(current_tb) & CF_COUNT_MASK) != 1) {
-/* If we are modifying the current TB, we must stop
-its execution. We could be more precise by checking
-that the modification is after the current PC, but it
-would require a specialized function to partially
-restore the CPU state */
-
-current_tb_modified = 1;
-cpu_restore_state_from_tb(cpu, current_tb,
-  cpu->mem_io_pc, true);
+/*
+ * If we are modifying the current TB, we must stop
+ * its execution. We could be more precise by checking
+ * that the modification is after the current PC, but it
+ * would require a specialized function to partially
+ * restore the CPU state.
+ */
+current_tb_modified = true;
+cpu_restore_state_from_tb(cpu, current_tb, retaddr, true);
 cpu_get_tb_cpu_state(env, _pc, 

Re: [PATCH v3 15/33] tests/docker: reduce scary warnings by cleaning up clean up

2019-09-25 Thread Richard Henderson
On 9/24/19 2:00 PM, Alex Bennée wrote:
> There was in the clean-up code caused by attempting to inspect images
> which finished before we got there. Clean up the clean up code by:
> 
>   - only track the one instance at a time
>   - use --filter for docker ps instead of doing it by hand
>   - just call docker rm -f to be done with it
>   - use uuid.uuid4() for a random uid
> 
> Signed-off-by: Alex Bennée 
> 
> ---
> v2
>   - drop the try/except approach and be smarter
>   - use uuid4 as uuid1 can generate clashes in parallel builds
> 
> fixup! tests/docker: reduce scary warnings by cleaning up clean up
> ---
>  tests/docker/docker.py | 34 --
>  1 file changed, 16 insertions(+), 18 deletions(-)

Reviewed-by: Richard Henderson 


r~




[PULL 14/16] cputlb: Remove tb_invalidate_phys_page_range is_cpu_write_access

2019-09-25 Thread Richard Henderson
All callers pass false to this argument.  Remove it and pass the
constant on to tb_invalidate_phys_page_range__locked.

Reviewed-by: Alex Bennée 
Reviewed-by: David Hildenbrand 
Signed-off-by: Richard Henderson 
---
 accel/tcg/translate-all.h | 3 +--
 accel/tcg/translate-all.c | 6 ++
 exec.c| 4 ++--
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/accel/tcg/translate-all.h b/accel/tcg/translate-all.h
index 64f5fd9a05..31f2117188 100644
--- a/accel/tcg/translate-all.h
+++ b/accel/tcg/translate-all.h
@@ -28,8 +28,7 @@ struct page_collection *page_collection_lock(tb_page_addr_t 
start,
 void page_collection_unlock(struct page_collection *set);
 void tb_invalidate_phys_page_fast(struct page_collection *pages,
   tb_page_addr_t start, int len);
-void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
-   int is_cpu_write_access);
+void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end);
 void tb_check_watchpoint(CPUState *cpu);
 
 #ifdef CONFIG_USER_ONLY
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 5d1e08b169..de4b697163 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1983,8 +1983,7 @@ tb_invalidate_phys_page_range__locked(struct 
page_collection *pages,
  *
  * Called with mmap_lock held for user-mode emulation
  */
-void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
-   int is_cpu_write_access)
+void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end)
 {
 struct page_collection *pages;
 PageDesc *p;
@@ -1996,8 +1995,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, 
tb_page_addr_t end,
 return;
 }
 pages = page_collection_lock(start, end);
-tb_invalidate_phys_page_range__locked(pages, p, start, end,
-  is_cpu_write_access);
+tb_invalidate_phys_page_range__locked(pages, p, start, end, 0);
 page_collection_unlock(pages);
 }
 
diff --git a/exec.c b/exec.c
index 7d835b1a2b..b3df826039 100644
--- a/exec.c
+++ b/exec.c
@@ -1012,7 +1012,7 @@ const char *parse_cpu_option(const char *cpu_option)
 void tb_invalidate_phys_addr(target_ulong addr)
 {
 mmap_lock();
-tb_invalidate_phys_page_range(addr, addr + 1, 0);
+tb_invalidate_phys_page_range(addr, addr + 1);
 mmap_unlock();
 }
 
@@ -1039,7 +1039,7 @@ void tb_invalidate_phys_addr(AddressSpace *as, hwaddr 
addr, MemTxAttrs attrs)
 return;
 }
 ram_addr = memory_region_get_ram_addr(mr) + addr;
-tb_invalidate_phys_page_range(ram_addr, ram_addr + 1, 0);
+tb_invalidate_phys_page_range(ram_addr, ram_addr + 1);
 rcu_read_unlock();
 }
 
-- 
2.17.1




[PULL 12/16] cputlb: Handle TLB_NOTDIRTY in probe_access

2019-09-25 Thread Richard Henderson
We can use notdirty_write for the write and return a valid host
pointer for this case.

Reviewed-by: David Hildenbrand 
Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 3e91838519..b56e9ddf8c 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1168,16 +1168,24 @@ void *probe_access(CPUArchState *env, target_ulong 
addr, int size,
 return NULL;
 }
 
-/* Handle watchpoints.  */
-if (tlb_addr & TLB_WATCHPOINT) {
-cpu_check_watchpoint(env_cpu(env), addr, size,
- env_tlb(env)->d[mmu_idx].iotlb[index].attrs,
- wp_access, retaddr);
-}
+if (unlikely(tlb_addr & TLB_FLAGS_MASK)) {
+CPUIOTLBEntry *iotlbentry = _tlb(env)->d[mmu_idx].iotlb[index];
 
-/* Reject I/O access, or other required slow-path.  */
-if (tlb_addr & (TLB_NOTDIRTY | TLB_MMIO | TLB_BSWAP | TLB_DISCARD_WRITE)) {
-return NULL;
+/* Reject I/O access, or other required slow-path.  */
+if (tlb_addr & (TLB_MMIO | TLB_BSWAP | TLB_DISCARD_WRITE)) {
+return NULL;
+}
+
+/* Handle watchpoints.  */
+if (tlb_addr & TLB_WATCHPOINT) {
+cpu_check_watchpoint(env_cpu(env), addr, size,
+ iotlbentry->attrs, wp_access, retaddr);
+}
+
+/* Handle clean RAM pages.  */
+if (tlb_addr & TLB_NOTDIRTY) {
+notdirty_write(env_cpu(env), addr, size, iotlbentry, retaddr);
+}
 }
 
 return (void *)((uintptr_t)addr + entry->addend);
-- 
2.17.1




Re: [PATCH v4 00/16] Move rom and notdirty handling to cputlb

2019-09-25 Thread Mark Cave-Ayland
On 23/09/2019 23:59, Richard Henderson wrote:

> Changes since v3:
>   * Don't accidentally include the TARGET_PAGE_BITS_VARY patch set.  ;-)
>   * Remove __has_attribute(__always_inline__).
>   * Use single load/store_memop function instead of separate small wrappers.
>   * Introduce optimize_away to assert the code folds away as expected.
> 
> Patches without review:
> 
> 0003-qemu-compiler.h-Add-optimize_away.patch
> 0004-cputlb-Use-optimize_away-in-load-store_helpers.patch
> 0005-cputlb-Split-out-load-store_memop.patch
> 0010-cputlb-Partially-inline-memory_region_section_get.patch
> 0011-cputlb-Merge-and-move-memory_notdirty_write_-prep.patch
> 0012-cputlb-Handle-TLB_NOTDIRTY-in-probe_access.patch
> 
> 
> r~
> 
> 
> Richard Henderson (16):
>   exec: Use TARGET_PAGE_BITS_MIN for TLB flags
>   cputlb: Disable __always_inline__ without optimization
>   qemu/compiler.h: Add optimize_away
>   cputlb: Use optimize_away in load/store_helpers
>   cputlb: Split out load/store_memop
>   cputlb: Introduce TLB_BSWAP
>   exec: Adjust notdirty tracing
>   cputlb: Move ROM handling from I/O path to TLB path
>   cputlb: Move NOTDIRTY handling from I/O path to TLB path
>   cputlb: Partially inline memory_region_section_get_iotlb
>   cputlb: Merge and move memory_notdirty_write_{prepare,complete}
>   cputlb: Handle TLB_NOTDIRTY in probe_access
>   cputlb: Remove cpu->mem_io_vaddr
>   cputlb: Remove tb_invalidate_phys_page_range is_cpu_write_access
>   cputlb: Pass retaddr to tb_invalidate_phys_page_fast
>   cputlb: Pass retaddr to tb_check_watchpoint
> 
>  accel/tcg/translate-all.h  |   8 +-
>  include/exec/cpu-all.h |  23 ++-
>  include/exec/cpu-common.h  |   3 -
>  include/exec/exec-all.h|   6 +-
>  include/exec/memory-internal.h |  65 ---
>  include/hw/core/cpu.h  |   2 -
>  include/qemu/compiler.h|  26 +++
>  accel/tcg/cputlb.c | 340 +++--
>  accel/tcg/translate-all.c  |  51 +++--
>  exec.c | 158 +--
>  hw/core/cpu.c  |   1 -
>  memory.c   |  20 --
>  trace-events   |   4 +-
>  13 files changed, 279 insertions(+), 428 deletions(-)

Am I right in thinking that this is now the latest version of the patchset which
fixes up the byte swaps in RAM?

I'm not sure that I can offer much in the way of review, however is there any 
testing
I can do to help out here?


ATB,

Mark.



[PULL 09/16] cputlb: Move NOTDIRTY handling from I/O path to TLB path

2019-09-25 Thread Richard Henderson
Pages that we want to track for NOTDIRTY are RAM.  We do not
really need to go through the I/O path to handle them.

Acked-by: David Hildenbrand 
Reviewed-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/exec/cpu-common.h |  2 --
 accel/tcg/cputlb.c| 26 +---
 exec.c| 50 ---
 memory.c  | 16 -
 4 files changed, 23 insertions(+), 71 deletions(-)

diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 1c0e03ddc2..81753bbb34 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -100,8 +100,6 @@ void qemu_flush_coalesced_mmio_buffer(void);
 
 void cpu_flush_icache_range(hwaddr start, hwaddr len);
 
-extern struct MemoryRegion io_mem_notdirty;
-
 typedef int (RAMBlockIterFunc)(RAMBlock *rb, void *opaque);
 
 int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque);
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 404ec57a4e..7e9a0f7ac8 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -905,7 +905,7 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry 
*iotlbentry,
 mr = section->mr;
 mr_offset = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
 cpu->mem_io_pc = retaddr;
-if (mr != _mem_notdirty && !cpu->can_do_io) {
+if (!cpu->can_do_io) {
 cpu_io_recompile(cpu, retaddr);
 }
 
@@ -946,7 +946,7 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry 
*iotlbentry,
 section = iotlb_to_section(cpu, iotlbentry->addr, iotlbentry->attrs);
 mr = section->mr;
 mr_offset = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
-if (mr != _mem_notdirty && !cpu->can_do_io) {
+if (!cpu->can_do_io) {
 cpu_io_recompile(cpu, retaddr);
 }
 cpu->mem_io_vaddr = addr;
@@ -1612,7 +1612,7 @@ store_helper(CPUArchState *env, target_ulong addr, 
uint64_t val,
 need_swap = size > 1 && (tlb_addr & TLB_BSWAP);
 
 /* Handle I/O access.  */
-if (likely(tlb_addr & (TLB_MMIO | TLB_NOTDIRTY))) {
+if (tlb_addr & TLB_MMIO) {
 io_writex(env, iotlbentry, mmu_idx, val, addr, retaddr,
   op ^ (need_swap * MO_BSWAP));
 return;
@@ -1625,6 +1625,26 @@ store_helper(CPUArchState *env, target_ulong addr, 
uint64_t val,
 
 haddr = (void *)((uintptr_t)addr + entry->addend);
 
+/* Handle clean RAM pages.  */
+if (tlb_addr & TLB_NOTDIRTY) {
+NotDirtyInfo ndi;
+
+/* We require mem_io_pc in tb_invalidate_phys_page_range.  */
+env_cpu(env)->mem_io_pc = retaddr;
+
+memory_notdirty_write_prepare(, env_cpu(env), addr,
+  addr + iotlbentry->addr, size);
+
+if (unlikely(need_swap)) {
+store_memop(haddr, val, op ^ MO_BSWAP);
+} else {
+store_memop(haddr, val, op);
+}
+
+memory_notdirty_write_complete();
+return;
+}
+
 /*
  * Keep these two store_memop separate to ensure that the compiler
  * is able to fold the entire function to a single instruction.
diff --git a/exec.c b/exec.c
index ea8c0b18ac..dc7001f115 100644
--- a/exec.c
+++ b/exec.c
@@ -88,7 +88,6 @@ static MemoryRegion *system_io;
 AddressSpace address_space_io;
 AddressSpace address_space_memory;
 
-MemoryRegion io_mem_notdirty;
 static MemoryRegion io_mem_unassigned;
 #endif
 
@@ -191,7 +190,6 @@ typedef struct subpage_t {
 } subpage_t;
 
 #define PHYS_SECTION_UNASSIGNED 0
-#define PHYS_SECTION_NOTDIRTY 1
 
 static void io_mem_init(void);
 static void memory_map_init(void);
@@ -1472,9 +1470,6 @@ hwaddr memory_region_section_get_iotlb(CPUState *cpu,
 if (memory_region_is_ram(section->mr)) {
 /* Normal RAM.  */
 iotlb = memory_region_get_ram_addr(section->mr) + xlat;
-if (!section->readonly) {
-iotlb |= PHYS_SECTION_NOTDIRTY;
-}
 } else {
 AddressSpaceDispatch *d;
 
@@ -2783,42 +2778,6 @@ void memory_notdirty_write_complete(NotDirtyInfo *ndi)
 }
 }
 
-/* Called within RCU critical section.  */
-static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
-   uint64_t val, unsigned size)
-{
-NotDirtyInfo ndi;
-
-memory_notdirty_write_prepare(, current_cpu, current_cpu->mem_io_vaddr,
- ram_addr, size);
-
-stn_p(qemu_map_ram_ptr(NULL, ram_addr), size, val);
-memory_notdirty_write_complete();
-}
-
-static bool notdirty_mem_accepts(void *opaque, hwaddr addr,
- unsigned size, bool is_write,
- MemTxAttrs attrs)
-{
-return is_write;
-}
-
-static const MemoryRegionOps notdirty_mem_ops = {
-.write = notdirty_mem_write,
-.valid.accepts = notdirty_mem_accepts,
-.endianness = DEVICE_NATIVE_ENDIAN,
-.valid = {
-

  1   2   3   4   5   >