date:20171003

Re: [Qemu-devel] [PATCH v8 0/6] Initial support for keycodemapdb GIT submodule

2017-10-03 Thread Fam Zheng

On Mon, 10/02 11:52, Programmingkid wrote:
> 
> > On Oct 2, 2017, at 6:11 AM, qemu-devel-requ...@gnu.org wrote:
> > 
> > Message: 21
> > Date: Mon, 2 Oct 2017 10:56:27 +0100
> > From: "Daniel P. Berrange" 
> > To: qemu-devel@nongnu.org
> > Cc: Fam Zheng , Gerd Hoffmann ,
> > Peter Maydell ,   Paolo Bonzini
> > 
> > Subject: Re: [Qemu-devel] [PATCH v8 0/6] Initial support for
> > keycodemapdb GIT submodule
> > Message-ID: <20171002095627.gd27...@redhat.com>
> > Content-Type: text/plain; charset=utf-8
> > 
> > FYI, in case people were wondering, patchew successfully passed all
> > tests. So it looks like 8th time lucky for getting submodules working
> > correctly unless someone can break it again.
> > 
> >  http://patchew.org/QEMU/20170929101201.21039-1-berra...@redhat.com/
> 
> Using the above link I was able to copy and paste your patches to my 
> computer. The problem was 'git am' kept saying "Patch format detection 
> failed". Using 'git apply' was what I used instead. After running 'make' I 
> saw this error: 

Since you've noticed the link: in the beginning of that page there is also an
easy way to get the patches by one command:

> git fetch https://github.com/patchew-project/qemu 
> patchew/20170929101201.21039-1-berra...@redhat.com

Fam

Re: [Qemu-devel] [PATCH v3 2/2] qemu-options: Deprecate -nodefconfig

2017-10-03 Thread Markus Armbruster

Eduardo Habkost  writes:

> Since 2012 (commit ba6212d8 "Eliminate cpus-x86_64.conf file") we
> have no default config files that would be disabled using
> -nodefconfig.  Update documentation and document -nodefconfig as
> deprecated.
>
> Cc: Markus Armbruster 
> Acked-by: Alistair Francis 
> Signed-off-by: Eduardo Habkost 

Reviewed-by: Markus Armbruster

Re: [Qemu-devel] [PATCH v4 2/2] vl: Deprecate auto-loading of qemu.conf

2017-10-03 Thread Markus Armbruster

Eduardo Habkost  writes:

> In case there were options set in the default config file, print
> a warning so users can update their scripts.
>
> If somebody wants to keep the config file as-is, avoid the
> warning and use a command-line that will work in future QEMU
> versions, they can use:
>
>  $QEMU -no-user-config -readconfig /etc/qemu/qemu.conf
>
> I was going to include the suggestion in the warning message, but
> I thought it could make it more confusing.  The suggestion is
> documented in qemu-doc.texi.
>
> Signed-off-by: Eduardo Habkost 
> ---
> Changes v3 -> v4:
> * Use warn_report() instead of error_report("warning: ...")
>   (Eric Blake)
> * Document as a deprecated feature in qemu-doc.texi
> * Update subject line
>   (was: "vl: Print warning when a default config file is loaded")
>
> Changes v2 -> v3:
> * Rebase (no code changes)
> * Commit message update: suggest -no-user-config
> ---
>  vl.c  | 6 ++
>  qemu-doc.texi | 8 
>  2 files changed, 14 insertions(+)
>
> diff --git a/vl.c b/vl.c
> index 3fed457921..1b0ecdf74e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -3066,6 +3066,12 @@ static int qemu_read_default_config_file(void)
>  return ret;
>  }
>  
> +if (ret > 0) {
> +loc_set_none();

Sure we need this here?

> +warn_report("Future QEMU versions won't load %s automatically",
> + CONFIG_QEMU_CONFDIR "/qemu.conf");
> +}
> +
>  return 0;
>  }
>  
> diff --git a/qemu-doc.texi b/qemu-doc.texi
> index ecd186a159..a81a09d05c 100644
> --- a/qemu-doc.texi
> +++ b/qemu-doc.texi
> @@ -2370,6 +2370,14 @@ they were first deprecated in the 2.10.0 release.
>  What follows is a list of all features currently marked as
>  deprecated.
>  
> +@section Automatic loading of @file{qemu.conf} (since 2.11.0)
> +
> +The automatic loading of an user-provided @file{qemu.conf} file from the QEMU
> +config directory is deprecated and behavior will change in future QEMU 
> versions.
> +To load an user-provided @file{qemu.conf} file and keep compatibility with
> +future versions, the arguments @samp{-no-user-config -readconfig
> +@var{CONFDIR}/qemu.conf} may be used.
> +
>  @section System emulator command line arguments
>  
>  @subsection -drive boot=on|off (since 1.3.0)

Re: [Qemu-devel] [PATCH RFC 0/6] xen: xen-domid-restrict improvements

2017-10-03 Thread Fam Zheng

On Tue, 10/03 18:24, Ian Jackson wrote:
> no-re...@patchew.org writes ("Re: [Qemu-devel] [PATCH RFC 0/6] xen: 
> xen-domid-restrict improvements"):
> > This series seems to have some coding style problems. See output below for
> > more information:
> 
> Thanks for this automatic mail.  I have sorted out most of these.
> However:
> 
> > ERROR: consider using qemu_strtoul in preference to strtoul
> > #41: FILE: os-posix.c:159:
> > +lv = strtoul(optarg, , 0);
> 
> In one of these two cases, it is not possible to use qemu_strtoul
> because the expected terminator is '.'.  I have added a comment about
> this.

Thanks for taking a look at the report and helping explain. Yes, so the error is
false positive, let's ignore it.

Fam

Re: [Qemu-devel] [PATCH v2 0/5] block: Avoid copy-on-read assertions

2017-10-03 Thread Fam Zheng

On Tue, 10/03 21:22, Eric Blake wrote:
> On 10/03/2017 09:16 PM, no-re...@patchew.org wrote:
> > Hi,
> > 
> > This series failed automatic build test. Please find the testing commands 
> > and
> > their output below. If you have docker installed, you can probably 
> > reproduce it
> > locally.
> > 
> 
> > 195 [not run] not suitable for this image format: raw
> > 197 - output mismatch (see 197.out.bad)
> > --- /tmp/qemu-test/src/tests/qemu-iotests/197.out   2017-10-04 
> > 01:52:59.0 +
> > +++ /tmp/qemu-test/build/tests/qemu-iotests/197.out.bad 2017-10-04 
> > 02:15:52.212004491 +
> > @@ -12,13 +12,18 @@
> >  128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> >  read 0/0 bytes at offset 0
> >  0 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > -read 2147483136/2147483136 bytes at offset 1024
> > -2 GiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > +
> > +(process:16284): GLib-ERROR **: gmem.c:100: failed to allocate 2147483136 
> > bytes
> 
> Okay, a test that requires a nearly-2G read in one operation is fringe,
> and I can see it choking 32-bit platforms rather easily.  How do we
> modify the test to not be so mean to memory-starved systems?  And why
> didn't patchew complain about this on v1, which had the same ~2G read?

I don't know. The whole system (Fedora VM) is dedicated to patchew test and no
concurrent task should be running. Maybe 2G is just in between the memory
watermarks.

Fam

Re: [Qemu-devel] [PATCHv2] dma/i82374: avoid double creation of i82374 device

2017-10-03 Thread Markus Armbruster

Eduardo Habkost  writes:

> On Mon, Oct 02, 2017 at 02:50:07PM +0200, Paolo Bonzini wrote:
>> On 29/09/2017 21:31, Eduardo Habkost wrote:
>> >> -void DMA_init(ISABus *bus, int high_page_enable)
>> >> +void DMA_init(ISABus *bus, int high_page_enable, Error **errp)
>> > 
>> > If you make the function return a boolean to indicate success (in
>> > addition to setting *errp), you avoid the need for a local_err
>> > variable on the caller.
>> 
>> I think in this case, rather than a bool, it would be better to return 0
>> or -EBUSY.  A check for "< 0" would be more self-explanatory in the caller.
>
> I'm OK with that, too.
>
> We really need to document the available and preferred error
> reporting styles somewhere (probably on qapi/error.h).  We
> discussed that a lot recently[1], but the conclusions were not
> documented anywhere.
>
> [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg461702.html

Yes, we need to document it.  We also need to convert existing code.

In my experience, documentation is a great time saver when people ask
questions.  It's less successful at getting people do the right thing.
For that, you have to make good examples common and bad examples
sufficiently rare.

[Qemu-devel] [Bug 1352179] Re: could not open disk image

2017-10-03 Thread Launchpad Bug Tracker

[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1352179

Title:
  could not open disk image

Status in QEMU:
  Expired

Bug description:
  After restart the server it's show this error:

  Error starting domain: internal error process exited while connecting to 
monitor: char device redirected to /dev/pts/1
  qemu-kvm: -drive 
file=/var/lib/nova/instances/b4535ce9-54b5-4581-a906-16b83bf1ba2f/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none:
 could not open disk image 
/var/lib/nova/instances/b4535ce9-54b5-4581-a906-16b83bf1ba2f/disk: No such file 
or directory

  the disk info show
   qemu-img info disk
  image: disk
  file format: qcow2
  virtual size: 100G (107374182400 bytes)
  disk size: 22G
  cluster_size: 65536
  backing file: 
/var/lib/nova/instances/_base/b4535ce9-54b5-4581-a906-16b83bf1ba2f

  but this file (backing file : 
/var/lib/nova/instances/_base/b4535ce9-54b5-4581-a906-16b83bf1ba2f) is empty.
  And all the instances can't find the disk image

  We use CentOS release 6.5 (64bit)
  kernel version : 2.6.32-431.11.2.el6.x86_64
  qemu-kvm-0.12.1.2-2.415.el6_5.10.x86_64

  virsh version
  Compiled against library: libvirt 0.10.2
  Using library: libvirt 0.10.2
  Using API: QEMU 0.10.2
  Running hypervisor: QEMU 0.12.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1352179/+subscriptions

Re: [Qemu-devel] [PATCH v1 5/5] raspi: : Specify the valid CPUs

2017-10-03 Thread Eduardo Habkost

On Tue, Oct 03, 2017 at 07:18:44PM -0300, Philippe Mathieu-Daudé wrote:
> On 10/03/2017 06:36 PM, Alistair Francis wrote:
> > On Tue, Oct 3, 2017 at 1:39 PM, Eduardo Habkost  wrote:
> >> On Tue, Oct 03, 2017 at 01:05:18PM -0700, Alistair Francis wrote:
> >>> List all possible valid CPU options.
> >>>
> >>> Signed-off-by: Alistair Francis 
> >>> ---
> >>>
> >>>  hw/arm/raspi.c | 6 ++
> >>>  1 file changed, 6 insertions(+)
> >>>
> >>> diff --git a/hw/arm/raspi.c b/hw/arm/raspi.c
> >>> index 5941c9f751..555db0f258 100644
> >>> --- a/hw/arm/raspi.c
> >>> +++ b/hw/arm/raspi.c
> >>> @@ -158,6 +158,10 @@ static void raspi2_init(MachineState *machine)
> >>>  setup_boot(machine, 2, machine->ram_size - vcram_size);
> >>>  }
> >>>
> >>> +const char *raspi2_valid_cpus[] = { ARM_CPU_TYPE_NAME("cortex-a7"),
> >>> +NULL
> >>> +  };
> >>> +
> >>>  static void raspi2_machine_init(MachineClass *mc)
> >>>  {
> >>>  mc->desc = "Raspberry Pi 2";
> >>> @@ -169,5 +173,7 @@ static void raspi2_machine_init(MachineClass *mc)
> >>>  mc->max_cpus = BCM2836_NCPUS;
> >>>  mc->default_ram_size = 1024 * 1024 * 1024;
> >>>  mc->ignore_memory_transaction_failures = true;
> >>> +mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a7");
> >>> +mc->valid_cpu_types = raspi2_valid_cpus;
> >>
> >> I'm confused: bcm2836_init() is hardcoded to cortex-a15, not
> >> cortex-a7.
> > 
> > Odd. I just looked up the Raspberry Pi 2 and it says a Cortex-A7:
> > https://www.raspberrypi.org/products/raspberry-pi-2-model-b/
> 
> The BCM2836 SoC definitively is Cortex-A7.
> 
> git history says the A7 was added after (dcf578ed8cec) the raspi2 board
> (bad5623690b1).

Shouldn't we update TYPE_BCM2836 to use cpu_type instead of
cortex-a15 before applying this patch, then?

> 
> Reviewed-by: Philippe Mathieu-Daudé 
> 
> > 
> > Thanks,
> > Alistair
> > 
> >>
> >>>  };
> >>>  DEFINE_MACHINE("raspi2", raspi2_machine_init)
> >>> --
> >>> 2.11.0
> >>>
> >>
> >> --
> >> Eduardo

-- 
Eduardo

Re: [Qemu-devel] [PATCH v1 3/5] xlnx-zcu102: Specify the valid CPUs

2017-10-03 Thread Eduardo Habkost

On Tue, Oct 03, 2017 at 02:41:17PM -0700, Alistair Francis wrote:
> On Tue, Oct 3, 2017 at 1:36 PM, Eduardo Habkost  wrote:
> > On Tue, Oct 03, 2017 at 01:05:13PM -0700, Alistair Francis wrote:
> >> List all possible valid CPU options.
> >>
> >> Signed-off-by: Alistair Francis 
> >> ---
> >>
> >>  hw/arm/xlnx-zcu102.c | 10 ++
> >>  hw/arm/xlnx-zynqmp.c | 16 +---
> >>  include/hw/arm/xlnx-zynqmp.h |  1 +
> >>  3 files changed, 20 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/hw/arm/xlnx-zcu102.c b/hw/arm/xlnx-zcu102.c
> >> index 519a16ed98..039649e522 100644
> >> --- a/hw/arm/xlnx-zcu102.c
> >> +++ b/hw/arm/xlnx-zcu102.c
> >> @@ -98,6 +98,8 @@ static void xlnx_zynqmp_init(XlnxZCU102 *s, MachineState 
> >> *machine)
> >>  object_property_add_child(OBJECT(machine), "soc", OBJECT(>soc),
> >>_abort);
> >>
> >> +object_property_set_str(OBJECT(>soc), machine->cpu_type, 
> >> "cpu-type",
> >> +_fatal);
> >
> > Do you have plans to support other CPU types to xlnx_zynqmp in
> > the future?  If not, I wouldn't bother adding the cpu-type
> > property and the extra boilerplate code if it's always going to
> > be set to cortex-a53.
> 
> No, it'll always be A53.
> 
> I did think of that, but I also wanted to use the new option! I also
> think there is an advantage in sanely handling users '-cpu' option,
> before now we just ignored it, so I think it still does give a
> benefit. That'll be especially important on the Xilinx tree (sometimes
> people use our machines with a different CPU to 'benchmark' or test
> other CPUs with our CoSimulation setup). So I think it does make sense
> to keep in.

I see.

Reviewed-by: Eduardo Habkost 

-- 
Eduardo

Re: [Qemu-devel] [Qemu-ppc] [PATCH v5 1/6] ppc: spapr: Register and handle HCALL to receive updated RTAS region

2017-10-03 Thread Alexey Kardashevskiy

On 03/10/17 20:12, Alexey Kardashevskiy wrote:
> On 03/10/17 17:07, David Gibson wrote:
>> On Mon, Oct 02, 2017 at 02:02:19PM +1100, Alexey Kardashevskiy wrote:
>>> On 29/09/17 21:52, Nikunj A Dadhania wrote:
 David Gibson  writes:

> On Thu, Sep 28, 2017 at 04:07:38PM +0530, Aravinda Prasad wrote:
>> Receive updates from SLOF about the updated rtas-base.
>> A separate patch for SLOF [1] (commit f9a60de3) adds
>> functionality to invoke a private HCALL whenever OS
>> issues instantiate-rtas with a new rtas-base.
>>
>> This is required as QEMU needs to know the updated rtas-base
>> as it allocates error reporting structure in RTAS space upon
>> a machine check exception.
>>
>> [1] 
>> https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-August/120386.html
>>
>> Signed-off-by: Aravinda Prasad 
>> Reviewed-by: David Gibson 
>
> Ao I acked this earlier, but I've now realized there might be some
> connection between this and discussions taking place elsewhere about
> qemu not knowing what SLOF does with the device tree.
>
> At what point will SLOF call the UPDATE_RTAS hcall?  I'm guessing at
> the time of instantiate-rtas, is that right?

 The call happens from
 arch/powerpc/kernel/prom_init.c:prom_instantiate_rtas() and after that
 linux kernel makes two entries in the DT

 
if (call_prom_ret("call-method", 3, 2, ,
   ADDR("instantiate-rtas"),
   rtas_inst, base) != 0
 || entry == 0) {
 prom_printf(" failed\n");
 return;
 }
 prom_printf(" done\n");

 reserve_mem(base, size);

 val = cpu_to_be32(base);
 prom_setprop(rtas_node, "/rtas", "linux,rtas-base",
  , sizeof(val));
 val = cpu_to_be32(entry);
 prom_setprop(rtas_node, "/rtas", "linux,rtas-entry",
  , sizeof(val));
 

 Quiesce is called after this. 

> Does SLOF put the RTAS blob address in its internal device tree, or
> does it only pass it to the guest via the return parameters from
> instantiate-rtas?

 Entry was made to the DT by linux kernel prom_init code, will this be
 visible to QEMU?
>>>
>>> With my recent SLOF FDT patch - yes:
>>>
>>> aik@fstn1-p1:~$ grep rtas dbg.dts
>>> rtas {
>>> linux,rtas-entry = <0x2fff>;
>>> linux,rtas-base = <0x2fff>;
>>> [...]
>>
>> Ah.. except.. isn't that relying on the kernel putting the RTAS
>> address into the device tree before it calls quiesce and kills SLOF?
>>
>> The SLOF image is bundled in with qemu, so it's ok for us to rely on
>> its behaviour up to a point.  It's not really ok for us to rely on the
>> kernel's behaviour here, unless that behaviour is mandated by PAPR,
>> which this isn't.
> 
> Fair point.
> 
>> So, I think we either need to have *SLOF* update the device tree with
>> that address at instantiate-rtas time,
> 
> I can do that, in a separate patch.


One comment though - if I create the properties in SLOF, I have to name
them different, like rtas-entry/rtas-base or slof,rtas-entry/slof,rtas-base
to avoid colliding with the ones create by the guest kernel.

So what do I name them? And do we need 2 copies of the same thing, do we
ever expect rtas-entry!=rtas-base? The guest can potentially get them
different (under powervm) but not with SLOF.


> 
>> or we'll need to resurrect
>> Aravinda's original UPDATE_RTAS hcall.



-- 
Alexey



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v3 1/2] vl: Eliminate defconfig variable

2017-10-03 Thread Eduardo Habkost

Both -nodefconfig and -no-user-config options do the same thing
today, we only need one variable to keep track of them.

Suggested-by: Markus Armbruster 
Acked-by: Alistair Francis 
Reviewed-by: Markus Armbruster 
Signed-off-by: Eduardo Habkost 
---
 vl.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/vl.c b/vl.c
index 3fed457921..ebea42e0ea 100644
--- a/vl.c
+++ b/vl.c
@@ -3111,7 +3111,6 @@ int main(int argc, char **argv, char **envp)
 const char *qtest_log = NULL;
 const char *pid_file = NULL;
 const char *incoming = NULL;
-bool defconfig = true;
 bool userconfig = true;
 bool nographic = false;
 DisplayType display_type = DT_DEFAULT;
@@ -3213,8 +3212,6 @@ int main(int argc, char **argv, char **envp)
 popt = lookup_opt(argc, argv, , );
 switch (popt->index) {
 case QEMU_OPTION_nodefconfig:
-defconfig = false;
-break;
 case QEMU_OPTION_nouserconfig:
 userconfig = false;
 break;
@@ -3222,7 +3219,7 @@ int main(int argc, char **argv, char **envp)
 }
 }
 
-if (defconfig && userconfig) {
+if (userconfig) {
 if (qemu_read_default_config_file() < 0) {
 exit(1);
 }
-- 
2.13.5

[Qemu-devel] [PATCH v3 2/2] qemu-options: Deprecate -nodefconfig

2017-10-03 Thread Eduardo Habkost

Since 2012 (commit ba6212d8 "Eliminate cpus-x86_64.conf file") we
have no default config files that would be disabled using
-nodefconfig.  Update documentation and document -nodefconfig as
deprecated.

Cc: Markus Armbruster 
Acked-by: Alistair Francis 
Signed-off-by: Eduardo Habkost 
---
Changes v2 -> v3:
* Move documentation to the right section of qemu-doc.texi

Changes v1 -> v2:
* Document at "Deprecated features" section in qemu-doc.texi
  (Daniel)
* Remove documentation for the option from qemu-options.hx
  (Markus)
---
 qemu-doc.texi   |  4 
 qemu-options.hx | 17 -
 2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/qemu-doc.texi b/qemu-doc.texi
index ecd186a159..d8bb2c664f 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -2496,6 +2496,10 @@ would automatically enable USB support on the machine 
type.
 If using the new syntax, USB support must be explicitly
 enabled via the ``-machine usb=on'' argument.
 
+@subsection -nodefconfig (since 2.11.0)
+
+The ``-nodefconfig`` argument is a synonym for ``-no-user-config``.
+
 @section qemu-img command line arguments
 
 @subsection convert -s (since 2.0.0)
diff --git a/qemu-options.hx b/qemu-options.hx
index 39225ae6c3..981742d191 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4067,26 +4067,17 @@ Write device configuration to @var{file}. The 
@var{file} can be either filename
 command line and device configuration into file or dash @code{-}) character to 
print the
 output to stdout. This can be later used as input file for @code{-readconfig} 
option.
 ETEXI
-DEF("nodefconfig", 0, QEMU_OPTION_nodefconfig,
-"-nodefconfig\n"
-"do not load default config files at startup\n",
-QEMU_ARCH_ALL)
-STEXI
-@item -nodefconfig
-@findex -nodefconfig
-Normally QEMU loads configuration files from @var{sysconfdir} and 
@var{datadir} at startup.
-The @code{-nodefconfig} option will prevent QEMU from loading any of those 
config files.
-ETEXI
+HXCOMM Deprecated, same as -no-user-config
+DEF("nodefconfig", 0, QEMU_OPTION_nodefconfig, "", QEMU_ARCH_ALL)
 DEF("no-user-config", 0, QEMU_OPTION_nouserconfig,
 "-no-user-config\n"
-"do not load user-provided config files at startup\n",
+"do not load default user-provided config files at 
startup\n",
 QEMU_ARCH_ALL)
 STEXI
 @item -no-user-config
 @findex -no-user-config
 The @code{-no-user-config} option makes QEMU not load any of the user-provided
-config files on @var{sysconfdir}, but won't make it skip the QEMU-provided 
config
-files from @var{datadir}.
+config files on @var{sysconfdir}.
 ETEXI
 DEF("trace", HAS_ARG, QEMU_OPTION_trace,
 "-trace [[enable=]][,events=][,file=]\n"
-- 
2.13.5

[Qemu-devel] [PATCH v3 0/2] Deprecate -nodefconfig

2017-10-03 Thread Eduardo Habkost

Changes v2 -> v3:
* Move documentation to the right section of qemu-doc.texi

Changes v1 -> v2:
* Document at "Deprecated features" section in qemu-doc.texi
  (Daniel)
* Remove documentation for the option from qemu-options.hx
  (Markus)

Since 2012 (commit ba6212d8 "Eliminate cpus-x86_64.conf file") we
have no default config files that would be disabled using
-nodefconfig.  This series cleans up the code, updates
documentation, and document -nodefconfig as deprecated.

Eduardo Habkost (2):
  vl: Eliminate defconfig variable
  qemu-options: Deprecate -nodefconfig

 vl.c|  5 +
 qemu-doc.texi   |  4 
 qemu-options.hx | 17 -
 3 files changed, 9 insertions(+), 17 deletions(-)

-- 
2.13.5

[Qemu-devel] [PATCH v4 2/2] vl: Deprecate auto-loading of qemu.conf

2017-10-03 Thread Eduardo Habkost

In case there were options set in the default config file, print
a warning so users can update their scripts.

If somebody wants to keep the config file as-is, avoid the
warning and use a command-line that will work in future QEMU
versions, they can use:

 $QEMU -no-user-config -readconfig /etc/qemu/qemu.conf

I was going to include the suggestion in the warning message, but
I thought it could make it more confusing.  The suggestion is
documented in qemu-doc.texi.

Signed-off-by: Eduardo Habkost 
---
Changes v3 -> v4:
* Use warn_report() instead of error_report("warning: ...")
  (Eric Blake)
* Document as a deprecated feature in qemu-doc.texi
* Update subject line
  (was: "vl: Print warning when a default config file is loaded")

Changes v2 -> v3:
* Rebase (no code changes)
* Commit message update: suggest -no-user-config
---
 vl.c  | 6 ++
 qemu-doc.texi | 8 
 2 files changed, 14 insertions(+)

diff --git a/vl.c b/vl.c
index 3fed457921..1b0ecdf74e 100644
--- a/vl.c
+++ b/vl.c
@@ -3066,6 +3066,12 @@ static int qemu_read_default_config_file(void)
 return ret;
 }
 
+if (ret > 0) {
+loc_set_none();
+warn_report("Future QEMU versions won't load %s automatically",
+ CONFIG_QEMU_CONFDIR "/qemu.conf");
+}
+
 return 0;
 }
 
diff --git a/qemu-doc.texi b/qemu-doc.texi
index ecd186a159..a81a09d05c 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -2370,6 +2370,14 @@ they were first deprecated in the 2.10.0 release.
 What follows is a list of all features currently marked as
 deprecated.
 
+@section Automatic loading of @file{qemu.conf} (since 2.11.0)
+
+The automatic loading of an user-provided @file{qemu.conf} file from the QEMU
+config directory is deprecated and behavior will change in future QEMU 
versions.
+To load an user-provided @file{qemu.conf} file and keep compatibility with
+future versions, the arguments @samp{-no-user-config -readconfig
+@var{CONFDIR}/qemu.conf} may be used.
+
 @section System emulator command line arguments
 
 @subsection -drive boot=on|off (since 1.3.0)
-- 
2.13.5

[Qemu-devel] [PATCH v4 1/2] config: qemu_config_parse() return number of config groups

2017-10-03 Thread Eduardo Habkost

Change qemu_config_parse() to return the number of config groups
in success and -EINVAL on error. This will allow callers of
qemu_config_parse() to check if something was really loaded from
the config file.

All existing callers of qemu_config_parse() and
qemu_read_config_file() only check if the return value was
negative, so the change shouldn't affect them.

Reviewed-by: Markus Armbruster 
Reviewed-by: Eric Blake 
Signed-off-by: Eduardo Habkost 
---
Changes series v3 -> series v4:
* (none)

Changes v2 -> v3:
* None (rebase only)

Changes v1 -> v2:
* Remove unnecessary translation of qemu_config_parse()
  erros to -EINVAL at block/blkdebug.c:read_config()
  * Suggsted-by: Markus Armbruster 
---
 block/blkdebug.c   |  1 -
 util/qemu-config.c | 15 +++
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 46e53f2f09..dfdf9b91aa 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -244,7 +244,6 @@ static int read_config(BDRVBlkdebugState *s, const char 
*filename,
 ret = qemu_config_parse(f, config_groups, filename);
 if (ret < 0) {
 error_setg(errp, "Could not parse blkdebug config file");
-ret = -EINVAL;
 goto fail;
 }
 }
diff --git a/util/qemu-config.c b/util/qemu-config.c
index 405dd1a1d7..99b0e46fa3 100644
--- a/util/qemu-config.c
+++ b/util/qemu-config.c
@@ -385,6 +385,7 @@ void qemu_config_write(FILE *fp)
 }
 }
 
+/* Returns number of config groups on success, -errno on error */
 int qemu_config_parse(FILE *fp, QemuOptsList **lists, const char *fname)
 {
 char line[1024], group[64], id[64], arg[64], value[1024];
@@ -392,7 +393,8 @@ int qemu_config_parse(FILE *fp, QemuOptsList **lists, const 
char *fname)
 QemuOptsList *list = NULL;
 Error *local_err = NULL;
 QemuOpts *opts = NULL;
-int res = -1, lno = 0;
+int res = -EINVAL, lno = 0;
+int count = 0;
 
 loc_push_none();
 while (fgets(line, sizeof(line), fp) != NULL) {
@@ -413,6 +415,7 @@ int qemu_config_parse(FILE *fp, QemuOptsList **lists, const 
char *fname)
 goto out;
 }
 opts = qemu_opts_create(list, id, 1, NULL);
+count++;
 continue;
 }
 if (sscanf(line, "[%63[^]]]", group) == 1) {
@@ -423,6 +426,7 @@ int qemu_config_parse(FILE *fp, QemuOptsList **lists, const 
char *fname)
 goto out;
 }
 opts = qemu_opts_create(list, NULL, 0, _abort);
+count++;
 continue;
 }
 value[0] = '\0';
@@ -447,7 +451,7 @@ int qemu_config_parse(FILE *fp, QemuOptsList **lists, const 
char *fname)
 error_report("error reading file");
 goto out;
 }
-res = 0;
+res = count;
 out:
 loc_pop();
 return res;
@@ -464,12 +468,7 @@ int qemu_read_config_file(const char *filename)
 
 ret = qemu_config_parse(f, vm_config_groups, filename);
 fclose(f);
-
-if (ret == 0) {
-return 0;
-} else {
-return -EINVAL;
-}
+return ret;
 }
 
 static void config_parse_qdict_section(QDict *options, QemuOptsList *opts,
-- 
2.13.5

[Qemu-devel] [PATCH v4 0/2] vl: Deprecate auto-loading of qemu.conf

2017-10-03 Thread Eduardo Habkost

This missed v2.9 and v2.10.  Let's try again: we can include this
on v2.11, and remove the default config file in QEMU 2.13 or
2.14.

Changes v3 -> v4:
* Use warn_report() instead of error_report("warning: ...")
  (Eric Blake)
* Document as a deprecated feature in qemu-doc.texi
* Updated Subject line
  (was "vl: Print warning if a non-empty default config file is found")

Changes v2 -> v3:
* Rebase to latest qemu.git master

Changes v1 -> v2:
* Remove unnecessary translation of qemu_config_parse()
  erros to -EINVAL at block/blkdebug.c:read_config()
  * Suggsted-by: Markus Armbruster 

We plan to remove support for /etc/qemu/qemu.conf in the near
future. Make QEMU print a warning in case there a non-empty
/etc/qemu/qemu.conf is loaded, so users have time to adapt.

Eduardo Habkost (2):
  config: qemu_config_parse() return number of config groups
  vl: Deprecate auto-loading of qemu.conf

 block/blkdebug.c   |  1 -
 util/qemu-config.c | 15 +++
 vl.c   |  6 ++
 qemu-doc.texi  |  8 
 4 files changed, 21 insertions(+), 9 deletions(-)

-- 
2.13.5

Re: [Qemu-devel] [PATCH v2 0/5] block: Avoid copy-on-read assertions

2017-10-03 Thread Eric Blake

On 10/03/2017 09:16 PM, no-re...@patchew.org wrote:
> Hi,
> 
> This series failed automatic build test. Please find the testing commands and
> their output below. If you have docker installed, you can probably reproduce 
> it
> locally.
> 

> 195 [not run] not suitable for this image format: raw
> 197 - output mismatch (see 197.out.bad)
> --- /tmp/qemu-test/src/tests/qemu-iotests/197.out 2017-10-04 
> 01:52:59.0 +
> +++ /tmp/qemu-test/build/tests/qemu-iotests/197.out.bad   2017-10-04 
> 02:15:52.212004491 +
> @@ -12,13 +12,18 @@
>  128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>  read 0/0 bytes at offset 0
>  0 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -read 2147483136/2147483136 bytes at offset 1024
> -2 GiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +
> +(process:16284): GLib-ERROR **: gmem.c:100: failed to allocate 2147483136 
> bytes

Okay, a test that requires a nearly-2G read in one operation is fringe,
and I can see it choking 32-bit platforms rather easily.  How do we
modify the test to not be so mean to memory-starved systems?  And why
didn't patchew complain about this on v1, which had the same ~2G read?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v5 21/23] block: Align block status requests

2017-10-03 Thread Eric Blake

Any device that has request_alignment greater than 512 should be
unable to report status at a finer granularity; it may also be
simpler for such devices to be guaranteed that the block layer
has rounded things out to the granularity boundary (the way the
block layer already rounds all other I/O out).  Besides, getting
the code correct for super-sector alignment also benefits us
for the fact that our public interface now has byte granularity,
even though none of our drivers have byte-level callbacks.

Add an assertion in blkdebug that proves that the block layer
never requests status of unaligned sections, similar to what it
does on other requests (while still keeping the generic helper
in place for when future patches add a throttle driver).  Note
that iotest 177 already covers this (it would fail if you use
just the blkdebug.c hunk without the io.c changes).  Meanwhile,
we can drop assertions in callers that no longer have to pass
in sector-aligned addresses.

There is a mid-function scope added for 'int count', for a
couple of reasons: first, an upcoming patch will add an 'if'
statement that checks whether a driver has an old- or new-style
callback, and can conveniently use the same scope for less
indentation churn at that time.  Second, since we are trying
to get rid of sector-based computations, wrapping things in
a scope makes it easier to group and see what will be deleted
in a final cleanup patch once all drivers have been converted
to the new-style callback.

Signed-off-by: Eric Blake 

---
v5: rebase to earlier changes, add more comments
v4: no change
v3: tweak commit message [Fam], rebase to context conflicts, ensure
we don't exceed 32-bit limit, drop R-b
v2: new patch
---
 include/block/block_int.h |  3 ++-
 block/io.c| 68 +--
 block/blkdebug.c  | 13 -
 3 files changed, 62 insertions(+), 22 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 3b4158f576..41a229d933 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -207,7 +207,8 @@ struct BlockDriver {
  * according to the current layer, and should not set
  * BDRV_BLOCK_ALLOCATED, but may set BDRV_BLOCK_RAW.  See block.h
  * for the meaning of _DATA, _ZERO, and _OFFSET_VALID.  The block
- * layer guarantees non-NULL pnum and file.
+ * layer guarantees input aligned to request_alignment, as well as
+ * non-NULL pnum and file.
  */
 int64_t coroutine_fn (*bdrv_co_get_block_status)(BlockDriverState *bs,
 int64_t sector_num, int nb_sectors, int *pnum,
diff --git a/block/io.c b/block/io.c
index 8f0434ce4f..8619f82eae 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1818,7 +1818,8 @@ static int64_t coroutine_fn 
bdrv_co_block_status(BlockDriverState *bs,
 int64_t ret, ret2;
 BlockDriverState *local_file = NULL;
 int64_t local_pnum = 0;
-int count; /* sectors */
+int64_t aligned_offset, aligned_bytes;
+uint32_t align;

 assert(pnum);
 total_size = bdrv_getlength(bs);
@@ -1851,32 +1852,58 @@ static int64_t coroutine_fn 
bdrv_co_block_status(BlockDriverState *bs,
 }

 bdrv_inc_in_flight(bs);
+
+/* Round out to request_alignment boundaries */
+/* TODO: until we have a byte-based driver callback, we also have to
+ * round out to sectors, even if that is bigger than request_alignment */
+align = MAX(bs->bl.request_alignment, BDRV_SECTOR_SIZE);
+aligned_offset = QEMU_ALIGN_DOWN(offset, align);
+aligned_bytes = ROUND_UP(offset + bytes, align) - aligned_offset;
+
+{
+int count; /* sectors */
+
+assert(QEMU_IS_ALIGNED(aligned_offset | aligned_bytes,
+   BDRV_SECTOR_SIZE));
+/*
+ * The contract allows us to return pnum smaller than bytes, even
+ * if the next query would see the same status; we truncate the
+ * request to avoid overflowing the driver's 32-bit interface.
+ */
+ret = bs->drv->bdrv_co_get_block_status(
+bs, aligned_offset >> BDRV_SECTOR_BITS,
+MIN(INT_MAX, aligned_bytes) >> BDRV_SECTOR_BITS, ,
+_file);
+if (ret < 0) {
+goto out;
+}
+local_pnum = count * BDRV_SECTOR_SIZE;
+}
+
 /*
- * TODO: Rather than require aligned offsets, we could instead
- * round to the driver's request_alignment here, then touch up
- * count afterwards back to the caller's expectations.
- */
-assert(QEMU_IS_ALIGNED(offset | bytes, BDRV_SECTOR_SIZE));
-/*
- * The contract allows us to return pnum smaller than bytes, even
- * if the next query would see the same status; we truncate the
- * request to avoid overflowing the driver's 32-bit interface.
+ * The driver's result must be a multiple of request_alignment.
+ * Clamp pnum and ret to original request; requires care if align
+ * is larger than a sector.

[Qemu-devel] [PATCH v5 23/23] qemu-io: Relax 'alloc' now that block-status doesn't assert

2017-10-03 Thread Eric Blake

Previously, the alloc command required that input parameters be
sector-aligned and clamped to 32 bits, because the underlying
bdrv_is_allocated used a 32-bit parameter and asserted aligned
inputs.  But now that we have fixed block status to report a
64-bit bytes value, and to properly round requests on behalf of
guests, we can pass any values, and can use qemu-io to add
coverage that our rounding is correct regardless of the guest
alignment constraints.

Update iotest 177 to intentionally probe block status at
unaligned boundaries as well as with a bytes value that does not
map to 32-bit sectors, which also required tweaking the image
prep to leave an unallocated portion to the image under test.

Signed-off-by: Eric Blake 

---
v3: also test huge bytes value, R-b dropped
v2: new patch
---
 qemu-io-cmds.c | 13 -
 tests/qemu-iotests/177 | 12 ++--
 tests/qemu-iotests/177.out | 19 ++-
 3 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 3727fb43f3..de8e3de726 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -1769,10 +1769,6 @@ static int alloc_f(BlockBackend *blk, int argc, char 
**argv)
 if (offset < 0) {
 print_cvtnum_err(offset, argv[1]);
 return 0;
-} else if (!QEMU_IS_ALIGNED(offset, BDRV_SECTOR_SIZE)) {
-printf("%" PRId64 " is not a sector-aligned value for 'offset'\n",
-   offset);
-return 0;
 }

 if (argc == 3) {
@@ -1780,19 +1776,10 @@ static int alloc_f(BlockBackend *blk, int argc, char 
**argv)
 if (count < 0) {
 print_cvtnum_err(count, argv[2]);
 return 0;
-} else if (count > INT_MAX * BDRV_SECTOR_SIZE) {
-printf("length argument cannot exceed %llu, given %s\n",
-   INT_MAX * BDRV_SECTOR_SIZE, argv[2]);
-return 0;
 }
 } else {
 count = BDRV_SECTOR_SIZE;
 }
-if (!QEMU_IS_ALIGNED(count, BDRV_SECTOR_SIZE)) {
-printf("%" PRId64 " is not a sector-aligned value for 'count'\n",
-   count);
-return 0;
-}

 remaining = count;
 sum_alloc = 0;
diff --git a/tests/qemu-iotests/177 b/tests/qemu-iotests/177
index f8ed8fb86b..28990977f1 100755
--- a/tests/qemu-iotests/177
+++ b/tests/qemu-iotests/177
@@ -51,7 +51,7 @@ echo "== setting up files =="
 TEST_IMG="$TEST_IMG.base" _make_test_img $size
 $QEMU_IO -c "write -P 11 0 $size" "$TEST_IMG.base" | _filter_qemu_io
 _make_test_img -b "$TEST_IMG.base"
-$QEMU_IO -c "write -P 22 0 $size" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c "write -P 22 0 110M" "$TEST_IMG" | _filter_qemu_io

 # Limited to 64k max-transfer
 echo
@@ -82,6 +82,13 @@ $QEMU_IO -c "open -o $options,$limits blkdebug::$TEST_IMG" \
  -c "discard 8001 30M" | _filter_qemu_io

 echo
+echo "== block status smaller than alignment =="
+limits=align=4k
+$QEMU_IO -c "open -o $options,$limits blkdebug::$TEST_IMG" \
+-c "alloc 1 1" -c "alloc 0x6d0 1000" -c "alloc 127m 5P" \
+-c map | _filter_qemu_io
+
+echo
 echo "== verify image content =="

 function verify_io()
@@ -103,7 +110,8 @@ function verify_io()
 echo read -P 0 32M 32M
 echo read -P 22 64M 13M
 echo read -P $discarded 77M 29M
-echo read -P 22 106M 22M
+echo read -P 22 106M 4M
+echo read -P 11 110M 18M
 }

 verify_io | $QEMU_IO -r "$TEST_IMG" | _filter_qemu_io
diff --git a/tests/qemu-iotests/177.out b/tests/qemu-iotests/177.out
index 43a777836c..f788b55e20 100644
--- a/tests/qemu-iotests/177.out
+++ b/tests/qemu-iotests/177.out
@@ -5,8 +5,8 @@ Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=134217728
 wrote 134217728/134217728 bytes at offset 0
 128 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 
backing_file=TEST_DIR/t.IMGFMT.base
-wrote 134217728/134217728 bytes at offset 0
-128 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 115343360/115343360 bytes at offset 0
+110 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

 == constrained alignment and max-transfer ==
 wrote 131072/131072 bytes at offset 1000
@@ -26,6 +26,13 @@ wrote 33554432/33554432 bytes at offset 33554432
 discard 31457280/31457280 bytes at offset 8001
 30 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

+== block status smaller than alignment ==
+1/1 bytes allocated at offset 1 bytes
+16/1000 bytes allocated at offset 110 MiB
+0/1048576 bytes allocated at offset 127 MiB
+110 MiB (0x6e0) bytes allocated at offset 0 bytes (0x0)
+18 MiB (0x120) bytes not allocated at offset 110 MiB (0x6e0)
+
 == verify image content ==
 read 1000/1000 bytes at offset 0
 1000 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -43,12 +50,14 @@ read 13631488/13631488 bytes at offset 67108864
 13 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 read 30408704/30408704 bytes at offset 80740352
 29 MiB, X ops;

[Qemu-devel] [PATCH v5 20/23] qemu-img: Change img_compare() to be byte-based

2017-10-03 Thread Eric Blake

In the continuing quest to make more things byte-based, change
the internal iteration of img_compare().  We can finally drop the
TODO assertions added earlier, now that the entire algorithm is
byte-based and no longer has to shift from bytes to sectors.

Most of the change is mechanical ('total_sectors' becomes
'total_size', 'sector_num' becomes 'offset', 'nb_sectors' becomes
'chunk', 'progress_base' goes from sectors to bytes); some of it
is also a cleanup (sectors_to_bytes() is now unused, loss of
variable 'count' added earlier in commit 51b0a488).

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v5: rebase to earlier change, minor enough to keep R-b
v4: no change
v3: new patch
---
 qemu-img.c | 122 +++--
 1 file changed, 46 insertions(+), 76 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 2e74da978e..bcbe95dc29 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1186,11 +1186,6 @@ static int compare_buffers(const uint8_t *buf1, const 
uint8_t *buf2,

 #define IO_BUF_SIZE (2 * 1024 * 1024)

-static int64_t sectors_to_bytes(int64_t sectors)
-{
-return sectors << BDRV_SECTOR_BITS;
-}
-
 /*
  * Check if passed sectors are empty (not allocated or contain only 0 bytes)
  *
@@ -1241,7 +1236,7 @@ static int img_compare(int argc, char **argv)
 const char *fmt1 = NULL, *fmt2 = NULL, *cache, *filename1, *filename2;
 BlockBackend *blk1, *blk2;
 BlockDriverState *bs1, *bs2;
-int64_t total_sectors1, total_sectors2;
+int64_t total_size1, total_size2;
 uint8_t *buf1 = NULL, *buf2 = NULL;
 int64_t pnum1, pnum2;
 int allocated1, allocated2;
@@ -1249,9 +1244,9 @@ static int img_compare(int argc, char **argv)
 bool progress = false, quiet = false, strict = false;
 int flags;
 bool writethrough;
-int64_t total_sectors;
-int64_t sector_num = 0;
-int64_t nb_sectors;
+int64_t total_size;
+int64_t offset = 0;
+int64_t chunk;
 int c;
 uint64_t progress_base;
 bool image_opts = false;
@@ -1365,39 +1360,36 @@ static int img_compare(int argc, char **argv)

 buf1 = blk_blockalign(blk1, IO_BUF_SIZE);
 buf2 = blk_blockalign(blk2, IO_BUF_SIZE);
-total_sectors1 = blk_nb_sectors(blk1);
-if (total_sectors1 < 0) {
+total_size1 = blk_getlength(blk1);
+if (total_size1 < 0) {
 error_report("Can't get size of %s: %s",
- filename1, strerror(-total_sectors1));
+ filename1, strerror(-total_size1));
 ret = 4;
 goto out;
 }
-total_sectors2 = blk_nb_sectors(blk2);
-if (total_sectors2 < 0) {
+total_size2 = blk_getlength(blk2);
+if (total_size2 < 0) {
 error_report("Can't get size of %s: %s",
- filename2, strerror(-total_sectors2));
+ filename2, strerror(-total_size2));
 ret = 4;
 goto out;
 }
-total_sectors = MIN(total_sectors1, total_sectors2);
-progress_base = MAX(total_sectors1, total_sectors2);
+total_size = MIN(total_size1, total_size2);
+progress_base = MAX(total_size1, total_size2);

 qemu_progress_print(0, 100);

-if (strict && total_sectors1 != total_sectors2) {
+if (strict && total_size1 != total_size2) {
 ret = 1;
 qprintf(quiet, "Strict mode: Image size mismatch!\n");
 goto out;
 }

-while (sector_num < total_sectors) {
+while (offset < total_size) {
 int64_t status1, status2;

-status1 = bdrv_block_status_above(bs1, NULL,
-  sector_num * BDRV_SECTOR_SIZE,
-  (total_sectors1 - sector_num) *
-  BDRV_SECTOR_SIZE,
-  , NULL);
+status1 = bdrv_block_status_above(bs1, NULL, offset,
+  total_size1 - offset, , NULL);
 if (status1 < 0) {
 ret = 3;
 error_report("Sector allocation test failed for %s", filename1);
@@ -1405,31 +1397,24 @@ static int img_compare(int argc, char **argv)
 }
 allocated1 = status1 & BDRV_BLOCK_ALLOCATED;

-status2 = bdrv_block_status_above(bs2, NULL,
-  sector_num * BDRV_SECTOR_SIZE,
-  (total_sectors2 - sector_num) *
-  BDRV_SECTOR_SIZE,
-  , NULL);
+status2 = bdrv_block_status_above(bs2, NULL, offset,
+  total_size2 - offset, , NULL);
 if (status2 < 0) {
 ret = 3;
 error_report("Sector allocation test failed for %s", filename2);
 goto out;
 }
 allocated2 = status2 & BDRV_BLOCK_ALLOCATED;
-/* TODO: Relax this once comparison is byte-based, and we no longer

[Qemu-devel] [PATCH v5 15/23] qemu-img: Add find_nonzero()

2017-10-03 Thread Eric Blake

During 'qemu-img compare', when we are checking that an allocated
portion of one file is all zeros, we don't need to waste time
computing how many additional sectors after the first non-zero
byte are also non-zero.  Create a new helper find_nonzero() to do
the check for a first non-zero sector, and rebase
check_empty_sectors() to use it.

The new interface intentionally uses bytes in its interface, even
though it still crawls the buffer a sector at a time; it is robust
to a partial sector at the end of the buffer.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v4-v5: no change
v3: new patch
---
 qemu-img.c | 32 
 1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 43e3038894..12881f008e 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1065,6 +1065,28 @@ done:
 }

 /*
+ * Returns -1 if 'buf' contains only zeroes, otherwise the byte index
+ * of the first sector boundary within buf where the sector contains a
+ * non-zero byte.  This function is robust to a buffer that is not
+ * sector-aligned.
+ */
+static int64_t find_nonzero(const uint8_t *buf, int64_t n)
+{
+int64_t i;
+int64_t end = QEMU_ALIGN_DOWN(n, BDRV_SECTOR_SIZE);
+
+for (i = 0; i < end; i += BDRV_SECTOR_SIZE) {
+if (!buffer_is_zero(buf + i, BDRV_SECTOR_SIZE)) {
+return i;
+}
+}
+if (i < n && !buffer_is_zero(buf + i, n - end)) {
+return i;
+}
+return -1;
+}
+
+/*
  * Returns true iff the first sector pointed to by 'buf' contains at least
  * a non-NUL byte.
  *
@@ -1189,7 +1211,9 @@ static int check_empty_sectors(BlockBackend *blk, int64_t 
sect_num,
int sect_count, const char *filename,
uint8_t *buffer, bool quiet)
 {
-int pnum, ret = 0;
+int ret = 0;
+int64_t idx;
+
 ret = blk_pread(blk, sect_num << BDRV_SECTOR_BITS, buffer,
 sect_count << BDRV_SECTOR_BITS);
 if (ret < 0) {
@@ -1197,10 +1221,10 @@ static int check_empty_sectors(BlockBackend *blk, 
int64_t sect_num,
  sectors_to_bytes(sect_num), filename, strerror(-ret));
 return ret;
 }
-ret = is_allocated_sectors(buffer, sect_count, );
-if (ret || pnum != sect_count) {
+idx = find_nonzero(buffer, sect_count * BDRV_SECTOR_SIZE);
+if (idx >= 0) {
 qprintf(quiet, "Content mismatch at offset %" PRId64 "!\n",
-sectors_to_bytes(ret ? sect_num : sect_num + pnum));
+sectors_to_bytes(sect_num) + idx);
 return 1;
 }

-- 
2.13.6

[Qemu-devel] [PATCH v5 11/23] block: Switch bdrv_co_get_block_status_above() to byte-based

2017-10-03 Thread Eric Blake

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
type (no semantic change), and rename it to match the corresponding
public function rename.

Signed-off-by: Eric Blake 
Reviewed-by: Fam Zheng 
Reviewed-by: John Snow 

---
v5: no change other than rebase to context
v4: no change
v3: rebase to allocation/mapping sense change, simple enough to keep R-b
v2: rebase to earlier changes
---
 block/io.c | 48 ++--
 1 file changed, 18 insertions(+), 30 deletions(-)

diff --git a/block/io.c b/block/io.c
index 4826751c27..ac7399ad41 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1935,12 +1935,12 @@ static int64_t coroutine_fn 
bdrv_co_block_status(BlockDriverState *bs,
 return ret;
 }

-static int64_t coroutine_fn bdrv_co_get_block_status_above(BlockDriverState 
*bs,
+static int64_t coroutine_fn bdrv_co_block_status_above(BlockDriverState *bs,
 BlockDriverState *base,
 bool mapping,
-int64_t sector_num,
-int nb_sectors,
-int *pnum,
+int64_t offset,
+int64_t bytes,
+int64_t *pnum,
 BlockDriverState **file)
 {
 BlockDriverState *p;
@@ -1949,17 +1949,10 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status_above(BlockDriverState *bs,

 assert(bs != base);
 for (p = bs; p != base; p = backing_bs(p)) {
-int64_t count;
-
-ret = bdrv_co_block_status(p, mapping,
-   sector_num * BDRV_SECTOR_SIZE,
-   nb_sectors * BDRV_SECTOR_SIZE, ,
-   file);
+ret = bdrv_co_block_status(p, mapping, offset, bytes, pnum, file);
 if (ret < 0) {
 break;
 }
-assert(QEMU_IS_ALIGNED(count, BDRV_SECTOR_SIZE));
-*pnum = count >> BDRV_SECTOR_BITS;
 if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) {
 /*
  * Reading beyond the end of the file continues to read
@@ -1967,39 +1960,35 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status_above(BlockDriverState *bs,
  * unallocated length we learned from an earlier
  * iteration.
  */
-*pnum = nb_sectors;
+*pnum = bytes;
 }
 if (ret & (BDRV_BLOCK_ZERO | BDRV_BLOCK_DATA)) {
 break;
 }
-/* [sector_num, pnum] unallocated on this layer, which could be only
- * the first part of [sector_num, nb_sectors].  */
-nb_sectors = MIN(nb_sectors, *pnum);
+/* [offset, pnum] unallocated on this layer, which could be only
+ * the first part of [offset, bytes].  */
+bytes = MIN(bytes, *pnum);
 first = false;
 }
 return ret;
 }

 /* Coroutine wrapper for bdrv_get_block_status_above() */
-static void coroutine_fn bdrv_get_block_status_above_co_entry(void *opaque)
+static void coroutine_fn bdrv_block_status_above_co_entry(void *opaque)
 {
 BdrvCoBlockStatusData *data = opaque;
-int n;

-data->ret = bdrv_co_get_block_status_above(data->bs, data->base,
-   data->mapping,
-   data->offset >> 
BDRV_SECTOR_BITS,
-   data->bytes >> BDRV_SECTOR_BITS,
-   ,
-   data->file);
-*data->pnum = n * BDRV_SECTOR_SIZE;
+data->ret = bdrv_co_block_status_above(data->bs, data->base,
+   data->mapping,
+   data->offset, data->bytes,
+   data->pnum, data->file);
 data->done = true;
 }

 /*
- * Synchronous wrapper around bdrv_co_get_block_status_above().
+ * Synchronous wrapper around bdrv_co_block_status_above().
  *
- * See bdrv_co_get_block_status_above() for details.
+ * See bdrv_co_block_status_above() for details.
  */
 static int64_t bdrv_common_block_status_above(BlockDriverState *bs,
   BlockDriverState *base,
@@ -2022,10 +2011,9 @@ static int64_t 
bdrv_common_block_status_above(BlockDriverState *bs,

 if (qemu_in_coroutine()) {
 /* Fast-path if already in coroutine context */
-bdrv_get_block_status_above_co_entry();
+bdrv_block_status_above_co_entry();
 } else {
-co = qemu_coroutine_create(bdrv_get_block_status_above_co_entry,
-   );
+co = qemu_coroutine_create(bdrv_block_status_above_co_entry, );
 bdrv_coroutine_enter(bs, co);
 BDRV_POLL_WHILE(bs, !data.done);
 }
-- 
2.13.6

[Qemu-devel] [PATCH v5 18/23] qemu-img: Change compare_sectors() to be byte-based

2017-10-03 Thread Eric Blake

In the continuing quest to make more things byte-based, change
compare_sectors(), renaming it to compare_buffers() in the
process.  Note that one caller (qemu-img compare) only cares
about the first difference, while the other (qemu-img rebase)
cares about how many consecutive sectors have the same
equal/different status; however, this patch does not bother to
micro-optimize the compare case to avoid the comparisons of
sectors beyond the first mismatch.  Both callers are always
passing valid buffers in, so the initial check for buffer size
can be turned into an assertion.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v4-v5: no change
v3: new patch
---
 qemu-img.c | 55 +++
 1 file changed, 27 insertions(+), 28 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 016d0cc23a..b988c718aa 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1156,31 +1156,28 @@ static int is_allocated_sectors_min(const uint8_t *buf, 
int n, int *pnum,
 }

 /*
- * Compares two buffers sector by sector. Returns 0 if the first sector of both
- * buffers matches, non-zero otherwise.
+ * Compares two buffers sector by sector. Returns 0 if the first
+ * sector of each buffer matches, non-zero otherwise.
  *
- * pnum is set to the number of sectors (including and immediately following
- * the first one) that are known to have the same comparison result
+ * pnum is set to the sector-aligned size of the buffer prefix that
+ * has the same matching status as the first sector.
  */
-static int compare_sectors(const uint8_t *buf1, const uint8_t *buf2, int n,
-int *pnum)
+static int compare_buffers(const uint8_t *buf1, const uint8_t *buf2,
+   int64_t bytes, int64_t *pnum)
 {
 bool res;
-int i;
+int64_t i = MIN(bytes, BDRV_SECTOR_SIZE);

-if (n <= 0) {
-*pnum = 0;
-return 0;
-}
+assert(bytes > 0);

-res = !!memcmp(buf1, buf2, 512);
-for(i = 1; i < n; i++) {
-buf1 += 512;
-buf2 += 512;
+res = !!memcmp(buf1, buf2, i);
+while (i < bytes) {
+int64_t len = MIN(bytes - i, BDRV_SECTOR_SIZE);

-if (!!memcmp(buf1, buf2, 512) != res) {
+if (!!memcmp(buf1 + i, buf2 + i, len) != res) {
 break;
 }
+i += len;
 }

 *pnum = i;
@@ -1255,7 +1252,7 @@ static int img_compare(int argc, char **argv)
 int64_t total_sectors;
 int64_t sector_num = 0;
 int64_t nb_sectors;
-int c, pnum;
+int c;
 uint64_t progress_base;
 bool image_opts = false;
 bool force_share = false;
@@ -1440,6 +1437,8 @@ static int img_compare(int argc, char **argv)
 /* nothing to do */
 } else if (allocated1 == allocated2) {
 if (allocated1) {
+int64_t pnum;
+
 nb_sectors = MIN(nb_sectors, IO_BUF_SIZE >> BDRV_SECTOR_BITS);
 ret = blk_pread(blk1, sector_num << BDRV_SECTOR_BITS, buf1,
 nb_sectors << BDRV_SECTOR_BITS);
@@ -1459,11 +1458,11 @@ static int img_compare(int argc, char **argv)
 ret = 4;
 goto out;
 }
-ret = compare_sectors(buf1, buf2, nb_sectors, );
-if (ret || pnum != nb_sectors) {
+ret = compare_buffers(buf1, buf2,
+  nb_sectors * BDRV_SECTOR_SIZE, );
+if (ret || pnum != nb_sectors * BDRV_SECTOR_SIZE) {
 qprintf(quiet, "Content mismatch at offset %" PRId64 "!\n",
-sectors_to_bytes(
-ret ? sector_num : sector_num + pnum));
+sectors_to_bytes(sector_num) + (ret ? 0 : pnum));
 ret = 1;
 goto out;
 }
@@ -3354,16 +3353,16 @@ static int img_rebase(int argc, char **argv)
 /* If they differ, we need to write to the COW file */
 uint64_t written = 0;

-while (written < n) {
-int pnum;
+while (written < n * BDRV_SECTOR_SIZE) {
+int64_t pnum;

-if (compare_sectors(buf_old + written * 512,
-buf_new + written * 512, n - written, ))
+if (compare_buffers(buf_old + written,
+buf_new + written,
+n * BDRV_SECTOR_SIZE - written, ))
 {
 ret = blk_pwrite(blk,
- (sector + written) << BDRV_SECTOR_BITS,
- buf_old + written * 512,
- pnum << BDRV_SECTOR_BITS, 0);
+ (sector << BDRV_SECTOR_BITS) + written,
+ buf_old + written, pnum, 0);
 if (ret < 0) {

[Qemu-devel] [PATCH v5 10/23] block: Switch bdrv_common_block_status_above() to byte-based

2017-10-03 Thread Eric Blake

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change).

Signed-off-by: Eric Blake 
Reviewed-by: Fam Zheng 
Reviewed-by: John Snow 

---
v4-v5: no change
v3: rebase to allocation/mapping sense change, simple enough to keep R-b
v2: new patch
---
 block/io.c | 41 +
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/block/io.c b/block/io.c
index 1857191187..4826751c27 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2004,19 +2004,18 @@ static void coroutine_fn 
bdrv_get_block_status_above_co_entry(void *opaque)
 static int64_t bdrv_common_block_status_above(BlockDriverState *bs,
   BlockDriverState *base,
   bool mapping,
-  int64_t sector_num,
-  int nb_sectors, int *pnum,
+  int64_t offset,
+  int64_t bytes, int64_t *pnum,
   BlockDriverState **file)
 {
 Coroutine *co;
-int64_t n;
 BdrvCoBlockStatusData data = {
 .bs = bs,
 .base = base,
 .file = file,
-.offset = sector_num * BDRV_SECTOR_SIZE,
-.bytes = nb_sectors * BDRV_SECTOR_SIZE,
-.pnum = ,
+.offset = offset,
+.bytes = bytes,
+.pnum = pnum,
 .mapping = mapping,
 .done = false,
 };
@@ -2030,8 +2029,6 @@ static int64_t 
bdrv_common_block_status_above(BlockDriverState *bs,
 bdrv_coroutine_enter(bs, co);
 BDRV_POLL_WHILE(bs, !data.done);
 }
-assert(data.ret < 0 || QEMU_IS_ALIGNED(n, BDRV_SECTOR_SIZE));
-*pnum = n >> BDRV_SECTOR_BITS;
 return data.ret;
 }

@@ -2041,8 +2038,19 @@ int64_t bdrv_get_block_status_above(BlockDriverState *bs,
 int nb_sectors, int *pnum,
 BlockDriverState **file)
 {
-return bdrv_common_block_status_above(bs, base, true, sector_num,
-  nb_sectors, pnum, file);
+int64_t ret;
+int64_t n;
+
+ret = bdrv_common_block_status_above(bs, base, true,
+ sector_num * BDRV_SECTOR_SIZE,
+ nb_sectors * BDRV_SECTOR_SIZE,
+ , file);
+if (ret < 0) {
+return ret;
+}
+assert(QEMU_IS_ALIGNED(n, BDRV_SECTOR_SIZE));
+*pnum = n >> BDRV_SECTOR_BITS;
+return ret;
 }

 int64_t bdrv_block_status(BlockDriverState *bs,
@@ -2071,20 +2079,13 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState 
*bs, int64_t offset,
int64_t bytes, int64_t *pnum)
 {
 int64_t ret;
-int psectors;
+int64_t dummy;

-assert(QEMU_IS_ALIGNED(offset, BDRV_SECTOR_SIZE));
-assert(QEMU_IS_ALIGNED(bytes, BDRV_SECTOR_SIZE) && bytes < INT_MAX);
-ret = bdrv_common_block_status_above(bs, backing_bs(bs), false,
- offset >> BDRV_SECTOR_BITS,
- bytes >> BDRV_SECTOR_BITS, ,
- NULL);
+ret = bdrv_common_block_status_above(bs, backing_bs(bs), false, offset,
+ bytes, pnum ? pnum : , NULL);
 if (ret < 0) {
 return ret;
 }
-if (pnum) {
-*pnum = psectors * BDRV_SECTOR_SIZE;
-}
 return !!(ret & BDRV_BLOCK_ALLOCATED);
 }

-- 
2.13.6

[Qemu-devel] [PATCH v5 16/23] qemu-img: Drop redundant error message in compare

2017-10-03 Thread Eric Blake

If a read error is encountered during 'qemu-img compare', we
were printing the "Error while reading offset ..." message twice;
this was because our helper function was awkward, printing output
on some but not all paths.  Fix it to consistently report errors
on all paths, so that the callers do not risk a redundant message,
and update the testsuite for the improved output.

Further simplify the code by hoisting the conversion from an error
message to an exit code into the helper function, rather than
repeating that logic at all callers (yes, the helper function is
now less generic, but it's a net win in lines of code).

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v5: tweak commit message, but no code change
v4: no change
v3: new patch
---
 qemu-img.c | 19 +--
 tests/qemu-iotests/074.out |  2 --
 2 files changed, 5 insertions(+), 16 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 12881f008e..a8ed5d990d 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1197,8 +1197,10 @@ static int64_t sectors_to_bytes(int64_t sectors)
 /*
  * Check if passed sectors are empty (not allocated or contain only 0 bytes)
  *
- * Returns 0 in case sectors are filled with 0, 1 if sectors contain non-zero
- * data and negative value on error.
+ * Intended for use by 'qemu-img compare': Returns 0 in case sectors are
+ * filled with 0, 1 if sectors contain non-zero data (this is a comparison
+ * failure), and 4 on error (the exit status for read errors), after emitting
+ * an error message.
  *
  * @param blk:  BlockBackend for the image
  * @param sect_num: Number of first sector to check
@@ -1219,7 +1221,7 @@ static int check_empty_sectors(BlockBackend *blk, int64_t 
sect_num,
 if (ret < 0) {
 error_report("Error while reading offset %" PRId64 " of %s: %s",
  sectors_to_bytes(sect_num), filename, strerror(-ret));
-return ret;
+return 4;
 }
 idx = find_nonzero(buffer, sect_count * BDRV_SECTOR_SIZE);
 if (idx >= 0) {
@@ -1477,11 +1479,6 @@ static int img_compare(int argc, char **argv)
   filename2, buf1, quiet);
 }
 if (ret) {
-if (ret < 0) {
-error_report("Error while reading offset %" PRId64 ": %s",
- sectors_to_bytes(sector_num), strerror(-ret));
-ret = 4;
-}
 goto out;
 }
 }
@@ -1526,12 +1523,6 @@ static int img_compare(int argc, char **argv)
 ret = check_empty_sectors(blk_over, sector_num, nb_sectors,
   filename_over, buf1, quiet);
 if (ret) {
-if (ret < 0) {
-error_report("Error while reading offset %" PRId64
- " of %s: %s", 
sectors_to_bytes(sector_num),
- filename_over, strerror(-ret));
-ret = 4;
-}
 goto out;
 }
 }
diff --git a/tests/qemu-iotests/074.out b/tests/qemu-iotests/074.out
index 8fba5aea9c..ede66c3f81 100644
--- a/tests/qemu-iotests/074.out
+++ b/tests/qemu-iotests/074.out
@@ -4,7 +4,6 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824
 wrote 512/512 bytes at offset 512
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 qemu-img: Error while reading offset 0 of 
blkdebug:TEST_DIR/blkdebug.conf:TEST_DIR/t.IMGFMT: Input/output error
-qemu-img: Error while reading offset 0: Input/output error
 4
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824
 Formatting 'TEST_DIR/t.IMGFMT.2', fmt=IMGFMT size=0
@@ -12,7 +11,6 @@ Formatting 'TEST_DIR/t.IMGFMT.2', fmt=IMGFMT size=0
 wrote 512/512 bytes at offset 512
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 qemu-img: Error while reading offset 0 of 
blkdebug:TEST_DIR/blkdebug.conf:TEST_DIR/t.IMGFMT: Input/output error
-qemu-img: Error while reading offset 0 of 
blkdebug:TEST_DIR/blkdebug.conf:TEST_DIR/t.IMGFMT: Input/output error
 Warning: Image size mismatch!
 4
 Cleanup
-- 
2.13.6

[Qemu-devel] [PATCH v5 19/23] qemu-img: Change img_rebase() to be byte-based

2017-10-03 Thread Eric Blake

In the continuing quest to make more things byte-based, change
the internal iteration of img_rebase().  We can finally drop the
TODO assertion added earlier, now that the entire algorithm is
byte-based and no longer has to shift from bytes to sectors.

Most of the change is mechanical ('num_sectors' becomes 'size',
'sector' becomes 'offset', 'n' goes from sectors to bytes); some
of it is also a cleanup (use of MIN() instead of open-coding,
loss of variable 'count' added earlier in commit d6a644bb).

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v4-v5: no change
v3: new patch
---
 qemu-img.c | 84 +-
 1 file changed, 34 insertions(+), 50 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index b988c718aa..2e74da978e 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3248,70 +3248,58 @@ static int img_rebase(int argc, char **argv)
  * the image is the same as the original one at any time.
  */
 if (!unsafe) {
-int64_t num_sectors;
-int64_t old_backing_num_sectors;
-int64_t new_backing_num_sectors = 0;
-uint64_t sector;
-int n;
-int64_t count;
+int64_t size;
+int64_t old_backing_size;
+int64_t new_backing_size = 0;
+uint64_t offset;
+int64_t n;
 float local_progress = 0;

 buf_old = blk_blockalign(blk, IO_BUF_SIZE);
 buf_new = blk_blockalign(blk, IO_BUF_SIZE);

-num_sectors = blk_nb_sectors(blk);
-if (num_sectors < 0) {
+size = blk_getlength(blk);
+if (size < 0) {
 error_report("Could not get size of '%s': %s",
- filename, strerror(-num_sectors));
+ filename, strerror(-size));
 ret = -1;
 goto out;
 }
-old_backing_num_sectors = blk_nb_sectors(blk_old_backing);
-if (old_backing_num_sectors < 0) {
+old_backing_size = blk_getlength(blk_old_backing);
+if (old_backing_size < 0) {
 char backing_name[PATH_MAX];

 bdrv_get_backing_filename(bs, backing_name, sizeof(backing_name));
 error_report("Could not get size of '%s': %s",
- backing_name, strerror(-old_backing_num_sectors));
+ backing_name, strerror(-old_backing_size));
 ret = -1;
 goto out;
 }
 if (blk_new_backing) {
-new_backing_num_sectors = blk_nb_sectors(blk_new_backing);
-if (new_backing_num_sectors < 0) {
+new_backing_size = blk_getlength(blk_new_backing);
+if (new_backing_size < 0) {
 error_report("Could not get size of '%s': %s",
- out_baseimg, strerror(-new_backing_num_sectors));
+ out_baseimg, strerror(-new_backing_size));
 ret = -1;
 goto out;
 }
 }

-if (num_sectors != 0) {
-local_progress = (float)100 /
-(num_sectors / MIN(num_sectors, IO_BUF_SIZE / 512));
+if (size != 0) {
+local_progress = (float)100 / (size / MIN(size, IO_BUF_SIZE));
 }

-for (sector = 0; sector < num_sectors; sector += n) {
-
-/* How many sectors can we handle with the next read? */
-if (sector + (IO_BUF_SIZE / 512) <= num_sectors) {
-n = (IO_BUF_SIZE / 512);
-} else {
-n = num_sectors - sector;
-}
+for (offset = 0; offset < size; offset += n) {
+/* How many bytes can we handle with the next read? */
+n = MIN(IO_BUF_SIZE, size - offset);

 /* If the cluster is allocated, we don't need to take action */
-ret = bdrv_is_allocated(bs, sector << BDRV_SECTOR_BITS,
-n << BDRV_SECTOR_BITS, );
+ret = bdrv_is_allocated(bs, offset, n, );
 if (ret < 0) {
 error_report("error while reading image metadata: %s",
  strerror(-ret));
 goto out;
 }
-/* TODO relax this once bdrv_is_allocated does not enforce
- * sector alignment */
-assert(QEMU_IS_ALIGNED(count, BDRV_SECTOR_SIZE));
-n = count >> BDRV_SECTOR_BITS;
 if (ret) {
 continue;
 }
@@ -3320,30 +3308,28 @@ static int img_rebase(int argc, char **argv)
  * Read old and new backing file and take into consideration that
  * backing files may be smaller than the COW image.
  */
-if (sector >= old_backing_num_sectors) {
-memset(buf_old, 0, n * BDRV_SECTOR_SIZE);
+if (offset >= old_backing_size) {
+memset(buf_old, 0, n);
 } else {
-if (sector + n >

[Qemu-devel] [PATCH v5 09/23] block: Switch BdrvCoGetBlockStatusData to byte-based

2017-10-03 Thread Eric Blake

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
type (no semantic change), and rename it to match the corresponding
public function rename.

Signed-off-by: Eric Blake 
Reviewed-by: Fam Zheng 
Reviewed-by: John Snow 

---
v4-v5: no change
v3: rebase to context conflicts, simple enough to keep R-b
v2: rebase to earlier changes
---
 block/io.c | 31 ++-
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/block/io.c b/block/io.c
index b879e26154..1857191187 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1744,17 +1744,17 @@ int bdrv_flush_all(void)
 }


-typedef struct BdrvCoGetBlockStatusData {
+typedef struct BdrvCoBlockStatusData {
 BlockDriverState *bs;
 BlockDriverState *base;
 BlockDriverState **file;
-int64_t sector_num;
-int nb_sectors;
-int *pnum;
+int64_t offset;
+int64_t bytes;
+int64_t *pnum;
 int64_t ret;
 bool mapping;
 bool done;
-} BdrvCoGetBlockStatusData;
+} BdrvCoBlockStatusData;

 int64_t coroutine_fn bdrv_co_get_block_status_from_file(BlockDriverState *bs,
 int64_t sector_num,
@@ -1983,14 +1983,16 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status_above(BlockDriverState *bs,
 /* Coroutine wrapper for bdrv_get_block_status_above() */
 static void coroutine_fn bdrv_get_block_status_above_co_entry(void *opaque)
 {
-BdrvCoGetBlockStatusData *data = opaque;
+BdrvCoBlockStatusData *data = opaque;
+int n;

 data->ret = bdrv_co_get_block_status_above(data->bs, data->base,
data->mapping,
-   data->sector_num,
-   data->nb_sectors,
-   data->pnum,
+   data->offset >> 
BDRV_SECTOR_BITS,
+   data->bytes >> BDRV_SECTOR_BITS,
+   ,
data->file);
+*data->pnum = n * BDRV_SECTOR_SIZE;
 data->done = true;
 }

@@ -2007,13 +2009,14 @@ static int64_t 
bdrv_common_block_status_above(BlockDriverState *bs,
   BlockDriverState **file)
 {
 Coroutine *co;
-BdrvCoGetBlockStatusData data = {
+int64_t n;
+BdrvCoBlockStatusData data = {
 .bs = bs,
 .base = base,
 .file = file,
-.sector_num = sector_num,
-.nb_sectors = nb_sectors,
-.pnum = pnum,
+.offset = sector_num * BDRV_SECTOR_SIZE,
+.bytes = nb_sectors * BDRV_SECTOR_SIZE,
+.pnum = ,
 .mapping = mapping,
 .done = false,
 };
@@ -2027,6 +2030,8 @@ static int64_t 
bdrv_common_block_status_above(BlockDriverState *bs,
 bdrv_coroutine_enter(bs, co);
 BDRV_POLL_WHILE(bs, !data.done);
 }
+assert(data.ret < 0 || QEMU_IS_ALIGNED(n, BDRV_SECTOR_SIZE));
+*pnum = n >> BDRV_SECTOR_BITS;
 return data.ret;
 }

-- 
2.13.6

[Qemu-devel] [PATCH v5 22/23] block: Relax bdrv_aligned_preadv() assertion

2017-10-03 Thread Eric Blake

Now that bdrv_is_allocated accepts non-aligned inputs, we can
remove the TODO added in commit d6a644bb.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v4-v5: no change
v3: new patch [Kevin]
---
 block/io.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/block/io.c b/block/io.c
index 8619f82eae..cf4217ec29 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1103,18 +1103,14 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild 
*child,
 }

 if (flags & BDRV_REQ_COPY_ON_READ) {
-/* TODO: Simplify further once bdrv_is_allocated no longer
- * requires sector alignment */
-int64_t start = QEMU_ALIGN_DOWN(offset, BDRV_SECTOR_SIZE);
-int64_t end = QEMU_ALIGN_UP(offset + bytes, BDRV_SECTOR_SIZE);
 int64_t pnum;

-ret = bdrv_is_allocated(bs, start, end - start, );
+ret = bdrv_is_allocated(bs, offset, bytes, );
 if (ret < 0) {
 goto out;
 }

-if (!ret || pnum != end - start) {
+if (!ret || pnum != bytes) {
 ret = bdrv_co_do_copy_on_readv(child, offset, bytes, qiov);
 goto out;
 }
-- 
2.13.6

[Qemu-devel] [PATCH v5 14/23] qemu-img: Speed up compare on pre-allocated larger file

2017-10-03 Thread Eric Blake

Compare the following images with all-zero contents:
$ truncate --size 1M A
$ qemu-img create -f qcow2 -o preallocation=off B 1G
$ qemu-img create -f qcow2 -o preallocation=metadata C 1G

On my machine, the difference is noticeable for pre-patch speeds,
with more than an order of magnitude in difference caused by the
choice of preallocation in the qcow2 file:

$ time ./qemu-img compare -f raw -F qcow2 A B
Warning: Image size mismatch!
Images are identical.

real0m0.014s
user0m0.007s
sys 0m0.007s

$ time ./qemu-img compare -f raw -F qcow2 A C
Warning: Image size mismatch!
Images are identical.

real0m0.341s
user0m0.144s
sys 0m0.188s

Why? Because bdrv_is_allocated() returns false for image B but
true for image C, throwing away the fact that both images know
via lseek(SEEK_HOLE) that the entire image still reads as zero.
>From there, qemu-img ends up calling bdrv_pread() for every byte
of the tail, instead of quickly looking for the next allocation.
The solution: use block_status instead of is_allocated, giving:

$ time ./qemu-img compare -f raw -F qcow2 A C
Warning: Image size mismatch!
Images are identical.

real0m0.014s
user0m0.011s
sys 0m0.003s

which is on par with the speeds for no pre-allocation.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 
Reviewed-by: Vladimir Sementsov-Ogievskiy 

---
v4-v5: no change
v3: new patch
---
 qemu-img.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index abd289c0b5..43e3038894 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1481,11 +1481,11 @@ static int img_compare(int argc, char **argv)
 while (sector_num < progress_base) {
 int64_t count;

-ret = bdrv_is_allocated_above(blk_bs(blk_over), NULL,
+ret = bdrv_block_status_above(blk_bs(blk_over), NULL,
   sector_num * BDRV_SECTOR_SIZE,
   (progress_base - sector_num) *
   BDRV_SECTOR_SIZE,
-  );
+  , NULL);
 if (ret < 0) {
 ret = 3;
 error_report("Sector allocation test failed for %s",
@@ -1493,11 +1493,11 @@ static int img_compare(int argc, char **argv)
 goto out;

 }
-/* TODO relax this once bdrv_is_allocated_above does not enforce
+/* TODO relax this once bdrv_block_status_above does not enforce
  * sector alignment */
 assert(QEMU_IS_ALIGNED(count, BDRV_SECTOR_SIZE));
 nb_sectors = count >> BDRV_SECTOR_BITS;
-if (ret) {
+if (ret & BDRV_BLOCK_ALLOCATED && !(ret & BDRV_BLOCK_ZERO)) {
 nb_sectors = MIN(nb_sectors, IO_BUF_SIZE >> BDRV_SECTOR_BITS);
 ret = check_empty_sectors(blk_over, sector_num, nb_sectors,
   filename_over, buf1, quiet);
-- 
2.13.6

[Qemu-devel] [PATCH v5 05/23] block: Switch bdrv_make_zero() to byte-based

2017-10-03 Thread Eric Blake

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of zeroing a device to track by bytes instead of
sectors (although we are still guaranteed that we iterate by steps
that are sector-aligned).

Signed-off-by: Eric Blake 
Reviewed-by: Fam Zheng 
Reviewed-by: John Snow 

---
v3-v5: no change
v2: rebase to earlier changes
---
 block/io.c | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/block/io.c b/block/io.c
index e4d5d33805..afba2da1c4 100644
--- a/block/io.c
+++ b/block/io.c
@@ -696,38 +696,38 @@ int bdrv_pwrite_zeroes(BdrvChild *child, int64_t offset,
  */
 int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags flags)
 {
-int64_t target_sectors, ret, nb_sectors, sector_num = 0;
+int64_t target_size, ret, bytes, offset = 0;
 BlockDriverState *bs = child->bs;
-int n;
+int n; /* sectors */

-target_sectors = bdrv_nb_sectors(bs);
-if (target_sectors < 0) {
-return target_sectors;
+target_size = bdrv_getlength(bs);
+if (target_size < 0) {
+return target_size;
 }

 for (;;) {
-nb_sectors = MIN(target_sectors - sector_num, 
BDRV_REQUEST_MAX_SECTORS);
-if (nb_sectors <= 0) {
+bytes = MIN(target_size - offset, BDRV_REQUEST_MAX_BYTES);
+if (bytes <= 0) {
 return 0;
 }
-ret = bdrv_get_block_status(bs, sector_num, nb_sectors, , NULL);
+ret = bdrv_get_block_status(bs, offset >> BDRV_SECTOR_BITS,
+bytes >> BDRV_SECTOR_BITS, , NULL);
 if (ret < 0) {
-error_report("error getting block status at sector %" PRId64 ": 
%s",
- sector_num, strerror(-ret));
+error_report("error getting block status at offset %" PRId64 ": 
%s",
+ offset, strerror(-ret));
 return ret;
 }
 if (ret & BDRV_BLOCK_ZERO) {
-sector_num += n;
+offset += n * BDRV_SECTOR_BITS;
 continue;
 }
-ret = bdrv_pwrite_zeroes(child, sector_num << BDRV_SECTOR_BITS,
- n << BDRV_SECTOR_BITS, flags);
+ret = bdrv_pwrite_zeroes(child, offset, n * BDRV_SECTOR_SIZE, flags);
 if (ret < 0) {
-error_report("error writing zeroes at sector %" PRId64 ": %s",
- sector_num, strerror(-ret));
+error_report("error writing zeroes at offset %" PRId64 ": %s",
+ offset, strerror(-ret));
 return ret;
 }
-sector_num += n;
+offset += n * BDRV_SECTOR_SIZE;
 }
 }

-- 
2.13.6

[Qemu-devel] [PATCH v5 12/23] block: Convert bdrv_get_block_status_above() to bytes

2017-10-03 Thread Eric Blake

We are gradually moving away from sector-based interfaces, towards
byte-based.  In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.

Changing the name of the function from bdrv_get_block_status_above()
to bdrv_block_status_above() ensures that the compiler enforces that
all callers are updated.  For now, the io.c layer still assert()s
that all callers are sector-aligned, but that can be relaxed when a
later patch implements byte-based block status in the drivers.

For the most part this patch is just the addition of scaling at the
callers followed by inverse scaling at bdrv_block_status().  But some
code, particularly bdrv_block_status(), gets a lot simpler because
it no longer has to mess with sectors.  Likewise, mirror code no
longer computes s->granularity >> BDRV_SECTOR_BITS, and can therefore
drop an assertion about alignment because the loop no longer depends
on alignment (never mind that we don't really have a driver that
reports sub-sector alignments, so it's not really possible to test
the effect of sub-sector mirroring).  Fix a neighboring assertion to
use is_power_of_2 while there.

For ease of review, bdrv_get_block_status() was tackled separately.

Signed-off-by: Eric Blake 

---
v5: assert alignment rather than rounding up in img_compare [John],
rebase to earlier changes
v4: rebase to earlier changes
v3: rebase to allocation/mapping sense change and qcow2-measure, tweak
mirror assertions, drop R-b
v2: rebase to earlier changes
---
 include/block/block.h | 10 +-
 block/io.c| 43 ---
 block/mirror.c| 16 +---
 block/qcow2.c | 25 +
 qemu-img.c| 40 
 5 files changed, 51 insertions(+), 83 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 4ecd2c4a65..b484e60509 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -427,11 +427,11 @@ bool bdrv_can_write_zeroes_with_unmap(BlockDriverState 
*bs);
 int64_t bdrv_block_status(BlockDriverState *bs, int64_t offset,
   int64_t bytes, int64_t *pnum,
   BlockDriverState **file);
-int64_t bdrv_get_block_status_above(BlockDriverState *bs,
-BlockDriverState *base,
-int64_t sector_num,
-int nb_sectors, int *pnum,
-BlockDriverState **file);
+int64_t bdrv_block_status_above(BlockDriverState *bs,
+BlockDriverState *base,
+int64_t offset,
+int64_t bytes, int64_t *pnum,
+BlockDriverState **file);
 int bdrv_is_allocated(BlockDriverState *bs, int64_t offset, int64_t bytes,
   int64_t *pnum);
 int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base,
diff --git a/block/io.c b/block/io.c
index ac7399ad41..8f0434ce4f 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1973,7 +1973,7 @@ static int64_t coroutine_fn 
bdrv_co_block_status_above(BlockDriverState *bs,
 return ret;
 }

-/* Coroutine wrapper for bdrv_get_block_status_above() */
+/* Coroutine wrapper for bdrv_block_status_above() */
 static void coroutine_fn bdrv_block_status_above_co_entry(void *opaque)
 {
 BdrvCoBlockStatusData *data = opaque;
@@ -2020,47 +2020,20 @@ static int64_t 
bdrv_common_block_status_above(BlockDriverState *bs,
 return data.ret;
 }

-int64_t bdrv_get_block_status_above(BlockDriverState *bs,
-BlockDriverState *base,
-int64_t sector_num,
-int nb_sectors, int *pnum,
-BlockDriverState **file)
+int64_t bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base,
+int64_t offset, int64_t bytes, int64_t *pnum,
+BlockDriverState **file)
 {
-int64_t ret;
-int64_t n;
-
-ret = bdrv_common_block_status_above(bs, base, true,
- sector_num * BDRV_SECTOR_SIZE,
- nb_sectors * BDRV_SECTOR_SIZE,
- , file);
-if (ret < 0) {
-return ret;
-}
-assert(QEMU_IS_ALIGNED(n, BDRV_SECTOR_SIZE));
-*pnum = n >> BDRV_SECTOR_BITS;
-return ret;
+return bdrv_common_block_status_above(bs, base, true, offset, bytes,
+  pnum, file);
 }

 int64_t bdrv_block_status(BlockDriverState *bs,
   int64_t offset, int64_t bytes,

[Qemu-devel] [PATCH v5 17/23] qemu-img: Change check_empty_sectors() to byte-based

2017-10-03 Thread Eric Blake

Continue on the quest to make more things byte-based instead of
sector-based.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v4-v5: no change
v3: new patch
---
 qemu-img.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index a8ed5d990d..016d0cc23a 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1203,30 +1203,29 @@ static int64_t sectors_to_bytes(int64_t sectors)
  * an error message.
  *
  * @param blk:  BlockBackend for the image
- * @param sect_num: Number of first sector to check
- * @param sect_count: Number of sectors to check
+ * @param offset: Starting offset to check
+ * @param bytes: Number of bytes to check
  * @param filename: Name of disk file we are checking (logging purpose)
  * @param buffer: Allocated buffer for storing read data
  * @param quiet: Flag for quiet mode
  */
-static int check_empty_sectors(BlockBackend *blk, int64_t sect_num,
-   int sect_count, const char *filename,
+static int check_empty_sectors(BlockBackend *blk, int64_t offset,
+   int64_t bytes, const char *filename,
uint8_t *buffer, bool quiet)
 {
 int ret = 0;
 int64_t idx;

-ret = blk_pread(blk, sect_num << BDRV_SECTOR_BITS, buffer,
-sect_count << BDRV_SECTOR_BITS);
+ret = blk_pread(blk, offset, buffer, bytes);
 if (ret < 0) {
 error_report("Error while reading offset %" PRId64 " of %s: %s",
- sectors_to_bytes(sect_num), filename, strerror(-ret));
+ offset, filename, strerror(-ret));
 return 4;
 }
-idx = find_nonzero(buffer, sect_count * BDRV_SECTOR_SIZE);
+idx = find_nonzero(buffer, bytes);
 if (idx >= 0) {
 qprintf(quiet, "Content mismatch at offset %" PRId64 "!\n",
-sectors_to_bytes(sect_num) + idx);
+offset + idx);
 return 1;
 }

@@ -1472,10 +1471,12 @@ static int img_compare(int argc, char **argv)
 } else {
 nb_sectors = MIN(nb_sectors, IO_BUF_SIZE >> BDRV_SECTOR_BITS);
 if (allocated1) {
-ret = check_empty_sectors(blk1, sector_num, nb_sectors,
+ret = check_empty_sectors(blk1, sector_num * BDRV_SECTOR_SIZE,
+  nb_sectors * BDRV_SECTOR_SIZE,
   filename1, buf1, quiet);
 } else {
-ret = check_empty_sectors(blk2, sector_num, nb_sectors,
+ret = check_empty_sectors(blk2, sector_num * BDRV_SECTOR_SIZE,
+  nb_sectors * BDRV_SECTOR_SIZE,
   filename2, buf1, quiet);
 }
 if (ret) {
@@ -1520,7 +1521,9 @@ static int img_compare(int argc, char **argv)
 nb_sectors = count >> BDRV_SECTOR_BITS;
 if (ret & BDRV_BLOCK_ALLOCATED && !(ret & BDRV_BLOCK_ZERO)) {
 nb_sectors = MIN(nb_sectors, IO_BUF_SIZE >> BDRV_SECTOR_BITS);
-ret = check_empty_sectors(blk_over, sector_num, nb_sectors,
+ret = check_empty_sectors(blk_over,
+  sector_num * BDRV_SECTOR_SIZE,
+  nb_sectors * BDRV_SECTOR_SIZE,
   filename_over, buf1, quiet);
 if (ret) {
 goto out;
-- 
2.13.6

[Qemu-devel] [PATCH v5 13/23] qemu-img: Simplify logic in img_compare()

2017-10-03 Thread Eric Blake

As long as we are querying the status for a chunk smaller than
the known image size, we are guaranteed that a successful return
will have set pnum to a non-zero size (pnum is zero only for
queries beyond the end of the file).  Use that to slightly
simplify the calculation of the current chunk size being compared.
Likewise, we don't have to shrink the amount of data operated on
until we know we have to read the file, and therefore have to fit
in the bounds of our buffer.  Also, note that 'total_sectors_over'
is equivalent to 'progress_base'.

With these changes in place, sectors_to_process() is now dead code,
and can be removed.

Signed-off-by: Eric Blake 

---
v5: rebase to alignment assertion [John]
v4: no change
v3: new patch
---
 qemu-img.c | 38 +++---
 1 file changed, 11 insertions(+), 27 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index a351af09ac..abd289c0b5 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1172,11 +1172,6 @@ static int64_t sectors_to_bytes(int64_t sectors)
 return sectors << BDRV_SECTOR_BITS;
 }

-static int64_t sectors_to_process(int64_t total, int64_t from)
-{
-return MIN(total - from, IO_BUF_SIZE >> BDRV_SECTOR_BITS);
-}
-
 /*
  * Check if passed sectors are empty (not allocated or contain only 0 bytes)
  *
@@ -1373,13 +1368,9 @@ static int img_compare(int argc, char **argv)
 goto out;
 }

-for (;;) {
+while (sector_num < total_sectors) {
 int64_t status1, status2;

-nb_sectors = sectors_to_process(total_sectors, sector_num);
-if (nb_sectors <= 0) {
-break;
-}
 status1 = bdrv_block_status_above(bs1, NULL,
   sector_num * BDRV_SECTOR_SIZE,
   (total_sectors1 - sector_num) *
@@ -1406,12 +1397,9 @@ static int img_compare(int argc, char **argv)
 /* TODO: Relax this once comparison is byte-based, and we no longer
  * have to worry about sector alignment */
 assert(QEMU_IS_ALIGNED(pnum1 | pnum2, BDRV_SECTOR_SIZE));
-if (pnum1) {
-nb_sectors = MIN(nb_sectors, pnum1 >> BDRV_SECTOR_BITS);
-}
-if (pnum2) {
-nb_sectors = MIN(nb_sectors, pnum2 >> BDRV_SECTOR_BITS);
-}
+
+assert(pnum1 && pnum2);
+nb_sectors = MIN(pnum1, pnum2) >> BDRV_SECTOR_BITS;

 if (strict) {
 if ((status1 & ~BDRV_BLOCK_OFFSET_MASK) !=
@@ -1424,9 +1412,10 @@ static int img_compare(int argc, char **argv)
 }
 }
 if ((status1 & BDRV_BLOCK_ZERO) && (status2 & BDRV_BLOCK_ZERO)) {
-nb_sectors = DIV_ROUND_UP(MIN(pnum1, pnum2), BDRV_SECTOR_SIZE);
+/* nothing to do */
 } else if (allocated1 == allocated2) {
 if (allocated1) {
+nb_sectors = MIN(nb_sectors, IO_BUF_SIZE >> BDRV_SECTOR_BITS);
 ret = blk_pread(blk1, sector_num << BDRV_SECTOR_BITS, buf1,
 nb_sectors << BDRV_SECTOR_BITS);
 if (ret < 0) {
@@ -1455,7 +1444,7 @@ static int img_compare(int argc, char **argv)
 }
 }
 } else {
-
+nb_sectors = MIN(nb_sectors, IO_BUF_SIZE >> BDRV_SECTOR_BITS);
 if (allocated1) {
 ret = check_empty_sectors(blk1, sector_num, nb_sectors,
   filename1, buf1, quiet);
@@ -1478,30 +1467,24 @@ static int img_compare(int argc, char **argv)

 if (total_sectors1 != total_sectors2) {
 BlockBackend *blk_over;
-int64_t total_sectors_over;
 const char *filename_over;

 qprintf(quiet, "Warning: Image size mismatch!\n");
 if (total_sectors1 > total_sectors2) {
-total_sectors_over = total_sectors1;
 blk_over = blk1;
 filename_over = filename1;
 } else {
-total_sectors_over = total_sectors2;
 blk_over = blk2;
 filename_over = filename2;
 }

-for (;;) {
+while (sector_num < progress_base) {
 int64_t count;

-nb_sectors = sectors_to_process(total_sectors_over, sector_num);
-if (nb_sectors <= 0) {
-break;
-}
 ret = bdrv_is_allocated_above(blk_bs(blk_over), NULL,
   sector_num * BDRV_SECTOR_SIZE,
-  nb_sectors * BDRV_SECTOR_SIZE,
+  (progress_base - sector_num) *
+  BDRV_SECTOR_SIZE,
   );
 if (ret < 0) {
 ret = 3;
@@ -1515,6 +1498,7 @@ static int img_compare(int argc, char **argv)
 assert(QEMU_IS_ALIGNED(count, BDRV_SECTOR_SIZE));
 nb_sectors = count >> BDRV_SECTOR_BITS;
 if (ret) {
+

[Qemu-devel] [PATCH v5 01/23] block: Allow NULL file for bdrv_get_block_status()

2017-10-03 Thread Eric Blake

Not all callers care about which BDS owns the mapping for a given
range of the file.  This patch merely simplifies the callers by
consolidating the logic in the common call point, while guaranteeing
a non-NULL file to all the driver callbacks, for no semantic change.
The only caller that does not care about pnum is bdrv_is_allocated,
as invoked by vvfat; we can likewise add assertions that the rest
of the stack does not have to worry about a NULL pnum.

Furthermore, this will also set the stage for a future cleanup: when
a caller does not care about which BDS owns an offset, it would be
nice to allow the driver to optimize things to not have to return
BDRV_BLOCK_OFFSET_VALID in the first place.  In the case of fragmented
allocation (for example, it's fairly easy to create a qcow2 image
where consecutive guest addresses are not at consecutive host
addresses), the current contract requires bdrv_get_block_status()
to clamp *pnum to the limit where host addresses are no longer
consecutive, but allowing a NULL file means that *pnum could be
set to the full length of known-allocated data.

Signed-off-by: Eric Blake 

---
v5: use second label for cleaner exit logic [John], use local_pnum
v4: only context changes
v3: rebase to recent changes (qcow2_measure), dropped R-b
v2: use local variable and final transfer, rather than assignment
of parameter to local
[previously in different series]:
v2: new patch, 
https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg05645.html
---
 include/block/block_int.h | 10 
 block/io.c| 58 ++-
 block/mirror.c|  3 +--
 block/qcow2.c |  8 ++-
 qemu-img.c| 10 
 5 files changed, 45 insertions(+), 44 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 79366b94b5..3b4158f576 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -202,10 +202,12 @@ struct BlockDriver {
 int64_t offset, int bytes);

 /*
- * Building block for bdrv_block_status[_above]. The driver should
- * answer only according to the current layer, and should not
- * set BDRV_BLOCK_ALLOCATED, but may set BDRV_BLOCK_RAW.  See block.h
- * for the meaning of _DATA, _ZERO, and _OFFSET_VALID.
+ * Building block for bdrv_block_status[_above] and
+ * bdrv_is_allocated[_above].  The driver should answer only
+ * according to the current layer, and should not set
+ * BDRV_BLOCK_ALLOCATED, but may set BDRV_BLOCK_RAW.  See block.h
+ * for the meaning of _DATA, _ZERO, and _OFFSET_VALID.  The block
+ * layer guarantees non-NULL pnum and file.
  */
 int64_t coroutine_fn (*bdrv_co_get_block_status)(BlockDriverState *bs,
 int64_t sector_num, int nb_sectors, int *pnum,
diff --git a/block/io.c b/block/io.c
index 1e246315a7..e5a6f63eea 100644
--- a/block/io.c
+++ b/block/io.c
@@ -698,7 +698,6 @@ int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags flags)
 {
 int64_t target_sectors, ret, nb_sectors, sector_num = 0;
 BlockDriverState *bs = child->bs;
-BlockDriverState *file;
 int n;

 target_sectors = bdrv_nb_sectors(bs);
@@ -711,7 +710,7 @@ int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags flags)
 if (nb_sectors <= 0) {
 return 0;
 }
-ret = bdrv_get_block_status(bs, sector_num, nb_sectors, , );
+ret = bdrv_get_block_status(bs, sector_num, nb_sectors, , NULL);
 if (ret < 0) {
 error_report("error getting block status at sector %" PRId64 ": 
%s",
  sector_num, strerror(-ret));
@@ -1800,8 +1799,9 @@ int64_t coroutine_fn 
bdrv_co_get_block_status_from_backing(BlockDriverState *bs,
  * beyond the end of the disk image it will be clamped; if 'pnum' is set to
  * the end of the image, then the returned value will include BDRV_BLOCK_EOF.
  *
- * If returned value is positive and BDRV_BLOCK_OFFSET_VALID bit is set, 'file'
- * points to the BDS which the sector range is allocated in.
+ * If returned value is positive, BDRV_BLOCK_OFFSET_VALID bit is set, and
+ * 'file' is non-NULL, then '*file' points to the BDS which the sector range
+ * is allocated in.
  */
 static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
  int64_t sector_num,
@@ -1811,16 +1811,19 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,
 int64_t total_sectors;
 int64_t n;
 int64_t ret, ret2;
+BlockDriverState *local_file = NULL;
+int local_pnum = 0;

-*file = NULL;
+assert(pnum);
 total_sectors = bdrv_nb_sectors(bs);
 if (total_sectors < 0) {
-return total_sectors;
+ret = total_sectors;
+goto early_out;
 }

 if (sector_num >= total_sectors || !nb_sectors) {
-*pnum = 0;
-return sector_num >= total_sectors ? BDRV_BLOCK_EOF : 0;
+

[Qemu-devel] [PATCH v5 06/23] qemu-img: Switch get_block_status() to byte-based

2017-10-03 Thread Eric Blake

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Continue by converting
an internal function (no semantic change), and simplifying its
caller accordingly.

Signed-off-by: Eric Blake 
Reviewed-by: Fam Zheng 
Reviewed-by: John Snow 

---
v2-v5: no change
---
 qemu-img.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index e9c7b30c91..af3effdec5 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2671,14 +2671,16 @@ static void dump_map_entry(OutputFormat output_format, 
MapEntry *e,
 }
 }

-static int get_block_status(BlockDriverState *bs, int64_t sector_num,
-int nb_sectors, MapEntry *e)
+static int get_block_status(BlockDriverState *bs, int64_t offset,
+int64_t bytes, MapEntry *e)
 {
 int64_t ret;
 int depth;
 BlockDriverState *file;
 bool has_offset;
+int nb_sectors = bytes >> BDRV_SECTOR_BITS;

+assert(bytes < INT_MAX);
 /* As an optimization, we could cache the current range of unallocated
  * clusters in each file of the chain, and avoid querying the same
  * range repeatedly.
@@ -2686,8 +2688,8 @@ static int get_block_status(BlockDriverState *bs, int64_t 
sector_num,

 depth = 0;
 for (;;) {
-ret = bdrv_get_block_status(bs, sector_num, nb_sectors, _sectors,
-);
+ret = bdrv_get_block_status(bs, offset >> BDRV_SECTOR_BITS, nb_sectors,
+_sectors, );
 if (ret < 0) {
 return ret;
 }
@@ -2707,7 +2709,7 @@ static int get_block_status(BlockDriverState *bs, int64_t 
sector_num,
 has_offset = !!(ret & BDRV_BLOCK_OFFSET_VALID);

 *e = (MapEntry) {
-.start = sector_num * BDRV_SECTOR_SIZE,
+.start = offset,
 .length = nb_sectors * BDRV_SECTOR_SIZE,
 .data = !!(ret & BDRV_BLOCK_DATA),
 .zero = !!(ret & BDRV_BLOCK_ZERO),
@@ -2837,16 +2839,12 @@ static int img_map(int argc, char **argv)

 length = blk_getlength(blk);
 while (curr.start + curr.length < length) {
-int64_t nsectors_left;
-int64_t sector_num;
-int n;
-
-sector_num = (curr.start + curr.length) >> BDRV_SECTOR_BITS;
+int64_t offset = curr.start + curr.length;
+int64_t n;

 /* Probe up to 1 GiB at a time.  */
-nsectors_left = DIV_ROUND_UP(length, BDRV_SECTOR_SIZE) - sector_num;
-n = MIN(1 << (30 - BDRV_SECTOR_BITS), nsectors_left);
-ret = get_block_status(bs, sector_num, n, );
+n = QEMU_ALIGN_DOWN(MIN(1 << 30, length - offset), BDRV_SECTOR_SIZE);
+ret = get_block_status(bs, offset, n, );

 if (ret < 0) {
 error_report("Could not read file metadata: %s", strerror(-ret));
-- 
2.13.6

[Qemu-devel] [PATCH v5 04/23] qcow2: Switch is_zero_sectors() to byte-based

2017-10-03 Thread Eric Blake

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change), and rename it to is_zero() in the
process.

Signed-off-by: Eric Blake 
Reviewed-by: Fam Zheng 
Reviewed-by: John Snow 

---
v3-v5: no change
v2: rename function, rebase to upstream changes
---
 block/qcow2.c | 32 ++--
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index bcd5c4a34c..e0de46f530 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2972,21 +2972,28 @@ finish:
 }


-static bool is_zero_sectors(BlockDriverState *bs, int64_t start,
-uint32_t count)
+static bool is_zero(BlockDriverState *bs, int64_t offset, int64_t bytes)
 {
 int nr;
 int64_t res;
+int64_t start;

-if (start + count > bs->total_sectors) {
-count = bs->total_sectors - start;
+/* Widen to sector boundaries, then clamp to image length, before
+ * checking status of underlying sectors */
+start = QEMU_ALIGN_DOWN(offset, BDRV_SECTOR_SIZE);
+bytes = QEMU_ALIGN_UP(offset + bytes, BDRV_SECTOR_SIZE) - start;
+
+if (start + bytes > bs->total_sectors * BDRV_SECTOR_SIZE) {
+bytes = bs->total_sectors * BDRV_SECTOR_SIZE - start;
 }

-if (!count) {
+if (!bytes) {
 return true;
 }
-res = bdrv_get_block_status_above(bs, NULL, start, count, , NULL);
-return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == count;
+res = bdrv_get_block_status_above(bs, NULL, start >> BDRV_SECTOR_BITS,
+  bytes >> BDRV_SECTOR_BITS, , NULL);
+return res >= 0 && (res & BDRV_BLOCK_ZERO) &&
+nr * BDRV_SECTOR_SIZE == bytes;
 }

 static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
@@ -3004,24 +3011,21 @@ static coroutine_fn int 
qcow2_co_pwrite_zeroes(BlockDriverState *bs,
 }

 if (head || tail) {
-int64_t cl_start = (offset - head) >> BDRV_SECTOR_BITS;
 uint64_t off;
 unsigned int nr;

 assert(head + bytes <= s->cluster_size);

 /* check whether remainder of cluster already reads as zero */
-if (!(is_zero_sectors(bs, cl_start,
-  DIV_ROUND_UP(head, BDRV_SECTOR_SIZE)) &&
-  is_zero_sectors(bs, (offset + bytes) >> BDRV_SECTOR_BITS,
-  DIV_ROUND_UP(-tail & (s->cluster_size - 1),
-   BDRV_SECTOR_SIZE {
+if (!(is_zero(bs, offset - head, head) &&
+  is_zero(bs, offset + bytes,
+  tail ? s->cluster_size - tail : 0))) {
 return -ENOTSUP;
 }

 qemu_co_mutex_lock(>lock);
 /* We can have new write after previous check */
-offset = cl_start << BDRV_SECTOR_BITS;
+offset = QEMU_ALIGN_DOWN(offset, s->cluster_size);
 bytes = s->cluster_size;
 nr = s->cluster_size;
 ret = qcow2_get_cluster_offset(bs, offset, , );
-- 
2.13.6

[Qemu-devel] [PATCH v5 07/23] block: Convert bdrv_get_block_status() to bytes

2017-10-03 Thread Eric Blake

We are gradually moving away from sector-based interfaces, towards
byte-based.  In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.

Changing the name of the function from bdrv_get_block_status() to
bdrv_block_status() ensures that the compiler enforces that all
callers are updated.  For now, the io.c layer still assert()s that
all callers are sector-aligned, but that can be relaxed when a later
patch implements byte-based block status in the drivers.

Note that we have an inherent limitation in the BDRV_BLOCK_* return
values: BDRV_BLOCK_OFFSET_VALID can only return the start of a
sector, even if we later relax the interface to query for the status
starting at an intermediate byte; document the obvious interpretation
that valid offsets are always sector-relative.

Therefore, for the most part this patch is just the addition of scaling
at the callers followed by inverse scaling at bdrv_block_status().  But
some code, particularly bdrv_is_allocated(), gets a lot simpler because
it no longer has to mess with sectors.

For ease of review, bdrv_get_block_status_above() will be tackled
separately.

Signed-off-by: Eric Blake 

---
v5: drop pointless 'if (pnum)' [John], add comment
v4: no change
v3: clamp bytes to 32-bits, rather than asserting
v2: rebase to earlier changes
---
 include/block/block.h | 12 +++-
 block/io.c| 35 +++
 block/qcow2-cluster.c |  2 +-
 qemu-img.c| 20 +++-
 4 files changed, 42 insertions(+), 27 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index be49c4ae9d..4ecd2c4a65 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -138,8 +138,10 @@ typedef struct HDGeometry {
  *
  * If BDRV_BLOCK_OFFSET_VALID is set, bits 9-62 (BDRV_BLOCK_OFFSET_MASK)
  * represent the offset in the returned BDS that is allocated for the
- * corresponding raw data; however, whether that offset actually contains
- * data also depends on BDRV_BLOCK_DATA and BDRV_BLOCK_ZERO, as follows:
+ * corresponding raw data.  Individual bytes are at the same sector-relative
+ * locations (and thus, this bit cannot be set for mappings which are
+ * not equivalent modulo 512).  However, whether that offset actually
+ * contains data also depends on BDRV_BLOCK_DATA, as follows:
  *
  * DATA ZERO OFFSET_VALID
  *  ttt   sectors read as zero, returned file is zero at offset
@@ -422,9 +424,9 @@ int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
 bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs);
 bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs);
-int64_t bdrv_get_block_status(BlockDriverState *bs, int64_t sector_num,
-  int nb_sectors, int *pnum,
-  BlockDriverState **file);
+int64_t bdrv_block_status(BlockDriverState *bs, int64_t offset,
+  int64_t bytes, int64_t *pnum,
+  BlockDriverState **file);
 int64_t bdrv_get_block_status_above(BlockDriverState *bs,
 BlockDriverState *base,
 int64_t sector_num,
diff --git a/block/io.c b/block/io.c
index afba2da1c4..ab1853dc2d 100644
--- a/block/io.c
+++ b/block/io.c
@@ -698,7 +698,6 @@ int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags flags)
 {
 int64_t target_size, ret, bytes, offset = 0;
 BlockDriverState *bs = child->bs;
-int n; /* sectors */

 target_size = bdrv_getlength(bs);
 if (target_size < 0) {
@@ -710,24 +709,23 @@ int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags 
flags)
 if (bytes <= 0) {
 return 0;
 }
-ret = bdrv_get_block_status(bs, offset >> BDRV_SECTOR_BITS,
-bytes >> BDRV_SECTOR_BITS, , NULL);
+ret = bdrv_block_status(bs, offset, bytes, , NULL);
 if (ret < 0) {
 error_report("error getting block status at offset %" PRId64 ": 
%s",
  offset, strerror(-ret));
 return ret;
 }
 if (ret & BDRV_BLOCK_ZERO) {
-offset += n * BDRV_SECTOR_BITS;
+offset += bytes;
 continue;
 }
-ret = bdrv_pwrite_zeroes(child, offset, n * BDRV_SECTOR_SIZE, flags);
+ret = bdrv_pwrite_zeroes(child, offset, bytes, flags);
 if (ret < 0) {
 error_report("error writing zeroes at offset %" PRId64 ": %s",
  offset, strerror(-ret));
 return ret;
 }
-offset += n * BDRV_SECTOR_SIZE;
+offset += bytes;
 }
 }

@@ -2021,13 +2019,26 @@ int64_t bdrv_get_block_status_above(BlockDriverState 
*bs,

[Qemu-devel] [PATCH v5 08/23] block: Switch bdrv_co_get_block_status() to byte-based

2017-10-03 Thread Eric Blake

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change); and as with its public counterpart,
rename to bdrv_co_block_status() to make the compiler enforce that
we catch all uses.  For now, we assert that callers still pass
aligned data, but ultimately, this will be the function where we
hand off to a byte-based driver callback, and will eventually need
to add logic to ensure we round calls according to the driver's
request_alignment then touch up the result handed back to the
caller, to start permitting a caller to pass unaligned offsets.

Note that we are now prepared to accepts 'bytes' larger than INT_MAX;
this is okay as long as we clamp things internally before violating
any 32-bit limits, and makes no difference to how a client will
use the information (clients looping over the entire file must
already be prepared for consecutive calls to return the same status,
as drivers are already free to return shorter-than-maximal status
due to any other convenient split points, such as when the L2 table
crosses cluster boundaries in qcow2).

Signed-off-by: Eric Blake 

---
v5: rebase to earlier changes in 1/23, add comment
v4: no change
v3: rebase to allocation/mapping sense change, clamp bytes to 32-bits
when needed, drop R-b
v2: rebase to earlier changes
---
 block/io.c | 103 +
 1 file changed, 62 insertions(+), 41 deletions(-)

diff --git a/block/io.c b/block/io.c
index ab1853dc2d..b879e26154 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1792,76 +1792,91 @@ int64_t coroutine_fn 
bdrv_co_get_block_status_from_backing(BlockDriverState *bs,
  * BDRV_BLOCK_ZERO where possible; otherwise, the result may omit those
  * bits particularly if it allows for a larger value in 'pnum'.
  *
- * If 'sector_num' is beyond the end of the disk image the return value is
+ * If 'offset' is beyond the end of the disk image the return value is
  * BDRV_BLOCK_EOF and 'pnum' is set to 0.
  *
- * 'pnum' is set to the number of sectors (including and immediately following
- * the specified sector) that are known to be in the same
- * allocated/unallocated state.
+ * 'pnum' is set to the number of bytes (including and immediately following
+ * the specified offset) that are known to be in the same
+ * allocated/unallocated state.  It may be NULL.
  *
- * 'nb_sectors' is the max value 'pnum' should be set to.  If nb_sectors goes
+ * 'bytes' is the max value 'pnum' should be set to.  If bytes goes
  * beyond the end of the disk image it will be clamped; if 'pnum' is set to
  * the end of the image, then the returned value will include BDRV_BLOCK_EOF.
  *
  * If returned value is positive, BDRV_BLOCK_OFFSET_VALID bit is set, and
- * 'file' is non-NULL, then '*file' points to the BDS which the sector range
- * is allocated in.
+ * 'file' is non-NULL, then '*file' points to the BDS which owns the
+ * allocated sector that contains offset.
  */
-static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
- bool mapping,
- int64_t sector_num,
- int nb_sectors, int *pnum,
- BlockDriverState **file)
+static int64_t coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
+ bool mapping,
+ int64_t offset, int64_t bytes,
+ int64_t *pnum,
+ BlockDriverState **file)
 {
-int64_t total_sectors;
-int64_t n;
+int64_t total_size;
+int64_t n; /* bytes */
 int64_t ret, ret2;
 BlockDriverState *local_file = NULL;
-int local_pnum = 0;
+int64_t local_pnum = 0;
+int count; /* sectors */

 assert(pnum);
-total_sectors = bdrv_nb_sectors(bs);
-if (total_sectors < 0) {
-ret = total_sectors;
+total_size = bdrv_getlength(bs);
+if (total_size < 0) {
+ret = total_size;
 goto early_out;
 }

-if (sector_num >= total_sectors || !nb_sectors) {
-ret = sector_num >= total_sectors ? BDRV_BLOCK_EOF : 0;
+if (offset >= total_size || !bytes) {
+ret = offset >= total_size ? BDRV_BLOCK_EOF : 0;
 goto early_out;
 }

-n = total_sectors - sector_num;
-if (n < nb_sectors) {
-nb_sectors = n;
+n = total_size - offset;
+if (n < bytes) {
+bytes = n;
 }

 if (!bs->drv->bdrv_co_get_block_status) {
-local_pnum = nb_sectors;
+local_pnum = bytes;
 ret = BDRV_BLOCK_DATA | BDRV_BLOCK_ALLOCATED;
-if (sector_num + nb_sectors == total_sectors) {
+if (offset + bytes == total_size) {
 ret |=

[Qemu-devel] [PATCH v5 03/23] block: Make bdrv_round_to_clusters() signature more useful

2017-10-03 Thread Eric Blake

In the process of converting sector-based interfaces to bytes,
I'm finding it easier to represent a byte count as a 64-bit
integer at the block layer (even if we are internally capped
by SIZE_MAX or even INT_MAX for individual transactions, it's
still nicer to not have to worry about truncation/overflow
issues on as many variables).  Update the signature of
bdrv_round_to_clusters() to uniformly use int64_t, matching
the signature already chosen for bdrv_is_allocated and the
fact that off_t is also a signed type, then adjust clients
according to the required fallout (even where the result could
now exceed 32 bits, no client is directly assigning the result
into a 32-bit value without breaking things into a loop first).

Signed-off-by: Eric Blake 

---
v5: depends on copy-on-read fixes [John], fix incorrect trace, update
commit message to document rounding considerations, drop R-b
v4: only context changes
v3: no change
v2: fix commit message [John], rebase to earlier changes, including
mirror_clip_bytes() signature update
---
 include/block/block.h | 4 ++--
 block/io.c| 6 +++---
 block/mirror.c| 7 +++
 block/trace-events| 2 +-
 4 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 3c3af462e4..be49c4ae9d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -475,9 +475,9 @@ int bdrv_get_flags(BlockDriverState *bs);
 int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi);
 ImageInfoSpecific *bdrv_get_specific_info(BlockDriverState *bs);
 void bdrv_round_to_clusters(BlockDriverState *bs,
-int64_t offset, unsigned int bytes,
+int64_t offset, int64_t bytes,
 int64_t *cluster_offset,
-unsigned int *cluster_bytes);
+int64_t *cluster_bytes);

 const char *bdrv_get_encrypted_filename(BlockDriverState *bs);
 void bdrv_get_backing_filename(BlockDriverState *bs,
diff --git a/block/io.c b/block/io.c
index 0d8cdab583..e4d5d33805 100644
--- a/block/io.c
+++ b/block/io.c
@@ -449,9 +449,9 @@ static void mark_request_serialising(BdrvTrackedRequest 
*req, uint64_t align)
  * Round a region to cluster boundaries
  */
 void bdrv_round_to_clusters(BlockDriverState *bs,
-int64_t offset, unsigned int bytes,
+int64_t offset, int64_t bytes,
 int64_t *cluster_offset,
-unsigned int *cluster_bytes)
+int64_t *cluster_bytes)
 {
 BlockDriverInfo bdi;

@@ -949,7 +949,7 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BdrvChild 
*child,
 struct iovec iov;
 QEMUIOVector local_qiov;
 int64_t cluster_offset;
-unsigned int cluster_bytes;
+int64_t cluster_bytes;
 size_t skip_bytes;
 int ret;
 int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
diff --git a/block/mirror.c b/block/mirror.c
index e664a5dc5a..bac2324dce 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -190,10 +190,9 @@ static int mirror_cow_align(MirrorBlockJob *s, int64_t 
*offset,
 bool need_cow;
 int ret = 0;
 int64_t align_offset = *offset;
-unsigned int align_bytes = *bytes;
+int64_t align_bytes = *bytes;
 int max_bytes = s->granularity * s->max_iov;

-assert(*bytes < INT_MAX);
 need_cow = !test_bit(*offset / s->granularity, s->cow_bitmap);
 need_cow |= !test_bit((*offset + *bytes - 1) / s->granularity,
   s->cow_bitmap);
@@ -388,7 +387,7 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 while (nb_chunks > 0 && offset < s->bdev_length) {
 int64_t ret;
 int io_sectors;
-unsigned int io_bytes;
+int64_t io_bytes;
 int64_t io_bytes_acct;
 enum MirrorMethod {
 MIRROR_METHOD_COPY,
@@ -413,7 +412,7 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 io_bytes = s->granularity;
 } else if (ret >= 0 && !(ret & BDRV_BLOCK_DATA)) {
 int64_t target_offset;
-unsigned int target_bytes;
+int64_t target_bytes;
 bdrv_round_to_clusters(blk_bs(s->target), offset, io_bytes,
_offset, _bytes);
 if (target_offset == offset &&
diff --git a/block/trace-events b/block/trace-events
index 25dd5a3026..11c8d5f590 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -12,7 +12,7 @@ blk_co_pwritev(void *blk, void *bs, int64_t offset, unsigned 
int bytes, int flag
 bdrv_co_preadv(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) 
"bs %p offset %"PRId64" nbytes %"PRId64" flags 0x%x"
 bdrv_co_pwritev(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) 
"bs %p offset %"PRId64" nbytes %"PRId64" flags 0x%x"
 bdrv_co_pwrite_zeroes(void *bs, int64_t offset, int count, int

[Qemu-devel] [PATCH v5 00/23] make bdrv_get_block_status byte-based

2017-10-03 Thread Eric Blake

There are patches floating around to add NBD_CMD_BLOCK_STATUS,
but NBD wants to report status on byte granularity (even if the
reporting will probably be naturally aligned to sectors or even
much higher levels).  I've therefore started the task of
converting our block status code to report at a byte granularity
rather than sectors.

Now that 2.11 is open, I'm rebasing/reposting the remaining patches.

The overall conversion currently looks like:
part 1: bdrv_is_allocated (merged, commit 51b0a488)
part 2: dirty-bitmap (v10 is queued [1])
part 3: bdrv_get_block_status (this series, v4 at [2])
part 4: .bdrv_co_block_status (v3 is posted [4], mostly reviewed)

Available as a tag at:
git fetch git://repo.or.cz/qemu/ericb.git nbd-byte-status-v4

Based-on: <20170925145526.32690-1-ebl...@redhat.com>
([PATCH v10 00/20] make dirty-bitmap byte-based)
Based-on: <20171004014347.25099-1-ebl...@redhat.com>
([PATCH v2 0/5] block: Avoid copy-on-read assertions)

[1] https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg06848.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg03543.html
[3] https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg03812.html
[4] https://lists.gnu.org/archive/html/qemu-devel/2017-10/msg00524.html

Since v4:
- rebase to fixes for copy-on-read
- tweak bdrv_co_block_status goto/label usage for easier reading [John]
- more added comments and improved commit messages
- fix a couple of bugs, such as wrong trace-events usage
- add R-b where things didn't change drastically

001/23:[0042] [FC] 'block: Allow NULL file for bdrv_get_block_status()'
002/23:[0006] [FC] 'block: Add flag to avoid wasted work in bdrv_is_allocated()'
003/23:[0003] [FC] 'block: Make bdrv_round_to_clusters() signature more useful'
004/23:[] [--] 'qcow2: Switch is_zero_sectors() to byte-based'
005/23:[] [--] 'block: Switch bdrv_make_zero() to byte-based'
006/23:[] [--] 'qemu-img: Switch get_block_status() to byte-based'
007/23:[0010] [FC] 'block: Convert bdrv_get_block_status() to bytes'
008/23:[0042] [FC] 'block: Switch bdrv_co_get_block_status() to byte-based'
009/23:[] [--] 'block: Switch BdrvCoGetBlockStatusData to byte-based'
010/23:[] [--] 'block: Switch bdrv_common_block_status_above() to 
byte-based'
011/23:[] [-C] 'block: Switch bdrv_co_get_block_status_above() to 
byte-based'
012/23:[0019] [FC] 'block: Convert bdrv_get_block_status_above() to bytes'
013/23:[0008] [FC] 'qemu-img: Simplify logic in img_compare()'
014/23:[] [--] 'qemu-img: Speed up compare on pre-allocated larger file'
015/23:[] [--] 'qemu-img: Add find_nonzero()'
016/23:[] [--] 'qemu-img: Drop redundant error message in compare'
017/23:[] [--] 'qemu-img: Change check_empty_sectors() to byte-based'
018/23:[] [--] 'qemu-img: Change compare_sectors() to be byte-based'
019/23:[] [--] 'qemu-img: Change img_rebase() to be byte-based'
020/23:[0005] [FC] 'qemu-img: Change img_compare() to be byte-based'
021/23:[0061] [FC] 'block: Align block status requests'
022/23:[] [--] 'block: Relax bdrv_aligned_preadv() assertion'
023/23:[] [--] 'qemu-io: Relax 'alloc' now that block-status doesn't assert'

Eric Blake (23):
  block: Allow NULL file for bdrv_get_block_status()
  block: Add flag to avoid wasted work in bdrv_is_allocated()
  block: Make bdrv_round_to_clusters() signature more useful
  qcow2: Switch is_zero_sectors() to byte-based
  block: Switch bdrv_make_zero() to byte-based
  qemu-img: Switch get_block_status() to byte-based
  block: Convert bdrv_get_block_status() to bytes
  block: Switch bdrv_co_get_block_status() to byte-based
  block: Switch BdrvCoGetBlockStatusData to byte-based
  block: Switch bdrv_common_block_status_above() to byte-based
  block: Switch bdrv_co_get_block_status_above() to byte-based
  block: Convert bdrv_get_block_status_above() to bytes
  qemu-img: Simplify logic in img_compare()
  qemu-img: Speed up compare on pre-allocated larger file
  qemu-img: Add find_nonzero()
  qemu-img: Drop redundant error message in compare
  qemu-img: Change check_empty_sectors() to byte-based
  qemu-img: Change compare_sectors() to be byte-based
  qemu-img: Change img_rebase() to be byte-based
  qemu-img: Change img_compare() to be byte-based
  block: Align block status requests
  block: Relax bdrv_aligned_preadv() assertion
  qemu-io: Relax 'alloc' now that block-status doesn't assert

 include/block/block.h  |  26 ++--
 include/block/block_int.h  |  11 +-
 block/io.c | 303 +
 block/blkdebug.c   |  13 +-
 block/mirror.c |  24 +--
 block/qcow2-cluster.c  |   2 +-
 block/qcow2.c  |  53 +++
 qemu-img.c | 365 -
 qemu-io-cmds.c |  13 --
 block/trace-events |   2 +-
 tests/qemu-iotests/074.out |   2 -
 tests/qemu-iotests/177 |  12 +-
 tests/qemu-iotests/177.out |  19 ++-
 13 files changed, 431 insertions(+),

[Qemu-devel] [PATCH v5 02/23] block: Add flag to avoid wasted work in bdrv_is_allocated()

2017-10-03 Thread Eric Blake

Not all callers care about which BDS owns the mapping for a given
range of the file.  In particular, bdrv_is_allocated() cares more
about finding the largest run of allocated data from the guest
perspective, whether or not that data is consecutive from the
host perspective, and whether or not the data reads as zero.
Therefore, doing subsequent refinements such as checking how much
of the format-layer allocation also satisfies BDRV_BLOCK_ZERO at
the protocol layer is wasted work - in the best case, it just
costs extra CPU cycles during a single bdrv_is_allocated(), but
in the worst case, it results in a smaller *pnum, and forces
callers to iterate through more status probes when visiting the
entire file for even more extra CPU cycles.

This patch only optimizes the block layer (no behavior change when
mapping is true, but skip unnecessary effort when it is false).
Then when subsequent patches tweak the driver callback to be
byte-based, we can also pass this hint through to the driver.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v5: tweak commit message and one comment, rebase to previous changes,
minor enough to still add R-b
v4: only context changes
v3: s/allocation/mapping/ and flip sense of bool
v2: new patch
---
 block/io.c | 52 ++--
 1 file changed, 38 insertions(+), 14 deletions(-)

diff --git a/block/io.c b/block/io.c
index e5a6f63eea..0d8cdab583 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1754,6 +1754,7 @@ typedef struct BdrvCoGetBlockStatusData {
 int nb_sectors;
 int *pnum;
 int64_t ret;
+bool mapping;
 bool done;
 } BdrvCoGetBlockStatusData;

@@ -1788,6 +1789,11 @@ int64_t coroutine_fn 
bdrv_co_get_block_status_from_backing(BlockDriverState *bs,
  * Drivers not implementing the functionality are assumed to not support
  * backing files, hence all their sectors are reported as allocated.
  *
+ * If 'mapping' is true, the caller is querying for mapping purposes,
+ * and the result should include BDRV_BLOCK_OFFSET_VALID and
+ * BDRV_BLOCK_ZERO where possible; otherwise, the result may omit those
+ * bits particularly if it allows for a larger value in 'pnum'.
+ *
  * If 'sector_num' is beyond the end of the disk image the return value is
  * BDRV_BLOCK_EOF and 'pnum' is set to 0.
  *
@@ -1804,6 +1810,7 @@ int64_t coroutine_fn 
bdrv_co_get_block_status_from_backing(BlockDriverState *bs,
  * is allocated in.
  */
 static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
+ bool mapping,
  int64_t sector_num,
  int nb_sectors, int *pnum,
  BlockDriverState **file)
@@ -1854,14 +1861,15 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,

 if (ret & BDRV_BLOCK_RAW) {
 assert(ret & BDRV_BLOCK_OFFSET_VALID && local_file);
-ret = bdrv_co_get_block_status(local_file, ret >> BDRV_SECTOR_BITS,
+ret = bdrv_co_get_block_status(local_file, mapping,
+   ret >> BDRV_SECTOR_BITS,
local_pnum, _pnum, _file);
 goto out;
 }

 if (ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ZERO)) {
 ret |= BDRV_BLOCK_ALLOCATED;
-} else {
+} else if (mapping) {
 if (bdrv_unallocated_blocks_are_zero(bs)) {
 ret |= BDRV_BLOCK_ZERO;
 } else if (bs->backing) {
@@ -1873,12 +1881,13 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,
 }
 }

-if (local_file && local_file != bs &&
+if (mapping && local_file && local_file != bs &&
 (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
 (ret & BDRV_BLOCK_OFFSET_VALID)) {
 int file_pnum;

-ret2 = bdrv_co_get_block_status(local_file, ret >> BDRV_SECTOR_BITS,
+ret2 = bdrv_co_get_block_status(local_file, mapping,
+ret >> BDRV_SECTOR_BITS,
 local_pnum, _pnum, NULL);
 if (ret2 >= 0) {
 /* Ignore errors.  This is just providing extra information, it
@@ -1915,6 +1924,7 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,

 static int64_t coroutine_fn bdrv_co_get_block_status_above(BlockDriverState 
*bs,
 BlockDriverState *base,
+bool mapping,
 int64_t sector_num,
 int nb_sectors,
 int *pnum,
@@ -1926,7 +1936,8 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status_above(BlockDriverState *bs,

 assert(bs != base);
 for (p = bs; p != base; p = backing_bs(p)) {
-ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum, file);
+ret = bdrv_co_get_block_status(p, mapping, sector_num, nb_sectors,
+

[Qemu-devel] [PATCH v2 3/4] blockjob: expose persistent property

2017-10-03 Thread John Snow

For drive-backup and blockdev-backup, expose the persistent
property, having it default to false. There are no universal
creation parameters, so it must be added to each job type that
it makes sense for individually.

Signed-off-by: John Snow 
---
 blockdev.c   | 10 --
 qapi/block-core.json | 21 -
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index c08d6fb..8bbbf2a 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3198,6 +3198,9 @@ static BlockJob *do_drive_backup(DriveBackup *backup, 
BlockJobTxn *txn,
 if (!backup->has_job_id) {
 backup->job_id = NULL;
 }
+if (!backup->has_persistent) {
+backup->persistent = false;
+}
 if (!backup->has_compress) {
 backup->compress = false;
 }
@@ -3290,7 +3293,7 @@ static BlockJob *do_drive_backup(DriveBackup *backup, 
BlockJobTxn *txn,
 }
 }
 
-job = backup_job_create(backup->job_id, false, bs, target_bs,
+job = backup_job_create(backup->job_id, backup->persistent, bs, target_bs,
 backup->speed, backup->sync, bmap, 
backup->compress,
 backup->on_source_error, backup->on_target_error,
 BLOCK_JOB_DEFAULT, NULL, NULL, txn, _err);
@@ -3341,6 +3344,9 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, 
BlockJobTxn *txn,
 if (!backup->has_job_id) {
 backup->job_id = NULL;
 }
+if (!backup->has_persistent) {
+backup->persistent = false;
+}
 if (!backup->has_compress) {
 backup->compress = false;
 }
@@ -3369,7 +3375,7 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, 
BlockJobTxn *txn,
 goto out;
 }
 }
-job = backup_job_create(backup->job_id, false, bs, target_bs,
+job = backup_job_create(backup->job_id, backup->persistent, bs, target_bs,
 backup->speed, backup->sync, NULL, 
backup->compress,
 backup->on_source_error, backup->on_target_error,
 BLOCK_JOB_DEFAULT, NULL, NULL, txn, _err);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 5cce49d..4c7c17b 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1104,6 +1104,11 @@
 # @job-id: identifier for the newly-created block job. If
 #  omitted, the device name will be used. (Since 2.7)
 #
+# @persistent: Whether or not the job created by this command needs to be
+#  cleaned up manually via block-job-reap or not. The default is
+#  false. When true, the job will remain in a "completed" state
+#  until reaped manually with block-job-reap. (Since 2.11)
+#
 # @device: the device name or node-name of a root node which should be copied.
 #
 # @target: the target of the new image. If the file exists, or if it
@@ -1144,9 +1149,10 @@
 # Since: 1.6
 ##
 { 'struct': 'DriveBackup',
-  'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
-'*format': 'str', 'sync': 'MirrorSyncMode', '*mode': 
'NewImageMode',
-'*speed': 'int', '*bitmap': 'str', '*compress': 'bool',
+  'data': { '*job-id': 'str', '*persistent': 'bool', 'device': 'str',
+'target': 'str', '*format': 'str', 'sync': 'MirrorSyncMode',
+'*mode': 'NewImageMode', '*speed': 'int', '*bitmap': 'str',
+'*compress': 'bool',
 '*on-source-error': 'BlockdevOnError',
 '*on-target-error': 'BlockdevOnError' } }
 
@@ -1156,6 +1162,11 @@
 # @job-id: identifier for the newly-created block job. If
 #  omitted, the device name will be used. (Since 2.7)
 #
+# @persistent: Whether or not the job created by this command needs to be
+#  cleaned up manually via block-job-reap or not. The default is
+#  false. When true, the job will remain in a "completed" state
+#  until reaped manually with block-job-reap. (Since 2.11)
+#
 # @device: the device name or node-name of a root node which should be copied.
 #
 # @target: the device name or node-name of the backup target node.
@@ -1185,8 +1196,8 @@
 # Since: 2.3
 ##
 { 'struct': 'BlockdevBackup',
-  'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
-'sync': 'MirrorSyncMode',
+  'data': { '*job-id': 'str', '*persistent': 'bool', 'device': 'str',
+'target': 'str', 'sync': 'MirrorSyncMode',
 '*speed': 'int',
 '*compress': 'bool',
 '*on-source-error': 'BlockdevOnError',
-- 
2.9.5

[Qemu-devel] [PATCH v2 2/4] qmp: add block-job-reap command

2017-10-03 Thread John Snow

For jobs that have finished (either completed or canceled), allow the
user to dismiss the job's status reports via block-job-reap.

Signed-off-by: John Snow 
---
 block/trace-events   |  1 +
 blockdev.c   | 14 ++
 qapi/block-core.json | 21 +
 3 files changed, 36 insertions(+)

diff --git a/block/trace-events b/block/trace-events
index 25dd5a3..9580efa 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -46,6 +46,7 @@ qmp_block_job_cancel(void *job) "job %p"
 qmp_block_job_pause(void *job) "job %p"
 qmp_block_job_resume(void *job) "job %p"
 qmp_block_job_complete(void *job) "job %p"
+qmp_block_job_reap(void *job) "job %p"
 qmp_block_stream(void *bs, void *job) "bs %p job %p"
 
 # block/file-win32.c
diff --git a/blockdev.c b/blockdev.c
index eeb4986..c08d6fb 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3766,6 +3766,20 @@ void qmp_block_job_complete(const char *device, Error 
**errp)
 aio_context_release(aio_context);
 }
 
+void qmp_block_job_reap(const char *device, Error **errp)
+{
+AioContext *aio_context;
+BlockJob *job = find_block_job(device, _context, errp);
+
+if (!job) {
+return;
+}
+
+trace_qmp_block_job_reap(job);
+block_job_reap(, errp);
+aio_context_release(aio_context);
+}
+
 void qmp_change_backing_file(const char *device,
  const char *image_node_name,
  const char *backing_file,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index a4f5e10..5cce49d 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2161,6 +2161,27 @@
 { 'command': 'block-job-complete', 'data': { 'device': 'str' } }
 
 ##
+# @block-job-reap:
+#
+# For jobs that have already completed, remove them from the block-job-query
+# list. This command only needs to be run for jobs which were started with the
+# persistent=true option.
+#
+# This command will refuse to operate on any job that has not yet reached
+# its terminal state. "cancel" or "complete" will still need to be used as
+# appropriate.
+#
+# @device: The job identifier. This used to be a device name (hence
+#  the name of the parameter), but since QEMU 2.7 it can have
+#  other values.
+#
+# Returns: Nothing on success
+#
+# Since: 2.11
+##
+{ 'command': 'block-job-reap', 'data': { 'device': 'str' } }
+
+##
 # @BlockdevDiscardOptions:
 #
 # Determines how to handle discard requests.
-- 
2.9.5

[Qemu-devel] [PATCH v2 0/4] blockjobs: add explicit job reaping

2017-10-03 Thread John Snow

For jobs that complete when a monitor isn't looking, there's no way to
tell what the job's final return code was. We need to allow jobs to
remain in the list until queried for reliable management.

V2:
 - Added tests!
 - Changed property name (Jeff, Paolo)

RFC:
The next version will add tests for transactions.
Kevin, can you please take a look at bdrv_is_root_node and how it is
used with respect to do_drive_backup? I suspect that in this case that
"is root" should actually be "true", but a node in use by a job has
two roles; child_root and child_job, so it starts returning false here.

That's fine because we prevent a collision that way, but it makes the
error messages pretty bad and misleading. Do you have a quick suggestion?
(Should I just amend the loop to allow non-root nodes as long as they
happen to be jobs so that the job creation code can check permissions?)

John Snow (4):
  blockjob: add persistent property
  qmp: add block-job-reap command
  blockjob: expose persistent property
  iotests: test manual job reaping

 block/backup.c   |  20 ++--
 block/commit.c   |   2 +-
 block/mirror.c   |   2 +-
 block/replication.c  |   5 +-
 block/stream.c   |   2 +-
 block/trace-events   |   1 +
 blockdev.c   |  28 +-
 blockjob.c   |  46 -
 include/block/block_int.h|   8 +-
 include/block/blockjob.h |  21 
 include/block/blockjob_int.h |   2 +-
 qapi/block-core.json |  49 --
 tests/qemu-iotests/056   | 227 +++
 tests/qemu-iotests/056.out   |   4 +-
 tests/test-blockjob-txn.c|   2 +-
 tests/test-blockjob.c|   2 +-
 16 files changed, 384 insertions(+), 37 deletions(-)

-- 
2.9.5

[Qemu-devel] [PATCH v2 1/4] blockjob: add persistent property

2017-10-03 Thread John Snow

Add a persistent (manually reap) property to block jobs that forces
them to linger in the block job list (visible to QMP queries) until
the user explicitly dismisses them via QMP.

The reap command itself is implemented in the next commit, and the
feature is exposed to drive-backup and blockdev-backup in the subsequent
commit.

Signed-off-by: John Snow 
---
 block/backup.c   | 20 +--
 block/commit.c   |  2 +-
 block/mirror.c   |  2 +-
 block/replication.c  |  5 +++--
 block/stream.c   |  2 +-
 blockdev.c   |  8 
 blockjob.c   | 46 ++--
 include/block/block_int.h|  8 +---
 include/block/blockjob.h | 21 
 include/block/blockjob_int.h |  2 +-
 qapi/block-core.json |  7 ---
 tests/test-blockjob-txn.c|  2 +-
 tests/test-blockjob.c|  2 +-
 13 files changed, 97 insertions(+), 30 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 517c300..93ac194 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -532,15 +532,15 @@ static const BlockJobDriver backup_job_driver = {
 .drain  = backup_drain,
 };
 
-BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
-  BlockDriverState *target, int64_t speed,
-  MirrorSyncMode sync_mode, BdrvDirtyBitmap *sync_bitmap,
-  bool compress,
-  BlockdevOnError on_source_error,
-  BlockdevOnError on_target_error,
-  int creation_flags,
-  BlockCompletionFunc *cb, void *opaque,
-  BlockJobTxn *txn, Error **errp)
+BlockJob *backup_job_create(const char *job_id, bool persistent,
+BlockDriverState *bs, BlockDriverState *target,
+int64_t speed, MirrorSyncMode sync_mode,
+BdrvDirtyBitmap *sync_bitmap, bool compress,
+BlockdevOnError on_source_error,
+BlockdevOnError on_target_error,
+int creation_flags,
+BlockCompletionFunc *cb, void *opaque,
+BlockJobTxn *txn, Error **errp)
 {
 int64_t len;
 BlockDriverInfo bdi;
@@ -608,7 +608,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 }
 
 /* job->common.len is fixed, so we can't allow resize */
-job = block_job_create(job_id, _job_driver, bs,
+job = block_job_create(job_id, _job_driver, persistent, bs,
BLK_PERM_CONSISTENT_READ,
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD,
diff --git a/block/commit.c b/block/commit.c
index 8f0e835..308a5fd 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -304,7 +304,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 return;
 }
 
-s = block_job_create(job_id, _job_driver, bs, 0, BLK_PERM_ALL,
+s = block_job_create(job_id, _job_driver, false, bs, 0, 
BLK_PERM_ALL,
  speed, BLOCK_JOB_DEFAULT, NULL, NULL, errp);
 if (!s) {
 return;
diff --git a/block/mirror.c b/block/mirror.c
index 6f5cb9f..013e73a 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1180,7 +1180,7 @@ static void mirror_start_job(const char *job_id, 
BlockDriverState *bs,
 }
 
 /* Make sure that the source is not resized while the job is running */
-s = block_job_create(job_id, driver, mirror_top_bs,
+s = block_job_create(job_id, driver, false, mirror_top_bs,
  BLK_PERM_CONSISTENT_READ,
  BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
  BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD, speed,
diff --git a/block/replication.c b/block/replication.c
index 3a4e682..6c59f00 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -539,8 +539,9 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 bdrv_op_block_all(top_bs, s->blocker);
 bdrv_op_unblock(top_bs, BLOCK_OP_TYPE_DATAPLANE, s->blocker);
 
-job = backup_job_create(NULL, s->secondary_disk->bs, 
s->hidden_disk->bs,
-0, MIRROR_SYNC_MODE_NONE, NULL, false,
+job = backup_job_create(NULL, false, s->secondary_disk->bs,
+s->hidden_disk->bs, 0, MIRROR_SYNC_MODE_NONE,
+NULL, false,
 BLOCKDEV_ON_ERROR_REPORT,
 BLOCKDEV_ON_ERROR_REPORT, BLOCK_JOB_INTERNAL,
 backup_job_completed, bs, NULL, _err);
diff --git a/block/stream.c b/block/stream.c
index e6f7234..c644f34 100644
--- a/block/stream.c
+++ b/block/stream.c
@@

[Qemu-devel] [PATCH v2 4/4] iotests: test manual job reaping

2017-10-03 Thread John Snow

RFC: The error returned by a job creation command when that device
already has a job attached has become misleading; "Someone should
do something about that!"

Signed-off-by: John Snow 
---
 tests/qemu-iotests/056 | 227 +
 tests/qemu-iotests/056.out |   4 +-
 2 files changed, 229 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/056 b/tests/qemu-iotests/056
index 04f2c3c..d6bed20 100755
--- a/tests/qemu-iotests/056
+++ b/tests/qemu-iotests/056
@@ -29,6 +29,26 @@ backing_img = os.path.join(iotests.test_dir, 'backing.img')
 test_img = os.path.join(iotests.test_dir, 'test.img')
 target_img = os.path.join(iotests.test_dir, 'target.img')
 
+def img_create(img, fmt=iotests.imgfmt, size='64M', **kwargs):
+fullname = os.path.join(iotests.test_dir, '%s.%s' % (img, fmt))
+optargs = []
+for k,v in kwargs.iteritems():
+optargs = optargs + ['-o', '%s=%s' % (k,v)]
+args = ['create', '-f', fmt] + optargs + [fullname, size]
+iotests.qemu_img(*args)
+return fullname
+
+def try_remove(img):
+try:
+os.remove(img)
+except OSError:
+pass
+
+def io_write_patterns(img, patterns):
+for pattern in patterns:
+iotests.qemu_io('-c', 'write -P%s %s %s' % pattern, img)
+
+
 class TestSyncModesNoneAndTop(iotests.QMPTestCase):
 image_len = 64 * 1024 * 1024 # MB
 
@@ -108,5 +128,212 @@ class TestBeforeWriteNotifier(iotests.QMPTestCase):
 event = self.cancel_and_wait()
 self.assert_qmp(event, 'data/type', 'backup')
 
+class BackupTest(iotests.QMPTestCase):
+def setUp(self):
+self.vm = iotests.VM()
+self.test_img = img_create('test')
+self.dest_img = img_create('dest')
+self.vm.add_drive(self.test_img)
+self.vm.launch()
+
+def tearDown(self):
+self.vm.shutdown()
+try_remove(self.test_img)
+try_remove(self.dest_img)
+
+def hmp_io_writes(self, drive, patterns):
+for pattern in patterns:
+self.vm.hmp_qemu_io(drive, 'write -P%s %s %s' % pattern)
+self.vm.hmp_qemu_io(drive, 'flush')
+
+def qmp_backup_and_wait(self, cmd='drive-backup', serror=None,
+aerror=None, **kwargs):
+return (self.qmp_backup(cmd, serror, **kwargs) and
+self.qmp_backup_wait(kwargs['device'], aerror))
+
+def qmp_backup(self, cmd='drive-backup',
+   error=None, **kwargs):
+self.assertTrue('device' in kwargs)
+res = self.vm.qmp(cmd, **kwargs)
+if error:
+self.assert_qmp(res, 'error/desc', error)
+return False
+self.assert_qmp(res, 'return', {})
+return True
+
+def qmp_backup_wait(self, device, error=None):
+event = self.vm.event_wait(name="BLOCK_JOB_COMPLETED",
+   match={'data': {'device': device}})
+self.assertNotEqual(event, None)
+try:
+failure = self.dictpath(event, 'data/error')
+except AssertionError:
+# Backup succeeded.
+self.assert_qmp(event, 'data/offset', event['data']['len'])
+return True
+else:
+# Failure.
+self.assert_qmp(event, 'data/error', qerror)
+return False
+
+def test_reap_false(self):
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+ sync='full', target=self.dest_img, 
persistent=False)
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+
+def test_reap_true(self):
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+ sync='full', target=self.dest_img, 
persistent=True)
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return[0]/finished', True)
+res = self.vm.qmp('block-job-reap', device='drive0')
+self.assert_qmp(res, 'return', {})
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+
+def test_reap_bad_id(self):
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+res = self.vm.qmp('block-job-reap', device='foobar')
+self.assert_qmp(res, 'error/class', 'DeviceNotActive')
+
+def test_reap_collision(self):
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+ sync='full', target=self.dest_img, 
persistent=True)
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return[0]/finished', True)
+# Leave zombie job un-reaped, observe a failure:

[Qemu-devel] [PATCH v2 4/5] block: Perform copy-on-read in loop

2017-10-03 Thread Eric Blake

Improve our braindead copy-on-read implementation.  Pre-patch,
we have multiple issues:
- we create a bounce buffer and perform a write for the entire
request, even if the active image already has 99% of the
clusters occupied, and really only needs to copy-on-read the
remaining 1% of the clusters
- our bounce buffer was as large as the read request, and can
needlessly exhaust our memory by using double the memory of
the request size (the original request plus our bounce buffer),
rather than a capped maximum overhead beyond the original
- if a driver has a max_transfer limit, we are bypassing the
normal code in bdrv_aligned_preadv() that fragments to that
limit, and instead attempt to read the entire buffer from the
driver in one go, which some drivers may assert on
- a client can request a large request of nearly 2G such that
rounding the request out to cluster boundaries results in a
byte count larger than 2G.  While this cannot exceed 32 bits,
it DOES have some follow-on problems:
-- the call to bdrv_driver_pread() can assert for exceeding
BDRV_REQUEST_MAX_BYTES, if the driver is old and lacks
.bdrv_co_preadv
-- if the buffer is all zeroes, the subsequent call to
bdrv_co_do_pwrite_zeroes is a no-op due to a negative size,
which means we did not actually copy on read

Fix all of these issues by breaking up the action into a loop,
where each iteration is capped to sane limits.  Also, querying
the allocation status allows us to optimize: when data is
already present in the active layer, we don't need to bounce.

Note that the code has a telling comment that copy-on-read
should probably be a filter driver rather than a bolt-in hack
in io.c; but that remains a task for another day.

CC: qemu-sta...@nongnu.org
Signed-off-by: Eric Blake 

---
v2: avoid uninit ret on 0-length op [patchew, Kevin]
---
 block/io.c | 120 +
 1 file changed, 82 insertions(+), 38 deletions(-)

diff --git a/block/io.c b/block/io.c
index d656a0485b..1e246315a7 100644
--- a/block/io.c
+++ b/block/io.c
@@ -34,6 +34,9 @@

 #define NOT_DONE 0x7fff /* used while emulated sync operation in progress 
*/

+/* Maximum bounce buffer for copy-on-read and write zeroes, in bytes */
+#define MAX_BOUNCE_BUFFER (32768 << BDRV_SECTOR_BITS)
+
 static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
 int64_t offset, int bytes, BdrvRequestFlags flags);

@@ -945,11 +948,14 @@ static int coroutine_fn 
bdrv_co_do_copy_on_readv(BdrvChild *child,

 BlockDriver *drv = bs->drv;
 struct iovec iov;
-QEMUIOVector bounce_qiov;
+QEMUIOVector local_qiov;
 int64_t cluster_offset;
 unsigned int cluster_bytes;
 size_t skip_bytes;
 int ret;
+int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
+BDRV_REQUEST_MAX_BYTES);
+unsigned int progress = 0;

 /* FIXME We cannot require callers to have write permissions when all they
  * are doing is a read request. If we did things right, write permissions
@@ -961,53 +967,95 @@ static int coroutine_fn 
bdrv_co_do_copy_on_readv(BdrvChild *child,
 // assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));

 /* Cover entire cluster so no additional backing file I/O is required when
- * allocating cluster in the image file.
+ * allocating cluster in the image file.  Note that this value may exceed
+ * BDRV_REQUEST_MAX_BYTES (even when the original read did not), which
+ * is one reason we loop rather than doing it all at once.
  */
 bdrv_round_to_clusters(bs, offset, bytes, _offset, _bytes);
+skip_bytes = offset - cluster_offset;

 trace_bdrv_co_do_copy_on_readv(bs, offset, bytes,
cluster_offset, cluster_bytes);

-iov.iov_len = cluster_bytes;
-iov.iov_base = bounce_buffer = qemu_try_blockalign(bs, iov.iov_len);
+bounce_buffer = qemu_try_blockalign(bs,
+MIN(MIN(max_transfer, cluster_bytes),
+MAX_BOUNCE_BUFFER));
 if (bounce_buffer == NULL) {
 ret = -ENOMEM;
 goto err;
 }

-qemu_iovec_init_external(_qiov, , 1);
+while (cluster_bytes) {
+int64_t pnum;

-ret = bdrv_driver_preadv(bs, cluster_offset, cluster_bytes,
- _qiov, 0);
-if (ret < 0) {
-goto err;
-}
+ret = bdrv_is_allocated(bs, cluster_offset,
+MIN(cluster_bytes, max_transfer), );
+if (ret < 0) {
+/* Safe to treat errors in querying allocation as if
+ * unallocated; we'll probably fail again soon on the
+ * read, but at least that will set a decent errno.
+ */
+pnum = MIN(cluster_bytes, max_transfer);
+}

-bdrv_debug_event(bs, BLKDBG_COR_WRITE);
-if (drv->bdrv_co_pwrite_zeroes &&
-

[Qemu-devel] [PATCH v2 5/5] iotests: Add test 197 for covering copy-on-read

2017-10-03 Thread Eric Blake

Add a test for qcow2 copy-on-read behavior, including exposure
for the just-fixed bugs.

The copy-on-read behavior is always to a qcow2 image, but the
test is careful to allow running with most image protocol/format
combos as the backing file being copied from (luks being the
exception, as it is harder to pass the right secret to all the
right places).  In fact, for './check nbd', this appears to be
the first time we've had a qcow2 image wrapping NBD, requiring
an additional line in _filter_img_create to match the similar
line in _filter_img_info.

Invoking blkdebug to prove we don't write too much took some
effort to get working; and it requires that $TEST_WRAP (based
on $TEST_DIR) not be subject to word splitting.  We may decide
later to have the entire iotests suite use relative rather than
absolute names, to avoid problems inherited by the absolute
name of $PWD or $TEST_DIR, at which point the sanity check in
this commit could be simplified.

Signed-off-by: Eric Blake 

---
v2: test 0-length query [Kevin], sanity check TEST_DIR [Jeff]

I only tested with -raw, -qcow2, -qed, and -nbd. I won't be
surprised if the test fails in some other setup...
---
 tests/qemu-iotests/common.filter |   1 +
 tests/qemu-iotests/197   | 102 +++
 tests/qemu-iotests/197.out   |  26 ++
 tests/qemu-iotests/group |   1 +
 4 files changed, 130 insertions(+)
 create mode 100755 tests/qemu-iotests/197
 create mode 100644 tests/qemu-iotests/197.out

diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 9d5442ecd9..227b37e941 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -111,6 +111,7 @@ _filter_img_create()
 sed -e "s#$IMGPROTO:$TEST_DIR#TEST_DIR#g" \
 -e "s#$TEST_DIR#TEST_DIR#g" \
 -e "s#$IMGFMT#IMGFMT#g" \
+-e 's#nbd:127.0.0.1:10810#TEST_DIR/t.IMGFMT#g' \
 -e "s# encryption=off##g" \
 -e "s# cluster_size=[0-9]\\+##g" \
 -e "s# table_size=[0-9]\\+##g" \
diff --git a/tests/qemu-iotests/197 b/tests/qemu-iotests/197
new file mode 100755
index 00..cc85388039
--- /dev/null
+++ b/tests/qemu-iotests/197
@@ -0,0 +1,102 @@
+#!/bin/bash
+#
+# Test case for copy-on-read into qcow2
+#
+# Copyright (C) 2017 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=ebl...@redhat.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+status=1 # failure is the default!
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+TEST_WRAP="$TEST_DIR/t.wrap.qcow2"
+BLKDBG_CONF="$TEST_DIR/blkdebug.conf"
+
+# Sanity check: our use of blkdebug fails if $TEST_DIR contains spaces
+# or other problems
+case "$TEST_DIR" in
+*[^-_a-zA-Z0-9/]*)
+_notrun "Suspicious TEST_DIR='$TEST_DIR', cowardly refusing to run" ;;
+esac
+
+_cleanup()
+{
+_cleanup_test_img
+rm -f "$BLKDBG_CONF"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# Test is supported for any backing file; but we force qcow2 for our wrapper.
+_supported_fmt generic
+_supported_proto generic
+_supported_os Linux
+# LUKS support may be possible, but it complicates things.
+_unsupported_fmt luks
+
+echo
+echo '=== Copy-on-read ==='
+echo
+
+# Prep the images
+_make_test_img 4G
+$QEMU_IO -c "write -P 55 3G 1k" "$TEST_IMG" | _filter_qemu_io
+IMGPROTO=file IMGFMT=qcow2 IMGOPTS= TEST_IMG_FILE="$TEST_WRAP" \
+_make_test_img -F "$IMGFMT" -b "$TEST_IMG" | _filter_img_create
+$QEMU_IO -f qcow2 -c "write -z -u 1M 64k" "$TEST_WRAP" | _filter_qemu_io
+
+# Ensure that a read of two clusters, but where one is already allocated,
+# does not re-write the allocated cluster
+cat > "$BLKDBG_CONF" <&1 | _filter_testdir
+
+# Break the backing chain, and show that images are identical, and that
+# we properly copied over explicit zeros.
+$QEMU_IMG rebase -u -b "" -f qcow2 "$TEST_WRAP"
+$QEMU_IO -f qcow2 -c map "$TEST_WRAP"
+_check_test_img
+$QEMU_IMG compare -f $IMGFMT -F qcow2 "$TEST_IMG" "$TEST_WRAP"
+
+# success, all done
+echo '*** done'
+status=0
diff --git a/tests/qemu-iotests/197.out b/tests/qemu-iotests/197.out
new file mode 100644
index 00..52b4137d7b
--- /dev/null
+++ b/tests/qemu-iotests/197.out
@@ -0,0 +1,26 @@
+QA output created by 197
+
+=== Copy-on-read ===
+
+Formatting 'TEST_DIR/t.IMGFMT',

[Qemu-devel] [PATCH v2 3/5] block: Add blkdebug hook for copy-on-read

2017-10-03 Thread Eric Blake

Make it possible to inject errors on writes performed during a
read operation due to copy-on-read semantics.

Signed-off-by: Eric Blake 
Reviewed-by: Jeff Cody 
Reviewed-by: Kevin Wolf 
Reviewed-by: John Snow 
Reviewed-by: Stefan Hajnoczi 
---
 qapi/block-core.json | 5 -
 block/io.c   | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 750bb0c77c..ab96e348e6 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2538,6 +2538,8 @@
 #
 # @l1_shrink_free_l2_clusters: discard the l2 tables. (since 2.11)
 #
+# @cor_write: a write due to copy-on-read (since 2.11)
+#
 # Since: 2.9
 ##
 { 'enum': 'BlkdebugEvent', 'prefix': 'BLKDBG',
@@ -2555,7 +2557,8 @@
 'flush_to_disk', 'pwritev_rmw_head', 'pwritev_rmw_after_head',
 'pwritev_rmw_tail', 'pwritev_rmw_after_tail', 'pwritev',
 'pwritev_zero', 'pwritev_done', 'empty_image_prepare',
-'l1_shrink_write_table', 'l1_shrink_free_l2_clusters' ] }
+'l1_shrink_write_table', 'l1_shrink_free_l2_clusters',
+'cor_write'] }

 ##
 # @BlkdebugInjectErrorOptions:
diff --git a/block/io.c b/block/io.c
index 1f5baac41d..d656a0485b 100644
--- a/block/io.c
+++ b/block/io.c
@@ -983,6 +983,7 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BdrvChild 
*child,
 goto err;
 }

+bdrv_debug_event(bs, BLKDBG_COR_WRITE);
 if (drv->bdrv_co_pwrite_zeroes &&
 buffer_is_zero(bounce_buffer, iov.iov_len)) {
 /* FIXME: Should we (perhaps conditionally) be setting
-- 
2.13.6

[Qemu-devel] [PATCH v2 2/5] block: Uniform handling of 0-length bdrv_get_block_status()

2017-10-03 Thread Eric Blake

Handle a 0-length block status request up front, with a uniform
return value claiming the area is not allocated.

Most callers don't pass a length of 0 to bdrv_get_block_status()
and friends; but it definitely happens with a 0-length read when
copy-on-read is enabled.  While we could audit all callers to
ensure that they never make a 0-length request, and then assert
that fact, it was just as easy to fix things to always report
success (as long as the callers are careful to not go into an
infinite loop).  However, we had inconsistent behavior on whether
the status is reported as allocated or defers to the backing
layer, depending on what callbacks the driver implements, and
possibly wasting quite a few CPU cycles to get to that answer.
Consistently reporting unallocated up front doesn't really hurt
anything, and makes it easier both for callers (0-length requests
now have well-defined behavior) and for drivers (drivers don't
have to deal with 0-length requests).

Signed-off-by: Eric Blake 

---
v2: new patch
---
 block/io.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index e0f904583f..1f5baac41d 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1773,9 +1773,9 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,
 return total_sectors;
 }

-if (sector_num >= total_sectors) {
+if (sector_num >= total_sectors || !nb_sectors) {
 *pnum = 0;
-return BDRV_BLOCK_EOF;
+return sector_num >= total_sectors ? BDRV_BLOCK_EOF : 0;
 }

 n = total_sectors - sector_num;
-- 
2.13.6

[Qemu-devel] [PATCH v2 1/5] qemu-io: Add -C for opening with copy-on-read

2017-10-03 Thread Eric Blake

Make it easier to enable copy-on-read during iotests, by
exposing a new bool option to main and open.

Signed-off-by: Eric Blake 
Reviewed-by: Jeff Cody 
Reviewed-by: Kevin Wolf 
Reviewed-by: John Snow 
Reviewed-by: Stefan Hajnoczi 
---
 qemu-io.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/qemu-io.c b/qemu-io.c
index 265445ad89..c70bde3eb1 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -102,6 +102,7 @@ static void open_help(void)
 " Opens a file for subsequent use by all of the other qemu-io commands.\n"
 " -r, -- open file read-only\n"
 " -s, -- use snapshot file\n"
+" -C, -- use copy-on-read\n"
 " -n, -- disable host cache, short for -t none\n"
 " -U, -- force shared permissions\n"
 " -k, -- use kernel AIO implementation (on Linux only)\n"
@@ -120,7 +121,7 @@ static const cmdinfo_t open_cmd = {
 .argmin = 1,
 .argmax = -1,
 .flags  = CMD_NOFILE_OK,
-.args   = "[-rsnkU] [-t cache] [-d discard] [-o options] [path]",
+.args   = "[-rsCnkU] [-t cache] [-d discard] [-o options] [path]",
 .oneline= "open the file specified by path",
 .help   = open_help,
 };
@@ -145,7 +146,7 @@ static int open_f(BlockBackend *blk, int argc, char **argv)
 QDict *opts;
 bool force_share = false;

-while ((c = getopt(argc, argv, "snro:kt:d:U")) != -1) {
+while ((c = getopt(argc, argv, "snCro:kt:d:U")) != -1) {
 switch (c) {
 case 's':
 flags |= BDRV_O_SNAPSHOT;
@@ -154,6 +155,9 @@ static int open_f(BlockBackend *blk, int argc, char **argv)
 flags |= BDRV_O_NOCACHE;
 writethrough = false;
 break;
+case 'C':
+flags |= BDRV_O_COPY_ON_READ;
+break;
 case 'r':
 readonly = 1;
 break;
@@ -251,6 +255,7 @@ static void usage(const char *name)
 "  -r, --read-only  export read-only\n"
 "  -s, --snapshot   use snapshot file\n"
 "  -n, --nocachedisable host cache, short for -t none\n"
+"  -C, --copy-on-read   enable copy-on-read\n"
 "  -m, --misalign   misalign allocations for O_DIRECT\n"
 "  -k, --native-aio use kernel AIO implementation (on Linux only)\n"
 "  -t, --cache=MODE use the given cache mode for the image\n"
@@ -439,7 +444,7 @@ static QemuOptsList file_opts = {
 int main(int argc, char **argv)
 {
 int readonly = 0;
-const char *sopt = "hVc:d:f:rsnmkt:T:U";
+const char *sopt = "hVc:d:f:rsnCmkt:T:U";
 const struct option lopt[] = {
 { "help", no_argument, NULL, 'h' },
 { "version", no_argument, NULL, 'V' },
@@ -448,6 +453,7 @@ int main(int argc, char **argv)
 { "read-only", no_argument, NULL, 'r' },
 { "snapshot", no_argument, NULL, 's' },
 { "nocache", no_argument, NULL, 'n' },
+{ "copy-on-read", no_argument, NULL, 'C' },
 { "misalign", no_argument, NULL, 'm' },
 { "native-aio", no_argument, NULL, 'k' },
 { "discard", required_argument, NULL, 'd' },
@@ -492,6 +498,9 @@ int main(int argc, char **argv)
 flags |= BDRV_O_NOCACHE;
 writethrough = false;
 break;
+case 'C':
+flags |= BDRV_O_COPY_ON_READ;
+break;
 case 'd':
 if (bdrv_parse_discard_flags(optarg, ) < 0) {
 error_report("Invalid discard option: %s", optarg);
-- 
2.13.6

[Qemu-devel] [PATCH v2 0/5] block: Avoid copy-on-read assertions

2017-10-03 Thread Eric Blake

During my quest to switch block status to be byte-based, John
forced me to evaluate whether we have a situation during
copy-on-read where we could exceed BDRV_REQUEST_MAX_BYTES [1].
Sure enough, we have a number of pre-existing bugs in the
copy-on-read code.  Fix those, along with adding a test.

Available as a tag at:
git fetch git://repo.or.cz/qemu/ericb.git nbd-byte-status-v4

Since v1 (available at [2]):
- tweak patch 3 (now 4) to avoid uninit variable [Kevin, patchew]
- tweak patch 4 (now 5) to add 0-length test [Kevin]
- tweak patch 4 (now 5) to skip if TEST_DIR contains spaces [Jeff]
- new patch 2 to make testing 0-length read easier

[1] https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg07286.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg08200.html

001/5:[] [--] 'qemu-io: Add -C for opening with copy-on-read'
002/5:[down] 'block: Uniform handling of 0-length bdrv_get_block_status()'
003/5:[] [--] 'block: Add blkdebug hook for copy-on-read'
004/5:[0001] [FC] 'block: Perform copy-on-read in loop'
005/5:[0025] [FC] 'iotests: Add test 197 for covering copy-on-read'

Eric Blake (5):
  qemu-io: Add -C for opening with copy-on-read
  block: Uniform handling of 0-length bdrv_get_block_status()
  block: Add blkdebug hook for copy-on-read
  block: Perform copy-on-read in loop
  iotests: Add test 197 for covering copy-on-read

 qapi/block-core.json |   5 +-
 block/io.c   | 123 ++-
 qemu-io.c|  15 -
 tests/qemu-iotests/common.filter |   1 +
 tests/qemu-iotests/197   | 102 
 tests/qemu-iotests/197.out   |  26 +
 tests/qemu-iotests/group |   1 +
 7 files changed, 230 insertions(+), 43 deletions(-)
 create mode 100755 tests/qemu-iotests/197
 create mode 100644 tests/qemu-iotests/197.out

-- 
2.13.6

Re: [Qemu-devel] [PATCH v5 5/6] ppc: spapr: Enable FWNMI capability

2017-10-03 Thread David Gibson

On Thu, Sep 28, 2017 at 04:08:21PM +0530, Aravinda Prasad wrote:
> Enable the KVM capability KVM_CAP_PPC_FWNMI so that
> the KVM causes guest exit with NMI as exit reason
> when it encounters a machine check exception on the
> address belonging to a guest. Without this capability
> enabled, KVM redirects machine check exceptions to
> guest's 0x200 vector.
> 
> Signed-off-by: Aravinda Prasad 
> ---
>  hw/ppc/spapr_rtas.c  |   15 +++
>  target/ppc/kvm.c |   13 +
>  target/ppc/kvm_ppc.h |6 ++
>  3 files changed, 34 insertions(+)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 08e9a5e..d017a67 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -46,6 +46,7 @@
>  #include "qemu/cutils.h"
>  #include "trace.h"
>  #include "hw/ppc/fdt.h"
> +#include "kvm_ppc.h"
>  
>  static void rtas_display_character(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> uint32_t token, uint32_t nargs,
> @@ -354,6 +355,20 @@ static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
>target_ulong args,
>uint32_t nret, target_ulong rets)
>  {
> +int ret;
> +
> +ret = kvmppc_fwnmi_enable(cpu);

If you're enabling it here, doesn't that mean you need to disable it
on reset?

> +if (ret == 1) {
> +rtas_st(rets, 0, RTAS_OUT_NOT_SUPPORTED);
> +return;
> +}
> +
> +if (ret < 0) {
> +rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
> +return;
> +}
> +
>  spapr->guest_machine_check_addr = rtas_ld(args, 1);
>  rtas_st(rets, 0, RTAS_OUT_SUCCESS);
>  }
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 7e4ce02..59b3322 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -92,6 +92,7 @@ static int cap_mmu_radix;
>  static int cap_mmu_hash_v3;
>  static int cap_resize_hpt;
>  static int cap_ppc_pvr_compat;
> +static int cap_fwnmi;
>  
>  static uint32_t debug_inst_opcode;
>  
> @@ -150,6 +151,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>  cap_mmu_radix = kvm_vm_check_extension(s, KVM_CAP_PPC_MMU_RADIX);
>  cap_mmu_hash_v3 = kvm_vm_check_extension(s, KVM_CAP_PPC_MMU_HASH_V3);
>  cap_resize_hpt = kvm_vm_check_extension(s, KVM_CAP_SPAPR_RESIZE_HPT);
> +cap_fwnmi = kvm_check_extension(s, KVM_CAP_PPC_FWNMI);
>  /*
>   * Note: setting it to false because there is not such capability
>   * in KVM at this moment.
> @@ -2142,6 +2144,17 @@ void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int 
> mpic_proxy)
>  }
>  }
>  
> +int kvmppc_fwnmi_enable(PowerPCCPU *cpu)
> +{
> +CPUState *cs = CPU(cpu);
> +
> +if (!cap_fwnmi) {
> +return 1;
> +}
> +
> +return kvm_vcpu_enable_cap(cs, KVM_CAP_PPC_FWNMI, 0);

Yeah, this is no good.  It means migration from a host that's fwnmi
capable to one that isn't will be subtly broken.  Instead you need to
make fwnmi capability a machine property.  If the property is
requested and the host kernel doesn't support it, you need to outright
fail, rather than try to fall back.

> +}
> +
>  int kvmppc_smt_threads(void)
>  {
>  return cap_ppc_smt ? cap_ppc_smt : 1;
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index 0139dae..55b6df2 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -28,6 +28,7 @@ void kvmppc_enable_clear_ref_mod_hcalls(void);
>  void kvmppc_set_papr(PowerPCCPU *cpu);
>  int kvmppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr);
>  void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy);
> +int kvmppc_fwnmi_enable(PowerPCCPU *cpu);
>  int kvmppc_smt_threads(void);
>  void kvmppc_hint_smt_possible(Error **errp);
>  int kvmppc_set_smt_threads(int smt);
> @@ -157,6 +158,11 @@ static inline void kvmppc_set_mpic_proxy(PowerPCCPU 
> *cpu, int mpic_proxy)
>  {
>  }
>  
> +int kvmppc_fwnmi_enable(PowerPCCPU *cpu)
> +{
> +return 1;

Likewise, this should be available, not banned, on TCG.  I think there
are existing problems with TCG<->KVM migration, but there's no
inherent reason they shouldn't work, so we don't want to introduce
extra reasons they don't.

Even if TCG will never generate fwnmis (for now), it should allow the
guest to register for them.

> +}
> +
>  static inline int kvmppc_smt_threads(void)
>  {
>  return 1;
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v5 6/6] migration: Block migration while handling machine check

2017-10-03 Thread David Gibson

On Thu, Sep 28, 2017 at 04:08:31PM +0530, Aravinda Prasad wrote:
> Block VM migration requests until the machine check
> error handling is complete as (i) these errors are
> specific to the source hardware and is irrelevant on
> the target hardware, (ii) these errors cause data
> corruption and should be handled before migration.
> 
> Signed-off-by: Aravinda Prasad 
> ---
>  hw/ppc/spapr_rtas.c|3 +++
>  include/hw/ppc/spapr.h |2 ++
>  target/ppc/kvm.c   |   17 +
>  3 files changed, 22 insertions(+)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index d017a67..17f6567 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -47,6 +47,7 @@
>  #include "trace.h"
>  #include "hw/ppc/fdt.h"
>  #include "kvm_ppc.h"
> +#include "migration/blocker.h"
>  
>  static void rtas_display_character(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> uint32_t token, uint32_t nargs,
> @@ -390,6 +391,8 @@ static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
>  spapr->mc_status = -1;
>  qemu_cond_signal(>mc_delivery_cond);
>  rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +migrate_del_blocker(spapr->migration_blocker);
> +error_free(spapr->migration_blocker);
>  }
>  }
>  
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index a75e9cf..0890a44 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -7,6 +7,7 @@
>  #include "hw/ppc/spapr_drc.h"
>  #include "hw/mem/pc-dimm.h"
>  #include "hw/ppc/spapr_ovec.h"
> +#include "qapi/error.h"
>  
>  struct VIOsPAPRBus;
>  struct sPAPRPHBState;
> @@ -136,6 +137,7 @@ struct sPAPRMachineState {
>  MemoryHotplugState hotplug_memory;
>  
>  const char *icp_type;
> +Error *migration_blocker;

This isn't a good name, because it's _specifically_ the fwnmi as a
migration blocker - trying to put another migration blocker in here
would break horribly, because nmi-interlock would clear it regardless.

>  };
>  
>  #define H_SUCCESS 0
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 59b3322..58de7ea 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -52,6 +52,7 @@
>  #endif
>  #include "elf.h"
>  #include "sysemu/kvm_int.h"
> +#include "migration/blocker.h"
>  
>  //#define DEBUG_KVM
>  
> @@ -2770,10 +2771,26 @@ int kvm_handle_nmi(PowerPCCPU *cpu, struct kvm_run 
> *run)
>  sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>  PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>  target_ulong msr = 0;
> +Error *local_err = NULL;
> +int ret;
>  bool type, le;
>  
>  cpu_synchronize_state(CPU(cpu));
>  
> +error_setg(>migration_blocker,
> +"Live migration not supported during machine check error 
> handling");

In fact, there's no real reason to generate the error here.  The
error's always the same so you could just create it at startup as a
global and just add/remove it to the block list.

> +ret = migrate_add_blocker(spapr->migration_blocker, _err);
> +if (ret < 0) {
> +/*
> + * We don't want to abort and let the migration to continue. In a
> + * rare case, the machine check handler will run on the target
> + * hardware. Though this is not preferable, it is better than 
> aborting

Why is it not preferable?  I mean it's an edge case, but AFAICT it's
still the correct behaviour.

> + * the migration or killing the VM.
> + */
> +error_free(spapr->migration_blocker);
> +fprintf(stderr, "Warning: Machine check during VM migration\n");

Use error_report(), not fprintf().

> +}
> +
>  /*
>   * Properly set bits in MSR before we invoke the handler.
>   * SRR0/1, DAR and DSISR are properly set by KVM
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v5 4/6] target/ppc: Handle NMI guest exit

2017-10-03 Thread David Gibson

On Thu, Sep 28, 2017 at 04:08:10PM +0530, Aravinda Prasad wrote:
> Memory error such as bit flips that cannot be corrected
> by hardware are passed on to the kernel for handling.
> If the memory address in error belongs to guest then
> the guest kernel is responsible for taking suitable action.
> Patch [1] enhances KVM to exit guest with exit reason
> set to KVM_EXIT_NMI in such cases.
> 
> This patch handles KVM_EXIT_NMI exit. If the guest OS
> has registered the machine check handling routine by
> calling "ibm,nmi-register", then the handler builds
> the error log and invokes the registered handler else
> invokes the handler at 0x200.
> 
> Note that FWNMI handles synchronous machine check exceptions
> triggered by the hardware and hence we do not extend
> such support to the "nmi" command available in the QEMU
> monitor. Hence, "nmi" command from the monitor will
> always go through 0x200 vector.
> 
> [1] https://www.spinics.net/lists/kvm-ppc/msg12637.html
>   (e20bbd3d and related commits)

What does happen on KVM if an asynchronous machine check exception
occurs while in the guest?  Or under PowerVM for that matter.

> 
> Signed-off-by: Aravinda Prasad 
> Signed-off-by: Mahesh Salgaonkar 
> ---
>  hw/ppc/spapr.c |4 +++
>  hw/ppc/spapr_events.c  |   62 
> 
>  include/hw/ppc/spapr.h |6 +
>  target/ppc/kvm.c   |   62 
> 
>  target/ppc/kvm_ppc.h   |   14 +++
>  5 files changed, 148 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d568ea6..7780434 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2453,6 +2453,10 @@ static void ppc_spapr_init(MachineState *machine)
>  error_report("Could not get size of LPAR rtas '%s'", filename);
>  exit(1);
>  }
> +
> +/* Resize blob to accommodate error log. */
> +spapr->rtas_size = spapr_get_rtas_size();
> +
>  spapr->rtas_blob = g_malloc(spapr->rtas_size);
>  if (load_image_size(filename, spapr->rtas_blob, spapr->rtas_size) < 0) {
>  error_report("Could not load LPAR rtas '%s'", filename);
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index e377fc7..ac93a7b 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -41,6 +41,7 @@
>  #include "qemu/bcd.h"
>  #include "hw/ppc/spapr_ovec.h"
>  #include 
> +#include 
>  
>  #define RTAS_LOG_VERSION_MASK   0xff00
>  #define   RTAS_LOG_VERSION_60x0600
> @@ -174,6 +175,22 @@ struct epow_extended_log {
>  struct rtas_event_log_v6_epow epow;
>  } QEMU_PACKED;
>  
> +/*
> + * Data format in RTAS Blob
> + *
> + * This structure contains error information related to Machine
> + * Check exception. This is filled up and copied to rtas blob
> + * upon machine check exception. The address of rtas blob is
> + * passed on to OS registered machine check notification
> + * routines upon machine check exception.
> + */
> +struct rtas_event_log_mce {
> +target_ulong r3;
> +struct rtas_error_log rtas_error_log;
> +unsigned char   buffer[1];  /* Start of extended log */

I believe we allow C99 extensions in qemu, so you can use buffer[], a
C99 flexible array member, rather than the length 1 hack.

> +} QEMU_PACKED;
> +
> +
>  union drc_identifier {
>  uint32_t index;
>  uint32_t count;
> @@ -623,6 +640,51 @@ void 
> spapr_hotplug_req_remove_by_count_indexed(sPAPRDRConnectorType drc_type,
>  RTAS_LOG_V6_HP_ACTION_REMOVE, drc_type, _id);
>  }
>  
> +ssize_t spapr_get_rtas_size(void)
> +{
> +return RTAS_ERRLOG_OFFSET + sizeof(struct rtas_event_log_mce);

Erm.. because of the definition of rtas_event_log_mce, this only
allows for 1 byte of extended log buffer.  That doesn't seem right.

> +}
> +
> +target_ulong spapr_mce_req_event(target_ulong r3, hwaddr rtas_addr,
> + uint16_t flags, bool err_type, bool le)

err_tpe isn't a very informative name for a boolean.  'uncorrectable'
would be better.  Although, didn't you say only uncorrectable errors
are directed to the guest, so does this have any purpose anyway?

> +{
> +struct rtas_event_log_mce mc_log;
> +uint32_t summary;
> +
> +/* Set error log fields */
> +mc_log.r3 = r3;
> +
> +summary = RTAS_LOG_SEVERITY_ERROR_SYNC;
> +
> +if (flags & KVM_RUN_PPC_NMI_DISP_FULLY_RECOV) {

KVM specific flags shouldn't be here, this translation should happen
in the caller.

> +summary |= RTAS_LOG_DISPOSITION_FULLY_RECOVERED;
> +} else {
> +summary |= RTAS_LOG_DISPOSITION_NOT_RECOVERED;
> +}
> +
> +summary |= (RTAS_LOG_INITIATOR_MEMORY | RTAS_LOG_TARGET_MEMORY);
> +
> +if (err_type) {
> +summary |= RTAS_LOG_TYPE_ECC_UNCORR;
> +} else {
> +summary |= RTAS_LOG_TYPE_ECC_CORR;
> +}
> +
> +

Re: [Qemu-devel] [PATCH 3/3] blockjob: expose manual-cull property

2017-10-03 Thread John Snow



On 10/03/2017 01:43 PM, Jeff Cody wrote:
> On Tue, Oct 03, 2017 at 11:59:28AM -0400, John Snow wrote:
>>
>>
>> On 10/03/2017 11:57 AM, Paolo Bonzini wrote:
>>> On 03/10/2017 05:15, John Snow wrote:
 For drive-backup and blockdev-backup, expose the manual-cull
 property, having it default to false. There are no universal
 creation parameters, so it must be added to each job type that
 it makes sense for individually.

 Signed-off-by: John Snow 
>>
>> [...]
>>
>>>
>>> The verb "cull" is a bit weird.  The only alternative that comes to mind
>>> though are "reap" (like processes). There's also "join" (like threads),
>>> but would imply waiting if the jobs hasn't completed yet, and we
>>> probably don't want it.
>>>
>>> Paolo
>>>
>>
>> Sure, open to suggestions. I think Kevin suggested "delete" which I have
>> reservations about because of people potentially confusing it with
>> "cancel" or "complete" -- it does not have the capacity to
>> end/terminate/finish/complete/cancel a job.
>>
>> "reap" might be fine. I don't really have any strong preference.
>>
> 
> As far as verbs go, I like both 'reap' and 'delete'.  As far as the
> property, naming it 'manual_verb' is a bit odd, too. Maybe a clearer term
> for the property would just be 'persistent', with the QMP command being
> 'block_job_reap' or 'block_job_delete'?
> 
> -Jeff
> 

As they say, two hard problems in Computer Science ...

[Qemu-devel] [Bug 1719196] Re: [arm64 ocata] newly created instances are unable to raise network interfaces

2017-10-03 Thread Sean Feole

Taken from the upgraded hypervisor:

ubuntu@awrep3:/var/lib/nova/instances/2cec409e-de92-4d29-ad68-3f1d1f8be7fc$ 
sudo qemu-system-aarch64 --version
QEMU emulator version 2.8.0(Debian 1:2.8+dfsg-3ubuntu2.3~cloud0)
Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers


ubuntu@awrep3:/var/lib/nova/instances/2cec409e-de92-4d29-ad68-3f1d1f8be7fc$ 
sudo libvirtd --version
libvirtd (libvirt) 2.5.0

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1719196

Title:
  [arm64 ocata] newly created instances are unable to raise network
  interfaces

Status in libvirt:
  New
Status in QEMU:
  New

Bug description:
  arm64 Ocata ,

  I'm testing to see I can get Ocata running on arm64 and using the
  openstack-base bundle to deploy it.  I have added the bundle to the
  log file attached to this bug.

  When I create a new instance via nova, the VM comes up and runs,
  however fails to raise its eth0 interface. This occurs on both
  internal and external networks.

  ubuntu@openstackaw:~$ nova list
  
+--+-+++-++
  | ID   | Name| Status | Task State | 
Power State | Networks   |
  
+--+-+++-++
  | dcaf6d51-f81e-4cbd-ac77-0c5d21bde57c | sfeole1 | ACTIVE | -  | 
Running | internal=10.5.5.3  |
  | aa0b8aee-5650-41f4-8fa0-aeccdc763425 | sfeole2 | ACTIVE | -  | 
Running | internal=10.5.5.13 |
  
+--+-+++-++
  ubuntu@openstackaw:~$ nova show aa0b8aee-5650-41f4-8fa0-aeccdc763425
  
+--+--+
  | Property | Value
|
  
+--+--+
  | OS-DCF:diskConfig| MANUAL   
|
  | OS-EXT-AZ:availability_zone  | nova 
|
  | OS-EXT-SRV-ATTR:host | awrep3   
|
  | OS-EXT-SRV-ATTR:hypervisor_hostname  | awrep3.maas  
|
  | OS-EXT-SRV-ATTR:instance_name| instance-0003
|
  | OS-EXT-STS:power_state   | 1
|
  | OS-EXT-STS:task_state| -
|
  | OS-EXT-STS:vm_state  | active   
|
  | OS-SRV-USG:launched_at   | 2017-09-24T14:23:08.00   
|
  | OS-SRV-USG:terminated_at | -
|
  | accessIPv4   |  
|
  | accessIPv6   |  
|
  | config_drive |  
|
  | created  | 2017-09-24T14:22:41Z 
|
  | flavor   | m1.small 
(717660ae-0440-4b19-a762-ffeb32a0575c)  |
  | hostId   | 
5612a00671c47255d2ebd6737a64ec9bd3a5866d1233ecf3e988b025 |
  | id   | aa0b8aee-5650-41f4-8fa0-aeccdc763425 
|
  | image| zestynosplash 
(e88fd1bd-f040-44d8-9e7c-c462ccf4b945) |
  | internal network | 10.5.5.13
|
  | key_name | mykey
|
  | metadata | {}   
|
  | name | sfeole2  
|
  | os-extended-volumes:volumes_attached | []   
|
  | progress | 0
|
  | security_groups  | default  
|
  | status   | ACTIVE   
|
  | tenant_id| 9f7a21c1ad264fec81abc09f3960ad1d 
|
  | updated  |

Re: [Qemu-devel] [PATCH] target/ppc: Remove unused PPC 460 and 460F definitions

2017-10-03 Thread David Gibson

On Tue, Oct 03, 2017 at 12:14:04PM +0200, Thomas Huth wrote:
> We don't have any 460 or 460F CPUs in QEMU, so the init functions
> are just dead code. Let's simply remove them (translate_init.c
> is already big enough without them).
> 
> Signed-off-by: Thomas Huth 

Applied to ppc-for-2.11, thanks.

> ---
>  target/ppc/translate_init.c | 217 
> 
>  1 file changed, 217 deletions(-)
> 
> diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> index c6399a3..0d6379f 100644
> --- a/target/ppc/translate_init.c
> +++ b/target/ppc/translate_init.c
> @@ -4176,223 +4176,6 @@ POWERPC_FAMILY(440x5wDFPU)(ObjectClass *oc, void 
> *data)
>   POWERPC_FLAG_DE | POWERPC_FLAG_BUS_CLK;
>  }
>  
> -static void init_proc_460 (CPUPPCState *env)
> -{
> -/* Time base */
> -gen_tbl(env);
> -gen_spr_BookE(env, 0xULL);
> -gen_spr_440(env);
> -gen_spr_usprgh(env);
> -/* Processor identification */
> -spr_register(env, SPR_BOOKE_PIR, "PIR",
> - SPR_NOACCESS, SPR_NOACCESS,
> - _read_generic, _write_pir,
> - 0x);
> -/* XXX : not implemented */
> -spr_register(env, SPR_BOOKE_IAC3, "IAC3",
> - SPR_NOACCESS, SPR_NOACCESS,
> - _read_generic, _write_generic,
> - 0x);
> -/* XXX : not implemented */
> -spr_register(env, SPR_BOOKE_IAC4, "IAC4",
> - SPR_NOACCESS, SPR_NOACCESS,
> - _read_generic, _write_generic,
> - 0x);
> -/* XXX : not implemented */
> -spr_register(env, SPR_BOOKE_DVC1, "DVC1",
> - SPR_NOACCESS, SPR_NOACCESS,
> - _read_generic, _write_generic,
> - 0x);
> -/* XXX : not implemented */
> -spr_register(env, SPR_BOOKE_DVC2, "DVC2",
> - SPR_NOACCESS, SPR_NOACCESS,
> - _read_generic, _write_generic,
> - 0x);
> -/* XXX : not implemented */
> -spr_register(env, SPR_BOOKE_MCSR, "MCSR",
> - SPR_NOACCESS, SPR_NOACCESS,
> - _read_generic, _write_generic,
> - 0x);
> -spr_register(env, SPR_BOOKE_MCSRR0, "MCSRR0",
> - SPR_NOACCESS, SPR_NOACCESS,
> - _read_generic, _write_generic,
> - 0x);
> -spr_register(env, SPR_BOOKE_MCSRR1, "MCSRR1",
> - SPR_NOACCESS, SPR_NOACCESS,
> - _read_generic, _write_generic,
> - 0x);
> -/* XXX : not implemented */
> -spr_register(env, SPR_440_CCR1, "CCR1",
> - SPR_NOACCESS, SPR_NOACCESS,
> - _read_generic, _write_generic,
> - 0x);
> -/* XXX : not implemented */
> -spr_register(env, SPR_DCRIPR, "SPR_DCRIPR",
> - _read_generic, _write_generic,
> - _read_generic, _write_generic,
> - 0x);
> -/* Memory management */
> -#if !defined(CONFIG_USER_ONLY)
> -env->nb_tlb = 64;
> -env->nb_ways = 1;
> -env->id_tlbs = 0;
> -env->tlb_type = TLB_EMB;
> -#endif
> -init_excp_BookE(env);
> -env->dcache_line_size = 32;
> -env->icache_line_size = 32;
> -/* XXX: TODO: allocate internal IRQ controller */
> -
> -SET_FIT_PERIOD(12, 16, 20, 24);
> -SET_WDT_PERIOD(20, 24, 28, 32);
> -}
> -
> -POWERPC_FAMILY(460)(ObjectClass *oc, void *data)
> -{
> -DeviceClass *dc = DEVICE_CLASS(oc);
> -PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
> -
> -dc->desc = "PowerPC 460 (guessed)";
> -pcc->init_proc = init_proc_460;
> -pcc->check_pow = check_pow_nocheck;
> -pcc->insns_flags = PPC_INSNS_BASE | PPC_STRING |
> -   PPC_DCR | PPC_DCRX  | PPC_DCRUX |
> -   PPC_WRTEE | PPC_MFAPIDI | PPC_MFTB |
> -   PPC_CACHE | PPC_CACHE_ICBI |
> -   PPC_CACHE_DCBZ | PPC_CACHE_DCBA |
> -   PPC_MEM_TLBSYNC | PPC_TLBIVA |
> -   PPC_BOOKE | PPC_4xx_COMMON | PPC_405_MAC |
> -   PPC_440_SPEC;
> -pcc->msr_mask = (1ull << MSR_POW) |
> -(1ull << MSR_CE) |
> -(1ull << MSR_EE) |
> -(1ull << MSR_PR) |
> -(1ull << MSR_FP) |
> -(1ull << MSR_ME) |
> -(1ull << MSR_FE0) |
> -(1ull << MSR_DWE) |
> -(1ull << MSR_DE) |
> -(1ull << MSR_FE1) |
> -(1ull << MSR_IR) |
> -(1ull << MSR_DR);
> -pcc->mmu_model = POWERPC_MMU_BOOKE;
> -pcc->excp_model = POWERPC_EXCP_BOOKE;
> -pcc->bus_model = PPC_FLAGS_INPUT_BookE;
> -pcc->bfd_mach = bfd_mach_ppc_403;
> -pcc->flags = POWERPC_FLAG_CE | POWERPC_FLAG_DWE |
> -

[Qemu-devel] [Bug 1719196] Re: [arm64 ocata] newly created instances are unable to raise network interfaces

2017-10-03 Thread Sean Feole

Taken from the upgraded hypervisor

ubuntu@aw3:/var/lib/nova/instances/2cec409e-de92-4d29-ad68-3f1d1f8be7fc$ sudo 
qemu-system-aarch64 --version
QEMU emulator version 2.8.0(Debian 1:2.8+dfsg-3ubuntu2.3~cloud0)
Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers


ubuntu@aw3:/var/lib/nova/instances/2cec409e-de92-4d29-ad68-3f1d1f8be7fc$ sudo 
libvirtd --version
libvirtd (libvirt) 2.5.0

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1719196

Title:
  [arm64 ocata] newly created instances are unable to raise network
  interfaces

Status in libvirt:
  New
Status in QEMU:
  New

Bug description:
  arm64 Ocata ,

  I'm testing to see I can get Ocata running on arm64 and using the
  openstack-base bundle to deploy it.  I have added the bundle to the
  log file attached to this bug.

  When I create a new instance via nova, the VM comes up and runs,
  however fails to raise its eth0 interface. This occurs on both
  internal and external networks.

  ubuntu@openstackaw:~$ nova list
  
+--+-+++-++
  | ID   | Name| Status | Task State | 
Power State | Networks   |
  
+--+-+++-++
  | dcaf6d51-f81e-4cbd-ac77-0c5d21bde57c | sfeole1 | ACTIVE | -  | 
Running | internal=10.5.5.3  |
  | aa0b8aee-5650-41f4-8fa0-aeccdc763425 | sfeole2 | ACTIVE | -  | 
Running | internal=10.5.5.13 |
  
+--+-+++-++
  ubuntu@openstackaw:~$ nova show aa0b8aee-5650-41f4-8fa0-aeccdc763425
  
+--+--+
  | Property | Value
|
  
+--+--+
  | OS-DCF:diskConfig| MANUAL   
|
  | OS-EXT-AZ:availability_zone  | nova 
|
  | OS-EXT-SRV-ATTR:host | awrep3   
|
  | OS-EXT-SRV-ATTR:hypervisor_hostname  | awrep3.maas  
|
  | OS-EXT-SRV-ATTR:instance_name| instance-0003
|
  | OS-EXT-STS:power_state   | 1
|
  | OS-EXT-STS:task_state| -
|
  | OS-EXT-STS:vm_state  | active   
|
  | OS-SRV-USG:launched_at   | 2017-09-24T14:23:08.00   
|
  | OS-SRV-USG:terminated_at | -
|
  | accessIPv4   |  
|
  | accessIPv6   |  
|
  | config_drive |  
|
  | created  | 2017-09-24T14:22:41Z 
|
  | flavor   | m1.small 
(717660ae-0440-4b19-a762-ffeb32a0575c)  |
  | hostId   | 
5612a00671c47255d2ebd6737a64ec9bd3a5866d1233ecf3e988b025 |
  | id   | aa0b8aee-5650-41f4-8fa0-aeccdc763425 
|
  | image| zestynosplash 
(e88fd1bd-f040-44d8-9e7c-c462ccf4b945) |
  | internal network | 10.5.5.13
|
  | key_name | mykey
|
  | metadata | {}   
|
  | name | sfeole2  
|
  | os-extended-volumes:volumes_attached | []   
|
  | progress | 0
|
  | security_groups  | default  
|
  | status   | ACTIVE   
|
  | tenant_id| 9f7a21c1ad264fec81abc09f3960ad1d 
|
  | updated  | 2017-09-24T14:23:09Z

[Qemu-devel] [Bug 1719196] Re: [arm64 ocata] newly created instances are unable to raise network interfaces

2017-10-03 Thread Sean Feole

Today I ran some tests and installed a Newton Deployment on arm64, which
we already know works.  I upgraded QEMU and Libvirt on one of the
hypervisors from the xenial-updates/ocata cloud-archive.

See attached notes.

Libvirt - 1.3.1-1ubuntu10.14 -> 2.5.0-3ubuntu5.5~cloud0
QEMU - 1:2.5+dfsg-5ubuntu10.16 -> 1:2.8+dfsg-3ubuntu2.3~cloud0

I was able to reset the already built instance on the hypervisor and was
able to receive a dhcp response from the ovs tap device. Eth0 came up as
expected with an internal tenant IP.


Steps to reproduce. 

1.) Install Newton & start a few VM's 
2.) Choose 1 hypervisor , upgrade qemu & libvirt to versions from the Ocata 
cloud-archive. 
3.) Reset the running VM so that it now runs with the latest QEMU/Libvirt 
4.) Reset the Instance, see if it boots and network can be reached. 


** Attachment added: "notes1.txt"
   
https://bugs.launchpad.net/libvirt/+bug/1719196/+attachment/4961802/+files/notes1.txt

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1719196

Title:
  [arm64 ocata] newly created instances are unable to raise network
  interfaces

Status in libvirt:
  New
Status in QEMU:
  New

Bug description:
  arm64 Ocata ,

  I'm testing to see I can get Ocata running on arm64 and using the
  openstack-base bundle to deploy it.  I have added the bundle to the
  log file attached to this bug.

  When I create a new instance via nova, the VM comes up and runs,
  however fails to raise its eth0 interface. This occurs on both
  internal and external networks.

  ubuntu@openstackaw:~$ nova list
  
+--+-+++-++
  | ID   | Name| Status | Task State | 
Power State | Networks   |
  
+--+-+++-++
  | dcaf6d51-f81e-4cbd-ac77-0c5d21bde57c | sfeole1 | ACTIVE | -  | 
Running | internal=10.5.5.3  |
  | aa0b8aee-5650-41f4-8fa0-aeccdc763425 | sfeole2 | ACTIVE | -  | 
Running | internal=10.5.5.13 |
  
+--+-+++-++
  ubuntu@openstackaw:~$ nova show aa0b8aee-5650-41f4-8fa0-aeccdc763425
  
+--+--+
  | Property | Value
|
  
+--+--+
  | OS-DCF:diskConfig| MANUAL   
|
  | OS-EXT-AZ:availability_zone  | nova 
|
  | OS-EXT-SRV-ATTR:host | awrep3   
|
  | OS-EXT-SRV-ATTR:hypervisor_hostname  | awrep3.maas  
|
  | OS-EXT-SRV-ATTR:instance_name| instance-0003
|
  | OS-EXT-STS:power_state   | 1
|
  | OS-EXT-STS:task_state| -
|
  | OS-EXT-STS:vm_state  | active   
|
  | OS-SRV-USG:launched_at   | 2017-09-24T14:23:08.00   
|
  | OS-SRV-USG:terminated_at | -
|
  | accessIPv4   |  
|
  | accessIPv6   |  
|
  | config_drive |  
|
  | created  | 2017-09-24T14:22:41Z 
|
  | flavor   | m1.small 
(717660ae-0440-4b19-a762-ffeb32a0575c)  |
  | hostId   | 
5612a00671c47255d2ebd6737a64ec9bd3a5866d1233ecf3e988b025 |
  | id   | aa0b8aee-5650-41f4-8fa0-aeccdc763425 
|
  | image| zestynosplash 
(e88fd1bd-f040-44d8-9e7c-c462ccf4b945) |
  | internal network | 10.5.5.13
|
  | key_name | mykey
|
  | metadata | {}   
|
  | name | sfeole2  
|
  | os-extended-volumes:volumes_attached

Re: [Qemu-devel] [PATCH] hw/ppc: use 0 instead of fdt_path_offset(fdt, "/")

2017-10-03 Thread David Gibson

On Tue, Oct 03, 2017 at 04:13:11PM +0200, Greg Kurz wrote:
> The offset of the root node is guaranteed to be 0.
> 
> This doesn't fix anything, it's just trivial cleanup of the two
> remaining places where this was done under hw/ppc.
> 
> Signed-off-by: Greg Kurz 

Applied to ppc-for-2.11.

> ---
>  hw/ppc/pnv.c   |3 +--
>  hw/ppc/spapr.c |3 +--
>  2 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index d46d91c76f5c..84b2389ea60b 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -92,8 +92,7 @@ static int get_cpus_node(void *fdt)
>  int cpus_offset = fdt_path_offset(fdt, "/cpus");
>  
>  if (cpus_offset < 0) {
> -cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"),
> -  "cpus");
> +cpus_offset = fdt_add_subnode(fdt, 0, "cpus");
>  if (cpus_offset) {
>  _FDT((fdt_setprop_cell(fdt, cpus_offset, "#address-cells", 
> 0x1)));
>  _FDT((fdt_setprop_cell(fdt, cpus_offset, "#size-cells", 0x0)));
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index ff87f155d55e..352ff3d614e8 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -353,8 +353,7 @@ static int spapr_fixup_cpu_dt(void *fdt, 
> sPAPRMachineState *spapr)
>  
>  cpus_offset = fdt_path_offset(fdt, "/cpus");
>  if (cpus_offset < 0) {
> -cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"),
> -  "cpus");
> +cpus_offset = fdt_add_subnode(fdt, 0, "cpus");
>  if (cpus_offset < 0) {
>  return cpus_offset;
>  }
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH for-2.10 0/3] qdev/vfio: defer DEVICE_DEL to avoid races with libvirt

2017-10-03 Thread Michael Roth

Quoting Michael Roth (2017-07-26 20:30:52)
> This series was motivated by the discussion in this thread:
> 
>   https://www.redhat.com/archives/libvir-list/2017-June/msg01370.html
> 
> The issue this series addresses is that when libvirt unplugs a VFIO PCI 
> device,
> it may attempt to bind the host device back to the host driver when QEMU emits
> the DEVICE_DELETED event for the corresponding vfio-pci device. However, the
> VFIO group FD is not actually cleaned up until vfio-pci device is *finalized*
> by QEMU, whereas the event is emitted earlier during device_unparent.
> Depending on the host device and how long certain operations like resetting 
> the
> device might take, this can in result in libvirt trying to rebind the device
> back to the host while it is still in use by VFIO, leading to host crashes or
> other unexpected behavior.
> 
> In particular, Mellanox CX4 adapters on PowerNV hosts might not be fully
> quiesced by vfio-pci's finalize() routine until up to 6s after the
> DEVICE_DELETED was emitted, leading to detach-device on the libvirt side 
> pretty
> much always crashing the host.
> 
> Implementing this change requires 2 prereqs to ensure the same information is
> available when the DEVICE_DELETED is finally emitted:
> 
> 1) Storing the path in the composition patch, which is addressed by PATCH 1,
>which was plucked from another pending series from Greg Kurz:
> 
>https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg07922.html
> 
>since we are now "disconnected" at the time the event is emitted, and
> 
> 2) Deferring qemu_opts_del of the DeviceState->QemuOpts till finalize, since
>that is where DeviceState->id is stored. This was actually how it was
>done in the past, so PATCH 2 simply reverts the change which moved it to
>device_unparent.
> 
> From there it's just a mechanical move of the event from device_unparent to
> device_finalize.

Ping.

The situation has changed somewhat since original posting as Alex now
has a fix on the kernel side for the VFIO issue noted above:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6586b561a91cd80a91c8f107ed0d144feb3eadc2

However, I think this series would still be useful for addressing the
issue for hosts using older kernels, and there seems to be general
interest from the libvirt side in aligning "DEVICE_DELETED" events
with the notion that QEMU is completely finished with a device.

Patch 1/3 is also needed for another series that Greg is working on
and I don't want to hold that up for this series, so if it's
preferred that we just post that patch separately in the meantime
please let me know.

Thanks!

> 
>  hw/core/qdev.c | 30 +++---
>  include/hw/qdev-core.h |  1 +
>  2 files changed, 20 insertions(+), 11 deletions(-)
> 
>

Re: [Qemu-devel] [PATCH v1 5/5] raspi: : Specify the valid CPUs

2017-10-03 Thread Philippe Mathieu-Daudé

On 10/03/2017 06:36 PM, Alistair Francis wrote:
> On Tue, Oct 3, 2017 at 1:39 PM, Eduardo Habkost  wrote:
>> On Tue, Oct 03, 2017 at 01:05:18PM -0700, Alistair Francis wrote:
>>> List all possible valid CPU options.
>>>
>>> Signed-off-by: Alistair Francis 
>>> ---
>>>
>>>  hw/arm/raspi.c | 6 ++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/hw/arm/raspi.c b/hw/arm/raspi.c
>>> index 5941c9f751..555db0f258 100644
>>> --- a/hw/arm/raspi.c
>>> +++ b/hw/arm/raspi.c
>>> @@ -158,6 +158,10 @@ static void raspi2_init(MachineState *machine)
>>>  setup_boot(machine, 2, machine->ram_size - vcram_size);
>>>  }
>>>
>>> +const char *raspi2_valid_cpus[] = { ARM_CPU_TYPE_NAME("cortex-a7"),
>>> +NULL
>>> +  };
>>> +
>>>  static void raspi2_machine_init(MachineClass *mc)
>>>  {
>>>  mc->desc = "Raspberry Pi 2";
>>> @@ -169,5 +173,7 @@ static void raspi2_machine_init(MachineClass *mc)
>>>  mc->max_cpus = BCM2836_NCPUS;
>>>  mc->default_ram_size = 1024 * 1024 * 1024;
>>>  mc->ignore_memory_transaction_failures = true;
>>> +mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a7");
>>> +mc->valid_cpu_types = raspi2_valid_cpus;
>>
>> I'm confused: bcm2836_init() is hardcoded to cortex-a15, not
>> cortex-a7.
> 
> Odd. I just looked up the Raspberry Pi 2 and it says a Cortex-A7:
> https://www.raspberrypi.org/products/raspberry-pi-2-model-b/

The BCM2836 SoC definitively is Cortex-A7.

git history says the A7 was added after (dcf578ed8cec) the raspi2 board
(bad5623690b1).

Reviewed-by: Philippe Mathieu-Daudé 

> 
> Thanks,
> Alistair
> 
>>
>>>  };
>>>  DEFINE_MACHINE("raspi2", raspi2_machine_init)
>>> --
>>> 2.11.0
>>>
>>
>> --
>> Eduardo

[Qemu-devel] [PULL 3/3] vfio/pci: Add NVIDIA GPUDirect Cliques support

2017-10-03 Thread Alex Williamson

NVIDIA has defined a specification for creating GPUDirect "cliques",
where devices with the same clique ID support direct peer-to-peer DMA.
When running on bare-metal, tools like NVIDIA's p2pBandwidthLatencyTest
(part of cuda-samples) determine which GPUs can support peer-to-peer
based on chipset and topology.  When running in a VM, these tools have
no visibility to the physical hardware support or topology.  This
option allows the user to specify hints via a vendor defined
capability.  For instance:

  






  

This enables two cliques.  The first is a singleton clique with ID 0,
for the first hostdev defined in the XML (note that since cliques
define peer-to-peer sets, singleton clique offer no benefit).  The
subsequent two hostdevs are both added to clique ID 1, indicating
peer-to-peer is possible between these devices.

QEMU only provides validation that the clique ID is valid and applied
to an NVIDIA graphics device, any validation that the resulting
cliques are functional and valid is the user's responsibility.  The
NVIDIA specification allows a 4-bit clique ID, thus valid values are
0-15.

Signed-off-by: Alex Williamson 
---
 hw/vfio/pci-quirks.c |  110 ++
 hw/vfio/pci.c|5 ++
 hw/vfio/pci.h|3 +
 3 files changed, 118 insertions(+)

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index 40aaae76feb9..14291c2a16b2 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -14,6 +14,7 @@
 #include "qemu/error-report.h"
 #include "qemu/range.h"
 #include "qapi/error.h"
+#include "qapi/visitor.h"
 #include "hw/nvram/fw_cfg.h"
 #include "pci.h"
 #include "trace.h"
@@ -1850,7 +1851,116 @@ void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev)
 break;
 }
 }
+
+/*
+ * The NVIDIA GPUDirect P2P Vendor capability allows the user to specify
+ * devices as a member of a clique.  Devices within the same clique ID
+ * are capable of direct P2P.  It's the user's responsibility that this
+ * is correct.  The spec says that this may reside at any unused config
+ * offset, but reserves and recommends hypervisors place this at C8h.
+ * The spec also states that the hypervisor should place this capability
+ * at the end of the capability list, thus next is defined as 0h.
+ *
+ * +++++
+ * | sig 7:0 ('P')  |  vndr len (8h) |next (0h)   |   cap id (9h)  |
+ * +++++
+ * | rsvd 15:7(0h),id 6:3,ver 2:0(0h)|  sig 23:8 ('P2')|
+ * +-+-+
+ *
+ * https://lists.gnu.org/archive/html/qemu-devel/2017-08/pdfUda5iEpgOS.pdf
+ */
+static void get_nv_gpudirect_clique_id(Object *obj, Visitor *v,
+   const char *name, void *opaque,
+   Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+uint8_t *ptr = qdev_get_prop_ptr(dev, prop);
+
+visit_type_uint8(v, name, ptr, errp);
+}
+
+static void set_nv_gpudirect_clique_id(Object *obj, Visitor *v,
+   const char *name, void *opaque,
+   Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+uint8_t value, *ptr = qdev_get_prop_ptr(dev, prop);
+Error *local_err = NULL;
+
+if (dev->realized) {
+qdev_prop_set_after_realize(dev, name, errp);
+return;
+}
+
+visit_type_uint8(v, name, , _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
+if (value & ~0xF) {
+error_setg(errp, "Property %s: valid range 0-15", name);
+return;
+}
+
+*ptr = value;
+}
+
+const PropertyInfo qdev_prop_nv_gpudirect_clique = {
+.name = "uint4",
+.description = "NVIDIA GPUDirect Clique ID (0 - 15)",
+.get = get_nv_gpudirect_clique_id,
+.set = set_nv_gpudirect_clique_id,
+};
+
+static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice *vdev, Error **errp)
+{
+PCIDevice *pdev = >pdev;
+int ret, pos = 0xC8;
+
+if (vdev->nv_gpudirect_clique == 0xFF) {
+return 0;
+}
+
+if (!vfio_pci_is(vdev, PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID)) {
+error_setg(errp, "NVIDIA GPUDirect Clique ID: invalid device vendor");
+return -EINVAL;
+}
+
+if (pci_get_byte(pdev->config + PCI_CLASS_DEVICE + 1) !=
+PCI_BASE_CLASS_DISPLAY) {
+error_setg(errp, "NVIDIA GPUDirect Clique ID: unsupported PCI class");
+return -EINVAL;
+}
+
+ret = pci_add_capability(pdev, PCI_CAP_ID_VNDR, pos, 8, errp);
+if (ret < 0) {
+error_prepend(errp, "Failed to add NVIDIA GPUDirect cap: ");
+return ret;
+}
+
+memset(vdev->emulated_config_bits + pos, 0xFF, 8);
+pos += PCI_CAP_FLAGS;
+

[Qemu-devel] [PULL 1/3] vfio/pci: Do not unwind on error

2017-10-03 Thread Alex Williamson

If vfio_add_std_cap() errors then going to out prepends irrelevant
errors for capabilities we haven't attempted to add as we unwind our
recursive stack.  Just return error.

Fixes: 7ef165b9a8d9 ("vfio/pci: Pass an error object to vfio_add_capabilities")
Signed-off-by: Alex Williamson 
---
 hw/vfio/pci.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 31e1edf44745..916d365dfab3 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1826,7 +1826,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t 
pos, Error **errp)
 if (next) {
 ret = vfio_add_std_cap(vdev, next, errp);
 if (ret) {
-goto out;
+return ret;
 }
 } else {
 /* Begin the rebuild, use QEMU emulated list bits */
@@ -1862,7 +1862,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t 
pos, Error **errp)
 ret = pci_add_capability(pdev, cap_id, pos, size, errp);
 break;
 }
-out:
+
 if (ret < 0) {
 error_prepend(errp,
   "failed to add PCI capability 0x%x[0x%x]@0x%x: ",

[Qemu-devel] [PULL 0/3] VFIO updates 2017-10-03

2017-10-03 Thread Alex Williamson

The following changes since commit d147f7e815f97cb477e223586bcb80c316ae10ea:

  Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging 
(2017-10-03 16:27:24 +0100)

are available in the git repository at:


  git://github.com/awilliam/qemu-vfio.git tags/vfio-updates-20171003.0

for you to fetch changes up to dfbee78db8fdf7bc8c151c3d29504bb47438480b:

  vfio/pci: Add NVIDIA GPUDirect Cliques support (2017-10-03 12:57:36 -0600)


VFIO updates 2017-10-03

 - NVIDIA GPUDirect Cliques experimental support (Alex Williamson)


Alex Williamson (3):
  vfio/pci: Do not unwind on error
  vfio/pci: Add virtual capabilities quirk infrastructure
  vfio/pci: Add NVIDIA GPUDirect Cliques support

 hw/vfio/pci-quirks.c | 114 +++
 hw/vfio/pci.c|  17 +++-
 hw/vfio/pci.h|   4 ++
 3 files changed, 133 insertions(+), 2 deletions(-)

[Qemu-devel] [PULL 2/3] vfio/pci: Add virtual capabilities quirk infrastructure

2017-10-03 Thread Alex Williamson

If the hypervisor needs to add purely virtual capabilties, give us a
hook through quirks to do that.  Note that we determine the maximum
size for a capability based on the physical device, if we insert a
virtual capability, that can change.  Therefore if maximum size is
smaller after added virt capabilities, use that.

Signed-off-by: Alex Williamson 
---
 hw/vfio/pci-quirks.c |4 
 hw/vfio/pci.c|8 
 hw/vfio/pci.h|1 +
 3 files changed, 13 insertions(+)

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index 349085ea12bc..40aaae76feb9 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -1850,3 +1850,7 @@ void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev)
 break;
 }
 }
+int vfio_add_virt_caps(VFIOPCIDevice *vdev, Error **errp)
+{
+return 0;
+}
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 916d365dfab3..bfeaaef22d00 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1833,8 +1833,16 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t 
pos, Error **errp)
 pdev->config[PCI_CAPABILITY_LIST] = 0;
 vdev->emulated_config_bits[PCI_CAPABILITY_LIST] = 0xff;
 vdev->emulated_config_bits[PCI_STATUS] |= PCI_STATUS_CAP_LIST;
+
+ret = vfio_add_virt_caps(vdev, errp);
+if (ret) {
+return ret;
+}
 }
 
+/* Scale down size, esp in case virt caps were added above */
+size = MIN(size, vfio_std_cap_max_size(pdev, pos));
+
 /* Use emulated next pointer to allow dropping caps */
 pci_set_byte(vdev->emulated_config_bits + pos + PCI_CAP_LIST_NEXT, 0xff);
 
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index a8366bb2a74a..958cee058b3b 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -160,6 +160,7 @@ void vfio_bar_quirk_setup(VFIOPCIDevice *vdev, int nr);
 void vfio_bar_quirk_exit(VFIOPCIDevice *vdev, int nr);
 void vfio_bar_quirk_finalize(VFIOPCIDevice *vdev, int nr);
 void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev);
+int vfio_add_virt_caps(VFIOPCIDevice *vdev, Error **errp);
 
 int vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp);

Re: [Qemu-devel] [PATCH v1 4/5] xilinx_zynq: : Specify the valid CPUs

2017-10-03 Thread Philippe Mathieu-Daudé

On 10/03/2017 05:05 PM, Alistair Francis wrote:
> List all possible valid CPU options.
> 
> Signed-off-by: Alistair Francis 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
> 
>  hw/arm/xilinx_zynq.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
> index 1836a4ed45..de1e0bbce1 100644
> --- a/hw/arm/xilinx_zynq.c
> +++ b/hw/arm/xilinx_zynq.c
> @@ -313,6 +313,10 @@ static void zynq_init(MachineState *machine)
>  arm_load_kernel(ARM_CPU(first_cpu), _binfo);
>  }
>  
> +const char *xlnx_zynq_7000_valid_cpus[] = { ARM_CPU_TYPE_NAME("cortex-a9"),
> +NULL
> +  };
> +
>  static void zynq_machine_init(MachineClass *mc)
>  {
>  mc->desc = "Xilinx Zynq Platform Baseboard for Cortex-A9";
> @@ -321,6 +325,7 @@ static void zynq_machine_init(MachineClass *mc)
>  mc->no_sdcard = 1;
>  mc->ignore_memory_transaction_failures = true;
>  mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a9");
> +mc->valid_cpu_types = xlnx_zynq_7000_valid_cpus;
>  }
>  
>  DEFINE_MACHINE("xilinx-zynq-a9", zynq_machine_init)
>

Re: [Qemu-devel] [PULL 44/50] scripts: let checkpatch.pl process an entire GIT branch

2017-10-03 Thread Alex Williamson

On Tue, 19 Sep 2017 14:29:33 +0200
Paolo Bonzini  wrote:

> From: "Daniel P. Berrange" 
> 
> Currently before submitting a series, devs should run checkpatch.pl
> across each patch to be submitted. This can be automated using a
> command such as:
> 
>   git rebase -i master -x 'git show | ./scripts/checkpatch.pl -'
> 
> This is rather long winded to type, so this patch introduces a way
> to tell checkpatch.pl to validate a series of GIT revisions.
> 
> There are now three modes it can operate in 1) check a patch 2) check a source
> file, or 3) check a git branch.
> 
> If no flags are given, the mode is determined by checking the args passed to
> the command. If the args contain a literal ".." it is treated as a GIT 
> revision
> list. If the args end in ".patch" or equal "-" it is treated as a patch file.
> Otherwise it is treated as a source file.
> 
> This automatic guessing can be overridden using --[no-]patch --[no-]file or
> --[no-]branch
> 
> For example to check a GIT revision list:
> 
> $ ./scripts/checkpatch.pl master..
> total: 0 errors, 0 warnings, 297 lines checked
> 
> b886d352a2bf58f0996471fb3991a138373a2957 has no obvious style problems 
> and is ready for submission.
> total: 0 errors, 0 warnings, 182 lines checked
> 
> 2a731f9a9ce145e0e0df6d42dd2a3ce4dfc543fa has no obvious style problems 
> and is ready for submission.
> total: 0 errors, 0 warnings, 102 lines checked
> 
> 11844169bcc0c8ed4449eb3744a69877ed329dd7 has no obvious style problems 
> and is ready for submission.
> 
> If a genuine patch filename contains the characters '..' it is
> possible to force interpretation of the arg as a patch
> 
>   $ ./scripts/checkpatch.pl --patch master..
> 
> will force it to load a patch file called "master..", or equivalently
> 
>   $ ./scripts/checkpatch.pl --no-branch master..
> 
> will simply turn off guessing of GIT revision lists.
> 
> Signed-off-by: Daniel P. Berrange 
> Message-Id: <20170913091000.9005-1-berra...@redhat.com>
> Signed-off-by: Paolo Bonzini 
> ---
>  scripts/checkpatch.pl | 138 
> --
>  1 file changed, 111 insertions(+), 27 deletions(-)


This introduces the following regression for me:

$ ./scripts/checkpatch.pl patches-next/vfio-pci-add-virtual
ERROR: trailing whitespace
#44: FILE: vfio-pci-add-virtual:44:
+ $

ERROR: trailing whitespace
#50: FILE: vfio-pci-add-virtual:50:
+ $

ERROR: trailing whitespace
#60: FILE: vfio-pci-add-virtual:60:
+ $

ERROR: trailing whitespace
#62: FILE: vfio-pci-add-virtual:62:
+ $

total: 4 errors, 0 warnings, 62 lines checked

patches-next/vfio-pci-add-virtual has style problems, please review.  If any of 
these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

This is reported for the following patch, which contains no trailing
whitespace errors:

https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg05828.html

$ perl -v

This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi

Thanks,
Alex

 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index fa47807..28d71b3 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -18,11 +18,12 @@ use Getopt::Long qw(:config no_auto_abbrev);
>  my $quiet = 0;
>  my $tree = 1;
>  my $chk_signoff = 1;
> -my $chk_patch = 1;
> +my $chk_patch = undef;
> +my $chk_branch = undef;
>  my $tst_only;
>  my $emacs = 0;
>  my $terse = 0;
> -my $file = 0;
> +my $file = undef;
>  my $no_warnings = 0;
>  my $summary = 1;
>  my $mailback = 0;
> @@ -35,14 +36,19 @@ sub help {
>   my ($exitcode) = @_;
>  
>   print << "EOM";
> -Usage: $P [OPTION]... [FILE]...
> +Usage:
> +
> +$P [OPTION]... [FILE]...
> +$P [OPTION]... [GIT-REV-LIST]
> +
>  Version: $V
>  
>  Options:
>-q, --quietquiet
>--no-tree  run without a kernel tree
>--no-signoff   do not check for 'Signed-off-by' line
> -  --patchtreat FILE as patchfile (default)
> +  --patchtreat FILE as patchfile
> +  --branch   treat args as GIT revision list
>--emacsemacs compile window format
>--terseone line per report
>-f, --file treat FILE as regular source file
> @@ -69,6 +75,7 @@ GetOptions(
>   'tree!' => \$tree,
>   'signoff!'  => \$chk_signoff,
>   'patch!'=> \$chk_patch,
> + 'branch!'   => \$chk_branch,
>   'emacs!'=> \$emacs,
>   'terse!'=> \$terse,
>   'f|file!'   => \$file,
> @@ -93,6 +100,49 @@ if ($#ARGV < 0) {
>   exit(1);
>  }
>  
> +if (!defined $chk_branch && !defined $chk_patch && !defined $file) {
> + $chk_branch = $ARGV[0] =~ /\.\./ ? 1 : 0;
> + $chk_patch = $chk_branch ? 0 :
> + $ARGV[0] =~ /\.patch$/ || $ARGV[0] eq "-" ? 1 : 0;
> +

Re: [Qemu-devel] [PATCH v1 2/5] netduino2: Specify the valid CPUs

2017-10-03 Thread Philippe Mathieu-Daudé

On 10/03/2017 05:05 PM, Alistair Francis wrote:
> List all possible valid CPU options.
> 
> Although the board only ever has a Cortex-M3 we mark the Cortex-M4 as
> supported because the Netduino2 Plus supports the Cortex-M4 and the
> Netduino2 Plus is similar to the Netduino2.
> 
> Signed-off-by: Alistair Francis 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
> 
> RFC v2:
>  - Use a NULL terminated list
>  - Add the Cortex-M4 for testing
> 
> 
>  hw/arm/netduino2.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/arm/netduino2.c b/hw/arm/netduino2.c
> index f936017d4a..b68ecf2c08 100644
> --- a/hw/arm/netduino2.c
> +++ b/hw/arm/netduino2.c
> @@ -34,18 +34,25 @@ static void netduino2_init(MachineState *machine)
>  DeviceState *dev;
>  
>  dev = qdev_create(NULL, TYPE_STM32F205_SOC);
> -qdev_prop_set_string(dev, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m3"));
> +qdev_prop_set_string(dev, "cpu-type", machine->cpu_type);
>  object_property_set_bool(OBJECT(dev), true, "realized", _fatal);
>  
>  armv7m_load_kernel(ARM_CPU(first_cpu), machine->kernel_filename,
> FLASH_SIZE);
>  }
>  
> +const char *netduino_valid_cpus[] = { ARM_CPU_TYPE_NAME("cortex-m3"),
> +  ARM_CPU_TYPE_NAME("cortex-m4"),
> +  NULL
> +};
> +
>  static void netduino2_machine_init(MachineClass *mc)
>  {
>  mc->desc = "Netduino 2 Machine";
>  mc->init = netduino2_init;
>  mc->ignore_memory_transaction_failures = true;
> +mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-m3");
> +mc->valid_cpu_types = netduino_valid_cpus;
>  }
>  
>  DEFINE_MACHINE("netduino2", netduino2_machine_init)
>

[Qemu-devel] [ANNOUNCE] QEMU 2.10.1 Stable released

2017-10-03 Thread Michael Roth

Hi everyone,

I am pleased to announce that the QEMU v2.10.1 stable release is now
available:

You can grab the tarball from our download page here:

  https://www.qemu.org/download/#source

v2.10.1 is now tagged in the official qemu.git repository,
and the stable-2.10 branch has been updated accordingly:

  https://git.qemu.org/?p=qemu.git;a=shortlog;h=refs/heads/stable-2.10

This update contains security fixes addressing guest-induced crashing
of host QEMU process (CVE-2017-13672, CVE-2017-13673) and possible
code injection into host QEMU process via a crafted multiboot ELF
kernel when specified directly via QEMU command-line option
(CVE-2017-14167).

There are also the normal range of general fixes. Please see the
changelog for additional details and update accordingly.

Thank you to everyone involved!

CHANGELOG:

7851197b81: Update version for 2.10.1 release (Michael Roth)
547435f550: migration: disable auto-converge during bulk block migration (Peter 
Lieven)
17cd46fbdf: s390x/cpumodel: remove ais from z14 default model-> also for 2.10.1 
(Christian Borntraeger)
6a903482b1: Revert "ACPI: don't call acpi_pcihp_device_plug_cb on xen" (Anthony 
PERARD)
8edf4c6adc: hw/acpi: Move acpi_set_pci_info to pcihp (Anthony PERARD)
2c3a8cc581: hw/acpi: Limit hotplug to root bus on legacy mode (Anthony PERARD)
0691b70a2a: nbd-client: avoid read_reply_co entry if send failed (Stefan 
Hajnoczi)
4d824886c8: accel/tcg/cputlb: avoid recursive BQL (fixes #1706296) (Alex Bennée)
780fb4ce48: block/qcow2-bitmap: fix use of uninitialized pointer (Vladimir 
Sementsov-Ogievskiy)
7496699ba6: block/throttle-groups.c: allocate RestartData on the heap (Manos 
Pitsidianakis)
33a599667a: osdep: Fix ROUND_UP(64-bit, 32-bit) (Eric Blake)
a432f419ab: s390x/ais: for 2.10 stable: disable ais facility (Christian 
Borntraeger)
a83858fdb5: 9pfs: check the size of transport buffer before marshaling (Jan 
Dakinevich)
d13a0bde83: 9pfs: fix name_to_path assertion in v9fs_complete_rename() (Jan 
Dakinevich)
e90997dc8f: 9pfs: fix readdir() for 9p2000.u (Jan Dakinevich)
7e1288cd0c: console: fix dpy_gfx_replace_surface assert (Gerd Hoffmann)
83b23fe55c: ide: ahci: unparent children buses before freeing their memory 
(Igor Mammedov)
e96002e0d1: hw/ide/microdrive: Mark the dscm1 device with user_creatable = 
false (Thomas Huth)
cc7dd3ad3f: hw/arm/aspeed_soc: Mark devices as user_creatable = false (Thomas 
Huth)
de4ad17a8e: hw/arm/digic: Mark device with user_creatable = false (Thomas Huth)
8a9d7f3063: s390x/ipl: The s390-ipl device is not hot-pluggable (Thomas Huth)
d3f05848fc: watchdog/wdt_diag288: Mark diag288 watchdog as non-hotpluggable 
(Thomas Huth)
fca5f37fe9: multiboot: validate multiboot header address values (Prasad J 
Pandit)
2965be1f00: vga: stop passing pointers to vga_draw_line* functions (Gerd 
Hoffmann)
d6f7f3b0cf: vga: fix display update region calculation (split screen) (Gerd 
Hoffmann)
2a2eab6660: vhost-user-bridge: fix resume regression (since 2.9) (Marc-André 
Lureau)
48f65ce837: libvhost-user: support resuming vq->last_avail_idx based on 
used_idx (Marc-André Lureau)
b95fbe6f12: scsi-bus: correct responses for INQUIRY and REQUEST SENSE (Hannes 
Reinecke)
b8cd978919: mps2-an511: Fix wiring of UART overflow interrupt lines (Peter 
Maydell)
b24304ca13: vhost: Release memory references on cleanup (Alex Williamson)
c6841b112e: qcow2: move qcow2_store_persistent_dirty_bitmaps() before cache 
flushing (Pavel Butsykin)
65a24b5c44: hw/arm/allwinner-a10: Mark the allwinner-a10 device with 
user_creatable = false (Thomas Huth)
85cdc23e75: arm_gicv3_kvm: Fix compile warning (Pranith Kumar)
168ff32c5d: virtfs: error out gracefully when mandatory suboptions are missing 
(Greg Kurz)
728bfa3273: target/arm: Fix aa64 ldp register writeback (Richard Henderson)
e1b4750f06: s390-ccw: Fix alignment for CCW1 (Farhan Ali)
53d421dd9c: slirp: fix clearing ifq_so from pending packets (Samuel Thibault)

[Qemu-devel] [PATCH] Revert: checkpatch: check trace-events code style

2017-10-03 Thread Alex Williamson

Commit c3e5875afc0f ("checkpatch: check trace-events code style")
introduces a regression as reported:

https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg05820.html

Bareword found where operator expected at ./scripts/checkpatch.pl line 1350, 
near "s/($hex[.:\/ ])+$hex//gr"
syntax error at ./scripts/checkpatch.pl line 1350, near "s/($hex[.:\/ 
])+$hex//gr"
Execution of ./scripts/checkpatch.pl aborted due to compilation errors.

$ perl -v

This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi

As no fix or discussion has resulted, revert the original patch.

Cc: Vladimir Sementsov-Ogievskiy 
Cc: Stefan Hajnoczi 
Fixes: c3e5875afc0f ("checkpatch: check trace-events code style")
Signed-off-by: Alex Williamson 
---
 scripts/checkpatch.pl |   19 ---
 1 file changed, 19 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 3c0a28e644aa..f7e785d12a49 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1422,25 +1422,6 @@ sub process {
$rpt_cleaners = 1;
}
 
-# checks for trace-events files
-   if ($realfile =~ /trace-events$/ && $line =~ /^\+/) {
-   if ($rawline =~ /%[-+ 0]*#/) {
-   ERROR("Don't use '#' flag of printf format 
('%#') in " .
- "trace-events, use '0x' prefix instead\n" 
. $herecurr);
-   } else {
-   my $hex =
-   qr/%[-+ 
*.0-9]*([hljztL]|ll|hh)?(x|X|"\s*PRI[xX][^"]*"?)/;
-
-   # don't consider groups splitted by [.:/ ], 
like 2A.20:12ab
-   my $tmpline = $rawline =~ s/($hex[.:\/ 
])+$hex//gr;
-
-   if ($tmpline =~ /(?

Re: [Qemu-devel] [PATCH v1 3/5] xlnx-zcu102: Specify the valid CPUs

2017-10-03 Thread Alistair Francis

On Tue, Oct 3, 2017 at 1:36 PM, Eduardo Habkost  wrote:
> On Tue, Oct 03, 2017 at 01:05:13PM -0700, Alistair Francis wrote:
>> List all possible valid CPU options.
>>
>> Signed-off-by: Alistair Francis 
>> ---
>>
>>  hw/arm/xlnx-zcu102.c | 10 ++
>>  hw/arm/xlnx-zynqmp.c | 16 +---
>>  include/hw/arm/xlnx-zynqmp.h |  1 +
>>  3 files changed, 20 insertions(+), 7 deletions(-)
>>
>> diff --git a/hw/arm/xlnx-zcu102.c b/hw/arm/xlnx-zcu102.c
>> index 519a16ed98..039649e522 100644
>> --- a/hw/arm/xlnx-zcu102.c
>> +++ b/hw/arm/xlnx-zcu102.c
>> @@ -98,6 +98,8 @@ static void xlnx_zynqmp_init(XlnxZCU102 *s, MachineState 
>> *machine)
>>  object_property_add_child(OBJECT(machine), "soc", OBJECT(>soc),
>>_abort);
>>
>> +object_property_set_str(OBJECT(>soc), machine->cpu_type, "cpu-type",
>> +_fatal);
>
> Do you have plans to support other CPU types to xlnx_zynqmp in
> the future?  If not, I wouldn't bother adding the cpu-type
> property and the extra boilerplate code if it's always going to
> be set to cortex-a53.

No, it'll always be A53.

I did think of that, but I also wanted to use the new option! I also
think there is an advantage in sanely handling users '-cpu' option,
before now we just ignored it, so I think it still does give a
benefit. That'll be especially important on the Xilinx tree (sometimes
people use our machines with a different CPU to 'benchmark' or test
other CPUs with our CoSimulation setup). So I think it does make sense
to keep in.

Thanks,
Alistair

>
>
>>  object_property_set_link(OBJECT(>soc), OBJECT(>ddr_ram),
>>   "ddr-ram", _abort);
>>  object_property_set_bool(OBJECT(>soc), s->secure, "secure",
>> @@ -160,6 +162,10 @@ static void xlnx_zynqmp_init(XlnxZCU102 *s, 
>> MachineState *machine)
>>  arm_load_kernel(s->soc.boot_cpu_ptr, _zcu102_binfo);
>>  }
>>
>> +const char *xlnx_zynqmp_valid_cpus[] = { ARM_CPU_TYPE_NAME("cortex-a53"),
>> + NULL
>> +   };
>> +
>>  static void xlnx_ep108_init(MachineState *machine)
>>  {
>>  XlnxZCU102 *s = EP108_MACHINE(machine);
>> @@ -185,6 +191,8 @@ static void xlnx_ep108_machine_class_init(ObjectClass 
>> *oc, void *data)
>>  mc->block_default_type = IF_IDE;
>>  mc->units_per_default_bus = 1;
>>  mc->ignore_memory_transaction_failures = true;
>> +mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a53");
>> +mc->valid_cpu_types = xlnx_zynqmp_valid_cpus;
>>  }
>>
>>  static const TypeInfo xlnx_ep108_machine_init_typeinfo = {
>> @@ -240,6 +248,8 @@ static void xlnx_zcu102_machine_class_init(ObjectClass 
>> *oc, void *data)
>>  mc->block_default_type = IF_IDE;
>>  mc->units_per_default_bus = 1;
>>  mc->ignore_memory_transaction_failures = true;
>> +mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a53");
>> +mc->valid_cpu_types = xlnx_zynqmp_valid_cpus;
>>  }
>>
>>  static const TypeInfo xlnx_zcu102_machine_init_typeinfo = {
>> diff --git a/hw/arm/xlnx-zynqmp.c b/hw/arm/xlnx-zynqmp.c
>> index 2b27daf51d..1bff099ec1 100644
>> --- a/hw/arm/xlnx-zynqmp.c
>> +++ b/hw/arm/xlnx-zynqmp.c
>> @@ -133,13 +133,6 @@ static void xlnx_zynqmp_init(Object *obj)
>>  XlnxZynqMPState *s = XLNX_ZYNQMP(obj);
>>  int i;
>>
>> -for (i = 0; i < XLNX_ZYNQMP_NUM_APU_CPUS; i++) {
>> -object_initialize(>apu_cpu[i], sizeof(s->apu_cpu[i]),
>> -  "cortex-a53-" TYPE_ARM_CPU);
>> -object_property_add_child(obj, "apu-cpu[*]", OBJECT(>apu_cpu[i]),
>> -  _abort);
>> -}
>> -
>>  object_initialize(>gic, sizeof(s->gic), gic_class_name());
>>  qdev_set_parent_bus(DEVICE(>gic), sysbus_get_default());
>>
>> @@ -187,6 +180,14 @@ static void xlnx_zynqmp_realize(DeviceState *dev, Error 
>> **errp)
>>  qemu_irq gic_spi[GIC_NUM_SPI_INTR];
>>  Error *err = NULL;
>>
>> +/* We need to do this here to ensure the cpu_type property is set. */
>> +for (i = 0; i < XLNX_ZYNQMP_NUM_APU_CPUS; i++) {
>> +object_initialize(>apu_cpu[i], sizeof(s->apu_cpu[i]),
>> +  s->cpu_type);
>> +object_property_add_child(OBJECT(dev), "apu-cpu[*]", 
>> OBJECT(>apu_cpu[i]),
>> +  _abort);
>> +}
>> +
>>  ram_size = memory_region_size(s->ddr_ram);
>>
>>  /* Create the DDR Memory Regions. User friendly checks should happen at
>> @@ -425,6 +426,7 @@ static void xlnx_zynqmp_realize(DeviceState *dev, Error 
>> **errp)
>>  }
>>
>>  static Property xlnx_zynqmp_props[] = {
>> +DEFINE_PROP_STRING("cpu-type", XlnxZynqMPState, cpu_type),
>>  DEFINE_PROP_STRING("boot-cpu", XlnxZynqMPState, boot_cpu),
>>  DEFINE_PROP_BOOL("secure", XlnxZynqMPState, secure, false),
>>  DEFINE_PROP_BOOL("virtualization", XlnxZynqMPState, virt, false),
>> diff

Re: [Qemu-devel] [PATCH v1 1/5] machine: Add a valid_cpu_types property

2017-10-03 Thread Alistair Francis

On Tue, Oct 3, 2017 at 1:33 PM, Eduardo Habkost  wrote:
> On Tue, Oct 03, 2017 at 01:26:53PM -0700, Alistair Francis wrote:
>> On Tue, Oct 3, 2017 at 1:23 PM, Eduardo Habkost  wrote:
>> > On Tue, Oct 03, 2017 at 01:05:09PM -0700, Alistair Francis wrote:
>> >> This patch add a MachineClass element that can be set in the machine C
>> >> code to specify a list of supported CPU types. If the supported CPU
>> >> types are specified the user enter CPU (by -cpu at runtime) is checked
>> >> against the supported types and QEMU exits if they aren't supported.
>> >>
>> >> Signed-off-by: Alistair Francis 
>> >> ---
>> >
>> > Thanks!
>> >
>> > Reviewed-by: Eduardo Habkost 
>> >
>> > However, I will squash the following changes before queueing,
>> > because:
>> >
>> > * object_class_dynamic_cast() is safe even if class is NULL,
>> >   so there's no need to validate cpu_type here.
>> > * "must not be valid" sounds like the CPU is not allowed to be a
>> >   valid CPU, so I rewrote the comment.
>> >
>> >
>> > diff --git a/hw/core/machine.c b/hw/core/machine.c
>> > index 3afc6a7b5b..36c2fb069c 100644
>> > --- a/hw/core/machine.c
>> > +++ b/hw/core/machine.c
>> > @@ -766,9 +766,6 @@ void machine_run_board_init(MachineState *machine)
>> >  ObjectClass *class = object_class_by_name(machine->cpu_type);
>> >  int i;
>> >
>> > -/* machine->cpu_type is supposed to be always a valid QOM type */
>> > -assert(class);
>> > -
>> >  for (i = 0; machine_class->valid_cpu_types[i]; i++) {
>> >  if (object_class_dynamic_cast(class,
>> >
>> > machine_class->valid_cpu_types[i])) {
>> > @@ -780,7 +777,7 @@ void machine_run_board_init(MachineState *machine)
>> >  }
>> >
>> >  if (!machine_class->valid_cpu_types[i]) {
>> > -/* The user specified CPU must not be a valid CPU */
>> > +/* The user specified CPU is not valid */
>> >  error_report("Invalid CPU type: %s", machine->cpu_type);
>> >  error_printf("The valid types are: %s",
>> >   machine_class->valid_cpu_types[0]);
>>
>> Looks good to me.
>>
>> Does that mean you are taking the whole series now?
>
> I was planning to tacke only patch 1/5, but I can take the whole
> series if I get an Acked-by from the corresponding maintainers.

Most of them are maintained by me. Just getting patch 1 in is the most
important part though (then others can use it). So if you want to just
take it by itself that's fine with me.

Thanks,
Alistair

>
> --
> Eduardo
>

Re: [Qemu-devel] [PATCH v1 5/5] raspi: : Specify the valid CPUs

2017-10-03 Thread Alistair Francis

On Tue, Oct 3, 2017 at 1:39 PM, Eduardo Habkost  wrote:
> On Tue, Oct 03, 2017 at 01:05:18PM -0700, Alistair Francis wrote:
>> List all possible valid CPU options.
>>
>> Signed-off-by: Alistair Francis 
>> ---
>>
>>  hw/arm/raspi.c | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/hw/arm/raspi.c b/hw/arm/raspi.c
>> index 5941c9f751..555db0f258 100644
>> --- a/hw/arm/raspi.c
>> +++ b/hw/arm/raspi.c
>> @@ -158,6 +158,10 @@ static void raspi2_init(MachineState *machine)
>>  setup_boot(machine, 2, machine->ram_size - vcram_size);
>>  }
>>
>> +const char *raspi2_valid_cpus[] = { ARM_CPU_TYPE_NAME("cortex-a7"),
>> +NULL
>> +  };
>> +
>>  static void raspi2_machine_init(MachineClass *mc)
>>  {
>>  mc->desc = "Raspberry Pi 2";
>> @@ -169,5 +173,7 @@ static void raspi2_machine_init(MachineClass *mc)
>>  mc->max_cpus = BCM2836_NCPUS;
>>  mc->default_ram_size = 1024 * 1024 * 1024;
>>  mc->ignore_memory_transaction_failures = true;
>> +mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a7");
>> +mc->valid_cpu_types = raspi2_valid_cpus;
>
> I'm confused: bcm2836_init() is hardcoded to cortex-a15, not
> cortex-a7.

Odd. I just looked up the Raspberry Pi 2 and it says a Cortex-A7:
https://www.raspberrypi.org/products/raspberry-pi-2-model-b/

Thanks,
Alistair

>
>>  };
>>  DEFINE_MACHINE("raspi2", raspi2_machine_init)
>> --
>> 2.11.0
>>
>
> --
> Eduardo

Re: [Qemu-devel] [PATCH v2 2/2] qemu-options: Deprecate -nodefconfig

2017-10-03 Thread Eduardo Habkost

On Tue, Oct 03, 2017 at 06:17:30PM -0300, Eduardo Habkost wrote:
> Since 2012 (commit ba6212d8 "Eliminate cpus-x86_64.conf file") we
> have no default config files that would be disabled using
> -nodefconfig.  Update documentation and document -nodefconfig as
> deprecated.
> 
> Cc: Markus Armbruster 
> Acked-by: Alistair Francis 
> Signed-off-by: Eduardo Habkost 
> ---
> Changes v1 -> v2:
> * Document at "Deprecated features" section in qemu-doc.texi
>   (Daniel)
> * Remove documentation for the option from qemu-options.hx
>   (Markus)
> ---
>  qemu-doc.texi   |  4 
>  qemu-options.hx | 17 -
>  2 files changed, 8 insertions(+), 13 deletions(-)
> 
> diff --git a/qemu-doc.texi b/qemu-doc.texi
> index ecd186a159..039abe0cfe 100644
> --- a/qemu-doc.texi
> +++ b/qemu-doc.texi
> @@ -2533,6 +2533,10 @@ or ``ivshmem-doorbell`` device types.
>  The ``spapr-pci-vfio-host-bridge'' device type is replaced by
>  the ``spapr-pci-host-bridge'' device type.
>  
> +@subsection -nodefconfig (since 2.11.0)
> +
> +The ``-nodefconfig`` argument is a synonym for ``-no-user-config``.
> +

This is at the wrong section.  I will submit v3 fixing that.
Sorry for the noise.

>  @node License
>  @appendix License
>  

-- 
Eduardo

Re: [Qemu-devel] [PATCH v10 8/9] tpm: Added support for TPM emulator

2017-10-03 Thread Stefan Berger


On 09/29/2017 07:10 AM, Amarnath Valluri wrote:

This change introduces a new TPM backend driver that can communicate with
swtpm(software TPM emulator) using unix domain socket interface. QEMU talks to
TPM emulator using QEMU's socket-based chardev backend device.

Swtpm uses two Unix sockets for communications, one for plain TPM commands and
responses, and one for out-of-band control messages. QEMU passes data socket to
be used over the control channel.

The swtpm and associated tools can be found here:
 https://github.com/stefanberger/swtpm

The swtpm's control channel protocol specification can be found here:
 https://github.com/stefanberger/swtpm/wiki/Control-Channel-Specification

Usage:
 # setup TPM state directory
 mkdir /tmp/mytpm
 chown -R tss:root /tmp/mytpm
 /usr/bin/swtpm_setup --tpm-state /tmp/mytpm --createek


To run this, one needs the latest version of swtpm that supports the 
file descriptor passing.

Then one can start the swtpm like this:

swtpm socket --tpmstate dir=/tmp/mytpm --ctrl 
type=unixio,path=/tmp/swtpm-sock --log level=20


I tested the SeaBIOS menu items so far and that works fine, also for 
TPM2 (--tpm2).


Cheers!
 Stefan




 # Ask qemu to use TPM emulator with given tpm state directory
 qemu-system-x86_64 \
 [...] \
 -chardev socket,id=chrtpm,path=/tmp/swtpm-sock \
 -tpmdev emulator,id=tpm0,chardev=chrtpm \
 -device tpm-tis,tpmdev=tpm0 \
 [...]

Signed-off-by: Amarnath Valluri 
---
  configure |  13 +-
  hmp.c |   5 +
  hw/tpm/Makefile.objs  |   1 +
  hw/tpm/tpm_emulator.c | 587 ++
  hw/tpm/tpm_ioctl.h| 246 +
  qapi/tpm.json |  21 +-
  qemu-options.hx   |  22 +-
  7 files changed, 888 insertions(+), 7 deletions(-)
  create mode 100644 hw/tpm/tpm_emulator.c
  create mode 100644 hw/tpm/tpm_ioctl.h

diff --git a/configure b/configure
index cb0f7ed..a1b956e 100755
--- a/configure
+++ b/configure
@@ -3467,6 +3467,12 @@ else
tpm_passthrough=no
  fi

+# TPM emulator is for all posix systems
+if test "$mingw32" != "yes"; then
+  tpm_emulator=$tpm
+else
+  tpm_emulator=no
+fi
  ##
  # attr probe

@@ -5359,6 +5365,7 @@ echo "gcov enabled  $gcov"
  echo "TPM support   $tpm"
  echo "libssh2 support   $libssh2"
  echo "TPM passthrough   $tpm_passthrough"
+echo "TPM emulator  $tpm_emulator"
  echo "QOM debugging $qom_cast_debug"
  echo "Live block migration $live_block_migration"
  echo "lzo support   $lzo"
@@ -5937,12 +5944,16 @@ if test "$live_block_migration" = "yes" ; then
echo "CONFIG_LIVE_BLOCK_MIGRATION=y" >> $config_host_mak
  fi

-# TPM passthrough support?
  if test "$tpm" = "yes"; then
echo 'CONFIG_TPM=$(CONFIG_SOFTMMU)' >> $config_host_mak
+  # TPM passthrough support?
if test "$tpm_passthrough" = "yes"; then
  echo "CONFIG_TPM_PASSTHROUGH=y" >> $config_host_mak
fi
+  # TPM emulator support?
+  if test "$tpm_emulator" = "yes"; then
+echo "CONFIG_TPM_EMULATOR=y" >> $config_host_mak
+  fi
  fi

  echo "TRACE_BACKENDS=$trace_backends" >> $config_host_mak
diff --git a/hmp.c b/hmp.c
index 0fb2bc7..9cd8179 100644
--- a/hmp.c
+++ b/hmp.c
@@ -994,6 +994,7 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict)
  Error *err = NULL;
  unsigned int c = 0;
  TPMPassthroughOptions *tpo;
+TPMEmulatorOptions *teo;

  info_list = qmp_query_tpm();
  if (err) {
@@ -1023,6 +1024,10 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict)
 tpo->has_cancel_path ? ",cancel-path=" : "",
 tpo->has_cancel_path ? tpo->cancel_path : "");
  break;
+case TPM_TYPE_OPTIONS_KIND_EMULATOR:
+teo = ti->options->u.emulator.data;
+monitor_printf(mon, ",chardev=%s", teo->chardev);
+break;
  case TPM_TYPE_OPTIONS_KIND__MAX:
  break;
  }
diff --git a/hw/tpm/Makefile.objs b/hw/tpm/Makefile.objs
index 64cecc3..41f0b7a 100644
--- a/hw/tpm/Makefile.objs
+++ b/hw/tpm/Makefile.objs
@@ -1,2 +1,3 @@
  common-obj-$(CONFIG_TPM_TIS) += tpm_tis.o
  common-obj-$(CONFIG_TPM_PASSTHROUGH) += tpm_passthrough.o tpm_util.o
+common-obj-$(CONFIG_TPM_EMULATOR) += tpm_emulator.o tpm_util.o
diff --git a/hw/tpm/tpm_emulator.c b/hw/tpm/tpm_emulator.c
new file mode 100644
index 000..5ddd723
--- /dev/null
+++ b/hw/tpm/tpm_emulator.c
@@ -0,0 +1,587 @@
+/*
+ *  Emulator TPM driver
+ *
+ *  Copyright (c) 2017 Intel Corporation
+ *  Author: Amarnath Valluri 
+ *
+ *  Copyright (c) 2010 - 2013 IBM Corporation
+ *  Authors:
+ *Stefan Berger 
+ *
+ *  Copyright (C) 2011 IAIK, Graz University of Technology
+ *Author: Andreas Niederl
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms

Re: [Qemu-devel] [PATCH v5 6/6] tests: Add check-qobject for equality tests

2017-10-03 Thread Max Reitz

On 2017-10-02 15:34, Markus Armbruster wrote:
> Max Reitz  writes:
> 
>> Add a new test file (check-qobject.c) for unit tests that concern
>> QObjects as a whole.
>>
>> Its only purpose for now is to test the qobject_is_equal() function.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>  tests/Makefile.include |   4 +-
>>  tests/check-qobject.c  | 315 
>> +
>>  2 files changed, 318 insertions(+), 1 deletion(-)
>>  create mode 100644 tests/check-qobject.c
>>
>> diff --git a/tests/Makefile.include b/tests/Makefile.include
>> index fae5715e9c..40b8b4e98f 100644
>> --- a/tests/Makefile.include
>> +++ b/tests/Makefile.include
>> @@ -41,6 +41,7 @@ check-unit-y += tests/check-qlist$(EXESUF)
>>  gcov-files-check-qlist-y = qobject/qlist.c
>>  check-unit-y += tests/check-qnull$(EXESUF)
>>  gcov-files-check-qnull-y = qobject/qnull.c
>> +check-unit-y += tests/check-qobject$(EXESUF)
>>  check-unit-y += tests/check-qjson$(EXESUF)
>>  gcov-files-check-qjson-y = qobject/qjson.c
>>  check-unit-y += tests/check-qlit$(EXESUF)
>> @@ -542,7 +543,7 @@ GENERATED_FILES += tests/test-qapi-types.h 
>> tests/test-qapi-visit.h \
>>  tests/test-qmp-introspect.h
>>  
>>  test-obj-y = tests/check-qnum.o tests/check-qstring.o tests/check-qdict.o \
>> -tests/check-qlist.o tests/check-qnull.o \
>> +tests/check-qlist.o tests/check-qnull.o tests/check-qobject.o \
>>  tests/check-qjson.o tests/check-qlit.o \
>>  tests/test-coroutine.o tests/test-string-output-visitor.o \
>>  tests/test-string-input-visitor.o tests/test-qobject-output-visitor.o \
>> @@ -576,6 +577,7 @@ tests/check-qstring$(EXESUF): tests/check-qstring.o 
>> $(test-util-obj-y)
>>  tests/check-qdict$(EXESUF): tests/check-qdict.o $(test-util-obj-y)
>>  tests/check-qlist$(EXESUF): tests/check-qlist.o $(test-util-obj-y)
>>  tests/check-qnull$(EXESUF): tests/check-qnull.o $(test-util-obj-y)
>> +tests/check-qobject$(EXESUF): tests/check-qobject.o $(test-util-obj-y)
>>  tests/check-qjson$(EXESUF): tests/check-qjson.o $(test-util-obj-y)
>>  tests/check-qlit$(EXESUF): tests/check-qlit.o $(test-util-obj-y)
>>  tests/check-qom-interface$(EXESUF): tests/check-qom-interface.o 
>> $(test-qom-obj-y)
>> diff --git a/tests/check-qobject.c b/tests/check-qobject.c
>> new file mode 100644
>> index 00..8f1b5550c2
>> --- /dev/null
>> +++ b/tests/check-qobject.c
>> @@ -0,0 +1,315 @@
>> +/*
>> + * Generic QObject unit-tests.
>> + *
>> + * Copyright (C) 2017 Red Hat Inc.
>> + *
>> + * This work is licensed under the terms of the GNU LGPL, version 2.1 or 
>> later.
>> + * See the COPYING.LIB file in the top-level directory.
>> + */
>> +#include "qemu/osdep.h"
>> +
>> +#include "qapi/qmp/types.h"
>> +#include "qemu-common.h"
>> +
>> +#include 
>> +
>> +/* Marks the end of the test_equality() argument list.
>> + * We cannot use NULL there because that is a valid argument. */
>> +static QObject _test_equality_end_of_arguments;
> 
> Reserved identifier.  Please scratch the leading underscore.

OK.

> Also: ugh!  I would've tried arrays just to avoid this ugliness.  But
> since you've written it, and it works...

*cough*

>> +
>> +/**
>> + * Test whether all variadic QObject *arguments are equal (@expected
>> + * is true) or whether they are all not equal (@expected is false).
>> + * Every QObject is tested to be equal to itself (to test
>> + * reflexivity), all tests are done both ways (to test symmetry), and
>> + * transitivity is not assumed but checked (each object is compared to
>> + * every other one).
>> + *
>> + * Note that qobject_is_equal() is not really an equivalence relation,
>> + * so this function may not be used for all objects (reflexivity is
>> + * not guaranteed, e.g. in the case of a QNum containing NaN).
>> + */
>> +static void do_test_equality(bool expected, ...)
>> +{
>> +va_list ap_count, ap_extract;
>> +QObject **args;
>> +int arg_count = 0;
>> +int i, j;
>> +
>> +va_start(ap_count, expected);
>> +va_copy(ap_extract, ap_count);
>> +while (va_arg(ap_count, QObject *) != &_test_equality_end_of_arguments) 
>> {
>> +arg_count++;
>> +}
>> +va_end(ap_count);
>> +
>> +args = g_new(QObject *, arg_count);
>> +for (i = 0; i < arg_count; i++) {
>> +args[i] = va_arg(ap_extract, QObject *);
>> +}
>> +va_end(ap_extract);
>> +
>> +for (i = 0; i < arg_count; i++) {
>> +g_assert(qobject_is_equal(args[i], args[i]) == true);
>> +
>> +for (j = i + 1; j < arg_count; j++) {
>> +g_assert(qobject_is_equal(args[i], args[j]) == expected);
>> +}
>> +}
>> +}
>> +
>> +#define test_equality(expected, ...) \
>> +do_test_equality(expected, __VA_ARGS__, 
>> &_test_equality_end_of_arguments)
>> +
>> +static void do_free_all(int _, ...)
>> +{
>> +va_list ap;
>> +QObject *obj;
>> +
>> +va_start(ap, _);
>> +while ((obj = va_arg(ap, QObject *)) != NULL) {
>> +

[Qemu-devel] [PATCH v2 2/2] qemu-options: Deprecate -nodefconfig

2017-10-03 Thread Eduardo Habkost

Since 2012 (commit ba6212d8 "Eliminate cpus-x86_64.conf file") we
have no default config files that would be disabled using
-nodefconfig.  Update documentation and document -nodefconfig as
deprecated.

Cc: Markus Armbruster 
Acked-by: Alistair Francis 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* Document at "Deprecated features" section in qemu-doc.texi
  (Daniel)
* Remove documentation for the option from qemu-options.hx
  (Markus)
---
 qemu-doc.texi   |  4 
 qemu-options.hx | 17 -
 2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/qemu-doc.texi b/qemu-doc.texi
index ecd186a159..039abe0cfe 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -2533,6 +2533,10 @@ or ``ivshmem-doorbell`` device types.
 The ``spapr-pci-vfio-host-bridge'' device type is replaced by
 the ``spapr-pci-host-bridge'' device type.
 
+@subsection -nodefconfig (since 2.11.0)
+
+The ``-nodefconfig`` argument is a synonym for ``-no-user-config``.
+
 @node License
 @appendix License
 
diff --git a/qemu-options.hx b/qemu-options.hx
index 39225ae6c3..981742d191 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4067,26 +4067,17 @@ Write device configuration to @var{file}. The 
@var{file} can be either filename
 command line and device configuration into file or dash @code{-}) character to 
print the
 output to stdout. This can be later used as input file for @code{-readconfig} 
option.
 ETEXI
-DEF("nodefconfig", 0, QEMU_OPTION_nodefconfig,
-"-nodefconfig\n"
-"do not load default config files at startup\n",
-QEMU_ARCH_ALL)
-STEXI
-@item -nodefconfig
-@findex -nodefconfig
-Normally QEMU loads configuration files from @var{sysconfdir} and 
@var{datadir} at startup.
-The @code{-nodefconfig} option will prevent QEMU from loading any of those 
config files.
-ETEXI
+HXCOMM Deprecated, same as -no-user-config
+DEF("nodefconfig", 0, QEMU_OPTION_nodefconfig, "", QEMU_ARCH_ALL)
 DEF("no-user-config", 0, QEMU_OPTION_nouserconfig,
 "-no-user-config\n"
-"do not load user-provided config files at startup\n",
+"do not load default user-provided config files at 
startup\n",
 QEMU_ARCH_ALL)
 STEXI
 @item -no-user-config
 @findex -no-user-config
 The @code{-no-user-config} option makes QEMU not load any of the user-provided
-config files on @var{sysconfdir}, but won't make it skip the QEMU-provided 
config
-files from @var{datadir}.
+config files on @var{sysconfdir}.
 ETEXI
 DEF("trace", HAS_ARG, QEMU_OPTION_trace,
 "-trace [[enable=]][,events=][,file=]\n"
-- 
2.13.5

[Qemu-devel] [PATCH v2 0/2] Deprecate -nodefconfig

2017-10-03 Thread Eduardo Habkost

Changes v1 -> v2:
* Document at "Deprecated features" section in qemu-doc.texi
  (Daniel)
* Remove documentation for the option from qemu-options.hx
  (Markus)

Since 2012 (commit ba6212d8 "Eliminate cpus-x86_64.conf file") we
have no default config files that would be disabled using
-nodefconfig.  This series cleans up the code, updates
documentation, and document -nodefconfig as deprecated.

Eduardo Habkost (2):
  vl: Eliminate defconfig variable
  qemu-options: Deprecate -nodefconfig

 vl.c|  5 +
 qemu-doc.texi   |  4 
 qemu-options.hx | 17 -
 3 files changed, 9 insertions(+), 17 deletions(-)

-- 
2.13.5

[Qemu-devel] [PATCH v2 1/2] vl: Eliminate defconfig variable

2017-10-03 Thread Eduardo Habkost

Both -nodefconfig and -no-user-config options do the same thing
today, we only need one variable to keep track of them.

Suggested-by: Markus Armbruster 
Acked-by: Alistair Francis 
Reviewed-by: Markus Armbruster 
Signed-off-by: Eduardo Habkost 
---
 vl.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/vl.c b/vl.c
index 3fed457921..ebea42e0ea 100644
--- a/vl.c
+++ b/vl.c
@@ -3111,7 +3111,6 @@ int main(int argc, char **argv, char **envp)
 const char *qtest_log = NULL;
 const char *pid_file = NULL;
 const char *incoming = NULL;
-bool defconfig = true;
 bool userconfig = true;
 bool nographic = false;
 DisplayType display_type = DT_DEFAULT;
@@ -3213,8 +3212,6 @@ int main(int argc, char **argv, char **envp)
 popt = lookup_opt(argc, argv, , );
 switch (popt->index) {
 case QEMU_OPTION_nodefconfig:
-defconfig = false;
-break;
 case QEMU_OPTION_nouserconfig:
 userconfig = false;
 break;
@@ -3222,7 +3219,7 @@ int main(int argc, char **argv, char **envp)
 }
 }
 
-if (defconfig && userconfig) {
+if (userconfig) {
 if (qemu_read_default_config_file() < 0) {
 exit(1);
 }
-- 
2.13.5

Re: [Qemu-devel] [PATCH v1 5/5] raspi: : Specify the valid CPUs

2017-10-03 Thread Eduardo Habkost

On Tue, Oct 03, 2017 at 01:05:18PM -0700, Alistair Francis wrote:
> List all possible valid CPU options.
> 
> Signed-off-by: Alistair Francis 
> ---
> 
>  hw/arm/raspi.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/hw/arm/raspi.c b/hw/arm/raspi.c
> index 5941c9f751..555db0f258 100644
> --- a/hw/arm/raspi.c
> +++ b/hw/arm/raspi.c
> @@ -158,6 +158,10 @@ static void raspi2_init(MachineState *machine)
>  setup_boot(machine, 2, machine->ram_size - vcram_size);
>  }
>  
> +const char *raspi2_valid_cpus[] = { ARM_CPU_TYPE_NAME("cortex-a7"),
> +NULL
> +  };
> +
>  static void raspi2_machine_init(MachineClass *mc)
>  {
>  mc->desc = "Raspberry Pi 2";
> @@ -169,5 +173,7 @@ static void raspi2_machine_init(MachineClass *mc)
>  mc->max_cpus = BCM2836_NCPUS;
>  mc->default_ram_size = 1024 * 1024 * 1024;
>  mc->ignore_memory_transaction_failures = true;
> +mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a7");
> +mc->valid_cpu_types = raspi2_valid_cpus;

I'm confused: bcm2836_init() is hardcoded to cortex-a15, not
cortex-a7.

>  };
>  DEFINE_MACHINE("raspi2", raspi2_machine_init)
> -- 
> 2.11.0
> 

-- 
Eduardo

Re: [Qemu-devel] [PATCH v1 3/5] xlnx-zcu102: Specify the valid CPUs

2017-10-03 Thread Eduardo Habkost

On Tue, Oct 03, 2017 at 01:05:13PM -0700, Alistair Francis wrote:
> List all possible valid CPU options.
> 
> Signed-off-by: Alistair Francis 
> ---
> 
>  hw/arm/xlnx-zcu102.c | 10 ++
>  hw/arm/xlnx-zynqmp.c | 16 +---
>  include/hw/arm/xlnx-zynqmp.h |  1 +
>  3 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/arm/xlnx-zcu102.c b/hw/arm/xlnx-zcu102.c
> index 519a16ed98..039649e522 100644
> --- a/hw/arm/xlnx-zcu102.c
> +++ b/hw/arm/xlnx-zcu102.c
> @@ -98,6 +98,8 @@ static void xlnx_zynqmp_init(XlnxZCU102 *s, MachineState 
> *machine)
>  object_property_add_child(OBJECT(machine), "soc", OBJECT(>soc),
>_abort);
>  
> +object_property_set_str(OBJECT(>soc), machine->cpu_type, "cpu-type",
> +_fatal);

Do you have plans to support other CPU types to xlnx_zynqmp in
the future?  If not, I wouldn't bother adding the cpu-type
property and the extra boilerplate code if it's always going to
be set to cortex-a53.


>  object_property_set_link(OBJECT(>soc), OBJECT(>ddr_ram),
>   "ddr-ram", _abort);
>  object_property_set_bool(OBJECT(>soc), s->secure, "secure",
> @@ -160,6 +162,10 @@ static void xlnx_zynqmp_init(XlnxZCU102 *s, MachineState 
> *machine)
>  arm_load_kernel(s->soc.boot_cpu_ptr, _zcu102_binfo);
>  }
>  
> +const char *xlnx_zynqmp_valid_cpus[] = { ARM_CPU_TYPE_NAME("cortex-a53"),
> + NULL
> +   };
> +
>  static void xlnx_ep108_init(MachineState *machine)
>  {
>  XlnxZCU102 *s = EP108_MACHINE(machine);
> @@ -185,6 +191,8 @@ static void xlnx_ep108_machine_class_init(ObjectClass 
> *oc, void *data)
>  mc->block_default_type = IF_IDE;
>  mc->units_per_default_bus = 1;
>  mc->ignore_memory_transaction_failures = true;
> +mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a53");
> +mc->valid_cpu_types = xlnx_zynqmp_valid_cpus;
>  }
>  
>  static const TypeInfo xlnx_ep108_machine_init_typeinfo = {
> @@ -240,6 +248,8 @@ static void xlnx_zcu102_machine_class_init(ObjectClass 
> *oc, void *data)
>  mc->block_default_type = IF_IDE;
>  mc->units_per_default_bus = 1;
>  mc->ignore_memory_transaction_failures = true;
> +mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a53");
> +mc->valid_cpu_types = xlnx_zynqmp_valid_cpus;
>  }
>  
>  static const TypeInfo xlnx_zcu102_machine_init_typeinfo = {
> diff --git a/hw/arm/xlnx-zynqmp.c b/hw/arm/xlnx-zynqmp.c
> index 2b27daf51d..1bff099ec1 100644
> --- a/hw/arm/xlnx-zynqmp.c
> +++ b/hw/arm/xlnx-zynqmp.c
> @@ -133,13 +133,6 @@ static void xlnx_zynqmp_init(Object *obj)
>  XlnxZynqMPState *s = XLNX_ZYNQMP(obj);
>  int i;
>  
> -for (i = 0; i < XLNX_ZYNQMP_NUM_APU_CPUS; i++) {
> -object_initialize(>apu_cpu[i], sizeof(s->apu_cpu[i]),
> -  "cortex-a53-" TYPE_ARM_CPU);
> -object_property_add_child(obj, "apu-cpu[*]", OBJECT(>apu_cpu[i]),
> -  _abort);
> -}
> -
>  object_initialize(>gic, sizeof(s->gic), gic_class_name());
>  qdev_set_parent_bus(DEVICE(>gic), sysbus_get_default());
>  
> @@ -187,6 +180,14 @@ static void xlnx_zynqmp_realize(DeviceState *dev, Error 
> **errp)
>  qemu_irq gic_spi[GIC_NUM_SPI_INTR];
>  Error *err = NULL;
>  
> +/* We need to do this here to ensure the cpu_type property is set. */
> +for (i = 0; i < XLNX_ZYNQMP_NUM_APU_CPUS; i++) {
> +object_initialize(>apu_cpu[i], sizeof(s->apu_cpu[i]),
> +  s->cpu_type);
> +object_property_add_child(OBJECT(dev), "apu-cpu[*]", 
> OBJECT(>apu_cpu[i]),
> +  _abort);
> +}
> +
>  ram_size = memory_region_size(s->ddr_ram);
>  
>  /* Create the DDR Memory Regions. User friendly checks should happen at
> @@ -425,6 +426,7 @@ static void xlnx_zynqmp_realize(DeviceState *dev, Error 
> **errp)
>  }
>  
>  static Property xlnx_zynqmp_props[] = {
> +DEFINE_PROP_STRING("cpu-type", XlnxZynqMPState, cpu_type),
>  DEFINE_PROP_STRING("boot-cpu", XlnxZynqMPState, boot_cpu),
>  DEFINE_PROP_BOOL("secure", XlnxZynqMPState, secure, false),
>  DEFINE_PROP_BOOL("virtualization", XlnxZynqMPState, virt, false),
> diff --git a/include/hw/arm/xlnx-zynqmp.h b/include/hw/arm/xlnx-zynqmp.h
> index 6eff81a995..5afb8de11e 100644
> --- a/include/hw/arm/xlnx-zynqmp.h
> +++ b/include/hw/arm/xlnx-zynqmp.h
> @@ -86,6 +86,7 @@ typedef struct XlnxZynqMPState {
>  XlnxDPState dp;
>  XlnxDPDMAState dpdma;
>  
> +char *cpu_type;
>  char *boot_cpu;
>  ARMCPU *boot_cpu_ptr;
>  
> -- 
> 2.11.0
> 

-- 
Eduardo

Re: [Qemu-devel] [PATCH v1 1/5] machine: Add a valid_cpu_types property

2017-10-03 Thread Eduardo Habkost

On Tue, Oct 03, 2017 at 01:26:53PM -0700, Alistair Francis wrote:
> On Tue, Oct 3, 2017 at 1:23 PM, Eduardo Habkost  wrote:
> > On Tue, Oct 03, 2017 at 01:05:09PM -0700, Alistair Francis wrote:
> >> This patch add a MachineClass element that can be set in the machine C
> >> code to specify a list of supported CPU types. If the supported CPU
> >> types are specified the user enter CPU (by -cpu at runtime) is checked
> >> against the supported types and QEMU exits if they aren't supported.
> >>
> >> Signed-off-by: Alistair Francis 
> >> ---
> >
> > Thanks!
> >
> > Reviewed-by: Eduardo Habkost 
> >
> > However, I will squash the following changes before queueing,
> > because:
> >
> > * object_class_dynamic_cast() is safe even if class is NULL,
> >   so there's no need to validate cpu_type here.
> > * "must not be valid" sounds like the CPU is not allowed to be a
> >   valid CPU, so I rewrote the comment.
> >
> >
> > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > index 3afc6a7b5b..36c2fb069c 100644
> > --- a/hw/core/machine.c
> > +++ b/hw/core/machine.c
> > @@ -766,9 +766,6 @@ void machine_run_board_init(MachineState *machine)
> >  ObjectClass *class = object_class_by_name(machine->cpu_type);
> >  int i;
> >
> > -/* machine->cpu_type is supposed to be always a valid QOM type */
> > -assert(class);
> > -
> >  for (i = 0; machine_class->valid_cpu_types[i]; i++) {
> >  if (object_class_dynamic_cast(class,
> >
> > machine_class->valid_cpu_types[i])) {
> > @@ -780,7 +777,7 @@ void machine_run_board_init(MachineState *machine)
> >  }
> >
> >  if (!machine_class->valid_cpu_types[i]) {
> > -/* The user specified CPU must not be a valid CPU */
> > +/* The user specified CPU is not valid */
> >  error_report("Invalid CPU type: %s", machine->cpu_type);
> >  error_printf("The valid types are: %s",
> >   machine_class->valid_cpu_types[0]);
> 
> Looks good to me.
> 
> Does that mean you are taking the whole series now?

I was planning to tacke only patch 1/5, but I can take the whole
series if I get an Acked-by from the corresponding maintainers.

-- 
Eduardo

Re: [Qemu-devel] [PATCH v1 2/5] netduino2: Specify the valid CPUs

2017-10-03 Thread Eduardo Habkost

On Tue, Oct 03, 2017 at 01:05:11PM -0700, Alistair Francis wrote:
> List all possible valid CPU options.
> 
> Although the board only ever has a Cortex-M3 we mark the Cortex-M4 as
> supported because the Netduino2 Plus supports the Cortex-M4 and the
> Netduino2 Plus is similar to the Netduino2.
> 
> Signed-off-by: Alistair Francis 

Reviewed-by: Eduardo Habkost 

-- 
Eduardo

Re: [Qemu-devel] [PATCH v1 1/5] machine: Add a valid_cpu_types property

2017-10-03 Thread Alistair Francis

On Tue, Oct 3, 2017 at 1:23 PM, Eduardo Habkost  wrote:
> On Tue, Oct 03, 2017 at 01:05:09PM -0700, Alistair Francis wrote:
>> This patch add a MachineClass element that can be set in the machine C
>> code to specify a list of supported CPU types. If the supported CPU
>> types are specified the user enter CPU (by -cpu at runtime) is checked
>> against the supported types and QEMU exits if they aren't supported.
>>
>> Signed-off-by: Alistair Francis 
>> ---
>
> Thanks!
>
> Reviewed-by: Eduardo Habkost 
>
> However, I will squash the following changes before queueing,
> because:
>
> * object_class_dynamic_cast() is safe even if class is NULL,
>   so there's no need to validate cpu_type here.
> * "must not be valid" sounds like the CPU is not allowed to be a
>   valid CPU, so I rewrote the comment.
>
>
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 3afc6a7b5b..36c2fb069c 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -766,9 +766,6 @@ void machine_run_board_init(MachineState *machine)
>  ObjectClass *class = object_class_by_name(machine->cpu_type);
>  int i;
>
> -/* machine->cpu_type is supposed to be always a valid QOM type */
> -assert(class);
> -
>  for (i = 0; machine_class->valid_cpu_types[i]; i++) {
>  if (object_class_dynamic_cast(class,
>
> machine_class->valid_cpu_types[i])) {
> @@ -780,7 +777,7 @@ void machine_run_board_init(MachineState *machine)
>  }
>
>  if (!machine_class->valid_cpu_types[i]) {
> -/* The user specified CPU must not be a valid CPU */
> +/* The user specified CPU is not valid */
>  error_report("Invalid CPU type: %s", machine->cpu_type);
>  error_printf("The valid types are: %s",
>   machine_class->valid_cpu_types[0]);

Looks good to me.

Does that mean you are taking the whole series now?

Thanks,
Alistair

>>
>> V1:
>>  - Use 'type' in the error message
>>  - Move a uneeded operation out of the for loop
>>  - Remove the goto
>> RFC v2:
>>  - Rebase on Igor's cpu_type work
>>  - Use object_class_dynamic_cast()
>>  - Use a NULL terminated cahr** list
>>  - Do the check before the machine_class init() is called
>>
>>
>>  hw/core/machine.c   | 35 +++
>>  include/hw/boards.h |  1 +
>>  2 files changed, 36 insertions(+)
>>
>> diff --git a/hw/core/machine.c b/hw/core/machine.c
>> index 80647edc2a..3afc6a7b5b 100644
>> --- a/hw/core/machine.c
>> +++ b/hw/core/machine.c
>> @@ -758,6 +758,41 @@ void machine_run_board_init(MachineState *machine)
>>  if (nb_numa_nodes) {
>>  machine_numa_finish_init(machine);
>>  }
>> +
>> +/* If the machine supports the valid_cpu_types check and the user
>> + * specified a CPU with -cpu check here that the user CPU is supported.
>> + */
>> +if (machine_class->valid_cpu_types && machine->cpu_type) {
>> +ObjectClass *class = object_class_by_name(machine->cpu_type);
>> +int i;
>> +
>> +/* machine->cpu_type is supposed to be always a valid QOM type */
>> +assert(class);
>> +
>> +for (i = 0; machine_class->valid_cpu_types[i]; i++) {
>> +if (object_class_dynamic_cast(class,
>> +  
>> machine_class->valid_cpu_types[i])) {
>> +/* The user specificed CPU is in the valid field, we are
>> + * good to go.
>> + */
>> +break;
>> +}
>> +}
>> +
>> +if (!machine_class->valid_cpu_types[i]) {
>> +/* The user specified CPU must not be a valid CPU */
>> +error_report("Invalid CPU type: %s", machine->cpu_type);
>> +error_printf("The valid types are: %s",
>> + machine_class->valid_cpu_types[0]);
>> +for (i = 1; machine_class->valid_cpu_types[i]; i++) {
>> +error_printf(", %s", machine_class->valid_cpu_types[i]);
>> +}
>> +error_printf("\n");
>> +
>> +exit(1);
>> +}
>> +}
>> +
>>  machine_class->init(machine);
>>  }
>>
>> diff --git a/include/hw/boards.h b/include/hw/boards.h
>> index 156e0a5701..191a5b3cd8 100644
>> --- a/include/hw/boards.h
>> +++ b/include/hw/boards.h
>> @@ -191,6 +191,7 @@ struct MachineClass {
>>  bool has_hotpluggable_cpus;
>>  bool ignore_memory_transaction_failures;
>>  int numa_mem_align_shift;
>> +const char **valid_cpu_types;
>>  void (*numa_auto_assign_ram)(MachineClass *mc, NodeInfo *nodes,
>>   int nb_nodes, ram_addr_t size);
>>
>> --
>> 2.11.0
>>
>
> --
> Eduardo

Re: [Qemu-devel] [PATCH v1 1/5] machine: Add a valid_cpu_types property

2017-10-03 Thread Eduardo Habkost

On Tue, Oct 03, 2017 at 01:05:09PM -0700, Alistair Francis wrote:
> This patch add a MachineClass element that can be set in the machine C
> code to specify a list of supported CPU types. If the supported CPU
> types are specified the user enter CPU (by -cpu at runtime) is checked
> against the supported types and QEMU exits if they aren't supported.
> 
> Signed-off-by: Alistair Francis 
> ---

Thanks!

Reviewed-by: Eduardo Habkost 

However, I will squash the following changes before queueing,
because:

* object_class_dynamic_cast() is safe even if class is NULL,
  so there's no need to validate cpu_type here.
* "must not be valid" sounds like the CPU is not allowed to be a
  valid CPU, so I rewrote the comment.


diff --git a/hw/core/machine.c b/hw/core/machine.c
index 3afc6a7b5b..36c2fb069c 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -766,9 +766,6 @@ void machine_run_board_init(MachineState *machine)
 ObjectClass *class = object_class_by_name(machine->cpu_type);
 int i;
 
-/* machine->cpu_type is supposed to be always a valid QOM type */
-assert(class);
-
 for (i = 0; machine_class->valid_cpu_types[i]; i++) {
 if (object_class_dynamic_cast(class,
   machine_class->valid_cpu_types[i])) {
@@ -780,7 +777,7 @@ void machine_run_board_init(MachineState *machine)
 }
 
 if (!machine_class->valid_cpu_types[i]) {
-/* The user specified CPU must not be a valid CPU */
+/* The user specified CPU is not valid */
 error_report("Invalid CPU type: %s", machine->cpu_type);
 error_printf("The valid types are: %s",
  machine_class->valid_cpu_types[0]);
> 
> V1:
>  - Use 'type' in the error message
>  - Move a uneeded operation out of the for loop
>  - Remove the goto
> RFC v2:
>  - Rebase on Igor's cpu_type work
>  - Use object_class_dynamic_cast()
>  - Use a NULL terminated cahr** list
>  - Do the check before the machine_class init() is called
> 
> 
>  hw/core/machine.c   | 35 +++
>  include/hw/boards.h |  1 +
>  2 files changed, 36 insertions(+)
> 
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 80647edc2a..3afc6a7b5b 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -758,6 +758,41 @@ void machine_run_board_init(MachineState *machine)
>  if (nb_numa_nodes) {
>  machine_numa_finish_init(machine);
>  }
> +
> +/* If the machine supports the valid_cpu_types check and the user
> + * specified a CPU with -cpu check here that the user CPU is supported.
> + */
> +if (machine_class->valid_cpu_types && machine->cpu_type) {
> +ObjectClass *class = object_class_by_name(machine->cpu_type);
> +int i;
> +
> +/* machine->cpu_type is supposed to be always a valid QOM type */
> +assert(class);
> +
> +for (i = 0; machine_class->valid_cpu_types[i]; i++) {
> +if (object_class_dynamic_cast(class,
> +  
> machine_class->valid_cpu_types[i])) {
> +/* The user specificed CPU is in the valid field, we are
> + * good to go.
> + */
> +break;
> +}
> +}
> +
> +if (!machine_class->valid_cpu_types[i]) {
> +/* The user specified CPU must not be a valid CPU */
> +error_report("Invalid CPU type: %s", machine->cpu_type);
> +error_printf("The valid types are: %s",
> + machine_class->valid_cpu_types[0]);
> +for (i = 1; machine_class->valid_cpu_types[i]; i++) {
> +error_printf(", %s", machine_class->valid_cpu_types[i]);
> +}
> +error_printf("\n");
> +
> +exit(1);
> +}
> +}
> +
>  machine_class->init(machine);
>  }
>  
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 156e0a5701..191a5b3cd8 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -191,6 +191,7 @@ struct MachineClass {
>  bool has_hotpluggable_cpus;
>  bool ignore_memory_transaction_failures;
>  int numa_mem_align_shift;
> +const char **valid_cpu_types;
>  void (*numa_auto_assign_ram)(MachineClass *mc, NodeInfo *nodes,
>   int nb_nodes, ram_addr_t size);
>  
> -- 
> 2.11.0
> 

-- 
Eduardo

[Qemu-devel] [PATCH v1 5/5] raspi: : Specify the valid CPUs

2017-10-03 Thread Alistair Francis

List all possible valid CPU options.

Signed-off-by: Alistair Francis 
---

 hw/arm/raspi.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/arm/raspi.c b/hw/arm/raspi.c
index 5941c9f751..555db0f258 100644
--- a/hw/arm/raspi.c
+++ b/hw/arm/raspi.c
@@ -158,6 +158,10 @@ static void raspi2_init(MachineState *machine)
 setup_boot(machine, 2, machine->ram_size - vcram_size);
 }
 
+const char *raspi2_valid_cpus[] = { ARM_CPU_TYPE_NAME("cortex-a7"),
+NULL
+  };
+
 static void raspi2_machine_init(MachineClass *mc)
 {
 mc->desc = "Raspberry Pi 2";
@@ -169,5 +173,7 @@ static void raspi2_machine_init(MachineClass *mc)
 mc->max_cpus = BCM2836_NCPUS;
 mc->default_ram_size = 1024 * 1024 * 1024;
 mc->ignore_memory_transaction_failures = true;
+mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a7");
+mc->valid_cpu_types = raspi2_valid_cpus;
 };
 DEFINE_MACHINE("raspi2", raspi2_machine_init)
-- 
2.11.0

[Qemu-devel] [PATCH v1 2/5] netduino2: Specify the valid CPUs

2017-10-03 Thread Alistair Francis

List all possible valid CPU options.

Although the board only ever has a Cortex-M3 we mark the Cortex-M4 as
supported because the Netduino2 Plus supports the Cortex-M4 and the
Netduino2 Plus is similar to the Netduino2.

Signed-off-by: Alistair Francis 
---

RFC v2:
 - Use a NULL terminated list
 - Add the Cortex-M4 for testing


 hw/arm/netduino2.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/hw/arm/netduino2.c b/hw/arm/netduino2.c
index f936017d4a..b68ecf2c08 100644
--- a/hw/arm/netduino2.c
+++ b/hw/arm/netduino2.c
@@ -34,18 +34,25 @@ static void netduino2_init(MachineState *machine)
 DeviceState *dev;
 
 dev = qdev_create(NULL, TYPE_STM32F205_SOC);
-qdev_prop_set_string(dev, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m3"));
+qdev_prop_set_string(dev, "cpu-type", machine->cpu_type);
 object_property_set_bool(OBJECT(dev), true, "realized", _fatal);
 
 armv7m_load_kernel(ARM_CPU(first_cpu), machine->kernel_filename,
FLASH_SIZE);
 }
 
+const char *netduino_valid_cpus[] = { ARM_CPU_TYPE_NAME("cortex-m3"),
+  ARM_CPU_TYPE_NAME("cortex-m4"),
+  NULL
+};
+
 static void netduino2_machine_init(MachineClass *mc)
 {
 mc->desc = "Netduino 2 Machine";
 mc->init = netduino2_init;
 mc->ignore_memory_transaction_failures = true;
+mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-m3");
+mc->valid_cpu_types = netduino_valid_cpus;
 }
 
 DEFINE_MACHINE("netduino2", netduino2_machine_init)
-- 
2.11.0

[Qemu-devel] [PATCH v1 4/5] xilinx_zynq: : Specify the valid CPUs

2017-10-03 Thread Alistair Francis

List all possible valid CPU options.

Signed-off-by: Alistair Francis 
---

 hw/arm/xilinx_zynq.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index 1836a4ed45..de1e0bbce1 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -313,6 +313,10 @@ static void zynq_init(MachineState *machine)
 arm_load_kernel(ARM_CPU(first_cpu), _binfo);
 }
 
+const char *xlnx_zynq_7000_valid_cpus[] = { ARM_CPU_TYPE_NAME("cortex-a9"),
+NULL
+  };
+
 static void zynq_machine_init(MachineClass *mc)
 {
 mc->desc = "Xilinx Zynq Platform Baseboard for Cortex-A9";
@@ -321,6 +325,7 @@ static void zynq_machine_init(MachineClass *mc)
 mc->no_sdcard = 1;
 mc->ignore_memory_transaction_failures = true;
 mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a9");
+mc->valid_cpu_types = xlnx_zynq_7000_valid_cpus;
 }
 
 DEFINE_MACHINE("xilinx-zynq-a9", zynq_machine_init)
-- 
2.11.0

[Qemu-devel] [PATCH v1 1/5] machine: Add a valid_cpu_types property

2017-10-03 Thread Alistair Francis

This patch add a MachineClass element that can be set in the machine C
code to specify a list of supported CPU types. If the supported CPU
types are specified the user enter CPU (by -cpu at runtime) is checked
against the supported types and QEMU exits if they aren't supported.

Signed-off-by: Alistair Francis 
---

V1:
 - Use 'type' in the error message
 - Move a uneeded operation out of the for loop
 - Remove the goto
RFC v2:
 - Rebase on Igor's cpu_type work
 - Use object_class_dynamic_cast()
 - Use a NULL terminated cahr** list
 - Do the check before the machine_class init() is called


 hw/core/machine.c   | 35 +++
 include/hw/boards.h |  1 +
 2 files changed, 36 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 80647edc2a..3afc6a7b5b 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -758,6 +758,41 @@ void machine_run_board_init(MachineState *machine)
 if (nb_numa_nodes) {
 machine_numa_finish_init(machine);
 }
+
+/* If the machine supports the valid_cpu_types check and the user
+ * specified a CPU with -cpu check here that the user CPU is supported.
+ */
+if (machine_class->valid_cpu_types && machine->cpu_type) {
+ObjectClass *class = object_class_by_name(machine->cpu_type);
+int i;
+
+/* machine->cpu_type is supposed to be always a valid QOM type */
+assert(class);
+
+for (i = 0; machine_class->valid_cpu_types[i]; i++) {
+if (object_class_dynamic_cast(class,
+  machine_class->valid_cpu_types[i])) {
+/* The user specificed CPU is in the valid field, we are
+ * good to go.
+ */
+break;
+}
+}
+
+if (!machine_class->valid_cpu_types[i]) {
+/* The user specified CPU must not be a valid CPU */
+error_report("Invalid CPU type: %s", machine->cpu_type);
+error_printf("The valid types are: %s",
+ machine_class->valid_cpu_types[0]);
+for (i = 1; machine_class->valid_cpu_types[i]; i++) {
+error_printf(", %s", machine_class->valid_cpu_types[i]);
+}
+error_printf("\n");
+
+exit(1);
+}
+}
+
 machine_class->init(machine);
 }
 
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 156e0a5701..191a5b3cd8 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -191,6 +191,7 @@ struct MachineClass {
 bool has_hotpluggable_cpus;
 bool ignore_memory_transaction_failures;
 int numa_mem_align_shift;
+const char **valid_cpu_types;
 void (*numa_auto_assign_ram)(MachineClass *mc, NodeInfo *nodes,
  int nb_nodes, ram_addr_t size);
 
-- 
2.11.0

[Qemu-devel] [PATCH v1 0/5] Add a valid_cpu_types property

2017-10-03 Thread Alistair Francis

There are numorous QEMU machines that only have a single or a handful of
valid CPU options. To simplyfy the management of specificying which CPU
is/isn't valid let's create a property that can be set in the machine
init. We can then check to see if the user supplied CPU is in that list
or not.

I have added the valid_cpu_types for some ARM machines only at the
moment.

Here is what specifying the CPUs looks like now:

$ aarch64-softmmu/qemu-system-aarch64 -M netduino2 -kernel ./u-boot.elf 
-nographic -cpu "cortex-m3" -S
QEMU 2.10.50 monitor - type 'help' for more information
(qemu) info cpus
* CPU #0: thread_id=24175
(qemu) q

$ aarch64-softmmu/qemu-system-aarch64 -M netduino2 -kernel ./u-boot.elf 
-nographic -cpu "cortex-m4" -S
QEMU 2.10.50 monitor - type 'help' for more information
(qemu) q

$ aarch64-softmmu/qemu-system-aarch64 -M netduino2 -kernel ./u-boot.elf 
-nographic -cpu "cortex-m5" -S
qemu-system-aarch64: unable to find CPU model 'cortex-m5'

$ aarch64-softmmu/qemu-system-aarch64 -M netduino2 -kernel ./u-boot.elf 
-nographic -cpu "cortex-a9" -S
qemu-system-aarch64: Invalid CPU type: cortex-a9-arm-cpu
The valid types are: cortex-m3-arm-cpu, cortex-m4-arm-cpu

V1:
 - Small fixes to prepare a series instead of RFC
 - Add commit messages for the commits
 - Expand the machine support to ARM machines
RFC v2:
 - Rebase on Igor's work
 - Use more QEMUisms inside the code
 - List the supported machines in a NULL terminated array


Alistair Francis (5):
  machine: Add a valid_cpu_types property
  netduino2: Specify the valid CPUs
  xlnx-zcu102: Specify the valid CPUs
  xilinx_zynq: : Specify the valid CPUs
  raspi: : Specify the valid CPUs

 hw/arm/netduino2.c   |  9 -
 hw/arm/raspi.c   |  6 ++
 hw/arm/xilinx_zynq.c |  5 +
 hw/arm/xlnx-zcu102.c | 10 ++
 hw/arm/xlnx-zynqmp.c | 16 +---
 hw/core/machine.c| 35 +++
 include/hw/arm/xlnx-zynqmp.h |  1 +
 include/hw/boards.h  |  1 +
 8 files changed, 75 insertions(+), 8 deletions(-)

-- 
2.11.0

[Qemu-devel] [PULL 1/5] qom: provide root container for internal objs

2017-10-03 Thread Stefan Hajnoczi

From: Peter Xu 

We have object_get_objects_root() to keep user created objects, however
no place for objects that will be used internally.  Create such a
container for internal objects.

CC: Andreas Färber 
CC: Markus Armbruster 
CC: Paolo Bonzini 
Suggested-by: Daniel P. Berrange 
Signed-off-by: Peter Xu 
Reviewed-by: Fam Zheng 
Message-id: 20170928025958.1420-2-pet...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 include/qom/object.h | 11 +++
 qom/object.c | 11 +++
 2 files changed, 22 insertions(+)

diff --git a/include/qom/object.h b/include/qom/object.h
index f3e5cff37a..e0d9824415 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -1214,6 +1214,17 @@ Object *object_get_root(void);
 Object *object_get_objects_root(void);
 
 /**
+ * object_get_internal_root:
+ *
+ * Get the container object that holds internally used object
+ * instances.  Any object which is put into this container must not be
+ * user visible, and it will not be exposed in the QOM tree.
+ *
+ * Returns: the internal object container
+ */
+Object *object_get_internal_root(void);
+
+/**
  * object_get_canonical_path_component:
  *
  * Returns: The final component in the object's canonical path.  The canonical
diff --git a/qom/object.c b/qom/object.c
index 3e18537e9b..6a7bd9257b 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -1370,6 +1370,17 @@ Object *object_get_objects_root(void)
 return container_get(object_get_root(), "/objects");
 }
 
+Object *object_get_internal_root(void)
+{
+static Object *internal_root;
+
+if (!internal_root) {
+internal_root = object_new("container");
+}
+
+return internal_root;
+}
+
 static void object_get_child_property(Object *obj, Visitor *v,
   const char *name, void *opaque,
   Error **errp)
-- 
2.13.6

[Qemu-devel] [PULL 3/5] iothread: export iothread_stop()

2017-10-03 Thread Stefan Hajnoczi

From: Peter Xu 

So that internal iothread users can explicitly stop one iothread without
destroying it.

Since at it, fix iothread_stop() to allow it to be called multiple
times.  Before this patch we may call iothread_stop() more than once on
single iothread, while that may not be correct since qemu_thread_join()
is not allowed to run twice.  From manual of pthread_join():

  Joining with a thread that has previously been joined results in
  undefined behavior.

Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Peter Xu 
Message-id: 20170928025958.1420-4-pet...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 include/sysemu/iothread.h |  1 +
 iothread.c| 24 
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/include/sysemu/iothread.h b/include/sysemu/iothread.h
index b07663f0a1..110329b2b4 100644
--- a/include/sysemu/iothread.h
+++ b/include/sysemu/iothread.h
@@ -52,6 +52,7 @@ GMainContext *iothread_get_g_main_context(IOThread *iothread);
  * "query-iothreads".
  */
 IOThread *iothread_create(const char *id, Error **errp);
+void iothread_stop(IOThread *iothread);
 void iothread_destroy(IOThread *iothread);
 
 #endif /* IOTHREAD_H */
diff --git a/iothread.c b/iothread.c
index 0672a9196f..b3c092b2d7 100644
--- a/iothread.c
+++ b/iothread.c
@@ -80,13 +80,10 @@ static void *iothread_run(void *opaque)
 return NULL;
 }
 
-static int iothread_stop(Object *object, void *opaque)
+void iothread_stop(IOThread *iothread)
 {
-IOThread *iothread;
-
-iothread = (IOThread *)object_dynamic_cast(object, TYPE_IOTHREAD);
-if (!iothread || !iothread->ctx || iothread->stopping) {
-return 0;
+if (!iothread->ctx || iothread->stopping) {
+return;
 }
 iothread->stopping = true;
 aio_notify(iothread->ctx);
@@ -94,6 +91,17 @@ static int iothread_stop(Object *object, void *opaque)
 g_main_loop_quit(iothread->main_loop);
 }
 qemu_thread_join(>thread);
+}
+
+static int iothread_stop_iter(Object *object, void *opaque)
+{
+IOThread *iothread;
+
+iothread = (IOThread *)object_dynamic_cast(object, TYPE_IOTHREAD);
+if (!iothread) {
+return 0;
+}
+iothread_stop(iothread);
 return 0;
 }
 
@@ -108,7 +116,7 @@ static void iothread_instance_finalize(Object *obj)
 {
 IOThread *iothread = IOTHREAD(obj);
 
-iothread_stop(obj, NULL);
+iothread_stop(iothread);
 qemu_cond_destroy(>init_done_cond);
 qemu_mutex_destroy(>init_done_lock);
 if (!iothread->ctx) {
@@ -328,7 +336,7 @@ void iothread_stop_all(void)
 aio_context_release(ctx);
 }
 
-object_child_foreach(container, iothread_stop, NULL);
+object_child_foreach(container, iothread_stop_iter, NULL);
 }
 
 static gpointer iothread_g_main_context_init(gpointer opaque)
-- 
2.13.6

[Qemu-devel] [PULL 0/5] Block patches

2017-10-03 Thread Stefan Hajnoczi

The following changes since commit d147f7e815f97cb477e223586bcb80c316ae10ea:

  Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging 
(2017-10-03 16:27:24 +0100)

are available in the git repository at:

  git://github.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to f708a5e71cba0d784e307334c07ade5f56f827ab:

  aio: fix assert when remove poll during destroy (2017-10-03 14:36:19 -0400)





Peter Xu (4):
  qom: provide root container for internal objs
  iothread: provide helpers for internal use
  iothread: export iothread_stop()
  iothread: delay the context release to finalize

Stefan Hajnoczi (1):
  aio: fix assert when remove poll during destroy

 include/qom/object.h  | 11 +++
 include/sysemu/iothread.h |  9 +
 iothread.c| 46 --
 qom/object.c  | 11 +++
 util/aio-posix.c  |  9 -
 5 files changed, 75 insertions(+), 11 deletions(-)

-- 
2.13.6

[Qemu-devel] [PULL 4/5] iothread: delay the context release to finalize

2017-10-03 Thread Stefan Hajnoczi

From: Peter Xu 

When gcontext is used with iothread, the context will be destroyed
during iothread_stop().  That's not good since sometimes we would like
to keep the resources until iothread is destroyed, but we may want to
stop the thread before that point.

Delay the destruction of gcontext to iothread finalize.  Then we can do:

  iothread_stop(thread);
  some_cleanup_on_resources();
  iothread_destroy(thread);

We may need this patch if we want to run chardev IOs in iothreads and
hopefully clean them up correctly.  For more specific information,
please see 2b316774f6 ("qemu-char: do not operate on sources from
finalize callbacks").

Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Peter Xu 
Message-id: 20170928025958.1420-5-pet...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 iothread.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/iothread.c b/iothread.c
index b3c092b2d7..27a4288578 100644
--- a/iothread.c
+++ b/iothread.c
@@ -71,8 +71,6 @@ static void *iothread_run(void *opaque)
 g_main_loop_unref(loop);
 
 g_main_context_pop_thread_default(iothread->worker_context);
-g_main_context_unref(iothread->worker_context);
-iothread->worker_context = NULL;
 }
 }
 
@@ -117,6 +115,10 @@ static void iothread_instance_finalize(Object *obj)
 IOThread *iothread = IOTHREAD(obj);
 
 iothread_stop(iothread);
+if (iothread->worker_context) {
+g_main_context_unref(iothread->worker_context);
+iothread->worker_context = NULL;
+}
 qemu_cond_destroy(>init_done_cond);
 qemu_mutex_destroy(>init_done_lock);
 if (!iothread->ctx) {
-- 
2.13.6

[Qemu-devel] [PULL 5/5] aio: fix assert when remove poll during destroy

2017-10-03 Thread Stefan Hajnoczi

After iothread is enabled internally inside QEMU with GMainContext, we
may encounter this warning when destroying the iothread:

(qemu-system-x86_64:19925): GLib-CRITICAL **: g_source_remove_poll:
 assertion '!SOURCE_DESTROYED (source)' failed

The problem is that g_source_remove_poll() does not allow to remove one
source from array if the source is detached from its owner
context. (peterx: which IMHO does not make much sense)

Fix it on QEMU side by avoid calling g_source_remove_poll() if we know
the object is during destruction, and we won't leak anything after all
since the array will be gone soon cleanly even with that fd.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Fam Zheng 
Signed-off-by: Peter Xu 
Message-id: 20170928025958.1420-6-pet...@redhat.com
[peterx: write the commit message]
Signed-off-by: Peter Xu 
Signed-off-by: Stefan Hajnoczi 
---
 util/aio-posix.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/util/aio-posix.c b/util/aio-posix.c
index 2d51239ec6..5946ac09f0 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -223,7 +223,14 @@ void aio_set_fd_handler(AioContext *ctx,
 return;
 }
 
-g_source_remove_poll(>source, >pfd);
+/* If the GSource is in the process of being destroyed then
+ * g_source_remove_poll() causes an assertion failure.  Skip
+ * removal in that case, because glib cleans up its state during
+ * destruction anyway.
+ */
+if (!g_source_is_destroyed(>source)) {
+g_source_remove_poll(>source, >pfd);
+}
 
 /* If the lock is held, just mark the node as deleted */
 if (qemu_lockcnt_count(>list_lock)) {
-- 
2.13.6

[Qemu-devel] [PULL 2/5] iothread: provide helpers for internal use

2017-10-03 Thread Stefan Hajnoczi

From: Peter Xu 

IOThread is a general framework that contains IO loop environment and a
real thread behind.  It's also good to be used internally inside qemu.
Provide some helpers for it to create iothreads to be used internally.

Put all the internal used iothreads into the internal object container.

Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Peter Xu 
Message-id: 20170928025958.1420-3-pet...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 include/sysemu/iothread.h |  8 
 iothread.c| 16 
 2 files changed, 24 insertions(+)

diff --git a/include/sysemu/iothread.h b/include/sysemu/iothread.h
index d2985b30ba..b07663f0a1 100644
--- a/include/sysemu/iothread.h
+++ b/include/sysemu/iothread.h
@@ -46,4 +46,12 @@ AioContext *iothread_get_aio_context(IOThread *iothread);
 void iothread_stop_all(void);
 GMainContext *iothread_get_g_main_context(IOThread *iothread);
 
+/*
+ * Helpers used to allocate iothreads for internal use.  These
+ * iothreads will not be seen by monitor clients when query using
+ * "query-iothreads".
+ */
+IOThread *iothread_create(const char *id, Error **errp);
+void iothread_destroy(IOThread *iothread);
+
 #endif /* IOTHREAD_H */
diff --git a/iothread.c b/iothread.c
index 59d0850988..0672a9196f 100644
--- a/iothread.c
+++ b/iothread.c
@@ -354,3 +354,19 @@ GMainContext *iothread_get_g_main_context(IOThread 
*iothread)
 
 return iothread->worker_context;
 }
+
+IOThread *iothread_create(const char *id, Error **errp)
+{
+Object *obj;
+
+obj = object_new_with_props(TYPE_IOTHREAD,
+object_get_internal_root(),
+id, errp, NULL);
+
+return IOTHREAD(obj);
+}
+
+void iothread_destroy(IOThread *iothread)
+{
+object_unparent(OBJECT(iothread));
+}
-- 
2.13.6

Re: [Qemu-devel] [PATCH] qdev: Check for the availability of a hotplug controller before adding a device

2017-10-03 Thread Eduardo Habkost

On Tue, Oct 03, 2017 at 06:46:02PM +0200, Thomas Huth wrote:
> The qdev_unplug() function contains a g_assert(hotplug_ctrl) statement,
> so QEMU crashes when the user tries to device_add + device_del a device
> that does not have a corresponding hotplug controller. This could be
> provoked for a couple of devices in the past (see commit 4c93950659487c7ad
> or 84ebd3e8c7d4fe955 for example). So devices clearly need a hotplug
> controller when they are suitable for device_add.
> The code in qdev_device_add() already checks whether the bus has a proper
> hotplug controller, but for devices that do not have a corresponding bus,
> there is no appropriate check available. In that case we should check
> whether the machine itself provides a suitable hotplug controller and
> refuse to plug the device if none is available.
> 
> Signed-off-by: Thomas Huth 
> ---
>  This is the follow-up patch from my earlier try "hw/core/qdev: Do not
>  allow hot-plugging without hotplug controller" ... AFAICS the function
>  qdev_device_add() is now the right spot to do the check.
> 
>  hw/core/qdev.c | 28 
>  include/hw/qdev-core.h |  1 +
>  qdev-monitor.c |  9 +
>  3 files changed, 30 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 606ab53..a953ec9 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -253,19 +253,31 @@ void qdev_set_legacy_instance_id(DeviceState *dev, int 
> alias_id,
>  dev->alias_required_for_version = required_for_version;
>  }
>  
> +HotplugHandler *qdev_get_machine_hotplug_handler(DeviceState *dev)
> +{
> +MachineState *machine;
> +MachineClass *mc;
> +Object *m_obj = qdev_get_machine();
> +
> +if (object_dynamic_cast(m_obj, TYPE_MACHINE)) {
> +machine = MACHINE(m_obj);
> +mc = MACHINE_GET_CLASS(machine);
> +if (mc->get_hotplug_handler) {
> +return mc->get_hotplug_handler(machine, dev);
> +}
> +}
> +
> +return NULL;
> +}
> +
>  HotplugHandler *qdev_get_hotplug_handler(DeviceState *dev)
>  {
> -HotplugHandler *hotplug_ctrl = NULL;
> +HotplugHandler *hotplug_ctrl;
>  
>  if (dev->parent_bus && dev->parent_bus->hotplug_handler) {
>  hotplug_ctrl = dev->parent_bus->hotplug_handler;
> -} else if (object_dynamic_cast(qdev_get_machine(), TYPE_MACHINE)) {
> -MachineState *machine = MACHINE(qdev_get_machine());
> -MachineClass *mc = MACHINE_GET_CLASS(machine);
> -
> -if (mc->get_hotplug_handler) {
> -hotplug_ctrl = mc->get_hotplug_handler(machine, dev);
> -}
> +} else {
> +hotplug_ctrl = qdev_get_machine_hotplug_handler(dev);
>  }
>  return hotplug_ctrl;
>  }
> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> index 0891461..5aa536d 100644
> --- a/include/hw/qdev-core.h
> +++ b/include/hw/qdev-core.h
> @@ -285,6 +285,7 @@ DeviceState *qdev_try_create(BusState *bus, const char 
> *name);
>  void qdev_init_nofail(DeviceState *dev);
>  void qdev_set_legacy_instance_id(DeviceState *dev, int alias_id,
>   int required_for_version);
> +HotplugHandler *qdev_get_machine_hotplug_handler(DeviceState *dev);
>  HotplugHandler *qdev_get_hotplug_handler(DeviceState *dev);
>  void qdev_unplug(DeviceState *dev, Error **errp);
>  void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev,
> diff --git a/qdev-monitor.c b/qdev-monitor.c
> index 8fd6df9..2891dde 100644
> --- a/qdev-monitor.c
> +++ b/qdev-monitor.c
> @@ -626,6 +626,15 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error 
> **errp)
>  return NULL;
>  }
>  
> +/* In case we don't have a bus, there must be a machine hotplug handler 
> */
> +if (qdev_hotplug && !bus && !qdev_get_machine_hotplug_handler(dev)) {
> +error_setg(errp, "Device '%s' can not be hotplugged on this machine",
> +   driver);
> +object_unparent(OBJECT(dev));

Isn't it better to check qdev_get_machine_hotplug_handler()
earlier (before the qdev_set_parent_bus() and qdev_set_id()
lines), so object_unparent() isn't necessary?

(We probably don't need to call object_unparent() here, already,
because bus is NULL.  But moving the check before the "if (bus)
qdev_set_parent_bus()" statement would make this more obvious).

I would prefer to eventually make
MachineClass::get_hotplug_handler() get a typename or
DeviceClass* argument instead of DeviceState*, so we don't even
create the device object.  But I don't think it's a requirement
for this bug fix.


> +object_unref(OBJECT(dev));
> +return NULL;
> +}
> +
>  dev->opts = opts;
>  object_property_set_bool(OBJECT(dev), true, "realized", );
>  if (err != NULL) {
> -- 
> 1.8.3.1
> 

-- 
Eduardo

[Qemu-devel] [PATCH] usb: fix typo CONFIG_LIBUSB -> CONFIG_USB_LIBUSB

2017-10-03 Thread Dr. Thomas Jansen

A previous commit (4e5ee5b21c) broke USB pass-through by introducing
a typo in a configuration variable.

Signed-off-by: Thomas Jansen

1 2 3 >

1 - 100 of 246 matches

Mail list logo