date:20151103

Re: [Qemu-devel] [Qemu-arm] [PATCH] ARM: ACPI: Fix MPIDR value in ACPI table

2015-11-03 Thread Peter Maydell

On 3 November 2015 at 04:33, Peter Crosthwaite
 wrote:
> So, I think this is just another case of the MPIDR information flow
> going the wrong way. It should go from board to all of CPU, DT and now
> this. I guess we can just fix this incrementally when we fix the
> implicit setting of MPIDR in mach-virt.

The difficulty with that is that to support KVM we need
to let KVM (ie the CPU object) override the board's ideas
about mpidr, because the kernel doesn't yet support letting
the board model inform it about what mpidr values to use.
So we probably need to have 'board model sets cpu property,
everything else reads cpu property which might or might not
be what the board hoped for'.

thanks
-- PMM

Re: [Qemu-devel] [PATCH] target-arm: Fix arm_debug_excp_handler() for singlestep enabled

2015-11-03 Thread Sergey Fedorov

On 02.11.2015 21:28, Peter Maydell wrote:
> On 2 November 2015 at 17:51, Sergey Fedorov  wrote:
>> CPU singlestep is done by generating a debug internal exception. Do not
>> raise a real CPU exception in case of singlestepping.
>>
>> Signed-off-by: Sergey Fedorov 
>> ---
>>  target-arm/op_helper.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
>> index 7929c71..67d9ffb 100644
>> --- a/target-arm/op_helper.c
>> +++ b/target-arm/op_helper.c
>> @@ -909,7 +909,7 @@ void arm_debug_excp_handler(CPUState *cs)
>>  uint64_t pc = is_a64(env) ? env->pc : env->regs[15];
>>  bool same_el = (arm_debug_target_el(env) == arm_current_el(env));
>>
>> -if (cpu_breakpoint_test(cs, pc, BP_GDB)) {
>> +if (cs->singlestep_enabled || cpu_breakpoint_test(cs, pc, BP_GDB)) {
>>  return;
>>  }
> So I think this will mean that if we're gdbstub-single-stepping then
> an architectural breakpoint on the insn we're stepping won't fire.
>
> Does using a test
>
> if (!cpu_breakpoint_test(cs, pc, BP_CPU)) {
> return;
> }
>
> fix the singlestep bug too? If so I think it would probably be
> preferable.

Actually, it is supposed that gdbstub breakpoints should be handled
before CPU breakpoints. So I think we should rather do this way:

if (cpu_breakpoint_test(cs, pc, BP_GDB) || !cpu_breakpoint_test(cs, pc, 
BP_CPU)) {
return;
}


Thanks,
Sergey

Re: [Qemu-devel] [PATCH v2 2/2] block: test 'blockdev-snapshot' using a file BDS as the overlay

2015-11-03 Thread Alberto Garcia

On Mon 02 Nov 2015 06:29:14 PM CET, Eric Blake  wrote:
>>> @@ -103,7 +103,8 @@ function add_snapshot_image()
>>> { 'options':
>>>   { 'driver': 'qcow2', 'node-name': 'snap_"${1}"', 
>>> "${extra_params}"
>>> 'file':
>>> -   { 'driver': 'file', 'filename': '"${snapshot_file}"' } } } 
>>> }"
>>> +   { 'driver': 'file', 'filename': '"${snapshot_file}"',
>>> + 'node-name': 'file_"${1}"' } } } }"
>> 
>> Pre-existing, but do those "" actually do anything?
>> 
>
> Actually, the "" are wrong.  Look at the full context: we have:
>
> cmd="..."${snapshot_file}"..."
>
> which means the expansion of $snapshot_file is _unquoted_.

Not really, it's quoted in all cases:

   'node-name': 'snap_"${1}"'
   'filename':  '"${snapshot_file}"'
   'node-name': 'file_"${1}"'

But it's true that the double quotes don't do anything so I'll remove
them.

Berto

Re: [Qemu-devel] qemu-2.2 using trace event

2015-11-03 Thread Stefan Hajnoczi

On Tue, Nov 3, 2015 at 9:13 AM, 浩樊啊  wrote:

Please keep qemu-devel@nongnu.org on CC so the discussion stays on the
mailing list where others can participate.

> you mean that qemu-system-x86_64 .   is a command to start a vm?
> can I just start a vm by a xml file and then use qemu-system_x86_64 -trace
> events=/tmp/events trace the vm?

No, each QEMU process is a separate VM.  Launching a new QEMU does not
attach to an existing QEMU process.

If you want to "attach" to a running QEMU process, you can use the
SystemTap tracing backend.  Make sure that your QEMU binary was built
with ./configure --enable-trace-backend=dtrace (that is the case in
Fedora and Red Hat-based distributions).  Take a look at the
/usr/share/systemtap/tapset/qemu-system-x86_64.stp tapset and
SystemTap documentation here:
https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps
https://sourceware.org/systemtap/SystemTap_Beginners_Guide/

Stefan

Re: [Qemu-devel] SMM error in 2.4 changelog

2015-11-03 Thread Paolo Bonzini



On 02/11/2015 23:26, William Dauchy wrote:
> Hello,
> 
> I think there might be a mistake in the 2.4 changelog:
> http://wiki.qemu.org/ChangeLog/2.4
> "Support for system management mode (requires Linux 4.1)."
> 
> But I believe it's included in Linux v4.2 see
> https://lwn.net/Articles/648995/ for the merge window.
> 
> Sorry for the noise if it is not the right way to report a
> wiki/changelog mistake.

Yes, this is correct.  Thanks!

Paolo

Re: [Qemu-devel] [PATCH] hw/arm/virt-acpi-build: _CCA attribute is compulsary

2015-11-03 Thread Shannon Zhao



On 2015/11/3 16:31, Graeme Gregory wrote:
> 
> 
> On Tue, 3 Nov 2015, at 02:25 AM, Shannon Zhao wrote:
>> Hi Graeme,
>>
>> On 2015/11/2 18:39, Graeme Gregory wrote:
>>> According to ACPI specification 6.2.17 _CCA (Cache Coherency Attribute)
>>> this attribute is compulsary on ARM systems. Add this attribute to
>>> the PCI host bridges as required.
>>>
>>
>> To ACPI 5.1 this object is not compulsory and if not supplied it has
>> default value for it. But to ACPI 6.0 it must be supplied on ARM systems.
>> Regarding this change, ACPI 6.0 fixes 5.1 for this object, right?
>>
> 
> Hi Shannon, the wording in ACPI 5.1 is "On ARM based systems, the _CCA
> object must be supplied all such devices."
> 
> So is not functionally different from 6.0.
> 
Oh, I see. It's updated by 5.1 Errata 1189.

Reviewed-by: Shannon Zhao 

> Graeme
> 
>>> Without this the kernel will produce the error
>>> [Firmware Bug]: PCI device :00:00.0 fail to setup DMA.
>>>
>>> Signed-off-by: Graeme Gregory 
>>> ---
>>>  hw/arm/virt-acpi-build.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>>> index 1aaff1f..1430125 100644
>>> --- a/hw/arm/virt-acpi-build.c
>>> +++ b/hw/arm/virt-acpi-build.c
>>> @@ -180,6 +180,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const 
>>> MemMapEntry *memmap, int irq,
>>>  aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
>>>  aml_append(dev, aml_name_decl("_UID", aml_string("PCI0")));
>>>  aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
>>> +aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
>>>  
>>>  /* Declare the PCI Routing Table. */
>>>  Aml *rt_pkg = aml_package(nr_pcie_buses * PCI_NUM_PINS);
>>>
>>
>> -- 
>> Shannon
>>
>>

-- 
Shannon

Re: [Qemu-devel] [PATCH] target-arm: Clean up DISAS_UPDATE usage in AArch32 translation code

2015-11-03 Thread Sergey Fedorov

On 02.11.2015 21:29, Peter Maydell wrote:
> On 2 November 2015 at 18:16, Sergey Fedorov  wrote:
>> AArch32 translation code does not distinguish between DISAS_UPDATE and
>> DISAS_JUMP. Thus, we cannot use any of them without first updating PC in
>> CPU state. Furthermore, it is too complicated to update PC in CPU state
>> before PC gets updated in disas context. So it is hardly possible to
>> correctly end TB early if is is not likely to be executed before calling
>> disas_*_insn(), e.g. just after calling breakpoint check helper.
>>
>> Modify DISAS_UPDATE and DISAS_JUMP usage in AArch32 translation and
>> apply to them the same semantic as AArch64 translation does:
>>  - DISAS_UPDATE: update PC in CPU state when finishing translation
>>  - DISAS_JUMP:   preserve current PC value in CPU state when finishing
>>  translation
> Is this fixing the breakpoint related bug? If so the commit message
> should say so. Otherwise it just looks like cleanup...
>
> (I'll review the patch tomorrow.)

Yes it's fixing a bug in breakpoint handling. I'll update the commit
message.

Best,
Sergey

Re: [Qemu-devel] [RFC PATCH 0/5] Introduce Intel 82574 GbE Controller Emulation (e1000e)

2015-11-03 Thread Dmitry Fleytman


> On 3 Nov 2015, at 07:44 AM, Jason Wang  wrote:
> 
> 
> 
> On 11/02/2015 03:49 PM, Dmitry Fleytman wrote:
>> 
>>> On 2 Nov 2015, at 05:35 AM, Jason Wang >> 
>>> >> wrote:
>>> 
>>> 
>>> 
>>> On 10/31/2015 01:52 PM, Dmitry Fleytman wrote:
 Hello Jason,
 
 Thanks for reviewing. See my answers inline.
 
 
> On 30 Oct 2015, at 07:28 AM, Jason Wang  
> >
> > wrote:
> 
> 
> 
> On 10/28/2015 01:44 PM, Jason Wang wrote:
>> 
>> On 10/26/2015 01:00 AM, Leonid Bloch wrote:
>>> Hello qemu-devel,
>>> 
>>> This patch series is an RFC for the new networking device emulation
>>> we're developing for QEMU.
>>> 
>>> This new device emulates the Intel 82574 GbE Controller and works
>>> with unmodified Intel e1000e drivers from the Linux/Windows kernels.
>>> 
>>> The status of the current series is "Functional Device Ready, work
>>> on Extended Features in Progress".
>>> 
>>> More precisely, these patches represent a functional device, which
>>> is recognized by the standard Intel drivers, and is able to transfer
>>> TX/RX packets with CSO/TSO offloads, according to the spec.
>>> 
>>> Extended features not supported yet (work in progress):
>>> 1. TX/RX Interrupt moderation mechanisms
>>> 2. RSS
>>> 3. Full-featured multi-queue (use of multiqueued network backend)
>>> 
>>> Also, there will be some code refactoring and performance
>>> optimization efforts.
>>> 
>>> This series was tested on Linux (Fedora 22) and Windows (2012R2)
>>> guests, using Iperf, with TX/RX and TCP/UDP streams, and various
>>> packet sizes.
>>> 
>>> More thorough testing, including data streams with different MTU
>>> sizes, and Microsoft Certification (HLK) tests, are pending missing
>>> features' development.
>>> 
>>> See commit messages (esp. "net: Introduce e1000e device emulation")
>>> for more information about the development approaches and the
>>> architecture options chosen for this device.
>>> 
>>> This series is based upon v2.3.0 tag of the upstream QEMU repository,
>>> and it will be rebased to latest before the final submission.
>>> 
>>> Please share your thoughts - any feedback is highly welcomed :)
>>> 
>>> Best Regards,
>>> Dmitry Fleytman.
>> Thanks for the series. Will go through this in next few days.
> 
> Have a quick glance at the series, got the following questions:
> 
> - Though e1000e differs from e1000 in many places, I still see lots of
> code duplications. We need consider to reuse e1000.c (or at least part
> of). I believe we don't want to fix a bug twice in two places in the
> future and I expect hundreds of lines could be saved through this way.
 
 That’s a good question :)
 
 This is how we started, we had a common “core” code base meant to
 implement all common logic (this split is still present in the patches
 - there are e1000e_core.c and e1000e.c files).
 Unfortunately at some point it turned out that there are more
 differences that commons. We noticed that the code becomes filled with
 many minor differences handling.
 This also made the code base more complicated and harder to follow.
 
 So at some point of time it was decided to split the code base and
 revert all changes done to the e1000 device (except a few
 fixes/improvements Leonid submitted a few days ago).
 
 Although there was common code between devices, total SLOC of e1000
 and e1000e devices became smaller after the split.
 
 Amount of code that may be shared between devices will be even smaller
 after we complete the implementation which still misses a few features
 (see cover letter) that will change many things.
 
 Still after the device implementation is done, we plan to review code
 similarities again to see if there are possibilities for code sharing.
>>> 
>>> I see, but if we can try to re-use or unify the codes from beginning, it
>>> would be a little bit easier. Looks like the differences were mainly:
>>> 
>>> 1) MSI-X support
>>> 2) offloading support through virtio-net header
>>> 3) trace points
>>> 4) other new functions through e1000e specific registers
>>> 
>>> So we could first unify the code through implementing the support of 2
>>> and 3 for e1000. For MSI-X and other e1000e specific new functions, it
>>> could be done through:
>>> 
>>> 1) model specific callbacks, e.g realize, transmission and reception
>>> 2) A new register flags e.g PHY_RW_E1000E which means the register is
>>> for e1000e only. Or even model specific wirteops

Re: [Qemu-devel] [PATCH v2] taget-ppc: Fix read access to IBAT registers higher than IBAT3

2015-11-03 Thread Julio Guerra

Ping :)

Le mer. 14 oct. 2015 19:43, Julio Guerra  a écrit :

> Fix the index used to read the IBAT's vector which results in IBAT0..3
> instead
> of IBAT4..N.
>
> The bug appeared by saving/restoring contexts including IBATs values.
>
> Signed-off-by: Julio Guerra 
> ---
>  target-ppc/translate_init.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index b541473..76d9a02 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -305,7 +305,7 @@ static void spr_read_ibat (DisasContext *ctx, int
> gprn, int sprn)
>
>  static void spr_read_ibat_h (DisasContext *ctx, int gprn, int sprn)
>  {
> -tcg_gen_ld_tl(cpu_gpr[gprn], cpu_env, offsetof(CPUPPCState, IBAT[sprn
> & 1][(sprn - SPR_IBAT4U) / 2]));
> +tcg_gen_ld_tl(cpu_gpr[gprn], cpu_env, offsetof(CPUPPCState, IBAT[sprn
> & 1][((sprn - SPR_IBAT4U) / 2) + 4]));
>  }
>
>  static void spr_write_ibatu (DisasContext *ctx, int sprn, int gprn)
> --
> 2.5.2
>
>

Re: [Qemu-devel] [PATCH 00/19] buffer/vnc: improve vnc buffer hsndling

2015-11-03 Thread Gerd Hoffmann

> >buffer: allow a buffer to shrink gracefully
> 
> The last Patch isn't the latest version. I have one with improved comments 
> here:
> 
> https://github.com/plieven/qemu/commit/e599748ab1ef381d4b1c88bf1ea1454dd89353fb
> 
> I also had another improvement:
> 
> https://github.com/plieven/qemu/commit/2b4180a5f4ec29a59de692e9aa512b7b4d8023e7
> 
> which limits the number of memmove operation in qio_buffer_advance.

Can you git-send-email them to the list for review?

thanks,
  Gerd

Re: [Qemu-devel] [PATCH v3 00/11] vl.c: Error message rework

2015-11-03 Thread Markus Armbruster

Eduardo Habkost  writes:

> On Fri, Oct 30, 2015 at 05:23:27PM +0100, Markus Armbruster wrote:
>> Eduardo Habkost  writes:
>> 
>> > Changes v2 -> v3:
>> > * Removed patch: "vl.c: Convert error sentences to simpler phrases"
>> > * Removed patch: "vl.c: Reword -machine help error messages"
>> > * Removed patch: "vl.c: Reword fw_cfg name prefix warning"
>> > * Removed patch: "vl.c: Use US spelling for 'unrecognized'"
>> > * New patch: "vl.c: Change 'fail to parse' error message to 'failed to 
>> > parse'"
>> > * Squashed "vl.c: trivial: Don't wrap lines unnecessarily"
>> >   into "vl.c: Replace fprintf(stderr) with error_report()"
>> >
>> > Changes v1 -> v2:
>> > * Extra patches for many suggestions I got when changing vl.c to use
>> >   error_report()
>> >
>> > Eduardo Habkost (11):
>> >   vl.c: Replace fprintf(stderr) with error_report()
>> >   vl.c: Use error_report() when reporting shutdown signal
>> >   vl.c: Remove periods and exclamation points from error messages
>> >   vl.c: Use "warning:" prefix consistently on warnings
>> >   vl.c: Use "cannot" instead of "can not" in error messages
>> >   vl.c: Use 'quotes' instead of `quotes' in messages
>> >   vl.c: Remove unnecessary uppercase in error messages
>> >   vl.c: Change "fail to parse" error message to "failed to parse"
>> >   vl.c: Simplify "ignoring deprecated option" warnings
>> >   vl.c: Reword -no-kvm-pit-reinjection deprecation warning
>> >   vl.c: Use "%s support is disabled" error messages consistently
>> >
>> >  vl.c | 256 
>> > +--
>> >  1 file changed, 125 insertions(+), 131 deletions(-)
>> 
>> I guess I would've squashed some of these together, and perhaps touched
>> up the rest of the files listed in MAINTAINERS under "Main loop", too
>> (cpus.c main-loop.c qemu-timer.c).  Regardless, this looks ready to go
>> through my tree.
>
> Thanks! Feel free to squash as many patches as you want together. I
> split the changes just to make them easier to review and discuss.

Please have a look at pull-error-2015-11-03 in my tree at
.  Only squashing, no code change.
If you like the result, I'll post a pull request.

Re: [Qemu-devel] [PATCH 00/19] buffer/vnc: improve vnc buffer hsndling

2015-11-03 Thread Peter Lieven


Am 03.11.2015 um 09:23 schrieb Gerd Hoffmann:

buffer: allow a buffer to shrink gracefully

The last Patch isn't the latest version. I have one with improved comments here:

https://github.com/plieven/qemu/commit/e599748ab1ef381d4b1c88bf1ea1454dd89353fb

I also had another improvement:

https://github.com/plieven/qemu/commit/2b4180a5f4ec29a59de692e9aa512b7b4d8023e7

which limits the number of memmove operation in qio_buffer_advance.

Can you git-send-email them to the list for review?


done

Peter

[Qemu-devel] [PATCH 1/2] io/buffer: allow a buffer to shrink gracefully

2015-11-03 Thread Peter Lieven

the idea behind this patch is to allow the buffer to shrink, but
make this a seldom operation. The buffers average size is measured
exponentionally smoothed with am alpha of 1/128.

Signed-off-by: Peter Lieven 
---
 include/io/buffer.h |  1 +
 io/buffer.c | 42 +-
 2 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/include/io/buffer.h b/include/io/buffer.h
index f6668cb..f63869e 100644
--- a/include/io/buffer.h
+++ b/include/io/buffer.h
@@ -37,6 +37,7 @@ struct QIOBuffer {
 char *name;
 size_t capacity;
 size_t offset;
+uint64_t avg_size;
 uint8_t *buffer;
 };
 
diff --git a/io/buffer.c b/io/buffer.c
index 0fd3cea..d2a6043 100644
--- a/io/buffer.c
+++ b/io/buffer.c
@@ -23,6 +23,10 @@
 
 #define QIO_BUFFER_MIN_INIT_SIZE 4096
 #define QIO_BUFFER_MIN_SHRINK_SIZE  65536
+/* define the factor alpha for the expentional smoothing
+ * that is used in the average size calculation. a shift
+ * of 7 results in an alpha of 1/2^7. */
+#define QIO_BUFFER_AVG_SIZE_SHIFT   7
 
 static size_t buf_req_size(QIOBuffer *buffer, size_t len)
 {
@@ -37,6 +41,11 @@ static void buf_adj_size(QIOBuffer *buffer, size_t len)
 buffer->buffer = g_realloc(buffer->buffer, buffer->capacity);
 trace_qio_buffer_resize(buffer->name ?: "unnamed",
 old, buffer->capacity);
+
+/* make it even harder for the buffer to shrink, reset average size
+ * to currenty capacity if it is larger than the average. */
+buffer->avg_size = MAX(buffer->avg_size,
+   buffer->capacity << QIO_BUFFER_AVG_SIZE_SHIFT);
 }
 
 void qio_buffer_init(QIOBuffer *buffer, const char *name, ...)
@@ -48,21 +57,34 @@ void qio_buffer_init(QIOBuffer *buffer, const char *name, 
...)
 va_end(ap);
 }
 
-void qio_buffer_shrink(QIOBuffer *buffer)
+static uint64_t get_buf_avg_size(QIOBuffer *buffer)
 {
-/*
- * Only shrink in case the used size is *much* smaller than the
- * capacity, to avoid bumping up & down the buffers all the time.
+return buffer->avg_size >> QIO_BUFFER_AVG_SIZE_SHIFT;
+}
+
+void qio_buffer_shrink(QIOBuffer *buffer)
+ {
+size_t new;
+
+/* Calculate the average size of the buffer as
+ * avg_size = avg_size * ( 1 - a ) + required_size * a
+ * where a is 1 / 2 ^ QIO_BUFFER_AVG_SIZE_SHIFT. */
+buffer->avg_size *= (1 << QIO_BUFFER_AVG_SIZE_SHIFT) - 1;
+buffer->avg_size >>= QIO_BUFFER_AVG_SIZE_SHIFT;
+buffer->avg_size += buf_req_size(buffer, 0);
+
+/* And then only shrink if the average size of the buffer is much
+ * too big, to avoid bumping up & down the buffers all the time.
  * realloc() isn't exactly cheap ...
  */
-if (buffer->offset < (buffer->capacity >> 3) &&
-buffer->capacity > QIO_BUFFER_MIN_SHRINK_SIZE) {
-return;
+new = buf_req_size(buffer, get_buf_avg_size(buffer));
+if (new < buffer->capacity >> 3 &&
+new >= QIO_BUFFER_MIN_SHRINK_SIZE) {
+buf_adj_size(buffer, get_buf_avg_size(buffer));
 }
-
-buf_adj_size(buffer, 0);
 }
 
+
 void qio_buffer_reserve(QIOBuffer *buffer, size_t len)
 {
 if ((buffer->capacity - buffer->offset) < len) {
@@ -83,6 +105,7 @@ uint8_t *qio_buffer_end(QIOBuffer *buffer)
 void qio_buffer_reset(QIOBuffer *buffer)
 {
 buffer->offset = 0;
+qio_buffer_shrink(buffer);
 }
 
 void qio_buffer_free(QIOBuffer *buffer)
@@ -107,6 +130,7 @@ void qio_buffer_advance(QIOBuffer *buffer, size_t len)
 memmove(buffer->buffer, buffer->buffer + len,
 (buffer->offset - len));
 buffer->offset -= len;
+qio_buffer_shrink(buffer);
 }
 
 void qio_buffer_move_empty(QIOBuffer *to, QIOBuffer *from)
-- 
1.9.1

[Qemu-devel] [PATCH 2/2] io/buffer: avoid memmove at each qio_buffer_advance

2015-11-03 Thread Peter Lieven

memmove isn't exactly cheap at it involves temporary buffers
if the memory areas are overlapping. So make qio_buffer_advance
basically a pointer adjustment, but still keep the wasted memory
in reasonable limits.

Signed-off-by: Peter Lieven 
---
 include/io/buffer.h |  2 ++
 io/buffer.c | 43 ---
 2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/include/io/buffer.h b/include/io/buffer.h
index f63869e..43688cc 100644
--- a/include/io/buffer.h
+++ b/include/io/buffer.h
@@ -39,6 +39,8 @@ struct QIOBuffer {
 size_t offset;
 uint64_t avg_size;
 uint8_t *buffer;
+size_t base_offs;
+uint8_t *base_ptr;
 };
 
 /**
diff --git a/io/buffer.c b/io/buffer.c
index d2a6043..f1e4570 100644
--- a/io/buffer.c
+++ b/io/buffer.c
@@ -21,24 +21,27 @@
 #include "io/buffer.h"
 #include "trace.h"
 
-#define QIO_BUFFER_MIN_INIT_SIZE 4096
-#define QIO_BUFFER_MIN_SHRINK_SIZE  65536
+#define QIO_BUFFER_MIN_INIT_SIZE4096
+#define QIO_BUFFER_MIN_SHRINK_SIZE 65536
+#define QIO_BUFFER_MAX_WASTED_SIZE   1048576
 /* define the factor alpha for the expentional smoothing
  * that is used in the average size calculation. a shift
  * of 7 results in an alpha of 1/2^7. */
-#define QIO_BUFFER_AVG_SIZE_SHIFT   7
+#define QIO_BUFFER_AVG_SIZE_SHIFT  7
 
 static size_t buf_req_size(QIOBuffer *buffer, size_t len)
 {
 return MAX(QIO_BUFFER_MIN_INIT_SIZE,
-   pow2ceil(buffer->offset + len));
+   pow2ceil(buffer->base_offs + buffer->offset + len));
 }
 
 static void buf_adj_size(QIOBuffer *buffer, size_t len)
 {
 size_t old = buffer->capacity;
 buffer->capacity = buf_req_size(buffer, len);
-buffer->buffer = g_realloc(buffer->buffer, buffer->capacity);
+buffer->base_ptr = g_realloc(buffer->base_ptr, buffer->capacity);
+buffer->buffer = buffer->base_ptr + buffer->base_offs;
+
 trace_qio_buffer_resize(buffer->name ?: "unnamed",
 old, buffer->capacity);
 
@@ -105,17 +108,21 @@ uint8_t *qio_buffer_end(QIOBuffer *buffer)
 void qio_buffer_reset(QIOBuffer *buffer)
 {
 buffer->offset = 0;
+buffer->base_offs = 0;
+buffer->buffer = buffer->base_ptr;
 qio_buffer_shrink(buffer);
 }
 
 void qio_buffer_free(QIOBuffer *buffer)
 {
 trace_qio_buffer_free(buffer->name ?: "unnamed", buffer->capacity);
-g_free(buffer->buffer);
+g_free(buffer->base_ptr);
 g_free(buffer->name);
 buffer->offset = 0;
+buffer->base_offs = 0;
 buffer->capacity = 0;
 buffer->buffer = NULL;
+buffer->base_ptr = NULL;
 buffer->name = NULL;
 }
 
@@ -127,10 +134,18 @@ void qio_buffer_append(QIOBuffer *buffer, const void 
*data, size_t len)
 
 void qio_buffer_advance(QIOBuffer *buffer, size_t len)
 {
-memmove(buffer->buffer, buffer->buffer + len,
-(buffer->offset - len));
+if (buffer->offset - len == 0) {
+return qio_buffer_reset(buffer);
+}
+buffer->buffer += len;
+buffer->base_offs += len;
 buffer->offset -= len;
-qio_buffer_shrink(buffer);
+if (buffer->base_offs > QIO_BUFFER_MAX_WASTED_SIZE) {
+memmove(buffer->base_ptr, buffer->buffer, buffer->offset);
+buffer->buffer = buffer->base_ptr;
+buffer->base_offs = 0;
+qio_buffer_shrink(buffer);
+}
 }
 
 void qio_buffer_move_empty(QIOBuffer *to, QIOBuffer *from)
@@ -140,14 +155,18 @@ void qio_buffer_move_empty(QIOBuffer *to, QIOBuffer *from)
 from->name ?: "unnamed");
 assert(to->offset == 0);
 
-g_free(to->buffer);
+g_free(to->base_ptr);
 to->offset = from->offset;
 to->capacity = from->capacity;
 to->buffer = from->buffer;
+to->base_offs = from->base_offs;
+to->base_ptr = from->base_ptr;
 
 from->offset = 0;
 from->capacity = 0;
 from->buffer = NULL;
+from->base_offs = 0;
+from->base_ptr = NULL;
 }
 
 void qio_buffer_move(QIOBuffer *to, QIOBuffer *from)
@@ -164,8 +183,10 @@ void qio_buffer_move(QIOBuffer *to, QIOBuffer *from)
 qio_buffer_reserve(to, from->offset);
 qio_buffer_append(to, from->buffer, from->offset);
 
-g_free(from->buffer);
+g_free(from->base_ptr);
 from->offset = 0;
 from->capacity = 0;
 from->buffer = NULL;
+from->base_offs = 0;
+from->base_ptr = NULL;
 }
-- 
1.9.1

[Qemu-devel] [PATCH 0/2] vnc buffer enhancements for 2.5

2015-11-03 Thread Peter Lieven

These are 2 patches on top of Gerds improve buffer handling series.

Patch 1 is an updated version of the patch included in Gerds series and
Patch 2 is a further optimization open for discussion.

Peter

Peter Lieven (2):
  io/buffer: allow a buffer to shrink gracefully
  io/buffer: avoid memmove at each qio_buffer_advance

 include/io/buffer.h |  3 ++
 io/buffer.c | 81 +
 2 files changed, 66 insertions(+), 18 deletions(-)

-- 
1.9.1

Re: [Qemu-devel] [PATCH] target-arm: Fix arm_debug_excp_handler() for singlestep enabled

2015-11-03 Thread Peter Maydell

On 3 November 2015 at 09:02, Sergey Fedorov  wrote:
> On 02.11.2015 21:28, Peter Maydell wrote:
>> On 2 November 2015 at 17:51, Sergey Fedorov  wrote:
>>> CPU singlestep is done by generating a debug internal exception. Do not
>>> raise a real CPU exception in case of singlestepping.
>>>
>>> Signed-off-by: Sergey Fedorov 
>>> ---
>>>  target-arm/op_helper.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
>>> index 7929c71..67d9ffb 100644
>>> --- a/target-arm/op_helper.c
>>> +++ b/target-arm/op_helper.c
>>> @@ -909,7 +909,7 @@ void arm_debug_excp_handler(CPUState *cs)
>>>  uint64_t pc = is_a64(env) ? env->pc : env->regs[15];
>>>  bool same_el = (arm_debug_target_el(env) == arm_current_el(env));
>>>
>>> -if (cpu_breakpoint_test(cs, pc, BP_GDB)) {
>>> +if (cs->singlestep_enabled || cpu_breakpoint_test(cs, pc, BP_GDB)) 
>>> {
>>>  return;
>>>  }
>> So I think this will mean that if we're gdbstub-single-stepping then
>> an architectural breakpoint on the insn we're stepping won't fire.
>>
>> Does using a test
>>
>> if (!cpu_breakpoint_test(cs, pc, BP_CPU)) {
>> return;
>> }
>>
>> fix the singlestep bug too? If so I think it would probably be
>> preferable.
>
> Actually, it is supposed that gdbstub breakpoints should be handled
> before CPU breakpoints. So I think we should rather do this way:
>
> if (cpu_breakpoint_test(cs, pc, BP_GDB) || !cpu_breakpoint_test(cs, pc, 
> BP_CPU)) {
> return;
> }

Yes, that sounds like the right logic. I think a comment will be
helpful to explain what's going on for future readers :-)

thanks
-- PMM

Re: [Qemu-devel] [PATCH v3 2/5] fw_cfg: amend callback behavior spec to once per select

2015-11-03 Thread Laszlo Ersek

On 11/03/15 01:35, Gabriel L. Somlo wrote:
> Currently, the fw_cfg internal API specifies that if an item was set up
> with a read callback, the callback must be run each time a byte is read
> from the item. This behavior is both wasteful (most items do not have a
> read callback set), and impractical for bulk transfers (e.g., DMA read).
> 
> At the time of this writing, the only items configured with a callback
> are "/etc/table-loader", "/etc/acpi/tables", and "/etc/acpi/rsdp". They
> all share the same callback functions: virt_acpi_build_update() on ARM
> (in hw/arm/virt-acpi-build.c), and acpi_build_update() on i386 (in
> hw/i386/acpi.c). Both of these callbacks are one-shot (i.e. they return
> without doing anything at all after the first time they are invoked with
> a given build_state; since build_state is also shared across all three
> items mentioned above, the callback only ever runs *once*, the first
> time either of the listed items is read).
> 
> This patch amends the specification for fw_cfg_add_file_callback() to
> state that any available read callback will only be invoked once each
> time the item is selected. This change has no practical effect on the
> current behavior of QEMU, and it enables us to significantly optimize
> the behavior of fw_cfg reads during guest firmware setup, eliminating
> a large amount of redundant callback checks and invocations.
> 
> Cc: Laszlo Ersek 
> Cc: Gerd Hoffmann 
> Cc: Marc Marí 
> Signed-off-by: Gabriel Somlo 
> ---
>  hw/nvram/fw_cfg.c | 19 ++-
>  include/hw/nvram/fw_cfg.h | 10 +++---
>  2 files changed, 13 insertions(+), 16 deletions(-)
> 
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 73b0a81..6e6414b 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -252,7 +252,8 @@ static void fw_cfg_write(FWCfgState *s, uint8_t value)
>  
>  static int fw_cfg_select(FWCfgState *s, uint16_t key)
>  {
> -int ret;
> +int arch, ret;
> +FWCfgEntry *e;
>  
>  s->cur_offset = 0;
>  if ((key & FW_CFG_ENTRY_MASK) >= FW_CFG_MAX_ENTRY) {
> @@ -261,6 +262,12 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
>  } else {
>  s->cur_entry = key;
>  ret = 1;
> +/* entry successfully selected, now run callback if present */
> +arch = !!(key & FW_CFG_ARCH_LOCAL);
> +e = >entries[arch][key & FW_CFG_ENTRY_MASK];
> +if (e->read_callback) {
> +e->read_callback(e->callback_opaque, s->cur_offset);
> +}
>  }
>  
>  trace_fw_cfg_select(s, key, ret);
> @@ -276,9 +283,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
>  if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= 
> e->len)
>  ret = 0;
>  else {
> -if (e->read_callback) {
> -e->read_callback(e->callback_opaque, s->cur_offset);
> -}
>  ret = e->data[s->cur_offset++];
>  }
>  
> @@ -371,10 +375,6 @@ static void fw_cfg_dma_transfer(FWCfgState *s)
>  len = (e->len - s->cur_offset);
>  }
>  
> -if (e->read_callback) {
> -e->read_callback(e->callback_opaque, s->cur_offset);
> -}
> -
>  /* If the access is not a read access, it will be a skip access,
>   * tested before.
>   */
> @@ -513,7 +513,8 @@ static void fw_cfg_reset(DeviceState *d)
>  {
>  FWCfgState *s = FW_CFG(d);
>  
> -fw_cfg_select(s, 0);
> +/* we never register a read callback for FW_CFG_SIGNATURE */
> +fw_cfg_select(s, FW_CFG_SIGNATURE);
>  }
>  
>  /* Save restore 32 bit int as uint16_t
> diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
> index 4b5e196..a1cfaa4 100644
> --- a/include/hw/nvram/fw_cfg.h
> +++ b/include/hw/nvram/fw_cfg.h
> @@ -183,13 +183,9 @@ void fw_cfg_add_file(FWCfgState *s, const char 
> *filename, void *data,
>   * structure residing at key value FW_CFG_FILE_DIR, containing the item name,
>   * data size, and assigned selector key value.
>   * Additionally, set a callback function (and argument) to be called each
> - * time a byte is read by the guest from this particular item, or, in the
> - * case of DMA, each time a read or skip request overlaps with the valid
> - * offset range of the item.
> - * NOTE: In addition to the opaque argument set here, the callback function
> - * takes the current data offset as an additional argument, allowing it the
> - * option of only acting upon specific offset values (e.g., 0, before the
> - * first data byte of the selected item is returned to the guest).
> + * time this item is selected (by having its selector key either written to
> + * the fw_cfg control register, or passed to QEMU in FWCfgDmaAccess.control
> + * with FW_CFG_DMA_CTL_SELECT).
>   */
>  void fw_cfg_add_file_callback(FWCfgState *s, const char *filename,
>FWCfgReadCallback callback, void

Re: [Qemu-devel] [PATCH 01/13] PPC: Allow Rc bit to be set on mtspr

2015-11-03 Thread Thomas Huth

On 23/10/15 15:56, Mark Cave-Ayland wrote:
> From: Alexander Graf 
> 
> According to the ISA setting the Rc bit on mtspr is undefined behavior.
> Real 750 hardware simply ignores the bit and doesn't touch cr0 though.

According to PowerISA 2.07, chapter 1.1.3:

"Reserved fields in instructions are ignored by the pro-
cessor."

And the lowest bit of the mtspr opcode is marked as reserved. So I think
this is not just a hack, but even a proper fix.

> Unfortunately, Mac OS 9 relies on this fact and executes a few mtspr
> instructions (to set XER for example) with Rc set.
> 
> So let's handle the bit the same way hardware does and ignore it.
> 
> Signed-off-by: Alexander Graf 
> Signed-off-by: Mark Cave-Ayland 
> ---
>  target-ppc/translate.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index c2bc1a7..d1f0f13 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -9884,7 +9884,7 @@ GEN_HANDLER(mtcrf, 0x1F, 0x10, 0x04, 0x0801, 
> PPC_MISC),
>  GEN_HANDLER(mtmsrd, 0x1F, 0x12, 0x05, 0x001EF801, PPC_64B),
>  #endif
>  GEN_HANDLER(mtmsr, 0x1F, 0x12, 0x04, 0x001FF801, PPC_MISC),
> -GEN_HANDLER(mtspr, 0x1F, 0x13, 0x0E, 0x0001, PPC_MISC),
> +GEN_HANDLER(mtspr, 0x1F, 0x13, 0x0E, 0x, PPC_MISC),
>  GEN_HANDLER(dcbf, 0x1F, 0x16, 0x02, 0x03C1, PPC_CACHE),
>  GEN_HANDLER(dcbi, 0x1F, 0x16, 0x0E, 0x03E1, PPC_CACHE),
>  GEN_HANDLER(dcbst, 0x1F, 0x16, 0x01, 0x03E1, PPC_CACHE),

Reviewed-by: Thomas Huth

Re: [Qemu-devel] [PATCH v7 03/35] acpi: add aml_create_field

2015-11-03 Thread Xiao Guangrong




On 11/03/2015 02:14 PM, Shannon Zhao wrote:



On 2015/11/2 17:13, Xiao Guangrong wrote:

Implement CreateField term which is used by NVDIMM _DSM method in later patch

Signed-off-by: Xiao Guangrong 
---
  hw/acpi/aml-build.c | 13 +
  include/hw/acpi/aml-build.h |  1 +
  2 files changed, 14 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index a72214d..9fe5e7b 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1151,6 +1151,19 @@ Aml *aml_sizeof(Aml *arg)
  return var;
  }

+/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefCreateField */
+Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
+{
+Aml *var = aml_alloc();
+build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
+build_append_byte(var->buf, 0x13); /* CreateFieldOp */
+aml_append(var, srcbuf);
+aml_append(var, index);
+aml_append(var, len);
+build_append_namestring(var->buf, "%s", name);
+return var;
+}
+
  void
  build_header(GArray *linker, GArray *table_data,
   AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 7296efb..7e1c43b 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -276,6 +276,7 @@ Aml *aml_touuid(const char *uuid);
  Aml *aml_unicode(const char *str);
  Aml *aml_derefof(Aml *arg);
  Aml *aml_sizeof(Aml *arg);
+Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);


Maybe this could be moved together with existing aml_create_dword_field.


Not bad, will do. :)




  void
  build_header(GArray *linker, GArray *table_data,

Re: [Qemu-devel] [PATCH v3 2/3] block: Remove inner quotation marks in iotest 085

2015-11-03 Thread Eric Blake

On 11/03/2015 03:32 AM, Alberto Garcia wrote:
> This patch removes the inner quotation marks in all cases like this:
> 
>cmd=" ... "${variable}" ... "
> 
> Signed-off-by: Alberto Garcia 
> ---
>  tests/qemu-iotests/085 | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)

Reviewed-by: Eric Blake 

I might have worded the commit message differently, though:

block: Remove incorrect "" in iotest 085

We had the patterns:
   cmd="..."${variable}"..."
   _send_qemu_cmd ... "..."${variable}"..."

which is equivalent to using ${variable} unquoted.  In the cmd= usage,
that happened to be okay even though it is unusual (because no word
splitting occurs on variable assignment); but where the usage appeared
as an argument to _send_qemu_cmd, it was actively wrong (any whitespace
in ${variable} would have caused word splitting).

Fix it by removing the inner "", leaving ${variable} to be expanded
inside the outer "" as desired.

> @@ -152,7 +152,7 @@ echo === Invalid command - missing device and nodename ===
>  echo
>  
>  _send_qemu_cmd $h "{ 'execute': 'blockdev-snapshot-sync',
> - 'arguments': { 
> 'snapshot-file':'"${TEST_DIR}/1-${snapshot_virt0}"',
> + 'arguments': { 
> 'snapshot-file':'${TEST_DIR}/1-${snapshot_virt0}',
>   'format': 'qcow2' } }" "error"
>  

Here's an example of the actual bug being fixed.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] RFC: libyajl for JSON

2015-11-03 Thread Markus Armbruster

Luiz Capitulino  writes:

> On Tue, 3 Nov 2015 14:53:59 +0100
> Paolo Bonzini  wrote:
>
>> On 03/11/2015 14:46, Luiz Capitulino wrote:
>> >> > Can you explain why that would make sense? :)  (Especially since there
>> >> > is another extension---JSON5---that does exactly what we're doing, so it
>> >> > probably wasn't that stupid an idea).
>> > Let's be pragmatic. *If* this is the only issue stopping us from
>> > dropping our own parser in favor of something more widely used and
>> > *if* libvirt doesn't make use of the feature, it's something we
>> > should strongly consider.
>> 
>> I'm not sure what's so bad about our parser that makes it worthwhile to:
>
> It's not that it's bad. It's about the advantages of dropping hundreds of
> lines of NIH code and switching to something more widely used.

I wish we would've / could've avoided NIH back then, but I'm not sure
getting rid of this piece of NIH now is worthwhile.

json-{lexer,parser,streamer}.[ch] together have 949 SLOC.  I'm not
counting tests, because we'd most likely keep them anyway.  This is less
than 0.1% of the QEMU source code :)

Maintenance hasn't been costing us a fortune exactly: 40 commits in six
years.

I'm annoyed by its relative shoddiness, but the prospect of having to
fix it up isn't something that keeps me up at night.

>Also,
> any maintenance time we spend on libyajl will also be automatically
> enjoyed by libvirt which is excellent.

Excellent indeed *if* upstream is responsive.

> On the other hand, I don't want to push too hard for it because I do
> recognize that switching has a cost and I won't be able to help with
> that myself.
>
>> 1) uglify all tests and make them inconsistent with the QAPI schemas,
>> which also uses single-quoted strings
>
> This doesn't seem hard to fix, we could pre-process the test files,
> say in Python, to add the needed escaping.

The test files are actually C code... sure you want to mangle C code in
Python?

>> 2) waste time finding a replacement for % interpolation (the best
>> replacement here would be to rewrite the tests in Python IMHO, but
>> that's not a small ask)
>
> Is this only used by tests?

No.

> Can you give an example of this feature?

Yes:

static QDict *build_qmp_error_dict(Error *err)
{
QObject *obj;

obj = qobject_from_jsonf("{ 'error': { 'class': %s, 'desc': %s } }",
 ErrorClass_lookup[error_get_class(err)],
 error_get_pretty(err));

return qobject_to_qdict(obj);
}

Builds a QDict with a single key "error".  Its value is a QDict with key
"class", value ErrorClass_lookup[error_get_class(err)], and key "desc",
value error_get_pretty(err), where "value" really means the C string
quoted for JSON and wrapped in a QString.

As I wrote elsewhere in the thread, we could build this by hand.  Much
less readable, but that might be tolerable, as the feature isn't widely
used.  I'm not thrilled about it, though.  It's too easy to forget the
quoting.

To find more examples, try "git-grep _json[fv]".

>> Just let's remove the weird (to not say worse) usage of QDict/QList to
>> store tokens...

Any replacement effort will have to compete on merits with fixing up
what we have.

Re: [Qemu-devel] [PATCH v3 3/3] block: test 'blockdev-snapshot' using a file BDS as the overlay

2015-11-03 Thread Eric Blake

On 11/03/2015 03:32 AM, Alberto Garcia wrote:
> This test checks that it is not possible to create a snapshot if the
> requested overlay node is a BDS which does not support backing images.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  tests/qemu-iotests/085 | 12 +++-
>  tests/qemu-iotests/085.out |  4 
>  2 files changed, 15 insertions(+), 1 deletion(-)
> 

> +++ b/tests/qemu-iotests/085.out
> @@ -62,6 +62,10 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 
> backing_file=TEST_DIR/
>  {"return": {}}
>  {"return": {}}
>  
> +=== Invalid command - cannot create a snapshot using a file BDS ===
> +
> +{"error": {"class": "GenericError", "desc": "The snapshot does not support 
> backing images"}}

This message could use improvement; more on that in 1/3.

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 1/3] block: Disallow snapshots if the overlay doesn't support backing files

2015-11-03 Thread Eric Blake

On 11/03/2015 03:32 AM, Alberto Garcia wrote:
> This addresses scenarios like this one:
> 
>   { 'execute': 'blockdev-add', 'arguments':
> { 'options': { 'driver': 'qcow2',
>'node-name': 'new0',
>'file': { 'driver': 'file',
>  'filename': 'new.qcow2',
>  'node-name': 'file0' } } } }
> 
>   { 'execute': 'blockdev-snapshot', 'arguments':
> { 'node': 'virtio0',
>   'overlay': 'file0' } }
> 
> Signed-off-by: Alberto Garcia 
> Reviewed-by: Eric Blake 
> Reviewed-by: Max Reitz 
> ---
>  blockdev.c | 5 +
>  1 file changed, 5 insertions(+)
> 

> +++ b/blockdev.c
> @@ -1667,6 +1667,11 @@ static void 
> external_snapshot_prepare(BlkTransactionState *common,
>  
>  if (state->new_bs->backing != NULL) {
>  error_setg(errp, "The snapshot already has a backing image");
> +return;
> +}
> +
> +if (!state->new_bs->drv->supports_backing) {
> +error_setg(errp, "The snapshot does not support backing images");

If we do s/snapshot/overlay/ here, the error message will make more
sense (I noticed it in 3/3).

My R-b stands either way, though.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v10 12/14] block: add transactional properties

2015-11-03 Thread Stefan Hajnoczi

On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
> @@ -1732,6 +1757,10 @@ static void 
> block_dirty_bitmap_add_prepare(BlkActionState *common,
>  BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>   common, common);
>  
> +if (action_check_cancel_mode(common, errp) < 0) {
> +return;
> +}
> +
>  action = common->action->block_dirty_bitmap_add;
>  /* AIO context taken and released within qmp_block_dirty_bitmap_add */
>  qmp_block_dirty_bitmap_add(action->node, action->name,
> @@ -1767,6 +1796,10 @@ static void 
> block_dirty_bitmap_clear_prepare(BlkActionState *common,
>   common, common);
>  BlockDirtyBitmap *action;
>  
> +if (action_check_cancel_mode(common, errp) < 0) {
> +return;
> +}
> +
>  action = common->action->block_dirty_bitmap_clear;
>  state->bitmap = block_dirty_bitmap_lookup(action->node,
>action->name,

Why do the bitmap add/clear actions not support err-cancel=all?

I understand why other block jobs don't support it, but it's not clear
why these non-block job actions cannot.


signature.asc
Description: PGP signature

[Qemu-devel] [PATCH v2] pc: allow raising low memory via max-ram-below-4g option

2015-11-03 Thread Gerd Hoffmann

This patch extends the functionality of the max-ram-below-4g option
to also allow increasing lowmem.  While being at it also rework the
lowmem calculation logic and add a longish comment describing how it
works and what the compatibility constrains are.

Signed-off-by: Gerd Hoffmann 
---
 hw/i386/pc.c  |  2 +-
 hw/i386/pc_piix.c | 61 +++
 2 files changed, 40 insertions(+), 23 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0cb8afd..55d6ca3 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1885,7 +1885,7 @@ static void pc_machine_initfn(Object *obj)
 pc_machine_get_hotplug_memory_region_size,
 NULL, NULL, NULL, _abort);
 
-pcms->max_ram_below_4g = 1ULL << 32; /* 4G */
+pcms->max_ram_below_4g = 0xe000; /* 3.5G */
 object_property_add(obj, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
 pc_machine_get_max_ram_below_4g,
 pc_machine_set_max_ram_below_4g,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 393dcc4..3cc0a72 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -100,29 +100,46 @@ static void pc_init1(MachineState *machine,
 PcGuestInfo *guest_info;
 ram_addr_t lowmem;
 
-/* Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory).
- * If it doesn't, we need to split it in chunks below and above 4G.
- * In any case, try to make sure that guest addresses aligned at
- * 1G boundaries get mapped to host addresses aligned at 1G boundaries.
- * For old machine types, use whatever split we used historically to avoid
- * breaking migration.
+/*
+ * Calculate ram split, for memory below and above 4G.  It's a bit
+ * complicated for backward compatibility reasons ...
+ *
+ *  - Traditional split is 3.5G (lowmem = 0xe000).  This is the
+ *default value for max_ram_below_4g now.
+ *
+ *  - Then, to gigabyte align the memory, we move the split to 3G
+ *(lowmem = 0xc000).  But only in case we have to split in
+ *the first place, i.e. ram_size is larger than (traditional)
+ *lowmem.  And for new machine types (gigabyte_align = true)
+ *only, for live migration compatibility reasons.
+ *
+ *  - Next the max-ram-below-4g option was added, which allowed to
+ *reduce lowmem to a smaller value, to allow a larger PCI I/O
+ *window below 4G.  qemu doesn't enforce gigabyte alignment here,
+ *but prints a warning.
+ *
+ *  - Finally max-ram-below-4g got updated to also allow raising lowmem,
+ *so legacy non-PAE guests can get as much memory as possible in
+ *the 32bit address space below 4G.
+ *
+ * Examples:
+ *qemu -M pc-1.7 -m 4G(old default)-> 3584M low,  512M high
+ *qemu -M pc -m 4G(new default)-> 3072M low, 1024M high
+ *qemu -M pc,max-ram-below-4g=2G -m 4G -> 2048M low, 2048M high
+ *qemu -M pc,max-ram-below-4g=4G -m 3968M  -> 3968M low (=4G-128M)
  */
-if (machine->ram_size >= 0xe000) {
-lowmem = gigabyte_align ? 0xc000 : 0xe000;
-} else {
-lowmem = 0xe000;
-}
-
-/* Handle the machine opt max-ram-below-4g.  It is basically doing
- * min(qemu limit, user limit).
- */
-if (lowmem > pcms->max_ram_below_4g) {
-lowmem = pcms->max_ram_below_4g;
-if (machine->ram_size - lowmem > lowmem &&
-lowmem & ((1ULL << 30) - 1)) {
-error_report("Warning: Large machine and max_ram_below_4g(%"PRIu64
- ") not a multiple of 1G; possible bad performance.",
- pcms->max_ram_below_4g);
+lowmem = pcms->max_ram_below_4g;
+if (machine->ram_size >= pcms->max_ram_below_4g) {
+if (gigabyte_align) {
+if (lowmem > 0xc000) {
+lowmem = 0xc000;
+}
+if (lowmem & ((1ULL << 30) - 1)) {
+error_report("Warning: Large machine and max_ram_below_4g "
+ "(%" PRIu64 ") not a multiple of 1G; "
+ "possible bad performance.",
+ pcms->max_ram_below_4g);
+}
 }
 }
 
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH v8.5 1/4] qapi: Drop all_members parameter from check()

2015-11-03 Thread Markus Armbruster

Eric Blake  writes:

> The implementation of QAPISchemaObjectTypeMember.check() always
> adds the member currently being checked to both the all_members
> and seen parameters.

QAPISchemaObjectTypeMember.check() does four things:

1. Compute self.type

   Precondition: all types are defined.

2. Accumulate members

   all_members serves as accumulator.

   We'll see that its only actual use is the owning object type's
   check(), which uses it to compute self.members.

3. Check for collisions

   This works by accumulating names in seen.  Precondition: seen
   contains the names seen so far.

   Note that this part uses seen like a set.  See 4.

4. Accumulate a map from names to members

   seen serves as accumulator.

   We'll see that its only actual user is the owning object type's
   variants.check(), which uses it to compute variants.tag_member from
   variants.tag_name.

>  However, the three callers of this method
> pass in the following parameters:
>
> QAPISchemaObjectType.check():
>   - all_members contains all non-variant members seen to date,
>   for use in populating self.members
>   - seen contains all non-variant members seen to date, for
>   use in checking for collisions

Yes, and:

- we're calling it for m in self.local_members
- before the loop, all_members and seen are initialized to the inherited
  non-variant members
- after the loop, they therefore contain all non-variant members

This caller uses all four things done by QAPISchemaObjectType.check():

1. Compute m.type
2. Accumulate non-variant members
3. Check for collisions among non-variant members
   Before the loop, seen contains the inherited members, which don't
   collide (self.base.check() ensures that).  The loop adds the local
   members one by one, checking for collisions.
4. Accumulate a map from names to non-variant members
   Similar argument to 3.

> QAPISchemaObjectTypeVariant.check():

Do you mean QAPISchemaObjectVariants.check()?

>   - all_members is a throwaway empty list
>   - seen is a throwaway dictionary created as a copy by
>   QAPISchemaObjectVariants.check() (since the members of
>   one variant cannot collide with those from another), for
>   use in checking for collisions (technically, we no longer
>   need to check for collisions between tag values and QMP
>   key names, but that's a cleanup for another patch)
>
> QAPISchemaAlternateType.check():
>   - all_members is a throwaway empty list
>   - seen is a throwaway empty dict

I'm afraid you're omitting a few steps here, and I think you missed
QAPISchemaObjectVariants.check()'s self.tag_member.check().  Let me go
through the remaining callers of QAPISchemaObjectTypeMember.check() real
slow.

* QAPISchemaObjectType.check() calls it via v.check(), which is a thin
  wrapper that throws away 2. and asserts something of no interest here.
  seen is a map from names to non-variant members.  Therefore,
  QAPISchemaObjectTypeMember.check() gets used here as follows:

  1. Compute v.type
  2. Thrown away
  3. Check for collision of tag value with non-variant members
 This is obsolete now.
  4. Thrown away

* QAPISchemaAlternateType.check() calls it via v.check(), which is the
  same thin wrapper.  seen is {} here.  Therefore:

  1. Compute v.type
  2. Thrown away
  3. No-op
  4. Thrown away

* QAPISchemaObjectTypeVariants.check() calls
  self.tag_member.check(schema, members, seen)
  where members and seen contain the non-variant members.
  However, it gets called only when the owning type is an alternate, and
  then members and seen are both empty.  Therefore:

  1. Compute v.type
  2. Thrown away
  3. No-op
  4. Thrown away

> Therefore, in the one case where we care about all_members
> after seen has been populated, we know that it contains the
> same members as seen.values(); changing seen to be an
> OrderedDict() is sufficient to pick up this information with
> one less parameter being passed around.

I believe the first step should be dropping the obsolete check for
collision of tag value with non-variant members.  I believe this should
do:

@@ -1059,8 +1059,7 @@ class QAPISchemaObjectTypeVariants(object):
 self.tag_member.check(schema, members, seen)
 assert isinstance(self.tag_member.type, QAPISchemaEnumType)
 for v in self.variants:
-vseen = dict(seen)
-v.check(schema, self.tag_member.type, vseen)
+v.check(schema, self.tag_member.type, {})

 class QAPISchemaObjectTypeVariant(QAPISchemaObjectTypeMember):

Then only one caller about 2-4., namely QAPISchemaObjectType.check().
Simplify radically: move 2-4. to the caller that cares, drop parameters
all_members and seen.

Still to do then: non-variant member collision checking.  Factor out
3. into a helper function, use it for non-variant members.

What do you think?

[Qemu-devel] [PATCH v11 12/12] Add a new API to start/stop replication, do checkpoint to all BDSes

2015-11-03 Thread Wen Congyang

Signed-off-by: Wen Congyang 
Signed-off-by: zhanghailiang 
Signed-off-by: Gonglei 
---
 block.c   | 83 +++
 include/block/block.h |  4 +++
 2 files changed, 87 insertions(+)

diff --git a/block.c b/block.c
index 04b928c..517fa4b 100644
--- a/block.c
+++ b/block.c
@@ -4208,3 +4208,86 @@ void bdrv_stop_replication(BlockDriverState *bs, bool 
failover, Error **errp)
" replication", bs->filename);
 }
 }
+
+void bdrv_start_replication_all(ReplicationMode mode, Error **errp)
+{
+BlockDriverState *bs = NULL, *temp = NULL;
+Error *local_err = NULL;
+
+while ((bs = bdrv_next(bs))) {
+if (!QLIST_EMPTY(>parents)) {
+/* It is not top BDS */
+continue;
+}
+
+if (bdrv_is_read_only(bs) || !bdrv_is_inserted(bs)) {
+continue;
+}
+
+bdrv_start_replication(bs, mode, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+goto fail;
+}
+}
+
+return;
+
+fail:
+while ((temp = bdrv_next(temp)) && bs != temp) {
+bdrv_stop_replication(temp, false, NULL);
+}
+}
+
+void bdrv_do_checkpoint_all(Error **errp)
+{
+BlockDriverState *bs = NULL;
+Error *local_err = NULL;
+
+while ((bs = bdrv_next(bs))) {
+if (!QLIST_EMPTY(>parents)) {
+/* It is not top BDS */
+continue;
+}
+
+if (bdrv_is_read_only(bs) || !bdrv_is_inserted(bs)) {
+continue;
+}
+
+bdrv_do_checkpoint(bs, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+}
+}
+
+void bdrv_stop_replication_all(bool failover, Error **errp)
+{
+BlockDriverState *bs = NULL;
+Error *local_err = NULL;
+
+while ((bs = bdrv_next(bs))) {
+if (!QLIST_EMPTY(>parents)) {
+/* It is not top BDS */
+continue;
+}
+
+if (bdrv_is_read_only(bs) || !bdrv_is_inserted(bs)) {
+continue;
+}
+
+bdrv_stop_replication(bs, failover, _err);
+if (!errp) {
+/*
+ * The caller doesn't care the result, they just
+ * want to stop all block's replication.
+ */
+continue;
+}
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+}
+}
diff --git a/include/block/block.h b/include/block/block.h
index 288e14e..8427969 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -643,4 +643,8 @@ void bdrv_start_replication(BlockDriverState *bs, 
ReplicationMode mode,
 void bdrv_do_checkpoint(BlockDriverState *bs, Error **errp);
 void bdrv_stop_replication(BlockDriverState *bs, bool failover, Error **errp);
 
+void bdrv_start_replication_all(ReplicationMode mode, Error **errp);
+void bdrv_do_checkpoint_all(Error **errp);
+void bdrv_stop_replication_all(bool failover, Error **errp);
+
 #endif
-- 
2.4.3

[Qemu-devel] [PATCH v4 0/7] e1000: Various fixes and registers' implementation

2015-11-03 Thread Leonid Bloch

This series fixes issues with packet/octet counting in e1000's Statistic
registers, fixes a bug in the packet address filtering procedure, and
implements many MAC registers that were absent before, some Statistic
counters among them.

Besides this, the series introduces a parameter which, if set to "on"
(default), will cause the entire MAC registers' array to migrate during
live migration (please see patch #2 for details). The rational behind
this is the ability to implement additional MAC registers in the future,
without worrying about migration compatibility between future versions.
For compatibility with previous versions, the above mentioned parameter
can be set to "off".

Additionally, several cosmetic changes are made.

Differences v1-2:

* Wording of several commit messages corrected.
* For trivially implemented Diagnostic registers, a debug message is
  added on read/write attempts, alerting of incomplete implementation.
* Following testing on a physical device, only the lower 16 bits can now
  be read from AIT, and only the lower 4 - from FFMT*.
* The grow_8reg_if_not_full function is rewritten.
* inc_tx_bcast_or_mcast_count and increase_size_stats are now called
  from within e1000_send_packet, to avoid code duplication.

Differences v2-3:

* Minor rewordings of some commit messages (0002, 0003).
* Live migration capability is added to the newly implemented registers.

Differences v3-4:

* Introduction of the "full_mac_registers" parameter (see above).
* Reversion of the live migration handling introduced in v3.
* Small alignment changes in patch #1 to correspond with the following
  patches.

The majority of these changes result from Jason Wang's review - thank
you, Jason!

Leonid Bloch (7):
  e1000: Cosmetic and alignment fixes
  e1000: Add support for migrating the entire MAC registers' array
  e1000: Trivial implementation of various MAC registers
  e1000: Fixing the received/transmitted packets' counters
  e1000: Fixing the received/transmitted octets' counters
  e1000: Fixing the packet address filtering procedure
  e1000: Implementing various counters

 hw/net/e1000.c  | 471 ++--
 hw/net/e1000_regs.h |   8 +-
 2 files changed, 355 insertions(+), 124 deletions(-)

-- 
2.4.3

[Qemu-devel] [PATCH v4 5/7] e1000: Fixing the received/transmitted octets' counters

2015-11-03 Thread Leonid Bloch

Previously, these 64-bit registers did not stick at their maximal
values when (and if) they reached them, as they should do, according to
the specs.

This patch introduces a function that takes care of such registers,
avoiding code duplication, making the relevant parts more compatible
with the QEMU coding style, while ensuring that in the unlikely case
of reaching the maximal value, the counter will stick there, as it
supposed to.

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
---
 hw/net/e1000.c | 26 ++
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 871f1b5..d176a8a 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -585,6 +585,20 @@ inc_reg_if_not_full(E1000State *s, int index)
 }
 }
 
+static void
+grow_8reg_if_not_full(E1000State *s, int index, int size)
+{
+uint64_t sum = s->mac_reg[index] | (uint64_t)s->mac_reg[index+1] << 32;
+
+if (sum + size < sum) {
+sum = ~0ULL;
+} else {
+sum += size;
+}
+s->mac_reg[index] = sum;
+s->mac_reg[index+1] = sum >> 32;
+}
+
 static inline int
 vlan_enabled(E1000State *s)
 {
@@ -634,7 +648,7 @@ static void
 xmit_seg(E1000State *s)
 {
 uint16_t len, *sp;
-unsigned int frames = s->tx.tso_frames, css, sofar, n;
+unsigned int frames = s->tx.tso_frames, css, sofar;
 struct e1000_tx *tp = >tx;
 
 if (tp->tse && tp->cptse) {
@@ -682,10 +696,8 @@ xmit_seg(E1000State *s)
 }
 
 inc_reg_if_not_full(s, TPT);
+grow_8reg_if_not_full(s, TOTL, s->tx.size);
 s->mac_reg[GPTC] = s->mac_reg[TPT];
-n = s->mac_reg[TOTL];
-if ((s->mac_reg[TOTL] += s->tx.size) < n)
-s->mac_reg[TOTH]++;
 }
 
 static void
@@ -1100,11 +1112,9 @@ e1000_receive_iov(NetClientState *nc, const struct iovec 
*iov, int iovcnt)
 /* TOR - Total Octets Received:
  * This register includes bytes received in a packet from the  field through the  field, inclusively.
+ * Always include FCS length (4) in size.
  */
-n = s->mac_reg[TORL] + size + /* Always include FCS length. */ 4;
-if (n < s->mac_reg[TORL])
-s->mac_reg[TORH]++;
-s->mac_reg[TORL] = n;
+grow_8reg_if_not_full(s, TORL, size+4);
 
 n = E1000_ICS_RXT0;
 if ((rdt = s->mac_reg[RDT]) < s->mac_reg[RDH])
-- 
2.4.3

[Qemu-devel] [PATCH v4 3/7] e1000: Trivial implementation of various MAC registers

2015-11-03 Thread Leonid Bloch

These registers appear in Intel's specs, but were not implemented.
These registers are now implemented trivially, i.e. they are initiated
with zero values, and if they are RW, they can be written or read by the
driver, or read only if they are R (essentially retaining their zero
values). For these registers no other procedures are performed.

For the trivially implemented Diagnostic registers, a debug warning is
produced on read/write attempts.

The registers implemented here are:

Transmit:
RW: AIT

Management:
RW: WUC WUS IPAVIP6AT*  IP4AT*  FFLT*   WUPM*   FFMT*   FFVT*

Diagnostic:
RW: RDFHRDFTRDFHS   RDFTS   RDFPC   PBM*TDFHTDFTTDFHS
TDFTS   TDFPC

Statistic:
RW: FCRUC
R:  RNBCTSCTFC  MGTPRC  MGTPDC  MGTPTC  RFC RJC SCC ECOL
LATECOL MCC COLCDC  TNCRS   SEC CEXTERR RLECXONRXC
XONTXC  XOFFRXC XOFFTXC

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
---
 hw/net/e1000.c  | 95 +++--
 hw/net/e1000_regs.h |  6 
 2 files changed, 98 insertions(+), 3 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 1190bbe..7db6614 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -170,7 +170,17 @@ enum {
 defreg(TPR), defreg(TPT), defreg(TXDCTL),  defreg(WUFC),
 defreg(RA),  defreg(MTA), defreg(CRCERRS), defreg(VFTA),
 defreg(VET), defreg(RDTR),defreg(RADV),defreg(TADV),
-defreg(ITR),
+defreg(ITR), defreg(FCRUC),   defreg(TDFH),defreg(TDFT),
+defreg(TDFHS),   defreg(TDFTS),   defreg(TDFPC),   defreg(RDFH),
+defreg(RDFT),defreg(RDFHS),   defreg(RDFTS),   defreg(RDFPC),
+defreg(IPAV),defreg(WUC), defreg(WUS), defreg(AIT),
+defreg(IP6AT),   defreg(IP4AT),   defreg(FFLT),defreg(FFMT),
+defreg(FFVT),defreg(WUPM),defreg(PBM), defreg(SCC),
+defreg(ECOL),defreg(MCC), defreg(LATECOL), defreg(COLC),
+defreg(DC),  defreg(TNCRS),   defreg(SEC), defreg(CEXTERR),
+defreg(RLEC),defreg(XONRXC),  defreg(XONTXC),  defreg(XOFFRXC),
+defreg(XOFFTXC), defreg(RFC), defreg(RJC), defreg(RNBC),
+defreg(TSCTFC),  defreg(MGTPRC),  defreg(MGTPDC),  defreg(MGTPTC)
 };
 
 static void
@@ -1118,6 +1128,48 @@ mac_readreg(E1000State *s, int index)
 }
 
 static uint32_t
+mac_readreg_prt(E1000State *s, int index)
+{
+DBGOUT(GENERAL, "Reading register at offset: 0x%08x. "
+   "It is not fully implemented.\n", index<<2);
+return s->mac_reg[index];
+}
+
+static uint32_t
+mac_low4_read(E1000State *s, int index)
+{
+return s->mac_reg[index] & 0xf;
+}
+
+static uint32_t
+mac_low11_read(E1000State *s, int index)
+{
+return s->mac_reg[index] & 0x7ff;
+}
+
+static uint32_t
+mac_low11_read_prt(E1000State *s, int index)
+{
+DBGOUT(GENERAL, "Reading register at offset: 0x%08x. "
+   "It is not fully implemented.\n", index<<2);
+return s->mac_reg[index] & 0x7ff;
+}
+
+static uint32_t
+mac_low13_read_prt(E1000State *s, int index)
+{
+DBGOUT(GENERAL, "Reading register at offset: 0x%08x. "
+   "It is not fully implemented.\n", index<<2);
+return s->mac_reg[index] & 0x1fff;
+}
+
+static uint32_t
+mac_low16_read(E1000State *s, int index)
+{
+return s->mac_reg[index] & 0x;
+}
+
+static uint32_t
 mac_icr_read(E1000State *s, int index)
 {
 uint32_t ret = s->mac_reg[ICR];
@@ -1161,6 +1213,14 @@ mac_writereg(E1000State *s, int index, uint32_t val)
 }
 
 static void
+mac_writereg_prt(E1000State *s, int index, uint32_t val)
+{
+DBGOUT(GENERAL, "Writing to register at offset: 0x%08x. "
+   "It is not fully implemented.\n", index<<2);
+s->mac_reg[index] = val;
+}
+
+static void
 set_rdt(E1000State *s, int index, uint32_t val)
 {
 s->mac_reg[index] = val & 0x;
@@ -1219,26 +1279,50 @@ static uint32_t (*macreg_readops[])(E1000State *, int) 
= {
 getreg(RDH),  getreg(RDT),  getreg(VET),  getreg(ICS),
 getreg(TDBAL),getreg(TDBAH),getreg(RDBAH),getreg(RDBAL),
 getreg(TDLEN),getreg(RDLEN),getreg(RDTR), getreg(RADV),
-getreg(TADV), getreg(ITR),
+getreg(TADV), getreg(ITR),  getreg(FCRUC),getreg(IPAV),
+getreg(WUC),  getreg(WUS),  getreg(SCC),  getreg(ECOL),
+getreg(MCC),  getreg(LATECOL),  getreg(COLC), getreg(DC),
+getreg(TNCRS),getreg(SEC),  getreg(CEXTERR),  getreg(RLEC),
+getreg(XONRXC),   getreg(XONTXC),   getreg(XOFFRXC),  getreg(XOFFTXC),
+getreg(RFC),  getreg(RJC),  getreg(RNBC), getreg(TSCTFC),
+getreg(MGTPRC),   getreg(MGTPDC),   getreg(MGTPTC),
 
 [TOTH]= mac_read_clr8,  [TORH]= mac_read_clr8,
 [GPRC]= mac_read_clr4,  [GPTC]= mac_read_clr4,
 [TPT] = mac_read_clr4,  [TPR] = mac_read_clr4,
 [ICR] = mac_icr_read,

[Qemu-devel] [PATCH COLO-Frame v10 20/38] COLO: Implement failover work for Primary VM

2015-11-03 Thread zhanghailiang

For PVM, if there is failover request from users.
The colo thread will exit the loop while the failover BH does the
cleanup work and resumes VM.

Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
---
v10: Call migration_end() in primary_vm_do_failover()
---
 include/migration/colo.h  |  4 +++
 include/migration/failover.h  |  1 +
 include/migration/migration.h |  1 +
 migration/colo-failover.c |  7 -
 migration/colo.c  | 59 +--
 migration/ram.c   |  2 +-
 6 files changed, 70 insertions(+), 4 deletions(-)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 34e8127..3e375c1 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -37,4 +37,8 @@ int get_colo_mode(void);
 int colo_init_ram_cache(void);
 void colo_release_ram_cache(void);
 void colo_flush_ram_cache(void);
+
+/* failover */
+void colo_do_failover(MigrationState *s);
+
 #endif
diff --git a/include/migration/failover.h b/include/migration/failover.h
index 882c625..fba3931 100644
--- a/include/migration/failover.h
+++ b/include/migration/failover.h
@@ -26,5 +26,6 @@ void failover_init_state(void);
 int failover_set_state(int old_state, int new_state);
 int failover_get_state(void);
 void failover_request_active(Error **errp);
+bool failover_request_is_active(void);
 
 #endif
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 0c0309d..406874f 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -210,6 +210,7 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t 
block_offset,
  uint64_t *bytes_sent);
 
 void ram_mig_init(void);
+void migration_end(void);
 void savevm_skip_section_footers(void);
 void register_global_state(void);
 void global_state_set_optional(void);
diff --git a/migration/colo-failover.c b/migration/colo-failover.c
index ae06c16..33c82c1 100644
--- a/migration/colo-failover.c
+++ b/migration/colo-failover.c
@@ -32,7 +32,7 @@ static void colo_failover_bh(void *opaque)
 error_report(" Unkown error for failover, old_state=%d", old_state);
 return;
 }
-/*TODO: Do failover work */
+colo_do_failover(NULL);
 }
 
 void failover_request_active(Error **errp)
@@ -67,6 +67,11 @@ int failover_get_state(void)
 return atomic_read(_state);
 }
 
+bool failover_request_is_active(void)
+{
+return ((failover_get_state() != FAILOVER_STATUS_NONE));
+}
+
 void qmp_x_colo_lost_heartbeat(Error **errp)
 {
 if (get_colo_mode() == COLO_MODE_UNKNOWN) {
diff --git a/migration/colo.c b/migration/colo.c
index 7732f60..95f1405 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -47,6 +47,45 @@ bool migration_incoming_in_colo_state(void)
 return mis && (mis->state == MIGRATION_STATUS_COLO);
 }
 
+static bool colo_runstate_is_stopped(void)
+{
+return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
+}
+
+static void primary_vm_do_failover(void)
+{
+MigrationState *s = migrate_get_current();
+int old_state;
+
+if (s->state != MIGRATION_STATUS_FAILED) {
+migrate_set_state(>state, MIGRATION_STATUS_COLO,
+  MIGRATION_STATUS_COMPLETED);
+}
+migration_end();
+
+vm_start();
+
+old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
+   FAILOVER_STATUS_COMPLETED);
+if (old_state != FAILOVER_STATUS_HANDLING) {
+error_report("Serious error while do failover for Primary VM,"
+ "old_state: %d", old_state);
+return;
+}
+}
+
+void colo_do_failover(MigrationState *s)
+{
+/* Make sure vm stopped while failover */
+if (!colo_runstate_is_stopped()) {
+vm_stop_force_state(RUN_STATE_COLO);
+}
+
+if (get_colo_mode() == COLO_MODE_PRIMARY) {
+primary_vm_do_failover();
+}
+}
+
 /* colo checkpoint control helper */
 static int colo_ctl_put(QEMUFile *f, uint32_t cmd, uint64_t value)
 {
@@ -130,9 +169,22 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
 }
 
 qemu_mutex_lock_iothread();
+if (failover_request_is_active()) {
+qemu_mutex_unlock_iothread();
+ret = -1;
+goto out;
+}
 vm_stop_force_state(RUN_STATE_COLO);
 qemu_mutex_unlock_iothread();
 trace_colo_vm_state_change("run", "stop");
+/*
+ * failover request bh could be called after
+ * vm_stop_force_state so we check failover_request_is_active() again.
+ */
+if (failover_request_is_active()) {
+ret = -1;
+goto out;
+}
 
 /* Disable block migration */
 s->params.blk = 0;
@@ -231,6 +283,11 @@ static void colo_process_checkpoint(MigrationState *s)
 trace_colo_vm_state_change("stop", "run");
 
 while (s->state == MIGRATION_STATUS_COLO) {
+if (failover_request_is_active()) {
+error_report("failover request");
+

Re: [Qemu-devel] [PATCH v2 1/4] ui: Use g_new() & friends where that makes obvious sense

2015-11-03 Thread Michael Tokarev

29.10.2015 18:55, Markus Armbruster wrote:
> g_new(T, n) is neater than g_malloc(sizeof(T) * n).  It's also safer,
> for two reasons.  One, it catches multiplication overflowing size_t.
> Two, it returns T * rather than void *, which lets the compiler catch
> more type errors.
> 
> This commit only touches allocations with size arguments of the form
> sizeof(T).  Same Coccinelle semantic patch as in commit b45c03f.
> 
> Signed-off-by: Markus Armbruster 
> Reviewed-by: Eric Blake 
> ---
>  ui/console.c  | 2 +-
>  ui/curses.c   | 2 +-
>  ui/input-legacy.c | 4 ++--
>  ui/keymaps.c  | 2 +-
>  ui/sdl.c  | 2 +-
>  ui/vnc-jobs.c | 6 +++---
>  ui/vnc.c  | 6 +++---
>  7 files changed, 12 insertions(+), 12 deletions(-)

ui/vnc.c code has been modified by Eric Blake meanwhile,
in 2d32addae70987521578d8bb27c6b3f52cdcbdcb "sockets:
Convert to new qapi union layout".

The patch applies for other files however.

Thanks,

/mjt

[Qemu-devel] [PATCH COLO-Frame v10 36/38] netfilter: Introduce an API to delete all the automatically added netfilters

2015-11-03 Thread zhanghailiang

We add a new property 'auto' for netfilter to distinguish if netfilter is
added by user or automatically added.

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
---
v10: new patch
---
 include/net/filter.h |  2 ++
 net/filter-buffer.c  | 17 +
 net/filter.c | 15 +++
 3 files changed, 34 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index b0954ba..46d3ef9 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -55,6 +55,7 @@ struct NetFilterState {
 char *netdev_id;
 NetClientState *netdev;
 NetFilterDirection direction;
+bool auto_add;
 char info_str[256];
 QTAILQ_ENTRY(NetFilterState) next;
 };
@@ -76,5 +77,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
 void filter_buffer_release_all(void);
 void  filter_buffer_del_all_timers(void);
 void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);
+void qemu_auto_del_filter_buffer(Error **errp);
 
 #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 0dc1efb..ea4481c 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -19,6 +19,7 @@
 #include "qapi/qmp-output-visitor.h"
 #include "qapi/qmp-input-visitor.h"
 #include "monitor/monitor.h"
+#include "qmp-commands.h"
 
 
 #define TYPE_FILTER_BUFFER "filter-buffer"
@@ -269,6 +270,22 @@ void qemu_auto_add_filter_buffer(NetFilterDirection 
direction, Error **errp)
 g_free(queue);
 }
 
+static void netdev_del_filter_buffer(NetFilterState *nf, void *opaque,
+ Error **errp)
+{
+if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER) &&
+nf->auto_add) {
+char *id = object_get_canonical_path_component(OBJECT(nf));
+
+qmp_object_del(id, errp);
+}
+}
+
+void qemu_auto_del_filter_buffer(Error **errp)
+{
+qemu_foreach_netfilter(netdev_del_filter_buffer, NULL, errp);
+}
+
 static void filter_buffer_init(Object *obj)
 {
 object_property_add(obj, "interval", "int",
diff --git a/net/filter.c b/net/filter.c
index 326f2b5..dcbcb80 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -117,6 +117,18 @@ static void netfilter_set_direction(Object *obj, int 
direction, Error **errp)
 nf->direction = direction;
 }
 
+static bool netfilter_get_auto_flag(Object *obj, Error **errp)
+{
+NetFilterState *nf = NETFILTER(obj);
+return nf->auto_add;
+}
+
+static void netfilter_set_auto_flag(Object *obj, bool flag, Error **errp)
+{
+NetFilterState *nf = NETFILTER(obj);
+nf->auto_add = flag;
+}
+
 static void netfilter_init(Object *obj)
 {
 object_property_add_str(obj, "netdev",
@@ -126,6 +138,9 @@ static void netfilter_init(Object *obj)
  NetFilterDirection_lookup,
  netfilter_get_direction, netfilter_set_direction,
  NULL);
+object_property_add_bool(obj, "auto",
+ netfilter_get_auto_flag, netfilter_set_auto_flag,
+ NULL);
 }
 
 static void netfilter_complete(UserCreatable *uc, Error **errp)
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH] target-alpha: fix uninitialized variable

2015-11-03 Thread Michael Tokarev

22.10.2015 02:15, Richard Henderson wrote:
> On 10/19/2015 04:08 AM, Paolo Bonzini wrote:
>> I am not sure why the compiler does not catch it.  There is no
>> semantic change since gen_excp returns EXIT_NORETURN, but the
>> old code is wrong.
>>
>> Reported by Coverity.
>>
>> Signed-off-by: Paolo Bonzini
>> ---
>>   target-alpha/translate.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Reviewed-by: Richard Henderson 
> 
> I assume this will go in via trivial.

Richard, after your patch 522a0d4e3c0d397ffb45ec400d8cbd426dad9d17
"target-*: Advance pc after recognizing a breakpoint", this code
needs another review I think, as you modified the subsequent line ;)
Please take a look.

Thanks,

/mjt

Re: [Qemu-devel] [PATCH] configure: remove help string for 'vnc-tls' option

2015-11-03 Thread Michael Tokarev

03.11.2015 14:34, Daniel P. Berrange wrote:
> The '--enable-vnc-tls' option to configure was removed in

Applies to -trivial, thank you!

/mjt

[Qemu-devel] [PATCH COLO-Frame v10 24/38] COLO failover: Shutdown related socket fd when do failover

2015-11-03 Thread zhanghailiang

If the net connection between COLO's two sides is broken while colo/colo 
incoming
thread is blocked in 'read'/'write' socket fd. It will not detect this error 
until
connect timeout. It will be a long time.

Here we shutdown all the related socket file descriptors to wake up the blocking
operation in failover BH. Besides, we should close the corresponding file 
descriptors
after failvoer BH shutdown them, or there will be an error.

Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
---
 migration/colo.c | 28 ++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 247b40f..240ccda 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -74,6 +74,13 @@ static void secondary_vm_do_failover(void)
 /* recover runstate to normal migration finish state */
 autostart = true;
 }
+/* Make sure colo incoming thread not block in recv */
+if (mis->from_src_file) {
+qemu_file_shutdown(mis->from_src_file);
+}
+if (mis->to_src_file) {
+qemu_file_shutdown(mis->to_src_file);
+}
 
 old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
FAILOVER_STATUS_COMPLETED);
@@ -99,6 +106,13 @@ static void primary_vm_do_failover(void)
 }
 migration_end();
 
+if (s->from_dst_file) { /* Make sure colo thread no block in recv */
+qemu_file_shutdown(s->from_dst_file);
+}
+if (s->to_dst_file) {
+qemu_file_shutdown(s->to_dst_file);
+}
+
 vm_start();
 
 old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
@@ -342,7 +356,7 @@ static void colo_process_checkpoint(MigrationState *s)
 
 out:
 current_time = error_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
-if (ret < 0) {
+if (ret < 0 || (!ret && !failover_request_is_active())) {
 error_report("%s: %s", __func__, strerror(-ret));
 qapi_event_send_colo_exit(COLO_MODE_PRIMARY, COLO_EXIT_REASON_ERROR,
   true, strerror(-ret), NULL);
@@ -371,6 +385,11 @@ out:
 qsb_free(buffer);
 buffer = NULL;
 
+/* Hope this not to be too long to loop here */
+while (failover_get_state() != FAILOVER_STATUS_COMPLETED) {
+;
+}
+/* Must be called after failover BH is completed */
 if (s->from_dst_file) {
 qemu_fclose(s->from_dst_file);
 }
@@ -534,7 +553,7 @@ void *colo_process_incoming_thread(void *opaque)
 
 out:
 current_time = error_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
-if (ret < 0) {
+if (ret < 0 || (!ret && !failover_request_is_active())) {
 error_report("colo incoming thread will exit, detect error: %s",
  strerror(-ret));
 qapi_event_send_colo_exit(COLO_MODE_SECONDARY, COLO_EXIT_REASON_ERROR,
@@ -573,6 +592,11 @@ out:
 */
 colo_release_ram_cache();
 
+/* Hope this not to be too long to loop here */
+while (failover_get_state() != FAILOVER_STATUS_COMPLETED) {
+;
+}
+/* Must be called after failover BH is completed */
 if (mis->to_src_file) {
 qemu_fclose(mis->to_src_file);
 }
-- 
1.8.3.1

[Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters

2015-11-03 Thread zhanghailiang

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
---
v10: new patch
---
 include/net/filter.h |  1 +
 net/filter-buffer.c  | 17 +
 2 files changed, 18 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 5a09607..4499d60 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -74,5 +74,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
 int iovcnt,
 void *opaque);
 void filter_buffer_release_all(void);
+void  filter_buffer_del_all_timers(void);
 
 #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index b344901..5f0ea70 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -178,6 +178,23 @@ void filter_buffer_release_all(void)
 qemu_foreach_netfilter(filter_buffer_release_packets, NULL, NULL);
 }
 
+static void filter_buffer_del_timer(NetFilterState *nf, void *opaque,
+Error **errp)
+{
+if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
+FilterBufferState *s = FILTER_BUFFER(nf);
+
+if (s->interval) {
+timer_del(>release_timer);
+}
+}
+}
+
+void filter_buffer_del_all_timers(void)
+{
+qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
+}
+
 static void filter_buffer_init(Object *obj)
 {
 object_property_add(obj, "interval", "int",
-- 
1.8.3.1

Re: [Qemu-devel] [v2 RESEND 0/4] Fix long vm downtime during live migration

2015-11-03 Thread Amit Shah

On (Mon) 02 Nov 2015 [15:36:59], Liang Li wrote:
> The patch 3ea3b7fa9af067982f34b of kvm introduces a lazy collapsing
> of small sptes into large sptes mechanism, which intend to solve the
> performance drop issue if live migration fails or is canceled. The
> rmap will be scanned in the KVM_SET_USER_MEMORY_REGION ioctl context
> when dirty logging is stopped so as to drop the small sptes, scanning
> the rmap and drop the small sptes is a time consuming operation which
> will take dozens of milliseconds, the actual time depends on VM's
> memory size. For a VM with 8GB RAM, it will take about 30ms.
> 
> The current QEMU code stop the dirty logging during the pause and
> copy stage by calling the migration_end() function. Now migration_end()
> is a time consuming operation because it calls
> memroy_global_dirty_log_stop(), which will trigger the scanning of rmap
> and dropping small sptes operation. So call migration_end() before all
> the vmsate data has already been transferred to the destination will
> prolong VM downtime.
> 
> migration_end() should be deferred after all the data has been
> transferred to the destination. blk_mig_cleanup() can be deferred too.

Reviewed-by: Amit Shah 

Thanks for adding to the commit message, that helped.


Amit

Re: [Qemu-devel] [PATCH v2] taget-ppc: Fix read access to IBAT registers higher than IBAT3

2015-11-03 Thread Michael Tokarev

03.11.2015 11:00, Julio Guerra wrote:
> Ping :)

Well, I'm not sure what can I do with this.  I've no idea what is IBAT to start
with, so while technically the patch is a one-liner, I've no idea what it does
and how trivial it is :)

Maybe you can include some context which teaches me what it is all about, and in
that case it becomes really trivial, or.. I dunno :)

Thanks,

/mjt

> Le mer. 14 oct. 2015 19:43, Julio Guerra  > a écrit :
> 
> Fix the index used to read the IBAT's vector which results in IBAT0..3 
> instead
> of IBAT4..N.
> 
> The bug appeared by saving/restoring contexts including IBATs values.
> 
> Signed-off-by: Julio Guerra >
> ---
>  target-ppc/translate_init.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index b541473..76d9a02 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -305,7 +305,7 @@ static void spr_read_ibat (DisasContext *ctx, int 
> gprn, int sprn)
> 
>  static void spr_read_ibat_h (DisasContext *ctx, int gprn, int sprn)
>  {
> -tcg_gen_ld_tl(cpu_gpr[gprn], cpu_env, offsetof(CPUPPCState, 
> IBAT[sprn & 1][(sprn - SPR_IBAT4U) / 2]));
> +tcg_gen_ld_tl(cpu_gpr[gprn], cpu_env, offsetof(CPUPPCState, 
> IBAT[sprn & 1][((sprn - SPR_IBAT4U) / 2) + 4]));
>  }
> 
>  static void spr_write_ibatu (DisasContext *ctx, int sprn, int gprn)
> --
> 2.5.2
>

[Qemu-devel] safety of migration_bitmap_extend

2015-11-03 Thread Dr. David Alan Gilbert

Hi,
  I'm trying to understand why migration_bitmap_extend is correct/safe;
If I understand correctly, you're arguing that:

  1) the migration_bitmap_mutex around the extend, stops any sync's happening
 and so no new bits will be set during the extend.

  2) If migration sends a page and clears a bitmap entry, it doesn't
 matter if we lose the 'clear' because we're copying it as
 we extend it, because losing the clear just means the page
 gets resent, and so the data is OK.

However, doesn't (2) mean that migration_dirty_pages might be wrong?
If a page was sent, the bit cleared, and migration_dirty_pages decremented,
then if we copy over that bitmap and 'set' that bit again then 
migration_dirty_pages
is too small; that means that either migration would finish too early,
or more likely, migration_dirty_pages would wrap-around -ve and
never finish.

Is there a reason it's really safe?

Dave

--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] Safety of killing qemu when it is doing an fstrim

2015-11-03 Thread Richard W.M. Jones


I wrote a tool called virt-sparsify which runs fstrim on disks via
qemu.  My colleague asked me a good question: Is this safe if qemu is
killed (^C)?  Could it corrupt the guest?

Using 'virt-sparsify --inplace disk.img' is essentially equivalent to
doing:

  qemu-kvm \
-kernel  \
-drive file=disk.img,discard=unmap,[virtio-scsi] \
-drive file=appliance

And in the appliance doing:

  foreach fs in filesystems:
  mount -o discard fs /sysroot
  fstrim /sysroot
  umount /sysroot
  sync
  poweroff

I think the answer is "safe", as long as the Linux kernel and qemu are
written carefully, but it would be good to get an expert opinion.

It looks like fstrim just sends discard requests.  And mount/umount
should be safe by the usual rules of journalling.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

Re: [Qemu-devel] [PATCH v7 06/35] acpi: add aml_method_serialized

2015-11-03 Thread Igor Mammedov

On Mon,  2 Nov 2015 17:13:08 +0800
Xiao Guangrong  wrote:

> It avoid explicit Mutex and will be used by NVDIMM ACPI
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/acpi/aml-build.c | 26 --
>  include/hw/acpi/aml-build.h |  1 +
>  2 files changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 9f792ab..8bee8b2 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -696,14 +696,36 @@ Aml *aml_while(Aml *predicate)
>  }
>  
>  /* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefMethod */
> -Aml *aml_method(const char *name, int arg_count)
> +static Aml *__aml_method(const char *name, int arg_count, bool serialized)
We don't have many users of aml_method() yet, so I'd prefer to have a single
vs multiple function call:

I suggest to do something like:
typedef enum {
AML_NONSERIALIZED = 0,
AML_SERIALIZED = 1,
} AmlSerializeRule;

aml_method(const char *name, AmlSerializeRule rule, int synclevel);

with current users fixed up with AML_NONSERIALIZED argument. 

>  {
>  Aml *var = aml_bundle(0x14 /* MethodOp */, AML_PACKAGE);
> +int methodflags;
> +
> +/*
> + * MethodFlags:
> + *   bit 0-2: ArgCount (0-7)
> + *   bit 3: SerializeFlag
> + * 0: NotSerialized
> + * 1: Serialized
> + *   bit 4-7: reserved (must be 0)
> + */
> +assert(!(arg_count & ~7));
> +methodflags = arg_count | (serialized << 3);
>  build_append_namestring(var->buf, "%s", name);
> -build_append_byte(var->buf, arg_count); /* MethodFlags: ArgCount */
> +build_append_byte(var->buf, methodflags);
>  return var;
>  }
>  
> +Aml *aml_method(const char *name, int arg_count)
> +{
> +return __aml_method(name, arg_count, false);
> +}
> +
> +Aml *aml_method_serialized(const char *name, int arg_count)
> +{
> +return __aml_method(name, arg_count, true);
> +}
> +
>  /* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefDevice */
>  Aml *aml_device(const char *name_format, ...)
>  {
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index 5b8a118..00cf40e 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -263,6 +263,7 @@ Aml *aml_qword_memory(AmlDecode dec, AmlMinFixed 
> min_fixed,
>  Aml *aml_scope(const char *name_format, ...) GCC_FMT_ATTR(1, 2);
>  Aml *aml_device(const char *name_format, ...) GCC_FMT_ATTR(1, 2);
>  Aml *aml_method(const char *name, int arg_count);
> +Aml *aml_method_serialized(const char *name, int arg_count);
>  Aml *aml_if(Aml *predicate);
>  Aml *aml_else(void);
>  Aml *aml_while(Aml *predicate);

Re: [Qemu-devel] [v2 RESEND 4/4] migration: code clean up

2015-11-03 Thread Juan Quintela

Liang Li  wrote:
> Just clean up code, no behavior change.
>
> Signed-off-by: Liang Li 

Reviewed-by: Juan Quintela 


Applied the whole series

Re: [Qemu-devel] [v2 RESEND 0/4] Fix long vm downtime during live migration

2015-11-03 Thread Juan Quintela

Paolo Bonzini  wrote:
> On 02/11/2015 08:36, Liang Li wrote:
>> The patch 3ea3b7fa9af067982f34b of kvm introduces a lazy collapsing
>> of small sptes into large sptes mechanism, which intend to solve the
>> performance drop issue if live migration fails or is canceled. The
>> rmap will be scanned in the KVM_SET_USER_MEMORY_REGION ioctl context
>> when dirty logging is stopped so as to drop the small sptes, scanning
>> the rmap and drop the small sptes is a time consuming operation which
>> will take dozens of milliseconds, the actual time depends on VM's
>> memory size. For a VM with 8GB RAM, it will take about 30ms.
>
> I'm okay with these patches.  Juan, can they be included in 2.5?

Applied, thanks.

>
> However, the KVM patch is a regression too.  Wanpeng, can you look into
> doing the collapsing from a work item?
>
> Thanks,
>
> Paolo

Re: [Qemu-devel] safety of migration_bitmap_extend

2015-11-03 Thread Juan Quintela

"Dr. David Alan Gilbert"  wrote:
> Hi,
>   I'm trying to understand why migration_bitmap_extend is correct/safe;
> If I understand correctly, you're arguing that:
>
>   1) the migration_bitmap_mutex around the extend, stops any sync's happening
>  and so no new bits will be set during the extend.
>
>   2) If migration sends a page and clears a bitmap entry, it doesn't
>  matter if we lose the 'clear' because we're copying it as
>  we extend it, because losing the clear just means the page
>  gets resent, and so the data is OK.
>
> However, doesn't (2) mean that migration_dirty_pages might be wrong?
> If a page was sent, the bit cleared, and migration_dirty_pages decremented,
> then if we copy over that bitmap and 'set' that bit again then 
> migration_dirty_pages
> is too small; that means that either migration would finish too early,
> or more likely, migration_dirty_pages would wrap-around -ve and
> never finish.
>
> Is there a reason it's really safe?

No.  It is reasonably safe.  Various values of reasonably.

migration_dirty_pages should never arrive at values near zero.  Because
we move to the completion stage way before it gets a value near zero.
(We could have very, very bad luck, as in it is not safe).

Now, do we really care if migration_dirty_pages is exact?  Not really,
we just use it to calculate if we should start the throotle or not.
That only test that each 1 second, so if we have written a couple of
pages that we are not accounting for, things should be reasonably safe.

Once told that, I don't know why we didn't catch that problem during
review (yes, I am guilty here).  Not sure how to really fix it,
thought.  I think that the problem is more theoretical than real, but

Thanks, Juan.

>
> Dave
>
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] [PATCH] Fixed KVM problems with old DOS programs. Compatibility can be forced by module parameter.

2015-11-03 Thread Gerhard Wiesinger


Signed-off-by: Gerhard Wiesinger 
---
 arch/x86/kvm/svm.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 2f9ed1f..e0b00fc 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -198,6 +198,10 @@ static bool npt_enabled;
 static int npt = true;
 module_param(npt, int, S_IRUGO);
 +/* allow backward compatibility with e.g. old DOS application */
+static int npt_task_switch_emulation = true;
+module_param(npt_task_switch_emulation, int, S_IRUGO);
+
 /* allow nested virtualization in KVM/SVM */
 static int nested = true;
 module_param(nested, int, S_IRUGO);
@@ -1177,6 +1181,9 @@ static void init_vmcb(struct vcpu_svm *svm, bool 
init_event)

if (npt_enabled) {
/* Setup VMCB for Nested Paging */
control->nested_ctl = 1;
+   if (!npt_task_switch_emulation) {
+   clr_intercept(svm, INTERCEPT_TASK_SWITCH);
+   }
clr_intercept(svm, INTERCEPT_INVLPG);
clr_exception_intercept(svm, PF_VECTOR);
clr_cr_intercept(svm, INTERCEPT_CR3_READ);
--
2.4.3

Re: [Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters

2015-11-03 Thread zhanghailiang


Hi,

On 2015/11/3 20:41, Yang Hongyang wrote:

Can you explain why this is needed? Seems that this api hasn't
been used in this series.



We will call it in colo_init_filter_buffers() which is introduced in patch 37,
We should remove the timers of filter-buffers which are configured by users.
Or there will be two places to release packets when we enable colo ft, one in 
timer callback,
the other one in COLO when we do checkpoint.


Thanks,
zhanghailiang


On 2015年11月03日 19:56, zhanghailiang wrote:

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
---
v10: new patch
---
  include/net/filter.h |  1 +
  net/filter-buffer.c  | 17 +
  2 files changed, 18 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 5a09607..4499d60 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -74,5 +74,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
  int iovcnt,
  void *opaque);
  void filter_buffer_release_all(void);
+void  filter_buffer_del_all_timers(void);

  #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index b344901..5f0ea70 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -178,6 +178,23 @@ void filter_buffer_release_all(void)
  qemu_foreach_netfilter(filter_buffer_release_packets, NULL, NULL);
  }

+static void filter_buffer_del_timer(NetFilterState *nf, void *opaque,
+Error **errp)
+{
+if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
+FilterBufferState *s = FILTER_BUFFER(nf);
+
+if (s->interval) {
+timer_del(>release_timer);
+}
+}
+}
+
+void filter_buffer_del_all_timers(void)
+{
+qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
+}
+
  static void filter_buffer_init(Object *obj)
  {
  object_property_add(obj, "interval", "int",

[Qemu-devel] [PATCH] log disasm insns when nochain + in_asm enabled

2015-11-03 Thread Sergey Smolov

When 'nochain' and 'in_asm' debug options are enabled, QEMU
does not print records about every executed translation block
 (TB). For loop-containing programs it could be suitable to log
 every executed TB. This patch includes a mapping between TBs and
 disassembled instructions for this task to be implemented.

Sergey Smolov (1):
  log disasm insns when nochain + in_asm enabled

 cpu-exec.c|   20 
 disas.c   |   18 +-
 include/disas/disas.h |   14 ++
 qemu-log.c|2 +-
 4 files changed, 52 insertions(+), 2 deletions(-)

--
1.7.10.4

[Qemu-devel] [PATCH] log disasm insns when nochain + in_asm enabled

2015-11-03 Thread Sergey Smolov

When 'nochain' and 'in_asm' debug options are enabled,
disassembled forms of all executed translation blocks (TB)
 are printed to log. For this task a mapping between
disassembled instructions and executed TBs is created
and used.

Signed-off-by: Sergey Smolov 
---
 cpu-exec.c|   20 
 disas.c   |   18 +-
 include/disas/disas.h |   14 ++
 qemu-log.c|2 +-
 4 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 7eef083..b9385f9 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -345,6 +345,9 @@ int cpu_exec(CPUState *cpu)
 uintptr_t next_tb;
 SyncClocks sc;
 
+hwaddr pc_prev;
+bool pc_prev_valid = false;
+
 if (cpu->halted) {
 #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
 if (cpu->interrupt_request & CPU_INTERRUPT_POLL) {
@@ -474,6 +477,23 @@ int cpu_exec(CPUState *cpu)
 qemu_log("Trace %p [" TARGET_FMT_lx "] %s\n",
  tb->tc_ptr, tb->pc, lookup_symbol(tb->pc));
 }
+if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
+&& qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
+struct insninfo *s;
+for (s = insninfos; s != NULL; s = s->next) {
+if (s->insn_addr == tb->pc
+&& pc_prev_valid
+&& s->insn_addr != pc_prev) {
+qemu_log("%s\n", s->insn);
+pc_prev = s->insn_addr;
+if (!pc_prev_valid) {
+pc_prev_valid = true;
+}
+} else if (s->insn_addr == pc_prev) {
+pc_prev_valid = false;
+}
+}
+}
 /* see if we can patch the calling TB. When the TB
spans two pages, we cannot safely do a direct
jump. */
diff --git a/disas.c b/disas.c
index 4e11944..51bf68f 100644
--- a/disas.c
+++ b/disas.c
@@ -16,6 +16,9 @@ typedef struct CPUDebug {
 /* Filled in by elfload.c.  Simplistic, but will do for now. */
 struct syminfo *syminfos = NULL;
 
+/* Filled in here. */
+struct insninfo *insninfos;
+
 /* Get LENGTH bytes from info's buffer, at target address memaddr.
Transfer them to myaddr.  */
 int
@@ -236,7 +239,20 @@ void target_disas(FILE *out, CPUState *cpu, target_ulong 
code,
 }
 
 for (pc = code; size > 0; pc += count, size -= count) {
-   fprintf(out, "0x" TARGET_FMT_lx ":  ", pc);
+
+if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)
+&& qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
+struct insninfo *s =
+(struct insninfo *)malloc(
+sizeof(struct insninfo));
+s->insn_addr = pc;
+
+sprintf(s->insn, " " TARGET_FMT_lx " ", pc);
+s->next = insninfos;
+insninfos = s;
+} else {
+fprintf(out, "0x" TARGET_FMT_lx ":  ", pc);
+}
count = s.info.print_insn(pc, );
 #if 0
 {
diff --git a/include/disas/disas.h b/include/disas/disas.h
index 2b9293b..75a4c73 100644
--- a/include/disas/disas.h
+++ b/include/disas/disas.h
@@ -2,6 +2,7 @@
 #define _QEMU_DISAS_H
 
 #include "qemu-common.h"
+#include "exec/hwaddr.h"
 
 #ifdef NEED_CPU_H
 /* Disassemble this for me please... (debugging). */
@@ -40,4 +41,17 @@ struct syminfo {
 /* Filled in by elfload.c.  Simplistic, but will do for now. */
 extern struct syminfo *syminfos;
 
+struct insninfo {
+
+/* Instruction address. */
+hwaddr insn_addr;
+
+/* Instruction string representation. */
+char insn[256];
+struct insninfo *next;
+};
+
+/* Filled in by disas.c - Information about instructions. */
+extern struct insninfo *insninfos;
+
 #endif /* _QEMU_DISAS_H */
diff --git a/qemu-log.c b/qemu-log.c
index efd07c8..bbc10e3 100644
--- a/qemu-log.c
+++ b/qemu-log.c
@@ -120,7 +120,7 @@ const QEMULogItem qemu_log_items[] = {
   "log when the guest OS does something invalid (eg accessing a\n"
   "non-existent register)" },
 { CPU_LOG_TB_NOCHAIN, "nochain",
-  "do not chain compiled TBs so that \"exec\" and \"cpu\" show\n"
+  "do not chain compiled TBs so that \"exec\", \"in_asm\" and \"cpu\" 
show\n"
   "complete traces" },
 { 0, NULL, NULL },
 };
-- 
1.7.10.4

Re: [Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters

2015-11-03 Thread Yang Hongyang


Can you explain why this is needed? Seems that this api hasn't
been used in this series.

On 2015年11月03日 19:56, zhanghailiang wrote:

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
---
v10: new patch
---
  include/net/filter.h |  1 +
  net/filter-buffer.c  | 17 +
  2 files changed, 18 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 5a09607..4499d60 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -74,5 +74,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
  int iovcnt,
  void *opaque);
  void filter_buffer_release_all(void);
+void  filter_buffer_del_all_timers(void);

  #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index b344901..5f0ea70 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -178,6 +178,23 @@ void filter_buffer_release_all(void)
  qemu_foreach_netfilter(filter_buffer_release_packets, NULL, NULL);
  }

+static void filter_buffer_del_timer(NetFilterState *nf, void *opaque,
+Error **errp)
+{
+if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
+FilterBufferState *s = FILTER_BUFFER(nf);
+
+if (s->interval) {
+timer_del(>release_timer);
+}
+}
+}
+
+void filter_buffer_del_all_timers(void)
+{
+qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
+}
+
  static void filter_buffer_init(Object *obj)
  {
  object_property_add(obj, "interval", "int",



--
Thanks,
Yang

Re: [Qemu-devel] [PATCH COLO-Frame v10 32/38] netfilter: Add a public API to release all the buffered packets

2015-11-03 Thread Yang Hongyang


On 2015年11月03日 19:56, zhanghailiang wrote:

For COLO or MC FT, We need a function to release all the buffered packets
actively.

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
---
v10: new patch
---
  include/net/filter.h |  1 +
  include/net/net.h|  4 
  net/filter-buffer.c  | 15 +++
  net/net.c| 24 
  4 files changed, 44 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 2deda36..5a09607 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -73,5 +73,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
  const struct iovec *iov,
  int iovcnt,
  void *opaque);
+void filter_buffer_release_all(void);

  #endif /* QEMU_NET_FILTER_H */
diff --git a/include/net/net.h b/include/net/net.h
index 7af3e15..5c65c45 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -125,6 +125,10 @@ NetClientState *qemu_find_vlan_client_by_name(Monitor 
*mon, int vlan_id,
const char *client_str);
  typedef void (*qemu_nic_foreach)(NICState *nic, void *opaque);
  void qemu_foreach_nic(qemu_nic_foreach func, void *opaque);
+typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
+   Error **errp);
+void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
+Error **errp);
  int qemu_can_send_packet(NetClientState *nc);
  ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
int iovcnt);
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 57be149..b344901 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -14,6 +14,7 @@
  #include "qapi/qmp/qerror.h"
  #include "qapi-visit.h"
  #include "qom/object.h"
+#include "net/net.h"

  #define TYPE_FILTER_BUFFER "filter-buffer"

@@ -163,6 +164,20 @@ out:
  error_propagate(errp, local_err);
  }

+static void filter_buffer_release_packets(NetFilterState *nf, void *opaque,
+  Error **errp)
+{
+if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
+filter_buffer_flush(nf);
+}
+}
+
+/* public APIs */
+void filter_buffer_release_all(void)
+{
+qemu_foreach_netfilter(filter_buffer_release_packets, NULL, NULL);
+}
+
  static void filter_buffer_init(Object *obj)
  {
  object_property_add(obj, "interval", "int",
diff --git a/net/net.c b/net/net.c
index a3e9d1a..a333b01 100644
--- a/net/net.c
+++ b/net/net.c
@@ -259,6 +259,30 @@ static char *assign_name(NetClientState *nc1, const char 
*model)
  return g_strdup_printf("%s.%d", model, id);
  }

+void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
+Error **errp)
+{
+NetClientState *nc;
+NetFilterState *nf;
+
+QTAILQ_FOREACH(nc, _clients, next) {


Going through every filters this way might cause problem under
multiqueue case. IIRC, Jason suggested that we implement multiqueue
by this way: attach the same filter to all queues instead of
attach the clone of the filter obj to other queues. So if we
attach the same filter to all queues, going through filters
this way will cause the func been called multiple(=num of queues) times.


+if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
+continue;
+}
+QTAILQ_FOREACH(nf, >filters, next) {
+if (func) {
+Error *local_err = NULL;
+
+func(nf, opaque, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+}
+}
+}
+}
+
  static void qemu_net_client_destructor(NetClientState *nc)
  {
  g_free(nc);



--
Thanks,
Yang

Re: [Qemu-devel] RFC: libyajl for JSON

2015-11-03 Thread Luiz Capitulino

On Tue, 03 Nov 2015 08:17:58 +0100
Markus Armbruster  wrote:

> > So at this point, I want to see if lloyd makes any progress towards an
> > actual yajl release and/or adding a co-maintainer, before even trying to
> > get formal upstream support for single quoting.  We could always create
> > a git submodule with our own choice of fork (since there are already
> > forks that do single-quote parsing) - but the mantra of 'upstream first'
> > has a lot of merit (I'm reluctant to fork without good reason).
> 
> The value proposition of replacing our flawed JSON parser isn't in
> saving big on maintenance, it's in not having to find and fix its flaws.
> 
> If the replacement needs a lot of work to fit our needs, the value
> proposition becomes negative.
> 
> A JSON parser shouldn't require much maintenance, as JSON is simple,
> doesn't change, and parsing has few system dependencies.

Let me suggest this crazy idea: have you guys considered breaking
compatibility?

Re: [Qemu-devel] [PATCH v8.5 1/4] qapi: Drop all_members parameter from check()

2015-11-03 Thread Eric Blake

On 11/03/2015 04:06 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> The implementation of QAPISchemaObjectTypeMember.check() always
>> adds the member currently being checked to both the all_members
>> and seen parameters.
> 
> QAPISchemaObjectTypeMember.check() does four things:
> 
> 1. Compute self.type
> 
>Precondition: all types are defined.

Correct, unchanged by this patch.

> 
> 2. Accumulate members
> 
>all_members serves as accumulator.
> 
>We'll see that its only actual use is the owning object type's
>check(), which uses it to compute self.members.

This patch changes it to use seen.values(), which (once you use an
OrderedDict() instead of plain {}) is identical to all_members.

> 
> 3. Check for collisions
> 
>This works by accumulating names in seen.  Precondition: seen
>contains the names seen so far.
> 
>Note that this part uses seen like a set.  See 4.

Unchanged by this patch; but see also 2/4 and 3/4.

> 
> 4. Accumulate a map from names to members
> 
>seen serves as accumulator.
> 

Unchanged by this patch.

>We'll see that its only actual user is the owning object type's
>variants.check(), which uses it to compute variants.tag_member from
>variants.tag_name.
> 
>>  However, the three callers of this method
>> pass in the following parameters:
>>
>> QAPISchemaObjectType.check():
>>   - all_members contains all non-variant members seen to date,
>>   for use in populating self.members
>>   - seen contains all non-variant members seen to date, for
>>   use in checking for collisions
> 
> Yes, and:
> 
> - we're calling it for m in self.local_members
> - before the loop, all_members and seen are initialized to the inherited
>   non-variant members
> - after the loop, they therefore contain all non-variant members
> 
> This caller uses all four things done by QAPISchemaObjectType.check():
> 
> 1. Compute m.type

Unchanged by this patch.

> 2. Accumulate non-variant members

Whether the accumulation is done via all_members (pre-patch) or by
seen.values() (post-patch), this step is still done.

> 3. Check for collisions among non-variant members
>Before the loop, seen contains the inherited members, which don't
>collide (self.base.check() ensures that).  The loop adds the local
>members one by one, checking for collisions.

Unchanged by this patch.

> 4. Accumulate a map from names to non-variant members
>Similar argument to 3.

Unchanged by this patch.

> 
>> QAPISchemaObjectTypeVariant.check():
> 
> Do you mean QAPISchemaObjectVariants.check()?

QAPISchemaObjectTypeVariants.check() calls
QAPISchemaObjectTypeVariant.check() for each variant, but with a fresh
copy of seen.  We'll later need to expand this copy of seen (patch 2/4),
but for this patch its use is unchanged - we are appending a single
value (the tag value) which is wrong, but no one cares that we appended
it because it was a copy. Patch 3/4 fixes to not append to it.

> 
>>   - all_members is a throwaway empty list
>>   - seen is a throwaway dictionary created as a copy by
>>   QAPISchemaObjectVariants.check() (since the members of
>>   one variant cannot collide with those from another), for
>>   use in checking for collisions (technically, we no longer
>>   need to check for collisions between tag values and QMP
>>   key names, but that's a cleanup for another patch)
>>
>> QAPISchemaAlternateType.check():
>>   - all_members is a throwaway empty list
>>   - seen is a throwaway empty dict
> 
> I'm afraid you're omitting a few steps here, and I think you missed
> QAPISchemaObjectVariants.check()'s self.tag_member.check().

There is no self.tag_member.check(), any more; rather, my earlier
patches have reworked things so that tag_member is checked by the owner
(a flat union's base type, a simple union's local_members, or directly
in QAPISchemaAlternateType prior to calling Variants.check()).  I guess
that's a pitfall of seeing this patch without my rework of 5/17 to
address your comments there.


>> Therefore, in the one case where we care about all_members
>> after seen has been populated, we know that it contains the
>> same members as seen.values(); changing seen to be an
>> OrderedDict() is sufficient to pick up this information with
>> one less parameter being passed around.
> 
> I believe the first step should be dropping the obsolete check for
> collision of tag value with non-variant members.  I believe this should
> do:
> 
> @@ -1059,8 +1059,7 @@ class QAPISchemaObjectTypeVariants(object):
>  self.tag_member.check(schema, members, seen)
>  assert isinstance(self.tag_member.type, QAPISchemaEnumType)
>  for v in self.variants:
> -vseen = dict(seen)
> -v.check(schema, self.tag_member.type, vseen)
> +v.check(schema, self.tag_member.type, {})

Close, but not quite.  It should do:

+  cases = {}
   for v in self.variants:
   vseen =

[Qemu-devel] [PATCH v11 05/12] Allow creating backup jobs when opening BDS

2015-11-03 Thread Wen Congyang

When opening BDS, we need to create backup jobs for
image-fleecing.

Signed-off-by: Wen Congyang 
Signed-off-by: zhanghailiang 
Signed-off-by: Gonglei 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Jeff Cody 
---
 block/Makefile.objs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 58ef2ef..fa05f37 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -22,10 +22,10 @@ block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
 block-obj-$(CONFIG_LIBSSH2) += ssh.o
 block-obj-y += accounting.o
 block-obj-y += write-threshold.o
+block-obj-y += backup.o
 
 common-obj-y += stream.o
 common-obj-y += commit.o
-common-obj-y += backup.o
 
 iscsi.o-cflags := $(LIBISCSI_CFLAGS)
 iscsi.o-libs   := $(LIBISCSI_LIBS)
-- 
2.4.3

[Qemu-devel] [PATCH v11 11/12] support replication driver in blockdev-add

2015-11-03 Thread Wen Congyang

Signed-off-by: Wen Congyang 
Signed-off-by: zhanghailiang 
Signed-off-by: Gonglei 
Reviewed-by: Eric Blake 
---
 qapi/block-core.json | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0539dfa..acc85ba 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -219,7 +219,7 @@
 #   'qcow2', 'raw', 'tftp', 'vdi', 'vmdk', 'vpc', 'vvfat'
 #   2.2: 'archipelago' added, 'cow' dropped
 #   2.3: 'host_floppy' deprecated
-#   2.5: 'host_floppy' dropped
+#   2.5: 'host_floppy' dropped, 'replication' added
 #
 # @backing_file: #optional the name of the backing file (for copy-on-write)
 #
@@ -1375,6 +1375,7 @@
 # Drivers that are supported in block device operations.
 #
 # @host_device, @host_cdrom: Since 2.1
+# @replication: Since 2.5
 #
 # Since: 2.0
 ##
@@ -1382,8 +1383,8 @@
   'data': [ 'archipelago', 'blkdebug', 'blkverify', 'bochs', 'cloop',
 'dmg', 'file', 'ftp', 'ftps', 'host_cdrom', 'host_device',
 'http', 'https', 'null-aio', 'null-co', 'parallels',
-'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'tftp', 'vdi', 'vhdx',
-'vmdk', 'vpc', 'vvfat' ] }
+'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'replication',
+'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
 
 ##
 # @BlockdevOptionsBase
@@ -1810,6 +1811,19 @@
 { 'enum' : 'ReplicationMode', 'data' : [ 'primary', 'secondary' ] }
 
 ##
+# @BlockdevOptionsReplication
+#
+# Driver specific block device options for replication
+#
+# @mode: the replication mode
+#
+# Since: 2.5
+##
+{ 'struct': 'BlockdevOptionsReplication',
+  'base': 'BlockdevOptionsGenericFormat',
+  'data': { 'mode': 'ReplicationMode'  } }
+
+##
 # @BlockdevOptions
 #
 # Options for creating a block device.
@@ -1846,6 +1860,7 @@
   'quorum': 'BlockdevOptionsQuorum',
   'raw':'BlockdevOptionsGenericFormat',
 # TODO rbd: Wait for structured options
+  'replication':'BlockdevOptionsReplication',
 # TODO sheepdog: Wait for structured options
 # TODO ssh: Should take InetSocketAddress for 'host'?
   'tftp':   'BlockdevOptionsFile',
-- 
2.4.3

[Qemu-devel] [PATCH v11 04/12] Backup: clear all bitmap when doing block checkpoint

2015-11-03 Thread Wen Congyang

Signed-off-by: Wen Congyang 
Signed-off-by: zhanghailiang 
Signed-off-by: Gonglei 
Reviewed-by: Jeff Cody 
---
 block/backup.c   | 14 ++
 blockjob.c   | 11 +++
 include/block/blockjob.h | 12 
 3 files changed, 37 insertions(+)

diff --git a/block/backup.c b/block/backup.c
index ec01db8..4232962 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -221,11 +221,25 @@ static void backup_iostatus_reset(BlockJob *job)
 }
 }
 
+static void backup_do_checkpoint(BlockJob *job, Error **errp)
+{
+BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
+
+if (backup_job->sync_mode != MIRROR_SYNC_MODE_NONE) {
+error_setg(errp, "The backup job only supports block checkpoint in"
+   " sync=none mode");
+return;
+}
+
+hbitmap_reset_all(backup_job->bitmap);
+}
+
 static const BlockJobDriver backup_job_driver = {
 .instance_size  = sizeof(BackupBlockJob),
 .job_type   = BLOCK_JOB_TYPE_BACKUP,
 .set_speed  = backup_set_speed,
 .iostatus_reset = backup_iostatus_reset,
+.do_checkpoint  = backup_do_checkpoint,
 };
 
 static BlockErrorAction backup_error_action(BackupBlockJob *job,
diff --git a/blockjob.c b/blockjob.c
index c02fe59..0bd2656 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -406,3 +406,14 @@ void block_job_defer_to_main_loop(BlockJob *job,
 
 qemu_bh_schedule(data->bh);
 }
+
+void block_job_do_checkpoint(BlockJob *job, Error **errp)
+{
+if (!job->driver->do_checkpoint) {
+error_setg(errp, "The job %s doesn't support block checkpoint",
+   BlockJobType_lookup[job->driver->job_type]);
+return;
+}
+
+job->driver->do_checkpoint(job, errp);
+}
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 289b13f..ae9e01c 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -50,6 +50,9 @@ typedef struct BlockJobDriver {
  * manually.
  */
 void (*complete)(BlockJob *job, Error **errp);
+
+/** Optional callback for job types that support checkpoint. */
+void (*do_checkpoint)(BlockJob *job, Error **errp);
 } BlockJobDriver;
 
 /**
@@ -364,4 +367,13 @@ void block_job_defer_to_main_loop(BlockJob *job,
   BlockJobDeferToMainLoopFn *fn,
   void *opaque);
 
+/**
+ * block_job_do_checkpoint:
+ * @job: The job.
+ * @errp: Error object.
+ *
+ * Do block checkpoint on the specified job.
+ */
+void block_job_do_checkpoint(BlockJob *job, Error **errp);
+
 #endif
-- 
2.4.3

[Qemu-devel] [PATCH v11 06/12] block: make bdrv_put_ref_bh_schedule() as a public API

2015-11-03 Thread Wen Congyang

Signed-off-by: Wen Congyang 
---
 block.c   | 25 +
 blockdev.c| 37 ++---
 include/block/block.h |  1 +
 3 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/block.c b/block.c
index 32ed776..9a1c20e 100644
--- a/block.c
+++ b/block.c
@@ -3508,6 +3508,31 @@ void bdrv_unref(BlockDriverState *bs)
 }
 }
 
+typedef struct {
+QEMUBH *bh;
+BlockDriverState *bs;
+} BDRVPutRefBH;
+
+static void bdrv_put_ref_bh(void *opaque)
+{
+BDRVPutRefBH *s = opaque;
+
+bdrv_unref(s->bs);
+qemu_bh_delete(s->bh);
+g_free(s);
+}
+
+/* Release a BDS reference in a BH */
+void bdrv_put_ref_bh_schedule(BlockDriverState *bs)
+{
+BDRVPutRefBH *s;
+
+s = g_new(BDRVPutRefBH, 1);
+s->bh = qemu_bh_new(bdrv_put_ref_bh, s);
+s->bs = bs;
+qemu_bh_schedule(s->bh);
+}
+
 struct BdrvOpBlocker {
 Error *reason;
 QLIST_ENTRY(BdrvOpBlocker) list;
diff --git a/blockdev.c b/blockdev.c
index bd13669..9d0b3ea 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -278,37 +278,6 @@ static void bdrv_format_print(void *opaque, const char 
*name)
 error_printf(" %s", name);
 }
 
-typedef struct {
-QEMUBH *bh;
-BlockDriverState *bs;
-} BDRVPutRefBH;
-
-static void bdrv_put_ref_bh(void *opaque)
-{
-BDRVPutRefBH *s = opaque;
-
-bdrv_unref(s->bs);
-qemu_bh_delete(s->bh);
-g_free(s);
-}
-
-/*
- * Release a BDS reference in a BH
- *
- * It is not safe to use bdrv_unref() from a callback function when the callers
- * still need the BlockDriverState.  In such cases we schedule a BH to release
- * the reference.
- */
-static void bdrv_put_ref_bh_schedule(BlockDriverState *bs)
-{
-BDRVPutRefBH *s;
-
-s = g_new(BDRVPutRefBH, 1);
-s->bh = qemu_bh_new(bdrv_put_ref_bh, s);
-s->bs = bs;
-qemu_bh_schedule(s->bh);
-}
-
 static int parse_block_error_action(const char *buf, bool is_read, Error 
**errp)
 {
 if (!strcmp(buf, "ignore")) {
@@ -2557,6 +2526,12 @@ static void block_job_cb(void *opaque, int ret)
 block_job_event_completed(bs->job, msg);
 }
 
+
+/*
+ * It is not safe to use bdrv_unref() from a callback function when the
+ * callers still need the BlockDriverState. In such cases we schedule
+ * a BH to release the reference.
+ */
 bdrv_put_ref_bh_schedule(bs);
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index 601a5de..cccda1d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -507,6 +507,7 @@ void bdrv_unref_child(BlockDriverState *parent, BdrvChild 
*child);
 BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
  BlockDriverState *child_bs,
  const BdrvChildRole *child_role);
+void bdrv_put_ref_bh_schedule(BlockDriverState *bs);
 
 bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp);
 void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason);
-- 
2.4.3

[Qemu-devel] [PATCH v11 01/12] unblock backup operations in backing file

2015-11-03 Thread Wen Congyang

Signed-off-by: Wen Congyang 
---
 block.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/block.c b/block.c
index 9a2ab68..1d6c115 100644
--- a/block.c
+++ b/block.c
@@ -1161,6 +1161,24 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd)
 /* Otherwise we won't be able to commit due to check in bdrv_commit */
 bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_COMMIT_TARGET,
 bs->backing_blocker);
+/*
+ * We do backup in 3 ways:
+ * 1. drive backup
+ *The target bs is new opened, and the source is top BDS
+ * 2. blockdev backup
+ *Both the source and the target are top BDSes.
+ * 3. internal backup(used for block replication)
+ *Both the source and the target are backing file
+ *
+ * In case 1, and 2, the backing file is neither the source nor
+ * the target.
+ * In case 3, we will block the top BDS, so there is only one block
+ * job for the top BDS and its backing chain.
+ */
+bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_BACKUP_SOURCE,
+bs->backing_blocker);
+bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_BACKUP_TARGET,
+bs->backing_blocker);
 out:
 bdrv_refresh_limits(bs, NULL);
 }
-- 
2.4.3

[Qemu-devel] [PATCH v4 4/7] e1000: Fixing the received/transmitted packets' counters

2015-11-03 Thread Leonid Bloch

According to Intel's specs, these counters (as the other Statistic
registers) stick at 0x when this maximal value is reached.
Previously, they would reset after the max. value.

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
---
 hw/net/e1000.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 7db6614..871f1b5 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -577,6 +577,14 @@ putsum(uint8_t *data, uint32_t n, uint32_t sloc, uint32_t 
css, uint32_t cse)
 }
 }
 
+static inline void
+inc_reg_if_not_full(E1000State *s, int index)
+{
+if (s->mac_reg[index] != 0x) {
+s->mac_reg[index]++;
+}
+}
+
 static inline int
 vlan_enabled(E1000State *s)
 {
@@ -673,8 +681,8 @@ xmit_seg(E1000State *s)
 e1000_send_packet(s, tp->data, tp->size);
 }
 
-s->mac_reg[TPT]++;
-s->mac_reg[GPTC]++;
+inc_reg_if_not_full(s, TPT);
+s->mac_reg[GPTC] = s->mac_reg[TPT];
 n = s->mac_reg[TOTL];
 if ((s->mac_reg[TOTL] += s->tx.size) < n)
 s->mac_reg[TOTH]++;
@@ -1087,8 +1095,8 @@ e1000_receive_iov(NetClientState *nc, const struct iovec 
*iov, int iovcnt)
 }
 } while (desc_offset < total_size);
 
-s->mac_reg[GPRC]++;
-s->mac_reg[TPR]++;
+inc_reg_if_not_full(s, TPR);
+s->mac_reg[GPRC] = s->mac_reg[TPR];
 /* TOR - Total Octets Received:
  * This register includes bytes received in a packet from the  field through the  field, inclusively.
-- 
2.4.3

[Qemu-devel] [PATCH v4 1/7] e1000: Cosmetic and alignment fixes

2015-11-03 Thread Leonid Bloch

This fixes some alignment and cosmetic issues. The changes are made
in order that the following patches in this series will look like
integral parts of the code surrounding them, while conforming to the
coding style. Although some changes in unrelated areas are also made.

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
---
 hw/net/e1000.c  | 149 +++-
 hw/net/e1000_regs.h |   2 +-
 2 files changed, 79 insertions(+), 72 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 09c9e9d..7036842 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -41,20 +41,20 @@
 
 #ifdef E1000_DEBUG
 enum {
-DEBUG_GENERAL, DEBUG_IO,   DEBUG_MMIO, DEBUG_INTERRUPT,
-DEBUG_RX,  DEBUG_TX,   DEBUG_MDIC, DEBUG_EEPROM,
-DEBUG_UNKNOWN, DEBUG_TXSUM,DEBUG_TXERR,DEBUG_RXERR,
+DEBUG_GENERAL,  DEBUG_IO,   DEBUG_MMIO, DEBUG_INTERRUPT,
+DEBUG_RX,   DEBUG_TX,   DEBUG_MDIC, DEBUG_EEPROM,
+DEBUG_UNKNOWN,  DEBUG_TXSUM,DEBUG_TXERR,DEBUG_RXERR,
 DEBUG_RXFILTER, DEBUG_PHY,  DEBUG_NOTYET,
 };
-#define DBGBIT(x)  (1<>2)
+#define defreg(x)x = (E1000_##x>>2)
 enum {
-defreg(CTRL),  defreg(EECD),   defreg(EERD),   defreg(GPRC),
-defreg(GPTC),  defreg(ICR),defreg(ICS),defreg(IMC),
-defreg(IMS),   defreg(LEDCTL), defreg(MANC),   defreg(MDIC),
-defreg(MPC),   defreg(PBA),defreg(RCTL),   defreg(RDBAH),
-defreg(RDBAL), defreg(RDH),defreg(RDLEN),  defreg(RDT),
-defreg(STATUS),defreg(SWSM),   defreg(TCTL),   defreg(TDBAH),
-defreg(TDBAL), defreg(TDH),defreg(TDLEN),  defreg(TDT),
-defreg(TORH),  defreg(TORL),   defreg(TOTH),   defreg(TOTL),
-defreg(TPR),   defreg(TPT),defreg(TXDCTL), defreg(WUFC),
-defreg(RA),defreg(MTA),defreg(CRCERRS),defreg(VFTA),
-defreg(VET),defreg(RDTR),   defreg(RADV),   defreg(TADV),
+defreg(CTRL),defreg(EECD),defreg(EERD),defreg(GPRC),
+defreg(GPTC),defreg(ICR), defreg(ICS), defreg(IMC),
+defreg(IMS), defreg(LEDCTL),  defreg(MANC),defreg(MDIC),
+defreg(MPC), defreg(PBA), defreg(RCTL),defreg(RDBAH),
+defreg(RDBAL),   defreg(RDH), defreg(RDLEN),   defreg(RDT),
+defreg(STATUS),  defreg(SWSM),defreg(TCTL),defreg(TDBAH),
+defreg(TDBAL),   defreg(TDH), defreg(TDLEN),   defreg(TDT),
+defreg(TORH),defreg(TORL),defreg(TOTH),defreg(TOTL),
+defreg(TPR), defreg(TPT), defreg(TXDCTL),  defreg(WUFC),
+defreg(RA),  defreg(MTA), defreg(CRCERRS), defreg(VFTA),
+defreg(VET), defreg(RDTR),defreg(RADV),defreg(TADV),
 defreg(ITR),
 };
 
@@ -226,12 +226,12 @@ enum { NPHYWRITEOPS = ARRAY_SIZE(phyreg_writeops) };
 
 enum { PHY_R = 1, PHY_W = 2, PHY_RW = PHY_R | PHY_W };
 static const char phy_regcap[0x20] = {
-[PHY_STATUS] = PHY_R,  [M88E1000_EXT_PHY_SPEC_CTRL] = PHY_RW,
-[PHY_ID1] = PHY_R, [M88E1000_PHY_SPEC_CTRL] = PHY_RW,
-[PHY_CTRL] = PHY_RW,   [PHY_1000T_CTRL] = PHY_RW,
-[PHY_LP_ABILITY] = PHY_R,  [PHY_1000T_STATUS] = PHY_R,
-[PHY_AUTONEG_ADV] = PHY_RW,[M88E1000_RX_ERR_CNTR] = PHY_R,
-[PHY_ID2] = PHY_R, [M88E1000_PHY_SPEC_STATUS] = PHY_R,
+[PHY_STATUS] = PHY_R,   [M88E1000_EXT_PHY_SPEC_CTRL] = PHY_RW,
+[PHY_ID1] = PHY_R,  [M88E1000_PHY_SPEC_CTRL] = PHY_RW,
+[PHY_CTRL] = PHY_RW,[PHY_1000T_CTRL] = PHY_RW,
+[PHY_LP_ABILITY] = PHY_R,   [PHY_1000T_STATUS] = PHY_R,
+[PHY_AUTONEG_ADV] = PHY_RW, [M88E1000_RX_ERR_CNTR] = PHY_R,
+[PHY_ID2] = PHY_R,  [M88E1000_PHY_SPEC_STATUS] = PHY_R,
 [PHY_AUTONEG_EXP] = PHY_R,
 };
 
@@ -510,17 +510,17 @@ set_eecd(E1000State *s, int index, uint32_t val)
 
 s->eecd_state.old_eecd = val & (E1000_EECD_SK | E1000_EECD_CS |

Re: [Qemu-devel] [PULL 0/6] ui patch queue

2015-11-03 Thread Peter Maydell

On 3 November 2015 at 10:01, Gerd Hoffmann <kra...@redhat.com> wrote:
>   Hi,
>
> Trying something new -- a unified patch queue for ui bugfix patches,
> featuring fixes for vnc, opengl and curses.  That way I don't have
> to do three tiny pull requests.
>
> Anything larger (like the vnc buffering series in the works atm) will
> continue to come as separate per-ui pull request.
>
> please pull,
>   Gerd
>
> The following changes since commit 3d861a01093f8eedfac9889746ccafcfd32039b7:
>
>   Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2015-11-02' 
> into staging (2015-11-02 11:11:39 +)
>
> are available in the git repository at:
>
>
>   git://git.kraxel.org/qemu tags/pull-ui-20151103-1
>
> for you to fetch changes up to 4d77b1f23877b579b94421d0cab2bebc79f4e171:
>
>   vnc: fix bug: vnc server can't start when 'to' is specified (2015-11-03 
> 10:21:49 +0100)
>
> 
> ui: fixes for vnc, opengl and curses.
>

Applied, thanks.

-- PMM

[Qemu-devel] [PATCH COLO-Frame v10 38/38] COLO: Add block replication into colo process

2015-11-03 Thread zhanghailiang

Make sure master start block replication after slave's block replication 
started.

Signed-off-by: zhanghailiang 
Signed-off-by: Wen Congyang 
Signed-off-by: Li Zhijian 
---
 migration/colo.c  | 62 ++-
 migration/migration.c | 10 -
 trace-events  |  2 ++
 3 files changed, 63 insertions(+), 11 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 25335db..cb9c6db 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -23,6 +23,7 @@
 #include "qapi-types.h"
 #include "net/filter.h"
 #include "net/net.h"
+#include "block/block_int.h"
 
 /*
  * The delay time before qemu begin the procedure of default failover 
treatment.
@@ -83,6 +84,7 @@ static void secondary_vm_do_failover(void)
 {
 int old_state;
 MigrationIncomingState *mis = migration_incoming_get_current();
+Error *local_err = NULL;
 
 /* Can not do failover during the process of VM's loading VMstate, Or
   * it will break the secondary VM.
@@ -100,6 +102,12 @@ static void secondary_vm_do_failover(void)
 migrate_set_state(>state, MIGRATION_STATUS_COLO,
   MIGRATION_STATUS_COMPLETED);
 
+bdrv_stop_replication_all(true, _err);
+if (local_err) {
+error_report_err(local_err);
+}
+trace_colo_stop_block_replication("failover");
+
 if (!autostart) {
 error_report("\"-S\" qemu option will be ignored in secondary side");
 /* recover runstate to normal migration finish state */
@@ -130,6 +138,7 @@ static void primary_vm_do_failover(void)
 {
 MigrationState *s = migrate_get_current();
 int old_state;
+Error *local_err = NULL;
 
 if (s->state != MIGRATION_STATUS_FAILED) {
 migrate_set_state(>state, MIGRATION_STATUS_COLO,
@@ -145,6 +154,12 @@ static void primary_vm_do_failover(void)
 }
 colo_cleanup_filter_buffers();
 
+bdrv_stop_replication_all(true, _err);
+if (local_err) {
+error_report_err(local_err);
+}
+trace_colo_stop_block_replication("failover");
+
 vm_start();
 
 old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
@@ -234,6 +249,7 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
 int colo_shutdown, ret;
 size_t size;
 QEMUFile *trans = NULL;
+Error *local_err = NULL;
 
 ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_CHECKPOINT_REQUEST, 0);
 if (ret < 0) {
@@ -271,6 +287,16 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
 goto out;
 }
 
+/* we call this api although this may do nothing on primary side */
+qemu_mutex_lock_iothread();
+bdrv_do_checkpoint_all(_err);
+qemu_mutex_unlock_iothread();
+if (local_err) {
+error_report_err(local_err);
+ret = -1;
+goto out;
+}
+
 ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
 if (ret < 0) {
 goto out;
@@ -315,6 +341,10 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
 filter_buffer_release_all();
 
 if (colo_shutdown) {
+qemu_mutex_lock_iothread();
+bdrv_stop_replication_all(false, NULL);
+trace_colo_stop_block_replication("shutdown");
+qemu_mutex_unlock_iothread();
 colo_ctl_put(s->to_dst_file, COLO_COMMAND_GUEST_SHUTDOWN, 0);
 qemu_fflush(s->to_dst_file);
 colo_shutdown_requested = 0;
@@ -359,6 +389,7 @@ static void colo_process_checkpoint(MigrationState *s)
 int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
 int64_t error_time;
 int fd, ret = 0;
+Error *local_err = NULL;
 
 failover_init_state();
 
@@ -403,6 +434,15 @@ static void colo_process_checkpoint(MigrationState *s)
 }
 
 qemu_mutex_lock_iothread();
+/* start block replication */
+bdrv_start_replication_all(REPLICATION_MODE_PRIMARY, _err);
+if (local_err) {
+qemu_mutex_unlock_iothread();
+error_report_err(local_err);
+ret = -EINVAL;
+goto out;
+}
+trace_colo_start_block_replication();
 vm_start();
 qemu_mutex_unlock_iothread();
 trace_colo_vm_state_change("stop", "run");
@@ -514,6 +554,8 @@ static int colo_wait_handle_cmd(QEMUFile *f, int 
*checkpoint_request)
 case COLO_COMMAND_GUEST_SHUTDOWN:
 qemu_mutex_lock_iothread();
 vm_stop_force_state(RUN_STATE_COLO);
+bdrv_stop_replication_all(false, NULL);
+trace_colo_stop_block_replication("shutdown");
 qemu_system_shutdown_request_core();
 qemu_mutex_unlock_iothread();
 /* the main thread will exit and termiante the whole
@@ -545,6 +587,7 @@ void *colo_process_incoming_thread(void *opaque)
 int  total_size;
 int64_t error_time, current_time;
 int fd, ret = 0;
+Error *local_err = NULL;
 
 migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,

[Qemu-devel] [PATCH COLO-Frame v10 29/38] savevm: Split load vm state function qemu_loadvm_state

2015-11-03 Thread zhanghailiang

qemu_loadvm_state is too long, and we can simplify it by splitting up
with three helper functions.

Signed-off-by: zhanghailiang 
---
 migration/savevm.c | 165 +++--
 1 file changed, 96 insertions(+), 69 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 0faf12b..1296cc3 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1053,6 +1053,100 @@ void loadvm_free_handlers(MigrationIncomingState *mis)
 }
 }
 
+static int
+qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis)
+{
+uint32_t instance_id, version_id, section_id;
+SaveStateEntry *se;
+LoadStateEntry *le;
+char idstr[256];
+int ret;
+
+/* Read section start */
+section_id = qemu_get_be32(f);
+if (!qemu_get_counted_string(f, idstr)) {
+error_report("Unable to read ID string for section %u",
+ section_id);
+return -EINVAL;
+}
+instance_id = qemu_get_be32(f);
+version_id = qemu_get_be32(f);
+
+trace_qemu_loadvm_state_section_startfull(section_id, idstr,
+instance_id, version_id);
+/* Find savevm section */
+se = find_se(idstr, instance_id);
+if (se == NULL) {
+error_report("Unknown savevm section or instance '%s' %d",
+ idstr, instance_id);
+ret = -EINVAL;
+return ret;
+}
+
+/* Validate version */
+if (version_id > se->version_id) {
+error_report("savevm: unsupported version %d for '%s' v%d",
+ version_id, idstr, se->version_id);
+ret = -EINVAL;
+return ret;
+}
+
+/* Add entry */
+le = g_malloc0(sizeof(*le));
+
+le->se = se;
+le->section_id = section_id;
+le->version_id = version_id;
+QLIST_INSERT_HEAD(>loadvm_handlers, le, entry);
+
+ret = vmstate_load(f, le->se, le->version_id);
+if (ret < 0) {
+error_report("error while loading state for instance 0x%x of"
+ " device '%s'", instance_id, idstr);
+return ret;
+}
+if (!check_section_footer(f, le)) {
+ret = -EINVAL;
+return ret;
+}
+
+return 0;
+}
+
+static int
+qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
+{
+uint32_t section_id;
+LoadStateEntry *le;
+int ret;
+
+section_id = qemu_get_be32(f);
+
+trace_qemu_loadvm_state_section_partend(section_id);
+QLIST_FOREACH(le, >loadvm_handlers, entry) {
+if (le->section_id == section_id) {
+break;
+}
+}
+if (le == NULL) {
+error_report("Unknown savevm section %d", section_id);
+ret = -EINVAL;
+return ret;
+}
+
+ret = vmstate_load(f, le->se, le->version_id);
+if (ret < 0) {
+error_report("error while loading state section id %d(%s)",
+ section_id, le->se->idstr);
+return ret;
+}
+if (!check_section_footer(f, le)) {
+ret = -EINVAL;
+return ret;
+}
+
+return 0;
+}
 int qemu_loadvm_state(QEMUFile *f)
 {
 MigrationIncomingState *mis = migration_incoming_get_current();
@@ -1096,87 +1190,20 @@ int qemu_loadvm_state(QEMUFile *f)
 }
 
 while ((section_type = qemu_get_byte(f)) != QEMU_VM_EOF) {
-uint32_t instance_id, version_id, section_id;
-SaveStateEntry *se;
-LoadStateEntry *le;
-char idstr[256];
 
 trace_qemu_loadvm_state_section(section_type);
 switch (section_type) {
 case QEMU_VM_SECTION_START:
 case QEMU_VM_SECTION_FULL:
-/* Read section start */
-section_id = qemu_get_be32(f);
-if (!qemu_get_counted_string(f, idstr)) {
-error_report("Unable to read ID string for section %u",
-section_id);
-return -EINVAL;
-}
-instance_id = qemu_get_be32(f);
-version_id = qemu_get_be32(f);
-
-trace_qemu_loadvm_state_section_startfull(section_id, idstr,
-  instance_id, version_id);
-/* Find savevm section */
-se = find_se(idstr, instance_id);
-if (se == NULL) {
-error_report("Unknown savevm section or instance '%s' %d",
- idstr, instance_id);
-ret = -EINVAL;
-goto out;
-}
-
-/* Validate version */
-if (version_id > se->version_id) {
-error_report("savevm: unsupported version %d for '%s' v%d",
- version_id, idstr, se->version_id);
-ret = -EINVAL;
-goto out;
-}
-
-/* Add entry */
-le = g_malloc0(sizeof(*le));
-
-le->se = se;
-le->section_id = section_id;
-le->version_id = version_id;
-

Re: [Qemu-devel] [PATCH] target-alpha: fix uninitialized variable

2015-11-03 Thread Paolo Bonzini



On 03/11/2015 13:01, Michael Tokarev wrote:
> 22.10.2015 02:15, Richard Henderson wrote:
>> On 10/19/2015 04:08 AM, Paolo Bonzini wrote:
>>> I am not sure why the compiler does not catch it.  There is no
>>> semantic change since gen_excp returns EXIT_NORETURN, but the
>>> old code is wrong.
>>>
>>> Reported by Coverity.
>>>
>>> Signed-off-by: Paolo Bonzini
>>> ---
>>>   target-alpha/translate.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> Reviewed-by: Richard Henderson 
>>
>> I assume this will go in via trivial.
> 
> Richard, after your patch 522a0d4e3c0d397ffb45ec400d8cbd426dad9d17
> "target-*: Advance pc after recognizing a breakpoint", this code
> needs another review I think, as you modified the subsequent line ;)
> Please take a look.

It's okay, I noticed the conflict.  Assigning the return value of
gen_excp is still the right thing to do.

Paolo

Re: [Qemu-devel] [PATCH v2 0/9] block: Fixes for bdrv_drain

2015-11-03 Thread Kevin Wolf

Am 29.10.2015 um 03:14 hat Fam Zheng geschrieben:
> v2: Add Kevin's reviewed-by in patches 1, 2, 5-7, 9.
> Address Kevin's reviewing comments which are:
> - Explicit "ret = 0" before out label in patch 3.
> - Add missing qemu_aio_unref() in patch 4.
> - Recurse into all children in bdrv_drain in patch 8.
> 
> Previously bdrv_drain and bdrv_drain_all don't handle ioctl, flush and discard
> requests (which are fundamentally the same as read and write requests that
> change disk state).  Forgetting such requests leaves us in risk of violating
> the invariant that bdrv_drain() callers rely on - all asynchronous requests
> must have completed after bdrv_drain returns.
> 
> Enrich the tracked request types, and add tracked_request_begin/end pairs to
> all three code paths. As a prerequisite, ioctl code is moved into coroutine
> too.
> 
> The last two patches take care of QED's "need check" timer, so that after
> bdrv_drain returns, the driver is in a consistent state.

Reviewed-by: Kevin Wolf

Re: [Qemu-devel] [PATCH v2 0/4] ui audio qxl usb: Use g_new() & friends where that makes obvious sense

2015-11-03 Thread Michael Tokarev

29.10.2015 18:55, Markus Armbruster wrote:
> v2:
> * Trivially rebased
> 
> Markus Armbruster (4):
>   ui: Use g_new() & friends where that makes obvious sense
>   audio: Use g_new() & friends where that makes obvious sense
>   qxl: Use g_new() & friends where that makes obvious sense
>   usb: Use g_new() & friends where that makes obvious sense

Applied to -trivial except the audio bits and ui/vnc.c bits.

Thanks,

/mjt

Re: [Qemu-devel] [PATCH] pci-assign: do not test path with access() before opening

2015-11-03 Thread Michael Tokarev

02.11.2015 17:17, Paolo Bonzini wrote:
> Using access() is a time-of-check/time-of-use race condition.  It is
> okay to use them to provide better error messages, but that is pretty
> much it.
> 
> In this case we can get the same error from fopen(), so just use
> strerror and errno there---which actually improves the error
> message most of the time.

Applied to -trivial, thank you!

/mjt

Re: [Qemu-devel] [v2 RESEND 1/4] migration: defer migration_end & blk_mig_cleanup

2015-11-03 Thread Juan Quintela

Liang Li  wrote:
> Because of the patch 3ea3b7fa9af067982f34b of kvm, which introduces a
> lazy collapsing of small sptes into large sptes mechanism, now
> migration_end() is a time consuming operation because it calls
> memroy_global_dirty_log_stop(), which will trigger the dropping of small
> sptes operation and takes about dozens of milliseconds, so call
> migration_end() before all the vmsate data has already been transferred
> to the destination will prolong VM downtime. This operation should be
> deferred after all the data has been transferred to the destination.
>
> blk_mig_cleanup() can be deferred too.
>
> For a VM with 8G RAM, this patch can reduce the VM downtime about 30 ms.
>
> Signed-off-by: Liang Li 
> Reviewed-by: Paolo Bonzini 

Reviewed-by: Juan Quintela 

And the naming makes more sense even, thanks

Re: [Qemu-devel] [v2 RESEND 2/4] migration: rename qemu_savevm_state_cancel

2015-11-03 Thread Juan Quintela

Liang Li  wrote:
> The function qemu_savevm_state_cancel is called after the migration
> in migration_thread, it seems strange to 'cancel' it after completion,
> rename it to qemu_savevm_state_cleanup looks better.
>
> Signed-off-by: Liang Li 

Reviewed-by: Juan Quintela

Re: [Qemu-devel] [v2 RESEND 3/4] migration: rename cancel to cleanup in SaveVMHandles

2015-11-03 Thread Juan Quintela

Liang Li  wrote:
> 'cleanup' seems more appropriate than 'cancel'.
>
> Signed-off-by: Liang Li 

Reviewed-by: Juan Quintela

Re: [Qemu-devel] [PATCH] exec: avoid unnecessary cacheline bounce on ram_list.mru_block

2015-11-03 Thread Michael Tokarev

22.10.2015 14:51, Paolo Bonzini write:
> Whenever the MRU cache hits for the list of RAM blocks, qemu_get_ram_block
> does an unnecessary write that causes a processor cache line to bounce
> from one core to another.  This causes a performance hit.

Applied to -trivial.  A good one! :)

/mjt

[Qemu-devel] [vhost-user BUG ?] QEMU process segfault when shutdown or reboot with vhost-user

2015-11-03 Thread zhanghailiang


Hi,

We catch a segfault in our project.

Qemu version is 2.3.0

The Stack backtrace is:
(gdb) bt
#0  0x in ?? ()
#1  0x7f7ad9280b2f in qemu_deliver_packet (sender=, flags=, data=, size=100, opaque=
0x7f7ad9d6db10) at net/net.c:510
#2  0x7f7ad92831fa in qemu_net_queue_deliver (size=, data=, flags=,
sender=, queue=) at net/queue.c:157
#3  qemu_net_queue_flush (queue=0x7f7ad9d39630) at net/queue.c:254
#4  0x7f7ad9280dac in qemu_flush_or_purge_queued_packets 
(nc=0x7f7ad9d6db10, purge=true) at net/net.c:539
#5  0x7f7ad9280e76 in net_vm_change_state_handler (opaque=, 
running=, state=100) at net/net.c:1214
#6  0x7f7ad915612f in vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN) 
at vl.c:1820
#7  0x7f7ad906db1a in do_vm_stop (state=) at 
/usr/src/packages/BUILD/qemu-kvm-2.3.0/cpus.c:631
#8  vm_stop (state=RUN_STATE_SHUTDOWN) at 
/usr/src/packages/BUILD/qemu-kvm-2.3.0/cpus.c:1325
#9  0x7f7ad915e4a2 in main_loop_should_exit () at vl.c:2080
#10 main_loop () at vl.c:2131
#11 main (argc=, argv=, envp=) at 
vl.c:4721
(gdb) p *(NetClientState *)0x7f7ad9d6db10
$1 = {info = 0x7f7ad9824520, link_down = 0, next = {tqe_next = 0x7f7ad0f06d10, 
tqe_prev = 0x7f7ad98b1cf0}, peer = 0x7f7ad0f06d10,
  incoming_queue = 0x7f7ad9d39630, model = 0x7f7ad9d39590 "vhost_user", name = 
0x7f7ad9d39570 "hostnet0", info_str =
"vhost-user to charnet0", '\000' , receive_disabled = 0, 
destructor =
0x7f7ad92821f0 , queue_index = 0, 
rxfilter_notify_enabled = 0}
(gdb) p *(NetClientInfo *)0x7f7ad9824520
$2 = {type = NET_CLIENT_OPTIONS_KIND_VHOST_USER, size = 360, receive = 0, 
receive_raw = 0, receive_iov = 0, can_receive = 0, cleanup =
0x7f7ad9288850 , link_status_changed = 0, 
query_rx_filter = 0, poll = 0, has_ufo =
0x7f7ad92886d0 , has_vnet_hdr = 0x7f7ad9288670 
, has_vnet_hdr_len = 0,
  using_vnet_hdr = 0, set_offload = 0, set_vnet_hdr_len = 0}
(gdb)

The corresponding codes where gdb reports error are: (We have added some codes 
in net.c)
ssize_t qemu_deliver_packet(NetClientState *sender,
unsigned flags,
const uint8_t *data,
size_t size,
void *opaque)
{
NetClientState *nc = opaque;
ssize_t ret;

if (nc->link_down) {
return size;
}

if (nc->receive_disabled) {
return 0;
}

if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) {
ret = nc->info->receive_raw(nc, data, size);
} else {
ret = nc->info->receive(nc, data, size);   > Here is 510 line
}

I'm not quite familiar with vhost-user, but for vhost-user, these two callback 
functions seem to be always NULL,
Why we can come here ?
Is it an error to add VM state change handler for vhost-user ?

Thanks,
zhanghailiang

Re: [Qemu-devel] [PATCH v7 27/35] nvdimm acpi: build ACPI nvdimm devices

2015-11-03 Thread Igor Mammedov

On Mon,  2 Nov 2015 17:13:29 +0800
Xiao Guangrong  wrote:

> NVDIMM devices is defined in ACPI 6.0 9.20 NVDIMM Devices
> 
> There is a root device under \_SB and specified NVDIMM devices are under the
> root device. Each NVDIMM device has _ADR which returns its handle used to
> associate MEMDEV structure in NFIT
> 
> We reserve handle 0 for root device. In this patch, we save handle, handle,
> arg1 and arg2 to dsm memory. Arg3 is conditionally saved in later patch
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/acpi/nvdimm.c | 184 
> +++
>  1 file changed, 184 insertions(+)
> 
> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> index dd84e5f..53ed675 100644
> --- a/hw/acpi/nvdimm.c
> +++ b/hw/acpi/nvdimm.c
> @@ -368,6 +368,15 @@ static void nvdimm_build_nfit(GSList *device_list, 
> GArray *table_offsets,
>  g_array_free(structures, true);
>  }
>  
> +struct NvdimmDsmIn {
> +uint32_t handle;
> +uint32_t revision;
> +uint32_t function;
> +   /* the remaining size in the page is used by arg3. */
> +uint8_t arg3[0];
> +} QEMU_PACKED;
> +typedef struct NvdimmDsmIn NvdimmDsmIn;
> +
>  static uint64_t
>  nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
>  {
> @@ -377,6 +386,7 @@ nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
>  static void
>  nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
>  {
> +fprintf(stderr, "BUG: we never write DSM notification IO Port.\n");
it doesn't seem like this hunk belongs here

>  }
>  
>  static const MemoryRegionOps nvdimm_dsm_ops = {
> @@ -402,6 +412,179 @@ void nvdimm_init_acpi_state(MemoryRegion *memory, 
> MemoryRegion *io,
>  memory_region_add_subregion(io, NVDIMM_ACPI_IO_BASE, >io_mr);
>  }
>  
> +#define BUILD_STA_METHOD(_dev_, _method_)  \
> +do {   \
> +_method_ = aml_method("_STA", 0);  \
> +aml_append(_method_, aml_return(aml_int(0x0f)));   \
> +aml_append(_dev_, _method_);   \
> +} while (0)
_STA doesn't have any logic here so drop macro and just
replace its call sites with:

aml_append(foo_dev, aml_name_decl("_STA", aml_int(0xf));


> +
> +#define BUILD_DSM_METHOD(_dev_, _method_, _handle_, _uuid_)\
> +do {   \
> +Aml *ifctx, *uuid; \
> +_method_ = aml_method("_DSM", 4);  \
> +/* check UUID if it is we expect, return the errorcode if not.*/   \
> +uuid = aml_touuid(_uuid_); \
> +ifctx = aml_if(aml_lnot(aml_equal(aml_arg(0), uuid))); \
> +aml_append(ifctx, aml_return(aml_int(1 /* Not Supported */))); \
> +aml_append(method, ifctx); \
> +aml_append(method, aml_return(aml_call4("NCAL", aml_int(_handle_), \
> +   aml_arg(1), aml_arg(2), aml_arg(3;  \
> +aml_append(_dev_, _method_);   \
> +} while (0)
> +
> +#define BUILD_FIELD_UNIT_SIZE(_field_, _byte_, _name_) \
> +aml_append(_field_, aml_named_field(_name_, (_byte_) * BITS_PER_BYTE))
> +
> +#define BUILD_FIELD_UNIT_STRUCT(_field_, _s_, _f_, _name_) \
> +BUILD_FIELD_UNIT_SIZE(_field_, sizeof(typeof_field(_s_, _f_)), _name_)
> +
> +static void build_nvdimm_devices(GSList *device_list, Aml *root_dev)
> +{
> +for (; device_list; device_list = device_list->next) {
> +NVDIMMDevice *nvdimm = device_list->data;
> +int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> +   NULL);
> +uint32_t handle = nvdimm_slot_to_handle(slot);
> +Aml *dev, *method;
> +
> +dev = aml_device("NV%02X", slot);
> +aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
> +
> +BUILD_STA_METHOD(dev, method);
> +
> +/*
> + * Chapter 4: _DSM Interface for NVDIMM Device (non-root) - Example
> + * in DSM Spec Rev1.
> + */
> +BUILD_DSM_METHOD(dev, method,
> + handle /* NVDIMM Device Handle */,
> + "4309AC30-0D11-11E4-9191-0800200C9A66"
> + /* UUID for NVDIMM Devices. */);
this will add N-bytes * #NVDIMMS in worst case.
Please drop macro and just consolidate this method into _DSM method of parent 
scope
and then call it from here like this:
   Method(_DSM, 4)
   Return(^_DSM(Arg[0-3]))

> +
> +aml_append(root_dev, dev);
> +}
> +}
> +
> +static void nvdimm_build_acpi_devices(GSList

Re: [Qemu-devel] QEMU+Aarch64: in_asm log skips instructions of loop-programs

2015-11-03 Thread Sergey Smolov


Hi Christopher,

I've send my patch to the mailing list. Sorry for the great delay in 
answers.



18.09.2015 18:26, Christopher Covington пишет:

On 09/18/2015 04:15 AM, Sergey Smolov wrote:

Hi Christopher,

18.09.2015 02:02, Christopher Covington пишет:

Hi Sergey,

On 09/04/2015 12:38 PM, Sergey Smolov wrote:

03.09.2015 19:35, Peter Maydell пишет:

On 3 September 2015 at 15:31, Sergey Smolov  wrote:

Do you think it is possible to implement another QEMU logger which will
make a record for every executed block,

Yes (this would just need to disable the TB linking optimisation,
which we've discussed providing a debug toggle for in another
thread).

Ok, I've implemented a mapping between disassembled forms of instructions and
executed TBs.
Now my logger does "loop unrolling" successfully.

This sounds like it solves the same issue as -d nochain but in what's probably
a more time efficient manner. Are you able to share your implementation?

In which way should I share it? Am I need to create a patch and send it to
mailing list?

That would be ideal. If you're not familiar with the process, just let me know
and I'd be happy to help.

Thanks,
Christopher Covington

Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev

2015-11-03 Thread zhanghailiang


On 2015/11/3 20:57, Yang Hongyang wrote:



On 2015年11月03日 19:56, zhanghailiang wrote:

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
---
v10: new patch
---
  include/net/filter.h |  1 +
  include/net/net.h|  3 ++
  net/filter-buffer.c  | 84 
  net/net.c| 20 +
  4 files changed, 108 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 4499d60..b0954ba 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -75,5 +75,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
  void *opaque);
  void filter_buffer_release_all(void);
  void  filter_buffer_del_all_timers(void);
+void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);

  #endif /* QEMU_NET_FILTER_H */
diff --git a/include/net/net.h b/include/net/net.h
index 5c65c45..e32bd90 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -129,6 +129,9 @@ typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, 
void *opaque,
 Error **errp);
  void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
  Error **errp);
+typedef void (*qemu_netdev_foreach)(NetClientState *nc, void *opaque,
+Error **errp);
+void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp);
  int qemu_can_send_packet(NetClientState *nc);
  ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
int iovcnt);
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 05313de..0dc1efb 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -15,6 +15,11 @@
  #include "qapi-visit.h"
  #include "qom/object.h"
  #include "net/net.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp-output-visitor.h"
+#include "qapi/qmp-input-visitor.h"
+#include "monitor/monitor.h"
+

  #define TYPE_FILTER_BUFFER "filter-buffer"

@@ -185,6 +190,85 @@ void filter_buffer_del_all_timers(void)
  qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
  }

+static void netdev_add_filter_buffer(NetClientState *nc, void *opaque,
+ Error **errp)
+{
+NetFilterState *nf;
+bool found = false;
+
+QTAILQ_FOREACH(nf, >filters, next) {
+if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
+found = true;


What if a filter-buffer already attached to a netdev, but has interval
set?
Is this API really necessary?



We will jump this netdev, but remove its filter-buffer timer. Meanwhile, we will
release the packets all together in colo checkpoint process.
Besides, we should resume the timer after exit COLO. (We didn't do this in this 
version).

I don't know if it is a good idea to automatically add filter-buffer for the 
device
that not configured with it. But it is really reduce the complexity of testing.


+break;
+}
+}
+
+if (!found) {
+QmpOutputVisitor *qov;
+QmpInputVisitor *qiv;
+Visitor *ov, *iv;
+QObject *obj = NULL;
+QDict *qdict;
+void *dummy = NULL;
+char *id = g_strdup_printf("%s-%s.0", nc->name, TYPE_FILTER_BUFFER);
+char *queue = (char *) opaque;
+bool auto_add = true;
+Error *err = NULL;
+
+qov = qmp_output_visitor_new();
+ov = qmp_output_get_visitor(qov);
+visit_start_struct(ov,  , NULL, NULL, 0, );
+if (err) {
+goto out;
+}
+visit_type_str(ov, >name, "netdev", );
+if (err) {
+goto out;
+}
+visit_type_str(ov, , "queue", );
+if (err) {
+goto out;
+}
+visit_type_bool(ov, _add, "auto", );
+if (err) {
+goto out;
+}
+visit_end_struct(ov, );
+if (err) {
+goto out;
+}
+obj = qmp_output_get_qobject(qov);
+g_assert(obj != NULL);
+qdict = qobject_to_qdict(obj);
+qmp_output_visitor_cleanup(qov);
+
+qiv = qmp_input_visitor_new(obj);
+iv = qmp_input_get_visitor(qiv);
+object_add(TYPE_FILTER_BUFFER, id, qdict, iv, );
+qmp_input_visitor_cleanup(qiv);
+qobject_decref(obj);
+out:
+g_free(id);
+if (err) {
+error_propagate(errp, err);
+}
+}
+}
+/*
+* This will be used by COLO or MC FT, for which they will need
+* to buffer all the packets of all VM's net devices, Here we check
+* and automatically add netfilter for netdev that doesn't attach any buffer
+* netfilter.
+*/
+void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
+{
+char *queue = g_strdup(NetFilterDirection_lookup[direction]);
+
+qemu_foreach_netdev(netdev_add_filter_buffer, queue,
+

Re: [Qemu-devel] [PATCH v4 0/3] aio: Use epoll in aio_poll()

2015-11-03 Thread Stefan Hajnoczi

On Fri, Oct 30, 2015 at 12:06:26PM +0800, Fam Zheng wrote:
> v4: Rebase onto master (with aio_disable_external):
> Don't use epoll if aio_external_disabled(ctx);
> Change assert on epoll_ctl return code to disable epoll;
> Rerun benchmark;
> 
> v3: Remove the redundant check in aio_epoll_try_enable. [Stefan]
> 
> v2: Merge aio-epoll.c into aio-posix.c. [Paolo]
> Capture some benchmark data in commit log.
> 
> This series adds the ability to use epoll in aio_poll() on Linux. It's 
> switched
> on in a dynamic way rather than static for two reasons: 1) when the number of
> fds is not high enough, using epoll has little advantage; 2) when an epoll
> incompatible fd needs to be handled, we need to fall back.  The epoll is
> enabled when a fd number threshold is met.
> 
> 
> Fam Zheng (3):
>   aio: Introduce aio_external_disabled
>   aio: Introduce aio_context_setup
>   aio: Introduce aio-epoll.c
> 
>  aio-posix.c | 188 
> +++-
>  aio-win32.c |   4 ++
>  async.c |  13 +++-
>  include/block/aio.h |  24 +++
>  4 files changed, 226 insertions(+), 3 deletions(-)
> 
> -- 
> 2.4.3
> 

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH COLO-Frame v10 32/38] netfilter: Add a public API to release all the buffered packets

2015-11-03 Thread zhanghailiang


On 2015/11/3 20:39, Yang Hongyang wrote:

On 2015年11月03日 19:56, zhanghailiang wrote:

For COLO or MC FT, We need a function to release all the buffered packets
actively.

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
---
v10: new patch
---
  include/net/filter.h |  1 +
  include/net/net.h|  4 
  net/filter-buffer.c  | 15 +++
  net/net.c| 24 
  4 files changed, 44 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 2deda36..5a09607 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -73,5 +73,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
  const struct iovec *iov,
  int iovcnt,
  void *opaque);
+void filter_buffer_release_all(void);

  #endif /* QEMU_NET_FILTER_H */
diff --git a/include/net/net.h b/include/net/net.h
index 7af3e15..5c65c45 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -125,6 +125,10 @@ NetClientState *qemu_find_vlan_client_by_name(Monitor 
*mon, int vlan_id,
const char *client_str);
  typedef void (*qemu_nic_foreach)(NICState *nic, void *opaque);
  void qemu_foreach_nic(qemu_nic_foreach func, void *opaque);
+typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
+   Error **errp);
+void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
+Error **errp);
  int qemu_can_send_packet(NetClientState *nc);
  ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
int iovcnt);
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 57be149..b344901 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -14,6 +14,7 @@
  #include "qapi/qmp/qerror.h"
  #include "qapi-visit.h"
  #include "qom/object.h"
+#include "net/net.h"

  #define TYPE_FILTER_BUFFER "filter-buffer"

@@ -163,6 +164,20 @@ out:
  error_propagate(errp, local_err);
  }

+static void filter_buffer_release_packets(NetFilterState *nf, void *opaque,
+  Error **errp)
+{
+if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
+filter_buffer_flush(nf);
+}
+}
+
+/* public APIs */
+void filter_buffer_release_all(void)
+{
+qemu_foreach_netfilter(filter_buffer_release_packets, NULL, NULL);
+}
+
  static void filter_buffer_init(Object *obj)
  {
  object_property_add(obj, "interval", "int",
diff --git a/net/net.c b/net/net.c
index a3e9d1a..a333b01 100644
--- a/net/net.c
+++ b/net/net.c
@@ -259,6 +259,30 @@ static char *assign_name(NetClientState *nc1, const char 
*model)
  return g_strdup_printf("%s.%d", model, id);
  }

+void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
+Error **errp)
+{
+NetClientState *nc;
+NetFilterState *nf;
+
+QTAILQ_FOREACH(nc, _clients, next) {


Going through every filters this way might cause problem under
multiqueue case. IIRC, Jason suggested that we implement multiqueue
by this way: attach the same filter to all queues instead of
attach the clone of the filter obj to other queues. So if we
attach the same filter to all queues, going through filters
this way will cause the func been called multiple(=num of queues) times.



Got it, i will investigate it.

Thanks.
zhanghailiang


+if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
+continue;
+}
+QTAILQ_FOREACH(nf, >filters, next) {
+if (func) {
+Error *local_err = NULL;
+
+func(nf, opaque, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+}
+}
+}
+}
+
  static void qemu_net_client_destructor(NetClientState *nc)
  {
  g_free(nc);

Re: [Qemu-devel] [PATCH] hw/arm/virt-acpi-build: _CCA attribute is compulsary

2015-11-03 Thread Peter Maydell

On 3 November 2015 at 09:08, Shannon Zhao  wrote:
>
>
> On 2015/11/3 16:31, Graeme Gregory wrote:
>>
>>
>> On Tue, 3 Nov 2015, at 02:25 AM, Shannon Zhao wrote:
>>> Hi Graeme,
>>>
>>> On 2015/11/2 18:39, Graeme Gregory wrote:
 According to ACPI specification 6.2.17 _CCA (Cache Coherency Attribute)
 this attribute is compulsary on ARM systems. Add this attribute to
 the PCI host bridges as required.

>>>
>>> To ACPI 5.1 this object is not compulsory and if not supplied it has
>>> default value for it. But to ACPI 6.0 it must be supplied on ARM systems.
>>> Regarding this change, ACPI 6.0 fixes 5.1 for this object, right?
>>>
>>
>> Hi Shannon, the wording in ACPI 5.1 is "On ARM based systems, the _CCA
>> object must be supplied all such devices."
>>
>> So is not functionally different from 6.0.
>>
> Oh, I see. It's updated by 5.1 Errata 1189.
>
> Reviewed-by: Shannon Zhao 



Applied to target-arm.next, thanks.

-- PMM

Re: [Qemu-devel] [PATCH v8 07/17] qapi: Rework collision assertions

2015-11-03 Thread Eric Blake

On 11/03/2015 12:56 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> On 11/02/2015 08:37 AM, Markus Armbruster wrote:
>>
>>>
>>> Not checked: variant's members don't collide with non-variant members.
>>> I think this check got lost when we simplified
>>> QAPISchemaObjectTypeVariants to hold a single member.
>>
>> Yep, I found the culprit: in your v2 proposal for QAPISchema, you had:
>>
>> +class QAPISchemaObjectTypeVariant(QAPISchemaObjectTypeMember):
>> +def __init__(self, name, typ, flat):
>> +QAPISchemaObjectTypeMember.__init__(self, name, typ, False)
>> +assert isinstance(flat, bool)
>> +self.flat = flat
>> +def check(self, schema, tag_type, seen):
>> +QAPISchemaObjectTypeMember.check(self, schema, [], seen)
>> +assert self.name in tag_type.values
>> +if self.flat:
>> +self.type.check(schema)
>> +assert isinstance(self.type, QAPISchemaObjectType)
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2015-07/msg00394.html
>>
>> but the 'if self.flat' clause was lost in v3:
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2015-08/msg00450.html
> 
> Quoting v3's change log:
> 
>   - Lower simple variants to flat ones as described on
> qapi-code-gen.txt.  Replace QAPISchemaObjectType's .flat by
> .simple_union_type(), for use by pre-existing code-generation
> warts.
> 
>> I am in fact reinstating it here, but for v9, will do it in a separate
>> patch rather than blended in with the rest of the changes.
> 
> Any "is this union flat or simple" check signals a flaw.  It's either a
> pointless difference in generated code (these should all be marked TODO
> by now), or something's wrong with the desugaring of simple to flat
> unions.

Losing 'if self.flat' was correct, but we still need 'if union'; and
that's what I add in 2/4.

> 
> Therefore, the if self.flat is superfluous.  Good, because otherwise our
> desugaring must be flawed.

And things correctly work on simple unions due to our wrapper type, so
that 'if union' was sufficient.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH COLO-Frame v10 13/38] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily

2015-11-03 Thread zhanghailiang

We should not load PVM's state directly into SVM, because there maybe some
errors happen when SVM is receving data, which will break SVM.

We need to ensure receving all data before load the state into SVM. We use
an extra memory to cache these data (PVM's ram). The ram cache in secondary side
is initially the same as SVM/PVM's memory. And in the process of checkpoint,
we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
always the same as PVM's memory at every checkpoint, then we flush this cached 
ram
to SVM after we receive all PVM's state.

Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
Signed-off-by: Gonglei 
---
v10: Split the process of dirty pages recording into a new patch
---
 include/exec/ram_addr.h  |  1 +
 include/migration/colo.h |  3 +++
 migration/colo.c | 14 +--
 migration/ram.c  | 61 ++--
 4 files changed, 75 insertions(+), 4 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 3360ac5..e7c4310 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -28,6 +28,7 @@ struct RAMBlock {
 struct rcu_head rcu;
 struct MemoryRegion *mr;
 uint8_t *host;
+uint8_t *host_cache; /* For colo, VM's ram cache */
 ram_addr_t offset;
 ram_addr_t used_length;
 ram_addr_t max_length;
diff --git a/include/migration/colo.h b/include/migration/colo.h
index 2676c4a..8edd5f1 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -29,4 +29,7 @@ bool migration_incoming_enable_colo(void);
 void migration_incoming_exit_colo(void);
 void *colo_process_incoming_thread(void *opaque);
 bool migration_incoming_in_colo_state(void);
+/* ram cache */
+int colo_init_ram_cache(void);
+void colo_release_ram_cache(void);
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index b865513..25f85b2 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -304,6 +304,12 @@ void *colo_process_incoming_thread(void *opaque)
 goto out;
 }
 
+ret = colo_init_ram_cache();
+if (ret < 0) {
+error_report("Failed to initialize ram cache");
+goto out;
+}
+
 ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
 if (ret < 0) {
 goto out;
@@ -331,14 +337,14 @@ void *colo_process_incoming_thread(void *opaque)
 goto out;
 }
 
-/* TODO: read migration data into colo buffer */
+/* TODO Load VM state */
 
 ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0);
 if (ret < 0) {
 goto out;
 }
 
-/* TODO: load vm state */
+/* TODO: flush vm state */
 
 ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
 if (ret < 0) {
@@ -352,6 +358,10 @@ out:
  strerror(-ret));
 }
 
+qemu_mutex_lock_iothread();
+colo_release_ram_cache();
+qemu_mutex_unlock_iothread();
+
 if (mis->to_src_file) {
 qemu_fclose(mis->to_src_file);
 }
diff --git a/migration/ram.c b/migration/ram.c
index 5784c15..b094dc3 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -222,6 +222,7 @@ static RAMBlock *last_sent_block;
 static ram_addr_t last_offset;
 static QemuMutex migration_bitmap_mutex;
 static uint64_t migration_dirty_pages;
+static bool ram_cache_enable;
 static uint32_t last_version;
 static bool ram_bulk_stage;
 
@@ -1446,7 +1447,11 @@ static inline void *host_from_stream_offset(QEMUFile *f,
 return NULL;
 }
 
-return block->host + offset;
+if (ram_cache_enable) {
+return block->host_cache + offset;
+} else {
+return block->host + offset;
+}
 }
 
 len = qemu_get_byte(f);
@@ -1456,7 +1461,11 @@ static inline void *host_from_stream_offset(QEMUFile *f,
 QLIST_FOREACH_RCU(block, _list.blocks, next) {
 if (!strncmp(id, block->idstr, sizeof(id)) &&
 block->max_length > offset) {
-return block->host + offset;
+if (ram_cache_enable) {
+return block->host_cache + offset;
+} else {
+return block->host + offset;
+}
 }
 }
 
@@ -1707,6 +1716,54 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 return ret;
 }
 
+/*
+ * colo cache: this is for secondary VM, we cache the whole
+ * memory of the secondary VM, it will be called after first migration.
+ */
+int colo_init_ram_cache(void)
+{
+RAMBlock *block;
+
+rcu_read_lock();
+QLIST_FOREACH_RCU(block, _list.blocks, next) {
+block->host_cache = qemu_anon_ram_alloc(block->used_length, NULL);
+if (!block->host_cache) {
+goto out_locked;
+}
+memcpy(block->host_cache, block->host, block->used_length);
+}
+rcu_read_unlock();
+

[Qemu-devel] [PATCH COLO-Frame v10 34/38] filter-buffer: Accept zero interval

2015-11-03 Thread zhanghailiang

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
---
v10: new patch
---
 net/filter-buffer.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 5f0ea70..05313de 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -104,16 +104,6 @@ static void filter_buffer_setup(NetFilterState *nf, Error 
**errp)
 {
 FilterBufferState *s = FILTER_BUFFER(nf);
 
-/*
- * We may want to accept zero interval when VM FT solutions like MC
- * or COLO use this filter to release packets on demand.
- */
-if (!s->interval) {
-error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "interval",
-   "a non-zero interval");
-return;
-}
-
 s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
 if (s->interval) {
 timer_init_us(>release_timer, QEMU_CLOCK_VIRTUAL,
-- 
1.8.3.1

[Qemu-devel] [PATCH COLO-Frame v10 21/38] COLO: Implement failover work for Secondary VM

2015-11-03 Thread zhanghailiang

If users require SVM to takeover work, colo incoming thread should
exit from loop while failover BH helps backing to migration incoming
coroutine.

Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
---
 migration/colo.c | 41 ++---
 1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 95f1405..925a694 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -52,6 +52,33 @@ static bool colo_runstate_is_stopped(void)
 return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
 }
 
+static void secondary_vm_do_failover(void)
+{
+int old_state;
+MigrationIncomingState *mis = migration_incoming_get_current();
+
+migrate_set_state(>state, MIGRATION_STATUS_COLO,
+  MIGRATION_STATUS_COMPLETED);
+
+if (!autostart) {
+error_report("\"-S\" qemu option will be ignored in secondary side");
+/* recover runstate to normal migration finish state */
+autostart = true;
+}
+
+old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
+   FAILOVER_STATUS_COMPLETED);
+if (old_state != FAILOVER_STATUS_HANDLING) {
+error_report("Serious error while do failover for secondary VM,"
+ "old_state: %d", old_state);
+return;
+}
+/* For Secondary VM, jump to incoming co */
+if (mis->migration_incoming_co) {
+qemu_coroutine_enter(mis->migration_incoming_co, NULL);
+}
+}
+
 static void primary_vm_do_failover(void)
 {
 MigrationState *s = migrate_get_current();
@@ -83,6 +110,8 @@ void colo_do_failover(MigrationState *s)
 
 if (get_colo_mode() == COLO_MODE_PRIMARY) {
 primary_vm_do_failover();
+} else {
+secondary_vm_do_failover();
 }
 }
 
@@ -410,6 +439,11 @@ void *colo_process_incoming_thread(void *opaque)
 }
 }
 
+if (failover_request_is_active()) {
+error_report("failover request");
+goto out;
+}
+
 ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_REPLY, 0);
 if (ret < 0) {
 goto out;
@@ -474,10 +508,11 @@ out:
 qemu_fclose(fb);
 }
 qsb_free(buffer);
-
-qemu_mutex_lock_iothread();
+/* Here, we can ensure BH is hold the global lock, and will join colo
+* incoming thread, so here it is not necessary to lock here again,
+* or there will be a deadlock error.
+*/
 colo_release_ram_cache();
-qemu_mutex_unlock_iothread();
 
 if (mis->to_src_file) {
 qemu_fclose(mis->to_src_file);
-- 
1.8.3.1

[Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev

2015-11-03 Thread zhanghailiang

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
---
v10: new patch
---
 include/net/filter.h |  1 +
 include/net/net.h|  3 ++
 net/filter-buffer.c  | 84 
 net/net.c| 20 +
 4 files changed, 108 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 4499d60..b0954ba 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -75,5 +75,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
 void *opaque);
 void filter_buffer_release_all(void);
 void  filter_buffer_del_all_timers(void);
+void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);
 
 #endif /* QEMU_NET_FILTER_H */
diff --git a/include/net/net.h b/include/net/net.h
index 5c65c45..e32bd90 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -129,6 +129,9 @@ typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, 
void *opaque,
Error **errp);
 void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
 Error **errp);
+typedef void (*qemu_netdev_foreach)(NetClientState *nc, void *opaque,
+Error **errp);
+void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp);
 int qemu_can_send_packet(NetClientState *nc);
 ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
   int iovcnt);
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 05313de..0dc1efb 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -15,6 +15,11 @@
 #include "qapi-visit.h"
 #include "qom/object.h"
 #include "net/net.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp-output-visitor.h"
+#include "qapi/qmp-input-visitor.h"
+#include "monitor/monitor.h"
+
 
 #define TYPE_FILTER_BUFFER "filter-buffer"
 
@@ -185,6 +190,85 @@ void filter_buffer_del_all_timers(void)
 qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
 }
 
+static void netdev_add_filter_buffer(NetClientState *nc, void *opaque,
+ Error **errp)
+{
+NetFilterState *nf;
+bool found = false;
+
+QTAILQ_FOREACH(nf, >filters, next) {
+if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
+found = true;
+break;
+}
+}
+
+if (!found) {
+QmpOutputVisitor *qov;
+QmpInputVisitor *qiv;
+Visitor *ov, *iv;
+QObject *obj = NULL;
+QDict *qdict;
+void *dummy = NULL;
+char *id = g_strdup_printf("%s-%s.0", nc->name, TYPE_FILTER_BUFFER);
+char *queue = (char *) opaque;
+bool auto_add = true;
+Error *err = NULL;
+
+qov = qmp_output_visitor_new();
+ov = qmp_output_get_visitor(qov);
+visit_start_struct(ov,  , NULL, NULL, 0, );
+if (err) {
+goto out;
+}
+visit_type_str(ov, >name, "netdev", );
+if (err) {
+goto out;
+}
+visit_type_str(ov, , "queue", );
+if (err) {
+goto out;
+}
+visit_type_bool(ov, _add, "auto", );
+if (err) {
+goto out;
+}
+visit_end_struct(ov, );
+if (err) {
+goto out;
+}
+obj = qmp_output_get_qobject(qov);
+g_assert(obj != NULL);
+qdict = qobject_to_qdict(obj);
+qmp_output_visitor_cleanup(qov);
+
+qiv = qmp_input_visitor_new(obj);
+iv = qmp_input_get_visitor(qiv);
+object_add(TYPE_FILTER_BUFFER, id, qdict, iv, );
+qmp_input_visitor_cleanup(qiv);
+qobject_decref(obj);
+out:
+g_free(id);
+if (err) {
+error_propagate(errp, err);
+}
+}
+}
+/*
+* This will be used by COLO or MC FT, for which they will need
+* to buffer all the packets of all VM's net devices, Here we check
+* and automatically add netfilter for netdev that doesn't attach any buffer
+* netfilter.
+*/
+void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
+{
+char *queue = g_strdup(NetFilterDirection_lookup[direction]);
+
+qemu_foreach_netdev(netdev_add_filter_buffer, queue,
+errp);
+g_free(queue);
+}
+
 static void filter_buffer_init(Object *obj)
 {
 object_property_add(obj, "interval", "int",
diff --git a/net/net.c b/net/net.c
index a333b01..4fbe0af 100644
--- a/net/net.c
+++ b/net/net.c
@@ -283,6 +283,26 @@ void qemu_foreach_netfilter(qemu_netfilter_foreach func, 
void *opaque,
 }
 }
 
+void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp)
+{
+NetClientState *nc;
+
+QTAILQ_FOREACH(nc, _clients, next) {
+if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
+continue;
+}
+if (func) {
+Error

Re: [Qemu-devel] [PATCH] ivshmem-server: fix possible OVERRUN

2015-11-03 Thread Michael Tokarev

Applied to -trivial, thanks!

/mjt

[Qemu-devel] [PATCH COLO-Frame v10 14/38] COLO: Load VMState into qsb before restore it

2015-11-03 Thread zhanghailiang

We should not destroy the state of SVM (Secondary VM) until we receive the whole
state from the PVM (Primary VM), in case the primary fails in the middle of 
sending
the state, so, here we cache the device state in Secondary before restore it.

Besides, we should call qemu_system_reset() before load VM state,
which can ensure the data is intact.

Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
Signed-off-by: Gonglei 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/colo.c | 47 ++-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/migration/colo.c b/migration/colo.c
index 25f85b2..1339774 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -287,6 +287,9 @@ static int colo_wait_handle_cmd(QEMUFile *f, int 
*checkpoint_request)
 void *colo_process_incoming_thread(void *opaque)
 {
 MigrationIncomingState *mis = opaque;
+QEMUFile *fb = NULL;
+QEMUSizedBuffer *buffer = NULL; /* Cache incoming device state */
+int  total_size;
 int fd, ret = 0;
 
 migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
@@ -310,6 +313,12 @@ void *colo_process_incoming_thread(void *opaque)
 goto out;
 }
 
+buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
+if (buffer == NULL) {
+error_report("Failed to allocate colo buffer!");
+goto out;
+}
+
 ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
 if (ret < 0) {
 goto out;
@@ -337,19 +346,50 @@ void *colo_process_incoming_thread(void *opaque)
 goto out;
 }
 
-/* TODO Load VM state */
+/* read the VM state total size first */
+total_size = colo_ctl_get(mis->from_src_file,
+  COLO_COMMAND_VMSTATE_SIZE);
+if (total_size <= 0) {
+goto out;
+}
+
+/* read vm device state into colo buffer */
+ret = qsb_fill_buffer(buffer, mis->from_src_file, total_size);
+if (ret != total_size) {
+error_report("can't get all migration data");
+goto out;
+}
 
 ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0);
 if (ret < 0) {
 goto out;
 }
 
+/* open colo buffer for read */
+fb = qemu_bufopen("r", buffer);
+if (!fb) {
+error_report("can't open colo buffer for read");
+goto out;
+}
+
+qemu_mutex_lock_iothread();
+qemu_system_reset(VMRESET_SILENT);
+if (qemu_loadvm_state(fb) < 0) {
+error_report("COLO: loadvm failed");
+qemu_mutex_unlock_iothread();
+goto out;
+}
+qemu_mutex_unlock_iothread();
+
 /* TODO: flush vm state */
 
 ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
 if (ret < 0) {
 goto out;
 }
+
+qemu_fclose(fb);
+fb = NULL;
 }
 
 out:
@@ -358,6 +398,11 @@ out:
  strerror(-ret));
 }
 
+if (fb) {
+qemu_fclose(fb);
+}
+qsb_free(buffer);
+
 qemu_mutex_lock_iothread();
 colo_release_ram_cache();
 qemu_mutex_unlock_iothread();
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH 2/2] io/buffer: avoid memmove at each qio_buffer_advance

2015-11-03 Thread Peter Lieven


Am 03.11.2015 um 11:52 schrieb Gerd Hoffmann:

diff --git a/include/io/buffer.h b/include/io/buffer.h
index f63869e..43688cc 100644
--- a/include/io/buffer.h
+++ b/include/io/buffer.h
@@ -39,6 +39,8 @@ struct QIOBuffer {
  size_t offset;
  uint64_t avg_size;
  uint8_t *buffer;
+size_t base_offs;
+uint8_t *base_ptr;

Why a separate base_ptr?

While being at it I'd much prefer to replace offset with start & end.
The buffer content is buffer[start] ... buffer[end-1] then.

We can allow the buffer to wrap around, i.e. end < start.  Buffer
content is buf[start] ... buffer[size-1] and buffer[0] .. buffer[end-1]
then.  Makes the buffer management a bit more complicated, but we never
have to memmove then (except when changing buffer size) and the
WASTED_SIZE logic isn't needed too ...


Then drop this patch and try to use your approach for 2.6+

Peter

[Qemu-devel] [PATCH COLO-Frame v10 37/38] colo: Use the netfilter to buffer and release packets

2015-11-03 Thread zhanghailiang

Signed-off-by: zhanghailiang 
---
v10: Use the new API
---
 migration/colo.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 36f737a..25335db 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -21,6 +21,8 @@
 #include "qapi-event.h"
 #include "qmp-commands.h"
 #include "qapi-types.h"
+#include "net/filter.h"
+#include "net/net.h"
 
 /*
  * The delay time before qemu begin the procedure of default failover 
treatment.
@@ -59,6 +61,24 @@ static bool colo_runstate_is_stopped(void)
 return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
 }
 
+static int colo_init_filter_buffers(void)
+{
+Error *local_err = NULL;
+
+qemu_auto_add_filter_buffer(NET_FILTER_DIRECTION_RX, _err);
+if (local_err) {
+error_report_err(local_err);
+return -1;
+}
+filter_buffer_del_all_timers();
+return 0;
+}
+
+static void colo_cleanup_filter_buffers(void)
+{
+qemu_auto_del_filter_buffer(NULL);
+}
+
 static void secondary_vm_do_failover(void)
 {
 int old_state;
@@ -123,6 +143,7 @@ static void primary_vm_do_failover(void)
 if (s->to_dst_file) {
 qemu_file_shutdown(s->to_dst_file);
 }
+colo_cleanup_filter_buffers();
 
 vm_start();
 
@@ -291,6 +312,8 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
 goto out;
 }
 
+filter_buffer_release_all();
+
 if (colo_shutdown) {
 colo_ctl_put(s->to_dst_file, COLO_COMMAND_GUEST_SHUTDOWN, 0);
 qemu_fflush(s->to_dst_file);
@@ -339,6 +362,12 @@ static void colo_process_checkpoint(MigrationState *s)
 
 failover_init_state();
 
+ret = colo_init_filter_buffers();
+if (ret < 0) {
+ret = -EINVAL;
+goto out;
+}
+
 /* Dup the fd of to_dst_file */
 fd = dup(qemu_get_fd(s->to_dst_file));
 if (fd == -1) {
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH 0/2] Fix the reopening of images in 'block-commit'

2015-11-03 Thread Kevin Wolf

Am 28.10.2015 um 14:43 hat Alberto Garcia geschrieben:
> This series fixes a bug in the 'block-commit' operation under the
> following scenario:
> 
>[A] <- [B] <- [C] <- [D]
> 
> If we do block-commit top=B base=A, the contents of [B] will be
> written into [A] resulting in this chain:
> 
>[A] <- [C] <- [D]
> 
> In order to perform this operation, [A] must be reopened in read-write
> mode but so does [C] because its backing file string needs to be
> updated to point at [A].
> 
> There's a bug in the current code that makes [A] read-only again when
> [C] is reopened. This series includes a fix for that bug plus a test
> case for the scenario.
> 
> This affects both master and the 2.4 branch.

Thanks, applied to the block branch.

Kevin

[Qemu-devel] [PATCH] set_memory_options: remove code that make no sense

2015-11-03 Thread Cao jin

Signed-off-by: Cao jin 
---
 vl.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/vl.c b/vl.c
index f5f7c3f..13f2c8b 100644
--- a/vl.c
+++ b/vl.c
@@ -2860,11 +2860,6 @@ static void set_memory_options(uint64_t *ram_slots, 
ram_addr_t *maxram_size,
 sz = 0;
 mem_str = qemu_opt_get(opts, "size");
 if (mem_str) {
-if (!*mem_str) {
-error_report("missing 'size' option value");
-exit(EXIT_FAILURE);
-}
-
 sz = qemu_opt_get_size(opts, "size", ram_size);
 
 /* Fix up legacy suffix-less format */
@@ -2886,10 +2881,6 @@ static void set_memory_options(uint64_t *ram_slots, 
ram_addr_t *maxram_size,
 
 sz = QEMU_ALIGN_UP(sz, 8192);
 ram_size = sz;
-if (ram_size != sz) {
-error_report("ram size too large");
-exit(EXIT_FAILURE);
-}
 
 /* store value for the future use */
 qemu_opt_set_number(opts, "size", ram_size, _abort);
-- 
2.1.0

Re: [Qemu-devel] [PATCH v7 09/35] exec: allow file_ram_alloc to work on file

2015-11-03 Thread Igor Mammedov

On Mon,  2 Nov 2015 17:13:11 +0800
Xiao Guangrong  wrote:

> Currently, file_ram_alloc() only works on directory - it creates a file
> under @path and do mmap on it
> 
> This patch tries to allow it to work on file directly, if @path is a
> directory it works as before, otherwise it treats @path as the target
> file then directly allocate memory from it
Paolo has just queued
https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg06513.html
perhaps that's what you can reuse here.
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  exec.c | 80 
> ++
>  1 file changed, 51 insertions(+), 29 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 9075f4d..db0fdaf 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1174,14 +1174,60 @@ void qemu_mutex_unlock_ramlist(void)
>  }
>  
>  #ifdef __linux__
> +static bool path_is_dir(const char *path)
> +{
> +struct stat fs;
> +
> +return stat(path, ) == 0 && S_ISDIR(fs.st_mode);
> +}
> +
> +static int open_ram_file_path(RAMBlock *block, const char *path, size_t size)
> +{
> +char *filename;
> +char *sanitized_name;
> +char *c;
> +int fd;
> +
> +if (!path_is_dir(path)) {
> +int flags = (block->flags & RAM_SHARED) ? O_RDWR : O_RDONLY;
> +
> +flags |= O_EXCL;
> +return open(path, flags);
> +}
> +
> +/* Make name safe to use with mkstemp by replacing '/' with '_'. */
> +sanitized_name = g_strdup(memory_region_name(block->mr));
> +for (c = sanitized_name; *c != '\0'; c++) {
> +if (*c == '/') {
> +*c = '_';
> +}
> +}
> +filename = g_strdup_printf("%s/qemu_back_mem.%s.XX", path,
> +   sanitized_name);
> +g_free(sanitized_name);
> +fd = mkstemp(filename);
> +if (fd >= 0) {
> +unlink(filename);
> +/*
> + * ftruncate is not supported by hugetlbfs in older
> + * hosts, so don't bother bailing out on errors.
> + * If anything goes wrong with it under other filesystems,
> + * mmap will fail.
> + */
> +if (ftruncate(fd, size)) {
> +perror("ftruncate");
> +}
> +}
> +g_free(filename);
> +
> +return fd;
> +}
> +
>  static void *file_ram_alloc(RAMBlock *block,
>  ram_addr_t memory,
>  const char *path,
>  Error **errp)
>  {
> -char *filename;
> -char *sanitized_name;
> -char *c;
>  void *area;
>  int fd;
>  uint64_t pagesize;
> @@ -1212,38 +1258,14 @@ static void *file_ram_alloc(RAMBlock *block,
>  goto error;
>  }
>  
> -/* Make name safe to use with mkstemp by replacing '/' with '_'. */
> -sanitized_name = g_strdup(memory_region_name(block->mr));
> -for (c = sanitized_name; *c != '\0'; c++) {
> -if (*c == '/')
> -*c = '_';
> -}
> -
> -filename = g_strdup_printf("%s/qemu_back_mem.%s.XX", path,
> -   sanitized_name);
> -g_free(sanitized_name);
> +memory = ROUND_UP(memory, pagesize);
>  
> -fd = mkstemp(filename);
> +fd = open_ram_file_path(block, path, memory);
>  if (fd < 0) {
>  error_setg_errno(errp, errno,
>   "unable to create backing store for path %s", path);
> -g_free(filename);
>  goto error;
>  }
> -unlink(filename);
> -g_free(filename);
> -
> -memory = ROUND_UP(memory, pagesize);
> -
> -/*
> - * ftruncate is not supported by hugetlbfs in older
> - * hosts, so don't bother bailing out on errors.
> - * If anything goes wrong with it under other filesystems,
> - * mmap will fail.
> - */
> -if (ftruncate(fd, memory)) {
> -perror("ftruncate");
> -}
>  
>  area = qemu_ram_mmap(fd, memory, pagesize, block->flags & RAM_SHARED);
>  if (area == MAP_FAILED) {

Re: [Qemu-devel] [PATCH] set_memory_options: remove code that make no sense

2015-11-03 Thread Michael Tokarev

03.11.2015 15:30, Cao jin wrote:
> Signed-off-by: Cao jin 
> ---
>  vl.c | 9 -
>  1 file changed, 9 deletions(-)
> 
> diff --git a/vl.c b/vl.c
> index f5f7c3f..13f2c8b 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2860,11 +2860,6 @@ static void set_memory_options(uint64_t *ram_slots, 
> ram_addr_t *maxram_size,
>  sz = 0;
>  mem_str = qemu_opt_get(opts, "size");
>  if (mem_str) {
> -if (!*mem_str) {
> -error_report("missing 'size' option value");
> -exit(EXIT_FAILURE);
> -}

I'm not sure this one is bad or good, it is indeed possible
to specify no value for size=, but if we're to check that,
we'd have to add such checks everywhere.

But the next one...

>  sz = qemu_opt_get_size(opts, "size", ram_size);
>  
>  /* Fix up legacy suffix-less format */
> @@ -2886,10 +2881,6 @@ static void set_memory_options(uint64_t *ram_slots, 
> ram_addr_t *maxram_size,
>  
>  sz = QEMU_ALIGN_UP(sz, 8192);
>  ram_size = sz;
> -if (ram_size != sz) {
> -error_report("ram size too large");
> -exit(EXIT_FAILURE);
> -}

is definitely wrong.

sz is uint64_t, while ram_size is ram_addr_t which is
either uint64_t or uintptr_t.  Until it is fixed to
always be 64bits, the above code makes (some) sense.

Thanks,

/mjt

Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev

2015-11-03 Thread Yang Hongyang




On 2015年11月03日 19:56, zhanghailiang wrote:

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
---
v10: new patch
---
  include/net/filter.h |  1 +
  include/net/net.h|  3 ++
  net/filter-buffer.c  | 84 
  net/net.c| 20 +
  4 files changed, 108 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 4499d60..b0954ba 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -75,5 +75,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
  void *opaque);
  void filter_buffer_release_all(void);
  void  filter_buffer_del_all_timers(void);
+void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);

  #endif /* QEMU_NET_FILTER_H */
diff --git a/include/net/net.h b/include/net/net.h
index 5c65c45..e32bd90 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -129,6 +129,9 @@ typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, 
void *opaque,
 Error **errp);
  void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
  Error **errp);
+typedef void (*qemu_netdev_foreach)(NetClientState *nc, void *opaque,
+Error **errp);
+void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp);
  int qemu_can_send_packet(NetClientState *nc);
  ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
int iovcnt);
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 05313de..0dc1efb 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -15,6 +15,11 @@
  #include "qapi-visit.h"
  #include "qom/object.h"
  #include "net/net.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp-output-visitor.h"
+#include "qapi/qmp-input-visitor.h"
+#include "monitor/monitor.h"
+

  #define TYPE_FILTER_BUFFER "filter-buffer"

@@ -185,6 +190,85 @@ void filter_buffer_del_all_timers(void)
  qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
  }

+static void netdev_add_filter_buffer(NetClientState *nc, void *opaque,
+ Error **errp)
+{
+NetFilterState *nf;
+bool found = false;
+
+QTAILQ_FOREACH(nf, >filters, next) {
+if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
+found = true;


What if a filter-buffer already attached to a netdev, but has interval
set?
Is this API really necessary?


+break;
+}
+}
+
+if (!found) {
+QmpOutputVisitor *qov;
+QmpInputVisitor *qiv;
+Visitor *ov, *iv;
+QObject *obj = NULL;
+QDict *qdict;
+void *dummy = NULL;
+char *id = g_strdup_printf("%s-%s.0", nc->name, TYPE_FILTER_BUFFER);
+char *queue = (char *) opaque;
+bool auto_add = true;
+Error *err = NULL;
+
+qov = qmp_output_visitor_new();
+ov = qmp_output_get_visitor(qov);
+visit_start_struct(ov,  , NULL, NULL, 0, );
+if (err) {
+goto out;
+}
+visit_type_str(ov, >name, "netdev", );
+if (err) {
+goto out;
+}
+visit_type_str(ov, , "queue", );
+if (err) {
+goto out;
+}
+visit_type_bool(ov, _add, "auto", );
+if (err) {
+goto out;
+}
+visit_end_struct(ov, );
+if (err) {
+goto out;
+}
+obj = qmp_output_get_qobject(qov);
+g_assert(obj != NULL);
+qdict = qobject_to_qdict(obj);
+qmp_output_visitor_cleanup(qov);
+
+qiv = qmp_input_visitor_new(obj);
+iv = qmp_input_get_visitor(qiv);
+object_add(TYPE_FILTER_BUFFER, id, qdict, iv, );
+qmp_input_visitor_cleanup(qiv);
+qobject_decref(obj);
+out:
+g_free(id);
+if (err) {
+error_propagate(errp, err);
+}
+}
+}
+/*
+* This will be used by COLO or MC FT, for which they will need
+* to buffer all the packets of all VM's net devices, Here we check
+* and automatically add netfilter for netdev that doesn't attach any buffer
+* netfilter.
+*/
+void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
+{
+char *queue = g_strdup(NetFilterDirection_lookup[direction]);
+
+qemu_foreach_netdev(netdev_add_filter_buffer, queue,
+errp);
+g_free(queue);
+}
+
  static void filter_buffer_init(Object *obj)
  {
  object_property_add(obj, "interval", "int",
diff --git a/net/net.c b/net/net.c
index a333b01..4fbe0af 100644
--- a/net/net.c
+++ b/net/net.c
@@ -283,6 +283,26 @@ void qemu_foreach_netfilter(qemu_netfilter_foreach func, 
void *opaque,
  }
  }

+void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp)
+{
+NetClientState *nc;
+

Re: [Qemu-devel] [PATCH COLO-Frame v10 36/38] netfilter: Introduce an API to delete all the automatically added netfilters

2015-11-03 Thread Yang Hongyang


On 2015年11月03日 19:56, zhanghailiang wrote:

We add a new property 'auto' for netfilter to distinguish if netfilter is
added by user or automatically added.

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
---
v10: new patch
---
  include/net/filter.h |  2 ++
  net/filter-buffer.c  | 17 +
  net/filter.c | 15 +++
  3 files changed, 34 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index b0954ba..46d3ef9 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -55,6 +55,7 @@ struct NetFilterState {
  char *netdev_id;
  NetClientState *netdev;
  NetFilterDirection direction;
+bool auto_add;
  char info_str[256];
  QTAILQ_ENTRY(NetFilterState) next;
  };
@@ -76,5 +77,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
  void filter_buffer_release_all(void);
  void  filter_buffer_del_all_timers(void);
  void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);
+void qemu_auto_del_filter_buffer(Error **errp);

  #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 0dc1efb..ea4481c 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -19,6 +19,7 @@
  #include "qapi/qmp-output-visitor.h"
  #include "qapi/qmp-input-visitor.h"
  #include "monitor/monitor.h"
+#include "qmp-commands.h"


  #define TYPE_FILTER_BUFFER "filter-buffer"
@@ -269,6 +270,22 @@ void qemu_auto_add_filter_buffer(NetFilterDirection 
direction, Error **errp)
  g_free(queue);
  }

+static void netdev_del_filter_buffer(NetFilterState *nf, void *opaque,
+ Error **errp)
+{
+if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER) &&
+nf->auto_add) {
+char *id = object_get_canonical_path_component(OBJECT(nf));
+
+qmp_object_del(id, errp);
+}
+}
+
+void qemu_auto_del_filter_buffer(Error **errp)
+{
+qemu_foreach_netfilter(netdev_del_filter_buffer, NULL, errp);
+}
+
  static void filter_buffer_init(Object *obj)
  {
  object_property_add(obj, "interval", "int",
diff --git a/net/filter.c b/net/filter.c
index 326f2b5..dcbcb80 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -117,6 +117,18 @@ static void netfilter_set_direction(Object *obj, int 
direction, Error **errp)
  nf->direction = direction;
  }

+static bool netfilter_get_auto_flag(Object *obj, Error **errp)
+{
+NetFilterState *nf = NETFILTER(obj);
+return nf->auto_add;
+}
+
+static void netfilter_set_auto_flag(Object *obj, bool flag, Error **errp)
+{
+NetFilterState *nf = NETFILTER(obj);
+nf->auto_add = flag;
+}
+


This chunk of code should be in previous patch.


  static void netfilter_init(Object *obj)
  {
  object_property_add_str(obj, "netdev",
@@ -126,6 +138,9 @@ static void netfilter_init(Object *obj)
   NetFilterDirection_lookup,
   netfilter_get_direction, netfilter_set_direction,
   NULL);
+object_property_add_bool(obj, "auto",
+ netfilter_get_auto_flag, netfilter_set_auto_flag,
+ NULL);
  }


Ditto.



  static void netfilter_complete(UserCreatable *uc, Error **errp)



--
Thanks,
Yang

Re: [Qemu-devel] Safety of killing qemu when it is doing an fstrim

2015-11-03 Thread Paolo Bonzini



On 03/11/2015 13:12, Richard W.M. Jones wrote:
> 
> I wrote a tool called virt-sparsify which runs fstrim on disks via
> qemu.  My colleague asked me a good question: Is this safe if qemu is
> killed (^C)?  Could it corrupt the guest?
> 
> Using 'virt-sparsify --inplace disk.img' is essentially equivalent to
> doing:
> 
>   qemu-kvm \
> -kernel  \
> -drive file=disk.img,discard=unmap,[virtio-scsi] \
> -drive file=appliance
> 
> And in the appliance doing:
> 
>   foreach fs in filesystems:
>   mount -o discard fs /sysroot
>   fstrim /sysroot
>   umount /sysroot
>   sync
>   poweroff
> 
> I think the answer is "safe", as long as the Linux kernel and qemu are
> written carefully, but it would be good to get an expert opinion.
> 
> It looks like fstrim just sends discard requests.  And mount/umount
> should be safe by the usual rules of journalling.

Yes, this is correct.

Paolo

Re: [Qemu-devel] [PATCH COLO-Frame v10 34/38] filter-buffer: Accept zero interval

2015-11-03 Thread Yang Hongyang


Some commit message would be better.

On 2015年11月03日 19:56, zhanghailiang wrote:

Signed-off-by: zhanghailiang 
Cc: Jason Wang 


Reviewed-by: Yang Hongyang 


---
v10: new patch
---
  net/filter-buffer.c | 10 --
  1 file changed, 10 deletions(-)

diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 5f0ea70..05313de 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -104,16 +104,6 @@ static void filter_buffer_setup(NetFilterState *nf, Error 
**errp)
  {
  FilterBufferState *s = FILTER_BUFFER(nf);

-/*
- * We may want to accept zero interval when VM FT solutions like MC
- * or COLO use this filter to release packets on demand.
- */
-if (!s->interval) {
-error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "interval",
-   "a non-zero interval");
-return;
-}
-
  s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
  if (s->interval) {
  timer_init_us(>release_timer, QEMU_CLOCK_VIRTUAL,



--
Thanks,
Yang

Re: [Qemu-devel] [PATCH v3 9/9] kvm/x86: Hyper-V kvm exit

2015-11-03 Thread Paolo Bonzini



On 22/10/2015 18:10, Andrey Smetanin wrote:
> A new vcpu exit is introduced to notify the userspace of the
> changes in Hyper-V SynIC configuration triggered by guest writing to the
> corresponding MSRs.
> 
> Changes v3:
> * added KVM_EXIT_HYPERV types and structs notes into docs
> 
> Signed-off-by: Andrey Smetanin 
> Reviewed-by: Roman Kagan 
> Signed-off-by: Denis V. Lunev 
> CC: Vitaly Kuznetsov 
> CC: "K. Y. Srinivasan" 
> CC: Gleb Natapov 
> CC: Paolo Bonzini 
> CC: Roman Kagan 
> 
> ---
>  Documentation/virtual/kvm/api.txt | 22 ++
>  arch/x86/include/asm/kvm_host.h   |  1 +
>  arch/x86/kvm/hyperv.c | 17 +
>  arch/x86/kvm/x86.c|  6 ++
>  include/linux/kvm_host.h  |  1 +
>  include/uapi/linux/kvm.h  | 17 +
>  6 files changed, 64 insertions(+)
> 
> diff --git a/Documentation/virtual/kvm/api.txt 
> b/Documentation/virtual/kvm/api.txt
> index 8710418..a6858eb 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -3337,6 +3337,28 @@ the userspace IOAPIC should process the EOI and 
> retrigger the interrupt if
>  it is still asserted.  Vector is the LAPIC interrupt vector for which the
>  EOI was received.
>  
> + struct kvm_hyperv_exit {
> +#define KVM_EXIT_HYPERV_SYNIC  1
> + __u32 type;
> + union {
> + struct {
> + __u32 msr;
> + __u64 control;
> + __u64 evt_page;
> + __u64 msg_page;
> + } synic;
> + } u;
> + };
> + /* KVM_EXIT_HYPERV */
> +struct kvm_hyperv_exit hyperv;
> +Indicates that the VCPU exits into userspace to process some tasks
> +related to Hyper-V emulation.
> +Valid values for 'type' are:
> + KVM_EXIT_HYPERV_SYNIC -- synchronously notify user-space about
> +Hyper-V SynIC state change. Notification is used to remap SynIC
> +event/message pages and to enable/disable SynIC messages/events processing
> +in userspace.
> +
>   /* Fix the size of the union. */
>   char padding[256];
>   };
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 8434f88..54c90d3 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -392,6 +392,7 @@ struct kvm_vcpu_hv {
>   u64 hv_vapic;
>   s64 runtime_offset;
>   struct kvm_vcpu_hv_synic synic;
> + struct kvm_hyperv_exit exit;
>  };
>  
>  struct kvm_vcpu_arch {
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 8ff71f3..9443920 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -129,6 +129,20 @@ static void kvm_hv_notify_acked_sint(struct kvm_vcpu 
> *vcpu, u32 sint)
>   srcu_read_unlock(>irq_srcu, idx);
>  }
>  
> +static void synic_exit(struct kvm_vcpu_hv_synic *synic, u32 msr)
> +{
> + struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
> + struct kvm_vcpu_hv *hv_vcpu = >arch.hyperv;
> +
> + hv_vcpu->exit.type = KVM_EXIT_HYPERV_SYNIC;
> + hv_vcpu->exit.u.synic.msr = msr;
> + hv_vcpu->exit.u.synic.control = synic->control;
> + hv_vcpu->exit.u.synic.evt_page = synic->evt_page;
> + hv_vcpu->exit.u.synic.msg_page = synic->msg_page;
> +
> + kvm_make_request(KVM_REQ_HV_EXIT, vcpu);
> +}
> +
>  static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
>u32 msr, u64 data, bool host)
>  {
> @@ -141,6 +155,7 @@ static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
>   switch (msr) {
>   case HV_X64_MSR_SCONTROL:
>   synic->control = data;
> + synic_exit(synic, msr);

Another note.  I am getting:

EAX= EBX= ECX= EDX=0663
ESI= EDI= EBP= ESP=
EIP=fff0 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =   9300
CS =f000   9b00
SS =   9300
DS =   9300
FS =   9300
GS =   9300
LDT=   8200
TR =   8b00
GDT=  
IDT=  
CR0=6010 CR2= CR3= CR4=
DR0= DR1= DR2=
DR3=
DR6=0ff0 DR7=0400
EFER=
Code=90 90 90 90 eb c3 90 90 90 90 90 90 00 00 00 00 56 54 46 00 <90> 90
eb ac 90 90 90 90 90 90 90 90 90 90 90 90 00 00 00 00 00 00 00 00 00 00
00 00 00 00

if I run a patched QEMU but I *do not* enable the synthetic

Re: [Qemu-devel] [PATCH] hw/arm/virt-acpi-build: Add GICC ACPI subtable for GICv3

2015-11-03 Thread Peter Maydell

On 29 October 2015 at 15:16, Shannon Zhao  wrote:
> When booting VM with GICv3, the kernel needs GICC ACPI subtable to
> initialize the CPUs, e.g. MPIDR information. This adds GICC ACPI
> subtable for GICv3, but set GICC base address only when gic_version == 2
> since it donesn't need GICC base address for GICv3.
>
> Signed-off-by: Shannon Zhao 
> ---
>  hw/arm/virt-acpi-build.c | 30 --
>  1 file changed, 16 insertions(+), 14 deletions(-)




Applied to target-arm.next, thanks.

-- PMM

Re: [Qemu-devel] [PATCH] ARM: ACPI: Fix MPIDR value in ACPI table

2015-11-03 Thread Peter Maydell

On 31 October 2015 at 09:50, Shannon Zhao  wrote:
> From: Shannon Zhao 
>
> Use mp_affinity of ARMCPU as the CPU MPIDR instead of the CPU index.
>
> Signed-off-by: Shannon Zhao 
> ---
> This patch is based on below patch.
> http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg06919.html
>



Applied to target-arm.next, thanks.

-- PMM

Re: [Qemu-devel] [PATCH 2/2] io/buffer: avoid memmove at each qio_buffer_advance

2015-11-03 Thread Gerd Hoffmann

> diff --git a/include/io/buffer.h b/include/io/buffer.h
> index f63869e..43688cc 100644
> --- a/include/io/buffer.h
> +++ b/include/io/buffer.h
> @@ -39,6 +39,8 @@ struct QIOBuffer {
>  size_t offset;
>  uint64_t avg_size;
>  uint8_t *buffer;
> +size_t base_offs;
> +uint8_t *base_ptr;

Why a separate base_ptr?

While being at it I'd much prefer to replace offset with start & end.
The buffer content is buffer[start] ... buffer[end-1] then.

We can allow the buffer to wrap around, i.e. end < start.  Buffer
content is buf[start] ... buffer[size-1] and buffer[0] .. buffer[end-1]
then.  Makes the buffer management a bit more complicated, but we never
have to memmove then (except when changing buffer size) and the
WASTED_SIZE logic isn't needed too ...

cheers,
  Gerd

[Qemu-devel] [PATCH v11 08/12] Add new block driver interfaces to control block replication

2015-11-03 Thread Wen Congyang

Signed-off-by: Wen Congyang 
Signed-off-by: zhanghailiang 
Signed-off-by: Gonglei 
Cc: Luiz Capitulino 
Cc: Michael Roth 
Reviewed-by: Paolo Bonzini 
---
 block.c   | 43 +++
 include/block/block.h |  5 +
 include/block/block_int.h | 14 ++
 qapi/block-core.json  | 13 +
 4 files changed, 75 insertions(+)

diff --git a/block.c b/block.c
index 9a1c20e..04b928c 100644
--- a/block.c
+++ b/block.c
@@ -4165,3 +4165,46 @@ void bdrv_del_child(BlockDriverState *parent_bs, 
BlockDriverState *child_bs,
 
 parent_bs->drv->bdrv_del_child(parent_bs, child_bs, errp);
 }
+
+void bdrv_start_replication(BlockDriverState *bs, ReplicationMode mode,
+Error **errp)
+{
+BlockDriver *drv = bs->drv;
+
+if (drv && drv->bdrv_start_replication) {
+drv->bdrv_start_replication(bs, mode, errp);
+} else if (bs->file) {
+bdrv_start_replication(bs->file->bs, mode, errp);
+} else {
+error_setg(errp, "The BDS %s doesn't support starting block"
+   " replication", bs->filename);
+}
+}
+
+void bdrv_do_checkpoint(BlockDriverState *bs, Error **errp)
+{
+BlockDriver *drv = bs->drv;
+
+if (drv && drv->bdrv_do_checkpoint) {
+drv->bdrv_do_checkpoint(bs, errp);
+} else if (bs->file) {
+bdrv_do_checkpoint(bs->file->bs, errp);
+} else {
+error_setg(errp, "The BDS %s doesn't support block checkpoint",
+   bs->filename);
+}
+}
+
+void bdrv_stop_replication(BlockDriverState *bs, bool failover, Error **errp)
+{
+BlockDriver *drv = bs->drv;
+
+if (drv && drv->bdrv_stop_replication) {
+drv->bdrv_stop_replication(bs, failover, errp);
+} else if (bs->file) {
+bdrv_stop_replication(bs->file->bs, failover, errp);
+} else {
+error_setg(errp, "The BDS %s doesn't support stopping block"
+   " replication", bs->filename);
+}
+}
diff --git a/include/block/block.h b/include/block/block.h
index cccda1d..288e14e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -638,4 +638,9 @@ void bdrv_add_child(BlockDriverState *parent, 
BlockDriverState *child,
 void bdrv_del_child(BlockDriverState *parent, BlockDriverState *child,
 Error **errp);
 
+void bdrv_start_replication(BlockDriverState *bs, ReplicationMode mode,
+Error **errp);
+void bdrv_do_checkpoint(BlockDriverState *bs, Error **errp);
+void bdrv_stop_replication(BlockDriverState *bs, bool failover, Error **errp);
+
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 3285739..eec2591 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -293,6 +293,20 @@ struct BlockDriver {
 void (*bdrv_del_child)(BlockDriverState *parent, BlockDriverState *child,
Error **errp);
 
+void (*bdrv_start_replication)(BlockDriverState *bs, ReplicationMode mode,
+   Error **errp);
+/* Drop Disk buffer when doing checkpoint. */
+void (*bdrv_do_checkpoint)(BlockDriverState *bs, Error **errp);
+/*
+ * After failover, we should flush Disk buffer into secondary disk
+ * and stop block replication.
+ *
+ * If the guest is shutdown, we should drop Disk buffer and stop
+ * block representation.
+ */
+void (*bdrv_stop_replication)(BlockDriverState *bs, bool failover,
+  Error **errp);
+
 QLIST_ENTRY(BlockDriver) list;
 };
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 86b62e4..0539dfa 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1797,6 +1797,19 @@
 '*read-pattern': 'QuorumReadPattern' } }
 
 ##
+# @ReplicationMode
+#
+# An enumeration of replication modes.
+#
+# @primary: Primary mode, the vm's state will be sent to secondary QEMU.
+#
+# @secondary: Secondary mode, receive the vm's state from primary QEMU.
+#
+# Since: 2.5
+##
+{ 'enum' : 'ReplicationMode', 'data' : [ 'primary', 'secondary' ] }
+
+##
 # @BlockdevOptions
 #
 # Options for creating a block device.
-- 
2.4.3

1 2 3 4 >

1 - 100 of 396 matches

Mail list logo