Re: [Qemu-devel] [PATCH 4/5] qemu-io: prompt for encryption keys when required

2015-05-12 Thread Eric Blake
On 05/12/2015 10:09 AM, Daniel P. Berrange wrote:
> The qemu-io tool does not check if the image is encrypted so
> historically would silently corrupt the sectors by writing
> plain text data into them instead of cipher text. The earlier
> commit turns this mistake into a fatal abort, so check for
> encryption and prompt for key when required.

Doesn't that mean that 'git bisect' gives a crashing qemu-io for 3
patches?  Should this be rearranged so that 1/5 comes after this to
avoid triggering the abort?

> 
> This enables us to add unit tests to ensure we don't break
> the ability of qemu-img to convert existing encrypted qcow2
> files into a non-encrypted format.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  qemu-io.c | 21 +
>  1 file changed, 21 insertions(+)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 5/5] tests: add test case for encrypted qcow2 read/write

2015-05-12 Thread Eric Blake
On 05/12/2015 10:09 AM, Daniel P. Berrange wrote:
> Add a simple test case for qemu-iotests that covers read/write
> with encrypted qcow2 files.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  tests/qemu-iotests/131 | 69 
> ++
>  tests/qemu-iotests/131.out | 46 +++
>  tests/qemu-iotests/group   |  1 +
>  3 files changed, 116 insertions(+)
>  create mode 100755 tests/qemu-iotests/131
>  create mode 100644 tests/qemu-iotests/131.out
> 
> diff --git a/tests/qemu-iotests/131 b/tests/qemu-iotests/131
> new file mode 100755
> index 000..f44b0a0
> --- /dev/null
> +++ b/tests/qemu-iotests/131
> @@ -0,0 +1,69 @@
> +#!/bin/bash
> +#
> +# Test encrypted read/write using plain bdrv_read/bdrv_write
> +#
> +# Copyright (C) 2009 Red Hat, Inc.

Copy-and-paste strikes again; welcome to 2015.  With that fixed,
Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 3/5] util: allow \n to terminate password input

2015-05-12 Thread Eric Blake
On 05/12/2015 10:09 AM, Daniel P. Berrange wrote:
> The qemu_read_password() method looks for \r to terminate the
> reading of the a password. This is what will be seen when
> reading the password from a TTY. When scripting though, it is
> useful to be able to send the password via a pipe, in which
> case we must look for \n to terminate password input.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  util/oslib-posix.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 2/5] util: move read_password method out of qemu-img into osdep/oslib

2015-05-12 Thread Eric Blake
On 05/12/2015 10:09 AM, Daniel P. Berrange wrote:
> The qemu-img.c file has a read_password() method impl that is
> used to prompt for passwords on the console, with impls for
> POSIX and Windows. This will be needed by qemu-io.c too, so
> move it into the QEMU osdep/oslib files where it can be shared
> without code duplication
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  include/qemu/osdep.h |  2 ++
>  qemu-img.c   | 93 
> +---
>  util/oslib-posix.c   | 66 +
>  util/oslib-win32.c   | 24 ++
>  4 files changed, 93 insertions(+), 92 deletions(-)
> 
> diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
> index b3300cc..3247364 100644
> --- a/include/qemu/osdep.h
> +++ b/include/qemu/osdep.h
> @@ -259,4 +259,6 @@ void qemu_set_tty_echo(int fd, bool echo);
>  
>  void os_mem_prealloc(int fd, char *area, size_t sz);
>  
> +int qemu_read_password(char *buf, int buf_size);

Should we fix it to use size_t buf_size while at it? (or as a followup,
to keep this one limited to code motion)

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] when does a target frontend need to use gen_io_start()/gen_io_end() ?

2015-05-12 Thread Paolo Bonzini


On 12/05/2015 17:32, Peter Maydell wrote:
> In order for -icount to work, it's important for the target
> translate.c code to correctly bracket any generated code which
> can "do I/O" with gen_io_start()/gen_io_end() calls. But
> does anybody know exactly what the criteria are here for this?
> It would be nice if we could document this in a comment in
> gen_icount.h -- I'm happy to write one up if somebody will just
> tell me what the right answer is :-)

It's any instruction that can cause an icount read, typically through
QEMU_CLOCK_VIRTUAL or cpu_get_ticks().

Paolo



Re: [Qemu-devel] [PATCH 1/5] qcow2/qcow: protect against uninitialized encryption key

2015-05-12 Thread Eric Blake
On 05/12/2015 10:09 AM, Daniel P. Berrange wrote:
> When a qcow[2] file is opened, if the header reports an
> encryption method, this is used to set the 'crypt_method_header'
> field on the BDRVQcow[2]State struct, and the 'encrypted' flag
> in the BDRVState struct.
> 
> When doing I/O operations, the 'crypt_method' field on the
> BDRVQcow[2]State struct is checked to determine if encryption
> needs to be applied.
> 
> The crypt_method_header value is copied into crypt_method when
> the bdrv_set_key() method is called.
> 
> The QEMU code which opens a block device is expected to always
> do a check
> 
>if (bdrv_is_encrypted(bs)) {
>bdrv_set_key(bs, key...);
>}
> 
> If code forgets todo this, then 'crypt_method' is never set

s/todo/to do/

> and so when I/O is performed, QEMU writes plain text data
> into a sector which is expected to contain cipher text, or
> when reading, will return cipher text instead of plain
> text.
> 
> Change the qcow[2] code to consult bs->encrypted when deciding
> whether encryption is required, and assert(s->crypt_method)
> to protect against cases where the caller forgets to set the
> encryption key.
> 
> Also put an assert in the set_key methods to protect against
> the case where the caller sets an encryption key on a block
> device that does not have encryption
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  block/qcow.c  | 10 +++---
>  block/qcow2-cluster.c |  3 ++-
>  block/qcow2.c | 18 --
>  3 files changed, 21 insertions(+), 10 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v2 15/17] target-alpha: Suppress underflow from CVTTQ if DNZ

2015-05-12 Thread Richard Henderson
I.e. respect flush_inputs_to_zero.

Reported-by: Al Viro 
Signed-off-by: Richard Henderson 
---
 target-alpha/fpu_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-alpha/fpu_helper.c b/target-alpha/fpu_helper.c
index ea1f2e2..fa4401d 100644
--- a/target-alpha/fpu_helper.c
+++ b/target-alpha/fpu_helper.c
@@ -452,7 +452,7 @@ static uint64_t do_cvttq(CPUAlphaState *env, uint64_t a, 
int roundmode)
 frac = a & 0xfull;
 
 if (exp == 0) {
-if (unlikely(frac != 0)) {
+if (unlikely(frac != 0) && !env->fp_status.flush_inputs_to_zero) {
 goto do_underflow;
 }
 } else if (exp == 0x7ff) {
-- 
2.1.0




[Qemu-devel] [PATCH v2 13/17] target-alpha: Disallow literal operand to 1C.30 to 1C.37

2015-05-12 Thread Richard Henderson
Before 64f45e49 we used to have literal checks for 4 of these 8 opcodes.
Confirmed that real hardware doesn't allow them.

Reported-by: Al Viro 
Signed-off-by: Richard Henderson 
---
 target-alpha/translate.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 953d1ef..f0556b0 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -1342,6 +1342,13 @@ static ExitStatus gen_mtpr(DisasContext *ctx, TCGv vb, 
int regno)
 }
 #endif /* !USER_ONLY*/
 
+#define REQUIRE_NO_LIT  \
+do {\
+if (real_islit) {   \
+goto invalid_opc;   \
+}   \
+} while (0)
+
 #define REQUIRE_TB_FLAG(FLAG)   \
 do {\
 if ((ctx->tb->flags & (FLAG)) == 0) {   \
@@ -1361,7 +1368,7 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 int32_t disp21, disp16, disp12 __attribute__((unused));
 uint16_t fn11;
 uint8_t opc, ra, rb, rc, fpfn, fn7, lit;
-bool islit;
+bool islit, real_islit;
 TCGv va, vb, vc, tmp, tmp2;
 TCGv_i32 t32;
 ExitStatus ret;
@@ -1371,7 +1378,7 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 ra = extract32(insn, 21, 5);
 rb = extract32(insn, 16, 5);
 rc = extract32(insn, 0, 5);
-islit = extract32(insn, 12, 1);
+real_islit = islit = extract32(insn, 12, 1);
 lit = extract32(insn, 13, 8);
 
 disp21 = sextract32(insn, 0, 21);
@@ -2466,11 +2473,13 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 /* CTPOP */
 REQUIRE_TB_FLAG(TB_FLAGS_AMASK_CIX);
 REQUIRE_REG_31(ra);
+REQUIRE_NO_LIT;
 gen_helper_ctpop(vc, vb);
 break;
 case 0x31:
 /* PERR */
 REQUIRE_TB_FLAG(TB_FLAGS_AMASK_MVI);
+REQUIRE_NO_LIT;
 va = load_gpr(ctx, ra);
 gen_helper_perr(vc, va, vb);
 break;
@@ -2478,36 +2487,42 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 /* CTLZ */
 REQUIRE_TB_FLAG(TB_FLAGS_AMASK_CIX);
 REQUIRE_REG_31(ra);
+REQUIRE_NO_LIT;
 gen_helper_ctlz(vc, vb);
 break;
 case 0x33:
 /* CTTZ */
 REQUIRE_TB_FLAG(TB_FLAGS_AMASK_CIX);
 REQUIRE_REG_31(ra);
+REQUIRE_NO_LIT;
 gen_helper_cttz(vc, vb);
 break;
 case 0x34:
 /* UNPKBW */
 REQUIRE_TB_FLAG(TB_FLAGS_AMASK_MVI);
 REQUIRE_REG_31(ra);
+REQUIRE_NO_LIT;
 gen_helper_unpkbw(vc, vb);
 break;
 case 0x35:
 /* UNPKBL */
 REQUIRE_TB_FLAG(TB_FLAGS_AMASK_MVI);
 REQUIRE_REG_31(ra);
+REQUIRE_NO_LIT;
 gen_helper_unpkbl(vc, vb);
 break;
 case 0x36:
 /* PKWB */
 REQUIRE_TB_FLAG(TB_FLAGS_AMASK_MVI);
 REQUIRE_REG_31(ra);
+REQUIRE_NO_LIT;
 gen_helper_pkwb(vc, vb);
 break;
 case 0x37:
 /* PKLB */
 REQUIRE_TB_FLAG(TB_FLAGS_AMASK_MVI);
 REQUIRE_REG_31(ra);
+REQUIRE_NO_LIT;
 gen_helper_pklb(vc, vb);
 break;
 case 0x38:
-- 
2.1.0




[Qemu-devel] [PATCH v2 10/17] target-alpha: Fix cvttq vs inf

2015-05-12 Thread Richard Henderson
We should raise INV for infinities as well, not OVR+INE.

Reported-by: Al Viro 
Signed-off-by: Richard Henderson 
---
 target-alpha/fpu_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-alpha/fpu_helper.c b/target-alpha/fpu_helper.c
index 9449c57..db523fb 100644
--- a/target-alpha/fpu_helper.c
+++ b/target-alpha/fpu_helper.c
@@ -444,7 +444,7 @@ static uint64_t do_cvttq(CPUAlphaState *env, uint64_t a, 
int roundmode)
 goto do_underflow;
 }
 } else if (exp == 0x7ff) {
-exc = (frac ? FPCR_INV : FPCR_IOV | FPCR_INE);
+exc = FPCR_INV;
 } else {
 /* Restore implicit bit.  */
 frac |= 0x10ull;
-- 
2.1.0




[Qemu-devel] [PATCH v2 11/17] target-alpha: Fix integer overflow checking insns

2015-05-12 Thread Richard Henderson
We need to write the result to the destination register before
raising any exception.  Thus inline the code for each insn, and
check for any exception after we're done.

Reported-by: Al Viro 
Signed-off-by: Richard Henderson 
---
 target-alpha/helper.h |  7 +-
 target-alpha/int_helper.c | 59 ++
 target-alpha/translate.c  | 60 +--
 3 files changed, 56 insertions(+), 70 deletions(-)

diff --git a/target-alpha/helper.h b/target-alpha/helper.h
index 9e7b771..5b1a5d9 100644
--- a/target-alpha/helper.h
+++ b/target-alpha/helper.h
@@ -1,12 +1,7 @@
 DEF_HELPER_3(excp, noreturn, env, int, int)
 DEF_HELPER_FLAGS_1(load_pcc, TCG_CALL_NO_RWG_SE, i64, env)
 
-DEF_HELPER_FLAGS_3(addqv, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_3(addlv, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_3(subqv, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_3(sublv, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_3(mullv, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_3(mulqv, TCG_CALL_NO_WG, i64, env, i64, i64)
+DEF_HELPER_FLAGS_3(check_overflow, TCG_CALL_NO_WG, void, env, i64, i64)
 
 DEF_HELPER_FLAGS_1(ctpop, TCG_CALL_NO_RWG_SE, i64, i64)
 DEF_HELPER_FLAGS_1(ctlz, TCG_CALL_NO_RWG_SE, i64, i64)
diff --git a/target-alpha/int_helper.c b/target-alpha/int_helper.c
index 7a205eb..8e4537f 100644
--- a/target-alpha/int_helper.c
+++ b/target-alpha/int_helper.c
@@ -249,64 +249,9 @@ uint64_t helper_unpkbw(uint64_t op1)
 | ((op1 & 0xff00) << 24));
 }
 
-uint64_t helper_addqv(CPUAlphaState *env, uint64_t op1, uint64_t op2)
+void helper_check_overflow(CPUAlphaState *env, uint64_t op1, uint64_t op2)
 {
-uint64_t tmp = op1;
-op1 += op2;
-if (unlikely((tmp ^ op2 ^ (-1ULL)) & (tmp ^ op1) & (1ULL << 63))) {
+if (unlikely(op1 != op2)) {
 arith_excp(env, GETPC(), EXC_M_IOV, 0);
 }
-return op1;
-}
-
-uint64_t helper_addlv(CPUAlphaState *env, uint64_t op1, uint64_t op2)
-{
-uint64_t tmp = op1;
-op1 = (uint32_t)(op1 + op2);
-if (unlikely((tmp ^ op2 ^ (-1UL)) & (tmp ^ op1) & (1UL << 31))) {
-arith_excp(env, GETPC(), EXC_M_IOV, 0);
-}
-return op1;
-}
-
-uint64_t helper_subqv(CPUAlphaState *env, uint64_t op1, uint64_t op2)
-{
-uint64_t res;
-res = op1 - op2;
-if (unlikely((op1 ^ op2) & (res ^ op1) & (1ULL << 63))) {
-arith_excp(env, GETPC(), EXC_M_IOV, 0);
-}
-return res;
-}
-
-uint64_t helper_sublv(CPUAlphaState *env, uint64_t op1, uint64_t op2)
-{
-uint32_t res;
-res = op1 - op2;
-if (unlikely((op1 ^ op2) & (res ^ op1) & (1UL << 31))) {
-arith_excp(env, GETPC(), EXC_M_IOV, 0);
-}
-return res;
-}
-
-uint64_t helper_mullv(CPUAlphaState *env, uint64_t op1, uint64_t op2)
-{
-int64_t res = (int64_t)op1 * (int64_t)op2;
-
-if (unlikely((int32_t)res != res)) {
-arith_excp(env, GETPC(), EXC_M_IOV, 0);
-}
-return (int64_t)((int32_t)res);
-}
-
-uint64_t helper_mulqv(CPUAlphaState *env, uint64_t op1, uint64_t op2)
-{
-uint64_t tl, th;
-
-muls64(&tl, &th, op1, op2);
-/* If th != 0 && th != -1, then we had an overflow */
-if (unlikely((th + 1) > 1)) {
-arith_excp(env, GETPC(), EXC_M_IOV, 0);
-}
-return tl;
 }
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 7868cc4..74f5d07 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -1362,7 +1362,7 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 uint16_t fn11;
 uint8_t opc, ra, rb, rc, fpfn, fn7, lit;
 bool islit;
-TCGv va, vb, vc, tmp;
+TCGv va, vb, vc, tmp, tmp2;
 TCGv_i32 t32;
 ExitStatus ret;
 
@@ -1574,11 +1574,23 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 break;
 case 0x40:
 /* ADDL/V */
-gen_helper_addlv(vc, cpu_env, va, vb);
+tmp = tcg_temp_new();
+tcg_gen_ext32s_i64(tmp, va);
+tcg_gen_ext32s_i64(vc, vb);
+tcg_gen_add_i64(tmp, tmp, vc);
+tcg_gen_ext32s_i64(vc, tmp);
+gen_helper_check_overflow(cpu_env, vc, tmp);
+tcg_temp_free(tmp);
 break;
 case 0x49:
 /* SUBL/V */
-gen_helper_sublv(vc, cpu_env, va, vb);
+tmp = tcg_temp_new();
+tcg_gen_ext32s_i64(tmp, va);
+tcg_gen_ext32s_i64(vc, vb);
+tcg_gen_sub_i64(tmp, tmp, vc);
+tcg_gen_ext32s_i64(vc, tmp);
+gen_helper_check_overflow(cpu_env, vc, tmp);
+tcg_temp_free(tmp);
 break;
 case 0x4D:
 /* CMPLT */
@@ -1586,11 +1598,33 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 break;
 case 0x60:
 /* ADDQ/V */
-gen_helper_addqv(vc, cpu_env, va, vb);
+tmp = tcg_temp_new();
+tmp2 = tcg_t

Re: [Qemu-devel] [PATCH v2] qmp: Add qom_path field to query-cpus command

2015-05-12 Thread Markus Armbruster
Eduardo Habkost  writes:

> On Tue, May 12, 2015 at 05:38:37PM +0200, Markus Armbruster wrote:
> [...]
>> > @@ -699,8 +701,9 @@
>> >  #data is sent to the client, the guest may no longer be halted.
>> >  ##
>> >  { 'struct': 'CpuInfo',
>> > -  'data': {'CPU': 'int', 'current': 'bool', 'halted': 'bool', '*pc': 
>> > 'int',
>> > - '*nip': 'int', '*npc': 'int', '*PC': 'int', 'thread_id': 'int'}
>> > }
>> > + 'data': {'CPU': 'int', 'current': 'bool', 'halted': 'bool',
>> > qom_path': 'str',
>> 
>> Long line.
>
> It has exactly 80 characters.

Several characters too wide for my taste.

I just realized checkpatch is fine with 80.  I'll revert my line wrap if
you feel strongly about it.



[Qemu-devel] [PATCH v2 09/17] target-alpha: Fix cvttq vs large integers

2015-05-12 Thread Richard Henderson
The range +- 2**63 - 2**64 was returning the wrong truncated
result.  We also incorrectly signaled overflow for -2**63.

Reported-by: Al Viro 
Signed-off-by: Richard Henderson 
---
 target-alpha/fpu_helper.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/target-alpha/fpu_helper.c b/target-alpha/fpu_helper.c
index 132b5a2..9449c57 100644
--- a/target-alpha/fpu_helper.c
+++ b/target-alpha/fpu_helper.c
@@ -453,12 +453,12 @@ static uint64_t do_cvttq(CPUAlphaState *env, uint64_t a, 
int roundmode)
 if (shift >= 0) {
 /* In this case the number is so large that we must shift
the fraction left.  There is no rounding to do.  */
-exc = FPCR_IOV | FPCR_INE;
-if (shift < 63) {
+if (shift < 64) {
 ret = frac << shift;
-if ((ret >> shift) == frac) {
-exc = 0;
-}
+}
+/* Check for overflow.  Note the special case of -0x1p63.  */
+if (shift >= 11 && a != 0xC3E0ull) {
+exc = FPCR_IOV | FPCR_INE;
 }
 } else {
 uint64_t round;
-- 
2.1.0




[Qemu-devel] [PATCH v2 16/17] target-alpha: Raise IOV from CVTQL

2015-05-12 Thread Richard Henderson
Even if an exception isn't taken, the status flags need updating
and the result should be written to the destination.  Move the body
of cvtql out of line, since we now always need a call.

Reported-by: Al Viro 
Signed-off-by: Richard Henderson 
---
 target-alpha/fpu_helper.c |  8 ++--
 target-alpha/helper.h |  3 ++-
 target-alpha/translate.c  | 34 +-
 3 files changed, 13 insertions(+), 32 deletions(-)

diff --git a/target-alpha/fpu_helper.c b/target-alpha/fpu_helper.c
index fa4401d..b091aa8 100644
--- a/target-alpha/fpu_helper.c
+++ b/target-alpha/fpu_helper.c
@@ -539,9 +539,13 @@ uint64_t helper_cvtqt(CPUAlphaState *env, uint64_t a)
 return float64_to_t(fr);
 }
 
-void helper_cvtql_v_input(CPUAlphaState *env, uint64_t val)
+uint64_t helper_cvtql(CPUAlphaState *env, uint64_t val)
 {
+uint32_t exc = 0;
 if (val != (int32_t)val) {
-arith_excp(env, GETPC(), EXC_M_IOV, 0);
+exc = FPCR_IOV | FPCR_INE;
 }
+env->error_code = exc;
+
+return ((val & 0xc000) << 32) | ((val & 0x3fff) << 29);
 }
diff --git a/target-alpha/helper.h b/target-alpha/helper.h
index 780b0dc..d221f0d 100644
--- a/target-alpha/helper.h
+++ b/target-alpha/helper.h
@@ -79,6 +79,8 @@ DEF_HELPER_FLAGS_2(cvtqg, TCG_CALL_NO_RWG, i64, env, i64)
 DEF_HELPER_FLAGS_2(cvttq, TCG_CALL_NO_RWG, i64, env, i64)
 DEF_HELPER_FLAGS_2(cvttq_c, TCG_CALL_NO_RWG, i64, env, i64)
 
+DEF_HELPER_FLAGS_2(cvtql, TCG_CALL_NO_RWG, i64, env, i64)
+
 DEF_HELPER_FLAGS_2(setroundmode, TCG_CALL_NO_RWG, void, env, i32)
 DEF_HELPER_FLAGS_2(setflushzero, TCG_CALL_NO_RWG, void, env, i32)
 DEF_HELPER_FLAGS_3(fp_exc_raise, TCG_CALL_NO_WG, void, env, i32, i32)
@@ -87,7 +89,6 @@ DEF_HELPER_FLAGS_3(fp_exc_raise_s, TCG_CALL_NO_WG, void, env, 
i32, i32)
 DEF_HELPER_FLAGS_2(ieee_input, TCG_CALL_NO_WG, void, env, i64)
 DEF_HELPER_FLAGS_2(ieee_input_cmp, TCG_CALL_NO_WG, void, env, i64)
 DEF_HELPER_FLAGS_2(ieee_input_s, TCG_CALL_NO_WG, void, env, i64)
-DEF_HELPER_FLAGS_2(cvtql_v_input, TCG_CALL_NO_WG, void, env, i64)
 
 #if !defined (CONFIG_USER_ONLY)
 DEF_HELPER_2(hw_ret, void, env, i64)
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 4c441a9..e9927b5 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -720,19 +720,6 @@ static void gen_cvtlq(TCGv vc, TCGv vb)
 tcg_temp_free(tmp);
 }
 
-static void gen_cvtql(TCGv vc, TCGv vb)
-{
-TCGv tmp = tcg_temp_new();
-
-tcg_gen_andi_i64(tmp, vb, (int32_t)0xc000);
-tcg_gen_andi_i64(vc, vb, 0x3FFF);
-tcg_gen_shli_i64(tmp, tmp, 32);
-tcg_gen_shli_i64(vc, vc, 29);
-tcg_gen_or_i64(vc, vc, tmp);
-
-tcg_temp_free(tmp);
-}
-
 static void gen_ieee_arith2(DisasContext *ctx,
 void (*helper)(TCGv, TCGv_ptr, TCGv),
 int rb, int rc, int fn11)
@@ -2254,25 +2241,14 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 /* FCMOVGT */
 gen_fcmov(ctx, TCG_COND_GT, ra, rb, rc);
 break;
-case 0x030:
-/* CVTQL */
-REQUIRE_REG_31(ra);
-vc = dest_fpr(ctx, rc);
-vb = load_fpr(ctx, rb);
-gen_cvtql(vc, vb);
-break;
-case 0x130:
-/* CVTQL/V */
-case 0x530:
-/* CVTQL/SV */
+case 0x030: /* CVTQL */
+case 0x130: /* CVTQL/V */
+case 0x530: /* CVTQL/SV */
 REQUIRE_REG_31(ra);
-/* ??? I'm pretty sure there's nothing that /sv needs to do that
-   /v doesn't do.  The only thing I can think is that /sv is a
-   valid instruction merely for completeness in the ISA.  */
 vc = dest_fpr(ctx, rc);
 vb = load_fpr(ctx, rb);
-gen_helper_cvtql_v_input(cpu_env, vb);
-gen_cvtql(vc, vb);
+gen_helper_cvtql(vc, cpu_env, vb);
+gen_fp_exc_raise(rc, fn11);
 break;
 default:
 goto invalid_opc;
-- 
2.1.0




[Qemu-devel] [PATCH v2 07/17] target-alpha: Set EXC_M_SWC for exceptions from /S insns

2015-05-12 Thread Richard Henderson
Previously forgotten, the kernel needs the software completion bit to
know that it needs to emulate software completion qualified insns.

Reported-by: Al Viro 
Signed-off-by: Richard Henderson 
---
 target-alpha/fpu_helper.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/target-alpha/fpu_helper.c b/target-alpha/fpu_helper.c
index 6e84fd3..914c1d5 100644
--- a/target-alpha/fpu_helper.c
+++ b/target-alpha/fpu_helper.c
@@ -55,10 +55,8 @@ static uint32_t soft_to_fpcr_exc(CPUAlphaState *env)
 }
 
 static void fp_exc_raise1(CPUAlphaState *env, uintptr_t retaddr,
-  uint32_t exc, uint32_t regno)
+  uint32_t exc, uint32_t regno, uint32_t hw_exc)
 {
-uint32_t hw_exc = 0;
-
 hw_exc |= CONVERT_BIT(exc, FPCR_INV, EXC_M_INV);
 hw_exc |= CONVERT_BIT(exc, FPCR_DZE, EXC_M_DZE);
 hw_exc |= CONVERT_BIT(exc, FPCR_OVF, EXC_M_FOV);
@@ -79,7 +77,7 @@ void helper_fp_exc_raise(CPUAlphaState *env, uint32_t ignore, 
uint32_t regno)
 env->fpcr |= exc;
 exc &= ~ignore;
 if (exc) {
-fp_exc_raise1(env, GETPC(), exc, regno);
+fp_exc_raise1(env, GETPC(), exc, regno, 0);
 }
 }
 }
@@ -93,7 +91,7 @@ void helper_fp_exc_raise_s(CPUAlphaState *env, uint32_t 
ignore, uint32_t regno)
 exc &= ~ignore;
 if (exc) {
 exc &= env->fpcr_exc_enable;
-fp_exc_raise1(env, GETPC(), exc, regno);
+fp_exc_raise1(env, GETPC(), exc, regno, EXC_M_SWC);
 }
 }
 }
-- 
2.1.0




[Qemu-devel] [PATCH v2 17/17] target-alpha: Rewrite helper_zapnot

2015-05-12 Thread Richard Henderson
The extract signed single bitfield produces significantly
smaller code on x86_64.

Signed-off-by: Richard Henderson 
---
 target-alpha/int_helper.c | 30 --
 1 file changed, 12 insertions(+), 18 deletions(-)

diff --git a/target-alpha/int_helper.c b/target-alpha/int_helper.c
index 8e4537f..c023fa1 100644
--- a/target-alpha/int_helper.c
+++ b/target-alpha/int_helper.c
@@ -37,31 +37,25 @@ uint64_t helper_cttz(uint64_t arg)
 return ctz64(arg);
 }
 
-static inline uint64_t byte_zap(uint64_t op, uint8_t mskb)
+uint64_t helper_zapnot(uint64_t val, uint64_t mskb)
 {
 uint64_t mask;
 
-mask = 0;
-mask |= ((mskb >> 0) & 1) * 0x00FFULL;
-mask |= ((mskb >> 1) & 1) * 0xFF00ULL;
-mask |= ((mskb >> 2) & 1) * 0x00FFULL;
-mask |= ((mskb >> 3) & 1) * 0xFF00ULL;
-mask |= ((mskb >> 4) & 1) * 0x00FFULL;
-mask |= ((mskb >> 5) & 1) * 0xFF00ULL;
-mask |= ((mskb >> 6) & 1) * 0x00FFULL;
-mask |= ((mskb >> 7) & 1) * 0xFF00ULL;
-
-return op & ~mask;
-}
+mask  = sextract64(mskb, 0, 1) & 0x00fful;
+mask |= sextract64(mskb, 1, 1) & 0xff00ul;
+mask |= sextract64(mskb, 2, 1) & 0x00fful;
+mask |= sextract64(mskb, 3, 1) & 0xff00ul;
+mask |= sextract64(mskb, 4, 1) & 0x00fful;
+mask |= sextract64(mskb, 5, 1) & 0xff00ul;
+mask |= sextract64(mskb, 6, 1) & 0x00fful;
+mask |= sextract64(mskb, 7, 1) & 0xff00ul;
 
-uint64_t helper_zap(uint64_t val, uint64_t mask)
-{
-return byte_zap(val, mask);
+return val & mask;
 }
 
-uint64_t helper_zapnot(uint64_t val, uint64_t mask)
+uint64_t helper_zap(uint64_t val, uint64_t mask)
 {
-return byte_zap(val, ~mask);
+return helper_zapnot(val, ~mask);
 }
 
 uint64_t helper_cmpbge(uint64_t op1, uint64_t op2)
-- 
2.1.0




[Qemu-devel] [PATCH v2 03/17] target-alpha: Forget installed round mode after MT_FPCR

2015-05-12 Thread Richard Henderson
When we use QUAL_RM_D, we copy fpcr_dyn_round to float_status.
When we install a new FPCR value, we update fpcr_dyn_round.
Reset the status of the cache so that we re-copy for the next
fp insn that requires dynamic rounding.

Signed-off-by: Richard Henderson 
---
 target-alpha/translate.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index b3c5dca..94dab26 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2199,6 +2199,11 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 /* MT_FPCR */
 va = load_fpr(ctx, ra);
 gen_helper_store_fpcr(cpu_env, va);
+if (ctx->tb_rm == QUAL_RM_D) {
+/* Re-do the copy of the rounding mode to fp_status
+   the next time we use dynamic rounding.  */
+ctx->tb_rm = -1;
+}
 break;
 case 0x025:
 /* MF_FPCR */
-- 
2.1.0




[Qemu-devel] [PATCH v2 14/17] target-alpha: Raise EXC_M_INV properly for fp inputs

2015-05-12 Thread Richard Henderson
Ignore DNZ if software completion isn't used.  Raise INV for
denormals in system mode so the OS completion handler sees them.

Reported-by: Al Viro 
Signed-off-by: Richard Henderson 
---
 target-alpha/fpu_helper.c | 32 ++--
 target-alpha/helper.h |  1 +
 target-alpha/translate.c  |  7 +++
 3 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/target-alpha/fpu_helper.c b/target-alpha/fpu_helper.c
index db523fb..ea1f2e2 100644
--- a/target-alpha/fpu_helper.c
+++ b/target-alpha/fpu_helper.c
@@ -104,16 +104,14 @@ void helper_ieee_input(CPUAlphaState *env, uint64_t val)
 uint64_t frac = val & 0xfull;
 
 if (exp == 0) {
-/* Denormals without DNZ set raise an exception.  */
-if (frac != 0 && !env->fp_status.flush_inputs_to_zero) {
-arith_excp(env, GETPC(), EXC_M_UNF, 0);
+/* Denormals without /S raise an exception.  */
+if (frac != 0) {
+arith_excp(env, GETPC(), EXC_M_INV, 0);
 }
 } else if (exp == 0x7ff) {
 /* Infinity or NaN.  */
-/* ??? I'm not sure these exception bit flags are correct.  I do
-   know that the Linux kernel, at least, doesn't rely on them and
-   just emulates the insn to figure out what exception to use.  */
-arith_excp(env, GETPC(), frac ? EXC_M_INV : EXC_M_FOV, 0);
+env->fpcr |= FPCR_INV;
+arith_excp(env, GETPC(), EXC_M_INV, 0);
 }
 }
 
@@ -124,16 +122,30 @@ void helper_ieee_input_cmp(CPUAlphaState *env, uint64_t 
val)
 uint64_t frac = val & 0xfull;
 
 if (exp == 0) {
-/* Denormals without DNZ set raise an exception.  */
-if (frac != 0 && !env->fp_status.flush_inputs_to_zero) {
-arith_excp(env, GETPC(), EXC_M_UNF, 0);
+/* Denormals without /S raise an exception.  */
+if (frac != 0) {
+arith_excp(env, GETPC(), EXC_M_INV, 0);
 }
 } else if (exp == 0x7ff && frac) {
 /* NaN.  */
+env->fpcr |= FPCR_INV;
 arith_excp(env, GETPC(), EXC_M_INV, 0);
 }
 }
 
+/* Input handing with software completion.  Trap for denorms, unless DNZ
+   is set.  If we try to support DNOD (which none of the produced hardware
+   did, AFAICS), we'll need to suppress the trap when FPCR.DNOD is set;
+   then the code downstream of that will need to cope with denorms sans
+   flush_input_to_zero.  Most of it should work sanely, but there's
+   nothing to compare with.  */
+void helper_ieee_input_s(CPUAlphaState *env, uint64_t val)
+{
+if (unlikely(2 * val - 1 < 0x1full)
+&& !env->fp_status.flush_inputs_to_zero) {
+arith_excp(env, GETPC(), EXC_M_INV | EXC_M_SWC, 0);
+}
+}
 
 /* S floating (single) */
 
diff --git a/target-alpha/helper.h b/target-alpha/helper.h
index 5b1a5d9..780b0dc 100644
--- a/target-alpha/helper.h
+++ b/target-alpha/helper.h
@@ -86,6 +86,7 @@ DEF_HELPER_FLAGS_3(fp_exc_raise_s, TCG_CALL_NO_WG, void, env, 
i32, i32)
 
 DEF_HELPER_FLAGS_2(ieee_input, TCG_CALL_NO_WG, void, env, i64)
 DEF_HELPER_FLAGS_2(ieee_input_cmp, TCG_CALL_NO_WG, void, env, i64)
+DEF_HELPER_FLAGS_2(ieee_input_s, TCG_CALL_NO_WG, void, env, i64)
 DEF_HELPER_FLAGS_2(cvtql_v_input, TCG_CALL_NO_WG, void, env, i64)
 
 #if !defined (CONFIG_USER_ONLY)
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index f0556b0..4c441a9 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -658,6 +658,13 @@ static TCGv gen_ieee_input(DisasContext *ctx, int reg, int 
fn11, int is_cmp)
 } else {
 gen_helper_ieee_input(cpu_env, val);
 }
+} else {
+#ifndef CONFIG_USER_ONLY
+/* In system mode, raise exceptions for denormals like real
+   hardware.  In user mode, proceed as if the OS completion
+   handler is handling the denormal as per spec.  */
+gen_helper_ieee_input_s(cpu_env, val);
+#endif
 }
 }
 return val;
-- 
2.1.0




[Qemu-devel] [PATCH v2 08/17] target-alpha: Raise IOV from CVTTQ

2015-05-12 Thread Richard Henderson
Floating-point overflow is a different bit from integer overflow.

Reported-by: Al Viro 
Signed-off-by: Richard Henderson 
---
 target-alpha/fpu_helper.c | 25 +
 target-alpha/helper.h |  1 -
 target-alpha/translate.c  | 17 -
 3 files changed, 13 insertions(+), 30 deletions(-)

diff --git a/target-alpha/fpu_helper.c b/target-alpha/fpu_helper.c
index 914c1d5..132b5a2 100644
--- a/target-alpha/fpu_helper.c
+++ b/target-alpha/fpu_helper.c
@@ -427,12 +427,9 @@ uint64_t helper_cvtqs(CPUAlphaState *env, uint64_t a)
 
 /* Implement float64 to uint64 conversion without saturation -- we must
supply the truncated result.  This behaviour is used by the compiler
-   to get unsigned conversion for free with the same instruction.
+   to get unsigned conversion for free with the same instruction.  */
 
-   The VI flag is set when overflow or inexact exceptions should be raised.  */
-
-static inline uint64_t inline_cvttq(CPUAlphaState *env, uint64_t a,
-int roundmode, int VI)
+static uint64_t do_cvttq(CPUAlphaState *env, uint64_t a, int roundmode)
 {
 uint64_t frac, ret = 0;
 uint32_t exp, sign, exc = 0;
@@ -447,7 +444,7 @@ static inline uint64_t inline_cvttq(CPUAlphaState *env, 
uint64_t a,
 goto do_underflow;
 }
 } else if (exp == 0x7ff) {
-exc = (frac ? FPCR_INV : VI ? FPCR_OVF : 0);
+exc = (frac ? FPCR_INV : FPCR_IOV | FPCR_INE);
 } else {
 /* Restore implicit bit.  */
 frac |= 0x10ull;
@@ -456,10 +453,11 @@ static inline uint64_t inline_cvttq(CPUAlphaState *env, 
uint64_t a,
 if (shift >= 0) {
 /* In this case the number is so large that we must shift
the fraction left.  There is no rounding to do.  */
+exc = FPCR_IOV | FPCR_INE;
 if (shift < 63) {
 ret = frac << shift;
-if (VI && (ret >> shift) != frac) {
-exc = FPCR_OVF;
+if ((ret >> shift) == frac) {
+exc = 0;
 }
 }
 } else {
@@ -482,7 +480,7 @@ static inline uint64_t inline_cvttq(CPUAlphaState *env, 
uint64_t a,
 }
 
 if (round) {
-exc = (VI ? FPCR_INE : 0);
+exc = FPCR_INE;
 switch (roundmode) {
 case float_round_nearest_even:
 if (round == (1ull << 63)) {
@@ -514,17 +512,12 @@ static inline uint64_t inline_cvttq(CPUAlphaState *env, 
uint64_t a,
 
 uint64_t helper_cvttq(CPUAlphaState *env, uint64_t a)
 {
-return inline_cvttq(env, a, FP_STATUS.float_rounding_mode, 1);
+return do_cvttq(env, a, FP_STATUS.float_rounding_mode);
 }
 
 uint64_t helper_cvttq_c(CPUAlphaState *env, uint64_t a)
 {
-return inline_cvttq(env, a, float_round_to_zero, 0);
-}
-
-uint64_t helper_cvttq_svic(CPUAlphaState *env, uint64_t a)
-{
-return inline_cvttq(env, a, float_round_to_zero, 1);
+return do_cvttq(env, a, float_round_to_zero);
 }
 
 uint64_t helper_cvtqt(CPUAlphaState *env, uint64_t a)
diff --git a/target-alpha/helper.h b/target-alpha/helper.h
index 67a6e32..9e7b771 100644
--- a/target-alpha/helper.h
+++ b/target-alpha/helper.h
@@ -83,7 +83,6 @@ DEF_HELPER_FLAGS_2(cvtqg, TCG_CALL_NO_RWG, i64, env, i64)
 
 DEF_HELPER_FLAGS_2(cvttq, TCG_CALL_NO_RWG, i64, env, i64)
 DEF_HELPER_FLAGS_2(cvttq_c, TCG_CALL_NO_RWG, i64, env, i64)
-DEF_HELPER_FLAGS_2(cvttq_svic, TCG_CALL_NO_RWG, i64, env, i64)
 
 DEF_HELPER_FLAGS_2(setroundmode, TCG_CALL_NO_RWG, void, env, i32)
 DEF_HELPER_FLAGS_2(setflushzero, TCG_CALL_NO_RWG, void, env, i32)
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index f121320..7868cc4 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -760,23 +760,14 @@ static void gen_cvttq(DisasContext *ctx, int rb, int rc, 
int fn11)
 vb = gen_ieee_input(ctx, rb, fn11, 0);
 vc = dest_fpr(ctx, rc);
 
-/* Almost all integer conversions use cropped rounding, and most
-   also do not have integer overflow enabled.  Special case that.  */
-switch (fn11) {
-case QUAL_RM_C:
+/* Almost all integer conversions use cropped rounding;
+   special case that.  */
+if ((fn11 & QUAL_RM_MASK) == QUAL_RM_C) {
 gen_helper_cvttq_c(vc, cpu_env, vb);
-break;
-case QUAL_V | QUAL_RM_C:
-case QUAL_S | QUAL_V | QUAL_RM_C:
-case QUAL_S | QUAL_V | QUAL_I | QUAL_RM_C:
-gen_helper_cvttq_svic(vc, cpu_env, vb);
-break;
-default:
+} else {
 gen_qual_roundmode(ctx, fn11);
 gen_helper_cvttq(vc, cpu_env, vb);
-break;
 }
-
 gen_fp_exc_raise(rc, fn11);
 }
 
-- 
2.1.0




[Qemu-devel] [PATCH v2 05/17] target-alpha: Tidy FPCR representation

2015-05-12 Thread Richard Henderson
Store the fpcr as the hardware represents it.  Convert the softfpu
representation of exceptions into the fpcr representation.

Signed-off-by: Richard Henderson 
---
 target-alpha/cpu.h|  95 +
 target-alpha/fpu_helper.c | 130 +-
 target-alpha/helper.c | 130 --
 target-alpha/helper.h |   2 -
 target-alpha/translate.c  |  45 +++-
 5 files changed, 159 insertions(+), 243 deletions(-)

diff --git a/target-alpha/cpu.h b/target-alpha/cpu.h
index 9538f19..2a4d5cb 100644
--- a/target-alpha/cpu.h
+++ b/target-alpha/cpu.h
@@ -150,54 +150,54 @@ enum {
 FP_ROUND_DYNAMIC = 0x3,
 };
 
-/* FPCR bits */
-#define FPCR_SUM   (1ULL << 63)
-#define FPCR_INED  (1ULL << 62)
-#define FPCR_UNFD  (1ULL << 61)
-#define FPCR_UNDZ  (1ULL << 60)
-#define FPCR_DYN_SHIFT 58
-#define FPCR_DYN_CHOPPED   (0ULL << FPCR_DYN_SHIFT)
-#define FPCR_DYN_MINUS (1ULL << FPCR_DYN_SHIFT)
-#define FPCR_DYN_NORMAL(2ULL << FPCR_DYN_SHIFT)
-#define FPCR_DYN_PLUS  (3ULL << FPCR_DYN_SHIFT)
-#define FPCR_DYN_MASK  (3ULL << FPCR_DYN_SHIFT)
-#define FPCR_IOV   (1ULL << 57)
-#define FPCR_INE   (1ULL << 56)
-#define FPCR_UNF   (1ULL << 55)
-#define FPCR_OVF   (1ULL << 54)
-#define FPCR_DZE   (1ULL << 53)
-#define FPCR_INV   (1ULL << 52)
-#define FPCR_OVFD  (1ULL << 51)
-#define FPCR_DZED  (1ULL << 50)
-#define FPCR_INVD  (1ULL << 49)
-#define FPCR_DNZ   (1ULL << 48)
-#define FPCR_DNOD  (1ULL << 47)
-#define FPCR_STATUS_MASK   (FPCR_IOV | FPCR_INE | FPCR_UNF \
-| FPCR_OVF | FPCR_DZE | FPCR_INV)
+/* FPCR bits -- right-shifted 32 so we can use a uint32_t.  */
+#define FPCR_SUM(1U << (63 - 32))
+#define FPCR_INED   (1U << (62 - 32))
+#define FPCR_UNFD   (1U << (61 - 32))
+#define FPCR_UNDZ   (1U << (60 - 32))
+#define FPCR_DYN_SHIFT  (58 - 32)
+#define FPCR_DYN_CHOPPED(0U << FPCR_DYN_SHIFT)
+#define FPCR_DYN_MINUS  (1U << FPCR_DYN_SHIFT)
+#define FPCR_DYN_NORMAL (2U << FPCR_DYN_SHIFT)
+#define FPCR_DYN_PLUS   (3U << FPCR_DYN_SHIFT)
+#define FPCR_DYN_MASK   (3U << FPCR_DYN_SHIFT)
+#define FPCR_IOV(1U << (57 - 32))
+#define FPCR_INE(1U << (56 - 32))
+#define FPCR_UNF(1U << (55 - 32))
+#define FPCR_OVF(1U << (54 - 32))
+#define FPCR_DZE(1U << (53 - 32))
+#define FPCR_INV(1U << (52 - 32))
+#define FPCR_OVFD   (1U << (51 - 32))
+#define FPCR_DZED   (1U << (50 - 32))
+#define FPCR_INVD   (1U << (49 - 32))
+#define FPCR_DNZ(1U << (48 - 32))
+#define FPCR_DNOD   (1U << (47 - 32))
+#define FPCR_STATUS_MASK(FPCR_IOV | FPCR_INE | FPCR_UNF \
+ | FPCR_OVF | FPCR_DZE | FPCR_INV)
 
 /* The silly software trap enables implemented by the kernel emulation.
These are more or less architecturally required, since the real hardware
has read-as-zero bits in the FPCR when the features aren't implemented.
For the purposes of QEMU, we pretend the FPCR can hold everything.  */
-#define SWCR_TRAP_ENABLE_INV   (1ULL << 1)
-#define SWCR_TRAP_ENABLE_DZE   (1ULL << 2)
-#define SWCR_TRAP_ENABLE_OVF   (1ULL << 3)
-#define SWCR_TRAP_ENABLE_UNF   (1ULL << 4)
-#define SWCR_TRAP_ENABLE_INE   (1ULL << 5)
-#define SWCR_TRAP_ENABLE_DNO   (1ULL << 6)
-#define SWCR_TRAP_ENABLE_MASK  ((1ULL << 7) - (1ULL << 1))
-
-#define SWCR_MAP_DMZ   (1ULL << 12)
-#define SWCR_MAP_UMZ   (1ULL << 13)
-#define SWCR_MAP_MASK  (SWCR_MAP_DMZ | SWCR_MAP_UMZ)
-
-#define SWCR_STATUS_INV(1ULL << 17)
-#define SWCR_STATUS_DZE(1ULL << 18)
-#define SWCR_STATUS_OVF(1ULL << 19)
-#define SWCR_STATUS_UNF(1ULL << 20)
-#define SWCR_STATUS_INE(1ULL << 21)
-#define SWCR_STATUS_DNO(1ULL << 22)
-#define SWCR_STATUS_MASK   ((1ULL << 23) - (1ULL << 17))
+#define SWCR_TRAP_ENABLE_INV(1U << 1)
+#define SWCR_TRAP_ENABLE_DZE(1U << 2)
+#define SWCR_TRAP_ENABLE_OVF(1U << 3)
+#define SWCR_TRAP_ENABLE_UNF(1U << 4)
+#define SWCR_TRAP_ENABLE_INE(1U << 5)
+#define SWCR_TRAP_ENABLE_DNO(1U << 6)
+#define SWCR_TRAP_ENABLE_MASK   ((1U << 7) - (1U << 1))
+
+#define SWCR_MAP_DMZ(1U << 12)
+#define SWCR_MAP_UMZ(1U << 13)
+#define SWCR_MAP_MASK   (SWCR_MAP_DMZ | SWCR_MAP_UMZ)
+
+#define SWCR_STATUS_INV (1U << 17)
+#define SWCR_STATUS_DZE (1U << 18)
+#define SWCR_STATUS_OVF (1U << 19)
+#define SWCR_STATUS_UNF (1U << 20)
+#define SWCR_STATUS_

[Qemu-devel] [PATCH v2 12/17] target-alpha: Implement WH64EN

2015-05-12 Thread Richard Henderson
Backward compatible cache insn introduced for EV7.

Reported-by: Al Viro 
Signed-off-by: Richard Henderson 
---
 target-alpha/translate.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 74f5d07..953d1ef 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2318,6 +2318,10 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 /* WH64 */
 /* No-op */
 break;
+case 0xFC00:
+/* WH64EN */
+/* No-op */
+break;
 default:
 goto invalid_opc;
 }
-- 
2.1.0




[Qemu-devel] [PATCH v2 06/17] target-alpha: Set fpcr_exc_status even for disabled exceptions

2015-05-12 Thread Richard Henderson
The qualifiers can suppress the raising of exceptions, but real
hardware still records that the exceptions occurred.

Reported-by: Al Viro 
Signed-off-by: Richard Henderson 
---
 target-alpha/fpu_helper.c | 35 +--
 target-alpha/translate.c  | 28 +---
 2 files changed, 34 insertions(+), 29 deletions(-)

diff --git a/target-alpha/fpu_helper.c b/target-alpha/fpu_helper.c
index caf8317..6e84fd3 100644
--- a/target-alpha/fpu_helper.c
+++ b/target-alpha/fpu_helper.c
@@ -57,18 +57,16 @@ static uint32_t soft_to_fpcr_exc(CPUAlphaState *env)
 static void fp_exc_raise1(CPUAlphaState *env, uintptr_t retaddr,
   uint32_t exc, uint32_t regno)
 {
-if (exc) {
-uint32_t hw_exc = 0;
+uint32_t hw_exc = 0;
 
-hw_exc |= CONVERT_BIT(exc, FPCR_INV, EXC_M_INV);
-hw_exc |= CONVERT_BIT(exc, FPCR_DZE, EXC_M_DZE);
-hw_exc |= CONVERT_BIT(exc, FPCR_OVF, EXC_M_FOV);
-hw_exc |= CONVERT_BIT(exc, FPCR_UNF, EXC_M_UNF);
-hw_exc |= CONVERT_BIT(exc, FPCR_INE, EXC_M_INE);
-hw_exc |= CONVERT_BIT(exc, FPCR_IOV, EXC_M_IOV);
+hw_exc |= CONVERT_BIT(exc, FPCR_INV, EXC_M_INV);
+hw_exc |= CONVERT_BIT(exc, FPCR_DZE, EXC_M_DZE);
+hw_exc |= CONVERT_BIT(exc, FPCR_OVF, EXC_M_FOV);
+hw_exc |= CONVERT_BIT(exc, FPCR_UNF, EXC_M_UNF);
+hw_exc |= CONVERT_BIT(exc, FPCR_INE, EXC_M_INE);
+hw_exc |= CONVERT_BIT(exc, FPCR_IOV, EXC_M_IOV);
 
-arith_excp(env, retaddr, hw_exc, 1ull << regno);
-}
+arith_excp(env, retaddr, hw_exc, 1ull << regno);
 }
 
 /* Raise exceptions for ieee fp insns without software completion.
@@ -76,8 +74,14 @@ static void fp_exc_raise1(CPUAlphaState *env, uintptr_t 
retaddr,
doesn't apply.  */
 void helper_fp_exc_raise(CPUAlphaState *env, uint32_t ignore, uint32_t regno)
 {
-uint32_t exc = env->error_code & ~ignore;
-fp_exc_raise1(env, GETPC(), exc, regno);
+uint32_t exc = env->error_code;
+if (exc) {
+env->fpcr |= exc;
+exc &= ~ignore;
+if (exc) {
+fp_exc_raise1(env, GETPC(), exc, regno);
+}
+}
 }
 
 /* Raise exceptions for ieee fp insns with software completion.  */
@@ -86,8 +90,11 @@ void helper_fp_exc_raise_s(CPUAlphaState *env, uint32_t 
ignore, uint32_t regno)
 uint32_t exc = env->error_code & ~ignore;
 if (exc) {
 env->fpcr |= exc;
-exc &= env->fpcr_exc_enable;
-fp_exc_raise1(env, GETPC(), exc, regno);
+exc &= ~ignore;
+if (exc) {
+exc &= env->fpcr_exc_enable;
+fp_exc_raise1(env, GETPC(), exc, regno);
+}
 }
 }
 
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 25107f9..f121320 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -663,15 +663,24 @@ static TCGv gen_ieee_input(DisasContext *ctx, int reg, 
int fn11, int is_cmp)
 return val;
 }
 
-static void gen_fp_exc_raise_ignore(int rc, int fn11, int ignore)
+static void gen_fp_exc_raise(int rc, int fn11)
 {
 /* ??? We ought to be able to do something with imprecise exceptions.
E.g. notice we're still in the trap shadow of something within the
TB and do not generate the code to signal the exception; end the TB
when an exception is forced to arrive, either by consumption of a
register value or TRAPB or EXCB.  */
-TCGv_i32 ign = tcg_const_i32(ignore);
-TCGv_i32 reg;
+TCGv_i32 reg, ign;
+uint32_t ignore = 0;
+
+if (!(fn11 & QUAL_U)) {
+/* Note that QUAL_U == QUAL_V, so ignore either.  */
+ignore |= FPCR_UNF | FPCR_IOV;
+}
+if (!(fn11 & QUAL_I)) {
+ignore |= FPCR_INE;
+}
+ign = tcg_const_i32(ignore);
 
 /* ??? Pass in the regno of the destination so that the helper can
set EXC_MASK, which contains a bitmask of destination registers
@@ -679,7 +688,6 @@ static void gen_fp_exc_raise_ignore(int rc, int fn11, int 
ignore)
does not require this.  We do need it for a guest kernel's entArith,
or if we were to do something clever with imprecise exceptions.  */
 reg = tcg_const_i32(rc + 32);
-
 if (fn11 & QUAL_S) {
 gen_helper_fp_exc_raise_s(cpu_env, ign, reg);
 } else {
@@ -690,11 +698,6 @@ static void gen_fp_exc_raise_ignore(int rc, int fn11, int 
ignore)
 tcg_temp_free_i32(ign);
 }
 
-static inline void gen_fp_exc_raise(int rc, int fn11)
-{
-gen_fp_exc_raise_ignore(rc, fn11, fn11 & QUAL_I ? 0 : FPCR_INE);
-}
-
 static void gen_cvtlq(TCGv vc, TCGv vb)
 {
 TCGv tmp = tcg_temp_new();
@@ -752,7 +755,6 @@ IEEE_ARITH2(cvtts)
 static void gen_cvttq(DisasContext *ctx, int rb, int rc, int fn11)
 {
 TCGv vb, vc;
-int ignore = 0;
 
 /* No need to set flushzero, since we have an integer output.  */
 vb = gen_ieee_input(ctx, rb, fn11, 0);
@@ -766,20 +768,16 @@ static void gen_cvttq(DisasContext *ctx, int rb, int rc, 
int fn11)
 break;
 case QUAL_V | QUAL_RM_C:

[Qemu-devel] [PATCH v2 01/17] target-alpha: Move VAX helpers to a new file

2015-05-12 Thread Richard Henderson
Keep the IEEE and VAX floating point emulation separate.

Signed-off-by: Richard Henderson 
---
 target-alpha/Makefile.objs |   2 +-
 target-alpha/fpu_helper.c  | 328 -
 target-alpha/vax_helper.c  | 353 +
 3 files changed, 354 insertions(+), 329 deletions(-)
 create mode 100644 target-alpha/vax_helper.c

diff --git a/target-alpha/Makefile.objs b/target-alpha/Makefile.objs
index b96c5da..6366462 100644
--- a/target-alpha/Makefile.objs
+++ b/target-alpha/Makefile.objs
@@ -1,4 +1,4 @@
 obj-$(CONFIG_SOFTMMU) += machine.o
 obj-y += translate.o helper.o cpu.o
-obj-y += int_helper.o fpu_helper.o sys_helper.o mem_helper.o
+obj-y += int_helper.o fpu_helper.o vax_helper.o sys_helper.o mem_helper.o
 obj-y += gdbstub.o
diff --git a/target-alpha/fpu_helper.c b/target-alpha/fpu_helper.c
index d2d776c..8acd460 100644
--- a/target-alpha/fpu_helper.c
+++ b/target-alpha/fpu_helper.c
@@ -126,263 +126,6 @@ void helper_ieee_input_cmp(CPUAlphaState *env, uint64_t 
val)
 }
 }
 
-/* F floating (VAX) */
-static uint64_t float32_to_f(float32 fa)
-{
-uint64_t r, exp, mant, sig;
-CPU_FloatU a;
-
-a.f = fa;
-sig = ((uint64_t)a.l & 0x8000) << 32;
-exp = (a.l >> 23) & 0xff;
-mant = ((uint64_t)a.l & 0x007f) << 29;
-
-if (exp == 255) {
-/* NaN or infinity */
-r = 1; /* VAX dirty zero */
-} else if (exp == 0) {
-if (mant == 0) {
-/* Zero */
-r = 0;
-} else {
-/* Denormalized */
-r = sig | ((exp + 1) << 52) | mant;
-}
-} else {
-if (exp >= 253) {
-/* Overflow */
-r = 1; /* VAX dirty zero */
-} else {
-r = sig | ((exp + 2) << 52);
-}
-}
-
-return r;
-}
-
-static float32 f_to_float32(CPUAlphaState *env, uintptr_t retaddr, uint64_t a)
-{
-uint32_t exp, mant_sig;
-CPU_FloatU r;
-
-exp = ((a >> 55) & 0x80) | ((a >> 52) & 0x7f);
-mant_sig = ((a >> 32) & 0x8000) | ((a >> 29) & 0x007f);
-
-if (unlikely(!exp && mant_sig)) {
-/* Reserved operands / Dirty zero */
-dynamic_excp(env, retaddr, EXCP_OPCDEC, 0);
-}
-
-if (exp < 3) {
-/* Underflow */
-r.l = 0;
-} else {
-r.l = ((exp - 2) << 23) | mant_sig;
-}
-
-return r.f;
-}
-
-uint32_t helper_f_to_memory(uint64_t a)
-{
-uint32_t r;
-r =  (a & 0x1fffe000ull) >> 13;
-r |= (a & 0x07ffe000ull) >> 45;
-r |= (a & 0xc000ull) >> 48;
-return r;
-}
-
-uint64_t helper_memory_to_f(uint32_t a)
-{
-uint64_t r;
-r =  ((uint64_t)(a & 0xc000)) << 48;
-r |= ((uint64_t)(a & 0x003f)) << 45;
-r |= ((uint64_t)(a & 0x)) << 13;
-if (!(a & 0x4000)) {
-r |= 0x7ll << 59;
-}
-return r;
-}
-
-/* ??? Emulating VAX arithmetic with IEEE arithmetic is wrong.  We should
-   either implement VAX arithmetic properly or just signal invalid opcode.  */
-
-uint64_t helper_addf(CPUAlphaState *env, uint64_t a, uint64_t b)
-{
-float32 fa, fb, fr;
-
-fa = f_to_float32(env, GETPC(), a);
-fb = f_to_float32(env, GETPC(), b);
-fr = float32_add(fa, fb, &FP_STATUS);
-return float32_to_f(fr);
-}
-
-uint64_t helper_subf(CPUAlphaState *env, uint64_t a, uint64_t b)
-{
-float32 fa, fb, fr;
-
-fa = f_to_float32(env, GETPC(), a);
-fb = f_to_float32(env, GETPC(), b);
-fr = float32_sub(fa, fb, &FP_STATUS);
-return float32_to_f(fr);
-}
-
-uint64_t helper_mulf(CPUAlphaState *env, uint64_t a, uint64_t b)
-{
-float32 fa, fb, fr;
-
-fa = f_to_float32(env, GETPC(), a);
-fb = f_to_float32(env, GETPC(), b);
-fr = float32_mul(fa, fb, &FP_STATUS);
-return float32_to_f(fr);
-}
-
-uint64_t helper_divf(CPUAlphaState *env, uint64_t a, uint64_t b)
-{
-float32 fa, fb, fr;
-
-fa = f_to_float32(env, GETPC(), a);
-fb = f_to_float32(env, GETPC(), b);
-fr = float32_div(fa, fb, &FP_STATUS);
-return float32_to_f(fr);
-}
-
-uint64_t helper_sqrtf(CPUAlphaState *env, uint64_t t)
-{
-float32 ft, fr;
-
-ft = f_to_float32(env, GETPC(), t);
-fr = float32_sqrt(ft, &FP_STATUS);
-return float32_to_f(fr);
-}
-
-
-/* G floating (VAX) */
-static uint64_t float64_to_g(float64 fa)
-{
-uint64_t r, exp, mant, sig;
-CPU_DoubleU a;
-
-a.d = fa;
-sig = a.ll & 0x8000ull;
-exp = (a.ll >> 52) & 0x7ff;
-mant = a.ll & 0x000full;
-
-if (exp == 2047) {
-/* NaN or infinity */
-r = 1; /* VAX dirty zero */
-} else if (exp == 0) {
-if (mant == 0) {
-/* Zero */
-r = 0;
-} else {
-/* Denormalized */
-r = sig | ((exp + 1) << 52) | mant;
-}
-} else {
-if (exp >= 2045) {
-/* Overflow */
-r = 1; /* VAX dirty zero */
-} else {
-r = sig | ((exp + 2) << 52);
-  

[Qemu-devel] [PATCH v2 04/17] target-alpha: Set PC correctly for floating-point exceptions

2015-05-12 Thread Richard Henderson
PC should be one past the faulting insn.  Add better commentary
for the machine-check exception path.

Reported-by: Al Viro 
Signed-off-by: Richard Henderson 
---
 target-alpha/helper.c | 2 ++
 target-alpha/mem_helper.c | 9 -
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/target-alpha/helper.c b/target-alpha/helper.c
index a8aa782..e202fee 100644
--- a/target-alpha/helper.c
+++ b/target-alpha/helper.c
@@ -571,6 +571,8 @@ void QEMU_NORETURN dynamic_excp(CPUAlphaState *env, 
uintptr_t retaddr,
 env->error_code = error;
 if (retaddr) {
 cpu_restore_state(cs, retaddr);
+/* Floating-point exceptions (our only users) point to the next PC.  */
+env->pc += 4;
 }
 cpu_loop_exit(cs);
 }
diff --git a/target-alpha/mem_helper.c b/target-alpha/mem_helper.c
index fc4f57a..7b5e30d 100644
--- a/target-alpha/mem_helper.c
+++ b/target-alpha/mem_helper.c
@@ -128,7 +128,14 @@ void alpha_cpu_unassigned_access(CPUState *cs, hwaddr addr,
 
 env->trap_arg0 = addr;
 env->trap_arg1 = is_write ? 1 : 0;
-dynamic_excp(env, 0, EXCP_MCHK, 0);
+cs->exception_index = EXCP_MCHK;
+env->error_code = 0;
+
+/* ??? We should cpu_restore_state to the faulting insn, but this hook
+   does not have access to the retaddr value from the orignal helper.
+   It's all moot until the QEMU PALcode grows an MCHK handler.  */
+
+cpu_loop_exit(cs);
 }
 
 /* try to fill the TLB and return an exception if error. If retaddr is
-- 
2.1.0




[Qemu-devel] [PATCH v2 02/17] target-alpha: Rename floating-point subroutines

2015-05-12 Thread Richard Henderson
... to match the instructions, which have no leading "f".

Signed-off-by: Richard Henderson 
---
 target-alpha/fpu_helper.c |  2 +-
 target-alpha/helper.h |  2 +-
 target-alpha/translate.c  | 68 +++
 3 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/target-alpha/fpu_helper.c b/target-alpha/fpu_helper.c
index 8acd460..119559a 100644
--- a/target-alpha/fpu_helper.c
+++ b/target-alpha/fpu_helper.c
@@ -493,7 +493,7 @@ uint64_t helper_cvtqt(CPUAlphaState *env, uint64_t a)
 return float64_to_t(fr);
 }
 
-void helper_fcvtql_v_input(CPUAlphaState *env, uint64_t val)
+void helper_cvtql_v_input(CPUAlphaState *env, uint64_t val)
 {
 if (val != (int32_t)val) {
 arith_excp(env, GETPC(), EXC_M_IOV, 0);
diff --git a/target-alpha/helper.h b/target-alpha/helper.h
index a451cfe..424ea49 100644
--- a/target-alpha/helper.h
+++ b/target-alpha/helper.h
@@ -94,7 +94,7 @@ DEF_HELPER_FLAGS_3(fp_exc_raise_s, TCG_CALL_NO_WG, void, env, 
i32, i32)
 
 DEF_HELPER_FLAGS_2(ieee_input, TCG_CALL_NO_WG, void, env, i64)
 DEF_HELPER_FLAGS_2(ieee_input_cmp, TCG_CALL_NO_WG, void, env, i64)
-DEF_HELPER_FLAGS_2(fcvtql_v_input, TCG_CALL_NO_WG, void, env, i64)
+DEF_HELPER_FLAGS_2(cvtql_v_input, TCG_CALL_NO_WG, void, env, i64)
 
 #if !defined (CONFIG_USER_ONLY)
 DEF_HELPER_2(hw_ret, void, env, i64)
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index efeeb05..b3c5dca 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -718,7 +718,7 @@ static inline void gen_fp_exc_raise(int rc, int fn11)
 gen_fp_exc_raise_ignore(rc, fn11, fn11 & QUAL_I ? 0 : float_flag_inexact);
 }
 
-static void gen_fcvtlq(TCGv vc, TCGv vb)
+static void gen_cvtlq(TCGv vc, TCGv vb)
 {
 TCGv tmp = tcg_temp_new();
 
@@ -733,7 +733,7 @@ static void gen_fcvtlq(TCGv vc, TCGv vb)
 tcg_temp_free(tmp);
 }
 
-static void gen_fcvtql(TCGv vc, TCGv vb)
+static void gen_cvtql(TCGv vc, TCGv vb)
 {
 TCGv tmp = tcg_temp_new();
 
@@ -763,8 +763,8 @@ static void gen_ieee_arith2(DisasContext *ctx,
 }
 
 #define IEEE_ARITH2(name)   \
-static inline void glue(gen_f, name)(DisasContext *ctx, \
- int rb, int rc, int fn11)  \
+static inline void glue(gen_, name)(DisasContext *ctx,  \
+int rb, int rc, int fn11)   \
 {   \
 gen_ieee_arith2(ctx, gen_helper_##name, rb, rc, fn11);  \
 }
@@ -773,7 +773,7 @@ IEEE_ARITH2(sqrtt)
 IEEE_ARITH2(cvtst)
 IEEE_ARITH2(cvtts)
 
-static void gen_fcvttq(DisasContext *ctx, int rb, int rc, int fn11)
+static void gen_cvttq(DisasContext *ctx, int rb, int rc, int fn11)
 {
 TCGv vb, vc;
 int ignore = 0;
@@ -830,8 +830,8 @@ static void gen_ieee_intcvt(DisasContext *ctx,
 }
 
 #define IEEE_INTCVT(name)   \
-static inline void glue(gen_f, name)(DisasContext *ctx, \
- int rb, int rc, int fn11)  \
+static inline void glue(gen_, name)(DisasContext *ctx,  \
+int rb, int rc, int fn11)   \
 {   \
 gen_ieee_intcvt(ctx, gen_helper_##name, rb, rc, fn11);  \
 }
@@ -875,8 +875,8 @@ static void gen_ieee_arith3(DisasContext *ctx,
 }
 
 #define IEEE_ARITH3(name)   \
-static inline void glue(gen_f, name)(DisasContext *ctx, \
- int ra, int rb, int rc, int fn11)  \
+static inline void glue(gen_, name)(DisasContext *ctx,  \
+int ra, int rb, int rc, int fn11)   \
 {   \
 gen_ieee_arith3(ctx, gen_helper_##name, ra, rb, rc, fn11);  \
 }
@@ -906,8 +906,8 @@ static void gen_ieee_compare(DisasContext *ctx,
 }
 
 #define IEEE_CMP3(name) \
-static inline void glue(gen_f, name)(DisasContext *ctx, \
- int ra, int rb, int rc, int fn11)  \
+static inline void glue(gen_, name)(DisasContext *ctx,  \
+int ra, int rb, int rc, int fn11)   \
 {   \
 gen_ieee_compare(ctx, gen_helper_##name, ra, rb, rc, fn11); \
 }
@@ -1958,7 +1958,7 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 case 0x0B:
 /* SQRTS */
 REQUIRE_REG_31(ra);
-gen_fsqrts(ctx, rb, rc, fn11);
+gen_sqrts(ctx, rb, rc, fn11);
 break;
 case 0x14:
 /* ITOFF */
@@ -1984,7 +1984,7 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 case 0x02B:
 /* SQR

[Qemu-devel] [PATCH v2 00/17] target-alpha fpu improvments

2015-05-12 Thread Richard Henderson
This is v2 of the work that Al Viro helped with nearly a year ago.
At the time, I obsconded with an unused bit in the softfloat exception
flags.  Which was a bit of a wart, and rightly pointed out as such
by someone at the time.

After 11 months on the shelf, I've finally found enough time to work
out the bugs in the re-implementation of the fpcr.  This time all of
the bits are private to target-alpha, so no mucking about with the
generic softfloat code.


r~


Richard Henderson (17):
  target-alpha: Move VAX helpers to a new file
  target-alpha: Rename floating-point subroutines
  target-alpha: Forget installed round mode after MT_FPCR
  target-alpha: Set PC correctly for floating-point exceptions
  target-alpha: Tidy FPCR representation
  target-alpha: Set fpcr_exc_status even for disabled exceptions
  target-alpha: Set EXC_M_SWC for exceptions from /S insns
  target-alpha: Raise IOV from CVTTQ
  target-alpha: Fix cvttq vs large integers
  target-alpha: Fix cvttq vs inf
  target-alpha: Fix integer overflow checking insns
  target-alpha: Implement WH64EN
  target-alpha: Disallow literal operand to 1C.30 to 1C.37
  target-alpha: Raise EXC_M_INV properly for fp inputs
  target-alpha: Suppress underflow from CVTTQ if DNZ
  target-alpha: Raise IOV from CVTQL
  target-alpha: Rewrite helper_zapnot

 target-alpha/Makefile.objs |   2 +-
 target-alpha/cpu.h |  95 
 target-alpha/fpu_helper.c  | 530 +++--
 target-alpha/helper.c  | 132 ++-
 target-alpha/helper.h  |  14 +-
 target-alpha/int_helper.c  |  89 ++--
 target-alpha/mem_helper.c  |   9 +-
 target-alpha/translate.c   | 265 ---
 target-alpha/vax_helper.c  | 353 ++
 9 files changed, 715 insertions(+), 774 deletions(-)
 create mode 100644 target-alpha/vax_helper.c

-- 
2.1.0




Re: [Qemu-devel] [PULL 00/16] KVM, QOM, NBD, build fixes for 2015-05-08

2015-05-12 Thread Daniel P. Berrange
On Fri, May 08, 2015 at 02:29:01PM +0200, Andreas Färber wrote:
> Am 08.05.2015 um 14:07 schrieb Paolo Bonzini:
> > The following changes since commit 498147529d1f8e902e6528a0115143b53475791e:
> > 
> >   Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20150430' into 
> > staging (2015-04-30 14:15:56 +0100)
> > 
> > are available in the git repository at:
> > 
> >   git://github.com/bonzini/qemu.git tags/for-upstream
> > 
> > for you to fetch changes up to d51026b22b97332a95d91acfb6c23cd9b087955c:
> > 
> >   qemu-nbd: only send a limited number of errno codes on the wire 
> > (2015-05-08 13:14:54 +0200)
> > 
> > 
> > - Daniel's QOM improvements
> 
> Once again, objection.

Paolo, I'll re-send a v4 of these QOM related improvements that
incorporate Andreas' feedback.


> > - build bugfix from Fam and new configure check from Emilio
> > - two improvements to "info mtere" from Gerd
> > - KVM support for memory transaction attributes
> > - one more small step towards unlocked MMIO dispatch
> > - one piece of the qemu-nbd errno fixes
> > - trivial-ish patches from Denis and Thomas

> > 
> > 
> > Daniel P. Berrange (7):
> >   qom: fix typename of 'policy' enum property in hostmem obj
> >   qom: document user creatable object types in help text
> >   qom: create objects in two phases
> >   qom: add object_new_propv / object_new_proplist constructors
> >   qom: make enum string tables const-correct
> >   qom: add a object_property_add_enum helper method
> >   qom: don't pass string table to object_get_enum method

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v3 3/7] qom: create objects in two phases

2015-05-12 Thread Daniel P. Berrange
On Fri, May 08, 2015 at 04:40:51PM +0200, Paolo Bonzini wrote:
> 
> 
> On 08/05/2015 16:37, Andreas Färber wrote:
> > Hi,
> > 
> > Can we *please* find a better subject for this? To me, creating QOM
> > objects in two phases is about instance_init vs. realize, and thus I was
> > pretty upset that Paolo dared to apply this without asking me first.
> 
> Oops, sorry.  I very much understand where you came from, now.

Ok, I'll change this to say something like

  "create most objects before creating chardev backends"

to better describe what its trying to achieve.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v3 7/7] qom: don't pass string table to object_get_enum method

2015-05-12 Thread Daniel P. Berrange
On Fri, May 08, 2015 at 07:54:48PM +0200, Andreas Färber wrote:
> Am 01.05.2015 um 12:30 schrieb Daniel P. Berrange:
> > Now that properties can be explicitly registered as an enum
> > type, there is no need to pass the string table to the
> > object_get_enum method. The object property registration
> > already has a pointer to the string table.
> > 
> > In changing this method signature, the hostmem backend object
> > has to be converted to use the new enum property registration
> > code, which simplifies it somewhat.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  backends/hostmem.c | 22 --
> >  include/qom/object.h   |  4 ++--
> >  numa.c |  2 +-
> >  qom/object.c   | 32 
> >  tests/check-qom-proplist.c | 46 
> > ++
> >  5 files changed, 81 insertions(+), 25 deletions(-)
> > diff --git a/qom/object.c b/qom/object.c
> > index ba0e4b8..6d2a2a9 100644
> > --- a/qom/object.c
> > +++ b/qom/object.c
> > @@ -1026,13 +1026,35 @@ int64_t object_property_get_int(Object *obj, const 
> > char *name,
> >  return retval;
> >  }
> >  
> > +typedef struct EnumProperty {
> > +const char * const *strings;
> > +int (*get)(Object *, Error **);
> > +void (*set)(Object *, int, Error **);
> 
> Since get and set and moved unchanged, I would prefer placing it in the
> final destination in the original patch to avoid churn.

Yep, easy to do.

> > diff --git a/tests/check-qom-proplist.c b/tests/check-qom-proplist.c
> > index de142e3..d5cd38b 100644
> > --- a/tests/check-qom-proplist.c
> > +++ b/tests/check-qom-proplist.c
> > @@ -249,6 +249,51 @@ static void test_dummy_badenum(void)
> >  }
> >  
> >  
> > +
> > +static void test_dummy_getenum(void)
> > +{
> > +Error *err = NULL;
> > +int val;
> > +Object *parent = container_get(object_get_root(),
> > +   "/objects");
> > +DummyObject *dobj = DUMMY_OBJECT(
> > +object_new_propv(TYPE_DUMMY,
> > + parent,
> > + "dummy0",
> > + &err,
> > + "av", "platypus",
> > + NULL));
> > +
> > +g_assert(dobj != NULL);
> > +g_assert(err == NULL);
> > +g_assert(dobj->av == DUMMY_PLATYPUS);
> > +
> > +val = object_property_get_enum(OBJECT(dobj),
> > +   "av",
> > +   "DummyAnimal",
> > +   &err);
> > +g_assert(err == NULL);
> 
> Is there any significant difference between g_assert()'ing on error and
> passing &error_abort?

I didn't know about &error_abort until now :-) I will use that.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v3 6/7] qom: add a object_property_add_enum helper method

2015-05-12 Thread Daniel P. Berrange
On Fri, May 08, 2015 at 07:45:10PM +0200, Andreas Färber wrote:
> Am 01.05.2015 um 12:30 schrieb Daniel P. Berrange:
> 
> Looks good in general. Some minor nits below, and one limitation
> possibly worth mentioning in the second paragraph of the commit message:
> It assumes a 1:1 mapping. I do guess most ones are, but I remember some
> CPUID bits having different names for the same values, for instance.

Worst case, in that edge case, we can simply not use this stricter
enum property type - just carry on with existing custom property

> > diff --git a/include/qom/object.h b/include/qom/object.h
> > index bf76f7a..f6a2a9d 100644
> > --- a/include/qom/object.h
> > +++ b/include/qom/object.h
> > @@ -1271,6 +1271,25 @@ void object_property_add_bool(Object *obj, const 
> > char *name,
> >Error **errp);
> >  
> >  /**
> > + * object_property_add_enum:
> > + * @obj: the object to add a property to
> > + * @name: the name of the property
> > + * @typename: the name of the enum data type
> > + * @get: the getter or NULL if the property is write-only.
> 
> %NULL
> 
> > + * @set: the setter or NULL if the property is read-only
> > + * @errp: if an error occurs, a pointer to an area to store the error
> > + *
> > + * Add a enum property using getters/setters.  This function will add a
> > + * property of type 'enum'.
> 
> This is slightly ambiguous, as I understand it the type we're actually
> using is the one in @typename, not "enum"?

Yeah, you are right - this doc mistake is left over from a previous
version before Paolo asked me to add the @typename parameter.

> > diff --git a/tests/check-qom-proplist.c b/tests/check-qom-proplist.c
> > index 9f16cdb..de142e3 100644
> > --- a/tests/check-qom-proplist.c
> > +++ b/tests/check-qom-proplist.c
> > @@ -32,10 +32,28 @@ typedef struct DummyObjectClass DummyObjectClass;
> >  #define DUMMY_OBJECT(obj)   \
> >  OBJECT_CHECK(DummyObject, (obj), TYPE_DUMMY)
> >  
> > +typedef enum DummyAnimal DummyAnimal;
> > +
> > +enum DummyAnimal {
> > +DUMMY_FROG,
> > +DUMMY_ALLIGATOR,
> > +DUMMY_PLATYPUS,
> > +
> > +DUMMY_LAST,
> > +};
> > +
> > +static const char *const dummyanimalmap[DUMMY_LAST + 1] = {
> 
> dummy_animal_map would be slightly easier to read.

Sure

> > +static void test_dummy_badenum(void)
> > +{
> > +Error *err = NULL;
> > +Object *parent = container_get(object_get_root(),
> > +   "/objects");
> > +DummyObject *dobj = DUMMY_OBJECT(
> > +object_new_propv(TYPE_DUMMY,
> > + parent,
> > + "dummy0",
> > + &err,
> > + "bv", "yes",
> > + "sv", "Hiss hiss hiss",
> > + "av", "yeti",
> > + NULL));
> > +
> > +g_assert(dobj == NULL);
> 
> Superfluous.
> 
> > +g_assert(err != NULL);
> > +g_assert(g_str_equal(error_get_pretty(err),
> > + "Invalid parameter 'yeti'"));
> 
> Same question as in previous test: alternatives?

Yep, will check

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v3 4/7] qom: add object_new_propv / object_new_proplist constructors

2015-05-12 Thread Daniel P. Berrange
On Fri, May 08, 2015 at 07:10:49PM +0200, Andreas Färber wrote:
> Hi Daniel/Paolo,
> 
> Am 01.05.2015 um 12:30 schrieb Daniel P. Berrange:
> > It is reasonably common to want to create an object, set a
> > number of properties, register it in the hierarchy and then
> > mark it as complete (if a user creatable type). This requires
> > quite a lot of error prone, verbose, boilerplate code to achieve.
> > 
> > The object_new_propv / object_new_proplist constructors will
> > simplify this task by performing all required steps in one go,
> > accepting the property names/values as variadic args.
> 
> With this I disagree. I can see the virtue of adding properties in one
> go via some handy varargs function. But,
> 
> 1) The function does something different from what its name implies to
> me. It does not create a prop or proplist - instead of adding them it
> sets existing ones. Suggest object_new_with_props()?

Sure, with_props() looks fine.

> 2) You seem to mix up *v and non-v functions. v is with va_list usually,
> compare tests/libqtest.h.

Ok, I didn't see that qemu had a convention on that, so will change
to match.

> 3) Object construction is a tricky thing to get right. Anthony chose to
> be stricter than C++ and not let object_new() fail, one of the reasons
> we have the distinct realize step. Can we keep the two separate? qdev
> with all its convenience helpers didn't mix those either.
> I.e., use object_new() without Error** followed by object_set_props() or
> anything with Error**. That tells you if there's an Error* you need to
> unref the object. Otherwise it's in an unknown state.

I don't really think that forcing the callers to call new + set_props
separately is really makng it more reliable - in fact the contrary - it
means that the callers have more complex boilerplate code which they all
have to tediously duplicate in exactly the same way. With the single
object_new_with_props call, you know that if it returns NULL then it
failed and you have no cleanup that you need todo which is about as
reliable as it gets.

That said, I can see the value in having a standalone object_set_props()
method as a general feature. So I will add that, and simply make the
object_new_with_props method call object_new + object_set_props + 

> 4) What's the use case for this? I'm concerned about encouraging people
> to hardcode properties like this, when doing it in C can let the
> compiler detect any mismatches.

I use it in the VNC server when I convert it to use generic TLS encryption
code over to use the QCryptoTLSCreds object - it reduced a 100+ line
method into just two calls to object_new_propv. See vnc_display_create_creds()
in this RFC patch:

  https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg02062.html


Then, I've got a bunch of unit tests related to that series which are
using it, again to reduce the amount of code it takes to create and
set props on this TLS creds object.

> > 
> > Usage would be:
> > 
> >Error *err = NULL;
> >Object *obj;
> >obj = object_new_propv(TYPE_MEMORY_BACKEND_FILE,
> >   "/objects",
> 
> This is not an Object*. ;) I like it better as it's implemented below,
> but cf. above for mixing this Error**-ing operation with object_new().

Yep, that's a docs mistake.

> 
> >   "hostmem0",
> >   &err,
> >   "share", "yes",
> >   "mem-path", "/dev/shm/somefile",
> >   "prealloc", "yes",
> >   "size", "1048576",
> >   NULL);
> > 
> > Note all property values are passed in string form and will
> > be parsed into their required data types.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  include/qom/object.h   |  67 
> >  qom/object.c   |  66 
> >  tests/.gitignore   |   1 +
> >  tests/Makefile |   5 +-
> >  tests/check-qom-proplist.c | 190 
> > +
> >  5 files changed, 328 insertions(+), 1 deletion(-)
> >  create mode 100644 tests/check-qom-proplist.c
> > 
> > diff --git a/include/qom/object.h b/include/qom/object.h
> > index d2d7748..15ac314 100644
> > --- a/include/qom/object.h
> > +++ b/include/qom/object.h
> > @@ -607,6 +607,73 @@ Object *object_new(const char *typename);
> >  Object *object_new_with_type(Type type);
> >  
> >  /**
> > + * object_new_propv:
> > + * @typename:  The name of the type of the object to instantiate.
> > + * @parent: the parent object
> > + * @id: The unique ID of the object
> > + * @errp: pointer to error object
> > + * @...: list of property names and values
> > + *
> > + * This function with initialize a new object using heap allocated memory.
> 
> Grammar. ("will"?)
> 
> > + * The returned object has a reference count of 1, and will be freed when
> > + * the last reference is dropped.
> > + *
> > + * The @id parameter will be used 

Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio

2015-05-12 Thread Michael S. Tsirkin
On Tue, May 12, 2015 at 06:25:32PM +0200, Cornelia Huck wrote:
> On Tue, 12 May 2015 17:15:53 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Tue, May 12, 2015 at 03:25:30PM +0200, Cornelia Huck wrote:
> > > On Wed, 06 May 2015 14:08:02 +0200
> > > Greg Kurz  wrote:
> > > 
> > > > Legacy virtio is native endian: if the guest and host endianness differ,
> > > > we have to tell vhost so it can swap bytes where appropriate. This is
> > > > done through a vhost ring ioctl.
> > > > 
> > > > Signed-off-by: Greg Kurz 
> > > > ---
> > > >  hw/virtio/vhost.c |   50 
> > > > +-
> > > >  1 file changed, 49 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > > index 54851b7..1d7b939 100644
> > > > --- a/hw/virtio/vhost.c
> > > > +++ b/hw/virtio/vhost.c
> > > (...)
> > > > @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev 
> > > > *dev,
> > > >  return -errno;
> > > >  }
> > > > 
> > > > +if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
> > > 
> > > I think this should either go in after the virtio-1 base support (more
> > > feature bits etc.) or get a big fat comment and be touched up later.
> > > I'd prefer the first solution so it does not get forgotten, but I'm not
> > > sure when Michael plans to proceed with the virtio-1 patches (I think
> > > they're mostly fine already).
> > 
> > There are three main issues with virtio 1 patches that I am aware of.
> > 
> > One issue with virtio 1 patches as they are is with how features are
> > handled ATM.  There are 3 types of features
> > 
> > a. virtio 1 only features
> > b. virtio 0 only features
> > c. shared features
> > 
> > and 3 types of devices
> > a. legacy device: has b+c features
> > b. modern device: has a+c features
> > c. transitional device: has a+c features but exposes
> >only c through the legacy interface
> 
> Wouldn't a transitional device be able to expose b as well?

No because the virtio 1 spec says it shouldn't.


> > 
> > 
> > So I think a callback that gets features depending on guest
> > version isn't a good way to model it because fundamentally device
> > has one set of features.
> > A better way to model this is really just a single
> > host_features bitmask, and for transitional devices, a mask
> > hiding a features - which are so far all bits > 31, so maybe
> > for now we can just have a global mask.
> 
> How would this work for transitional presenting a modern device - would
> you have a superset of bits and masks for legacy and modern?

Basically we expose through modern interface a superset
of bits exposed through legacy.
F_BAD for pci is probably the only exception.


> > 
> > We need to validate features at initialization time and make
> > sure they make sense, fail if not (sometimes we need to mask
> > features if they don't make sense - this is unfortunate
> > but might be needed for compatibility).
> > 
> > Moving host_features to virtio core would make all of the above
> > easier.
> 
> I have started hacking up code that moves host_features, but I'm quite
> lost with all the different virtio versions floating around. Currently
> trying against master, but that of course ignores the virtio-1 issues.

Yes, I think we should focus on infrastructure cleanups in master first.

> > 
> > 
> > Second issue is migration, some of it is with migrating the new
> > features, so that's tied to the first one.
> 
> There's also the used and avail addresses, but that kind of follows
> from virtio-1 support.
> 
> > 
> > 
> > Third issue is fixing devices so they don't try to
> > access guest memory until DRIVER_OK is set.
> > This is surprisingly hard to do generally given need to support old
> > drivers which don't set DRIVER_OK or set it very late, and the fact that
> > we tied work-arounds for even older drivers which dont' set pci bus
> > master to the DRIVER_OK bit. I tried, and I'm close to giving up and
> > just checking guest ack for virtio 1, and ignoring DRIVER_OK requirement
> > if not there.
> 
> If legacy survived like it is until now, it might be best to focus on
> modern devices for this.

I'm kind of unhappy that it's up to guest though as that controls
whether we run in modern mode. But yea.

-- 
MST



Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check

2015-05-12 Thread Cornelia Huck
On Tue, 12 May 2015 17:30:21 +0200
"Michael S. Tsirkin"  wrote:

> On Tue, May 12, 2015 at 04:46:11PM +0200, Cornelia Huck wrote:
> > On Tue, 12 May 2015 15:44:46 +0200
> > Cornelia Huck  wrote:
> > 
> > > On Tue, 12 May 2015 15:34:47 +0200
> > > "Michael S. Tsirkin"  wrote:
> > > 
> > > > On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> > > > > On Wed, 06 May 2015 14:07:37 +0200
> > > > > Greg Kurz  wrote:
> > > > > 
> > > > > > Unlike with add and clear, there is no valid reason to abort when 
> > > > > > checking
> > > > > > for a feature. It makes more sense to return false (i.e. the 
> > > > > > feature bit
> > > > > > isn't set). This is exactly what __virtio_has_feature() does if 
> > > > > > fbit >= 32.
> > > > > > 
> > > > > > This allows to introduce code that is aware about new 64-bit 
> > > > > > features like
> > > > > > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > > > > > 
> > > > > > Signed-off-by: Greg Kurz 
> > > > > > ---
> > > > > >  include/hw/virtio/virtio.h |1 -
> > > > > >  1 file changed, 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > > > > index d95f8b6..6ef70f1 100644
> > > > > > --- a/include/hw/virtio/virtio.h
> > > > > > +++ b/include/hw/virtio/virtio.h
> > > > > > @@ -233,7 +233,6 @@ static inline void 
> > > > > > virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > > > > > 
> > > > > >  static inline bool __virtio_has_feature(uint32_t features, 
> > > > > > unsigned int fbit)
> > > > > >  {
> > > > > > -assert(fbit < 32);
> > > > > >  return !!(features & (1 << fbit));
> > > > > >  }
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > I must say I'm not very comfortable with knowingly passing out-of-rage
> > > > > values to this function.
> > > > > 
> > > > > Can we perhaps apply at least the feature-bit-size extending patches
> > > > > prior to your patchset, if the remainder of the virtio-1 patchset 
> > > > > still
> > > > > takes some time?
> > > > 
> > > > So the feature-bit-size extending patches currently don't support
> > > > migration correctly, that's why they are not merged.
> > > > 
> > > > What I think we need to do for this is move host_features out
> > > > from transports into core virtio device.
> > > > 
> > > > Then we can simply check host features >31 and skip
> > > > migrating low guest features is none set.
> > > > 
> > > > Thoughts? Any takers?
> > > > 
> > > 
> > > After we move host_features, put them into an optional vmstate
> > > subsection?
> > > 
> > > I think with the recent patchsets, most of the interesting stuff is
> > > already not handled by the transport anymore. There's only
> > > VIRTIO_F_NOTIFY_ON_EMPTY and VIRTIO_F_BAD_FEATURE left (set by pci and
> > > ccw).
> 
> notify on empty is likely safe to set for everyone.
> 
> bad feature should be pci specific, it's a mistake that
> we have it in ccw. it's there to detect very old buggy guests.
> in fact ccw ignores this bit completely.
> 
> For PCI, I think VIRTIO_F_BAD_FEATURE is never
> actually set in guest features. If guest attempts to set it,
> it is immediately cleared.
> 
> So it can be handled in pci specific code, and won't
> affect migration.
> 
> 
> > Thinking a bit more, we probably don't need this move of host_features
> > to get migration right (although it might be a nice cleanup later).
> > 
> > Could we
> > - keep migration of bits 0..31 as-is
> > - add a vmstate subsection for bits 32..63 only included if one of
> >   those bits is set
> > - have a post handler that performs a validation of the full set of
> >   bits 0..63
> > ?
> > 
> > We could do a similar exercise with a subsection containing the
> > addresses for avail and used with a post handler overwriting any
> > addresses set by the old style migration code.
> > 
> > Does that make sense?
> 
> I don't see how it does: on the receive side you don't know
> whether guest acked bits 32..63 so you can't decide whether
> to parse bits 32..63.

But if it wasn't set, it obviously wasn't acked, I'd think?

> 
> The right thing to do IMHO is to migrate the high guest bits if and only
> if the *host* bits 32..63 are set.  And that needs the host features in
> core, or at least is easier if they are there.

Aren't the host bits a prereq? Confused. I'll think about that tomorrow
when it's hopefully a bit cooler around here :)




[Qemu-devel] [Bug 1297218] Re: guest hangs after live migration due to tsc jump

2015-05-12 Thread Serge Hallyn
I'm sorry, but I'm not clear at this point on the status of this bug.  I
never received an answer to comments #32 and comment #35, and don't know
what, if anything, to apply in an SRU.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1297218

Title:
  guest hangs after live migration due to tsc jump

Status in QEMU:
  New
Status in glusterfs package in Ubuntu:
  Invalid
Status in qemu package in Ubuntu:
  Triaged
Status in glusterfs source package in Trusty:
  Confirmed
Status in qemu source package in Trusty:
  Confirmed

Bug description:
  We have two identical Ubuntu servers running libvirt/kvm/qemu, sharing
  a Gluster filesystem. Guests can be live migrated between them.
  However, live migration often leads to the guest being stuck at 100%
  for a while. In that case, the dmesg output for such a guest will show
  (once it recovers): Clocksource tsc unstable (delta = 662463064082
  ns). In this particular example, a guest was migrated and only after
  11 minutes (662 seconds) did it become responsive again.

  It seems that newly booted guests doe not suffer from this problem,
  these can be migrated back and forth at will. After a day or so, the
  problem becomes apparent. It also seems that migrating from server A
  to server B causes much more problems than going from B back to A. If
  necessary, I can do more measurements to qualify these observations.

  The VM servers run Ubuntu 13.04 with these packages:
  Kernel: 3.8.0-35-generic x86_64
  Libvirt: 1.0.2
  Qemu: 1.4.0
  Gluster-fs: 3.4.2 (libvirt access the images via the filesystem, not using 
libgfapi yet as the Ubuntu libvirt is not linked against libgfapi).
  The interconnect between both machines (both for migration and gluster) is 
10GbE. 
  Both servers are synced to NTP and well within 1ms form one another.

  Guests are either Ubuntu 13.04 or 13.10.

  On the guests, the current_clocksource is kvm-clock.
  The XML definition of the guests only contains:   

  Now as far as I've read in the documentation of kvm-clock, it specifically 
supports live migrations, so I'm a bit surprised at these problems. There isn't 
all that much information to find on these issue, although I have found 
postings by others that seem to have run into the same issues, but without a 
solution.
  --- 
  ApportVersion: 2.14.1-0ubuntu3
  Architecture: amd64
  DistroRelease: Ubuntu 14.04
  Package: libvirt (not installed)
  ProcCmdline: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic 
root=UUID=1b0c3c6d-a9b8-4e84-b076-117ae267d178 ro console=ttyS1,115200n8 
BOOTIF=01-00-25-90-75-b5-c8
  ProcVersionSignature: Ubuntu 3.13.0-24.47-generic 3.13.9
  Tags:  trusty apparmor apparmor apparmor apparmor apparmor
  Uname: Linux 3.13.0-24-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: True
  modified.conffile..etc.default.libvirt.bin: [modified]
  modified.conffile..etc.libvirt.libvirtd.conf: [modified]
  modified.conffile..etc.libvirt.qemu.conf: [modified]
  modified.conffile..etc.libvirt.qemu.networks.default.xml: [deleted]
  mtime.conffile..etc.default.libvirt.bin: 2014-05-12T19:07:40.020662
  mtime.conffile..etc.libvirt.libvirtd.conf: 2014-05-13T14:40:25.894837
  mtime.conffile..etc.libvirt.qemu.conf: 2014-05-12T18:58:27.885506

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1297218/+subscriptions



Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio

2015-05-12 Thread Cornelia Huck
On Tue, 12 May 2015 17:15:53 +0200
"Michael S. Tsirkin"  wrote:

> On Tue, May 12, 2015 at 03:25:30PM +0200, Cornelia Huck wrote:
> > On Wed, 06 May 2015 14:08:02 +0200
> > Greg Kurz  wrote:
> > 
> > > Legacy virtio is native endian: if the guest and host endianness differ,
> > > we have to tell vhost so it can swap bytes where appropriate. This is
> > > done through a vhost ring ioctl.
> > > 
> > > Signed-off-by: Greg Kurz 
> > > ---
> > >  hw/virtio/vhost.c |   50 
> > > +-
> > >  1 file changed, 49 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > index 54851b7..1d7b939 100644
> > > --- a/hw/virtio/vhost.c
> > > +++ b/hw/virtio/vhost.c
> > (...)
> > > @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev 
> > > *dev,
> > >  return -errno;
> > >  }
> > > 
> > > +if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
> > 
> > I think this should either go in after the virtio-1 base support (more
> > feature bits etc.) or get a big fat comment and be touched up later.
> > I'd prefer the first solution so it does not get forgotten, but I'm not
> > sure when Michael plans to proceed with the virtio-1 patches (I think
> > they're mostly fine already).
> 
> There are three main issues with virtio 1 patches that I am aware of.
> 
> One issue with virtio 1 patches as they are is with how features are
> handled ATM.  There are 3 types of features
> 
>   a. virtio 1 only features
>   b. virtio 0 only features
>   c. shared features
> 
> and 3 types of devices
>   a. legacy device: has b+c features
>   b. modern device: has a+c features
>   c. transitional device: has a+c features but exposes
>  only c through the legacy interface

Wouldn't a transitional device be able to expose b as well?

> 
> 
> So I think a callback that gets features depending on guest
> version isn't a good way to model it because fundamentally device
> has one set of features.
> A better way to model this is really just a single
> host_features bitmask, and for transitional devices, a mask
> hiding a features - which are so far all bits > 31, so maybe
> for now we can just have a global mask.

How would this work for transitional presenting a modern device - would
you have a superset of bits and masks for legacy and modern?

> 
> We need to validate features at initialization time and make
> sure they make sense, fail if not (sometimes we need to mask
> features if they don't make sense - this is unfortunate
> but might be needed for compatibility).
> 
> Moving host_features to virtio core would make all of the above
> easier.

I have started hacking up code that moves host_features, but I'm quite
lost with all the different virtio versions floating around. Currently
trying against master, but that of course ignores the virtio-1 issues.

> 
> 
> Second issue is migration, some of it is with migrating the new
> features, so that's tied to the first one.

There's also the used and avail addresses, but that kind of follows
from virtio-1 support.

> 
> 
> Third issue is fixing devices so they don't try to
> access guest memory until DRIVER_OK is set.
> This is surprisingly hard to do generally given need to support old
> drivers which don't set DRIVER_OK or set it very late, and the fact that
> we tied work-arounds for even older drivers which dont' set pci bus
> master to the DRIVER_OK bit. I tried, and I'm close to giving up and
> just checking guest ack for virtio 1, and ignoring DRIVER_OK requirement
> if not there.

If legacy survived like it is until now, it might be best to focus on
modern devices for this.




Re: [Qemu-devel] Bug report - Windows XP guest failure

2015-05-12 Thread John Snow


On 05/12/2015 03:22 AM, Michael Tokarev wrote:
> 12.05.2015 04:05, Peter Crosthwaite wrote:
>> On Thu, May 7, 2015 at 2:34 AM, Michael Tokarev  wrote:
> ...
 Ok, I can reproduce this, winXP BSODs on boot in tcg mode.
 Git bisect points to this:

 commit 23820dbfc79d1c9dce090b4c555994f2bb6a69b3
 Author: Peter Crosthwaite 
 Date:   Mon Mar 16 22:35:54 2015 -0700

 exec: Respect as_translate_internal length clamp
>>>
>>> This winXP BSOD happens on x86_64 target too.  Reverting the
>>> above commit from git master fixes the BSOD.
>>
>> Any useful info about IO addresses on that BSOD? The last issue with
>> this patch was IOPort code relying on the bug that this patch fixed.
>> This could be similar and if we can track the failure to a particular
>> address we can fix properly rather than another revert of that patch.
> 
> Oh.  I didn't know this patch has been reverted before.  Anyway, I disabled
> auto-reboot on BSOD on my winXP (what a "useful" feature!) and here's what
> I see.
> 
>   IRQ_NOT_LESS_OR_EQUAL
>   STOP: 0x0A (0x16, 0x02, 0x00, 0x80500EFC)
> 
> (with some amount of leading zeros stripped).
> 
> When this happens, win does something for quite some time, the BSOD comes
> after quite significant delay.
> 
> Is there anything else I can look at, maybe some crash dump or something?
> I haven't done any windows debugging before.
> 
> Thanks,
> 
> /mjt
> 

https://support.microsoft.com/en-us/kb/315263

You can configure the type of dump it saves, then use various MS
utilities described here (briefly) to perform some basic analysis on the
dumps, which sometimes gives extra goodies.

I haven't done too much advanced windows debugging myself, but I do
generally try to run the !analyze command on any minidumps I create, at
least.

--js



[Qemu-devel] [PATCH 2/5] util: move read_password method out of qemu-img into osdep/oslib

2015-05-12 Thread Daniel P. Berrange
The qemu-img.c file has a read_password() method impl that is
used to prompt for passwords on the console, with impls for
POSIX and Windows. This will be needed by qemu-io.c too, so
move it into the QEMU osdep/oslib files where it can be shared
without code duplication

Signed-off-by: Daniel P. Berrange 
---
 include/qemu/osdep.h |  2 ++
 qemu-img.c   | 93 +---
 util/oslib-posix.c   | 66 +
 util/oslib-win32.c   | 24 ++
 4 files changed, 93 insertions(+), 92 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index b3300cc..3247364 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -259,4 +259,6 @@ void qemu_set_tty_echo(int fd, bool echo);
 
 void os_mem_prealloc(int fd, char *area, size_t sz);
 
+int qemu_read_password(char *buf, int buf_size);
+
 #endif
diff --git a/qemu-img.c b/qemu-img.c
index 8d30e43..60c820d 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -165,97 +165,6 @@ static int GCC_FMT_ATTR(2, 3) qprintf(bool quiet, const 
char *fmt, ...)
 return ret;
 }
 
-#if defined(WIN32)
-/* XXX: put correct support for win32 */
-static int read_password(char *buf, int buf_size)
-{
-int c, i;
-
-printf("Password: ");
-fflush(stdout);
-i = 0;
-for(;;) {
-c = getchar();
-if (c < 0) {
-buf[i] = '\0';
-return -1;
-} else if (c == '\n') {
-break;
-} else if (i < (buf_size - 1)) {
-buf[i++] = c;
-}
-}
-buf[i] = '\0';
-return 0;
-}
-
-#else
-
-#include 
-
-static struct termios oldtty;
-
-static void term_exit(void)
-{
-tcsetattr (0, TCSANOW, &oldtty);
-}
-
-static void term_init(void)
-{
-struct termios tty;
-
-tcgetattr (0, &tty);
-oldtty = tty;
-
-tty.c_iflag &= ~(IGNBRK|BRKINT|PARMRK|ISTRIP
-  |INLCR|IGNCR|ICRNL|IXON);
-tty.c_oflag |= OPOST;
-tty.c_lflag &= ~(ECHO|ECHONL|ICANON|IEXTEN);
-tty.c_cflag &= ~(CSIZE|PARENB);
-tty.c_cflag |= CS8;
-tty.c_cc[VMIN] = 1;
-tty.c_cc[VTIME] = 0;
-
-tcsetattr (0, TCSANOW, &tty);
-
-atexit(term_exit);
-}
-
-static int read_password(char *buf, int buf_size)
-{
-uint8_t ch;
-int i, ret;
-
-printf("password: ");
-fflush(stdout);
-term_init();
-i = 0;
-for(;;) {
-ret = read(0, &ch, 1);
-if (ret == -1) {
-if (errno == EAGAIN || errno == EINTR) {
-continue;
-} else {
-break;
-}
-} else if (ret == 0) {
-ret = -1;
-break;
-} else {
-if (ch == '\r') {
-ret = 0;
-break;
-}
-if (i < (buf_size - 1))
-buf[i++] = ch;
-}
-}
-term_exit();
-buf[i] = '\0';
-printf("\n");
-return ret;
-}
-#endif
 
 static int print_block_option_help(const char *filename, const char *fmt)
 {
@@ -312,7 +221,7 @@ static BlockBackend *img_open(const char *id, const char 
*filename,
 bs = blk_bs(blk);
 if (bdrv_is_encrypted(bs) && require_io) {
 qprintf(quiet, "Disk image '%s' is encrypted.\n", filename);
-if (read_password(password, sizeof(password)) < 0) {
+if (qemu_read_password(password, sizeof(password)) < 0) {
 error_report("No password given");
 goto fail;
 }
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 37ffd96..1c23fd2 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -50,6 +50,7 @@ extern int daemon(int, int);
 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -415,3 +416,68 @@ void os_mem_prealloc(int fd, char *area, size_t memory)
 pthread_sigmask(SIG_SETMASK, &oldset, NULL);
 }
 }
+
+
+static struct termios oldtty;
+
+static void term_exit(void)
+{
+tcsetattr(0, TCSANOW, &oldtty);
+}
+
+static void term_init(void)
+{
+struct termios tty;
+
+tcgetattr(0, &tty);
+oldtty = tty;
+
+tty.c_iflag &= ~(IGNBRK|BRKINT|PARMRK|ISTRIP
+  |INLCR|IGNCR|ICRNL|IXON);
+tty.c_oflag |= OPOST;
+tty.c_lflag &= ~(ECHO|ECHONL|ICANON|IEXTEN);
+tty.c_cflag &= ~(CSIZE|PARENB);
+tty.c_cflag |= CS8;
+tty.c_cc[VMIN] = 1;
+tty.c_cc[VTIME] = 0;
+
+tcsetattr(0, TCSANOW, &tty);
+
+atexit(term_exit);
+}
+
+int qemu_read_password(char *buf, int buf_size)
+{
+uint8_t ch;
+int i, ret;
+
+printf("password: ");
+fflush(stdout);
+term_init();
+i = 0;
+for (;;) {
+ret = read(0, &ch, 1);
+if (ret == -1) {
+if (errno == EAGAIN || errno == EINTR) {
+continue;
+} else {
+break;
+}
+} else if (ret == 0) {
+ret = -1;
+break;
+} else {
+if (ch == '\r') {
+ret = 0;
+break;
+}
+

[Qemu-devel] [PATCH 3/5] util: allow \n to terminate password input

2015-05-12 Thread Daniel P. Berrange
The qemu_read_password() method looks for \r to terminate the
reading of the a password. This is what will be seen when
reading the password from a TTY. When scripting though, it is
useful to be able to send the password via a pipe, in which
case we must look for \n to terminate password input.

Signed-off-by: Daniel P. Berrange 
---
 util/oslib-posix.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 1c23fd2..3ae4987 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -467,7 +467,8 @@ int qemu_read_password(char *buf, int buf_size)
 ret = -1;
 break;
 } else {
-if (ch == '\r') {
+if (ch == '\r' ||
+ch == '\n') {
 ret = 0;
 break;
 }
-- 
2.1.0




[Qemu-devel] [PATCH 1/5] qcow2/qcow: protect against uninitialized encryption key

2015-05-12 Thread Daniel P. Berrange
When a qcow[2] file is opened, if the header reports an
encryption method, this is used to set the 'crypt_method_header'
field on the BDRVQcow[2]State struct, and the 'encrypted' flag
in the BDRVState struct.

When doing I/O operations, the 'crypt_method' field on the
BDRVQcow[2]State struct is checked to determine if encryption
needs to be applied.

The crypt_method_header value is copied into crypt_method when
the bdrv_set_key() method is called.

The QEMU code which opens a block device is expected to always
do a check

   if (bdrv_is_encrypted(bs)) {
   bdrv_set_key(bs, key...);
   }

If code forgets todo this, then 'crypt_method' is never set
and so when I/O is performed, QEMU writes plain text data
into a sector which is expected to contain cipher text, or
when reading, will return cipher text instead of plain
text.

Change the qcow[2] code to consult bs->encrypted when deciding
whether encryption is required, and assert(s->crypt_method)
to protect against cases where the caller forgets to set the
encryption key.

Also put an assert in the set_key methods to protect against
the case where the caller sets an encryption key on a block
device that does not have encryption

Signed-off-by: Daniel P. Berrange 
---
 block/qcow.c  | 10 +++---
 block/qcow2-cluster.c |  3 ++-
 block/qcow2.c | 18 --
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/block/qcow.c b/block/qcow.c
index ab89328..911e59f 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -269,6 +269,7 @@ static int qcow_set_key(BlockDriverState *bs, const char 
*key)
 for(i = 0;i < len;i++) {
 keybuf[i] = key[i];
 }
+assert(bs->encrypted);
 s->crypt_method = s->crypt_method_header;
 
 if (AES_set_encrypt_key(keybuf, 128, &s->aes_encrypt_key) != 0)
@@ -411,9 +412,10 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
 bdrv_truncate(bs->file, cluster_offset + s->cluster_size);
 /* if encrypted, we must initialize the cluster
content which won't be written */
-if (s->crypt_method &&
+if (bs->encrypted &&
 (n_end - n_start) < s->cluster_sectors) {
 uint64_t start_sect;
+assert(s->crypt_method);
 start_sect = (offset & ~(s->cluster_size - 1)) >> 9;
 memset(s->cluster_data + 512, 0x00, 512);
 for(i = 0; i < s->cluster_sectors; i++) {
@@ -590,7 +592,8 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, 
int64_t sector_num,
 if (ret < 0) {
 break;
 }
-if (s->crypt_method) {
+if (bs->encrypted) {
+assert(s->crypt_method);
 encrypt_sectors(s, sector_num, buf, buf,
 n, 0,
 &s->aes_decrypt_key);
@@ -661,7 +664,8 @@ static coroutine_fn int qcow_co_writev(BlockDriverState 
*bs, int64_t sector_num,
 ret = -EIO;
 break;
 }
-if (s->crypt_method) {
+if (bs->encrypted) {
+assert(s->crypt_method);
 if (!cluster_data) {
 cluster_data = g_malloc0(s->cluster_size);
 }
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index ed2b44d..2dd 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -403,7 +403,8 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
 goto out;
 }
 
-if (s->crypt_method) {
+if (bs->encrypted) {
+assert(s->crypt_method);
 qcow2_encrypt_sectors(s, start_sect + n_start,
 iov.iov_base, iov.iov_base, n, 1,
 &s->aes_encrypt_key);
diff --git a/block/qcow2.c b/block/qcow2.c
index b9a72e3..f7b4cc6 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1037,6 +1037,7 @@ static int qcow2_set_key(BlockDriverState *bs, const char 
*key)
 for(i = 0;i < len;i++) {
 keybuf[i] = key[i];
 }
+assert(bs->encrypted);
 s->crypt_method = s->crypt_method_header;
 
 if (AES_set_encrypt_key(keybuf, 128, &s->aes_encrypt_key) != 0)
@@ -1224,7 +1225,9 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState 
*bs, int64_t sector_num,
 goto fail;
 }
 
-if (s->crypt_method) {
+if (bs->encrypted) {
+assert(s->crypt_method);
+
 /*
  * For encrypted images, read everything into a temporary
  * contiguous buffer on which the AES functions can work.
@@ -1255,7 +1258,8 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState 
*bs, int64_t sector_num,
 if (ret < 0) {
 goto fail;
 }
-if (s->crypt_method) {
+if (bs->encrypted) {
+assert(s->crypt_method);
 qcow2_encrypt_sectors(s, sect

[Qemu-devel] [PATCH 4/5] qemu-io: prompt for encryption keys when required

2015-05-12 Thread Daniel P. Berrange
The qemu-io tool does not check if the image is encrypted so
historically would silently corrupt the sectors by writing
plain text data into them instead of cipher text. The earlier
commit turns this mistake into a fatal abort, so check for
encryption and prompt for key when required.

This enables us to add unit tests to ensure we don't break
the ability of qemu-img to convert existing encrypted qcow2
files into a non-encrypted format.

Signed-off-by: Daniel P. Berrange 
---
 qemu-io.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/qemu-io.c b/qemu-io.c
index 8e41080..34ae933 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -52,6 +52,7 @@ static const cmdinfo_t close_cmd = {
 static int openfile(char *name, int flags, QDict *opts)
 {
 Error *local_err = NULL;
+BlockDriverState *bs;
 
 if (qemuio_blk) {
 fprintf(stderr, "file open already, try 'help close'\n");
@@ -68,7 +69,27 @@ static int openfile(char *name, int flags, QDict *opts)
 return 1;
 }
 
+bs = blk_bs(qemuio_blk);
+if (bdrv_is_encrypted(bs)) {
+char password[256];
+printf("Disk image '%s' is encrypted.\n", name);
+if (qemu_read_password(password, sizeof(password)) < 0) {
+error_report("No password given");
+goto error;
+}
+if (bdrv_set_key(bs, password) < 0) {
+error_report("invalid password");
+goto error;
+}
+}
+
+
 return 0;
+
+ error:
+blk_unref(qemuio_blk);
+qemuio_blk = NULL;
+return 1;
 }
 
 static void open_help(void)
-- 
2.1.0




[Qemu-devel] [PATCH 0/5] Misc fixes and testing of qcow[2] encryption

2015-05-12 Thread Daniel P. Berrange
I realize that qcow[2] encryption is a feature we have deprecated
and will remove support for running it with the QEMU system
emulators in this cycle. We do still need to make sure it continues
to work for the sake of letting people run qemu-img convert to
retrieve their data though.

Some of the other patches I'm working on which introduce a cypto
cipher API touch this qcow2 code, thus I wanted to be able to test
that it doesn't break anything.

I found that qemu-iotests didn't have any coverage of the qcow2
encryption code. For added fun, I then discovered that qemu-io
doesn't check if an encryption key is required, so ends up
writing plain text to the files instead of cipher, and returning
cipher text for reads, instead of plain text. IOW qemu-io will
corrupt encrypted qcow2 files on write.

This series adds some asserts that will protect against this kind
of mistake, adds support for getting passwords to qemu-io (in the
same manner that qemu-img supports), and finally adds a test case
for reading/writing encrypted qcow2.

Daniel P. Berrange (5):
  qcow2/qcow: protect against uninitialized encryption key
  util: move read_password method out of qemu-img into osdep/oslib
  util: allow \n to terminate password input
  qemu-io: prompt for encryption keys when required
  tests: add test case for encrypted qcow2 read/write

 block/qcow.c   | 10 +++--
 block/qcow2-cluster.c  |  3 +-
 block/qcow2.c  | 18 ++---
 include/qemu/osdep.h   |  2 +
 qemu-img.c | 93 +-
 qemu-io.c  | 21 +++
 tests/qemu-iotests/131 | 69 ++
 tests/qemu-iotests/131.out | 46 +++
 tests/qemu-iotests/group   |  1 +
 util/oslib-posix.c | 67 +
 util/oslib-win32.c | 24 
 11 files changed, 252 insertions(+), 102 deletions(-)
 create mode 100755 tests/qemu-iotests/131
 create mode 100644 tests/qemu-iotests/131.out

-- 
2.1.0




[Qemu-devel] [PATCH 5/5] tests: add test case for encrypted qcow2 read/write

2015-05-12 Thread Daniel P. Berrange
Add a simple test case for qemu-iotests that covers read/write
with encrypted qcow2 files.

Signed-off-by: Daniel P. Berrange 
---
 tests/qemu-iotests/131 | 69 ++
 tests/qemu-iotests/131.out | 46 +++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 116 insertions(+)
 create mode 100755 tests/qemu-iotests/131
 create mode 100644 tests/qemu-iotests/131.out

diff --git a/tests/qemu-iotests/131 b/tests/qemu-iotests/131
new file mode 100755
index 000..f44b0a0
--- /dev/null
+++ b/tests/qemu-iotests/131
@@ -0,0 +1,69 @@
+#!/bin/bash
+#
+# Test encrypted read/write using plain bdrv_read/bdrv_write
+#
+# Copyright (C) 2009 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=berra...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+   _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto generic
+_supported_os Linux
+
+
+size=128M
+IMGOPTS="encryption=on" _make_test_img $size
+
+echo
+echo "== reading whole image =="
+echo "astrochicken" | $QEMU_IO -c "read 0 $size" "$TEST_IMG" | _filter_qemu_io
+
+echo
+echo "== rewriting whole image =="
+echo "astrochicken" | $QEMU_IO -c "write -P 0xa 0 $size" "$TEST_IMG" | 
_filter_qemu_io
+
+echo
+echo "== verify pattern =="
+echo "astrochicken" | $QEMU_IO -c "read -P 0xa 0 $size" "$TEST_IMG" | 
_filter_qemu_io
+
+echo
+echo "== verify pattern failure with wrong password =="
+echo "platypus" | $QEMU_IO -c "read -P 0xa 0 $size" "$TEST_IMG" | 
_filter_qemu_io
+
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/131.out b/tests/qemu-iotests/131.out
new file mode 100644
index 000..4eedb35
--- /dev/null
+++ b/tests/qemu-iotests/131.out
@@ -0,0 +1,46 @@
+QA output created by 131
+qemu-img: Encrypted images are deprecated
+Support for them will be removed in a future release.
+You can use 'qemu-img convert' to convert your image to an unencrypted one.
+qemu-img: Encrypted images are deprecated
+Support for them will be removed in a future release.
+You can use 'qemu-img convert' to convert your image to an unencrypted one.
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 encryption=on
+
+== reading whole image ==
+Encrypted images are deprecated
+Support for them will be removed in a future release.
+You can use 'qemu-img convert' to convert your image to an unencrypted one.
+Disk image '/home/berrange/src/virt/qemu/tests/qemu-iotests/scratch/t.qcow2' 
is encrypted.
+password:
+read 134217728/134217728 bytes at offset 0
+128 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== rewriting whole image ==
+Encrypted images are deprecated
+Support for them will be removed in a future release.
+You can use 'qemu-img convert' to convert your image to an unencrypted one.
+Disk image '/home/berrange/src/virt/qemu/tests/qemu-iotests/scratch/t.qcow2' 
is encrypted.
+password:
+wrote 134217728/134217728 bytes at offset 0
+128 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== verify pattern ==
+Encrypted images are deprecated
+Support for them will be removed in a future release.
+You can use 'qemu-img convert' to convert your image to an unencrypted one.
+Disk image '/home/berrange/src/virt/qemu/tests/qemu-iotests/scratch/t.qcow2' 
is encrypted.
+password:
+read 134217728/134217728 bytes at offset 0
+128 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== verify pattern failure with wrong password ==
+Encrypted images are deprecated
+Support for them will be removed in a future release.
+You can use 'qemu-img convert' to convert your image to an unencrypted one.
+Disk image '/home/berrange/src/virt/qemu/tests/qemu-iotests/scratch/t.qcow2' 
is encrypted.
+password:
+Pattern verification failed at offset 0, 134217728 bytes
+read 134217728/134217728 bytes at offset 0
+128 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 6ca3466..34b16cb 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -128,3 +128,4 @@
 128 rw auto quick
 129 rw auto quick
 130 rw auto quick
+131 rw aut

Re: [Qemu-devel] when does a target frontend need to use gen_io_start()/gen_io_end() ?

2015-05-12 Thread Peter Maydell
On 12 May 2015 at 16:43, Richard Henderson  wrote:
> On 05/12/2015 08:32 AM, Peter Maydell wrote:
>> In order for -icount to work, it's important for the target
>> translate.c code to correctly bracket any generated code which
>> can "do I/O" with gen_io_start()/gen_io_end() calls. But
>> does anybody know exactly what the criteria are here for this?
>> It would be nice if we could document this in a comment in
>> gen_icount.h -- I'm happy to write one up if somebody will just
>> tell me what the right answer is :-)
>
> I'm really not sure.
>
> So far I've assumed "i/o"-like insns, and those that can read some sort of
> cycle counter.  So while that handles easy cases like "inb" and "rdcc", it
> certainly doesn't handle any target for which all i/o is memory mapped.

I think the "mmio access" case is already dealt with in the
softmmu_template.h handlers, isn't it? If the CPU isn't in a
"can do IO" state then the io_read/write handlers call
cpu_io_recompile(), which figures out how far through the TB
we were (using the machinery we already have for converting
host addresses of faults into guest PC values), and creates
a new TB which stops with the MMIO load/store. (I don't
entirely understand cpu_io_recompile(), though -- it looks
rather tricksy.)

-- PMM



Re: [Qemu-devel] [PATCH v2] qmp: Add qom_path field to query-cpus command

2015-05-12 Thread Eduardo Habkost
On Tue, May 12, 2015 at 05:38:37PM +0200, Markus Armbruster wrote:
[...]
> > @@ -699,8 +701,9 @@
> >  #data is sent to the client, the guest may no longer be halted.
> >  ##
> >  { 'struct': 'CpuInfo',
> > -  'data': {'CPU': 'int', 'current': 'bool', 'halted': 'bool', '*pc': 'int',
> > -   '*nip': 'int', '*npc': 'int', '*PC': 'int', 'thread_id': 'int'} 
> > }
> > +  'data': {'CPU': 'int', 'current': 'bool', 'halted': 'bool', 'qom_path': 
> > 'str',
> 
> Long line.

It has exactly 80 characters.

-- 
Eduardo



Re: [Qemu-devel] when does a target frontend need to use gen_io_start()/gen_io_end() ?

2015-05-12 Thread Richard Henderson
On 05/12/2015 08:32 AM, Peter Maydell wrote:
> In order for -icount to work, it's important for the target
> translate.c code to correctly bracket any generated code which
> can "do I/O" with gen_io_start()/gen_io_end() calls. But
> does anybody know exactly what the criteria are here for this?
> It would be nice if we could document this in a comment in
> gen_icount.h -- I'm happy to write one up if somebody will just
> tell me what the right answer is :-)

I'm really not sure.

So far I've assumed "i/o"-like insns, and those that can read some sort of
cycle counter.  So while that handles easy cases like "inb" and "rdcc", it
certainly doesn't handle any target for which all i/o is memory mapped.

Which is sorta most of them these days, so the utility seems to be low...


r~



Re: [Qemu-devel] [PATCH v2] qmp: Add qom_path field to query-cpus command

2015-05-12 Thread Markus Armbruster
Eduardo Habkost  writes:

> This will allow clients to query additional information directly using
> qom-get on the CPU objects.
>
> Reviewed-by: David Gibson 
> Reviewed-by: Andreas Färber 
> Signed-off-by: Eduardo Habkost 
> ---
> Changes v1 -> v2:
> * Renamed field from "qom-path" to "qom_path", to keep consistency
>   with existing CpuInfo fields
> * Added "(since 2.4)" to QAPI schema documentation
> * Added the new field to example on qmp-commands.hx
>
> Reference to previous discussion:
>
>   Date: Mon, 4 May 2015 15:37:40 -0300
>   From: Eduardo Habkost 
>   Message-ID: <20150504183740.gm17...@thinpad.lan.raisama.net>
>   Subject: Re: [Qemu-devel] [PATCH] cpu: Register QOM links at 
> /machine/cpus/
>
> The summary is: even if we provide predictable QOM paths for the CPU
> objects, the qom-path field will be useful to allow the QOM objects and
> query-cpu data to be matched correctly.
> ---
>  cpus.c   | 1 +
>  qapi-schema.json | 7 +--
>  qmp-commands.hx  | 7 +--
>  3 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/cpus.c b/cpus.c
> index 62d157a..de6469f 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1435,6 +1435,7 @@ CpuInfoList *qmp_query_cpus(Error **errp)
>  info->value->CPU = cpu->cpu_index;
>  info->value->current = (cpu == first_cpu);
>  info->value->halted = cpu->halted;
> +info->value->qom_path = object_get_canonical_path(OBJECT(cpu));
>  info->value->thread_id = cpu->thread_id;
>  #if defined(TARGET_I386)
>  info->value->has_pc = true;
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 9c92482..921ce70 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -679,6 +679,8 @@
>  # @halted: true if the virtual CPU is in the halt state.  Halt usually refers
>  #  to a processor specific low power mode.
>  #
> +# @qom_path: path to the CPU object in the QOM tree (since 2.4)
> +#
>  # @pc: #optional If the target is i386 or x86_64, this is the 64-bit 
> instruction
>  #pointer.
>  #If the target is Sparc, this is the PC component of the
> @@ -699,8 +701,9 @@
>  #data is sent to the client, the guest may no longer be halted.
>  ##
>  { 'struct': 'CpuInfo',
> -  'data': {'CPU': 'int', 'current': 'bool', 'halted': 'bool', '*pc': 'int',
> -   '*nip': 'int', '*npc': 'int', '*PC': 'int', 'thread_id': 'int'} }
> +  'data': {'CPU': 'int', 'current': 'bool', 'halted': 'bool', 'qom_path': 
> 'str',

Long line.

> +   '*pc': 'int', '*nip': 'int', '*npc': 'int', '*PC': 'int',
> +   'thread_id': 'int'} }
>  
>  ##
>  # @query-cpus:
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index 7506774..14e109e 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -2569,6 +2569,7 @@ Return a json-array. Each CPU is represented by a 
> json-object, which contains:
>  - "CPU": CPU index (json-int)
>  - "current": true if this is the current CPU, false otherwise (json-bool)
>  - "halted": true if the cpu is halted, false otherwise (json-bool)
> +- "qom_path": path to the CPU object in the QOM tree (json-str)
>  - Current program counter. The key's name depends on the architecture:
>   "pc": i386/x86_64 (json-int)
>   "nip": PPC (json-int)
> @@ -2585,14 +2586,16 @@ Example:
>  "CPU":0,
>  "current":true,
>  "halted":false,
> -"pc":3227107138
> +"qom_path":"/machine/unattached/device[0]",
> +"pc":3227107138,
>  "thread_id":3134
>   },
>   {
>  "CPU":1,
>  "current":false,
>  "halted":true,
> -"pc":7108165
> +"qom_path":"/machine/unattached/device[2]",
> +"pc":7108165,
>  "thread_id":3135
>   }
>]

Applied to my qapi-next branch with the long line wrapped, thanks!



Re: [Qemu-devel] [PATCH v2 2/2] target-mips: Misaligned memory accesses for MSA

2015-05-12 Thread Richard Henderson
On 05/12/2015 02:54 AM, Peter Maydell wrote:
> Ideally it would be nice to have support in TCG so that a frontend
> could output a TCG load/store op with a flag for "unaligned access
> OK" or not. ARM also has this issue of some load/stores wanting to
> do alignment traps and some not.

Yes, that would be ideal.

As I was looking at softmmu_template.h for Peter C this morning, I was
wondering about that possibility, since he would be needing to hook
cpu_unaligned_access and the #ifdef ALIGNED_ONLY would need to go away.

What we can't afford is yet another parameter to the helpers.  So I turn my eye
to the mmu_idx parameter, of which we're only using a couple of bits.

What if we merge mmu_idx with TCGMemOp as a parameter, at the tcg-op.h
interface?  Save a tiny bit o space within the tcg opcode buffer.  We'd have to
teach each backend to pull them apart when generating code, but that's trivial.
 But in the end, the helpers have all the info that the code generator did wrt
the access.

Then we add an "aligned" bit to TCGMemOp and use it instead of ifdef 
ALIGNED_ONLY.

Thoughts?


r~



Re: [Qemu-devel] [PATCH v3 00/14] Fix qapi mangling of downstream names

2015-05-12 Thread Markus Armbruster
Eric Blake  writes:

> This series makes it possible to use downstream extensions
> (such as __com.redhat_xyz) and temporary names (such as x-foo)
> in every position possible in QAPI schemes, with added tests
> that the generated code still compiles.
>
> There's still some things we could do to the qapi generator,
> such as normalizing struct member names and C manglings and
> creating named implicit types up front on the initial parse
> rather than multiple times in each backend.  But that should
> wait until existing pending patches have landed, to minimize
> rebase churn.
>
> v2 was here:
> https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg01300.html
>
> v3 includes several more of Markus' original RFC series, splits
> up my work into smaller pieces, incorporates fixes suggested by
> Markus, and rebases on top of the pending v8 qapi drop nested
> structs series.  The series has changed enough from v2 that it
> is not worth showing git backport-diff statistics (as only patch
> 1 survived intact).

Applied to my qapi-next branch, thanks!



Re: [Qemu-devel] [PATCH] doc: fix qmp event type

2015-05-12 Thread Markus Armbruster
Eric Blake  writes:

> On 05/11/2015 09:17 AM, Michael S. Tsirkin wrote:
>> Even name for hot unplug errors was wrong.
>
> s/Even/Event/ ?
>
>> Make doc match code.
>> 
>> Cc: Zhu Guihua 
>> Reported-by: Eric Blake 
>> Signed-off-by: Michael S. Tsirkin 
>> ---
>>  docs/qmp/qmp-events.txt | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> Reviewed-by: Eric Blake 

Applied to my qapi-next branch with the commit message touched up,
thanks!



Re: [Qemu-devel] [Qemu-block] [PATCH] qemu-io: Use getopt() correctly

2015-05-12 Thread Alberto Garcia
On Tue 12 May 2015 05:10:56 PM CEST, Eric Blake  wrote:

> POSIX says getopt() returns -1 on completion.  While Linux happens
> to define EOF as -1, this definition is not required by POSIX, and
> there may be platforms where checking for EOF instead of -1 would
> lead to an infinite loop.
>
> Signed-off-by: Eric Blake 
Reviewed-by: Alberto Garcia 

Berto



[Qemu-devel] when does a target frontend need to use gen_io_start()/gen_io_end() ?

2015-05-12 Thread Peter Maydell
In order for -icount to work, it's important for the target
translate.c code to correctly bracket any generated code which
can "do I/O" with gen_io_start()/gen_io_end() calls. But
does anybody know exactly what the criteria are here for this?
It would be nice if we could document this in a comment in
gen_icount.h -- I'm happy to write one up if somebody will just
tell me what the right answer is :-)

thanks
-- PMM



Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check

2015-05-12 Thread Michael S. Tsirkin
On Tue, May 12, 2015 at 04:46:11PM +0200, Cornelia Huck wrote:
> On Tue, 12 May 2015 15:44:46 +0200
> Cornelia Huck  wrote:
> 
> > On Tue, 12 May 2015 15:34:47 +0200
> > "Michael S. Tsirkin"  wrote:
> > 
> > > On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> > > > On Wed, 06 May 2015 14:07:37 +0200
> > > > Greg Kurz  wrote:
> > > > 
> > > > > Unlike with add and clear, there is no valid reason to abort when 
> > > > > checking
> > > > > for a feature. It makes more sense to return false (i.e. the feature 
> > > > > bit
> > > > > isn't set). This is exactly what __virtio_has_feature() does if fbit 
> > > > > >= 32.
> > > > > 
> > > > > This allows to introduce code that is aware about new 64-bit features 
> > > > > like
> > > > > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > > > > 
> > > > > Signed-off-by: Greg Kurz 
> > > > > ---
> > > > >  include/hw/virtio/virtio.h |1 -
> > > > >  1 file changed, 1 deletion(-)
> > > > > 
> > > > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > > > index d95f8b6..6ef70f1 100644
> > > > > --- a/include/hw/virtio/virtio.h
> > > > > +++ b/include/hw/virtio/virtio.h
> > > > > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t 
> > > > > *features, unsigned int fbit)
> > > > > 
> > > > >  static inline bool __virtio_has_feature(uint32_t features, unsigned 
> > > > > int fbit)
> > > > >  {
> > > > > -assert(fbit < 32);
> > > > >  return !!(features & (1 << fbit));
> > > > >  }
> > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > I must say I'm not very comfortable with knowingly passing out-of-rage
> > > > values to this function.
> > > > 
> > > > Can we perhaps apply at least the feature-bit-size extending patches
> > > > prior to your patchset, if the remainder of the virtio-1 patchset still
> > > > takes some time?
> > > 
> > > So the feature-bit-size extending patches currently don't support
> > > migration correctly, that's why they are not merged.
> > > 
> > > What I think we need to do for this is move host_features out
> > > from transports into core virtio device.
> > > 
> > > Then we can simply check host features >31 and skip
> > > migrating low guest features is none set.
> > > 
> > > Thoughts? Any takers?
> > > 
> > 
> > After we move host_features, put them into an optional vmstate
> > subsection?
> > 
> > I think with the recent patchsets, most of the interesting stuff is
> > already not handled by the transport anymore. There's only
> > VIRTIO_F_NOTIFY_ON_EMPTY and VIRTIO_F_BAD_FEATURE left (set by pci and
> > ccw).

notify on empty is likely safe to set for everyone.

bad feature should be pci specific, it's a mistake that
we have it in ccw. it's there to detect very old buggy guests.
in fact ccw ignores this bit completely.

For PCI, I think VIRTIO_F_BAD_FEATURE is never
actually set in guest features. If guest attempts to set it,
it is immediately cleared.

So it can be handled in pci specific code, and won't
affect migration.


> Thinking a bit more, we probably don't need this move of host_features
> to get migration right (although it might be a nice cleanup later).
> 
> Could we
> - keep migration of bits 0..31 as-is
> - add a vmstate subsection for bits 32..63 only included if one of
>   those bits is set
> - have a post handler that performs a validation of the full set of
>   bits 0..63
> ?
> 
> We could do a similar exercise with a subsection containing the
> addresses for avail and used with a post handler overwriting any
> addresses set by the old style migration code.
> 
> Does that make sense?

I don't see how it does: on the receive side you don't know
whether guest acked bits 32..63 so you can't decide whether
to parse bits 32..63.

The right thing to do IMHO is to migrate the high guest bits if and only
if the *host* bits 32..63 are set.  And that needs the host features in
core, or at least is easier if they are there.

-- 
MST



Re: [Qemu-devel] [PULL 00/14] Ide patches

2015-05-12 Thread Peter Maydell
On 12 May 2015 at 16:22, John Snow  wrote:
> On 05/12/2015 06:44 AM, Peter Maydell wrote:
>> Doesn't build on 32-bit:
>>
>> /root/qemu/qtest.c: In function ‘qtest_process_command’:
>> /root/qemu/qtest.c:519:28: error: format ‘%zu’ expects argument of
>> type ‘size_t’, but argument 2 has type ‘uint64_t’ [-Werror=format]

> More motivation for me to try another stab at fixing up the -m32
> support, I guess.

You could also build for Windows 32 bit, or on OSX (where it's
only a warning but does detect this even on 64-bit hosts).

-- PMM



Re: [Qemu-devel] [RFC PATCH 02/34] tcg+qom: QOMify core CPU defintions

2015-05-12 Thread Richard Henderson
On 05/12/2015 12:23 AM, Peter Crosthwaite wrote:
> In my multi-compile approach helper_*[ld|st]* needs to be renamed
> per-arch for the multiple compiled cputlb.o. Hence I have no symbol
> with the unqualified name. But even if I do solve my namespacing
> problem, I still have an ambiguity of which cputlb.o provided
> helper_*[ld|st]* to use from the TCG backend. This would mean all
> those APIs would have to virtualised. The big question for Paolo, is
> what complete set of APIs defines the common-code/non-common-code
> boundary? tlb_fill does seem to do the job nicely and looking at the
> architecture implementations it's not a super fast path (falling back
> to a page table faulter).
> 
> Somewhere along the call path from the qemu_st_helpers uses
> (tcg/i386/tcg-target.c) through to tlb_fill there has to be a
> virtualised function unless I am missing something?

I think both cpu_unaligned_access and tlb_fill need to be hooked.

>> I think that this is a decent step forward, modulo the conditionals along the
>> use paths.  I think we ought to clean up all of the translators to the new 
>> QOM
>> hooks.
>>
> 
> So the conditional can be ditched by having the CPU base class
> defaulting the hook to the globally defined function. Then arches can
> be brought online one-by-one.

Yes, exactly.

> Ok so the solution to this is to opt-out of the hook via a re-#define
> when we have a target-specific cpu.h handy. This will actually mean no
> change to single-arch builds but multi-arch will use the hook from
> core code only.

Err... not via #defines, please.  Just use the _foo name all spelled out
from target-specific code.

> I don't know what this means exactly. tlb_fill is called by functions
> that are linked to common code (TCG backends) so I don't see a non
> virtualized solution. Is this refactoring to move tlb_fill?

It means if we do find a way to parameterize the tcg backend, e.g. by putting
the whole table of functions into the class, then we can revisit generating
cpu-specific versions of the memory helpers.


r~




Re: [Qemu-devel] [PULL 00/14] Ide patches

2015-05-12 Thread John Snow


On 05/12/2015 06:44 AM, Peter Maydell wrote:
> On 11 May 2015 at 19:12, John Snow  wrote:
>> The following changes since commit 9ad2c8cd41a086020e21aa6d616b73bd5e2a800b:
>>
>>   Merge remote-tracking branch 
>> 'remotes/mjt/tags/pull-trivial-patches-2015-05-09' into staging (2015-05-11 
>> 13:54:00 +0100)
>>
>> are available in the git repository at:
>>
>>   https://github.com/jnsnow/qemu.git tags/ide-pull-request
>>
>> for you to fetch changes up to 6e8d74ed17e0526c34386283df6b7935076d983a:
>>
>>   qtest: pre-buffer hex nibs (2015-05-11 11:00:05 -0400)
>>
> 
> Doesn't build on 32-bit:
> 
> /root/qemu/qtest.c: In function ‘qtest_process_command’:
> /root/qemu/qtest.c:519:28: error: format ‘%zu’ expects argument of
> type ‘size_t’, but argument 2 has type ‘uint64_t’ [-Werror=format]
> 
> -- PMM
> 

More motivation for me to try another stab at fixing up the -m32
support, I guess.

--js



Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio

2015-05-12 Thread Michael S. Tsirkin
On Tue, May 12, 2015 at 03:25:30PM +0200, Cornelia Huck wrote:
> On Wed, 06 May 2015 14:08:02 +0200
> Greg Kurz  wrote:
> 
> > Legacy virtio is native endian: if the guest and host endianness differ,
> > we have to tell vhost so it can swap bytes where appropriate. This is
> > done through a vhost ring ioctl.
> > 
> > Signed-off-by: Greg Kurz 
> > ---
> >  hw/virtio/vhost.c |   50 +-
> >  1 file changed, 49 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 54851b7..1d7b939 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> (...)
> > @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
> >  return -errno;
> >  }
> > 
> > +if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
> 
> I think this should either go in after the virtio-1 base support (more
> feature bits etc.) or get a big fat comment and be touched up later.
> I'd prefer the first solution so it does not get forgotten, but I'm not
> sure when Michael plans to proceed with the virtio-1 patches (I think
> they're mostly fine already).

There are three main issues with virtio 1 patches that I am aware of.

One issue with virtio 1 patches as they are is with how features are
handled ATM.  There are 3 types of features

a. virtio 1 only features
b. virtio 0 only features
c. shared features

and 3 types of devices
a. legacy device: has b+c features
b. modern device: has a+c features
c. transitional device: has a+c features but exposes
   only c through the legacy interface


So I think a callback that gets features depending on guest
version isn't a good way to model it because fundamentally device
has one set of features.
A better way to model this is really just a single
host_features bitmask, and for transitional devices, a mask
hiding a features - which are so far all bits > 31, so maybe
for now we can just have a global mask.

We need to validate features at initialization time and make
sure they make sense, fail if not (sometimes we need to mask
features if they don't make sense - this is unfortunate
but might be needed for compatibility).

Moving host_features to virtio core would make all of the above
easier.


Second issue is migration, some of it is with migrating the new
features, so that's tied to the first one.


Third issue is fixing devices so they don't try to
access guest memory until DRIVER_OK is set.
This is surprisingly hard to do generally given need to support old
drivers which don't set DRIVER_OK or set it very late, and the fact that
we tied work-arounds for even older drivers which dont' set pci bus
master to the DRIVER_OK bit. I tried, and I'm close to giving up and
just checking guest ack for virtio 1, and ignoring DRIVER_OK requirement
if not there.



> > +virtio_legacy_is_cross_endian(vdev)) {
> > +r = vhost_virtqueue_set_vring_endian_legacy(dev,
> > +
> > virtio_is_big_endian(vdev),
> > +vhost_vq_index);
> > +if (r) {
> > +return -errno;
> > +}
> > +}
> > +
> >  s = l = virtio_queue_get_desc_size(vdev, idx);
> >  a = virtio_queue_get_desc_addr(vdev, idx);
> >  vq->desc = cpu_physical_memory_map(a, &l, 0);



[Qemu-devel] [PATCH] qemu-io: Use getopt() correctly

2015-05-12 Thread Eric Blake
POSIX says getopt() returns -1 on completion.  While Linux happens
to define EOF as -1, this definition is not required by POSIX, and
there may be platforms where checking for EOF instead of -1 would
lead to an infinite loop.

Signed-off-by: Eric Blake 
---
 qemu-io-cmds.c | 16 
 qemu-io.c  |  2 +-
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 1afcfc0..52dc611 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -646,7 +646,7 @@ static int read_f(BlockBackend *blk, int argc, char **argv)
 int total = 0;
 int pattern = 0, pattern_offset = 0, pattern_count = 0;

-while ((c = getopt(argc, argv, "bCl:pP:qs:v")) != EOF) {
+while ((c = getopt(argc, argv, "bCl:pP:qs:v")) != -1) {
 switch (c) {
 case 'b':
 bflag = 1;
@@ -830,7 +830,7 @@ static int readv_f(BlockBackend *blk, int argc, char **argv)
 int pattern = 0;
 int Pflag = 0;

-while ((c = getopt(argc, argv, "CP:qv")) != EOF) {
+while ((c = getopt(argc, argv, "CP:qv")) != -1) {
 switch (c) {
 case 'C':
 Cflag = 1;
@@ -961,7 +961,7 @@ static int write_f(BlockBackend *blk, int argc, char **argv)
 int total = 0;
 int pattern = 0xcd;

-while ((c = getopt(argc, argv, "bcCpP:qz")) != EOF) {
+while ((c = getopt(argc, argv, "bcCpP:qz")) != -1) {
 switch (c) {
 case 'b':
 bflag = 1;
@@ -1116,7 +1116,7 @@ static int writev_f(BlockBackend *blk, int argc, char 
**argv)
 int pattern = 0xcd;
 QEMUIOVector qiov;

-while ((c = getopt(argc, argv, "CqP:")) != EOF) {
+while ((c = getopt(argc, argv, "CqP:")) != -1) {
 switch (c) {
 case 'C':
 Cflag = 1;
@@ -1228,7 +1228,7 @@ static int multiwrite_f(BlockBackend *blk, int argc, char 
**argv)
 int i;
 BlockRequest *reqs;

-while ((c = getopt(argc, argv, "CqP:")) != EOF) {
+while ((c = getopt(argc, argv, "CqP:")) != -1) {
 switch (c) {
 case 'C':
 Cflag = 1;
@@ -1463,7 +1463,7 @@ static int aio_read_f(BlockBackend *blk, int argc, char 
**argv)
 struct aio_ctx *ctx = g_new0(struct aio_ctx, 1);

 ctx->blk = blk;
-while ((c = getopt(argc, argv, "CP:qv")) != EOF) {
+while ((c = getopt(argc, argv, "CP:qv")) != -1) {
 switch (c) {
 case 'C':
 ctx->Cflag = 1;
@@ -1562,7 +1562,7 @@ static int aio_write_f(BlockBackend *blk, int argc, char 
**argv)
 struct aio_ctx *ctx = g_new0(struct aio_ctx, 1);

 ctx->blk = blk;
-while ((c = getopt(argc, argv, "CqP:")) != EOF) {
+while ((c = getopt(argc, argv, "CqP:")) != -1) {
 switch (c) {
 case 'C':
 ctx->Cflag = 1;
@@ -1779,7 +1779,7 @@ static int discard_f(BlockBackend *blk, int argc, char 
**argv)
 int64_t offset;
 int count;

-while ((c = getopt(argc, argv, "Cq")) != EOF) {
+while ((c = getopt(argc, argv, "Cq")) != -1) {
 switch (c) {
 case 'C':
 Cflag = 1;
diff --git a/qemu-io.c b/qemu-io.c
index 8e41080..ae5e274 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -120,7 +120,7 @@ static int open_f(BlockBackend *blk, int argc, char **argv)
 QemuOpts *qopts;
 QDict *opts;

-while ((c = getopt(argc, argv, "snrgo:")) != EOF) {
+while ((c = getopt(argc, argv, "snrgo:")) != -1) {
 switch (c) {
 case 's':
 flags |= BDRV_O_SNAPSHOT;
-- 
2.1.0




Re: [Qemu-devel] [PATCH 13/34] qemu-io: Add command 'reopen'

2015-05-12 Thread Eric Blake
On 05/08/2015 11:21 AM, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf 
> ---
>  qemu-io-cmds.c | 71 
> ++
>  1 file changed, 71 insertions(+)
> 

> +
> +while ((c = getopt(argc, argv, "c:o:r")) != EOF) {

POSIX says getopt() returns -1 at conclusion, and allows EOF to have a
value different than -1.  Thus, this could inf-loop on weird platforms
(does anyone know such a platform?)  But I see you are copying from
other bad examples in the file; so I'll post a trivial patch to fix all
those in one go.

http://pubs.opengroup.org/onlinepubs/9699919799/functions/getopt.html

> +switch (c) {
> +case 'c':
> +if (bdrv_parse_cache_flags(optarg, &flags) < 0) {
> +error_report("Invalid cache option: %s", optarg);
> +return 0;
> +}
> +break;
> +case 'o':
> +if (!qemu_opts_parse(&reopen_opts, optarg, 0)) {
> +printf("could not parse option list -- %s\n", optarg);

Messages usually have ':', not ' --', when displaying details about the
message on the left.

We aren't very consistent on whether to start messages with lower or
upper case, so you added one of each :)

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC v4] monitor: add memory search commands s, sp

2015-05-12 Thread Claudio Fontana
On 11.05.2015 16:16, Luiz Capitulino wrote:
> On Fri, 24 Apr 2015 14:39:48 +0200
> hw.clau...@gmail.com wrote:
> 
>> From: Claudio Fontana 
>>
>> usage is similar to the commands x, xp.
>>
>> Example with string: looking for "ELF" header in memory:
>>
>> (qemu) s/100cb 0x40001000 "ELF"
>> searching memory area [40001000-400f5240]
>> 40090001
>> (qemu) x/20b 0x4009
>> 4009: '\x7f' 'E' 'L' 'F' '\x02' '\x01' '\x01' '\x03'
>> 40090008: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00'
>> 40090010: '\x02' '\x00' '\xb7' '\x00'
>>
>> Example with value: looking for 64bit variable value 0x990088
>>
>> (qemu) s/100xg 0x90004200 0x990088
>> searching memory area [90004200-9000427a1200]
>> 9000424b3000
>> 9000424c1000
>>
>> Signed-off-by: Claudio Fontana 
> 
> I had to drop this patch because it doesn't build for w32. You can
> find instructions on how to build for w32 at:
> 
>  http://wiki.qemu.org/Hosts/W32
> 

I see, will take some time to figure out I think, due to my lack of familiarity 
with the Windows compilation environment.

Incidentally, if somebody knows of memmem equivalents in the Windows API, or 
how function replacements are usually handled, please let me know.

I guess we could add a replacement function in util/ to compile #ifdef _WIN32?
Basically it would be a duplicate of the work already done in gnulib...
are all other supported targets ok with using the GNU function memmem of 
string.h ?

Thanks,

Claudio

>> ---
>>  hmp-commands.hx |  28 
>>  monitor.c   | 140 
>> 
>>  2 files changed, 168 insertions(+)
>>
>> changes from v3:
>> initialize pointer variable to NULL to finally get rid of spurious warning
>>
>> changes from v2:
>> move code to try to address spurious warning
>>
>> changes from v1:
>> make checkpatch happy by adding braces here and there.
>>
>> diff --git a/hmp-commands.hx b/hmp-commands.hx
>> index d5022d8..2bf5737 100644
>> --- a/hmp-commands.hx
>> +++ b/hmp-commands.hx
>> @@ -432,6 +432,34 @@ Start gdbserver session (default @var{port}=1234)
>>  ETEXI
>>  
>>  {
>> +.name   = "s",
>> +.args_type  = "fmt:/,addr:l,data:s",
>> +.params = "/fmt addr data",
>> +.help   = "search virtual memory starting at 'addr' for 'data'",
>> +.mhandler.cmd = hmp_memory_search,
>> +},
>> +
>> +STEXI
>> +@item s/fmt @var{addr} @var{data}
>> +@findex s
>> +Virtual memory search starting at @var{addr} for data described by 
>> @var{data}.
>> +ETEXI
>> +
>> +{
>> +.name   = "sp",
>> +.args_type  = "fmt:/,addr:l,data:s",
>> +.params = "/fmt addr data",
>> +.help   = "search physical memory starting at 'addr' for 
>> 'data'",
>> +.mhandler.cmd = hmp_physical_memory_search,
>> +},
>> +
>> +STEXI
>> +@item sp/fmt @var{addr} @var{data}
>> +@findex sp
>> +Physical memory search starting at @var{addr} for data described by 
>> @var{data}.
>> +ETEXI
>> +
>> +{
>>  .name   = "x",
>>  .args_type  = "fmt:/,addr:l",
>>  .params = "/fmt addr",
>> diff --git a/monitor.c b/monitor.c
>> index c86a89e..b648dd2 100644
>> --- a/monitor.c
>> +++ b/monitor.c
>> @@ -1208,6 +1208,124 @@ static void monitor_printc(Monitor *mon, int c)
>>  monitor_printf(mon, "'");
>>  }
>>  
>> +static void monitor_print_addr(Monitor *mon, hwaddr addr, bool is_physical)
>> +{
>> +if (is_physical) {
>> +monitor_printf(mon, TARGET_FMT_plx "\n", addr);
>> +} else {
>> +monitor_printf(mon, TARGET_FMT_lx "\n", (target_ulong)addr);
>> +}
>> +}
>> +
>> +/* simple memory search for a byte sequence. The sequence is generated from
>> + * a numeric value to look for in guest memory, or from a string.
>> + */
>> +static void memory_search(Monitor *mon, int count, int format, int wsize,
>> +  hwaddr addr, const char *data_str, bool 
>> is_physical)
>> +{
>> +int pos, len;  /* pos in the search area, len of area */
>> +char *hay; /* buffer for haystack */
>> +int hay_size;  /* haystack size. Needle size is wsize. */
>> +const char *needle = NULL; /* needle to search in the haystack */
>> +const char *format_str;/* numeric input format string */
>> +char value_raw[8]; /* numeric input converted to raw data */
>> +#define MONITOR_S_CHUNK_SIZE 16000
>> +
>> +len = wsize * count;
>> +if (len < 1) {
>> +monitor_printf(mon, "invalid search area length.\n");
>> +return;
>> +}
>> +switch (format) {
>> +case 'i':
>> +monitor_printf(mon, "format '%c' not supported.\n", format);
>> +return;
>> +case 'c':
>> +needle = data_str;
>> +wsize = strlen(data_str);
>> +if (wsize > MONITOR_S_CHUNK_SIZE) {
>> +monitor_printf(mon, "search 

Re: [Qemu-devel] [PATCH 12/34] block: Allow specifying driver-specific options to reopen

2015-05-12 Thread Eric Blake
On 05/08/2015 11:21 AM, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf 
> ---
>  block.c   | 42 +++---
>  block/commit.c|  4 ++--
>  include/block/block.h |  4 +++-
>  3 files changed, 44 insertions(+), 6 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 95dc51e..561cefd 100644
> --- a/block.c
> +++ b/block.c
> @@ -1584,6 +1584,9 @@ typedef struct BlockReopenQueueEntry {
>   *
>   * bs is the BlockDriverState to add to the reopen queue.
>   *
> + * options contains the changed options for the associated bs
> + * (the BlockReopenQueue takes the ownership)

'takes ownership' reads a bit more idiomatically, but what you have is
not wrong.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes

2015-05-12 Thread Peter Maydell
Resurrecting a six month old thread (and starting with
a big long quote for context):

On 8 September 2014 at 12:53, Peter Maydell  wrote:
> On 7 September 2014 02:47, Edgar E. Iglesias  wrote:
>> On Thu, Sep 04, 2014 at 06:47:58PM +0100, Peter Maydell wrote:
>>> tlb_set_page() takes an extra argument specifying the
>>> transaction attributes. For RAM accesses we can just
>>> immediately use this to get the AddressSpace to pass
>>> to address_space_translate_for_iotlb(). For IO accesses
>>> we need to stash the attributes in the iotlb[], which
>>> means extending that from an array of hwaddrs to an
>>> array of struct {hwaddr, attributes}, which is easy enough.
>>> Then the io_read/write glue functions in softmmu_template.h
>>> can fish the attributes out of the iotlb and use them to
>>> pick the AddressSpace to pass to iotlb_to_region().
>>> More importantly, we can arrange to pass them through
>>> to the device read/write callbacks (either directly,
>>> or indirectly by saving them in the CPU struct like we
>>> do for mem_io_vaddr; since changing the prototypes on
>>> every device read and write callback would be insane
>>> we probably want to have fields in MemoryRegionOps for
>>> read_with_attrs and write_with_attrs function pointers).
>>
>> I think this will mostly work but could become a bit hard
>> to deal with when IOMMUs come into the picture that may want
>> to modify the attributes and AS.
>>
>> Maybe we could consider having a pointer to a bundle of
>> AS and attributes stored in the iotlb? example:
>>
>> memory.h:
>> typedef struct BusAttrSomething
>> {
>> AddressSpace *as;
>> MemoryTransactionAttr *attr;
>> } BusAttrSomthing;
>>
>> So that the stuff stored in the IOTLB is not specific
>> to the CPU in question but can be created by any
>> IOMMU along the bus path. See below for more info.
>
> Mmm, we probably want to allow for IOTLBs, so more
> flexibility than a simple index into the CPU's list
> of address spaces does seem warranted.

Now that the tx-attributes patches are in master I'm
looking at the "multiple AddressSpaces per CPU" part. In
the intervening time, this code has been somewhat complicated
by Paolo's RCU patches. In particular having actual
AddressSpace pointers in the iotlb doesn't look like it
will work given the way we now cache the memory_dispatch
pointer.

So we could deal with this by just falling back to
"CPUs have N AddressSpaces and when the target code
calls tlb_set_page_with_attrs it passes in the index
of the AddressSpace as well as the paddr" (and we then
can stash the index in the iotlb for later use, as
well as handing it to address_space_translate_for_iotlb).
Internally exec.c would also maintain an array of
AddressSpaceDispatch pointers corresponding to the
AddressSpaces (so effectively cs->as and cs->memory_dispatch
become arrays, though likely with some syntactic sugar
so we don't have to change all the uses of cs->as to
cs->as[0] for CPUs which only have 1 AS).

Or is there a better approach? Edgar, is your IOMMU
stuff sufficiently far advanced that you can see how
it ought to fit into the code at the moment?

thanks
-- PMM



Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check

2015-05-12 Thread Cornelia Huck
On Tue, 12 May 2015 15:44:46 +0200
Cornelia Huck  wrote:

> On Tue, 12 May 2015 15:34:47 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> > > On Wed, 06 May 2015 14:07:37 +0200
> > > Greg Kurz  wrote:
> > > 
> > > > Unlike with add and clear, there is no valid reason to abort when 
> > > > checking
> > > > for a feature. It makes more sense to return false (i.e. the feature bit
> > > > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 
> > > > 32.
> > > > 
> > > > This allows to introduce code that is aware about new 64-bit features 
> > > > like
> > > > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > > > 
> > > > Signed-off-by: Greg Kurz 
> > > > ---
> > > >  include/hw/virtio/virtio.h |1 -
> > > >  1 file changed, 1 deletion(-)
> > > > 
> > > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > > index d95f8b6..6ef70f1 100644
> > > > --- a/include/hw/virtio/virtio.h
> > > > +++ b/include/hw/virtio/virtio.h
> > > > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t 
> > > > *features, unsigned int fbit)
> > > > 
> > > >  static inline bool __virtio_has_feature(uint32_t features, unsigned 
> > > > int fbit)
> > > >  {
> > > > -assert(fbit < 32);
> > > >  return !!(features & (1 << fbit));
> > > >  }
> > > > 
> > > > 
> > > > 
> > > 
> > > I must say I'm not very comfortable with knowingly passing out-of-rage
> > > values to this function.
> > > 
> > > Can we perhaps apply at least the feature-bit-size extending patches
> > > prior to your patchset, if the remainder of the virtio-1 patchset still
> > > takes some time?
> > 
> > So the feature-bit-size extending patches currently don't support
> > migration correctly, that's why they are not merged.
> > 
> > What I think we need to do for this is move host_features out
> > from transports into core virtio device.
> > 
> > Then we can simply check host features >31 and skip
> > migrating low guest features is none set.
> > 
> > Thoughts? Any takers?
> > 
> 
> After we move host_features, put them into an optional vmstate
> subsection?
> 
> I think with the recent patchsets, most of the interesting stuff is
> already not handled by the transport anymore. There's only
> VIRTIO_F_NOTIFY_ON_EMPTY and VIRTIO_F_BAD_FEATURE left (set by pci and
> ccw).

Thinking a bit more, we probably don't need this move of host_features
to get migration right (although it might be a nice cleanup later).

Could we
- keep migration of bits 0..31 as-is
- add a vmstate subsection for bits 32..63 only included if one of
  those bits is set
- have a post handler that performs a validation of the full set of
  bits 0..63
?

We could do a similar exercise with a subsection containing the
addresses for avail and used with a post handler overwriting any
addresses set by the old style migration code.

Does that make sense?




Re: [Qemu-devel] [PATCH 11/34] block: Allow references for backing files

2015-05-12 Thread Eric Blake
On 05/08/2015 11:21 AM, Kevin Wolf wrote:
> For bs->file, using references to existing BDSes has been possible for a
> while already. This patch enables the same for bs->backing_hd.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block.c   | 42 --
>  block/mirror.c|  2 +-
>  include/block/block.h |  3 ++-
>  3 files changed, 27 insertions(+), 20 deletions(-)
> 
> diff --git a/block.c b/block.c
> index e93bf63..95dc51e 100644
> --- a/block.c
> +++ b/block.c
> @@ -1109,30 +1109,41 @@ out:
>  /*
>   * Opens the backing file for a BlockDriverState if not yet open
>   *
> - * options is a QDict of options to pass to the block drivers, or NULL for an
> - * empty set of options. The reference to the QDict is transferred to this
> - * function (even on failure), so if the caller intends to reuse the 
> dictionary,
> - * it needs to use QINCREF() before calling bdrv_file_open.
> + * bdrev_key specifies the key for the image's BlockdevRef in the options 
> QDict.

s/bdrev/bdref/

> + * That QDict has to be flattened; therefore, if the BlockdevRef is a QDict
> + * itself, all options starting with "${bdref_key}." are considered part of 
> the
> + * BlockdevRef.
> + *

>  
>  bs->open_flags &= ~BDRV_O_NO_BACKING;
> -if (qdict_haskey(options, "file.filename")) {
> +
> +bdref_key_dot = g_strdup_printf("%s.", bdref_key);
> +qdict_extract_subqdict(parent_options, &options, bdref_key_dot);
> +g_free(bdref_key_dot);

I wonder if we have a pattern like this frequently enough to make a
wrapper that concatenates the argument for us, instead of having every
caller have to form a temporary concatenation string.  But not something
that affects this patch.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-05-12 Thread Stefano Stabellini
On Tue, 12 May 2015, Stefano Stabellini wrote:
> On Tue, 12 May 2015, Fabio Fantoni wrote:
> > Il 12/05/2015 12:26, Fabio Fantoni ha scritto:
> > > Il 12/05/2015 11:23, Fabio Fantoni ha scritto:
> > > > Il 11/05/2015 17:04, Fabio Fantoni ha scritto:
> > > > > Il 21/04/2015 14:53, Stefano Stabellini ha scritto:
> > > > > > On Tue, 21 Apr 2015, Fabio Fantoni wrote:
> > > > > > > Il 21/04/2015 12:49, Stefano Stabellini ha scritto:
> > > > > > > > On Mon, 20 Apr 2015, Fabio Fantoni wrote:
> > > > > > > > > I updated xen and qemu from xen 4.5.0 with its upstream qemu
> > > > > > > > > included to
> > > > > > > > > xen
> > > > > > > > > 4.5.1-pre with qemu upstream from stable-4.5 (changed 
> > > > > > > > > Config.mk
> > > > > > > > > to use
> > > > > > > > > revision "master").
> > > > > > > > > After few minutes I booted windows 7 64 bit domU qemu crash,
> > > > > > > > > tried 2 times
> > > > > > > > > with same result.
> > > > > > > > > 
> > > > > > > > > In the domU's qemu log:
> > > > > > > > > > qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion
> > > > > > > > > > `(old_top ==
> > > > > > > > > > (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
> > > > > > > > > > __builtin_offsetof
> > > > > > > > > > (struct malloc_chunk, fd && old_size == 0) || ((unsigned
> > > > > > > > > > long)
> > > > > > > > > > (old_size) >= (unsigned long)__builtin_offsetof (struct
> > > > > > > > > > malloc_chunk,
> > > > > > > > > > fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 *
> > > > > > > > > > (sizeof(size_t))) -
> > > > > > > > > > 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end &
> > > > > > > > > > pagemask)
> > > > > > > > > > ==
> > > > > > > > > > 0)' failed.
> > > > > > > > > > Killing all inferiors
> > > > > > > > > In attachment the full backtrace of qemu crash.
> > > > > > > > > 
> > > > > > > > > With a fast search after I saw the backtrace I found a 
> > > > > > > > > probable
> > > > > > > > > cause of
> > > > > > > > > regression (I'm not sure):
> > > > > > > > > http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
> > > > > > > > >  
> > > > > > > > > spice: make sure we don't overflow ssd->buf
> > > > > > > > > 
> > > > > > > > > Added also qemu-devel and spice-devel as cc.
> > > > > > > > > 
> > > > > > > > > If you need more informations/tests tell me and I'll post 
> > > > > > > > > them.
> > > > > > > >Maybe you could try to revert the offending commit
> > > > > > > > (5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better 
> > > > > > > > bisect
> > > > > > > > the
> > > > > > > > crash?
> > > > > > > Thanks for your reply.
> > > > > > > 
> > > > > > > I reverted to 4.5.0 on dom0 for now on that system because I'm 
> > > > > > > busy
> > > > > > > trying to
> > > > > > > found another problem that cause very bad performance without 
> > > > > > > errors
> > > > > > > or
> > > > > > > nothing in logs :( I don't know if if xen related, kernel related 
> > > > > > > or
> > > > > > > other for
> > > > > > > now.
> > > > > > > 
> > > > > > > About this regression with spice I'll do further tests in next 
> > > > > > > days
> > > > > > > (probably
> > > > > > > starting reverting the spice patch in qemu) but any help is
> > > > > > > appreciated.
> > > > > > > Based on data I have for now is possible that the problem is that
> > > > > > > qemu try to
> > > > > > > allocate other ram or videoram after domU create but with xen is 
> > > > > > > not
> > > > > > > possible?
> > > > > > > In the spice related patch I saw something about dynamic 
> > > > > > > allocation
> > > > > > > for
> > > > > > > example.
> > > > > > It is probably caused by a commit in the range:
> > > > > > 
> > > > > > 1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4
> > > > > >  
> > > > > > 
> > > > > > there are only 10 commits in that range. By using git bisect you
> > > > > > should
> > > > > > be able to narrow it down in just 3 tests.
> > > > > 
> > > > > Sorry for delay, I was busy with many things, today I retried with
> > > > > updated stable-4.5 and also reverting "spice: make sure we don't
> > > > > overflow ssd->buf" (in a second test) but in both case regression 
> > > > > remain
> > > > > :(
> > > > > Tomorrow probably I'll do other tests.
> > > > 
> > > > I did another test, reverting this instead:
> > > > http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commit;h=c9ac5f816bf3a8b56f836b078711dcef6e5c90b8
> > > >  
> > > > And now seems I'm unable to reproduce the regression, before happen 
> > > > after
> > > > few seconds up to 1-2 minutes, now I use the same domU 15-20 minutes
> > > > without problem.
> > > > Probably is the cause of regression even if seems strange that on 
> > > > unstable
> > > > with same patch on tests of some days ago didn't happen.
> > > > 
> > > > Any ideas?
> > > > 
> > > > Thanks for any reply and sorry for my bad english.
> > > 
> > 

Re: [Qemu-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-05-12 Thread Stefano Stabellini
On Tue, 12 May 2015, Fabio Fantoni wrote:
> Il 12/05/2015 12:26, Fabio Fantoni ha scritto:
> > Il 12/05/2015 11:23, Fabio Fantoni ha scritto:
> > > Il 11/05/2015 17:04, Fabio Fantoni ha scritto:
> > > > Il 21/04/2015 14:53, Stefano Stabellini ha scritto:
> > > > > On Tue, 21 Apr 2015, Fabio Fantoni wrote:
> > > > > > Il 21/04/2015 12:49, Stefano Stabellini ha scritto:
> > > > > > > On Mon, 20 Apr 2015, Fabio Fantoni wrote:
> > > > > > > > I updated xen and qemu from xen 4.5.0 with its upstream qemu
> > > > > > > > included to
> > > > > > > > xen
> > > > > > > > 4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk
> > > > > > > > to use
> > > > > > > > revision "master").
> > > > > > > > After few minutes I booted windows 7 64 bit domU qemu crash,
> > > > > > > > tried 2 times
> > > > > > > > with same result.
> > > > > > > > 
> > > > > > > > In the domU's qemu log:
> > > > > > > > > qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion
> > > > > > > > > `(old_top ==
> > > > > > > > > (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
> > > > > > > > > __builtin_offsetof
> > > > > > > > > (struct malloc_chunk, fd && old_size == 0) || ((unsigned
> > > > > > > > > long)
> > > > > > > > > (old_size) >= (unsigned long)__builtin_offsetof (struct
> > > > > > > > > malloc_chunk,
> > > > > > > > > fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 *
> > > > > > > > > (sizeof(size_t))) -
> > > > > > > > > 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end &
> > > > > > > > > pagemask)
> > > > > > > > > ==
> > > > > > > > > 0)' failed.
> > > > > > > > > Killing all inferiors
> > > > > > > > In attachment the full backtrace of qemu crash.
> > > > > > > > 
> > > > > > > > With a fast search after I saw the backtrace I found a probable
> > > > > > > > cause of
> > > > > > > > regression (I'm not sure):
> > > > > > > > http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
> > > > > > > >  
> > > > > > > > spice: make sure we don't overflow ssd->buf
> > > > > > > > 
> > > > > > > > Added also qemu-devel and spice-devel as cc.
> > > > > > > > 
> > > > > > > > If you need more informations/tests tell me and I'll post them.
> > > > > > >Maybe you could try to revert the offending commit
> > > > > > > (5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect
> > > > > > > the
> > > > > > > crash?
> > > > > > Thanks for your reply.
> > > > > > 
> > > > > > I reverted to 4.5.0 on dom0 for now on that system because I'm busy
> > > > > > trying to
> > > > > > found another problem that cause very bad performance without errors
> > > > > > or
> > > > > > nothing in logs :( I don't know if if xen related, kernel related or
> > > > > > other for
> > > > > > now.
> > > > > > 
> > > > > > About this regression with spice I'll do further tests in next days
> > > > > > (probably
> > > > > > starting reverting the spice patch in qemu) but any help is
> > > > > > appreciated.
> > > > > > Based on data I have for now is possible that the problem is that
> > > > > > qemu try to
> > > > > > allocate other ram or videoram after domU create but with xen is not
> > > > > > possible?
> > > > > > In the spice related patch I saw something about dynamic allocation
> > > > > > for
> > > > > > example.
> > > > > It is probably caused by a commit in the range:
> > > > > 
> > > > > 1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4
> > > > >  
> > > > > 
> > > > > there are only 10 commits in that range. By using git bisect you
> > > > > should
> > > > > be able to narrow it down in just 3 tests.
> > > > 
> > > > Sorry for delay, I was busy with many things, today I retried with
> > > > updated stable-4.5 and also reverting "spice: make sure we don't
> > > > overflow ssd->buf" (in a second test) but in both case regression remain
> > > > :(
> > > > Tomorrow probably I'll do other tests.
> > > 
> > > I did another test, reverting this instead:
> > > http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commit;h=c9ac5f816bf3a8b56f836b078711dcef6e5c90b8
> > >  
> > > And now seems I'm unable to reproduce the regression, before happen after
> > > few seconds up to 1-2 minutes, now I use the same domU 15-20 minutes
> > > without problem.
> > > Probably is the cause of regression even if seems strange that on unstable
> > > with same patch on tests of some days ago didn't happen.
> > > 
> > > Any ideas?
> > > 
> > > Thanks for any reply and sorry for my bad english.
> > 
> > Bad news, qemu crash still happen even if this time in qemu log there is
> > another output, see attachment.
> > After take a look on the other patches I saw:
> > http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commitdiff;h=7154fba0e51ec985ef621965d1b7120ad424fcbf
> >  
> > With "Conflicts: hw/display/vga.c" in description I'll try to revert it
> > instead.
> > 
> > Or someone can tell me another probable test I can t

Re: [Qemu-devel] [PATCH 10/34] block: Fix reopen flag inheritance

2015-05-12 Thread Eric Blake
On 05/08/2015 11:21 AM, Kevin Wolf wrote:
> When reopening an image, the block layer already takes care to reopen
> bs->file as well with recalculated inherited flags. The same must happen
> for any other child (most notably missing before this patch: backing
> files).
> 
> If bs->file (or any other child) didn't originally inherit from bs, e.g.
> because it was created separately and then only referenced, it must not
> inherit flags on reopen either, so check the inherited_from field before
> propagation the reopen down.
> 
> VMDK already reopened its extents manually; this code can now be
> dropped.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block.c  | 13 +++--
>  block/vmdk.c | 28 ++--
>  2 files changed, 13 insertions(+), 28 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH 2/2] block: align bounce buffers to page

2015-05-12 Thread Denis V. Lunev
The following sequence
int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
for (i = 0; i < 10; i++)
write(fd, buf, 4096);
performs 5% better if buf is aligned to 4096 bytes.

The difference is quite reliable.

On the other hand we do not want at the moment to enforce bounce
buffering if guest request is aligned to 512 bytes.

The patch changes default bounce buffer optimal alignment to
MAX(page size, 4k). 4k is chosen as maximal known sector size on real
HDD.

The justification of the performance improve is quite interesting.
>From the kernel point of view each request to the disk was split
by two. This could be seen by blktrace like this:
  9,0   11  1 0.0 11151  Q  WS 312737792 + 1023 [qemu-img]
  9,0   11  2 0.07938 11151  Q  WS 312738815 + 8 [qemu-img]
  9,0   11  3 0.30735 11151  Q  WS 312738823 + 1016 [qemu-img]
  9,0   11  4 0.32482 11151  Q  WS 312739839 + 8 [qemu-img]
  9,0   11  5 0.41379 11151  Q  WS 312739847 + 1016 [qemu-img]
  9,0   11  6 0.42818 11151  Q  WS 312740863 + 8 [qemu-img]
  9,0   11  7 0.51236 11151  Q  WS 312740871 + 1017 [qemu-img]
  9,05  1 0.169071519 11151  Q  WS 312741888 + 1023 [qemu-img]
After the patch the pattern becomes normal:
  9,06  1 0.0 12422  Q  WS 314834944 + 1024 [qemu-img]
  9,06  2 0.38527 12422  Q  WS 314835968 + 1024 [qemu-img]
  9,06  3 0.72849 12422  Q  WS 314836992 + 1024 [qemu-img]
  9,06  4 0.000106276 12422  Q  WS 314838016 + 1024 [qemu-img]
and the amount of requests sent to disk (could be calculated counting
number of lines in the output of blktrace) is reduced about 2 times.

Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
does his job well and real requests comes properly aligned (to page).

Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block.c   |  8 
 block/io.c|  2 +-
 block/raw-posix.c | 13 +++--
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index e293907..325f727 100644
--- a/block.c
+++ b/block.c
@@ -106,8 +106,8 @@ int is_windows_drive(const char *filename)
 size_t bdrv_opt_mem_align(BlockDriverState *bs)
 {
 if (!bs || !bs->drv) {
-/* 4k should be on the safe side */
-return 4096;
+/* page size or 4k (hdd sector size) should be on the safe side */
+return MAX(4096, getpagesize());
 }
 
 return bs->bl.opt_mem_alignment;
@@ -116,8 +116,8 @@ size_t bdrv_opt_mem_align(BlockDriverState *bs)
 size_t bdrv_min_mem_align(BlockDriverState *bs)
 {
 if (!bs || !bs->drv) {
-/* 4k should be on the safe side */
-return 4096;
+/* page size or 4k (hdd sector size) should be on the safe side */
+return MAX(4096, getpagesize());
 }
 
 return bs->bl.min_mem_alignment;
diff --git a/block/io.c b/block/io.c
index 908a3d1..071652c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -205,7 +205,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
 bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
 } else {
 bs->bl.min_mem_alignment = 512;
-bs->bl.opt_mem_alignment = 512;
+bs->bl.opt_mem_alignment = getpagesize();
 }
 
 if (bs->backing_hd) {
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 7083924..2990e95 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -301,6 +301,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
 {
 BDRVRawState *s = bs->opaque;
 char *buf;
+size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
 
 /* For /dev/sg devices the alignment is not really used.
With buffered I/O, we don't have any restrictions. */
@@ -330,9 +331,9 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
 /* If we could not get the sizes so far, we can only guess them */
 if (!s->buf_align) {
 size_t align;
-buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
-if (raw_is_io_aligned(fd, buf + align, MAX_BLOCKSIZE)) {
+buf = qemu_memalign(max_align, 2 * max_align);
+for (align = 512; align <= max_align; align <<= 1) {
+if (raw_is_io_aligned(fd, buf + align, max_align)) {
 s->buf_align = align;
 break;
 }
@@ -342,8 +343,8 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
 
 if (!bs->request_alignment) {
 size_t align;
-buf = qemu_memalign(s->buf_align, MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
+buf = qemu_memalign(s->buf_align, max_align);
+for (align = 512; align <= max_align; align <<= 1) {
 if (raw_is_io_aligned(fd, buf, align)) {
 bs->request_alig

[Qemu-devel] [PATCH 1/2] block: minimal bounce buffer alignment

2015-05-12 Thread Denis V. Lunev
The patch introduces new concept: minimal memory alignment for bounce
buffers. Original so called "optimal" value is actually minimal required
value for aligment. It should be used for validation that the IOVec
is properly aligned and bounce buffer is not required.

Though, from the performance point of view, it would be better if
bounce buffer or IOVec allocated by QEMU will be aligned stricter.

The patch does not change any alignment value yet.

Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
Reviewed-by: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block.c   | 11 +++
 block/io.c|  7 ++-
 block/raw-posix.c |  1 +
 include/block/block.h |  2 ++
 include/block/block_int.h |  3 +++
 5 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 7904098..e293907 100644
--- a/block.c
+++ b/block.c
@@ -113,6 +113,16 @@ size_t bdrv_opt_mem_align(BlockDriverState *bs)
 return bs->bl.opt_mem_alignment;
 }
 
+size_t bdrv_min_mem_align(BlockDriverState *bs)
+{
+if (!bs || !bs->drv) {
+/* 4k should be on the safe side */
+return 4096;
+}
+
+return bs->bl.min_mem_alignment;
+}
+
 /* check if the path starts with ":" */
 int path_has_protocol(const char *path)
 {
@@ -890,6 +900,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 }
 
 assert(bdrv_opt_mem_align(bs) != 0);
+assert(bdrv_min_mem_align(bs) != 0);
 assert((bs->request_alignment != 0) || bs->sg);
 return 0;
 
diff --git a/block/io.c b/block/io.c
index 1ce62c4..908a3d1 100644
--- a/block/io.c
+++ b/block/io.c
@@ -201,8 +201,10 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error 
**errp)
 }
 bs->bl.opt_transfer_length = bs->file->bl.opt_transfer_length;
 bs->bl.max_transfer_length = bs->file->bl.max_transfer_length;
+bs->bl.min_mem_alignment = bs->file->bl.min_mem_alignment;
 bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
 } else {
+bs->bl.min_mem_alignment = 512;
 bs->bl.opt_mem_alignment = 512;
 }
 
@@ -221,6 +223,9 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
 bs->bl.opt_mem_alignment =
 MAX(bs->bl.opt_mem_alignment,
 bs->backing_hd->bl.opt_mem_alignment);
+bs->bl.min_mem_alignment =
+MAX(bs->bl.min_mem_alignment,
+bs->backing_hd->bl.min_mem_alignment);
 }
 
 /* Then let the driver override it */
@@ -2489,7 +2494,7 @@ void *qemu_try_blockalign0(BlockDriverState *bs, size_t 
size)
 bool bdrv_qiov_is_aligned(BlockDriverState *bs, QEMUIOVector *qiov)
 {
 int i;
-size_t alignment = bdrv_opt_mem_align(bs);
+size_t alignment = bdrv_min_mem_align(bs);
 
 for (i = 0; i < qiov->niov; i++) {
 if ((uintptr_t) qiov->iov[i].iov_base % alignment) {
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 24d8582..7083924 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -725,6 +725,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error 
**errp)
 BDRVRawState *s = bs->opaque;
 
 raw_probe_alignment(bs, s->fd, errp);
+bs->bl.min_mem_alignment = s->buf_align;
 bs->bl.opt_mem_alignment = s->buf_align;
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index 7d1a717..c1c963e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -440,6 +440,8 @@ void bdrv_img_create(const char *filename, const char *fmt,
 
 /* Returns the alignment in bytes that is required so that no bounce buffer
  * is required throughout the stack */
+size_t bdrv_min_mem_align(BlockDriverState *bs);
+/* Returns optimal alignment in bytes for bounce buffer */
 size_t bdrv_opt_mem_align(BlockDriverState *bs);
 void bdrv_set_guest_block_size(BlockDriverState *bs, int align);
 void *qemu_blockalign(BlockDriverState *bs, size_t size);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index db29b74..f004378 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -313,6 +313,9 @@ typedef struct BlockLimits {
 int max_transfer_length;
 
 /* memory alignment so that no bounce buffer is needed */
+size_t min_mem_alignment;
+
+/* memory alignment for bounce buffer */
 size_t opt_mem_alignment;
 } BlockLimits;
 
-- 
1.9.1




[Qemu-devel] [PATCH v8 0/2] block: enforce minimal 4096 alignment in qemu_blockalign

2015-05-12 Thread Denis V. Lunev
I have used the following program to test
#define _GNU_SOURCE

#include 
#include 
#include 
#include 
#include 
#include 

int main(int argc, char *argv[])
{
int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
void *buf;
int i = 0, align = atoi(argv[2]);

do {
buf = memalign(align, 4096);
if (align >= 4096)
break;
if ((unsigned long)buf & 4095)
break;
i++;
} while (1);
printf("%d %p\n", i, buf);

memset(buf, 0x11, 4096);

for (i = 0; i < 10; i++) {
lseek(fd, SEEK_CUR, 4096);
write(fd, buf, 4096);
}

close(fd);
return 0;
}
for in in `seq 1 30` ; do a.out aa ; done

The file was placed into 8 GB partition on HDD below to avoid speed
change due to different offset on disk. Results are reliable:
- 189 vs 180 seconds on Linux 3.16

The following setups have been tested:
1) ext4 with block size equals to 1024 over 512/512 physical/logical
   sector size SSD disk
2) ext4 with block size equals to 4096 over 512/512 physical/logical
   sector size SSD disk
3) ext4 with block size equals to 4096 over 512/4096 physical/logical
   sector size rotational disk (WDC WD20EZRX)
4) xfs with block size equals to 4096 over 512/512 physical/logical
   sector size SSD disk

The difference is quite reliable and the same 5%.
  qemu-io -n -c 'write -P 0xaa 0 1G' 1.img
for image in qcow2 format is 1% faster.

qemu-img is also affected. The difference in between
  qemu-img create -f qcow2 1.img 64G
  qemu-io -n -c 'write -P 0xaa 0 1G' 1.img
  time for i in `seq 1 30` ; do qemu-img convert 1.img -t none -O raw 2.img ; 
rm -rf 2.img ; done
is around 126 vs 119 seconds.

The justification of the performance improve is quite interesting.
>From the kernel point of view each request to the disk was split
by two. This could be seen by blktrace like this:
  9,0   11  1 0.0 11151  Q  WS 312737792 + 1023 [qemu-img]
  9,0   11  2 0.07938 11151  Q  WS 312738815 + 8 [qemu-img]
  9,0   11  3 0.30735 11151  Q  WS 312738823 + 1016 [qemu-img]
  9,0   11  4 0.32482 11151  Q  WS 312739839 + 8 [qemu-img]
  9,0   11  5 0.41379 11151  Q  WS 312739847 + 1016 [qemu-img]
  9,0   11  6 0.42818 11151  Q  WS 312740863 + 8 [qemu-img]
  9,0   11  7 0.51236 11151  Q  WS 312740871 + 1017 [qemu-img]
  9,05  1 0.169071519 11151  Q  WS 312741888 + 1023 [qemu-img]
After the patch the pattern becomes normal:
  9,06  1 0.0 12422  Q  WS 314834944 + 1024 [qemu-img]
  9,06  2 0.38527 12422  Q  WS 314835968 + 1024 [qemu-img]
  9,06  3 0.72849 12422  Q  WS 314836992 + 1024 [qemu-img]
  9,06  4 0.000106276 12422  Q  WS 314838016 + 1024 [qemu-img]
and the amount of requests sent to disk (could be calculated counting
number of lines in the output of blktrace) is reduced about 2 times.

Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
does his job well and real requests comes properly aligned (to page).

Changes from v7:
- make assignment from v6 unconditional (Kevin)

Changes from v6:
- explicitely assign opt_mem_alignemnt in raw-posix.c with
  MAX(s->buf_align, getpagesize()) (Kevin)

Changes from v5:
- found justification from kernel point of view
- fixed checkpatch warnings in the patch 2

Changes from v4:
- patches reordered
- dropped conversion from 512 to BDRV_SECTOR_SIZE
- getpagesize() is replaced with MAX(4096, getpagesize()) as suggested by
  Kevin

Changes from v3:
- portable way to calculate system page size used
- 512/4096 values are replaced with proper macros/values

Changes from v2:
- opt_mem_alignment is split to opt_mem_alignment for bounce buffering
  and min_mem_alignment to check buffers coming from guest.

Changes from v1:
- enforces 4096 alignment in qemu_(try_)blockalign, avoid touching of
  bdrv_qiov_is_aligned path not to enforce additional bounce buffering
  as suggested by Paolo
- reduces 10% to 5% in patch description to better fit 180 vs 189
  difference

Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 




Re: [Qemu-devel] [PATCH 2/2] block: align bounce buffers to page

2015-05-12 Thread Denis V. Lunev

On 12/05/15 17:26, Kevin Wolf wrote:

Am 12.05.2015 um 16:20 hat Denis V. Lunev geschrieben:

On 12/05/15 17:08, Kevin Wolf wrote:

Am 12.05.2015 um 15:41 hat Denis V. Lunev geschrieben:

The following sequence
 int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
 for (i = 0; i < 10; i++)
 write(fd, buf, 4096);
performs 5% better if buf is aligned to 4096 bytes.

The difference is quite reliable.

On the other hand we do not want at the moment to enforce bounce
buffering if guest request is aligned to 512 bytes.

The patch changes default bounce buffer optimal alignment to
MAX(page size, 4k). 4k is chosen as maximal known sector size on real
HDD.

The justification of the performance improve is quite interesting.
 From the kernel point of view each request to the disk was split
by two. This could be seen by blktrace like this:
   9,0   11  1 0.0 11151  Q  WS 312737792 + 1023 [qemu-img]
   9,0   11  2 0.07938 11151  Q  WS 312738815 + 8 [qemu-img]
   9,0   11  3 0.30735 11151  Q  WS 312738823 + 1016 [qemu-img]
   9,0   11  4 0.32482 11151  Q  WS 312739839 + 8 [qemu-img]
   9,0   11  5 0.41379 11151  Q  WS 312739847 + 1016 [qemu-img]
   9,0   11  6 0.42818 11151  Q  WS 312740863 + 8 [qemu-img]
   9,0   11  7 0.51236 11151  Q  WS 312740871 + 1017 [qemu-img]
   9,05  1 0.169071519 11151  Q  WS 312741888 + 1023 [qemu-img]
After the patch the pattern becomes normal:
   9,06  1 0.0 12422  Q  WS 314834944 + 1024 [qemu-img]
   9,06  2 0.38527 12422  Q  WS 314835968 + 1024 [qemu-img]
   9,06  3 0.72849 12422  Q  WS 314836992 + 1024 [qemu-img]
   9,06  4 0.000106276 12422  Q  WS 314838016 + 1024 [qemu-img]
and the amount of requests sent to disk (could be calculated counting
number of lines in the output of blktrace) is reduced about 2 times.

Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
does his job well and real requests comes properly aligned (to page).

Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
  block.c   |  8 
  block/io.c|  2 +-
  block/raw-posix.c | 15 +--
  3 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index e293907..325f727 100644
--- a/block.c
+++ b/block.c
@@ -106,8 +106,8 @@ int is_windows_drive(const char *filename)
  size_t bdrv_opt_mem_align(BlockDriverState *bs)
  {
  if (!bs || !bs->drv) {
-/* 4k should be on the safe side */
-return 4096;
+/* page size or 4k (hdd sector size) should be on the safe side */
+return MAX(4096, getpagesize());
  }
  return bs->bl.opt_mem_alignment;
@@ -116,8 +116,8 @@ size_t bdrv_opt_mem_align(BlockDriverState *bs)
  size_t bdrv_min_mem_align(BlockDriverState *bs)
  {
  if (!bs || !bs->drv) {
-/* 4k should be on the safe side */
-return 4096;
+/* page size or 4k (hdd sector size) should be on the safe side */
+return MAX(4096, getpagesize());
  }
  return bs->bl.min_mem_alignment;
diff --git a/block/io.c b/block/io.c
index 908a3d1..071652c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -205,7 +205,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
  bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
  } else {
  bs->bl.min_mem_alignment = 512;
-bs->bl.opt_mem_alignment = 512;
+bs->bl.opt_mem_alignment = getpagesize();
  }
  if (bs->backing_hd) {
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 7083924..4659552 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -301,6 +301,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  {
  BDRVRawState *s = bs->opaque;
  char *buf;
+size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
  /* For /dev/sg devices the alignment is not really used.
 With buffered I/O, we don't have any restrictions. */
@@ -330,9 +331,9 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  /* If we could not get the sizes so far, we can only guess them */
  if (!s->buf_align) {
  size_t align;
-buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
-if (raw_is_io_aligned(fd, buf + align, MAX_BLOCKSIZE)) {
+buf = qemu_memalign(max_align, 2 * max_align);
+for (align = 512; align <= max_align; align <<= 1) {
+if (raw_is_io_aligned(fd, buf + align, max_align)) {
  s->buf_align = align;
  break;
  }
@@ -342,8 +343,8 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  if (!bs->request_alignment) {
  size_t align;
-buf = qemu_memalign(s->buf_align, MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BL

Re: [Qemu-devel] [PATCH 2/2] block: align bounce buffers to page

2015-05-12 Thread Kevin Wolf
Am 12.05.2015 um 16:20 hat Denis V. Lunev geschrieben:
> On 12/05/15 17:08, Kevin Wolf wrote:
> >Am 12.05.2015 um 15:41 hat Denis V. Lunev geschrieben:
> >>The following sequence
> >> int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
> >> for (i = 0; i < 10; i++)
> >> write(fd, buf, 4096);
> >>performs 5% better if buf is aligned to 4096 bytes.
> >>
> >>The difference is quite reliable.
> >>
> >>On the other hand we do not want at the moment to enforce bounce
> >>buffering if guest request is aligned to 512 bytes.
> >>
> >>The patch changes default bounce buffer optimal alignment to
> >>MAX(page size, 4k). 4k is chosen as maximal known sector size on real
> >>HDD.
> >>
> >>The justification of the performance improve is quite interesting.
> >> From the kernel point of view each request to the disk was split
> >>by two. This could be seen by blktrace like this:
> >>   9,0   11  1 0.0 11151  Q  WS 312737792 + 1023 [qemu-img]
> >>   9,0   11  2 0.07938 11151  Q  WS 312738815 + 8 [qemu-img]
> >>   9,0   11  3 0.30735 11151  Q  WS 312738823 + 1016 [qemu-img]
> >>   9,0   11  4 0.32482 11151  Q  WS 312739839 + 8 [qemu-img]
> >>   9,0   11  5 0.41379 11151  Q  WS 312739847 + 1016 [qemu-img]
> >>   9,0   11  6 0.42818 11151  Q  WS 312740863 + 8 [qemu-img]
> >>   9,0   11  7 0.51236 11151  Q  WS 312740871 + 1017 [qemu-img]
> >>   9,05  1 0.169071519 11151  Q  WS 312741888 + 1023 [qemu-img]
> >>After the patch the pattern becomes normal:
> >>   9,06  1 0.0 12422  Q  WS 314834944 + 1024 [qemu-img]
> >>   9,06  2 0.38527 12422  Q  WS 314835968 + 1024 [qemu-img]
> >>   9,06  3 0.72849 12422  Q  WS 314836992 + 1024 [qemu-img]
> >>   9,06  4 0.000106276 12422  Q  WS 314838016 + 1024 [qemu-img]
> >>and the amount of requests sent to disk (could be calculated counting
> >>number of lines in the output of blktrace) is reduced about 2 times.
> >>
> >>Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
> >>does his job well and real requests comes properly aligned (to page).
> >>
> >>Signed-off-by: Denis V. Lunev 
> >>CC: Paolo Bonzini 
> >>CC: Kevin Wolf 
> >>CC: Stefan Hajnoczi 
> >>---
> >>  block.c   |  8 
> >>  block/io.c|  2 +-
> >>  block/raw-posix.c | 15 +--
> >>  3 files changed, 14 insertions(+), 11 deletions(-)
> >>
> >>diff --git a/block.c b/block.c
> >>index e293907..325f727 100644
> >>--- a/block.c
> >>+++ b/block.c
> >>@@ -106,8 +106,8 @@ int is_windows_drive(const char *filename)
> >>  size_t bdrv_opt_mem_align(BlockDriverState *bs)
> >>  {
> >>  if (!bs || !bs->drv) {
> >>-/* 4k should be on the safe side */
> >>-return 4096;
> >>+/* page size or 4k (hdd sector size) should be on the safe side */
> >>+return MAX(4096, getpagesize());
> >>  }
> >>  return bs->bl.opt_mem_alignment;
> >>@@ -116,8 +116,8 @@ size_t bdrv_opt_mem_align(BlockDriverState *bs)
> >>  size_t bdrv_min_mem_align(BlockDriverState *bs)
> >>  {
> >>  if (!bs || !bs->drv) {
> >>-/* 4k should be on the safe side */
> >>-return 4096;
> >>+/* page size or 4k (hdd sector size) should be on the safe side */
> >>+return MAX(4096, getpagesize());
> >>  }
> >>  return bs->bl.min_mem_alignment;
> >>diff --git a/block/io.c b/block/io.c
> >>index 908a3d1..071652c 100644
> >>--- a/block/io.c
> >>+++ b/block/io.c
> >>@@ -205,7 +205,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error 
> >>**errp)
> >>  bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
> >>  } else {
> >>  bs->bl.min_mem_alignment = 512;
> >>-bs->bl.opt_mem_alignment = 512;
> >>+bs->bl.opt_mem_alignment = getpagesize();
> >>  }
> >>  if (bs->backing_hd) {
> >>diff --git a/block/raw-posix.c b/block/raw-posix.c
> >>index 7083924..4659552 100644
> >>--- a/block/raw-posix.c
> >>+++ b/block/raw-posix.c
> >>@@ -301,6 +301,7 @@ static void raw_probe_alignment(BlockDriverState *bs, 
> >>int fd, Error **errp)
> >>  {
> >>  BDRVRawState *s = bs->opaque;
> >>  char *buf;
> >>+size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
> >>  /* For /dev/sg devices the alignment is not really used.
> >> With buffered I/O, we don't have any restrictions. */
> >>@@ -330,9 +331,9 @@ static void raw_probe_alignment(BlockDriverState *bs, 
> >>int fd, Error **errp)
> >>  /* If we could not get the sizes so far, we can only guess them */
> >>  if (!s->buf_align) {
> >>  size_t align;
> >>-buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
> >>-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
> >>-if (raw_is_io_aligned(fd, buf + align, MAX_BLOCKSIZE)) {
> >>+buf = qemu_memalign(max_align, 2 * max_align);
> >>+for (align = 512; align <= max_align; align <<= 1) {
> >>+

Re: [Qemu-devel] [PATCH 08/34] block: Add list of children to BlockDriverState

2015-05-12 Thread Kevin Wolf
Am 11.05.2015 um 17:45 hat Max Reitz geschrieben:
> On 08.05.2015 19:21, Kevin Wolf wrote:
> >This allows iterating over all children of a given BDS, not only
> >including bs->file and bs->backing_hd, but also driver-specific
> >ones like VMDK extents or Quorum children.
> >
> >Signed-off-by: Kevin Wolf 
> >---
> >  block.c   | 27 +++
> >  include/block/block_int.h |  8 
> >  2 files changed, 35 insertions(+)
> >
> >diff --git a/block.c b/block.c
> >index c4f0fb4..59f54ed 100644
> >--- a/block.c
> >+++ b/block.c
> >@@ -1301,6 +1301,19 @@ out:
> >  return ret;
> >  }
> >+static void bdrv_attach_child(BlockDriverState *parent_bs,
> >+  BlockDriverState *child_bs,
> >+  const BdrvChildRole *child_role)
> >+{
> >+BdrvChild *child = g_new(BdrvChild, 1);
> >+*child = (BdrvChild) {
> >+.bs = child_bs,
> >+.role   = child_role,
> >+};
> >+
> >+QLIST_INSERT_HEAD(&parent_bs->children, child, next);
> >+}
> >+
> >  /*
> >   * Opens a disk image (raw, qcow2, vmdk, ...)
> >   *
> >@@ -1353,6 +1366,9 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
> >const char *filename,
> >  return -ENODEV;
> >  }
> >  bdrv_ref(bs);
> >+if (child_role) {
> >+bdrv_attach_child(parent, bs, child_role);
> >+}
> >  *pbs = bs;
> >  return 0;
> >  }
> >@@ -1495,6 +1511,10 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
> >const char *filename,
> >  goto close_and_fail;
> >  }
> >+if (child_role) {
> >+bdrv_attach_child(parent, bs, child_role);
> >+}
> >+
> >  QDECREF(options);
> >  *pbs = bs;
> >  return 0;
> >@@ -1789,6 +1809,12 @@ void bdrv_close(BlockDriverState *bs)
> >  notifier_list_notify(&bs->close_notifiers, bs);
> >  if (bs->drv) {
> >+BdrvChild *child, *next;
> >+
> >+QLIST_FOREACH_SAFE(child, &bs->children, next, next) {
> >+g_free(child);
> >+}
> >+
> 
> Not considering the case where the child is closed before the parent
> assumes all children are reference-counted from the parent and they
> won't be closed (and maybe replaced with another BDS) on purpose.
> The first seems reasonable, the second one I'm not so sure about. It
> works for now, but I could imagine that we want to modify children
> of a Quorum instance at runtime.
> 
> But I can't imagine any case where this would break right now, so I
> guess I'm fine with it.

We don't have that yet, but I suppose to remove a child you would modify
the Quorum node to drop the child reference, which should at the same
time remove it from the children list.

What we currently can do (I think) is replacing a node with another
node. In that case, I thought bdrv_swap() would do the right thing, but
maybe it doesn't.

> >  if (bs->backing_hd) {
> >  BlockDriverState *backing_hd = bs->backing_hd;
> >  bdrv_set_backing_hd(bs, NULL);
> >@@ -1999,6 +2025,7 @@ void bdrv_append(BlockDriverState *bs_new, 
> >BlockDriverState *bs_top)
> >  /* The contents of 'tmp' will become bs_top, as we are
> >   * swapping bs_new and bs_top contents. */
> >  bdrv_set_backing_hd(bs_top, bs_new);
> >+bdrv_attach_child(bs_top, bs_new, &child_backing);
> >  }
> >  static void bdrv_delete(BlockDriverState *bs)
> 
> Using a mirror block job, we can force bdrv_swap() on arbitrary
> nodes, right? What happens if you swap e.g. a VMDK and a quorum
> node? Well, maybe one simply cannot swap a quorum node due to
> blockers, but I guess one can swap a VMDK node with some non-VMDK
> node. It is actually correct to leave the extents behind; but the
> other node cannot do anything with them, so because they are part of
> the opaque VMDK structure, they will de-facto remain with VMDK,
> while being counted as children of the other node. But I try to keep
> so far away from bdrv_swap() that I don't even know whether this
> case is even possible.

I suspect that instead of doing bdrv_attach_child() here, the child list
must be handled in bdrv_move_feature_fields(), so that the swapped BDSes
effectively swap their roles.

bs->inherits_from (next patch) might be similar. I'm not completely sure
yet what the ideal behaviour would be there. Or perhaps just set it to
NULL for both swapped BDSes.

Kevin



Re: [Qemu-devel] [PATCH 2/2] block: align bounce buffers to page

2015-05-12 Thread Denis V. Lunev

On 12/05/15 17:08, Kevin Wolf wrote:

Am 12.05.2015 um 15:41 hat Denis V. Lunev geschrieben:

The following sequence
 int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
 for (i = 0; i < 10; i++)
 write(fd, buf, 4096);
performs 5% better if buf is aligned to 4096 bytes.

The difference is quite reliable.

On the other hand we do not want at the moment to enforce bounce
buffering if guest request is aligned to 512 bytes.

The patch changes default bounce buffer optimal alignment to
MAX(page size, 4k). 4k is chosen as maximal known sector size on real
HDD.

The justification of the performance improve is quite interesting.
 From the kernel point of view each request to the disk was split
by two. This could be seen by blktrace like this:
   9,0   11  1 0.0 11151  Q  WS 312737792 + 1023 [qemu-img]
   9,0   11  2 0.07938 11151  Q  WS 312738815 + 8 [qemu-img]
   9,0   11  3 0.30735 11151  Q  WS 312738823 + 1016 [qemu-img]
   9,0   11  4 0.32482 11151  Q  WS 312739839 + 8 [qemu-img]
   9,0   11  5 0.41379 11151  Q  WS 312739847 + 1016 [qemu-img]
   9,0   11  6 0.42818 11151  Q  WS 312740863 + 8 [qemu-img]
   9,0   11  7 0.51236 11151  Q  WS 312740871 + 1017 [qemu-img]
   9,05  1 0.169071519 11151  Q  WS 312741888 + 1023 [qemu-img]
After the patch the pattern becomes normal:
   9,06  1 0.0 12422  Q  WS 314834944 + 1024 [qemu-img]
   9,06  2 0.38527 12422  Q  WS 314835968 + 1024 [qemu-img]
   9,06  3 0.72849 12422  Q  WS 314836992 + 1024 [qemu-img]
   9,06  4 0.000106276 12422  Q  WS 314838016 + 1024 [qemu-img]
and the amount of requests sent to disk (could be calculated counting
number of lines in the output of blktrace) is reduced about 2 times.

Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
does his job well and real requests comes properly aligned (to page).

Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
  block.c   |  8 
  block/io.c|  2 +-
  block/raw-posix.c | 15 +--
  3 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index e293907..325f727 100644
--- a/block.c
+++ b/block.c
@@ -106,8 +106,8 @@ int is_windows_drive(const char *filename)
  size_t bdrv_opt_mem_align(BlockDriverState *bs)
  {
  if (!bs || !bs->drv) {
-/* 4k should be on the safe side */
-return 4096;
+/* page size or 4k (hdd sector size) should be on the safe side */
+return MAX(4096, getpagesize());
  }
  
  return bs->bl.opt_mem_alignment;

@@ -116,8 +116,8 @@ size_t bdrv_opt_mem_align(BlockDriverState *bs)
  size_t bdrv_min_mem_align(BlockDriverState *bs)
  {
  if (!bs || !bs->drv) {
-/* 4k should be on the safe side */
-return 4096;
+/* page size or 4k (hdd sector size) should be on the safe side */
+return MAX(4096, getpagesize());
  }
  
  return bs->bl.min_mem_alignment;

diff --git a/block/io.c b/block/io.c
index 908a3d1..071652c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -205,7 +205,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
  bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
  } else {
  bs->bl.min_mem_alignment = 512;
-bs->bl.opt_mem_alignment = 512;
+bs->bl.opt_mem_alignment = getpagesize();
  }
  
  if (bs->backing_hd) {

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 7083924..4659552 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -301,6 +301,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  {
  BDRVRawState *s = bs->opaque;
  char *buf;
+size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
  
  /* For /dev/sg devices the alignment is not really used.

 With buffered I/O, we don't have any restrictions. */
@@ -330,9 +331,9 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  /* If we could not get the sizes so far, we can only guess them */
  if (!s->buf_align) {
  size_t align;
-buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
-if (raw_is_io_aligned(fd, buf + align, MAX_BLOCKSIZE)) {
+buf = qemu_memalign(max_align, 2 * max_align);
+for (align = 512; align <= max_align; align <<= 1) {
+if (raw_is_io_aligned(fd, buf + align, max_align)) {
  s->buf_align = align;
  break;
  }
@@ -342,8 +343,8 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  
  if (!bs->request_alignment) {

  size_t align;
-buf = qemu_memalign(s->buf_align, MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
+buf = qemu_memalign(s->buf_align, max_ali

Re: [Qemu-devel] [Xen-devel] [PATCH v6 2/6] Qemu-Xen-vTPM: Xen frontend driver infrastructure

2015-05-12 Thread Stefano Stabellini
On Tue, 12 May 2015, Xu, Quan wrote:
> > -Original Message-
> > From: xen-devel-boun...@lists.xen.org
> > [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Stefano Stabellini
> > Sent: Friday, May 08, 2015 1:26 AM
> > To: Xu, Quan
> > Cc: wei.l...@citrix.com; stef...@linux.vnet.ibm.com;
> > stefano.stabell...@eu.citrix.com; qemu-devel@nongnu.org;
> > xen-de...@lists.xen.org; dgde...@tycho.nsa.gov; ebl...@redhat.com
> > Subject: Re: [Xen-devel] [PATCH v6 2/6] Qemu-Xen-vTPM: Xen frontend driver
> > infrastructure
> > 
> > On Mon, 4 May 2015, Quan Xu wrote:
> > > This patch adds infrastructure for xen front drivers living in qemu,
> > > so drivers don't need to implement common stuff on their own.  It's
> > > mostly xenbus management stuff: some functions to access XenStore,
> > > setting up XenStore watches, callbacks on device discovery and state
> > > changes, and handle event channel between the virtual machines.
> > >
> > > Call xen_fe_register() function to register XenDevOps, and make sure,
> > > XenDevOps's flags is DEVOPS_FLAG_FE, which is flag bit to point out
> > > the XenDevOps is Xen frontend.
> > >
> > > Create a new file xen_pvdev.c for some common part of xen frontend and
> > > backend, such as xendevs queue and xenstore update functions.
> > >
> > > Signed-off-by: Quan Xu 
> > 
> > Better than the early versions, thanks.
> > 
> > However the patch is too big and it is too difficult to read as is.
> > Could you please split it in two: a patch that creates xen_pvdev.c and 
> > moves a
> > few functions from xen_backend.c to it and a second patch that introduces
> > xen_frontend.c.
> > 
> 
> Stefano,
>I missed this comment. Sorry for that.
>Agreed, also I think it is too big. I will do it in v8. 

Thanks! If you could also try to address the other comments on this
patch in v8, that would be great.



Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check

2015-05-12 Thread Peter Maydell
On 12 May 2015 at 14:14, Cornelia Huck  wrote:
> On Wed, 06 May 2015 14:07:37 +0200
> Greg Kurz  wrote:
>> @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t 
>> *features, unsigned int fbit)
>>
>>  static inline bool __virtio_has_feature(uint32_t features, unsigned int 
>> fbit)
>>  {
>> -assert(fbit < 32);
>>  return !!(features & (1 << fbit));
>>  }
>>
>>
>>
>
> I must say I'm not very comfortable with knowingly passing out-of-rage
> values to this function.

It would invoke C undefined behaviour, so clearly a bug if we did
pass an out-of-range value here. You'd need to at least do
if (fbit >= 32) {
return false;
}
if you want to make it valid.

-- PMM



Re: [Qemu-devel] [PATCH 2/2] block: align bounce buffers to page

2015-05-12 Thread Kevin Wolf
Am 12.05.2015 um 15:41 hat Denis V. Lunev geschrieben:
> The following sequence
> int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
> for (i = 0; i < 10; i++)
> write(fd, buf, 4096);
> performs 5% better if buf is aligned to 4096 bytes.
> 
> The difference is quite reliable.
> 
> On the other hand we do not want at the moment to enforce bounce
> buffering if guest request is aligned to 512 bytes.
> 
> The patch changes default bounce buffer optimal alignment to
> MAX(page size, 4k). 4k is chosen as maximal known sector size on real
> HDD.
> 
> The justification of the performance improve is quite interesting.
> From the kernel point of view each request to the disk was split
> by two. This could be seen by blktrace like this:
>   9,0   11  1 0.0 11151  Q  WS 312737792 + 1023 [qemu-img]
>   9,0   11  2 0.07938 11151  Q  WS 312738815 + 8 [qemu-img]
>   9,0   11  3 0.30735 11151  Q  WS 312738823 + 1016 [qemu-img]
>   9,0   11  4 0.32482 11151  Q  WS 312739839 + 8 [qemu-img]
>   9,0   11  5 0.41379 11151  Q  WS 312739847 + 1016 [qemu-img]
>   9,0   11  6 0.42818 11151  Q  WS 312740863 + 8 [qemu-img]
>   9,0   11  7 0.51236 11151  Q  WS 312740871 + 1017 [qemu-img]
>   9,05  1 0.169071519 11151  Q  WS 312741888 + 1023 [qemu-img]
> After the patch the pattern becomes normal:
>   9,06  1 0.0 12422  Q  WS 314834944 + 1024 [qemu-img]
>   9,06  2 0.38527 12422  Q  WS 314835968 + 1024 [qemu-img]
>   9,06  3 0.72849 12422  Q  WS 314836992 + 1024 [qemu-img]
>   9,06  4 0.000106276 12422  Q  WS 314838016 + 1024 [qemu-img]
> and the amount of requests sent to disk (could be calculated counting
> number of lines in the output of blktrace) is reduced about 2 times.
> 
> Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
> does his job well and real requests comes properly aligned (to page).
> 
> Signed-off-by: Denis V. Lunev 
> CC: Paolo Bonzini 
> CC: Kevin Wolf 
> CC: Stefan Hajnoczi 
> ---
>  block.c   |  8 
>  block/io.c|  2 +-
>  block/raw-posix.c | 15 +--
>  3 files changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/block.c b/block.c
> index e293907..325f727 100644
> --- a/block.c
> +++ b/block.c
> @@ -106,8 +106,8 @@ int is_windows_drive(const char *filename)
>  size_t bdrv_opt_mem_align(BlockDriverState *bs)
>  {
>  if (!bs || !bs->drv) {
> -/* 4k should be on the safe side */
> -return 4096;
> +/* page size or 4k (hdd sector size) should be on the safe side */
> +return MAX(4096, getpagesize());
>  }
>  
>  return bs->bl.opt_mem_alignment;
> @@ -116,8 +116,8 @@ size_t bdrv_opt_mem_align(BlockDriverState *bs)
>  size_t bdrv_min_mem_align(BlockDriverState *bs)
>  {
>  if (!bs || !bs->drv) {
> -/* 4k should be on the safe side */
> -return 4096;
> +/* page size or 4k (hdd sector size) should be on the safe side */
> +return MAX(4096, getpagesize());
>  }
>  
>  return bs->bl.min_mem_alignment;
> diff --git a/block/io.c b/block/io.c
> index 908a3d1..071652c 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -205,7 +205,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error 
> **errp)
>  bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
>  } else {
>  bs->bl.min_mem_alignment = 512;
> -bs->bl.opt_mem_alignment = 512;
> +bs->bl.opt_mem_alignment = getpagesize();
>  }
>  
>  if (bs->backing_hd) {
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index 7083924..4659552 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -301,6 +301,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
> fd, Error **errp)
>  {
>  BDRVRawState *s = bs->opaque;
>  char *buf;
> +size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
>  
>  /* For /dev/sg devices the alignment is not really used.
> With buffered I/O, we don't have any restrictions. */
> @@ -330,9 +331,9 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
> fd, Error **errp)
>  /* If we could not get the sizes so far, we can only guess them */
>  if (!s->buf_align) {
>  size_t align;
> -buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
> -for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
> -if (raw_is_io_aligned(fd, buf + align, MAX_BLOCKSIZE)) {
> +buf = qemu_memalign(max_align, 2 * max_align);
> +for (align = 512; align <= max_align; align <<= 1) {
> +if (raw_is_io_aligned(fd, buf + align, max_align)) {
>  s->buf_align = align;
>  break;
>  }
> @@ -342,8 +343,8 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
> fd, Error **errp)
>  
>  if (!bs->request_alignment) {
>  size_t align;
> -buf = qemu_memalign(s->

Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check

2015-05-12 Thread Greg Kurz
On Tue, 12 May 2015 15:14:53 +0200
Cornelia Huck  wrote:
> On Wed, 06 May 2015 14:07:37 +0200
> Greg Kurz  wrote:
> 
> > Unlike with add and clear, there is no valid reason to abort when checking
> > for a feature. It makes more sense to return false (i.e. the feature bit
> > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > 
> > This allows to introduce code that is aware about new 64-bit features like
> > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > 
> > Signed-off-by: Greg Kurz 
> > ---
> >  include/hw/virtio/virtio.h |1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > index d95f8b6..6ef70f1 100644
> > --- a/include/hw/virtio/virtio.h
> > +++ b/include/hw/virtio/virtio.h
> > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t 
> > *features, unsigned int fbit)
> > 
> >  static inline bool __virtio_has_feature(uint32_t features, unsigned int 
> > fbit)
> >  {
> > -assert(fbit < 32);
> >  return !!(features & (1 << fbit));
> >  }
> > 
> > 
> > 
> 
> I must say I'm not very comfortable with knowingly passing out-of-rage
> values to this function.
> 

I take that as a valid reason then :)

> Can we perhaps apply at least the feature-bit-size extending patches
> prior to your patchset, if the remainder of the virtio-1 patchset still
> takes some time?

Hmm... if I remember well, it still lacks migration support.

--
Greg




Re: [Qemu-devel] [v8 13/14] migration: Add qmp commands to set and query parameters

2015-05-12 Thread Dr. David Alan Gilbert
* Li, Liang Z (liang.z...@intel.com) wrote:
> > 
> > * Liang Li (liang.z...@intel.com) wrote:
> > > Add the qmp commands to tune and query the parameters used in live
> > > migration.
> > 
> > Hi,
> >   Do you know if there's anyone working on libvirt code to drive this 
> > interface
> > and turn on your compression code?
> > 
> 
> Yes,  I have confirmed that one person of Intel are working on this.

Great; I look forward to trying it.

Dave

> 
> Liang
> 
> > Dave
> > 
> > >
> > > Signed-off-by: Liang Li 
> > > Signed-off-by: Yang Zhang 
> > > ---
> > >  migration/migration.c | 56
> > ++
> > >  qapi-schema.json  | 45
> > 
> > >  qmp-commands.hx   | 57
> > +++
> > >  3 files changed, 158 insertions(+)
> > >
> > > diff --git a/migration/migration.c b/migration/migration.c index
> > > 533717c..8732803 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -188,6 +188,21 @@ MigrationCapabilityStatusList
> > *qmp_query_migrate_capabilities(Error **errp)
> > >  return head;
> > >  }
> > >
> > > +MigrationParameters *qmp_query_migrate_parameters(Error **errp) {
> > > +MigrationParameters *params;
> > > +MigrationState *s = migrate_get_current();
> > > +
> > > +params = g_malloc0(sizeof(*params));
> > > +params->compress_level = s-
> > >parameters[MIGRATION_PARAMETER_COMPRESS_LEVEL];
> > > +params->compress_threads =
> > > +s->parameters[MIGRATION_PARAMETER_COMPRESS_THREADS];
> > > +params->decompress_threads =
> > > +s->parameters[MIGRATION_PARAMETER_DECOMPRESS_THREADS];
> > > +
> > > +return params;
> > > +}
> > > +
> > >  static void get_xbzrle_cache_stats(MigrationInfo *info)  {
> > >  if (migrate_use_xbzrle()) {
> > > @@ -301,6 +316,47 @@ void
> > qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
> > >  }
> > >  }
> > >
> > > +void qmp_migrate_set_parameters(bool has_compress_level,
> > > +int64_t compress_level,
> > > +bool has_compress_threads,
> > > +int64_t compress_threads,
> > > +bool has_decompress_threads,
> > > +int64_t decompress_threads, Error
> > > +**errp) {
> > > +MigrationState *s = migrate_get_current();
> > > +
> > > +if (has_compress_level && (compress_level < 0 || compress_level > 9))
> > {
> > > +error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> > "compress_level",
> > > +  "a value in range [0, 9]");
> > > +return;
> > > +}
> > > +if (has_compress_threads &&
> > > +(compress_threads < 1 || compress_threads > 255)) {
> > > +error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> > > +  "compress_threads",
> > > +  "a value in range [1, 255]");
> > > +return;
> > > +}
> > > +if (has_decompress_threads &&
> > > +(decompress_threads < 1 || decompress_threads > 255)) {
> > > +error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> > > +  "decompress_threads",
> > > +  "a value in range [1, 255]");
> > > +return;
> > > +}
> > > +
> > > +if (has_compress_level) {
> > > +s->parameters[MIGRATION_PARAMETER_COMPRESS_LEVEL] =
> > compress_level;
> > > +}
> > > +if (has_compress_threads) {
> > > +s->parameters[MIGRATION_PARAMETER_COMPRESS_THREADS] =
> > compress_threads;
> > > +}
> > > +if (has_decompress_threads) {
> > > +s->parameters[MIGRATION_PARAMETER_DECOMPRESS_THREADS] =
> > > +decompress_threads;
> > > +}
> > > +}
> > > +
> > >  /* shared migration helpers */
> > >
> > >  static void migrate_set_state(MigrationState *s, int old_state, int
> > > new_state) diff --git a/qapi-schema.json b/qapi-schema.json index
> > > 121fcc7..579801b 100644
> > > --- a/qapi-schema.json
> > > +++ b/qapi-schema.json
> > > @@ -592,6 +592,51 @@
> > >  { 'enum': 'MigrationParameter',
> > >'data': ['compress-level', 'compress-threads',
> > > 'decompress-threads'] }
> > >
> > > +#
> > > +# @migrate-set-parameters
> > > +#
> > > +# Set the following migration parameters # # @compress-level:
> > > +compression level # # @compress-threads: compression thread count # #
> > > +@decompress-threads: decompression thread count # # Since: 2.3 ## {
> > > +'command': 'migrate-set-parameters',
> > > +  'data': { '*compress-level': 'int',
> > > +'*compress-threads': 'int',
> > > +'*decompress-threads': 'int'} }
> > > +
> > > +#
> > > +# @MigrationParameters
> > > +#
> > > +# @compress-level: compression level
> > > +#
> > > +# @compress-threads: compression thread count # #
> > > +@decompress-threads: decompression thread count # # Since: 2.3 ## 

Re: [Qemu-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-05-12 Thread Fabio Fantoni

Il 12/05/2015 12:26, Fabio Fantoni ha scritto:

Il 12/05/2015 11:23, Fabio Fantoni ha scritto:

Il 11/05/2015 17:04, Fabio Fantoni ha scritto:

Il 21/04/2015 14:53, Stefano Stabellini ha scritto:

On Tue, 21 Apr 2015, Fabio Fantoni wrote:

Il 21/04/2015 12:49, Stefano Stabellini ha scritto:

On Mon, 20 Apr 2015, Fabio Fantoni wrote:
I updated xen and qemu from xen 4.5.0 with its upstream qemu 
included to

xen
4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk 
to use

revision "master").
After few minutes I booted windows 7 64 bit domU qemu crash, 
tried 2 times

with same result.

In the domU's qemu log:

qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion `(old_top ==
(((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
__builtin_offsetof
(struct malloc_chunk, fd && old_size == 0) || ((unsigned long)
(old_size) >= (unsigned long)__builtin_offsetof (struct
malloc_chunk,
fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * 
(sizeof(size_t))) -
1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & 
pagemask)

==
0)' failed.
Killing all inferiors

In attachment the full backtrace of qemu crash.

With a fast search after I saw the backtrace I found a probable 
cause of

regression (I'm not sure):
http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa 


spice: make sure we don't overflow ssd->buf

Added also qemu-devel and spice-devel as cc.

If you need more informations/tests tell me and I'll post them.

   Maybe you could try to revert the offending commit
(5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect 
the

crash?

Thanks for your reply.

I reverted to 4.5.0 on dom0 for now on that system because I'm 
busy trying to
found another problem that cause very bad performance without 
errors or
nothing in logs :( I don't know if if xen related, kernel related 
or other for

now.

About this regression with spice I'll do further tests in next 
days (probably
starting reverting the spice patch in qemu) but any help is 
appreciated.
Based on data I have for now is possible that the problem is that 
qemu try to
allocate other ram or videoram after domU create but with xen is 
not possible?
In the spice related patch I saw something about dynamic 
allocation for

example.

It is probably caused by a commit in the range:

1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4 



there are only 10 commits in that range. By using git bisect you 
should

be able to narrow it down in just 3 tests.


Sorry for delay, I was busy with many things, today I retried with 
updated stable-4.5 and also reverting "spice: make sure we don't 
overflow ssd->buf" (in a second test) but in both case regression 
remain :(

Tomorrow probably I'll do other tests.


I did another test, reverting this instead:
http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commit;h=c9ac5f816bf3a8b56f836b078711dcef6e5c90b8 

And now seems I'm unable to reproduce the regression, before happen 
after few seconds up to 1-2 minutes, now I use the same domU 15-20 
minutes without problem.
Probably is the cause of regression even if seems strange that on 
unstable with same patch on tests of some days ago didn't happen.


Any ideas?

Thanks for any reply and sorry for my bad english.


Bad news, qemu crash still happen even if this time in qemu log there 
is another output, see attachment.

After take a look on the other patches I saw:
http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commitdiff;h=7154fba0e51ec985ef621965d1b7120ad424fcbf 

With "Conflicts: hw/display/vga.c" in description I'll try to revert 
it instead.


Or someone can tell me another probable test I can try?


Tried also to revet the patch above with same result, so I retried with 
qemu from 4.5.0 and seems the crash happen also in this case...I'm going 
crazy :(


In attachment full gdb log.

Any ideas on how to found the problem please?

Thanks for any reply and sorry for my bad english.
Full backtrace:
#0  0x736e8165 in *__GI_raise (sig=) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64
pid = 
selftid = 
#1  0x736eb3e0 in *__GI_abort () at abort.c:92
act = {__sigaction_handler = {sa_handler = 0x58ddeba0, sa_sigaction 
= 0x58ddeba0}, sa_mask = {__val = {140737278660816, 140737014337136, 4, 
140737014337376, 140737277706678, 206158430256, 140737014337416, 
140737014337168, 87, 226653584, 140737351936019, 140737488348083, 
140737278647399, 140737278651152, 3096, 140737277299604}}, sa_flags = 
-474017696, sa_restorer = 0x736b9c60}
sigs = {__val = {32, 0 }}
#2  0x7372bdea in __malloc_assert (assertion=, 
file=, line=, function=) at 
malloc.c:351
No locals.
#3  0x7372ed13 in sYSMALLOc (av=, nb=) at 
malloc.c:3093
snd_brk = 
front_misalign = 
remainder = 
tried_mmap = false
old_size = 
size = 
   

Re: [Qemu-devel] [PATCH] ui: use libexpoxy

2015-05-12 Thread Gerd Hoffmann
On Di, 2015-05-12 at 14:10 +0100, Peter Maydell wrote:
> On 12 May 2015 at 14:04, Gerd Hoffmann  wrote:
> > libepoxy does the opengl extension handling for us.
> >
> > It also is helpful for trouble-shooting as it prints nice error messages
> > instead of silently failing or segfaulting in case we do something
> > wrong, like using gl commands not supported by the current context.
> 
> How widely supported is this library? How long has it been around,
> is it carried by all the distros, does it work ok on OSX and Windows?
> 
> I'm a bit uncertain about adding dependencies that would limit the
> scope where we can provide important functionality like 3D
> acceleration.

https://github.com/anholt/libepoxy

It is relatively new (a bit more than a year old), supposed to work on
both osx and windows (didn't test myself though), and it seems to be
commonly included in distros (although newer versions only).

cheers,
  Gerd





Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check

2015-05-12 Thread Cornelia Huck
On Tue, 12 May 2015 15:34:47 +0200
"Michael S. Tsirkin"  wrote:

> On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> > On Wed, 06 May 2015 14:07:37 +0200
> > Greg Kurz  wrote:
> > 
> > > Unlike with add and clear, there is no valid reason to abort when checking
> > > for a feature. It makes more sense to return false (i.e. the feature bit
> > > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 
> > > 32.
> > > 
> > > This allows to introduce code that is aware about new 64-bit features like
> > > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > > 
> > > Signed-off-by: Greg Kurz 
> > > ---
> > >  include/hw/virtio/virtio.h |1 -
> > >  1 file changed, 1 deletion(-)
> > > 
> > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > index d95f8b6..6ef70f1 100644
> > > --- a/include/hw/virtio/virtio.h
> > > +++ b/include/hw/virtio/virtio.h
> > > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t 
> > > *features, unsigned int fbit)
> > > 
> > >  static inline bool __virtio_has_feature(uint32_t features, unsigned int 
> > > fbit)
> > >  {
> > > -assert(fbit < 32);
> > >  return !!(features & (1 << fbit));
> > >  }
> > > 
> > > 
> > > 
> > 
> > I must say I'm not very comfortable with knowingly passing out-of-rage
> > values to this function.
> > 
> > Can we perhaps apply at least the feature-bit-size extending patches
> > prior to your patchset, if the remainder of the virtio-1 patchset still
> > takes some time?
> 
> So the feature-bit-size extending patches currently don't support
> migration correctly, that's why they are not merged.
> 
> What I think we need to do for this is move host_features out
> from transports into core virtio device.
> 
> Then we can simply check host features >31 and skip
> migrating low guest features is none set.
> 
> Thoughts? Any takers?
> 

After we move host_features, put them into an optional vmstate
subsection?

I think with the recent patchsets, most of the interesting stuff is
already not handled by the transport anymore. There's only
VIRTIO_F_NOTIFY_ON_EMPTY and VIRTIO_F_BAD_FEATURE left (set by pci and
ccw).




Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check

2015-05-12 Thread Michael S. Tsirkin
On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> On Wed, 06 May 2015 14:07:37 +0200
> Greg Kurz  wrote:
> 
> > Unlike with add and clear, there is no valid reason to abort when checking
> > for a feature. It makes more sense to return false (i.e. the feature bit
> > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > 
> > This allows to introduce code that is aware about new 64-bit features like
> > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > 
> > Signed-off-by: Greg Kurz 
> > ---
> >  include/hw/virtio/virtio.h |1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > index d95f8b6..6ef70f1 100644
> > --- a/include/hw/virtio/virtio.h
> > +++ b/include/hw/virtio/virtio.h
> > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t 
> > *features, unsigned int fbit)
> > 
> >  static inline bool __virtio_has_feature(uint32_t features, unsigned int 
> > fbit)
> >  {
> > -assert(fbit < 32);
> >  return !!(features & (1 << fbit));
> >  }
> > 
> > 
> > 
> 
> I must say I'm not very comfortable with knowingly passing out-of-rage
> values to this function.
> 
> Can we perhaps apply at least the feature-bit-size extending patches
> prior to your patchset, if the remainder of the virtio-1 patchset still
> takes some time?

So the feature-bit-size extending patches currently don't support
migration correctly, that's why they are not merged.

What I think we need to do for this is move host_features out
from transports into core virtio device.

Then we can simply check host features >31 and skip
migrating low guest features is none set.

Thoughts? Any takers?

-- 
MST



[Qemu-devel] [PATCH v7 0/2] block: enforce minimal 4096 alignment in qemu_blockalign

2015-05-12 Thread Denis V. Lunev
I have used the following program to test
#define _GNU_SOURCE

#include 
#include 
#include 
#include 
#include 
#include 

int main(int argc, char *argv[])
{
int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
void *buf;
int i = 0, align = atoi(argv[2]);

do {
buf = memalign(align, 4096);
if (align >= 4096)
break;
if ((unsigned long)buf & 4095)
break;
i++;
} while (1);
printf("%d %p\n", i, buf);

memset(buf, 0x11, 4096);

for (i = 0; i < 10; i++) {
lseek(fd, SEEK_CUR, 4096);
write(fd, buf, 4096);
}

close(fd);
return 0;
}
for in in `seq 1 30` ; do a.out aa ; done

The file was placed into 8 GB partition on HDD below to avoid speed
change due to different offset on disk. Results are reliable:
- 189 vs 180 seconds on Linux 3.16

The following setups have been tested:
1) ext4 with block size equals to 1024 over 512/512 physical/logical
   sector size SSD disk
2) ext4 with block size equals to 4096 over 512/512 physical/logical
   sector size SSD disk
3) ext4 with block size equals to 4096 over 512/4096 physical/logical
   sector size rotational disk (WDC WD20EZRX)
4) xfs with block size equals to 4096 over 512/512 physical/logical
   sector size SSD disk

The difference is quite reliable and the same 5%.
  qemu-io -n -c 'write -P 0xaa 0 1G' 1.img
for image in qcow2 format is 1% faster.

qemu-img is also affected. The difference in between
  qemu-img create -f qcow2 1.img 64G
  qemu-io -n -c 'write -P 0xaa 0 1G' 1.img
  time for i in `seq 1 30` ; do qemu-img convert 1.img -t none -O raw 2.img ; 
rm -rf 2.img ; done
is around 126 vs 119 seconds.

The justification of the performance improve is quite interesting.
>From the kernel point of view each request to the disk was split
by two. This could be seen by blktrace like this:
  9,0   11  1 0.0 11151  Q  WS 312737792 + 1023 [qemu-img]
  9,0   11  2 0.07938 11151  Q  WS 312738815 + 8 [qemu-img]
  9,0   11  3 0.30735 11151  Q  WS 312738823 + 1016 [qemu-img]
  9,0   11  4 0.32482 11151  Q  WS 312739839 + 8 [qemu-img]
  9,0   11  5 0.41379 11151  Q  WS 312739847 + 1016 [qemu-img]
  9,0   11  6 0.42818 11151  Q  WS 312740863 + 8 [qemu-img]
  9,0   11  7 0.51236 11151  Q  WS 312740871 + 1017 [qemu-img]
  9,05  1 0.169071519 11151  Q  WS 312741888 + 1023 [qemu-img]
After the patch the pattern becomes normal:
  9,06  1 0.0 12422  Q  WS 314834944 + 1024 [qemu-img]
  9,06  2 0.38527 12422  Q  WS 314835968 + 1024 [qemu-img]
  9,06  3 0.72849 12422  Q  WS 314836992 + 1024 [qemu-img]
  9,06  4 0.000106276 12422  Q  WS 314838016 + 1024 [qemu-img]
and the amount of requests sent to disk (could be calculated counting
number of lines in the output of blktrace) is reduced about 2 times.

Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
does his job well and real requests comes properly aligned (to page).

Changes from v6:
- explicitely assign opt_mem_alignemnt in raw-posix.c with
  MAX(s->buf_align, getpagesize()) (Kevin)

Changes from v5:
- found justification from kernel point of view
- fixed checkpatch warnings in the patch 2

Changes from v4:
- patches reordered
- dropped conversion from 512 to BDRV_SECTOR_SIZE
- getpagesize() is replaced with MAX(4096, getpagesize()) as suggested by
  Kevin

Changes from v3:
- portable way to calculate system page size used
- 512/4096 values are replaced with proper macros/values

Changes from v2:
- opt_mem_alignment is split to opt_mem_alignment for bounce buffering
  and min_mem_alignment to check buffers coming from guest.

Changes from v1:
- enforces 4096 alignment in qemu_(try_)blockalign, avoid touching of
  bdrv_qiov_is_aligned path not to enforce additional bounce buffering
  as suggested by Paolo
- reduces 10% to 5% in patch description to better fit 180 vs 189
  difference

Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 




[Qemu-devel] [PATCH 1/2] block: minimal bounce buffer alignment

2015-05-12 Thread Denis V. Lunev
The patch introduces new concept: minimal memory alignment for bounce
buffers. Original so called "optimal" value is actually minimal required
value for aligment. It should be used for validation that the IOVec
is properly aligned and bounce buffer is not required.

Though, from the performance point of view, it would be better if
bounce buffer or IOVec allocated by QEMU will be aligned stricter.

The patch does not change any alignment value yet.

Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
Reviewed-by: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block.c   | 11 +++
 block/io.c|  7 ++-
 block/raw-posix.c |  1 +
 include/block/block.h |  2 ++
 include/block/block_int.h |  3 +++
 5 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 7904098..e293907 100644
--- a/block.c
+++ b/block.c
@@ -113,6 +113,16 @@ size_t bdrv_opt_mem_align(BlockDriverState *bs)
 return bs->bl.opt_mem_alignment;
 }
 
+size_t bdrv_min_mem_align(BlockDriverState *bs)
+{
+if (!bs || !bs->drv) {
+/* 4k should be on the safe side */
+return 4096;
+}
+
+return bs->bl.min_mem_alignment;
+}
+
 /* check if the path starts with ":" */
 int path_has_protocol(const char *path)
 {
@@ -890,6 +900,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 }
 
 assert(bdrv_opt_mem_align(bs) != 0);
+assert(bdrv_min_mem_align(bs) != 0);
 assert((bs->request_alignment != 0) || bs->sg);
 return 0;
 
diff --git a/block/io.c b/block/io.c
index 1ce62c4..908a3d1 100644
--- a/block/io.c
+++ b/block/io.c
@@ -201,8 +201,10 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error 
**errp)
 }
 bs->bl.opt_transfer_length = bs->file->bl.opt_transfer_length;
 bs->bl.max_transfer_length = bs->file->bl.max_transfer_length;
+bs->bl.min_mem_alignment = bs->file->bl.min_mem_alignment;
 bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
 } else {
+bs->bl.min_mem_alignment = 512;
 bs->bl.opt_mem_alignment = 512;
 }
 
@@ -221,6 +223,9 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
 bs->bl.opt_mem_alignment =
 MAX(bs->bl.opt_mem_alignment,
 bs->backing_hd->bl.opt_mem_alignment);
+bs->bl.min_mem_alignment =
+MAX(bs->bl.min_mem_alignment,
+bs->backing_hd->bl.min_mem_alignment);
 }
 
 /* Then let the driver override it */
@@ -2489,7 +2494,7 @@ void *qemu_try_blockalign0(BlockDriverState *bs, size_t 
size)
 bool bdrv_qiov_is_aligned(BlockDriverState *bs, QEMUIOVector *qiov)
 {
 int i;
-size_t alignment = bdrv_opt_mem_align(bs);
+size_t alignment = bdrv_min_mem_align(bs);
 
 for (i = 0; i < qiov->niov; i++) {
 if ((uintptr_t) qiov->iov[i].iov_base % alignment) {
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 24d8582..7083924 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -725,6 +725,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error 
**errp)
 BDRVRawState *s = bs->opaque;
 
 raw_probe_alignment(bs, s->fd, errp);
+bs->bl.min_mem_alignment = s->buf_align;
 bs->bl.opt_mem_alignment = s->buf_align;
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index 7d1a717..c1c963e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -440,6 +440,8 @@ void bdrv_img_create(const char *filename, const char *fmt,
 
 /* Returns the alignment in bytes that is required so that no bounce buffer
  * is required throughout the stack */
+size_t bdrv_min_mem_align(BlockDriverState *bs);
+/* Returns optimal alignment in bytes for bounce buffer */
 size_t bdrv_opt_mem_align(BlockDriverState *bs);
 void bdrv_set_guest_block_size(BlockDriverState *bs, int align);
 void *qemu_blockalign(BlockDriverState *bs, size_t size);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index db29b74..f004378 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -313,6 +313,9 @@ typedef struct BlockLimits {
 int max_transfer_length;
 
 /* memory alignment so that no bounce buffer is needed */
+size_t min_mem_alignment;
+
+/* memory alignment for bounce buffer */
 size_t opt_mem_alignment;
 } BlockLimits;
 
-- 
1.9.1




[Qemu-devel] [PATCH 2/2] block: align bounce buffers to page

2015-05-12 Thread Denis V. Lunev
The following sequence
int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
for (i = 0; i < 10; i++)
write(fd, buf, 4096);
performs 5% better if buf is aligned to 4096 bytes.

The difference is quite reliable.

On the other hand we do not want at the moment to enforce bounce
buffering if guest request is aligned to 512 bytes.

The patch changes default bounce buffer optimal alignment to
MAX(page size, 4k). 4k is chosen as maximal known sector size on real
HDD.

The justification of the performance improve is quite interesting.
>From the kernel point of view each request to the disk was split
by two. This could be seen by blktrace like this:
  9,0   11  1 0.0 11151  Q  WS 312737792 + 1023 [qemu-img]
  9,0   11  2 0.07938 11151  Q  WS 312738815 + 8 [qemu-img]
  9,0   11  3 0.30735 11151  Q  WS 312738823 + 1016 [qemu-img]
  9,0   11  4 0.32482 11151  Q  WS 312739839 + 8 [qemu-img]
  9,0   11  5 0.41379 11151  Q  WS 312739847 + 1016 [qemu-img]
  9,0   11  6 0.42818 11151  Q  WS 312740863 + 8 [qemu-img]
  9,0   11  7 0.51236 11151  Q  WS 312740871 + 1017 [qemu-img]
  9,05  1 0.169071519 11151  Q  WS 312741888 + 1023 [qemu-img]
After the patch the pattern becomes normal:
  9,06  1 0.0 12422  Q  WS 314834944 + 1024 [qemu-img]
  9,06  2 0.38527 12422  Q  WS 314835968 + 1024 [qemu-img]
  9,06  3 0.72849 12422  Q  WS 314836992 + 1024 [qemu-img]
  9,06  4 0.000106276 12422  Q  WS 314838016 + 1024 [qemu-img]
and the amount of requests sent to disk (could be calculated counting
number of lines in the output of blktrace) is reduced about 2 times.

Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
does his job well and real requests comes properly aligned (to page).

Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block.c   |  8 
 block/io.c|  2 +-
 block/raw-posix.c | 15 +--
 3 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index e293907..325f727 100644
--- a/block.c
+++ b/block.c
@@ -106,8 +106,8 @@ int is_windows_drive(const char *filename)
 size_t bdrv_opt_mem_align(BlockDriverState *bs)
 {
 if (!bs || !bs->drv) {
-/* 4k should be on the safe side */
-return 4096;
+/* page size or 4k (hdd sector size) should be on the safe side */
+return MAX(4096, getpagesize());
 }
 
 return bs->bl.opt_mem_alignment;
@@ -116,8 +116,8 @@ size_t bdrv_opt_mem_align(BlockDriverState *bs)
 size_t bdrv_min_mem_align(BlockDriverState *bs)
 {
 if (!bs || !bs->drv) {
-/* 4k should be on the safe side */
-return 4096;
+/* page size or 4k (hdd sector size) should be on the safe side */
+return MAX(4096, getpagesize());
 }
 
 return bs->bl.min_mem_alignment;
diff --git a/block/io.c b/block/io.c
index 908a3d1..071652c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -205,7 +205,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
 bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
 } else {
 bs->bl.min_mem_alignment = 512;
-bs->bl.opt_mem_alignment = 512;
+bs->bl.opt_mem_alignment = getpagesize();
 }
 
 if (bs->backing_hd) {
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 7083924..4659552 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -301,6 +301,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
 {
 BDRVRawState *s = bs->opaque;
 char *buf;
+size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
 
 /* For /dev/sg devices the alignment is not really used.
With buffered I/O, we don't have any restrictions. */
@@ -330,9 +331,9 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
 /* If we could not get the sizes so far, we can only guess them */
 if (!s->buf_align) {
 size_t align;
-buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
-if (raw_is_io_aligned(fd, buf + align, MAX_BLOCKSIZE)) {
+buf = qemu_memalign(max_align, 2 * max_align);
+for (align = 512; align <= max_align; align <<= 1) {
+if (raw_is_io_aligned(fd, buf + align, max_align)) {
 s->buf_align = align;
 break;
 }
@@ -342,8 +343,8 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
 
 if (!bs->request_alignment) {
 size_t align;
-buf = qemu_memalign(s->buf_align, MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
+buf = qemu_memalign(s->buf_align, max_align);
+for (align = 512; align <= max_align; align <<= 1) {
 if (raw_is_io_aligned(fd, buf, align)) {
 bs->request_al

Re: [Qemu-devel] [PATCH 07/34] block: Move flag inheritance to bdrv_open_inherited()

2015-05-12 Thread Kevin Wolf
Am 11.05.2015 um 17:20 hat Max Reitz geschrieben:
> On 08.05.2015 19:21, Kevin Wolf wrote:
> >Instead of letting every caller of bdrv_open() determine the right flags
> >for its child node manually and pass them to the function, pass the
> >parent node and the role of the newly opened child (like backing file,
> >protocol layer, etc.).
> >
> >Signed-off-by: Kevin Wolf 
> >---
> >  block.c   | 74 
> > ++-
> >  block/blkdebug.c  |  2 +-
> >  block/blkverify.c |  4 +--
> >  block/quorum.c|  4 +--
> >  block/vmdk.c  |  5 ++--
> >  include/block/block.h |  4 ++-
> >  include/block/block_int.h |  7 +
> >  7 files changed, 78 insertions(+), 22 deletions(-)
> >
> >diff --git a/block.c b/block.c
> >index cea022f..c4f0fb4 100644
> >--- a/block.c
> >+++ b/block.c
> >@@ -79,6 +79,12 @@ static QTAILQ_HEAD(, BlockDriverState) graph_bdrv_states =
> >  static QLIST_HEAD(, BlockDriver) bdrv_drivers =
> >  QLIST_HEAD_INITIALIZER(bdrv_drivers);
> >+static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
> >+ const char *reference, QDict *options, int 
> >flags,
> >+ BlockDriverState* parent,
> 
> Stern zur Variable!

Okay, now that we're in nitpicking mode:

Zur Variablen, wenn schon.

(Sorry for starting a discussion about German grammar on qemu-devel, but
I just have to...)

> >+ const BdrvChildRole *child_role,
> >+ BlockDriver *drv, Error **errp);
> >+
> >  static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
> >  /* If non-zero, use only whitelisted block drivers */
> >  static int use_bdrv_whitelist;
> >@@ -672,8 +678,8 @@ static int bdrv_temp_snapshot_flags(int flags)
> >  }
> >  /*
> >- * Returns the flags that bs->file should get, based on the given flags for
> >- * the parent BDS
> >+ * Returns the flags that bs->file should get if a protocol driver is 
> >expected,
> >+ * based on the given flags for the parent BDS
> >   */
> >  static int bdrv_inherited_flags(int flags)
> >  {
> >@@ -690,6 +696,25 @@ static int bdrv_inherited_flags(int flags)
> >  return flags;
> >  }
> >+const BdrvChildRole child_file = {
> >+.inherit_flags = bdrv_inherited_flags,
> >+};
> >+
> >+/*
> >+ * Returns the flags that bs->file should get if the use of formats (and not
> >+ * only protocols) is permitted for it, based on the given flags for the 
> >parent
> >+ * BDS
> >+ */
> >+static int bdrv_inherited_fmt_flags(int parent_flags)
> >+{
> >+int flags = child_file.inherit_flags(parent_flags);
> >+return flags & ~BDRV_O_PROTOCOL;
> >+}
> >+
> >+const BdrvChildRole child_format = {
> >+.inherit_flags = bdrv_inherited_fmt_flags,
> >+};
> >+
> >  /*
> >   * Returns the flags that bs->backing_hd should get, based on the given 
> > flags
> >   * for the parent BDS
> >@@ -705,6 +730,10 @@ static int bdrv_backing_flags(int flags)
> >  return flags;
> >  }
> >+static const BdrvChildRole child_backing = {
> >+.inherit_flags = bdrv_backing_flags,
> >+};
> >+
> >  static int bdrv_open_flags(BlockDriverState *bs, int flags)
> >  {
> >  int open_flags = flags | BDRV_O_CACHE_WB;
> >@@ -827,7 +856,6 @@ static int bdrv_open_common(BlockDriverState *bs, 
> >BlockDriverState *file,
> >  return 0;
> >  }
> >-bs->open_flags = flags;
> >  bs->guest_block_size = 512;
> >  bs->request_alignment = 512;
> >  bs->zero_beyond_eof = true;
> >@@ -1134,9 +1162,10 @@ int bdrv_open_backing_file(BlockDriverState *bs, 
> >QDict *options, Error **errp)
> >  }
> >  assert(bs->backing_hd == NULL);
> >-ret = bdrv_open(&backing_hd,
> >-*backing_filename ? backing_filename : NULL, NULL, 
> >options,
> >-bdrv_backing_flags(bs->open_flags), NULL, &local_err);
> >+ret = bdrv_open_inherit(&backing_hd,
> >+*backing_filename ? backing_filename : NULL,
> >+NULL, options, 0, bs, &child_backing,
> >+NULL, &local_err);
> >  if (ret < 0) {
> >  bdrv_unref(backing_hd);
> >  backing_hd = NULL;
> >@@ -1170,7 +1199,8 @@ free_exit:
> >   * To conform with the behavior of bdrv_open(), *pbs has to be NULL.
> >   */
> >  int bdrv_open_image(BlockDriverState **pbs, const char *filename,
> >-QDict *options, const char *bdref_key, int flags,
> >+QDict *options, const char *bdref_key,
> >+BlockDriverState* parent, const BdrvChildRole 
> >*child_role,
> 
> !
> 
> >  bool allow_none, Error **errp)
> >  {
> >  QDict *image_options;
> >@@ -1198,7 +1228,8 @@ int bdrv_open_image(BlockDriverState **pbs, const char 
> >*filename,
> >  goto done;
> >  }
> >-ret = bdrv_open(pbs, filename, reference, image_options, flags, NULL, 
> >errp);
> >+re

Re: [Qemu-devel] [PATCHv3 1/2] Move parallel_hds_isa_init to hw/isa/isa-bus.c

2015-05-12 Thread Paolo Bonzini


On 12/05/2015 08:22, mreza...@redhat.com wrote:
> From: Miroslav Rezanina 
> 
> Disabling CONFIG_PARALLEL cause removing parallel_hds_isa_init defined in
> parallel.c. This function is called during initialization of some boards so
> disabling CONFIG_PARALLEL cause build failure.
> 
> This patch moves parallel_hds_isa_init to hw/isa/isa-bus.c so it is included
> in case of disabled CONFIG_PARALLEL. Build is successful but qemu will abort
> with "Unknown device" error when function is called.
> 
> Signed-off-by: Miroslav Rezanina 
> ---
>  hw/char/parallel.c | 25 -
>  hw/isa/isa-bus.c   | 29 +
>  2 files changed, 29 insertions(+), 25 deletions(-)
> 
> diff --git a/hw/char/parallel.c b/hw/char/parallel.c
> index 4079554..c2b553f 100644
> --- a/hw/char/parallel.c
> +++ b/hw/char/parallel.c
> @@ -641,28 +641,3 @@ static void parallel_register_types(void)
>  }
>  
>  type_init(parallel_register_types)
> -
> -static void parallel_init(ISABus *bus, int index, CharDriverState *chr)
> -{
> -DeviceState *dev;
> -ISADevice *isadev;
> -
> -isadev = isa_create(bus, "isa-parallel");
> -dev = DEVICE(isadev);
> -qdev_prop_set_uint32(dev, "index", index);
> -qdev_prop_set_chr(dev, "chardev", chr);
> -qdev_init_nofail(dev);
> -}
> -
> -void parallel_hds_isa_init(ISABus *bus, int n)
> -{
> -int i;
> -
> -assert(n <= MAX_PARALLEL_PORTS);
> -
> -for (i = 0; i < n; i++) {
> -if (parallel_hds[i]) {
> -parallel_init(bus, i, parallel_hds[i]);
> -}
> -}
> -}
> diff --git a/hw/isa/isa-bus.c b/hw/isa/isa-bus.c
> index 825aa62..94f645c 100644
> --- a/hw/isa/isa-bus.c
> +++ b/hw/isa/isa-bus.c
> @@ -21,6 +21,7 @@
>  #include "hw/sysbus.h"
>  #include "sysemu/sysemu.h"
>  #include "hw/isa/isa.h"
> +#include "hw/i386/pc.h"
>  
>  static ISABus *isabus;
>  
> @@ -267,3 +268,31 @@ MemoryRegion *isa_address_space_io(ISADevice *dev)
>  }
>  
>  type_init(isabus_register_types)
> +
> +static void parallel_init(ISABus *bus, int index, CharDriverState *chr)
> +{
> +DeviceState *dev;
> +ISADevice *isadev;
> +
> +isadev = isa_try_create(bus, "isa-parallel");
> +if (!isadev) {
> +   return;
> +}
> +dev = DEVICE(isadev);
> +qdev_prop_set_uint32(dev, "index", index);
> +qdev_prop_set_chr(dev, "chardev", chr);
> +qdev_init_nofail(dev);
> +}
> +
> +void parallel_hds_isa_init(ISABus *bus, int n)
> +{
> +int i;
> +
> +assert(n <= MAX_PARALLEL_PORTS);
> +
> +for (i = 0; i < n; i++) {
> +if (parallel_hds[i]) {
> +parallel_init(bus, i, parallel_hds[i]);
> +}
> +}
> +}
> 

ACK

Paolo



Re: [Qemu-devel] [Qemu-block] [PATCH 2/2] block: align bounce buffers to page

2015-05-12 Thread Denis V. Lunev

On 12/05/15 16:08, Kevin Wolf wrote:

Am 12.05.2015 um 12:36 hat Denis V. Lunev geschrieben:

On 12/05/15 13:27, Kevin Wolf wrote:

Am 12.05.2015 um 07:47 hat Denis V. Lunev geschrieben:

The following sequence
 int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
 for (i = 0; i < 10; i++)
 write(fd, buf, 4096);
performs 5% better if buf is aligned to 4096 bytes.

The difference is quite reliable.

On the other hand we do not want at the moment to enforce bounce
buffering if guest request is aligned to 512 bytes.

The patch changes default bounce buffer optimal alignment to
MAX(page size, 4k). 4k is chosen as maximal known sector size on real
HDD.

The justification of the performance improve is quite interesting.
 From the kernel point of view each request to the disk was split
by two. This could be seen by blktrace like this:
   9,0   11  1 0.0 11151  Q  WS 312737792 + 1023 [qemu-img]
   9,0   11  2 0.07938 11151  Q  WS 312738815 + 8 [qemu-img]
   9,0   11  3 0.30735 11151  Q  WS 312738823 + 1016 [qemu-img]
   9,0   11  4 0.32482 11151  Q  WS 312739839 + 8 [qemu-img]
   9,0   11  5 0.41379 11151  Q  WS 312739847 + 1016 [qemu-img]
   9,0   11  6 0.42818 11151  Q  WS 312740863 + 8 [qemu-img]
   9,0   11  7 0.51236 11151  Q  WS 312740871 + 1017 [qemu-img]
   9,05  1 0.169071519 11151  Q  WS 312741888 + 1023 [qemu-img]
After the patch the pattern becomes normal:
   9,06  1 0.0 12422  Q  WS 314834944 + 1024 [qemu-img]
   9,06  2 0.38527 12422  Q  WS 314835968 + 1024 [qemu-img]
   9,06  3 0.72849 12422  Q  WS 314836992 + 1024 [qemu-img]
   9,06  4 0.000106276 12422  Q  WS 314838016 + 1024 [qemu-img]
and the amount of requests sent to disk (could be calculated counting
number of lines in the output of blktrace) is reduced about 2 times.

Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
does his job well and real requests comes properly aligned (to page).

Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
  block.c   |  8 
  block/io.c|  2 +-
  block/raw-posix.c | 14 --
  3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index e293907..325f727 100644
--- a/block.c
+++ b/block.c
@@ -106,8 +106,8 @@ int is_windows_drive(const char *filename)
  size_t bdrv_opt_mem_align(BlockDriverState *bs)
  {
  if (!bs || !bs->drv) {
-/* 4k should be on the safe side */
-return 4096;
+/* page size or 4k (hdd sector size) should be on the safe side */
+return MAX(4096, getpagesize());
  }

  return bs->bl.opt_mem_alignment;
@@ -116,8 +116,8 @@ size_t bdrv_opt_mem_align(BlockDriverState *bs)
  size_t bdrv_min_mem_align(BlockDriverState *bs)
  {
  if (!bs || !bs->drv) {
-/* 4k should be on the safe side */
-return 4096;
+/* page size or 4k (hdd sector size) should be on the safe side */
+return MAX(4096, getpagesize());
  }

  return bs->bl.min_mem_alignment;
diff --git a/block/io.c b/block/io.c
index 908a3d1..071652c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -205,7 +205,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
  bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
  } else {
  bs->bl.min_mem_alignment = 512;
-bs->bl.opt_mem_alignment = 512;
+bs->bl.opt_mem_alignment = getpagesize();
  }

  if (bs->backing_hd) {

I think it would make more sense to keep this specific to the raw-posix
driver. After all, it's only the kernel page cache that we optimise
here. Other backends probably don't take advantage of page alignment.


diff --git a/block/raw-posix.c b/block/raw-posix.c
index 7083924..04f3d4e 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -301,6 +301,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  {
  BDRVRawState *s = bs->opaque;
  char *buf;
+size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());

  /* For /dev/sg devices the alignment is not really used.
 With buffered I/O, we don't have any restrictions. */
@@ -330,9 +331,9 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  /* If we could not get the sizes so far, we can only guess them */
  if (!s->buf_align) {
  size_t align;
-buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
-if (raw_is_io_aligned(fd, buf + align, MAX_BLOCKSIZE)) {
+buf = qemu_memalign(max_align, 2 * max_align);
+for (align = 512; align <= max_align; align <<= 1) {
+if (raw_is_io_aligned(fd, buf + align, max_align)) {
  s->buf_align = align;
  break;
  }
@@ -342,8 +343,8 @@ static void raw_probe_

Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check

2015-05-12 Thread Cornelia Huck
On Wed, 06 May 2015 14:07:37 +0200
Greg Kurz  wrote:

> Unlike with add and clear, there is no valid reason to abort when checking
> for a feature. It makes more sense to return false (i.e. the feature bit
> isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> 
> This allows to introduce code that is aware about new 64-bit features like
> VIRTIO_F_VERSION_1, even if they are still not implemented.
> 
> Signed-off-by: Greg Kurz 
> ---
>  include/hw/virtio/virtio.h |1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index d95f8b6..6ef70f1 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t 
> *features, unsigned int fbit)
> 
>  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
>  {
> -assert(fbit < 32);
>  return !!(features & (1 << fbit));
>  }
> 
> 
> 

I must say I'm not very comfortable with knowingly passing out-of-rage
values to this function.

Can we perhaps apply at least the feature-bit-size extending patches
prior to your patchset, if the remainder of the virtio-1 patchset still
takes some time?




Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio

2015-05-12 Thread Cornelia Huck
On Wed, 06 May 2015 14:08:02 +0200
Greg Kurz  wrote:

> Legacy virtio is native endian: if the guest and host endianness differ,
> we have to tell vhost so it can swap bytes where appropriate. This is
> done through a vhost ring ioctl.
> 
> Signed-off-by: Greg Kurz 
> ---
>  hw/virtio/vhost.c |   50 +-
>  1 file changed, 49 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 54851b7..1d7b939 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
(...)
> @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
>  return -errno;
>  }
> 
> +if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&

I think this should either go in after the virtio-1 base support (more
feature bits etc.) or get a big fat comment and be touched up later.
I'd prefer the first solution so it does not get forgotten, but I'm not
sure when Michael plans to proceed with the virtio-1 patches (I think
they're mostly fine already).

> +virtio_legacy_is_cross_endian(vdev)) {
> +r = vhost_virtqueue_set_vring_endian_legacy(dev,
> +
> virtio_is_big_endian(vdev),
> +vhost_vq_index);
> +if (r) {
> +return -errno;
> +}
> +}
> +
>  s = l = virtio_queue_get_desc_size(vdev, idx);
>  a = virtio_queue_get_desc_addr(vdev, idx);
>  vq->desc = cpu_physical_memory_map(a, &l, 0);




Re: [Qemu-devel] [PATCHv3 2/2] stubs: Provide parallel_mm_init stub version

2015-05-12 Thread Paolo Bonzini


On 12/05/2015 08:22, mreza...@redhat.com wrote:
> From: Miroslav Rezanina 
> 
> mips build fail with link error in case PARALLEL_CONFIG is disabled as
> hw/mips/mips_jazz.c calls parallel_mm_init. Due to dependecies to content
> of parallel.c we can't simply move it to hw/isa/isa-devices.c.
> 
> This patch adds stubs/parallel.c file that contains stub version of
> parallel_mm_init. This ensure successful build with PARALLEL_CONFIG disabled.
> 
> Signed-off-by: Miroslav Rezanina 
> ---
>  stubs/Makefile.objs | 1 +
>  stubs/parallel.c| 8 
>  2 files changed, 9 insertions(+)
>  create mode 100644 stubs/parallel.c
> 
> diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
> index 8beff4c..ad4e110 100644
> --- a/stubs/Makefile.objs
> +++ b/stubs/Makefile.objs
> @@ -24,6 +24,7 @@ stub-obj-y += mon-printf.o
>  stub-obj-y += mon-set-error.o
>  stub-obj-y += monitor-init.o
>  stub-obj-y += notify-event.o
> +stub-obj-y += parallel.o
>  stub-obj-$(CONFIG_SPICE) += qemu-chr-open-spice.o
>  stub-obj-y += qtest.o
>  stub-obj-y += reset.o
> diff --git a/stubs/parallel.c b/stubs/parallel.c
> new file mode 100644
> index 000..8293d52
> --- /dev/null
> +++ b/stubs/parallel.c
> @@ -0,0 +1,8 @@
> +#include "hw/i386/pc.h"
> +
> +bool parallel_mm_init(MemoryRegion *address_space,
> +  hwaddr base, int it_shift, qemu_irq irq,
> +  CharDriverState *chr)
> +{
> +return false;
> +}
> 

I think removing CONFIG_PARALLEL from a board that hardcodes its
presence makes little sense, so I would just drop this patch.

Paolo



Re: [Qemu-devel] [PULL 00/19] target-arm queue

2015-05-12 Thread Peter Maydell
On 12 May 2015 at 12:03, Peter Maydell  wrote:
>
> v2 of the pull, fixing a silly compile failure on ARM hosts.
> 
> target-arm queue:
>  * Support TZ and grouping in the GIC
>  * hw/sd: sd_reset cleanup
>  * armv7m_nvic: fix bug in systick device
>
> 

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH] ui: use libexpoxy

2015-05-12 Thread Peter Maydell
On 12 May 2015 at 14:04, Gerd Hoffmann  wrote:
> libepoxy does the opengl extension handling for us.
>
> It also is helpful for trouble-shooting as it prints nice error messages
> instead of silently failing or segfaulting in case we do something
> wrong, like using gl commands not supported by the current context.

How widely supported is this library? How long has it been around,
is it carried by all the distros, does it work ok on OSX and Windows?

I'm a bit uncertain about adding dependencies that would limit the
scope where we can provide important functionality like 3D
acceleration.

thanks
-- PMM



Re: [Qemu-devel] [Qemu-block] [PATCH 2/2] block: align bounce buffers to page

2015-05-12 Thread Kevin Wolf
Am 12.05.2015 um 12:36 hat Denis V. Lunev geschrieben:
> On 12/05/15 13:27, Kevin Wolf wrote:
> >Am 12.05.2015 um 07:47 hat Denis V. Lunev geschrieben:
> >>The following sequence
> >> int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
> >> for (i = 0; i < 10; i++)
> >> write(fd, buf, 4096);
> >>performs 5% better if buf is aligned to 4096 bytes.
> >>
> >>The difference is quite reliable.
> >>
> >>On the other hand we do not want at the moment to enforce bounce
> >>buffering if guest request is aligned to 512 bytes.
> >>
> >>The patch changes default bounce buffer optimal alignment to
> >>MAX(page size, 4k). 4k is chosen as maximal known sector size on real
> >>HDD.
> >>
> >>The justification of the performance improve is quite interesting.
> >> From the kernel point of view each request to the disk was split
> >>by two. This could be seen by blktrace like this:
> >>   9,0   11  1 0.0 11151  Q  WS 312737792 + 1023 [qemu-img]
> >>   9,0   11  2 0.07938 11151  Q  WS 312738815 + 8 [qemu-img]
> >>   9,0   11  3 0.30735 11151  Q  WS 312738823 + 1016 [qemu-img]
> >>   9,0   11  4 0.32482 11151  Q  WS 312739839 + 8 [qemu-img]
> >>   9,0   11  5 0.41379 11151  Q  WS 312739847 + 1016 [qemu-img]
> >>   9,0   11  6 0.42818 11151  Q  WS 312740863 + 8 [qemu-img]
> >>   9,0   11  7 0.51236 11151  Q  WS 312740871 + 1017 [qemu-img]
> >>   9,05  1 0.169071519 11151  Q  WS 312741888 + 1023 [qemu-img]
> >>After the patch the pattern becomes normal:
> >>   9,06  1 0.0 12422  Q  WS 314834944 + 1024 [qemu-img]
> >>   9,06  2 0.38527 12422  Q  WS 314835968 + 1024 [qemu-img]
> >>   9,06  3 0.72849 12422  Q  WS 314836992 + 1024 [qemu-img]
> >>   9,06  4 0.000106276 12422  Q  WS 314838016 + 1024 [qemu-img]
> >>and the amount of requests sent to disk (could be calculated counting
> >>number of lines in the output of blktrace) is reduced about 2 times.
> >>
> >>Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
> >>does his job well and real requests comes properly aligned (to page).
> >>
> >>Signed-off-by: Denis V. Lunev 
> >>CC: Paolo Bonzini 
> >>CC: Kevin Wolf 
> >>CC: Stefan Hajnoczi 
> >>---
> >>  block.c   |  8 
> >>  block/io.c|  2 +-
> >>  block/raw-posix.c | 14 --
> >>  3 files changed, 13 insertions(+), 11 deletions(-)
> >>
> >>diff --git a/block.c b/block.c
> >>index e293907..325f727 100644
> >>--- a/block.c
> >>+++ b/block.c
> >>@@ -106,8 +106,8 @@ int is_windows_drive(const char *filename)
> >>  size_t bdrv_opt_mem_align(BlockDriverState *bs)
> >>  {
> >>  if (!bs || !bs->drv) {
> >>-/* 4k should be on the safe side */
> >>-return 4096;
> >>+/* page size or 4k (hdd sector size) should be on the safe side */
> >>+return MAX(4096, getpagesize());
> >>  }
> >>
> >>  return bs->bl.opt_mem_alignment;
> >>@@ -116,8 +116,8 @@ size_t bdrv_opt_mem_align(BlockDriverState *bs)
> >>  size_t bdrv_min_mem_align(BlockDriverState *bs)
> >>  {
> >>  if (!bs || !bs->drv) {
> >>-/* 4k should be on the safe side */
> >>-return 4096;
> >>+/* page size or 4k (hdd sector size) should be on the safe side */
> >>+return MAX(4096, getpagesize());
> >>  }
> >>
> >>  return bs->bl.min_mem_alignment;
> >>diff --git a/block/io.c b/block/io.c
> >>index 908a3d1..071652c 100644
> >>--- a/block/io.c
> >>+++ b/block/io.c
> >>@@ -205,7 +205,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error 
> >>**errp)
> >>  bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
> >>  } else {
> >>  bs->bl.min_mem_alignment = 512;
> >>-bs->bl.opt_mem_alignment = 512;
> >>+bs->bl.opt_mem_alignment = getpagesize();
> >>  }
> >>
> >>  if (bs->backing_hd) {
> >
> >I think it would make more sense to keep this specific to the raw-posix
> >driver. After all, it's only the kernel page cache that we optimise
> >here. Other backends probably don't take advantage of page alignment.
> >
> >>diff --git a/block/raw-posix.c b/block/raw-posix.c
> >>index 7083924..04f3d4e 100644
> >>--- a/block/raw-posix.c
> >>+++ b/block/raw-posix.c
> >>@@ -301,6 +301,7 @@ static void raw_probe_alignment(BlockDriverState *bs, 
> >>int fd, Error **errp)
> >>  {
> >>  BDRVRawState *s = bs->opaque;
> >>  char *buf;
> >>+size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
> >>
> >>  /* For /dev/sg devices the alignment is not really used.
> >> With buffered I/O, we don't have any restrictions. */
> >>@@ -330,9 +331,9 @@ static void raw_probe_alignment(BlockDriverState *bs, 
> >>int fd, Error **errp)
> >>  /* If we could not get the sizes so far, we can only guess them */
> >>  if (!s->buf_align) {
> >>  size_t align;
> >>-buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
> >>-for (align = 512; align <= M

Re: [Qemu-devel] [v8 13/14] migration: Add qmp commands to set and query parameters

2015-05-12 Thread Li, Liang Z
> 
> * Liang Li (liang.z...@intel.com) wrote:
> > Add the qmp commands to tune and query the parameters used in live
> > migration.
> 
> Hi,
>   Do you know if there's anyone working on libvirt code to drive this 
> interface
> and turn on your compression code?
> 

Yes,  I have confirmed that one person of Intel are working on this.

Liang

> Dave
> 
> >
> > Signed-off-by: Liang Li 
> > Signed-off-by: Yang Zhang 
> > ---
> >  migration/migration.c | 56
> ++
> >  qapi-schema.json  | 45
> 
> >  qmp-commands.hx   | 57
> +++
> >  3 files changed, 158 insertions(+)
> >
> > diff --git a/migration/migration.c b/migration/migration.c index
> > 533717c..8732803 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -188,6 +188,21 @@ MigrationCapabilityStatusList
> *qmp_query_migrate_capabilities(Error **errp)
> >  return head;
> >  }
> >
> > +MigrationParameters *qmp_query_migrate_parameters(Error **errp) {
> > +MigrationParameters *params;
> > +MigrationState *s = migrate_get_current();
> > +
> > +params = g_malloc0(sizeof(*params));
> > +params->compress_level = s-
> >parameters[MIGRATION_PARAMETER_COMPRESS_LEVEL];
> > +params->compress_threads =
> > +s->parameters[MIGRATION_PARAMETER_COMPRESS_THREADS];
> > +params->decompress_threads =
> > +s->parameters[MIGRATION_PARAMETER_DECOMPRESS_THREADS];
> > +
> > +return params;
> > +}
> > +
> >  static void get_xbzrle_cache_stats(MigrationInfo *info)  {
> >  if (migrate_use_xbzrle()) {
> > @@ -301,6 +316,47 @@ void
> qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
> >  }
> >  }
> >
> > +void qmp_migrate_set_parameters(bool has_compress_level,
> > +int64_t compress_level,
> > +bool has_compress_threads,
> > +int64_t compress_threads,
> > +bool has_decompress_threads,
> > +int64_t decompress_threads, Error
> > +**errp) {
> > +MigrationState *s = migrate_get_current();
> > +
> > +if (has_compress_level && (compress_level < 0 || compress_level > 9))
> {
> > +error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> "compress_level",
> > +  "a value in range [0, 9]");
> > +return;
> > +}
> > +if (has_compress_threads &&
> > +(compress_threads < 1 || compress_threads > 255)) {
> > +error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> > +  "compress_threads",
> > +  "a value in range [1, 255]");
> > +return;
> > +}
> > +if (has_decompress_threads &&
> > +(decompress_threads < 1 || decompress_threads > 255)) {
> > +error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> > +  "decompress_threads",
> > +  "a value in range [1, 255]");
> > +return;
> > +}
> > +
> > +if (has_compress_level) {
> > +s->parameters[MIGRATION_PARAMETER_COMPRESS_LEVEL] =
> compress_level;
> > +}
> > +if (has_compress_threads) {
> > +s->parameters[MIGRATION_PARAMETER_COMPRESS_THREADS] =
> compress_threads;
> > +}
> > +if (has_decompress_threads) {
> > +s->parameters[MIGRATION_PARAMETER_DECOMPRESS_THREADS] =
> > +decompress_threads;
> > +}
> > +}
> > +
> >  /* shared migration helpers */
> >
> >  static void migrate_set_state(MigrationState *s, int old_state, int
> > new_state) diff --git a/qapi-schema.json b/qapi-schema.json index
> > 121fcc7..579801b 100644
> > --- a/qapi-schema.json
> > +++ b/qapi-schema.json
> > @@ -592,6 +592,51 @@
> >  { 'enum': 'MigrationParameter',
> >'data': ['compress-level', 'compress-threads',
> > 'decompress-threads'] }
> >
> > +#
> > +# @migrate-set-parameters
> > +#
> > +# Set the following migration parameters # # @compress-level:
> > +compression level # # @compress-threads: compression thread count # #
> > +@decompress-threads: decompression thread count # # Since: 2.3 ## {
> > +'command': 'migrate-set-parameters',
> > +  'data': { '*compress-level': 'int',
> > +'*compress-threads': 'int',
> > +'*decompress-threads': 'int'} }
> > +
> > +#
> > +# @MigrationParameters
> > +#
> > +# @compress-level: compression level
> > +#
> > +# @compress-threads: compression thread count # #
> > +@decompress-threads: decompression thread count # # Since: 2.3 ## {
> > +'type': 'MigrationParameters',
> > +  'data': { 'compress-level': 'int',
> > +'compress-threads': 'int',
> > +'decompress-threads': 'int'} } ## #
> > +@query-migrate-parameters # # Returns information about the current
> > +migration parameters # # Returns: @MigrationParameters # # Since: 2.3
> > +## { 'command': 'query-migra

[Qemu-devel] [PATCH] ui: use libexpoxy

2015-05-12 Thread Gerd Hoffmann
libepoxy does the opengl extension handling for us.

It also is helpful for trouble-shooting as it prints nice error messages
instead of silently failing or segfaulting in case we do something
wrong, like using gl commands not supported by the current context.

Signed-off-by: Gerd Hoffmann 
---
 configure| 2 +-
 include/ui/console.h | 3 +--
 include/ui/shader.h  | 5 +
 3 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/configure b/configure
index 1f0f485..df1048a 100755
--- a/configure
+++ b/configure
@@ -3153,7 +3153,7 @@ else
 fi
 
 if test "$opengl" != "no" ; then
-  opengl_pkgs="gl glesv2"
+  opengl_pkgs="gl glesv2 epoxy"
   if $pkg_config $opengl_pkgs x11 && test "$have_glx" = "yes"; then
 opengl_cflags="$($pkg_config --cflags $opengl_pkgs) $x11_cflags"
 opengl_libs="$($pkg_config --libs $opengl_pkgs) $x11_libs"
diff --git a/include/ui/console.h b/include/ui/console.h
index e8b3a9e..383dec2 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -10,8 +10,7 @@
 #include "qapi/error.h"
 
 #ifdef CONFIG_OPENGL
-# include 
-# include 
+# include 
 #endif
 
 /* keyboard/mouse support */
diff --git a/include/ui/shader.h b/include/ui/shader.h
index 1ff926c..992cde6 100644
--- a/include/ui/shader.h
+++ b/include/ui/shader.h
@@ -1,7 +1,4 @@
-#ifdef CONFIG_OPENGL
-# include 
-# include 
-#endif
+#include 
 
 void qemu_gl_run_texture_blit(GLint texture_blit_prog);
 
-- 
1.8.3.1




Re: [Qemu-devel] [ARM]: Adding support for Cortex-M4

2015-05-12 Thread Peter Maydell
On 12 May 2015 at 13:46, aurelio remonda  wrote:
> Im using lm3s6965evb stellaris board, trying to make it "work as an M4", i
> would like to separate them, adding an dsp feature (i.e ARM_FEATURE_DSP)
> could work? the problem is if this feature is added it has to be set on all
> the cpus that use dsp instructions.

You need to create a new Cortex-M4 CPU model, which can then set
the correct feature switches for the instructions and functionality
that that CPU has. If there's something that only exists on a
subset of CPUs but which we're currently providing everywhere
then we need to add a new feature bit and set it on the CPUs
which have it but not the ones which don't.

-- PMM



Re: [Qemu-devel] [RFC v4] monitor: add memory search commands s, sp

2015-05-12 Thread Luiz Capitulino
On Fri, 24 Apr 2015 14:39:48 +0200
hw.clau...@gmail.com wrote:

> From: Claudio Fontana 
> 
> usage is similar to the commands x, xp.
> 
> Example with string: looking for "ELF" header in memory:
> 
> (qemu) s/100cb 0x40001000 "ELF"
> searching memory area [40001000-400f5240]
> 40090001
> (qemu) x/20b 0x4009
> 4009: '\x7f' 'E' 'L' 'F' '\x02' '\x01' '\x01' '\x03'
> 40090008: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00'
> 40090010: '\x02' '\x00' '\xb7' '\x00'
> 
> Example with value: looking for 64bit variable value 0x990088
> 
> (qemu) s/100xg 0x90004200 0x990088
> searching memory area [90004200-9000427a1200]
> 9000424b3000
> 9000424c1000
> 
> Signed-off-by: Claudio Fontana 

I had to drop this patch because it doesn't build for w32. You can
find instructions on how to build for w32 at:

 http://wiki.qemu.org/Hosts/W32

> ---
>  hmp-commands.hx |  28 
>  monitor.c   | 140 
> 
>  2 files changed, 168 insertions(+)
> 
> changes from v3:
> initialize pointer variable to NULL to finally get rid of spurious warning
> 
> changes from v2:
> move code to try to address spurious warning
> 
> changes from v1:
> make checkpatch happy by adding braces here and there.
> 
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index d5022d8..2bf5737 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -432,6 +432,34 @@ Start gdbserver session (default @var{port}=1234)
>  ETEXI
>  
>  {
> +.name   = "s",
> +.args_type  = "fmt:/,addr:l,data:s",
> +.params = "/fmt addr data",
> +.help   = "search virtual memory starting at 'addr' for 'data'",
> +.mhandler.cmd = hmp_memory_search,
> +},
> +
> +STEXI
> +@item s/fmt @var{addr} @var{data}
> +@findex s
> +Virtual memory search starting at @var{addr} for data described by 
> @var{data}.
> +ETEXI
> +
> +{
> +.name   = "sp",
> +.args_type  = "fmt:/,addr:l,data:s",
> +.params = "/fmt addr data",
> +.help   = "search physical memory starting at 'addr' for 'data'",
> +.mhandler.cmd = hmp_physical_memory_search,
> +},
> +
> +STEXI
> +@item sp/fmt @var{addr} @var{data}
> +@findex sp
> +Physical memory search starting at @var{addr} for data described by 
> @var{data}.
> +ETEXI
> +
> +{
>  .name   = "x",
>  .args_type  = "fmt:/,addr:l",
>  .params = "/fmt addr",
> diff --git a/monitor.c b/monitor.c
> index c86a89e..b648dd2 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -1208,6 +1208,124 @@ static void monitor_printc(Monitor *mon, int c)
>  monitor_printf(mon, "'");
>  }
>  
> +static void monitor_print_addr(Monitor *mon, hwaddr addr, bool is_physical)
> +{
> +if (is_physical) {
> +monitor_printf(mon, TARGET_FMT_plx "\n", addr);
> +} else {
> +monitor_printf(mon, TARGET_FMT_lx "\n", (target_ulong)addr);
> +}
> +}
> +
> +/* simple memory search for a byte sequence. The sequence is generated from
> + * a numeric value to look for in guest memory, or from a string.
> + */
> +static void memory_search(Monitor *mon, int count, int format, int wsize,
> +  hwaddr addr, const char *data_str, bool 
> is_physical)
> +{
> +int pos, len;  /* pos in the search area, len of area */
> +char *hay; /* buffer for haystack */
> +int hay_size;  /* haystack size. Needle size is wsize. */
> +const char *needle = NULL; /* needle to search in the haystack */
> +const char *format_str;/* numeric input format string */
> +char value_raw[8]; /* numeric input converted to raw data */
> +#define MONITOR_S_CHUNK_SIZE 16000
> +
> +len = wsize * count;
> +if (len < 1) {
> +monitor_printf(mon, "invalid search area length.\n");
> +return;
> +}
> +switch (format) {
> +case 'i':
> +monitor_printf(mon, "format '%c' not supported.\n", format);
> +return;
> +case 'c':
> +needle = data_str;
> +wsize = strlen(data_str);
> +if (wsize > MONITOR_S_CHUNK_SIZE) {
> +monitor_printf(mon, "search string too long [max %d].\n",
> +   MONITOR_S_CHUNK_SIZE);
> +return;
> +}
> +break;
> +case 'o':
> +format_str = "%" SCNo64;
> +break;
> +default:
> +case 'x':
> +format_str = "%" SCNx64;
> +break;
> +case 'u':
> +format_str = "%" SCNu64;
> +break;
> +case 'd':
> +format_str = "%" SCNd64;
> +break;
> +}
> +if (format != 'c') {
> +uint64_t value;  /* numeric input value */
> +void *from = &value;
> +if (sscanf(data_str, format_str, &value) != 1) {
> +monitor_printf(mon, "could not parse search string "
> +   

Re: [Qemu-devel] [ARM]: Adding support for Cortex-M4

2015-05-12 Thread aurelio remonda
Im using lm3s6965evb stellaris board, trying to make it "work as an M4", i
would like to separate them, adding an dsp feature (i.e ARM_FEATURE_DSP)
could work? the problem is if this feature is added it has to be set on all
the cpus that use dsp instructions.


Re: [Qemu-devel] [PATCH v2] Add virt-v3 machine that uses GIC-500

2015-05-12 Thread Pavel Fedin
 Hello!

> BTW did you try going beyond 16 cores I had problems with 32 and 64 cores.

 Just tried it. Works fine, except qemu takes incredibly long time to start up 
with so
many cores. 64 cores took something like 2 minutes. Indeed, looks like freeze, 
but if
you're patient enough, you'll see it running. I believe it's interpretation 
mode flaw.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia





[Qemu-devel] [PATCH] linux-user: fix support for timerfd_create on arm

2015-05-12 Thread Andreas Schwab
On arm the original timerfd syscall was reused for the new timerfd_create
syscall.

Signed-off-by: Andreas Schwab 
---
 linux-user/arm/syscall_nr.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/arm/syscall_nr.h b/linux-user/arm/syscall_nr.h
index 7d7be7c..53552be 100644
--- a/linux-user/arm/syscall_nr.h
+++ b/linux-user/arm/syscall_nr.h
@@ -354,7 +354,7 @@
 #define TARGET_NR_kexec_load   (347)
 #define TARGET_NR_utimensat(348)
 #define TARGET_NR_signalfd (349)
-#define TARGET_NR_timerfd  (350)
+#define TARGET_NR_timerfd_create   (350)
 #define TARGET_NR_eventfd  (351)
 #define TARGET_NR_fallocate(352)
 #define TARGET_NR_timerfd_settime  (353)
-- 
2.4.0

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



<    1   2   3   >