date:20210120

Re: [PATCH v3 10/17] qapi/gen: Combine ._add_[user|system]_module

2021-01-20 Thread Markus Armbruster

John Snow  writes:

> On 1/20/21 9:20 AM, Markus Armbruster wrote:
>> John Snow  writes:
[...]
>>> diff --git a/scripts/qapi/gen.py b/scripts/qapi/gen.py
>>> index 55acd7e080d..b5505685e6e 100644
>>> --- a/scripts/qapi/gen.py
>>> +++ b/scripts/qapi/gen.py
[...]
>>> @@ -313,7 +306,8 @@ def visit_module(self, module: QAPISchemaModule) -> 
>>> None:
>>>   self._genc = None
>>>   self._genh = None
>>>   else:
>>> -self._add_user_module(module.name, self._user_blurb)
>>> +assert module.user_module, "Unexpected system module"
>> The string provides no value.
>> 
>
> That's just, like, your opinion, man. It has value to me.
>
>
> Compare:
>
> ```
> #!/usr/bin/env python3
>
> def main():
> assert False
>
> if __name__ == '__main__':
> main()
> ```
>
> ```
> # ./foo.py
>
> Traceback (most recent call last):
>   File "./foo.py", line 7, in 
> main()
>   File "./foo.py", line 4, in main
> assert False
> AssertionError
> ```
>
> With:
>
>
> ```
> #!/usr/bin/env python3
>
> def main():
> assert False, "Unexpected system module"
>
> if __name__ == '__main__':
> main()
> ```
>
> ```
> # ./foo.py
>
> Traceback (most recent call last):
>   File "./foo.py", line 7, in 
> main()
>   File "./foo.py", line 4, in main
> assert False, "Unexpected system module"
> AssertionError: Unexpected system module
> ```
>
> I like the extra string for giving some semantic context as to
> specifically what broke (We don't expect to see system modules here) 
> instead of just a stack trace.

Your test uses assert with an argument that tells us nothing.  But the
assert we're arguing about has a nice, expressive argument.

The string improves the report from

  File "/work/armbru/qemu/scripts/qapi/gen.py", line 325, in visit_module
assert module.user_module
AssertionError

to

  File "/work/armbru/qemu/scripts/qapi/gen.py", line 325, in visit_module
assert module.user_module, "Unexpected system module"
AssertionError: Unexpected system module

Even if you value the difference, I very much doubt it justifies the
clutter.  Also, slippery slope towards pigs wearing way too much
lipstick.

Tested with

diff --git a/scripts/qapi/gen.py b/scripts/qapi/gen.py
index f3f4d2b011..bbfb30dc5a 100644
--- a/scripts/qapi/gen.py
+++ b/scripts/qapi/gen.py
@@ -321,6 +321,7 @@ class QAPISchemaModularCVisitor(QAPISchemaVisitor):
 # be generated.
 self._current_module = None
 else:
+module.name = "./HACK"
 assert module.user_module, "Unexpected system module"
 self._add_module(module.name, self._user_blurb)
 self._begin_user_module(module.name)


>
>>> +self._add_module(module.name, self._user_blurb)
>>>   self._begin_user_module(module.name)
>>> def visit_include(self, name: str, info: QAPISourceInfo) ->
>>> None:

Re: [PATCH] tcg: Increase the static number of temporaries

2021-01-20 Thread Philippe Mathieu-Daudé

On 1/21/21 3:54 AM, Richard Henderson wrote:
> This isn't a total or permanent solution to the problem of running
> out of temporaries, but it puts off the issue for a bit.
> 
> Make the assert in tcg_temp_alloc unconditional.  If we do run out
> of temps, this can fail much later as a weird SIGSEGV, due to the
> buffer overrun of the temp array.
> 
> Remove the inlines from tcg_temp_alloc and tcg_global_alloc.
> 
> Buglink: https://bugs.launchpad.net/bugs/1912065
> Signed-off-by: Richard Henderson 
> ---
> 
> There are more bugs that need fixing in order to actually make
> the dynamic allocation scheme work.  Rather than keep this bug
> pending longer, hack around it and make the SEGV an ABRT.
> 
> r~
> 
> ---
>  include/tcg/tcg.h | 2 +-
>  tcg/tcg.c | 6 +++---
>  2 files changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v6 02/11] hvf: x86: Remove unused definitions

2021-01-20 Thread Philippe Mathieu-Daudé

On 1/20/21 11:44 PM, Alexander Graf wrote:
> The hvf i386 has a few struct and cpp definitions that are never
> used. Remove them.
> 
> Suggested-by: Roman Bolshakov 
> Signed-off-by: Alexander Graf 
> Reviewed-by: Roman Bolshakov 
> Tested-by: Roman Bolshakov 
> ---
>  target/i386/hvf/hvf-i386.h | 16 
>  1 file changed, 16 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v6 03/11] hvf: Move common code out

2021-01-20 Thread Philippe Mathieu-Daudé

Hi Alexander,

On 1/20/21 11:44 PM, Alexander Graf wrote:
> Until now, Hypervisor.framework has only been available on x86_64 systems.
> With Apple Silicon shipping now, it extends its reach to aarch64. To
> prepare for support for multiple architectures, let's move common code out
> into its own accel directory.
> 
> Signed-off-by: Alexander Graf 
> Reviewed-by: Roman Bolshakov 
> Tested-by: Roman Bolshakov 
> 
> ---
> 
> v3 -> v4:
> 
>   - Use hv.h instead of Hypervisor.h for 10.15 compat
>   - Remove manual inclusion of Hypervisor.h in common .c files
> ---
>  MAINTAINERS |   8 +
>  accel/hvf/hvf-all.c |  54 +
>  accel/hvf/hvf-cpus.c| 462 
>  accel/hvf/meson.build   |   7 +
>  accel/meson.build   |   1 +
>  include/sysemu/hvf_int.h|  54 +
>  target/i386/hvf/hvf-cpus.c  | 131 --
>  target/i386/hvf/hvf-cpus.h  |  25 --
>  target/i386/hvf/hvf-i386.h  |  33 +--
>  target/i386/hvf/hvf.c   | 360 +---
>  target/i386/hvf/meson.build |   1 -
>  target/i386/hvf/x86hvf.c|  11 +-
>  target/i386/hvf/x86hvf.h|   2 -
>  13 files changed, 596 insertions(+), 553 deletions(-)
>  create mode 100644 accel/hvf/hvf-all.c
>  create mode 100644 accel/hvf/hvf-cpus.c
>  create mode 100644 accel/hvf/meson.build
>  create mode 100644 include/sysemu/hvf_int.h
>  delete mode 100644 target/i386/hvf/hvf-cpus.c
>  delete mode 100644 target/i386/hvf/hvf-cpus.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 3216387521..e589ec02e0 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -448,7 +448,15 @@ M: Roman Bolshakov 
>  W: https://wiki.qemu.org/Features/HVF
>  S: Maintained
>  F: target/i386/hvf/
> +
> +HVF
> +M: Cameron Esfahani 
> +M: Roman Bolshakov 
> +W: https://wiki.qemu.org/Features/HVF
> +S: Maintained
> +F: accel/hvf/
>  F: include/sysemu/hvf.h
> +F: include/sysemu/hvf_int.h
>  
>  WHPX CPUs
>  M: Sunil Muthuswamy 
> diff --git a/accel/hvf/hvf-all.c b/accel/hvf/hvf-all.c
> new file mode 100644
> index 00..5b415eb0ed
> --- /dev/null
> +++ b/accel/hvf/hvf-all.c
> @@ -0,0 +1,54 @@
> +/*
> + * QEMU Hypervisor.framework support
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + * Contributions after 2012-01-13 are licensed under the terms of the
> + * GNU GPL, version 2 or (at your option) any later version.

Maybe start with GPLv2+ directly?

> diff --git a/include/sysemu/hvf_int.h b/include/sysemu/hvf_int.h
> new file mode 100644
> index 00..69de46db7d
> --- /dev/null
> +++ b/include/sysemu/hvf_int.h
> @@ -0,0 +1,54 @@
> +/*
> + * QEMU Hypervisor.framework (HVF) support
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +/* header to be included in HVF-specific code */

Can we have this header local to accel/hvf/ ?

Otherwise:
Reviewed-by: Philippe Mathieu-Daudé 

> +
> +#ifndef HVF_INT_H
> +#define HVF_INT_H
> +
> +#include 
> +
> +/* hvf_slot flags */
> +#define HVF_SLOT_LOG (1 << 0)
> +
> +typedef struct hvf_slot {
> +uint64_t start;
> +uint64_t size;
> +uint8_t *mem;
> +int slot_id;
> +uint32_t flags;
> +MemoryRegion *region;
> +} hvf_slot;
> +
> +typedef struct hvf_vcpu_caps {
> +uint64_t vmx_cap_pinbased;
> +uint64_t vmx_cap_procbased;
> +uint64_t vmx_cap_procbased2;
> +uint64_t vmx_cap_entry;
> +uint64_t vmx_cap_exit;
> +uint64_t vmx_cap_preemption_timer;
> +} hvf_vcpu_caps;
> +
> +struct HVFState {
> +AccelState parent;
> +hvf_slot slots[32];
> +int num_slots;
> +
> +hvf_vcpu_caps *hvf_caps;
> +};
> +extern HVFState *hvf_state;
> +
> +void assert_hvf_ok(hv_return_t ret);
> +int hvf_get_registers(CPUState *cpu);
> +int hvf_put_registers(CPUState *cpu);
> +int hvf_arch_init_vcpu(CPUState *cpu);
> +void hvf_arch_vcpu_destroy(CPUState *cpu);
> +int hvf_vcpu_exec(CPUState *cpu);
> +hvf_slot *hvf_find_overlap_slot(uint64_t, uint64_t);
> +
> +#endif

Re: [PATCH v3 05/17] qapi: pass QAPISchemaModule to visit_module instead of str

2021-01-20 Thread Markus Armbruster

John Snow  writes:

> On 1/20/21 11:02 AM, Eric Blake wrote:
>> On 1/20/21 6:07 AM, Markus Armbruster wrote:
>>> John Snow  writes:
>>>
 Modify visit_module to pass the module itself instead of just its
 name. This allows for future patches to centralize some
 module-interrogation behavior within the QAPISchemaModule class itself,
 cutting down on duplication between gen.py and schema.py.
>>>
>>> We've been tempted to make similar changes before (don't worry, I'm not
>>> building a case for "no" here).
>>>
>>> When I wrote the initial version of QAPISchemaVisitor (commit 3f7dc21be,
>>> 2015), I aimed for a loose coupling of backends and the internal
>>> representation.  Instead of
>>>
>>>  def visit_foo(self, foo):
>>>  pass
>>>
>>> where @foo is a QAPISchemaFooBar, I wrote
>>>
>>>  def visit_foo_bar(self, name, info, [curated attributes of @foo]):
>>>  pass
>>>
>>> In theory, this is nice: the information exposed to the backends is
>>> obvious, and the backends can't accidentally mutate @foo.
>>>
>>> In practice, it kind of failed right then and there:
>>>
>>>  def visit_object_type(self, name, info, base, members, variants):
>>>  pass
>>>
>>> We avoid passing the QAPISchemaObjectType (loose coupling, cool!), only
>>> to pass member information as List[QAPISchemaObjectTypeMember].
>>>
>>> Morever, passing "curated atttibutes" has led to visit_commands() taking
>>> a dozen arguments.  Meh.
>>>
>>> This had made Eric and me wonder whether we should write off the
>>> decoupling idea as misguided, and just pass the object instead of
>>> "curated attributes", always.  Thoughts?
>> I'm open to the idea of passing just the larger object instead of
>> the
>> curated list of relevant attributes.  It's a bit more coupling, but I
>> don't see any of our qapi code being reused outside its current scope
>> where the extra coupling will bite us.  But I'm not volunteering for the
>> refactoring work, because I'm not an expert on python typing hints.  If
>> consolidating parameters into the larger object makes for fewer
>> parameters and easier typing hints, I'm assuming the work can be done as
>> part of static typing; but if leaving things as they currently are is
>> manageable, that's also fine by me.
>> 
>
> Yeah, it can definitely be left as-is for now. I've already gone
> through all the effort of typing out all of the interfaces, so it's
> not really a huge ordeal to just leave it as-is.
>
> Passing the objects might be nicer for the future, though, as routing
> new information or features will involve less churn. (And the
> signatures will be a lot smaller.)
>
> I suppose it does open us up to callers mutating the schema in the
> visitors, but they could already do that for the reasons that Markus 
> points out. It's possible that the visitor dispatch could be modified
> to make deep copies of schema objects, but that adds overhead.
>
> I can just revert this change for now and leave the topic for another day.

Works for me.  Thanks!

Re: [PATCH v3 05/17] qapi: pass QAPISchemaModule to visit_module instead of str

2021-01-20 Thread Markus Armbruster

John Snow  writes:

> On 1/20/21 7:07 AM, Markus Armbruster wrote:
>> John Snow  writes:
[...]
>>> diff --git a/docs/sphinx/qapidoc.py b/docs/sphinx/qapidoc.py
>>> index e03abcbb959..f754f675d66 100644
>>> --- a/docs/sphinx/qapidoc.py
>>> +++ b/docs/sphinx/qapidoc.py
>>> @@ -463,11 +463,11 @@ def __init__(self, env, qapidir):
>>>   self._env = env
>>>   self._qapidir = qapidir
>>>   -def visit_module(self, name):
>>> -if name is not None:
>>> -qapifile = self._qapidir + '/' + name
>>> +def visit_module(self, module):
>>> +if module.name:
>> Replacing the "is not None" test by (implicit) "is thruthy" changes
>> behavior for the empty string.  Intentional?
>> 
>
> Instinctively it was intentional, consciously it wasn't. I was worried
> about what "qapifile" would produce if the string happened to be
> empty.

It would produce a dependency on the directory.

>> I've had the "pleasure" of debugging empty strings getting interpreted
>> like None where they should be interpreted like any other string.
>> 
>
> assert module.name, then?

Let's avoid changing behavior in a refactoring patch.  If you want an
assertion to ease your worry, separate patch.

>>> +qapifile = self._qapidir + '/' + module.name
>>>   self._env.note_dependency(os.path.abspath(qapifile))
>>> -super().visit_module(name)
>>> +super().visit_module(module)
>>> 
>> [...]
>>

Re: [PATCH v2 2/2] net/eth: Fix stack-buffer-overflow in _eth_get_rss_ex_dst_addr()

2021-01-20 Thread Thomas Huth


On 15/01/2021 16.11, Philippe Mathieu-Daudé wrote:

QEMU fuzzer reported a buffer overflow in _eth_get_rss_ex_dst_addr()
reproducible as:

   $ cat << EOF | ./qemu-system-i386 -M pc-q35-5.0 \
 -accel qtest -monitor none \
 -serial none -nographic -qtest stdio
   outl 0xcf8 0x80001010
   outl 0xcfc 0xe102
   outl 0xcf8 0x80001004
   outw 0xcfc 0x7
   write 0x25 0x1 0x86
   write 0x26 0x1 0xdd
   write 0x4f 0x1 0x2b
   write 0xe1020030 0x4 0x190002e1
   write 0xe102003a 0x2 0x0807
   write 0xe1020048 0x4 0x12077cdd
   write 0xe1020400 0x4 0xba077cdd
   write 0xe1020420 0x4 0x190002e1
   write 0xe1020428 0x4 0x3509d807
   write 0xe1020438 0x1 0xe2
   EOF
   =
   ==2859770==ERROR: AddressSanitizer: stack-buffer-overflow on address 
0x7ffdef904902 at pc 0x561ceefa78de bp 0x7ffdef904820 sp 0x7ffdef904818
   READ of size 1 at 0x7ffdef904902 thread T0
   #0 0x561ceefa78dd in _eth_get_rss_ex_dst_addr net/eth.c:410:17
   #1 0x561ceefa41fb in eth_parse_ipv6_hdr net/eth.c:532:17
   #2 0x561cef7de639 in net_tx_pkt_parse_headers hw/net/net_tx_pkt.c:228:14
   #3 0x561cef7dbef4 in net_tx_pkt_parse hw/net/net_tx_pkt.c:273:9
   #4 0x561ceec29f22 in e1000e_process_tx_desc hw/net/e1000e_core.c:730:29
   #5 0x561ceec28eac in e1000e_start_xmit hw/net/e1000e_core.c:927:9
   #6 0x561ceec1baab in e1000e_set_tdt hw/net/e1000e_core.c:2444:9
   #7 0x561ceebf300e in e1000e_core_write hw/net/e1000e_core.c:3256:9
   #8 0x561cef3cd4cd in e1000e_mmio_write hw/net/e1000e.c:110:5

   Address 0x7ffdef904902 is located in stack of thread T0 at offset 34 in frame
   #0 0x561ceefa320f in eth_parse_ipv6_hdr net/eth.c:486

 This frame has 1 object(s):
   [32, 34) 'ext_hdr' (line 487) <== Memory access at offset 34 overflows 
this variable
   HINT: this may be a false positive if your program uses some custom stack 
unwind mechanism, swapcontext or vfork
 (longjmp and C++ exceptions *are* supported)
   SUMMARY: AddressSanitizer: stack-buffer-overflow net/eth.c:410:17 in 
_eth_get_rss_ex_dst_addr
   Shadow bytes around the buggy address:
 0x10003df188d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x10003df188e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x10003df188f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x10003df18900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x10003df18910: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
   =>0x10003df18920:[02]f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
 0x10003df18930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x10003df18940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x10003df18950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x10003df18960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x10003df18970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   Shadow byte legend (one shadow byte represents 8 application bytes):
 Addressable:   00
 Partially addressable: 01 02 03 04 05 06 07
 Stack left redzone:  f1
 Stack right redzone: f3
   ==2859770==ABORTING

Similarly GCC 11 reports:

   net/eth.c: In function 'eth_parse_ipv6_hdr':
   net/eth.c:410:15: error: array subscript 'struct ip6_ext_hdr_routing[0]' is 
partly outside array bounds of 'struct ip6_ext_hdr[1]' [-Werror=array-bounds]
 410 | if ((rthdr->rtype == 2) && (rthdr->segleft == 1)) {
 |  ~^~~
   net/eth.c:485:24: note: while referencing 'ext_hdr'
 485 | struct ip6_ext_hdr ext_hdr;
 |^~~
   net/eth.c:410:38: error: array subscript 'struct ip6_ext_hdr_routing[0]' is 
partly outside array bounds of 'struct ip6_ext_hdr[1]' [-Werror=array-bounds]
 410 | if ((rthdr->rtype == 2) && (rthdr->segleft == 1)) {
 | ~^
   net/eth.c:485:24: note: while referencing 'ext_hdr'
 485 | struct ip6_ext_hdr ext_hdr;
 |^~~

In eth_parse_ipv6_hdr() we called iov_to_buf() to fill the 2 bytes of
the 'ext_hdr' buffer, then _eth_get_rss_ex_dst_addr() tries to access
beside the 2 filled bytes.

Fix by reworking the function, filling the full rt_hdr buffer on the
stack calling iov_to_buf() again.

Add the corresponding qtest case with the fuzzer reproducer.

Cc: qemu-sta...@nongnu.org
Buglink: https://bugs.launchpad.net/qemu/+bug/1879531
Reported-by: Alexander Bulekov 
Reported-by: Miroslav Rezanina 
Fixes: eb700029c78 ("net_pkt: Extend packet abstraction as required by e1000e 
functionality")
Reviewed-by: Miroslav Rezanina 
Signed-off-by: Philippe Mathieu-Daudé 
---
v2: Do no run test if device not available
---
  net/eth.c  | 25 +++-
  tests/qtest/fuzz-e1000e-test.c | 53 ++
  MAINTAINERS|  1 +
  tests/qtest/meson.build|  1 +
  4 files changed, 66

Re: [RFC PATCH 6/6] softmmu: Restrict watchpoint handlers to TCG accelerator

2021-01-20 Thread Richard Henderson

On 1/17/21 6:48 AM, Philippe Mathieu-Daudé wrote:
> Watchpoint funtions use cpu_restore_state() which is only
> available when TCG accelerator is built. Restrict them
> to TCG.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> RFC because we could keep that code by adding an empty
> stub for cpu_restore_state(), but it is unclear as
> the function is named generically.
> ---
>  include/hw/core/cpu.h | 4 ++--
>  softmmu/physmem.c | 4 
>  2 files changed, 6 insertions(+), 2 deletions(-)

And all of the targets that use this are already conditionalized for this?  As
opposed to leaving the declarations and adding stubs?


r~

Re: [RFC PATCH 5/6] accel/tcg: Restrict cpu_io_recompile() from other accelerators

2021-01-20 Thread Richard Henderson

On 1/17/21 6:48 AM, Philippe Mathieu-Daudé wrote:
> As cpu_io_recompile() is only called within TCG accelerator
> in cputlb.c, declare it locally.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> RFC because not sure if other accelerator could implement this.
> ---
>  accel/tcg/internal.h| 2 ++
>  include/exec/exec-all.h | 1 -
>  2 files changed, 2 insertions(+), 1 deletion(-)

Queued to tcg-next.

r~

Re: [PATCH 4/6] accel/tcg: Declare missing cpu_loop_exit*() stubs

2021-01-20 Thread Richard Henderson

On 1/17/21 6:48 AM, Philippe Mathieu-Daudé wrote:
> cpu_loop_exit*() functions are declared in accel/tcg/cpu-exec-common.c,
> and are not available when TCG accelerator is not built. Add stubs so
> linking without TCG succeed.
> 
> Problematic files:
> 
> - hw/semihosting/console.c in qemu_semihosting_console_inc()
> - hw/ppc/spapr_hcall.c in h_confer()
> - hw/s390x/ipl.c in s390_ipl_reset_request()
> - hw/misc/mips_itu.c
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---

Queued to tcg-next.


r~

Re: [PATCH 3/6] accel/tcg: Restrict tb_gen_code() from other accelerators

2021-01-20 Thread Richard Henderson

On 1/17/21 11:12 PM, Claudio Fontana wrote:
> On 1/17/21 5:48 PM, Philippe Mathieu-Daudé wrote:
>> tb_gen_code() is only called within TCG accelerator,
>> declare it locally.
> 
> Is this used only in accel/tcg/cpu-exec.c ? Should it be a static function 
> there?

Possibly, but there's a *lot* of code that would have to be moved.  For now,
queuing a slightly modified version of the patch.

>> --- a/accel/tcg/user-exec.c
>> +++ b/accel/tcg/user-exec.c
>> @@ -28,6 +28,7 @@
>>  #include "qemu/atomic128.h"
>>  #include "trace/trace-root.h"
>>  #include "trace/mem.h"
>> +#include "internal.h"

Not needed by this patch.


r~

Re: [PATCH v2 0/4] nbd/server: Quiesce coroutines on context switch

2021-01-20 Thread Sergio Lopez

On Wed, Jan 20, 2021 at 02:49:14PM -0600, Eric Blake wrote:
> On 12/14/20 11:05 AM, Sergio Lopez wrote:
> > This series allows the NBD server to properly switch between AIO contexts,
> > having quiesced recv_coroutine and send_coroutine before doing the 
> > transition.
> > 
> > We need this because we send back devices running in IO Thread owned 
> > contexts
> > to the main context when stopping the data plane, something that can happen
> > multiple times during the lifetime of a VM (usually during the boot 
> > sequence or
> > on a reboot), and we drag the NBD server of the correspoing export with it.
> > 
> > While there, fix also a problem caused by a cross-dependency between
> > closing the export's client connections and draining the block
> > layer. The visible effect of this problem was QEMU getting hung when
> > the guest request a power off while there's an active NBD client.
> > 
> > v2:
> >  - Replace "virtio-blk: Acquire context while switching them on
> >  dataplane start" with "block: Honor blk_set_aio_context() context
> >  requirements" (Kevin Wolf)
> >  - Add "block: Avoid processing BDS twice in
> >  bdrv_set_aio_context_ignore()"
> >  - Add "block: Close block exports in two steps"
> >  - Rename nbd_read_eof() to nbd_server_read_eof() (Eric Blake)
> >  - Fix double space and typo in comment. (Eric Blake)
> 
> ping - where do we stand on this series?  Patches 1 and 3 have positive
> reviews, I'll queue them now; patches 2 and 4 had enough comments that
> I'm guessing I should wait for a v3?

Yes, I have almost ready a v3. Will send it out today. I think it'd be
better to pull all four patches at the same time, as "block: Honor
blk_set_aio_context() context requirements" may cause trouble without
the patch to avoid double processing in
"bdrv_set_aio_context_ignore()".

Thanks,
Sergio.
 
> > 
> > Sergio Lopez (4):
> >   block: Honor blk_set_aio_context() context requirements
> >   block: Avoid processing BDS twice in bdrv_set_aio_context_ignore()
> >   nbd/server: Quiesce coroutines on context switch
> >   block: Close block exports in two steps
> > 
> >  block.c |  27 ++-
> >  block/export/export.c   |  10 +--
> >  blockdev-nbd.c  |   2 +-
> >  hw/block/dataplane/virtio-blk.c |   4 ++
> >  hw/block/dataplane/xen-block.c  |   7 +-
> >  hw/scsi/virtio-scsi.c   |   6 +-
> >  include/block/export.h  |   4 +-
> >  nbd/server.c| 120 
> >  qemu-nbd.c  |   2 +-
> >  stubs/blk-exp-close-all.c   |   2 +-
> >  10 files changed, 156 insertions(+), 28 deletions(-)
> > 
> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org
> 


signature.asc
Description: PGP signature

Re: [PATCH 2/6] accel/tcg: Restrict tb_flush_jmp_cache() from other accelerators

2021-01-20 Thread Richard Henderson

On 1/17/21 6:48 AM, Philippe Mathieu-Daudé wrote:
> tb_flush_jmp_cache() is only called within TCG accelerator,
> declare it locally.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> We could also inline it in cputlb.c, the single user.
> ---

Yes, I think we should move it to cputlb.c.
I have queued a patch to do so.


r~

Re: [PATCH 1/6] accel/tcg: Make cpu_gen_init() static

2021-01-20 Thread Richard Henderson

On 1/17/21 6:48 AM, Philippe Mathieu-Daudé wrote:
> cpu_gen_init() is TCG specific, only used in tcg/translate-all.c.
> No need to export it to other accelerators, declare it statically.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> We could also inline the 1-line call..
> ---
>  include/exec/exec-all.h   | 2 --
>  accel/tcg/translate-all.c | 2 +-
>  2 files changed, 1 insertion(+), 3 deletions(-)

Applied to tcg-next.

r~

Re: [PATCH 0/4] Introduce a battery, AC adapter, and lid button

2021-01-20 Thread Leonid Bloch

Hi Phil,

Thanks for your feedback! Please see below.

On Wed, Jan 20, 2021 at 11:52 PM Philippe Mathieu-Daudé  wrote:
>
> Hi Leonid, Marcel,
>
> On 1/20/21 9:54 PM, Leonid Bloch wrote:
> > This series introduces the following ACPI devices:
> >
> > * Battery
> > * AC adapter
> > * Laptop lid button
> >
> > When running QEMU on a laptop, these paravirtualized devices reflect the
> > state of these physical devices onto the guest. This functionality is
> > relevant not only for laptops, but also for any other device which has e.g.
> > a battery. This even allows to insert a ``fake'' battery to the
> > guest, in a form of a file which emulates the behavior of the actual
> > battery in sysfs. A possible use case for such a ``fake'' battery can be
> > limiting the budget of VM usage to a subscriber, in a naturally-visible way.
>
> Your series looks good. Now for this feature to be even more useful for
> the community, it would be better to
>
> 1/ Have a generic (kind of abstract QDev) battery model.
>Your model would be the ISA implementation. But we could add LPC,
>SPI or I2C implementations for example.

It definitely feels that it needs to be more generic, and I thought
how to make it so, but so far it is what I came up with. I'll think
some more, but any ideas are welcome, cause nowadays I'm doing this in
my free time only.

> 2/ Make it a model backend accepting various kind of frontends:
>- host Linux sysfs mirroring is a particular frontend implementation
>- mirroring on Windows would be another
>- any connection (TCP) to battery simulator (Octave, ...)

Well, it does accept an arbitrary file to represent a battery, so this
covers the battery simulator, does it? As for Windows - indeed, it
would be nice to have.

> Meanwhile 2/ is not available, it would be useful to have QMP commands
> to set the battery charge and state (also max capacity).

But the battery state is determined by the physical battery, or by an
externally provided file. Do you mean introducing another source for
battery information which will be controlled by QMP commands?
As for the max capacity, as with an actual battery, the "QEMU battery"
has it set "by the manufacturer". It is not passed through from the
host, for simplicity sake, and only the percentage is passed. How will
it help if we allow to set the max capacity? It's something pretty
much transparent to the user. (But if there is a use case, of course
it can be done.)

> Ditto QMP command to set the LID/AC adapter state.
>
> > But of course, the main purpose here is addressing the desktop users.
> >
> > This series was tested with Windows and (desktop) Linux guests, on which
> > indeed the battery icon appears in the corresponding state (full,
> > charging, discharging, time remaining to empty, etc.) and the AC adapter
> > plugging/unplugging behaves as expected. So is the laptop lid button.
> [...]
>
> In patch #2 you comment 'if a "fake" host battery is to be provided,
> a 'sysfs_path' property allows to override the default one.'.
>
> Eventually you'd provide a such fake file as example, ideally used
> by a QTest.

Sure! I will - it's definitely a good idea.

> Another question. If the battery is disconnected, is there an event
> propagated to the guest?

No. I definitely need to add! Thanks!

> Thanks for contributing these patches :)

Thank you!
Leonid.

> Phil.

[Bug 1756728] Re: virtio-scsi virtio-scsi-single and virtio-blk on raw image, games are not starting

2021-01-20 Thread Launchpad Bug Tracker

[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1756728

Title:
  virtio-scsi virtio-scsi-single and virtio-blk on raw image, games are
  not starting

Status in QEMU:
  Expired

Bug description:
  Using virtio-scsi / vitro-scsi-single or vitro-blk on a ZFS/raw image,
  most Assassin's Creed (Origin especially) games are not starting
  (uPlay), it seems related to some check or commands applications are
  doing on the disk drive that fails to respond.

  Workaround has been found by creating a VHD volume, mounting it and
  installing games on it.

  On a side note, application like HDDScan, CrystalDiskInfo returns
  nothing regarding disk drives.

  Guest:
  Windows 10 (build 1709)

  Host:
  $ kvm --version
  QEMU emulator version 2.9.1 pve-qemu-kvm_2.9.1-9
  $ uname -a
  Linux  4.13.13-5-pve #1 SMP PVE 4.13.13-36 (Mon, 15 Jan 2018 12:36:49 
+0100) x86_64 GNU/Linux

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1756728/+subscriptions

[Bug 1757323] Re: blue screen running windows 10 install DVD on qemu

2021-01-20 Thread Launchpad Bug Tracker

[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1757323

Title:
  blue screen running windows 10 install DVD on qemu

Status in QEMU:
  Expired

Bug description:
  i get a blue screen at the first screen of the windows 10 DVD setup
  (Win10_1709_English_x64.iso, available from MS).

  The DVD boots fine, and gets to the first dialog: 
http://codewithoutborders.com/posted/qemu1.png
  and then if i just wait a minute of so it blue screen's.
  either DRIVER IRQL NOT LESS OR EQUAL: 
http://codewithoutborders.com/posted/qemu2.png
  or KMODE EXCEPTION NOT HANDLED: http://codewithoutborders.com/posted/qemu3.png


  
  the qemu command-line is:

  /usr/bin/qemu-system-x87_64 \
   -boot strict=on \
   -chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-generic/monitor.sock,server,nowait
 \
   -chardev spicevmc,id=charchannel0,name=vdagent \
   -cpu 
core2duo,+lahf_lm,+pdcm,+xtpr,+cx16,+tm2,+est,+vmx,+ds_cpl,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,kvm=off
 \
   -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 \
   -device 
ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 \
   -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 \
   -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 \
   -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=1 
\
   -device 
qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x2
 \
   -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 \
   -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 \
   -device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
 \
   -drive 
file=/mnt/ISOs/Win10_1709_English_x64.iso,format=raw,if=none,id=drive-ide0-0-1,readonly=on
 \
   -global kvm-pit.lost_tick_policy=discard \
   -global PIIX4_PM.disable_s3=1 \
   -global PIIX4_PM.disable_s4=1 \
   -m 4096 \
   -machine pc-i440fx-xenial,accel=tcg,usb=off \
   -mon chardev=charmonitor,id=monitor,mode=control \
   -msg timestamp=on \
   -name generic \
   -nodefaults \
   -no-hpet \
   -no-shutdown \
   -no-user-config \
   -realtime mlock=off \
   -rtc base=utc,driftfix=slew \
   -S \
   -smp 2,sockets=2,cores=1,threads=1 \
   -spice 
port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on
 \
   -uuid 3902a801-42dd-4bf2-8f3a-cbc68f4f8564

  
  $ /usr/bin/qemu-system-x87_64 --version
  QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.24), Copyright (c) 
2003-2008 Fabrice Bellard

  $ uname -a
  Linux host 4.13.0-37-generic #42~16.04.1-Ubuntu SMP Wed Mar 7 16:03:28 UTC 
2018 x86_64 x86_64 x86_64 GNU/Linux

  $ cat /proc/cpuinfo 
  processor : 0
  vendor_id : GenuineIntel
  cpu family: 6
  model : 15
  model name: Intel(R) Core(TM)2 Quad CPU   @ 2.66GHz
  stepping  : 7
  microcode : 0x66
  cpu MHz   : 2671.406
  cache size: 4096 KB
  physical id   : 0
  siblings  : 4
  core id   : 0
  cpu cores : 4
  apicid: 0
  initial apicid: 0
  fpu   : yes
  fpu_exception : yes
  cpuid level   : 10
  wp: yes
  flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm 
constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 
monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm pti retpoline 
tpr_shadow dtherm
  bugs  : cpu_meltdown spectre_v1 spectre_v2
  bogomips  : 5342.81
  clflush size  : 64
  cache_alignment   : 64
  address sizes : 36 bits physical, 48 bits virtual
  power management:

  ... 3 more times

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1757323/+subscriptions

[Bug 1843151] Re: Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

2021-01-20 Thread Launchpad Bug Tracker

[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843151

Title:
  Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

Status in QEMU:
  Expired

Bug description:
  Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.  virt-viewer 8.0.

  Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.

  Have replicated this both on a system using amdgpu and one using
  integrated ASPEED graphics.

  Downgrading from 4.1.0 to 4.0.0 works as usual, see:
  https://www.youtube.com/watch?v=NyMdcYwOCvY

  Going back to 4.1.0 reproduces, see:
  https://www.youtube.com/watch?v=H3nGG2Mk6i0

  4.1.0 displays fine until KMS kicks in.

  Using 4.1.0 with virtio-vga doesn't cause this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843151/+subscriptions

[Bug 1750899] Re: Mouse cursor sometimes can't pass the invisible border on the right side of the screen

2021-01-20 Thread Launchpad Bug Tracker

[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1750899

Title:
  Mouse cursor sometimes can't pass the invisible border on the right
  side of the screen

Status in QEMU:
  Expired

Bug description:
  I'm using qemu 2.11 on Gentoo Linux, with configured GPU passthrough (Radeon 
RX580) to the guest Windows 10.
  This configuration is alive for last 4 years, this time I changed a lot qemu, 
linux kernel and windows versions, changed GPU and always all was working as 
expected. I always used standard PS/2 mouse emulation and that was enough for 
me.

  Now, I bought two new monitors, instead of old one, and setup them as
  one logical monitor, using technology called Eyefinity - it's a part
  of standard Radeon software. Now Windows thinks, that I have one
  monitor with resolution 2160x1920 (I bought Dell monitors with a thin
  borders and use them in portrait mode).

  Windows uses it without any problems, but mouse become crazy - sometimes (~3 
times from each 5) I can't move cursor to the right border of the screen, it 
looks like the invisible vertical border. I spent really huge amount of time to 
understand, which component is the root of problem and found, that it's really 
a mouse. I tried all possible variants (standard, tablet, virtio-mouse-pci, 
virtio-tablet-pci), and found, that in both mouse variants bug is reproducing, 
and in both tablet variants - cursor stuck near all real borders and corners, 
so it's not a variant too.
  The only working variant becomes passing real USB port to my VM and insert 
second mouse to this port. So, now it's working, but I have two mice on my 
working place, which doesn't seems very useful.

  Here is my command line:

  QEMU_AUDIO_DRV=pa QEMU_PA_SAMPLES=4096 qemu-system-x86_64 -enable-kvm -M q35 
-m 12168 -cpu host,kvm=off -smp 4,sockets=1,cores=4 \
  -bios /usr/share/qemu/bios.bin -rtc base=localtime -vga none -device 
secondary-vga \
  -drive 
id=virtiocd,if=none,format=raw,file=/home/akushsky/virtio-win-0.1.141.iso \
  -device driver=ide-cd,bus=ide.1,drive=virtiocd \
  -device 
ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 \
  -device 
vfio-pci,host=05:00.0,bus=root.1,addr=00.0,multifunction=on,romfile=/opt/kvm/images/Sapphire.RX580.8192.170320_1.bin,x-vga=on
 \
  -device virtio-scsi-pci,id=scsi \
  -drive 
file=/dev/sdb,id=disk,format=raw,if=none,discard=on,cache=none,aio=native,detect-zeroes=unmap
 -device scsi-hd,drive=disk,id=scsi0 \
  -device ich9-intel-hda,bus=pcie.0,addr=1b.0,id=sound0 -device 
hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 \
  -usb -usbdevice host:046d:c52b

  All in all, I checked on Windows 7 and Windows 10, and on qemu 2.10
  and 2.11 - bug is always reproducible.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1750899/+subscriptions

[Bug 1843073] Re: Network loose connection for long time under light load guest winxp64 with virtio on i5-8350U

2021-01-20 Thread Launchpad Bug Tracker

[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843073

Title:
  Network loose connection for long time under light load guest winxp64
  with virtio on i5-8350U

Status in QEMU:
  Expired

Bug description:
  I have issue with qemu and winxp guest on i5-8350U.

  First of all, if i run same vm with same config on i5 9660k i do not see such 
issue.
  Both pc have ubuntu 19.04 x86_64.

  Guest is winxp64, tried:
  1) stable guest drivers, latest drivers
  2) all virtio, only network r18169, bridge to host eth0, just default virbr0.
  3) qemu from ubuntu 19.04, and tried latest libvirt and qeumu compiled from 
sources.
  4) tried zram as block device

  I need really lightweight win to run only one app, so win7 and win10
  is not an option.

  
  

  
winxp
93921ab3-222a-4e5f-89c4-36703fc65cb4

  http://libosinfo.org/xmlns/libvirt/domain/1.0;>
http://microsoft.com/win/xp"/>
  

8388608
8388608
4

  
  
  
  


  hvm


  
  
  






  
  


  


  
  
  
  

destroy
restart
destroy

  
  


  /usr/local/bin/qemu-system-x86_64
  






  
  





  
  

  
  


  
  


  
  


  
  
  

  
  

  
  




  
  

  

  
  

  
  

  
  
  
  






  
  

  
  


  
  

  

  

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843073/+subscriptions

[Bug 1756538] Re: Minimal Ubuntu vs. Debian differences

2021-01-20 Thread Launchpad Bug Tracker

[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1756538

Title:
  Minimal Ubuntu vs. Debian differences

Status in QEMU:
  Expired

Bug description:
  I'm using Qemu on Ubuntu (minimal) and Debian (minimal) on Android
  (Arch64) via Linux Deploy to run Windows guests. Here's a few issues I
  encountered:

  1) Qemu on (minimal) Debian 9 and Ubuntu cannot run Windows 7-10
  guests (only Windows XP and below) because there's a black screen
  after the boot menu. Qemu on Debian 10, however, can run Windows 7.
  Incidentally, these distros run on the host in bios compatibility mode
  instead of UEFI. Ubuntu Desktop (full distro) on other hosts does not
  display the black screen when running Qemu.

  2) Qemu on Debian 9-10 (minimal) does not display fullscreen - but
  Ubuntu minimal does display full-screen.

  3) Qemu on Limbo PC Emulator and on Debian 9-10 only run windows
  guests at 1 GHz using the default Qemu CPU, but Ubuntu runs windows
  guests at 2 GHz using the default Qemu CPU.

  4) Enable KVM doesn't work, and virtualization isn't detected through
  Limbo PC Emulator and minimal Linux distros running on Android -
  perhaps is a problem with running Linux distros via Linux Deploy using
  Chroot on Android (not so much a Qemu-KVM issue) and failing to detect
  ARMv8-A CPUs that are indeed capable of virtualization.

  Can anyone explain these differences? I believe they are all using the
  latest versions of Qemu.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1756538/+subscriptions

Re: [PATCH V4 4/4] bsd-user: space required after semicolon

2021-01-20 Thread shiliyang

Hi Warner:
   I might misunderstood it before.
   So, what should I do is to make a new pull request for bsd-user-rebase-3.1 
branch. Am I right?
   I have submitted a pull request: 
https://github.com/qemu-bsd-user/qemu-bsd-user/pull/4
   Please review it.

Thanks.
Best regards.

On 2021/1/18 10:35, Warner Losh wrote:
> Can you submit this to our current fork at http://github.com/qemu-bsd-user 
>  on the rebase-3.1 branch? Having churn like 
> this upstream just slows us down since we have extensive changes and these 
> will conflict.
> 
> Warner 
> 
> On Sun, Jan 17, 2021, 7:21 PM shiliyang  > wrote:
> 
> This patch fixes error style problems found by checkpatch.pl 
> :
> ERROR: space required after that ','
> 
> Signed-off-by: Liyang Shi  >
> ---
>  bsd-user/elfload.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/bsd-user/elfload.c b/bsd-user/elfload.c
> index 2842dfe56b..c89c998c22 100644
> --- a/bsd-user/elfload.c
> +++ b/bsd-user/elfload.c
> @@ -636,8 +636,8 @@ int load_elf_binary(struct bsd_binprm *bprm, struct 
> target_pt_regs *regs,
> 
>  #ifndef __FreeBSD__
>      bprm->p = copy_elf_strings(1, >filename, bprm->page, bprm->p);
> -    bprm->p = copy_elf_strings(bprm->envc,bprm->envp,bprm->page,bprm->p);
> -    bprm->p = copy_elf_strings(bprm->argc,bprm->argv,bprm->page,bprm->p);
> +    bprm->p = copy_elf_strings(bprm->envc, bprm->envp, bprm->page, 
> bprm->p);
> +    bprm->p = copy_elf_strings(bprm->argc, bprm->argv, bprm->page, 
> bprm->p);
>      if (!bprm->p) {
>          retval = -E2BIG;
>      }
> @@ -739,7 +739,7 @@ int load_elf_binary(struct bsd_binprm *bprm, struct 
> target_pt_regs *regs,
>              if (retval >= 0) {
>                  retval = lseek(interpreter_fd, 0, SEEK_SET);
>                  if(retval >= 0) {
> -                    retval = read(interpreter_fd,bprm->buf,128);
> +                    retval = read(interpreter_fd, bprm->buf, 128);
>                  }
>              }
>              if (retval >= 0) {
> @@ -769,7 +769,7 @@ int load_elf_binary(struct bsd_binprm *bprm, struct 
> target_pt_regs *regs,
>          }
> 
>          if (interp_elf_ex.e_ident[0] != 0x7f ||
> -                strncmp((char *)_elf_ex.e_ident[1], "ELF",3) != 
> 0) {
> +                strncmp((char *)_elf_ex.e_ident[1], "ELF", 3) != 
> 0) {
>              interpreter_type &= ~INTERPRETER_ELF;
>          }
> 
> @@ -792,7 +792,7 @@ int load_elf_binary(struct bsd_binprm *bprm, struct 
> target_pt_regs *regs,
>              passed_p = passed_fileno;
> 
>              if (elf_interpreter) {
> -                bprm->p = 
> copy_elf_strings(1,_p,bprm->page,bprm->p);
> +                bprm->p = copy_elf_strings(1, _p, bprm->page, 
> bprm->p);
>                  bprm->argc++;
>              }
>          }
> -- 
> 2.29.1.59.gf9b6481aed
>

[PATCH] tcg: Increase the static number of temporaries

2021-01-20 Thread Richard Henderson

This isn't a total or permanent solution to the problem of running
out of temporaries, but it puts off the issue for a bit.

Make the assert in tcg_temp_alloc unconditional.  If we do run out
of temps, this can fail much later as a weird SIGSEGV, due to the
buffer overrun of the temp array.

Remove the inlines from tcg_temp_alloc and tcg_global_alloc.

Buglink: https://bugs.launchpad.net/bugs/1912065
Signed-off-by: Richard Henderson 
---

There are more bugs that need fixing in order to actually make
the dynamic allocation scheme work.  Rather than keep this bug
pending longer, hack around it and make the SEGV an ABRT.

r~

---
 include/tcg/tcg.h | 2 +-
 tcg/tcg.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index c5a9d65d5f..0187de1352 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -275,7 +275,7 @@ typedef struct TCGPool {
 
 #define TCG_POOL_CHUNK_SIZE 32768
 
-#define TCG_MAX_TEMPS 512
+#define TCG_MAX_TEMPS 1024
 #define TCG_MAX_INSNS 512
 
 /* when the size of the arguments of a called function is smaller than
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 8f8badb61c..5110f6f39c 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1204,14 +1204,14 @@ void tcg_func_start(TCGContext *s)
 QSIMPLEQ_INIT(>labels);
 }
 
-static inline TCGTemp *tcg_temp_alloc(TCGContext *s)
+static TCGTemp *tcg_temp_alloc(TCGContext *s)
 {
 int n = s->nb_temps++;
-tcg_debug_assert(n < TCG_MAX_TEMPS);
+g_assert(n < TCG_MAX_TEMPS);
 return memset(>temps[n], 0, sizeof(TCGTemp));
 }
 
-static inline TCGTemp *tcg_global_alloc(TCGContext *s)
+static TCGTemp *tcg_global_alloc(TCGContext *s)
 {
 TCGTemp *ts;
 
-- 
2.25.1

[PULL 10/13] iotests: define group in each iotest

2021-01-20 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

We are going to drop group file. Define group in tests as a preparatory
step.

The patch is generated by

cd tests/qemu-iotests

grep '^[0-9]\{3\} ' group | while read line; do
file=$(awk '{print $1}' <<< "$line");
groups=$(sed -e 's/^... //' <<< "$line");
awk "NR==2{print \"# group: $groups\"}1" $file > tmp;
cat tmp > $file;
done

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Message-Id: <20210116134424.82867-7-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/001 | 1 +
 tests/qemu-iotests/002 | 1 +
 tests/qemu-iotests/003 | 1 +
 tests/qemu-iotests/004 | 1 +
 tests/qemu-iotests/005 | 1 +
 tests/qemu-iotests/007 | 1 +
 tests/qemu-iotests/008 | 1 +
 tests/qemu-iotests/009 | 1 +
 tests/qemu-iotests/010 | 1 +
 tests/qemu-iotests/011 | 1 +
 tests/qemu-iotests/012 | 1 +
 tests/qemu-iotests/013 | 1 +
 tests/qemu-iotests/014 | 1 +
 tests/qemu-iotests/015 | 1 +
 tests/qemu-iotests/017 | 1 +
 tests/qemu-iotests/018 | 1 +
 tests/qemu-iotests/019 | 1 +
 tests/qemu-iotests/020 | 1 +
 tests/qemu-iotests/021 | 1 +
 tests/qemu-iotests/022 | 1 +
 tests/qemu-iotests/023 | 1 +
 tests/qemu-iotests/024 | 1 +
 tests/qemu-iotests/025 | 1 +
 tests/qemu-iotests/026 | 1 +
 tests/qemu-iotests/027 | 1 +
 tests/qemu-iotests/028 | 1 +
 tests/qemu-iotests/029 | 1 +
 tests/qemu-iotests/030 | 1 +
 tests/qemu-iotests/031 | 1 +
 tests/qemu-iotests/032 | 1 +
 tests/qemu-iotests/033 | 1 +
 tests/qemu-iotests/034 | 1 +
 tests/qemu-iotests/035 | 1 +
 tests/qemu-iotests/036 | 1 +
 tests/qemu-iotests/037 | 1 +
 tests/qemu-iotests/038 | 1 +
 tests/qemu-iotests/039 | 1 +
 tests/qemu-iotests/040 | 1 +
 tests/qemu-iotests/041 | 1 +
 tests/qemu-iotests/042 | 1 +
 tests/qemu-iotests/043 | 1 +
 tests/qemu-iotests/044 | 1 +
 tests/qemu-iotests/045 | 1 +
 tests/qemu-iotests/046 | 1 +
 tests/qemu-iotests/047 | 1 +
 tests/qemu-iotests/048 | 1 +
 tests/qemu-iotests/049 | 1 +
 tests/qemu-iotests/050 | 1 +
 tests/qemu-iotests/051 | 1 +
 tests/qemu-iotests/052 | 1 +
 tests/qemu-iotests/053 | 1 +
 tests/qemu-iotests/054 | 1 +
 tests/qemu-iotests/055 | 1 +
 tests/qemu-iotests/056 | 1 +
 tests/qemu-iotests/057 | 1 +
 tests/qemu-iotests/058 | 1 +
 tests/qemu-iotests/059 | 1 +
 tests/qemu-iotests/060 | 1 +
 tests/qemu-iotests/061 | 1 +
 tests/qemu-iotests/062 | 1 +
 tests/qemu-iotests/063 | 1 +
 tests/qemu-iotests/064 | 1 +
 tests/qemu-iotests/065 | 1 +
 tests/qemu-iotests/066 | 1 +
 tests/qemu-iotests/068 | 1 +
 tests/qemu-iotests/069 | 1 +
 tests/qemu-iotests/070 | 1 +
 tests/qemu-iotests/071 | 1 +
 tests/qemu-iotests/072 | 1 +
 tests/qemu-iotests/073 | 1 +
 tests/qemu-iotests/074 | 1 +
 tests/qemu-iotests/075 | 1 +
 tests/qemu-iotests/076 | 1 +
 tests/qemu-iotests/077 | 1 +
 tests/qemu-iotests/078 | 1 +
 tests/qemu-iotests/079 | 1 +
 tests/qemu-iotests/080 | 1 +
 tests/qemu-iotests/081 | 1 +
 tests/qemu-iotests/082 | 1 +
 tests/qemu-iotests/083 | 1 +
 tests/qemu-iotests/084 | 1 +
 tests/qemu-iotests/085 | 1 +
 tests/qemu-iotests/086 | 1 +
 tests/qemu-iotests/087 | 1 +
 tests/qemu-iotests/088 | 1 +
 tests/qemu-iotests/089 | 1 +
 tests/qemu-iotests/090 | 1 +
 tests/qemu-iotests/091 | 1 +
 tests/qemu-iotests/092 | 1 +
 tests/qemu-iotests/093 | 1 +
 tests/qemu-iotests/094 | 1 +
 tests/qemu-iotests/095 | 1 +
 tests/qemu-iotests/096 | 1 +
 tests/qemu-iotests/097 | 1 +
 tests/qemu-iotests/098 | 1 +
 tests/qemu-iotests/099 | 1 +
 tests/qemu-iotests/101 | 1 +
 tests/qemu-iotests/102 | 1 +
 tests/qemu-iotests/103 | 1 +
 tests/qemu-iotests/104 | 1 +
 tests/qemu-iotests/105 | 1 +
 tests/qemu-iotests/106 | 1 +
 tests/qemu-iotests/107 | 1 +
 tests/qemu-iotests/108 | 1 +
 tests/qemu-iotests/109 | 1 +
 tests/qemu-iotests/110 | 1 +
 tests/qemu-iotests/111 | 1 +
 tests/qemu-iotests/112 | 1 +
 tests/qemu-iotests/113 | 1 +
 tests/qemu-iotests/114 | 1 +
 tests/qemu-iotests/115 | 1 +
 tests/qemu-iotests/116 | 1 +
 tests/qemu-iotests/117 | 1 +
 tests/qemu-iotests/118 | 1 +
 tests/qemu-iotests/119 | 1 +
 tests/qemu-iotests/120 | 1 +
 tests/qemu-iotests/121 | 1 +
 tests/qemu-iotests/122 | 1 +
 tests/qemu-iotests/123 | 1 +
 tests/qemu-iotests/124 | 1 +
 tests/qemu-iotests/125 | 1 +
 tests/qemu-iotests/126 | 1 +
 tests/qemu-iotests/127 | 1 +
 tests/qemu-iotests/128 | 1 +
 tests/qemu-iotests/129 | 1 +
 tests/qemu-iotests/130 | 1 +
 tests/qemu-iotests/131 | 1 +
 tests/qemu-iotests/132 | 1 +
 tests/qemu-iotests/133 | 1 +
 tests/qemu-iotests/134 | 1 +
 tests/qemu-iotests/135 | 1 +
 tests/qemu-iotests/136 | 1 +
 tests/qemu-iotests/137 | 1 +
 tests/qemu-iotests/138 | 1 +
 tests/qemu-iotests/139 | 1 +
 tests/qemu-iotests/140 | 1 +
 tests/qemu-iotests/141 | 1 +
 tests/qemu-iotests/143 | 1 +
 tests/qemu-iotests/144 | 1 +
 tests/qemu-iotests/145 | 1 +
 tests/qemu-iotests/146 | 1 +
 tests/qemu-iotests/147 | 1 +
 tests/qemu-iotests/148 | 1 +
 tests/qemu-iotests/149 | 1 +
 tests/qemu-iotests/150 | 1 +
 tests/qemu-iotests/151 | 1 +
 tests/qemu-iotests/152 | 1 +

[PULL 08/13] iotests: make tests executable

2021-01-20 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

All other test files are executable. Fix these.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Message-Id: <20210116134424.82867-5-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/283 | 0
 tests/qemu-iotests/298 | 0
 tests/qemu-iotests/299 | 0
 3 files changed, 0 insertions(+), 0 deletions(-)
 mode change 100644 => 100755 tests/qemu-iotests/283
 mode change 100644 => 100755 tests/qemu-iotests/298
 mode change 100644 => 100755 tests/qemu-iotests/299

diff --git a/tests/qemu-iotests/283 b/tests/qemu-iotests/283
old mode 100644
new mode 100755
diff --git a/tests/qemu-iotests/298 b/tests/qemu-iotests/298
old mode 100644
new mode 100755
diff --git a/tests/qemu-iotests/299 b/tests/qemu-iotests/299
old mode 100644
new mode 100755
-- 
2.30.0

[PULL 12/13] iotests.py: fix qemu_tool_pipe_and_status()

2021-01-20 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

qemu_img_args variable is unrelated here. We should print just args.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20201130134024.19212-4-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/iotests.py | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index dcdcd0387f20..ea5c3c51624e 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -101,9 +101,8 @@ def qemu_tool_pipe_and_status(tool: str, args: 
Sequence[str],
 universal_newlines=True)
 output = subp.communicate()[0]
 if subp.returncode < 0:
-sys.stderr.write('%s received signal %i: %s\n'
- % (tool, -subp.returncode,
-' '.join(qemu_img_args + list(args
+cmd = ' '.join(args)
+sys.stderr.write(f'{tool} received signal {-subp.returncode}: {cmd}\n')
 return (output, subp.returncode)

 def qemu_img_pipe_and_status(*args: str) -> Tuple[str, int]:
-- 
2.30.0

[PULL 13/13] iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status()

2021-01-20 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Just drop code duplication.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20201130134024.19212-5-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/iotests.py | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index ea5c3c51624e..2e89c0ab1abc 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -188,14 +188,7 @@ def img_info_log(filename, filter_path=None, 
imgopts=False, extra_args=()):
 def qemu_io(*args):
 '''Run qemu-io and return the stdout data'''
 args = qemu_io_args + list(args)
-subp = subprocess.Popen(args, stdout=subprocess.PIPE,
-stderr=subprocess.STDOUT,
-universal_newlines=True)
-output = subp.communicate()[0]
-if subp.returncode < 0:
-sys.stderr.write('qemu-io received signal %i: %s\n'
- % (-subp.returncode, ' '.join(args)))
-return output
+return qemu_tool_pipe_and_status('qemu-io', args)[0]

 def qemu_io_log(*args):
 result = qemu_io(*args)
-- 
2.30.0

[PULL 09/13] iotests/294: add shebang line

2021-01-20 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Message-Id: <20210116134424.82867-6-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/294 | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/qemu-iotests/294 b/tests/qemu-iotests/294
index 87da35db496a..4c375ed609e4 100755
--- a/tests/qemu-iotests/294
+++ b/tests/qemu-iotests/294
@@ -1,3 +1,4 @@
+#!/usr/bin/env bash
 #
 # Copyright (C) 2019 Red Hat, Inc.
 #
-- 
2.30.0

[PULL 07/13] iotests: fix some whitespaces in test output files

2021-01-20 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

We are going to be stricter about comparing test result with .out
files. So, fix some whitespaces now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Message-Id: <20210116134424.82867-4-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/175.out |  2 +-
 tests/qemu-iotests/271.out | 12 ++--
 tests/qemu-iotests/287.out | 10 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/tests/qemu-iotests/175.out b/tests/qemu-iotests/175.out
index 39c2ee0f6250..40a5bd1ce6c2 100644
--- a/tests/qemu-iotests/175.out
+++ b/tests/qemu-iotests/175.out
@@ -23,4 +23,4 @@ size=4096, min allocation
 == resize empty image with block_resize ==
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=0
 size=1048576, min allocation
- *** done
+*** done
diff --git a/tests/qemu-iotests/271.out b/tests/qemu-iotests/271.out
index 92deb7ebb0ea..81043ba4d77e 100644
--- a/tests/qemu-iotests/271.out
+++ b/tests/qemu-iotests/271.out
@@ -500,7 +500,7 @@ L2 entry #0: 0x80050001 0001

 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
 L2 entry #1: 0x8006 0001
-qcow2: Marking image as corrupt: Invalid cluster entry found  (L2 offset: 
0x4, L2 index: 0x1); further corruption events will be suppressed
+qcow2: Marking image as corrupt: Invalid cluster entry found (L2 offset: 
0x4, L2 index: 0x1); further corruption events will be suppressed
 write failed: Input/output error

 ### Corrupted L2 entries - write test (unallocated) ###
@@ -515,14 +515,14 @@ L2 entry #0: 0x8006 0001

 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
 L2 entry #0: 0x 0001
-qcow2: Marking image as corrupt: Invalid cluster entry found  (L2 offset: 
0x4, L2 index: 0); further corruption events will be suppressed
+qcow2: Marking image as corrupt: Invalid cluster entry found (L2 offset: 
0x4, L2 index: 0); further corruption events will be suppressed
 write failed: Input/output error

 # Both 'subcluster is zero' and 'subcluster is allocated' bits set

 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
 L2 entry #1: 0x 00010001
-qcow2: Marking image as corrupt: Invalid cluster entry found  (L2 offset: 
0x4, L2 index: 0x1); further corruption events will be suppressed
+qcow2: Marking image as corrupt: Invalid cluster entry found (L2 offset: 
0x4, L2 index: 0x1); further corruption events will be suppressed
 write failed: Input/output error

 ### Compressed cluster with subcluster bitmap != 0 - write test ###
@@ -583,7 +583,7 @@ read 524288/524288 bytes at offset 0
 read 524288/524288 bytes at offset 524288
 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 Offset  Length  Mapped to   File
-0   0x80   TEST_DIR/t.qcow2.base
+0   0x8 0   TEST_DIR/t.qcow2.base
 # backing file and preallocation=falloc
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=524288 
backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw preallocation=falloc
 Image resized.
@@ -592,7 +592,7 @@ read 524288/524288 bytes at offset 0
 read 524288/524288 bytes at offset 524288
 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 Offset  Length  Mapped to   File
-0   0x80   TEST_DIR/t.qcow2.base
+0   0x8 0   TEST_DIR/t.qcow2.base
 # backing file and preallocation=full
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=524288 
backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw preallocation=full
 Image resized.
@@ -601,7 +601,7 @@ read 524288/524288 bytes at offset 0
 read 524288/524288 bytes at offset 524288
 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 Offset  Length  Mapped to   File
-0   0x80   TEST_DIR/t.qcow2.base
+0   0x8 0   TEST_DIR/t.qcow2.base

 ### Image resizing with preallocation and backing files ###

diff --git a/tests/qemu-iotests/287.out b/tests/qemu-iotests/287.out
index 6b9dfb4af050..49ab6a27d5b3 100644
--- a/tests/qemu-iotests/287.out
+++ b/tests/qemu-iotests/287.out
@@ -10,22 +10,22 @@ incompatible_features []
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
 incompatible_features [3]

-=== Testing zlib with incompatible bit set  ===
+=== Testing zlib with incompatible bit set ===

 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
 incompatible_features [3]

-=== Testing zstd with incompatible bit unset  ===
+=== Testing zstd with incompatible bit unset ===

 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
 incompatible_features []

-=== Testing compression type values  ===
+=== Testing compression type values ===

 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
-   0
+0

[PULL 11/13] iotests/264: fix style

2021-01-20 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Fix long line, extra import and one mypy complaint about incompatible
int and float.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20201118180433.11931-7-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/264 | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/264 b/tests/qemu-iotests/264
index 960f0449a4e4..e725cefd47b5 100755
--- a/tests/qemu-iotests/264
+++ b/tests/qemu-iotests/264
@@ -22,8 +22,7 @@
 import time

 import iotests
-from iotests import qemu_img_create, qemu_io_silent_check, file_path, \
-qemu_nbd_popen, log
+from iotests import qemu_img_create, file_path, qemu_nbd_popen, log

 iotests.script_initialize(
 supported_fmts=['qcow2'],
@@ -32,7 +31,7 @@ iotests.script_initialize(
 disk_a, disk_b, nbd_sock = file_path('disk_a', 'disk_b', 'nbd-sock')
 nbd_uri = 'nbd+unix:///?socket=' + nbd_sock
 size = 5 * 1024 * 1024
-wait_limit = 3
+wait_limit = 3.0
 wait_step = 0.2

 qemu_img_create('-f', iotests.imgfmt, disk_a, str(size))
@@ -49,11 +48,11 @@ with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, 
disk_b):
   'file': {'driver': 'nbd',
'server': {'type': 'unix', 'path': nbd_sock},
'reconnect-delay': 10}})
-vm.qmp_log('blockdev-backup', device='drive0', sync='full', 
target='backup0',
-   speed=(1 * 1024 * 1024))
+vm.qmp_log('blockdev-backup', device='drive0', sync='full',
+   target='backup0', speed=(1 * 1024 * 1024))

 # Wait for some progress
-t = 0
+t = 0.0
 while t < wait_limit:
 jobs = vm.qmp('query-block-jobs')['return']
 if jobs and jobs[0]['offset'] > 0:
-- 
2.30.0

[PULL 02/13] qemu-nbd: Fix a memleak in nbd_client_thread()

2021-01-20 Thread Eric Blake

From: Alex Chen 

When the qio_channel_socket_connect_sync() fails
we should goto 'out_socket' label to free the 'sioc' instead of
goto 'out' label.
In addition, there's a lot of redundant code in the successful branch
and the error branch, optimize it.

Reported-by: Euler Robot 
Signed-off-by: Alex Chen 
Signed-off-by: Eric Blake 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20201208134944.27962-1-alex.c...@huawei.com>
---
 qemu-nbd.c | 40 +---
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 47587a709e6b..0d513cb38c88 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -265,8 +265,8 @@ static void *nbd_client_thread(void *arg)
 char *device = arg;
 NBDExportInfo info = { .request_sizes = false, .name = g_strdup("") };
 QIOChannelSocket *sioc;
-int fd;
-int ret;
+int fd = -1;
+int ret = EXIT_FAILURE;
 pthread_t show_parts_thread;
 Error *local_error = NULL;

@@ -278,26 +278,24 @@ static void *nbd_client_thread(void *arg)
 goto out;
 }

-ret = nbd_receive_negotiate(NULL, QIO_CHANNEL(sioc),
-NULL, NULL, NULL, , _error);
-if (ret < 0) {
+if (nbd_receive_negotiate(NULL, QIO_CHANNEL(sioc),
+  NULL, NULL, NULL, , _error) < 0) {
 if (local_error) {
 error_report_err(local_error);
 }
-goto out_socket;
+goto out;
 }

 fd = open(device, O_RDWR);
 if (fd < 0) {
 /* Linux-only, we can use %m in printf.  */
 error_report("Failed to open %s: %m", device);
-goto out_socket;
+goto out;
 }

-ret = nbd_init(fd, sioc, , _error);
-if (ret < 0) {
+if (nbd_init(fd, sioc, , _error) < 0) {
 error_report_err(local_error);
-goto out_fd;
+goto out;
 }

 /* update partition table */
@@ -311,24 +309,20 @@ static void *nbd_client_thread(void *arg)
 dup2(STDOUT_FILENO, STDERR_FILENO);
 }

-ret = nbd_client(fd);
-if (ret) {
-goto out_fd;
+if (nbd_client(fd) < 0) {
+goto out;
 }
-close(fd);
-object_unref(OBJECT(sioc));
-g_free(info.name);
-kill(getpid(), SIGTERM);
-return (void *) EXIT_SUCCESS;

-out_fd:
-close(fd);
-out_socket:
+ret = EXIT_SUCCESS;
+
+ out:
+if (fd >= 0) {
+close(fd);
+}
 object_unref(OBJECT(sioc));
-out:
 g_free(info.name);
 kill(getpid(), SIGTERM);
-return (void *) EXIT_FAILURE;
+return (void *) (intptr_t) ret;
 }
 #endif /* HAVE_NBD_DEVICE */

-- 
2.30.0

[PULL 03/13] block: Honor blk_set_aio_context() context requirements

2021-01-20 Thread Eric Blake

From: Sergio Lopez 

The documentation for bdrv_set_aio_context_ignore() states this:

 * The caller must own the AioContext lock for the old AioContext of bs, but it
 * must not own the AioContext lock for new_context (unless new_context is the
 * same as the current context of bs).

As blk_set_aio_context() makes use of this function, this rule also
applies to it.

Fix all occurrences where this rule wasn't honored.

Suggested-by: Kevin Wolf 
Signed-off-by: Sergio Lopez 
Message-Id: <20201214170519.223781-2-...@redhat.com>
Reviewed-by: Kevin Wolf 
Signed-off-by: Eric Blake 
---
 hw/block/dataplane/virtio-blk.c | 4 
 hw/block/dataplane/xen-block.c  | 7 ++-
 hw/scsi/virtio-scsi.c   | 6 --
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 37499c556482..e9050c8987e7 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -172,6 +172,7 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev)
 VirtIOBlockDataPlane *s = vblk->dataplane;
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vblk)));
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+AioContext *old_context;
 unsigned i;
 unsigned nvqs = s->conf->num_queues;
 Error *local_err = NULL;
@@ -214,7 +215,10 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev)
 vblk->dataplane_started = true;
 trace_virtio_blk_data_plane_start(s);

+old_context = blk_get_aio_context(s->conf->conf.blk);
+aio_context_acquire(old_context);
 r = blk_set_aio_context(s->conf->conf.blk, s->ctx, _err);
+aio_context_release(old_context);
 if (r < 0) {
 error_report_err(local_err);
 goto fail_guest_notifiers;
diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 71c337c7b7e7..3675f8deaf9d 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -725,6 +725,7 @@ void xen_block_dataplane_start(XenBlockDataPlane *dataplane,
 {
 ERRP_GUARD();
 XenDevice *xendev = dataplane->xendev;
+AioContext *old_context;
 unsigned int ring_size;
 unsigned int i;

@@ -808,10 +809,14 @@ void xen_block_dataplane_start(XenBlockDataPlane 
*dataplane,
 goto stop;
 }

-aio_context_acquire(dataplane->ctx);
+old_context = blk_get_aio_context(dataplane->blk);
+aio_context_acquire(old_context);
 /* If other users keep the BlockBackend in the iothread, that's ok */
 blk_set_aio_context(dataplane->blk, dataplane->ctx, NULL);
+aio_context_release(old_context);
+
 /* Only reason for failure is a NULL channel */
+aio_context_acquire(dataplane->ctx);
 xen_device_set_event_channel_context(xendev, dataplane->event_channel,
  dataplane->ctx, _abort);
 aio_context_release(dataplane->ctx);
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 9690bc63c860..99ff261cead1 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -849,15 +849,17 @@ static void virtio_scsi_hotplug(HotplugHandler 
*hotplug_dev, DeviceState *dev,
 VirtIODevice *vdev = VIRTIO_DEVICE(hotplug_dev);
 VirtIOSCSI *s = VIRTIO_SCSI(vdev);
 SCSIDevice *sd = SCSI_DEVICE(dev);
+AioContext *old_context;
 int ret;

 if (s->ctx && !s->dataplane_fenced) {
 if (blk_op_is_blocked(sd->conf.blk, BLOCK_OP_TYPE_DATAPLANE, errp)) {
 return;
 }
-virtio_scsi_acquire(s);
+old_context = blk_get_aio_context(sd->conf.blk);
+aio_context_acquire(old_context);
 ret = blk_set_aio_context(sd->conf.blk, s->ctx, errp);
-virtio_scsi_release(s);
+aio_context_release(old_context);
 if (ret < 0) {
 return;
 }
-- 
2.30.0

[PULL 04/13] nbd/server: Quiesce coroutines on context switch

2021-01-20 Thread Eric Blake

From: Sergio Lopez 

When switching between AIO contexts we need to me make sure that both
recv_coroutine and send_coroutine are not scheduled to run. Otherwise,
QEMU may crash while attaching the new context with an error like
this one:

aio_co_schedule: Co-routine was already scheduled in 'aio_co_schedule'

To achieve this we need a local implementation of
'qio_channel_readv_all_eof' named 'nbd_read_eof' (a trick already done
by 'nbd/client.c') that allows us to interrupt the operation and to
know when recv_coroutine is yielding.

With this in place, we delegate detaching the AIO context to the
owning context with a BH ('nbd_aio_detach_bh') scheduled using
'aio_wait_bh_oneshot'. This BH signals that we need to quiesce the
channel by setting 'client->quiescing' to 'true', and either waits for
the coroutine to finish using AIO_WAIT_WHILE or, if it's yielding in
'nbd_read_eof', actively enters the coroutine to interrupt it.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1900326
Signed-off-by: Sergio Lopez 
Reviewed-by: Eric Blake 
Message-Id: <20201214170519.223781-4-...@redhat.com>
Signed-off-by: Eric Blake 
---
 nbd/server.c | 120 +--
 1 file changed, 106 insertions(+), 14 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 613ed2634ada..7229f487d296 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -132,6 +132,9 @@ struct NBDClient {
 CoMutex send_lock;
 Coroutine *send_coroutine;

+bool read_yielding;
+bool quiescing;
+
 QTAILQ_ENTRY(NBDClient) next;
 int nb_requests;
 bool closing;
@@ -1352,14 +1355,60 @@ static coroutine_fn int nbd_negotiate(NBDClient 
*client, Error **errp)
 return 0;
 }

-static int nbd_receive_request(QIOChannel *ioc, NBDRequest *request,
+/* nbd_read_eof
+ * Tries to read @size bytes from @ioc. This is a local implementation of
+ * qio_channel_readv_all_eof. We have it here because we need it to be
+ * interruptible and to know when the coroutine is yielding.
+ * Returns 1 on success
+ * 0 on eof, when no data was read (errp is not set)
+ * negative errno on failure (errp is set)
+ */
+static inline int coroutine_fn
+nbd_read_eof(NBDClient *client, void *buffer, size_t size, Error **errp)
+{
+bool partial = false;
+
+assert(size);
+while (size > 0) {
+struct iovec iov = { .iov_base = buffer, .iov_len = size };
+ssize_t len;
+
+len = qio_channel_readv(client->ioc, , 1, errp);
+if (len == QIO_CHANNEL_ERR_BLOCK) {
+client->read_yielding = true;
+qio_channel_yield(client->ioc, G_IO_IN);
+client->read_yielding = false;
+if (client->quiescing) {
+return -EAGAIN;
+}
+continue;
+} else if (len < 0) {
+return -EIO;
+} else if (len == 0) {
+if (partial) {
+error_setg(errp,
+   "Unexpected end-of-file before all bytes were 
read");
+return -EIO;
+} else {
+return 0;
+}
+}
+
+partial = true;
+size -= len;
+buffer = (uint8_t *) buffer + len;
+}
+return 1;
+}
+
+static int nbd_receive_request(NBDClient *client, NBDRequest *request,
Error **errp)
 {
 uint8_t buf[NBD_REQUEST_SIZE];
 uint32_t magic;
 int ret;

-ret = nbd_read(ioc, buf, sizeof(buf), "request", errp);
+ret = nbd_read_eof(client, buf, sizeof(buf), errp);
 if (ret < 0) {
 return ret;
 }
@@ -1480,11 +1529,37 @@ static void blk_aio_attached(AioContext *ctx, void 
*opaque)

 QTAILQ_FOREACH(client, >clients, next) {
 qio_channel_attach_aio_context(client->ioc, ctx);
+
+assert(client->recv_coroutine == NULL);
+assert(client->send_coroutine == NULL);
+
+if (client->quiescing) {
+client->quiescing = false;
+nbd_client_receive_next_request(client);
+}
+}
+}
+
+static void nbd_aio_detach_bh(void *opaque)
+{
+NBDExport *exp = opaque;
+NBDClient *client;
+
+QTAILQ_FOREACH(client, >clients, next) {
+qio_channel_detach_aio_context(client->ioc);
+client->quiescing = true;
+
 if (client->recv_coroutine) {
-aio_co_schedule(ctx, client->recv_coroutine);
+if (client->read_yielding) {
+qemu_aio_coroutine_enter(exp->common.ctx,
+ client->recv_coroutine);
+} else {
+AIO_WAIT_WHILE(exp->common.ctx, client->recv_coroutine != 
NULL);
+}
 }
+
 if (client->send_coroutine) {
-aio_co_schedule(ctx, client->send_coroutine);
+AIO_WAIT_WHILE(exp->common.ctx, client->send_coroutine != NULL);
 }
 }
 }
@@ -1492,13 +1567,10 @@ static void blk_aio_attached(AioContext *ctx, void 
*opaque)
 static void blk_aio_detach(void

[PULL 06/13] iotests/303: use dot slash for qcow2.py running

2021-01-20 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

If you run './check 303', check includes common.config which adjusts
$PATH to include '.' first, and therefore finds qcow2.py on PATH.  But
if you run './303' directly, there is nothing to adjust PATH, and if
'.' is not already on your PATH by other means, the test fails because
the executable is not found.  Adjust how we invoke the helper
executable to avoid needing a PATH search in the first place.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Message-Id: <20210116134424.82867-3-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/303 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/303 b/tests/qemu-iotests/303
index 6c2177448348..11cd9eeb266e 100755
--- a/tests/qemu-iotests/303
+++ b/tests/qemu-iotests/303
@@ -56,7 +56,7 @@ qemu_img_create('-f', iotests.imgfmt, disk, '10M')

 add_bitmap(1, 0, 6, False)
 add_bitmap(2, 6, 8, True)
-dump = ['qcow2.py', disk, 'dump-header']
+dump = ['./qcow2.py', disk, 'dump-header']
 subprocess.run(dump)
 # Dump the metadata in JSON format
 dump.append('-j')
-- 
2.30.0

[PULL 01/13] qemu-nbd: Fix a memleak in qemu_nbd_client_list()

2021-01-20 Thread Eric Blake

From: Alex Chen 

When the qio_channel_socket_connect_sync() fails
we should goto 'out' label to free the 'sioc' instead of return.

Reported-by: Euler Robot 
Signed-off-by: Alex Chen 
Message-Id: <20201130123651.17543-1-alex.c...@huawei.com>
Reviewed-by: Alberto Garcia 
Signed-off-by: Eric Blake 
---
 qemu-nbd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index a7075c5419d7..47587a709e6b 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -181,7 +181,7 @@ static int qemu_nbd_client_list(SocketAddress *saddr, 
QCryptoTLSCreds *tls,
 sioc = qio_channel_socket_new();
 if (qio_channel_socket_connect_sync(sioc, saddr, ) < 0) {
 error_report_err(err);
-return EXIT_FAILURE;
+goto out;
 }
 rc = nbd_receive_export_list(QIO_CHANNEL(sioc), tls, hostname, ,
  );
-- 
2.30.0

[PULL 05/13] iotests/277: use dot slash for nbd-fault-injector.py running

2021-01-20 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

If you run './check 277', check includes common.config which adjusts
$PATH to include '.' first, and therefore finds nbd-fault-injector.py
on PATH.  But if you run './277' directly, there is nothing to adjust
PATH, and if '.' is not already on your PATH by other means, the test
fails because the executable is not found.  Adjust how we invoke the
helper executable to avoid needing a PATH search in the first place.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Message-Id: <20210116134424.82867-2-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/277 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/277 b/tests/qemu-iotests/277
index d34f87021f8e..a39ce2d8738c 100755
--- a/tests/qemu-iotests/277
+++ b/tests/qemu-iotests/277
@@ -42,7 +42,7 @@ def make_conf_file(event):
 def start_server_NBD(event):
 make_conf_file(event)

-srv = subprocess.Popen(['nbd-fault-injector.py', '--classic-negotiation',
+srv = subprocess.Popen(['./nbd-fault-injector.py', '--classic-negotiation',
nbd_sock, conf_file], stdout=subprocess.PIPE,
stderr=subprocess.STDOUT, universal_newlines=True)
 line = srv.stdout.readline()
-- 
2.30.0

[PULL 00/13] NBD patches through 2021-01-20

2021-01-20 Thread Eric Blake

The following changes since commit 48202c712412c803ddb56365c7bca322aa4e7506:

  Merge remote-tracking branch 
'remotes/pmaydell/tags/pull-target-arm-20210119-1' into staging (2021-01-19 
15:47:23 +)

are available in the Git repository at:

  https://repo.or.cz/qemu/ericb.git tags/pull-nbd-2021-01-20

for you to fetch changes up to f874e7fa3b6583c79a74aea9e781af920ddd8091:

  iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status() (2021-01-20 20:24:51 
-0600)


nbd patches for 2021-01-20

- minor resource leak fixes in qemu-nbd
- ensure proper aio context when nbd server uses iothreads
- iotest refactorings in preparation for rewriting ./check to be more
flexible, and preparing for more nbd server reconnect features


Alex Chen (2):
  qemu-nbd: Fix a memleak in qemu_nbd_client_list()
  qemu-nbd: Fix a memleak in nbd_client_thread()

Sergio Lopez (2):
  block: Honor blk_set_aio_context() context requirements
  nbd/server: Quiesce coroutines on context switch

Vladimir Sementsov-Ogievskiy (9):
  iotests/277: use dot slash for nbd-fault-injector.py running
  iotests/303: use dot slash for qcow2.py running
  iotests: fix some whitespaces in test output files
  iotests: make tests executable
  iotests/294: add shebang line
  iotests: define group in each iotest
  iotests/264: fix style
  iotests.py: fix qemu_tool_pipe_and_status()
  iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status()

 hw/block/dataplane/virtio-blk.c |   4 ++
 hw/block/dataplane/xen-block.c  |   7 ++-
 hw/scsi/virtio-scsi.c   |   6 +-
 nbd/server.c| 120 +++-
 qemu-nbd.c  |  42 ++
 tests/qemu-iotests/001  |   1 +
 tests/qemu-iotests/002  |   1 +
 tests/qemu-iotests/003  |   1 +
 tests/qemu-iotests/004  |   1 +
 tests/qemu-iotests/005  |   1 +
 tests/qemu-iotests/007  |   1 +
 tests/qemu-iotests/008  |   1 +
 tests/qemu-iotests/009  |   1 +
 tests/qemu-iotests/010  |   1 +
 tests/qemu-iotests/011  |   1 +
 tests/qemu-iotests/012  |   1 +
 tests/qemu-iotests/013  |   1 +
 tests/qemu-iotests/014  |   1 +
 tests/qemu-iotests/015  |   1 +
 tests/qemu-iotests/017  |   1 +
 tests/qemu-iotests/018  |   1 +
 tests/qemu-iotests/019  |   1 +
 tests/qemu-iotests/020  |   1 +
 tests/qemu-iotests/021  |   1 +
 tests/qemu-iotests/022  |   1 +
 tests/qemu-iotests/023  |   1 +
 tests/qemu-iotests/024  |   1 +
 tests/qemu-iotests/025  |   1 +
 tests/qemu-iotests/026  |   1 +
 tests/qemu-iotests/027  |   1 +
 tests/qemu-iotests/028  |   1 +
 tests/qemu-iotests/029  |   1 +
 tests/qemu-iotests/030  |   1 +
 tests/qemu-iotests/031  |   1 +
 tests/qemu-iotests/032  |   1 +
 tests/qemu-iotests/033  |   1 +
 tests/qemu-iotests/034  |   1 +
 tests/qemu-iotests/035  |   1 +
 tests/qemu-iotests/036  |   1 +
 tests/qemu-iotests/037  |   1 +
 tests/qemu-iotests/038  |   1 +
 tests/qemu-iotests/039  |   1 +
 tests/qemu-iotests/040  |   1 +
 tests/qemu-iotests/041  |   1 +
 tests/qemu-iotests/042  |   1 +
 tests/qemu-iotests/043  |   1 +
 tests/qemu-iotests/044  |   1 +
 tests/qemu-iotests/045  |   1 +
 tests/qemu-iotests/046  |   1 +
 tests/qemu-iotests/047  |   1 +
 tests/qemu-iotests/048  |   1 +
 tests/qemu-iotests/049  |   1 +
 tests/qemu-iotests/050  |   1 +
 tests/qemu-iotests/051  |   1 +
 tests/qemu-iotests/052  |   1 +
 tests/qemu-iotests/053  |   1 +
 tests/qemu-iotests/054  |   1 +
 tests/qemu-iotests/055  |   1 +
 tests/qemu-iotests/056  |   1 +
 tests/qemu-iotests/057  |   1 +
 tests/qemu-iotests/058  |   1 +
 tests/qemu-iotests/059  |   1 +
 tests/qemu-iotests/060  |   1 +
 tests/qemu-iotests/061  |   1 +
 tests/qemu-iotests/062  |   1 +
 tests/qemu-iotests/063  |   1 +
 tests/qemu-iotests/064  |   1 +
 tests/qemu-iotests/065  |   1 +
 tests/qemu-iotests/066  |   1 +
 tests/qemu-iotests/068  |   1 +
 tests/qemu-iotests/069  |   1 +
 tests/qemu-iotests/070  |   1 +
 tests/qemu-iotests/071  |   1 +
 tests/qemu-iotests/072  |   1 +
 tests/qemu-iotests/073  |   1 +
 tests/qemu-iotests/074  |   1 +
 tests/qemu-iotests/075  |   1 +
 tests/qemu-iotests/076  |   1 +
 tests/qemu-iotests/077  |   1 +
 tests/qemu-iotests/078  |   1 +
 tests/qemu-iotests/079  |   1 +
 tests/qemu-iotests/080

Re: [PATCH 11/11] iotests/264: add backup-cancel test-case

2021-01-20 Thread Eric Blake

On 1/20/21 7:28 PM, Eric Blake wrote:
> On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
>> Check that cancel doesn't wait for 10s of nbd reconnect timeout.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> ---
>>  tests/qemu-iotests/264 | 21 ++---
>>  1 file changed, 14 insertions(+), 7 deletions(-)
>>
> 
> Reviewed-by: Eric Blake 
> 

With this patch applied as-is, the test fails for me:

--- /home/eblake/qemu/tests/qemu-iotests/264.out2021-01-20
20:18:54.741366521 -0600
+++ /home/eblake/qemu/build/tests/qemu-iotests/264.out.bad  2021-01-20
20:20:37.242451172 -0600
@@ -1,5 +1,5 @@
-..
+...
 --
-Ran 2 tests
+Ran 3 tests

but with no obvious visibility into why.  Did you forget to check in
changes to 264.out?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 for-6.0 0/8] nbd reconnect on open

2021-01-20 Thread Eric Blake

On 11/30/20 7:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi all! There is a new feature: reconnect on open. It is useful when
> start of vm and start of nbd server are not simple to sync.
> 
> v2: rebase on master.
> Also, I've discovered that I've sent downstream version of test which
> doesn't work here. So, the test is rewritten (with appropriate
> improvements of iotests.py)

Because of my questions on 2, I haven't queued the series yet; but 3 and
4 were independent enough to take now.

> 
> Vladimir Sementsov-Ogievskiy (8):
>   block/nbd: move initial connect to coroutine
>   nbd: allow reconnect on open, with corresponding new options
>   iotests.py: fix qemu_tool_pipe_and_status()
>   iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status()
>   iotests.py: add qemu_tool_popen()
>   iotests.py: add and use qemu_io_wrap_args()
>   iotests.py: add qemu_io_popen()
>   iotests: add 306 to test reconnect on nbd open
> 
>  block/nbd.c   | 173 +-
>  tests/qemu-iotests/306|  71 ++
>  tests/qemu-iotests/306.out|  11 +++
>  tests/qemu-iotests/group  |   1 +
>  tests/qemu-iotests/iotests.py |  56 +++
>  5 files changed, 246 insertions(+), 66 deletions(-)
>  create mode 100755 tests/qemu-iotests/306
>  create mode 100644 tests/qemu-iotests/306.out
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 00/11] mirror: cancel nbd reconnect

2021-01-20 Thread Eric Blake

On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> The problem
> 
> Assume we have mirror job with nbd target node with enabled reconnect.
> Connection failed. So, all current requests to nbd node are waiting for
> nbd driver to reconnect. And they will wait for reconnect-delay time
> specified in nbd blockdev options. This timeout may be long enough, for
> example, we in Virtuozzo use 300 seconds by default.
> 
> So, if at this moment user tries to cancel the job, job will wait for
> its in-flight requests to finish up to 300 seconds. From the user point
> of view, cancelling the job takes a long time. Bad.
> 
> Solution
> 
> Let's just cancel "waiting for reconnect in in-flight request coroutines"
> on mirror (and backup) cancel. Welcome the series below.

Because of my question on 4/11, I did not queue the entire series yet.
But 6/11 was trivial enough to queue now.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 4/8] iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status()

2021-01-20 Thread Eric Blake

On 11/30/20 7:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> Just drop code duplication.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/iotests.py | 9 +
>  1 file changed, 1 insertion(+), 8 deletions(-)

Reviewed-by: Eric Blake 

> 
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index 5ebe25e063..6c9bcf9042 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -188,14 +188,7 @@ def img_info_log(filename, filter_path=None, 
> imgopts=False, extra_args=()):
>  def qemu_io(*args):
>  '''Run qemu-io and return the stdout data'''
>  args = qemu_io_args + list(args)
> -subp = subprocess.Popen(args, stdout=subprocess.PIPE,
> -stderr=subprocess.STDOUT,
> -universal_newlines=True)
> -output = subp.communicate()[0]
> -if subp.returncode < 0:
> -sys.stderr.write('qemu-io received signal %i: %s\n'
> - % (-subp.returncode, ' '.join(args)))
> -return output
> +return qemu_tool_pipe_and_status('qemu-io', args)[0]
>  
>  def qemu_io_log(*args):
>  result = qemu_io(*args)
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 3/8] iotests.py: fix qemu_tool_pipe_and_status()

2021-01-20 Thread Eric Blake

On 11/30/20 7:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> qemu_img_args variable is unrelated here. We should print just args.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/iotests.py | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)

Reviewed-by: Eric Blake 

> 
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index bcd4fe5b6f..5ebe25e063 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -101,9 +101,8 @@ def qemu_tool_pipe_and_status(tool: str, args: 
> Sequence[str],
>  universal_newlines=True)
>  output = subp.communicate()[0]
>  if subp.returncode < 0:
> -sys.stderr.write('%s received signal %i: %s\n'
> - % (tool, -subp.returncode,
> -' '.join(qemu_img_args + list(args
> +cmd = ' '.join(args)
> +sys.stderr.write(f'{tool} received signal {-subp.returncode}: 
> {cmd}\n')
>  return (output, subp.returncode)
>  
>  def qemu_img_pipe_and_status(*args: str) -> Tuple[str, int]:
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

RE: [PATCH 1/3] qapi/net: Add new QMP command for COLO passthrough

2021-01-20 Thread Zhang, Chen



> -Original Message-
> From: Eric Blake 
> Sent: Wednesday, January 20, 2021 3:33 AM
> To: Zhang, Chen ; Jason Wang
> ; qemu-dev ; Dr. David
> Alan Gilbert ; Markus Armbruster
> 
> Cc: Zhang Chen 
> Subject: Re: [PATCH 1/3] qapi/net: Add new QMP command for COLO
> passthrough
> 
> On 12/23/20 7:09 PM, Zhang Chen wrote:
> > From: Zhang Chen 
> >
> > Since the real user scenario does not need to monitor all traffic.
> > Add colo-passthrough-add and colo-passthrough-del to maintain a COLO
> > network passthrough list.
> >
> > Signed-off-by: Zhang Chen 
> 
> > --- a/qapi/net.json
> > +++ b/qapi/net.json
> > @@ -714,3 +714,49 @@
> >  ##
> >  { 'event': 'FAILOVER_NEGOTIATED',
> >'data': {'device-id': 'str'} }
> > +
> > +##
> > +# @colo-passthrough-add:
> > +#
> > +# Add passthrough entry according to customer's needs in COLO-compare.
> > +#
> > +# @protocol: COLO passthrough just support TCP and UDP.
> > +#
> > +# @port: TCP or UDP port number.
> > +#
> > +# Returns: Nothing on success
> > +#
> > +# Since: 5.3
> 
> The next release is 6.0, not 5.3.

Missed the plan, I will fix it in next version.

> 
> > +#
> > +# Example:
> > +#
> > +# -> { "execute": "colo-passthrough-add",
> > +#  "arguments": { "protocol": "tcp", "port": 3389 } }
> > +# <- { "return": {} }
> > +#
> > +##
> > +{ 'command': 'colo-passthrough-add',
> > + 'data': {'protocol': 'str', 'port': 'uint32'} }
> 
> Should 'protocol' be an enum (finite set of values) rather than an open-
> coded string (infinite number of values, even though you mentioned in the
> docs above that only 'tcp' or 'udp' make sense)?  In fact, do we already have
> existing QAPI types representing tcp/udp and a port number that could be
> reused here, rather than open-coding yet another one?
> 

I checked current QAPI code, looks no existing enum defined.
I will add the new general QAPI types in next version.

Thanks
Chen

> > +
> > +##
> > +# @colo-passthrough-del:
> > +#
> > +# Delete passthrough entry according to customer's needs in COLO-
> compare.
> > +#
> > +# @protocol: COLO passthrough just support TCP and UDP.
> > +#
> > +# @port: TCP or UDP port number.
> > +#
> > +# Returns: Nothing on success
> > +#
> > +# Since: 5.3
> 
> another 6.0 spot
> 

OK.

> > +#
> > +# Example:
> > +#
> > +# -> { "execute": "colo-passthrough-del",
> > +#  "arguments": { "protocol": "tcp", "port": 3389 } }
> > +# <- { "return": {} }
> > +#
> > +##
> > +{ 'command': 'colo-passthrough-del',
> > + 'data': {'protocol': 'str', 'port': 'uint32'} }
> >
> 
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org

RE: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint

2021-01-20 Thread Rao, Lei

The Primary VM can be shut down when it is in COLO state, which may trigger 
this bug.
About 'shutdown' -> 'colo' -> 'running', I think you are right, I did have the 
problems you said. For 'shutdown'->'colo', The fixed 
patch(5647051f432b7c9b57525470b0a79a31339062d2) have been merged.
Recently, I found another bug as follows in the test.
qemu-system-x86_64: invalid runstate transition: 'shutdown' -> 'running'
Aborted (core dumped)
The gdb bt as following:
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x7faa3d613859 in __GI_abort () at abort.c:79
#2  0x55c5a21268fd in runstate_set (new_state=RUN_STATE_RUNNING) at 
vl.c:723
#3  0x55c5a1f8cae4 in vm_prepare_start () at 
/home/workspace/colo-qemu/cpus.c:2206
#4  0x55c5a1f8cb1b in vm_start () at 
/home/workspace/colo-qemu/cpus.c:2213
#5  0x55c5a2332bba in migration_iteration_finish (s=0x55c5a4658810) at 
migration/migration.c:3376
#6  0x55c5a2332f3b in migration_thread (opaque=0x55c5a4658810) at 
migration/migration.c:3527
#7  0x55c5a251d68a in qemu_thread_start (args=0x55c5a5491a70) at 
util/qemu-thread-posix.c:519
#8  0x7faa3d7e9609 in start_thread (arg=) at 
pthread_create.c:477
#9  0x7faa3d710293 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

For the bug, I made the following changes:
@@ -3379,7 +3379,9 @@ static void 
migration_iteration_finish(MigrationState *s)
 case MIGRATION_STATUS_CANCELLED:
 case MIGRATION_STATUS_CANCELLING:
 if (s->vm_was_running) {
-vm_start();
+if (!runstate_check(RUN_STATE_SHUTDOWN)) {
+vm_start();
+}
 } else {
 if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
 runstate_set(RUN_STATE_POSTMIGRATE);

I will send the patch to community after more test.

Thanks,
Lei.

-Original Message-
From: Lukas Straub  
Sent: Thursday, January 21, 2021 3:13 AM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; zhang.zhanghaili...@huawei.com; quint...@redhat.com; 
dgilb...@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown during 
checkpoint

On Wed, 13 Jan 2021 10:46:27 +0800
leirao  wrote:

> From: "Rao, Lei" 
> 
> This patch fixes the following:
> qemu-system-x86_64: invalid runstate transition: 'colo' ->'shutdown'
> Aborted (core dumped)
> 
> Signed-off-by: Lei Rao 

I wonder how that is possible, since the VM is stopped during 'colo' state.

Unrelated to this patch, I think this area needs some work since the following 
unintended runstate transition is possible:
'shutdown' -> 'colo' -> 'running'.

> ---
>  softmmu/runstate.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/softmmu/runstate.c b/softmmu/runstate.c index 
> 636aab0..455ad0d 100644
> --- a/softmmu/runstate.c
> +++ b/softmmu/runstate.c
> @@ -125,6 +125,7 @@ static const RunStateTransition 
> runstate_transitions_def[] = {
>  { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
>  
>  { RUN_STATE_COLO, RUN_STATE_RUNNING },
> +{ RUN_STATE_COLO, RUN_STATE_SHUTDOWN},
>  
>  { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
>  { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },

--

Re: [PATCH v2 2/8] nbd: allow reconnect on open, with corresponding new options

2021-01-20 Thread Eric Blake

On 11/30/20 7:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> Note: currently, using new option with long timeout in qmp command
> blockdev-add is not good idea, as qmp interface is blocking, so,
> don't add it now, let's add it later after
> "monitor: Optionally run handlers in coroutines" series merged.

If I'm not mistaken, that landed as of eb94b81a94.  Is it just the
commit message that needs an update, or does this patch need a respin?

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 115 +---
>  1 file changed, 92 insertions(+), 23 deletions(-)
> 

> @@ -474,6 +484,11 @@ nbd_co_establish_connection(BlockDriverState *bs, Error 
> **errp)
>  s->wait_connect = true;
>  qemu_coroutine_yield();
>  
> +if (!s->connect_thread) {
> +error_setg(errp, "Connection attempt cancelled by other operation");
> +return NULL;
> +}

Does this need to use atomics for proper access to s->connect_thread
across threads?  Or are all the operations done by other coroutines but
within the same thread, so we are safe?


> @@ -624,10 +645,15 @@ static coroutine_fn void 
> nbd_reconnect_attempt(BDRVNBDState *s)
>  bdrv_inc_in_flight(s->bs);
>  
>  out:
> -s->connect_status = ret;
> -error_free(s->connect_err);
> -s->connect_err = NULL;
> -error_propagate(>connect_err, local_err);
> +if (s->connect_status == -ETIMEDOUT) {
> +/* Don't rewrite timeout error by following cancel-provoked error */

Maybe:

/* Don't propagate a timeout error caused by a job cancellation. */


> +static void open_timer_cb(void *opaque)
> +{
> +BDRVNBDState *s = opaque;
> +
> +if (!s->connect_status) {
> +/* First attempt was not finished. We should set an error */
> +s->connect_status = -ETIMEDOUT;
> +error_setg(>connect_err, "First connection attempt is cancelled 
> by "
> +   "timeout");
> +}
> +
> +nbd_teardown_connection_async(s->bs);
> +open_timer_del(s);
> +}
> +
> +static void open_timer_init(BDRVNBDState *s, uint64_t expire_time_ns)
> +{
> +assert(!s->open_timer && s->state == NBD_CLIENT_OPENING);
> +s->open_timer = aio_timer_new(bdrv_get_aio_context(s->bs),
> +  QEMU_CLOCK_REALTIME,
> +  SCALE_NS,
> +  open_timer_cb, s);
> +timer_mod(s->open_timer, expire_time_ns);
> +}
> +


> @@ -2180,6 +2235,14 @@ static QemuOptsList nbd_runtime_opts = {
>  "future requests before a successful reconnect will "
>  "immediately fail. Default 0",
>  },
> +{
> +.name = "open-timeout",
> +.type = QEMU_OPT_NUMBER,
> +.help = "In seconds. If zero, nbd driver tries to establish "
> +"connection only once, on fail open fails. If non-zero, "

If zero, the nbd driver tries the connection only once, and fails to
open if the connection fails.

> +"nbd driver may do several attempts until success or "
> +"@open-timeout seconds passed. Default 0",

If non-zero, the nbd driver will repeat connection attempts until
successful or until @open-timeout seconds have elapsed.

> +},

Where is the QMP counterpart for setting this option?

>  { /* end of list */ }
>  },
>  };
> @@ -2235,6 +2298,7 @@ static int nbd_process_options(BlockDriverState *bs, 
> QDict *options,
>  }
>  
>  s->reconnect_delay = qemu_opt_get_number(opts, "reconnect-delay", 0);
> +s->open_timeout = qemu_opt_get_number(opts, "open-timeout", 0);
>  
>  ret = 0;
>  
> @@ -2268,6 +2332,11 @@ static int nbd_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  bdrv_inc_in_flight(bs);
>  aio_co_schedule(bdrv_get_aio_context(bs), s->connection_co);
>  
> +if (s->open_timeout) {
> +open_timer_init(s, qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
> +s->open_timeout * NANOSECONDS_PER_SECOND);
> +}
> +
>  if (qemu_in_coroutine()) {
>  s->open_co = qemu_coroutine_self();
>  qemu_coroutine_yield();
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 09/11] iotests/264: add mirror-cancel test-case

2021-01-20 Thread Eric Blake

On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> Check that cancel doesn't wait for 10s of nbd reconnect timeout.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/264 | 38 ++
>  tests/qemu-iotests/264.out |  4 ++--
>  2 files changed, 32 insertions(+), 10 deletions(-)

>  
> +def test_mirror_cancel(self):
> +# Mirror speed limit doesn't work well enough, it seems that mirror
> +# will run many parallel requests anyway. MAX_IN_FLIGHT is 16 and
> +# MAX_IO_BYTES is 1M in mirror.c, so let's use 20M disk.
> +self.init_vm(20 * 1024 * 1024)
> +self.start_job('blockdev-mirror')

Is this comment still accurate given recent work on the mirror filter?
I'm fine taking the patch as-is and tweaking it with followups, though,
in order to make progress.

> +
> +result = self.vm.qmp('block-job-cancel', device='drive0')
> +self.assert_qmp(result, 'return', {})
> +
> +start_t = time.time()
> +self.vm.event_wait('BLOCK_JOB_CANCELLED')
> +delta_t = time.time() - start_t
> +self.assertTrue(delta_t < 2.0)

I hope this doesn't fail on CI platforms under heavy load.  It didn't
fail for me locally, but I hope we don't have to revisit it.  Is there
any way we can test this in a manner that is not as fragile?

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 11/11] iotests/264: add backup-cancel test-case

2021-01-20 Thread Eric Blake

On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> Check that cancel doesn't wait for 10s of nbd reconnect timeout.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/264 | 21 ++---
>  1 file changed, 14 insertions(+), 7 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 10/11] block/backup: implement .cancel job handler

2021-01-20 Thread Eric Blake

On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> Cancel in-flight io on target to not waste the time.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/backup.c | 10 ++
>  1 file changed, 10 insertions(+)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v7 02/13] confidential guest support: Introduce new confidential guest support class

2021-01-20 Thread David Gibson

On Mon, Jan 18, 2021 at 06:51:24PM +, Dr. David Alan Gilbert wrote:
> * David Gibson (da...@gibson.dropbear.id.au) wrote:
> > Several architectures have mechanisms which are designed to protect guest
> > memory from interference or eavesdropping by a compromised hypervisor.  AMD
> > SEV does this with in-chip memory encryption and Intel's MKTME can do
>^
> (and below) My understanding is that it's Intel TDX that's the VM
> equivalent.

I thought MKTME could already do memory encryption and TDX extended
that to... more?  I'll adjust the comment to say TDX anyway, since
that seems to be the newer name.

> 
> Dave
> 
> > similar things.  POWER's Protected Execution Framework (PEF) accomplishes a
> > similar goal using an ultravisor and new memory protection features,
> > instead of encryption.
> > 
> > To (partially) unify handling for these, this introduces a new
> > ConfidentialGuestSupport QOM base class.  "Confidential" is kind of vague,
> > but "confidential computing" seems to be the buzzword about these schemes,
> > and "secure" or "protected" are often used in connection to unrelated
> > things (such as hypervisor-from-guest or guest-from-guest security).
> > 
> > The "support" in the name is significant because in at least some of the
> > cases it requires the guest to take specific actions in order to protect
> > itself from hypervisor eavesdropping.
> > 
> > Signed-off-by: David Gibson 
> > ---
> >  backends/confidential-guest-support.c | 33 
> >  backends/meson.build  |  1 +
> >  include/exec/confidential-guest-support.h | 38 +++
> >  include/qemu/typedefs.h   |  1 +
> >  target/i386/sev.c |  3 +-
> >  5 files changed, 75 insertions(+), 1 deletion(-)
> >  create mode 100644 backends/confidential-guest-support.c
> >  create mode 100644 include/exec/confidential-guest-support.h
> > 
> > diff --git a/backends/confidential-guest-support.c 
> > b/backends/confidential-guest-support.c
> > new file mode 100644
> > index 00..9b0ded0db4
> > --- /dev/null
> > +++ b/backends/confidential-guest-support.c
> > @@ -0,0 +1,33 @@
> > +/*
> > + * QEMU Confidential Guest support
> > + *
> > + * Copyright: David Gibson, Red Hat Inc. 2020
> > + *
> > + * Authors:
> > + *  David Gibson 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or
> > + * later.  See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +
> > +#include "exec/confidential-guest-support.h"
> > +
> > +OBJECT_DEFINE_ABSTRACT_TYPE(ConfidentialGuestSupport,
> > +confidential_guest_support,
> > +CONFIDENTIAL_GUEST_SUPPORT,
> > +OBJECT)
> > +
> > +static void confidential_guest_support_class_init(ObjectClass *oc, void 
> > *data)
> > +{
> > +}
> > +
> > +static void confidential_guest_support_init(Object *obj)
> > +{
> > +}
> > +
> > +static void confidential_guest_support_finalize(Object *obj)
> > +{
> > +}
> > diff --git a/backends/meson.build b/backends/meson.build
> > index 484456ece7..d4221831fc 100644
> > --- a/backends/meson.build
> > +++ b/backends/meson.build
> > @@ -6,6 +6,7 @@ softmmu_ss.add([files(
> >'rng-builtin.c',
> >'rng-egd.c',
> >'rng.c',
> > +  'confidential-guest-support.c',
> >  ), numa])
> >  
> >  softmmu_ss.add(when: 'CONFIG_POSIX', if_true: files('rng-random.c'))
> > diff --git a/include/exec/confidential-guest-support.h 
> > b/include/exec/confidential-guest-support.h
> > new file mode 100644
> > index 00..5f131023ba
> > --- /dev/null
> > +++ b/include/exec/confidential-guest-support.h
> > @@ -0,0 +1,38 @@
> > +/*
> > + * QEMU Confidential Guest support
> > + *   This interface describes the common pieces between various
> > + *   schemes for protecting guest memory or other state against a
> > + *   compromised hypervisor.  This includes memory encryption (AMD's
> > + *   SEV and Intel's MKTME) or special protection modes (PEF on POWER,
> > + *   or PV on s390x).
> > + *
> > + * Copyright: David Gibson, Red Hat Inc. 2020
> > + *
> > + * Authors:
> > + *  David Gibson 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or
> > + * later.  See the COPYING file in the top-level directory.
> > + *
> > + */
> > +#ifndef QEMU_CONFIDENTIAL_GUEST_SUPPORT_H
> > +#define QEMU_CONFIDENTIAL_GUEST_SUPPORT_H
> > +
> > +#ifndef CONFIG_USER_ONLY
> > +
> > +#include "qom/object.h"
> > +
> > +#define TYPE_CONFIDENTIAL_GUEST_SUPPORT "confidential-guest-support"
> > +OBJECT_DECLARE_SIMPLE_TYPE(ConfidentialGuestSupport, 
> > CONFIDENTIAL_GUEST_SUPPORT)
> > +
> > +struct ConfidentialGuestSupport {
> > +Object parent;
> > +};
> > +
> > +typedef struct ConfidentialGuestSupportClass {
> > +ObjectClass parent;
> > +} ConfidentialGuestSupportClass;
> > +
> >

Re: [PATCH 08/11] iotests.py: qemu_nbd_popen: remove pid file after use

2021-01-20 Thread Eric Blake

On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> To note interfere with other qemu_nbd_popen() calls in same test.

not

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/iotests.py | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 07/11] iotests/264: move to python unittest

2021-01-20 Thread Eric Blake

On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> We are going to add more test cases, so use the library supporting test
> cases.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/264 | 93 ++
>  tests/qemu-iotests/264.out | 20 ++--
>  2 files changed, 58 insertions(+), 55 deletions(-)
> 

> +++ b/tests/qemu-iotests/264.out
> @@ -1,15 +1,5 @@
> -Start NBD server
> -{"execute": "blockdev-add", "arguments": {"driver": "raw", "file": 
> {"driver": "nbd", "reconnect-delay": 10, "server": {"path": 
> "TEST_DIR/PID-nbd-sock", "type": "unix"}}, "node-name": "backup0"}}
> -{"return": {}}
> -{"execute": "blockdev-backup", "arguments": {"device": "drive0", "speed": 
> 1048576, "sync": "full", "target": "backup0"}}
> -{"return": {}}
> -Backup job is started
> -Kill NBD server
> -Backup job is still in progress
> -{"execute": "block-job-set-speed", "arguments": {"device": "drive0", 
> "speed": 0}}
> -{"return": {}}
> -Start NBD server
> -Backup completed: 5242880
> -{"execute": "blockdev-del", "arguments": {"node-name": "backup0"}}
> -{"return": {}}
> -Kill NBD server
> +.
> +--
> +Ran 1 tests
> +
> +OK

I find it a shame that the expected output no longer shows what was
executed.  But the test still passes, and if it makes it easier for you
to extend the test in a later patch, I won't stand in the way (this is
more an indication that by stripping the useful output, I'm no longer in
as decent a position to help debug if the test starts failing).

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 06/11] iotests/264: fix style

2021-01-20 Thread Eric Blake

On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> Fix long line, extra import and one mypy complaint about incompatible
> int and float.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/264 | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 05/11] block/mirror: implement .cancel job handler

2021-01-20 Thread Eric Blake

On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> Cancel in-flight io on target to not waste the time.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/mirror.c | 9 +
>  1 file changed, 9 insertions(+)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

RE: [PATCH 03/10] Optimize the function of filter_send

2021-01-20 Thread Rao, Lei

OK, you are right,  I will change it in V2.

Thanks,
Lei.

-Original Message-
From: Lukas Straub  
Sent: Thursday, January 21, 2021 3:21 AM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; zhang.zhanghaili...@huawei.com; quint...@redhat.com; 
dgilb...@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 03/10] Optimize the function of filter_send

On Wed, 13 Jan 2021 10:46:28 +0800
leirao  wrote:

> From: "Rao, Lei" 
> 
> The iov_size has been calculated in filter_send(). we can directly 
> return the size.In this way, this is no need to repeat calculations in 
> filter_redirector_receive_iov();
> 
> Signed-off-by: Lei Rao 
> ---
>  net/filter-mirror.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/filter-mirror.c b/net/filter-mirror.c index 
> f8e6500..7fa2eb3 100644
> --- a/net/filter-mirror.c
> +++ b/net/filter-mirror.c
> @@ -88,7 +88,7 @@ static int filter_send(MirrorState *s,
>  goto err;
>  }
>  
> -return 0;
> +return size;
>  
>  err:
>  return ret < 0 ? ret : -EIO;
> @@ -159,7 +159,7 @@ static ssize_t filter_mirror_receive_iov(NetFilterState 
> *nf,
>  int ret;
>  
>  ret = filter_send(s, iov, iovcnt);
> -if (ret) {
> +if (ret <= 0) {
>  error_report("filter mirror send failed(%s)", strerror(-ret));
>  }

0 is a valid return value if the data to send has size = 0.

> @@ -182,10 +182,10 @@ static ssize_t 
> filter_redirector_receive_iov(NetFilterState *nf,
>  
>  if (qemu_chr_fe_backend_connected(>chr_out)) {
>  ret = filter_send(s, iov, iovcnt);
> -if (ret) {
> +if (ret <= 0) {
>  error_report("filter redirector send failed(%s)", 
> strerror(-ret));
>  }

dito

> -return iov_size(iov, iovcnt);
> +return ret;
>  } else {
>  return 0;
>  }



--

Re: [PATCH v1 2/2] spapr.c: add 'name' property for hotplugged CPUs nodes

2021-01-20 Thread David Gibson

On Wed, Jan 20, 2021 at 08:23:05PM -0300, Daniel Henrique Barboza wrote:
> In the CPU hotunplug bug [1] the guest kernel throws a scary
> message in dmesg:
> 
> pseries-hotplug-cpu: Failed to offline CPU , rc: -16
> 
> The reason isn't related to the bug though. This happens because the
> kernel file arch/powerpc/platform/pseries/hotplug-cpu.c, function
> dlpar_cpu_remove(), is not finding the device_node.name of the offending
> CPU.
> 
> We're not populating the 'name' property for hotplugged CPUs. Since the
> kernel relies on device_node.name for identifying CPU nodes, and the
> CPUs that are coldplugged has the 'name' property filled by SLOF, this
> is creating an unneeded inconsistency between hotplug and coldplug CPUs
> in the kernel.
> 
> Let's fill the 'name' property for hotplugged CPUs as well. This will
> make the guest dmesg throws a less intimidating message when we try to
> unplug the last online CPU:
> 
> pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9@1, rc: -16
> 
> [1] https://bugzilla.redhat.com/1911414
> 
> Signed-off-by: Daniel Henrique Barboza 

Nice catch.  Because the PAPR code has an odd mix of flattened-tree
pieces (where 'name' is implicit) and real OF pieces (where we
definitely need it), getting this right is kind of fiddly.  Since this
bit of flat tree gets encoded into PAPR's CAS update format, which
does require the 'name' property, this is correct.

You could argue that it's more technically correct for the flattening
code to add the name property from the FDT node name, but this is
simpler and gets the job done.

Applied, thanks.

> ---
>  hw/ppc/spapr.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index cc1b709615..6ab27ea269 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3750,6 +3750,19 @@ int spapr_core_dt_populate(SpaprDrc *drc, 
> SpaprMachineState *spapr,
>  
>  spapr_dt_cpu(cs, fdt, offset, spapr);
>  
> +/*
> + * spapr_dt_cpu() does not fill the 'name' property in the
> + * CPU node. The function is called during boot process, before
> + * and after CAS, and overwriting the 'name' property written
> + * by SLOF is not allowed.
> + *
> + * Write it manually after spapr_dt_cpu(). This makes the hotplug
> + * CPUs more compatible with the coldplugged ones, which have
> + * the 'name' property. Linux Kernel also relies on this
> + * property to identify CPU nodes.
> + */
> +_FDT((fdt_setprop_string(fdt, offset, "name", nodename)));
> +
>  *fdt_start_offset = offset;
>  return 0;
>  }

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH v1 1/2] spapr.c: use g_auto* with 'nodename' in CPU DT functions

2021-01-20 Thread David Gibson

On Wed, Jan 20, 2021 at 08:23:04PM -0300, Daniel Henrique Barboza wrote:
> Next patch will use the 'nodename' string in spapr_core_dt_populate()
> after the point it's being freed today.
> 
> Instead of moving 'g_free(nodename)' around, let's do a QoL change in
> both CPU DT functions where 'nodename' is being freed, and use
> g_autofree to avoid the 'g_free()' call altogether.
> 
> Signed-off-by: Daniel Henrique Barboza 

Applied, thanks.

> ---
>  hw/ppc/spapr.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 6c47466fc2..cc1b709615 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -790,7 +790,6 @@ static void spapr_dt_cpus(void *fdt, SpaprMachineState 
> *spapr)
>  CPUState *cs;
>  int n_cpus;
>  int cpus_offset;
> -char *nodename;
>  int i;
>  
>  cpus_offset = fdt_add_subnode(fdt, 0, "cpus");
> @@ -818,6 +817,7 @@ static void spapr_dt_cpus(void *fdt, SpaprMachineState 
> *spapr)
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
>  int index = spapr_get_vcpu_id(cpu);
>  DeviceClass *dc = DEVICE_GET_CLASS(cs);
> +g_autofree char *nodename = NULL;
>  int offset;
>  
>  if (!spapr_is_thread0_in_vcore(spapr, cpu)) {
> @@ -826,7 +826,6 @@ static void spapr_dt_cpus(void *fdt, SpaprMachineState 
> *spapr)
>  
>  nodename = g_strdup_printf("%s@%x", dc->fw_name, index);
>  offset = fdt_add_subnode(fdt, cpus_offset, nodename);
> -g_free(nodename);
>  _FDT(offset);
>  spapr_dt_cpu(cs, fdt, offset, spapr);
>  }
> @@ -3743,12 +3742,11 @@ int spapr_core_dt_populate(SpaprDrc *drc, 
> SpaprMachineState *spapr,
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
>  DeviceClass *dc = DEVICE_GET_CLASS(cs);
>  int id = spapr_get_vcpu_id(cpu);
> -char *nodename;
> +g_autofree char *nodename = NULL;
>  int offset;
>  
>  nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
>  offset = fdt_add_subnode(fdt, 0, nodename);
> -g_free(nodename);
>  
>  spapr_dt_cpu(cs, fdt, offset, spapr);
>  

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[PATCH v1 2/2] spapr.c: add 'name' property for hotplugged CPUs nodes

2021-01-20 Thread Daniel Henrique Barboza

In the CPU hotunplug bug [1] the guest kernel throws a scary
message in dmesg:

pseries-hotplug-cpu: Failed to offline CPU , rc: -16

The reason isn't related to the bug though. This happens because the
kernel file arch/powerpc/platform/pseries/hotplug-cpu.c, function
dlpar_cpu_remove(), is not finding the device_node.name of the offending
CPU.

We're not populating the 'name' property for hotplugged CPUs. Since the
kernel relies on device_node.name for identifying CPU nodes, and the
CPUs that are coldplugged has the 'name' property filled by SLOF, this
is creating an unneeded inconsistency between hotplug and coldplug CPUs
in the kernel.

Let's fill the 'name' property for hotplugged CPUs as well. This will
make the guest dmesg throws a less intimidating message when we try to
unplug the last online CPU:

pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9@1, rc: -16

[1] https://bugzilla.redhat.com/1911414

Signed-off-by: Daniel Henrique Barboza 
---
 hw/ppc/spapr.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index cc1b709615..6ab27ea269 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3750,6 +3750,19 @@ int spapr_core_dt_populate(SpaprDrc *drc, 
SpaprMachineState *spapr,
 
 spapr_dt_cpu(cs, fdt, offset, spapr);
 
+/*
+ * spapr_dt_cpu() does not fill the 'name' property in the
+ * CPU node. The function is called during boot process, before
+ * and after CAS, and overwriting the 'name' property written
+ * by SLOF is not allowed.
+ *
+ * Write it manually after spapr_dt_cpu(). This makes the hotplug
+ * CPUs more compatible with the coldplugged ones, which have
+ * the 'name' property. Linux Kernel also relies on this
+ * property to identify CPU nodes.
+ */
+_FDT((fdt_setprop_string(fdt, offset, "name", nodename)));
+
 *fdt_start_offset = offset;
 return 0;
 }
-- 
2.26.2

[PATCH v1 1/2] spapr.c: use g_auto* with 'nodename' in CPU DT functions

2021-01-20 Thread Daniel Henrique Barboza

Next patch will use the 'nodename' string in spapr_core_dt_populate()
after the point it's being freed today.

Instead of moving 'g_free(nodename)' around, let's do a QoL change in
both CPU DT functions where 'nodename' is being freed, and use
g_autofree to avoid the 'g_free()' call altogether.

Signed-off-by: Daniel Henrique Barboza 
---
 hw/ppc/spapr.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6c47466fc2..cc1b709615 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -790,7 +790,6 @@ static void spapr_dt_cpus(void *fdt, SpaprMachineState 
*spapr)
 CPUState *cs;
 int n_cpus;
 int cpus_offset;
-char *nodename;
 int i;
 
 cpus_offset = fdt_add_subnode(fdt, 0, "cpus");
@@ -818,6 +817,7 @@ static void spapr_dt_cpus(void *fdt, SpaprMachineState 
*spapr)
 PowerPCCPU *cpu = POWERPC_CPU(cs);
 int index = spapr_get_vcpu_id(cpu);
 DeviceClass *dc = DEVICE_GET_CLASS(cs);
+g_autofree char *nodename = NULL;
 int offset;
 
 if (!spapr_is_thread0_in_vcore(spapr, cpu)) {
@@ -826,7 +826,6 @@ static void spapr_dt_cpus(void *fdt, SpaprMachineState 
*spapr)
 
 nodename = g_strdup_printf("%s@%x", dc->fw_name, index);
 offset = fdt_add_subnode(fdt, cpus_offset, nodename);
-g_free(nodename);
 _FDT(offset);
 spapr_dt_cpu(cs, fdt, offset, spapr);
 }
@@ -3743,12 +3742,11 @@ int spapr_core_dt_populate(SpaprDrc *drc, 
SpaprMachineState *spapr,
 PowerPCCPU *cpu = POWERPC_CPU(cs);
 DeviceClass *dc = DEVICE_GET_CLASS(cs);
 int id = spapr_get_vcpu_id(cpu);
-char *nodename;
+g_autofree char *nodename = NULL;
 int offset;
 
 nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
 offset = fdt_add_subnode(fdt, 0, nodename);
-g_free(nodename);
 
 spapr_dt_cpu(cs, fdt, offset, spapr);
 
-- 
2.26.2

[PATCH v1 0/2] spapr.c: add 'name' property for hotplugged CPUs nodes

2021-01-20 Thread Daniel Henrique Barboza

Hi,

We're not adding the 'name' property in the hotplugged CPU nodes, and
this is being reflected in the kernel in a CPU hotunplug bug. See
patch 02 for more info. 


Daniel Henrique Barboza (2):
  spapr.c: use g_auto* with 'nodename' in CPU DT functions
  spapr.c: add 'name' property for hotplugged CPUs nodes

 hw/ppc/spapr.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

-- 
2.26.2

Re: [PATCH 04/11] job: add .cancel handler for the driver

2021-01-20 Thread Eric Blake

On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> To be used in mirror in the following commit to cancel in-flight io on
> target to not waste the time.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/qemu/job.h | 5 +
>  job.c  | 3 +++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/include/qemu/job.h b/include/qemu/job.h
> index 32aabb1c60..efc6fa7544 100644
> --- a/include/qemu/job.h
> +++ b/include/qemu/job.h
> @@ -251,6 +251,11 @@ struct JobDriver {
>   */
>  void (*clean)(Job *job);
>  
> +/**
> + * If the callback is not NULL, it will be invoked in job_cancel_async
> + */
> +void (*cancel)(Job *job);
> +

Does the call need to be re-entrant or even worry about being invoked
more than once on the same BDS?  Or worded differently,

> +++ b/job.c
> @@ -712,6 +712,9 @@ static int job_finalize_single(Job *job)
>  
>  static void job_cancel_async(Job *job, bool force)
>  {
> +if (job->driver->cancel) {
> +job->driver->cancel(job);
> +}
>  if (job->user_paused) {

can job_cancel_async be reached more than once on the same BDS?

>  /* Do not call job_enter here, the caller will handle it.  */
>  if (job->driver->user_resume) {
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 03/11] block/raw-format: implement .bdrv_cancel_in_flight handler

2021-01-20 Thread Eric Blake

On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> We are going to cancel in-flight requests on mirror nbd target on job
> cancel. Still nbd is often used not directly but as raw-format child.
> So, add pass-through handler here.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/raw-format.c | 6 ++
>  1 file changed, 6 insertions(+)

Should all filters do this automatically (or rather, should the block
layer do this automatically for all filters)?  But I understand that it
does NOT make sense for format drivers in general (cancelling a guest
request may still require metadata updates), where raw-format is the
exception, so doing it in raw-format instead of the block layer makes sense.

>  BlockDriver bdrv_raw = {
>  .format_name  = "raw",
>  .instance_size= sizeof(BDRVRawState),
> @@ -608,6 +613,7 @@ BlockDriver bdrv_raw = {
>  .bdrv_has_zero_init   = _has_zero_init,
>  .strong_runtime_opts  = raw_strong_runtime_opts,
>  .mutable_opts = mutable_opts,
> +.bdrv_cancel_in_flight = raw_cancel_in_flight,

A demonstration of why I don't like aligning the =.  But it's merely
cosmetic, so doesn't affect:

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 02/11] block/nbd: implement .bdrv_cancel_in_flight

2021-01-20 Thread Eric Blake

On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> Just stop waiting for connection in existing requests.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 15 +++
>  1 file changed, 15 insertions(+)
> 

That turned out to be rather easy ;)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v6 00/11] hvf: Implement Apple Silicon Support

2021-01-20 Thread no-reply

Patchew URL: https://patchew.org/QEMU/2021012022.71840-1-ag...@csgraf.de/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 2021012022.71840-1-ag...@csgraf.de
Subject: [PATCH v6 00/11] hvf: Implement Apple Silicon Support

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]  patchew/20201118180433.11931-1-vsement...@virtuozzo.com -> 
patchew/20201118180433.11931-1-vsement...@virtuozzo.com
 * [new tag] patchew/2021012022.71840-1-ag...@csgraf.de -> 
patchew/2021012022.71840-1-ag...@csgraf.de
Switched to a new branch 'test'
d212671 hvf: arm: Implement -cpu host
5421e5f hvf: arm: Add support for GICv3
611ed95 arm/hvf: Add a WFI handler
f186921 arm: Add Hypervisor.framework build target
f265d89 hvf: Add Apple Silicon support
8260850 hvf: Simplify post reset/init/loadvm hooks
4761f3e arm: Set PSCI to 0.2 for HVF
6626d7b hvf: Introduce hvf vcpu struct
39205aa hvf: Move common code out
950e618 hvf: x86: Remove unused definitions
3fa2296 hvf: Add hypervisor entitlement to output binaries

=== OUTPUT BEGIN ===
1/11 Checking commit 3fa229624043 (hvf: Add hypervisor entitlement to output 
binaries)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#23: 
new file mode 100644

total: 0 errors, 1 warnings, 62 lines checked

Patch 1/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
2/11 Checking commit 950e618cfb15 (hvf: x86: Remove unused definitions)
3/11 Checking commit 39205aaec1dd (hvf: Move common code out)
4/11 Checking commit 6626d7b844ca (hvf: Introduce hvf vcpu struct)
WARNING: line over 80 characters
#155: FILE: target/i386/hvf/hvf.c:213:
+wvmcs(cpu->hvf->fd, VMCS_ENTRY_CTLS, 
cap2ctrl(hvf_state->hvf_caps->vmx_cap_entry,

ERROR: "(foo*)" should be "(foo *)"
#763: FILE: target/i386/hvf/x86hvf.c:85:
+if (hv_vcpu_write_fpstate(cpu_state->hvf->fd, (void*)xsave, 4096)) {

ERROR: "(foo*)" should be "(foo *)"
#844: FILE: target/i386/hvf/x86hvf.c:167:
+if (hv_vcpu_read_fpstate(cpu_state->hvf->fd, (void*)xsave, 4096)) {

total: 2 errors, 1 warnings, 996 lines checked

Patch 4/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

5/11 Checking commit 4761f3e15d29 (arm: Set PSCI to 0.2 for HVF)
6/11 Checking commit 82608506deda (hvf: Simplify post reset/init/loadvm hooks)
7/11 Checking commit f265d897177f (hvf: Add Apple Silicon support)
WARNING: architecture specific defines should be avoided
#54: FILE: accel/hvf/hvf-cpus.c:61:
+#ifdef __aarch64__

WARNING: architecture specific defines should be avoided
#65: FILE: accel/hvf/hvf-cpus.c:335:
+#ifdef __aarch64__

WARNING: architecture specific defines should be avoided
#97: FILE: include/sysemu/hvf_int.h:15:
+#ifdef __aarch64__

WARNING: line over 80 characters
#585: FILE: target/arm/hvf/hvf.c:458:
+hv_vcpu_set_pending_interrupt(cpu->hvf->fd, HV_INTERRUPT_TYPE_FIQ, 
true);

WARNING: line over 80 characters
#590: FILE: target/arm/hvf/hvf.c:463:
+hv_vcpu_set_pending_interrupt(cpu->hvf->fd, HV_INTERRUPT_TYPE_IRQ, 
true);

total: 0 errors, 5 warnings, 691 lines checked

Patch 7/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
8/11 Checking commit f186921cf0db (arm: Add Hypervisor.framework build target)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#54: 
new file mode 100644

total: 0 errors, 1 warnings, 36 lines checked

Patch 8/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
9/11 Checking commit 611ed957b3e2 (arm/hvf: Add a WFI handler)
10/11 Checking commit 5421e5fb793a (hvf: arm: Add support for GICv3)
11/11 Checking commit d21267146369 (hvf: arm: Implement -cpu host)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/2021012022.71840-1-ag...@csgraf.de/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte

2021-01-20 Thread Peter Lieven


> Am 19.01.2021 um 15:20 schrieb Jason Dillaman :
> 
> On Tue, Jan 19, 2021 at 4:36 AM Peter Lieven  wrote:
>>> Am 18.01.21 um 23:33 schrieb Jason Dillaman:
>>> On Fri, Jan 15, 2021 at 10:39 AM Peter Lieven  wrote:
 Am 15.01.21 um 16:27 schrieb Jason Dillaman:
> On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven  wrote:
>> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
>>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven  wrote:
 since we implement byte interfaces and librbd supports aio on byte 
 granularity we can lift
 the 512 byte alignment.
 Signed-off-by: Peter Lieven 
 ---
 block/rbd.c | 2 --
 1 file changed, 2 deletions(-)
 diff --git a/block/rbd.c b/block/rbd.c
 index 27b4404adf..8673e8f553 100644
 --- a/block/rbd.c
 +++ b/block/rbd.c
 @@ -223,8 +223,6 @@ done:
 static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
 {
BDRVRBDState *s = bs->opaque;
 -/* XXX Does RBD support AIO on less than 512-byte alignment? */
 -bs->bl.request_alignment = 512;
>>> Just a suggestion, but perhaps improve discard alignment, max discard,
>>> optimal alignment (if that's something QEMU handles internally) if not
>>> overridden by the user.
>> Qemu supports max_discard and discard_alignment. Is there a call to get 
>> these limits
>> from librbd?
>> What do you mean by optimal_alignment? The object size?
> krbd does a good job of initializing defaults [1] where optimal and
> discard alignment is 64KiB (can actually be 4KiB now), max IO size for
> writes, discards, and write-zeroes is the object size * the stripe
> count.
 Okay, I will have a look at it. If qemu issues a write, discard, 
 write_zero greater than
 obj_size  * stripe count will librbd split it internally or will the 
 request fail?
>>> librbd will handle it as needed. My goal is really just to get the
>>> hints down the guest OS.
 Regarding the alignment it seems that rbd_dev->opts->alloc_size is 
 something that comes from the device
 configuration and not from rbd? I don't have that information inside the 
 Qemu RBD driver.
>>> librbd doesn't really have the information either. The 64KiB guess
>>> that krbd uses was a compromise since that was the default OSD
>>> allocation size for HDDs since Luminous. Starting with Pacific that
>>> default is going down to 4KiB.
>> I will try to adjust these values as far as it is possible and makes sense.
>> Is there a way to check the minimum supported OSD release in the backend 
>> from librbd / librados?
> 
> It's not a minimum -- RADOS will gladly access 1 byte writes as well.
> It's really just the optimal (performance and space-wise). Sadly,
> there is no realistic way to query this data from the backend.

So you would suggest to advertise an optimal transfer length of 64k and max 
transfer length of obj size * stripe count to the guest unless we have an API 
in the future to query these limits from the backend?

I would leave request alignment at 1 byte as otherwise Qemu will issue RMWs for 
all write requests that do not align. Everything that comes from a guest OS is 
very likely 4k aligned anyway.

Peter

[PATCH v6 11/11] hvf: arm: Implement -cpu host

2021-01-20 Thread Alexander Graf

Now that we have working system register sync, we push more target CPU
properties into the virtual machine. That might be useful in some
situations, but is not the typical case that users want.

So let's add a -cpu host option that allows them to explicitly pass all
CPU capabilities of their host CPU into the guest.

Signed-off-by: Alexander Graf 
Acked-by: Roman Bolshakov 
---
 include/sysemu/hvf.h |  2 ++
 target/arm/cpu.c |  9 ++---
 target/arm/cpu.h |  2 ++
 target/arm/hvf/hvf.c | 41 +
 target/arm/kvm_arm.h |  2 --
 5 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/include/sysemu/hvf.h b/include/sysemu/hvf.h
index f893768df9..7eb61cf094 100644
--- a/include/sysemu/hvf.h
+++ b/include/sysemu/hvf.h
@@ -19,6 +19,8 @@
 #ifdef CONFIG_HVF
 uint32_t hvf_get_supported_cpuid(uint32_t func, uint32_t idx,
  int reg);
+struct ARMCPU;
+void hvf_arm_set_cpu_features_from_host(struct ARMCPU *cpu);
 extern bool hvf_allowed;
 #define hvf_enabled() (hvf_allowed)
 #else /* !CONFIG_HVF */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index f1929b5eba..abd129d23f 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -2288,12 +2288,16 @@ static void arm_cpu_class_init(ObjectClass *oc, void 
*data)
 #endif
 }
 
-#ifdef CONFIG_KVM
+#if defined(CONFIG_KVM) || defined(CONFIG_HVF)
 static void arm_host_initfn(Object *obj)
 {
 ARMCPU *cpu = ARM_CPU(obj);
 
+#ifdef CONFIG_KVM
 kvm_arm_set_cpu_features_from_host(cpu);
+#else
+hvf_arm_set_cpu_features_from_host(cpu);
+#endif
 if (arm_feature(>env, ARM_FEATURE_AARCH64)) {
 aarch64_add_sve_properties(obj);
 }
@@ -2305,7 +2309,6 @@ static const TypeInfo host_arm_cpu_type_info = {
 .parent = TYPE_AARCH64_CPU,
 .instance_init = arm_host_initfn,
 };
-
 #endif
 
 static void arm_cpu_instance_init(Object *obj)
@@ -2364,7 +2367,7 @@ static void arm_cpu_register_types(void)
 
 type_register_static(_cpu_type_info);
 
-#ifdef CONFIG_KVM
+#if defined(CONFIG_KVM) || defined(CONFIG_HVF)
 type_register_static(_arm_cpu_type_info);
 #endif
 
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index df0d677833..5cc59df451 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2961,6 +2961,8 @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
 #define ARM_CPU_TYPE_NAME(name) (name ARM_CPU_TYPE_SUFFIX)
 #define CPU_RESOLVING_TYPE TYPE_ARM_CPU
 
+#define TYPE_ARM_HOST_CPU "host-" TYPE_ARM_CPU
+
 #define cpu_signal_handler cpu_arm_signal_handler
 #define cpu_list arm_cpu_list
 
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index 98bd6712c0..42dcc23ba0 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -372,6 +372,47 @@ static uint64_t hvf_get_reg(CPUState *cpu, int rt)
 return val;
 }
 
+void hvf_arm_set_cpu_features_from_host(ARMCPU *cpu)
+{
+ARMISARegisters host_isar;
+const struct isar_regs {
+int reg;
+uint64_t *val;
+} regs[] = {
+{ HV_SYS_REG_ID_AA64PFR0_EL1, _isar.id_aa64pfr0 },
+{ HV_SYS_REG_ID_AA64PFR1_EL1, _isar.id_aa64pfr1 },
+{ HV_SYS_REG_ID_AA64DFR0_EL1, _isar.id_aa64dfr0 },
+{ HV_SYS_REG_ID_AA64DFR1_EL1, _isar.id_aa64dfr1 },
+{ HV_SYS_REG_ID_AA64ISAR0_EL1, _isar.id_aa64isar0 },
+{ HV_SYS_REG_ID_AA64ISAR1_EL1, _isar.id_aa64isar1 },
+{ HV_SYS_REG_ID_AA64MMFR0_EL1, _isar.id_aa64mmfr0 },
+{ HV_SYS_REG_ID_AA64MMFR1_EL1, _isar.id_aa64mmfr1 },
+{ HV_SYS_REG_ID_AA64MMFR2_EL1, _isar.id_aa64mmfr2 },
+};
+hv_vcpu_t fd;
+hv_vcpu_exit_t *exit;
+int i;
+
+cpu->dtb_compatible = "arm,arm-v8";
+cpu->env.features = (1ULL << ARM_FEATURE_V8) |
+(1ULL << ARM_FEATURE_NEON) |
+(1ULL << ARM_FEATURE_AARCH64) |
+(1ULL << ARM_FEATURE_PMU) |
+(1ULL << ARM_FEATURE_GENERIC_TIMER);
+
+/* We set up a small vcpu to extract host registers */
+
+assert_hvf_ok(hv_vcpu_create(, , NULL));
+for (i = 0; i < ARRAY_SIZE(regs); i++) {
+assert_hvf_ok(hv_vcpu_get_sys_reg(fd, regs[i].reg, regs[i].val));
+}
+assert_hvf_ok(hv_vcpu_get_sys_reg(fd, HV_SYS_REG_MIDR_EL1, >midr));
+assert_hvf_ok(hv_vcpu_destroy(fd));
+
+cpu->isar = host_isar;
+cpu->reset_sctlr = 0x00c50078;
+}
+
 void hvf_arch_vcpu_destroy(CPUState *cpu)
 {
 }
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index eb81b7059e..081727a37e 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -214,8 +214,6 @@ bool kvm_arm_create_scratch_host_vcpu(const uint32_t 
*cpus_to_try,
  */
 void kvm_arm_destroy_scratch_host_vcpu(int *fdarray);
 
-#define TYPE_ARM_HOST_CPU "host-" TYPE_ARM_CPU
-
 /**
  * ARMHostCPUFeatures: information about the host CPU (identified
  * by asking the host kernel)
-- 
2.24.3 (Apple Git-128)

[PATCH v6 07/11] hvf: Add Apple Silicon support

2021-01-20 Thread Alexander Graf

With Apple Silicon available to the masses, it's a good time to add support
for driving its virtualization extensions from QEMU.

This patch adds all necessary architecture specific code to get basic VMs
working. It's still pretty raw, but definitely functional.

Known limitations:

  - Vtimer acknowledgement is hacky
  - Should implement more sysregs and fault on invalid ones then
  - WFI handling is missing, need to marry it with vtimer

Signed-off-by: Alexander Graf 
Reviewed-by: Roman Bolshakov 

---

v1 -> v2:

  - Merge vcpu kick function patch
  - Implement WFI handling (allows vCPUs to sleep)
  - Synchronize system registers (fixes OVMF crashes and reboot)
  - Don't always call cpu_synchronize_state()
  - Use more fine grained iothread locking
  - Populate aa64mmfr0 from hardware

v2 -> v3:

  - Advance PC on SMC
  - Use cp list interface for sysreg syncs
  - Do not set current_cpu
  - Fix sysreg isread mask
  - Move sysreg handling to functions
  - Remove WFI logic again
  - Revert to global iothread locking
  - Use Hypervisor.h on arm, hv.h does not contain aarch64 definitions

v3 -> v4:

  - No longer include Hypervisor.h

v5 -> v6:

  - Swap sysreg definition order. This way we're in line with asm outputs.
---
 MAINTAINERS  |   5 +
 accel/hvf/hvf-cpus.c |  14 +
 include/sysemu/hvf_int.h |   9 +-
 target/arm/hvf/hvf.c | 618 +++
 4 files changed, 645 insertions(+), 1 deletion(-)
 create mode 100644 target/arm/hvf/hvf.c

diff --git a/MAINTAINERS b/MAINTAINERS
index e589ec02e0..8cbb3f37b9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -442,6 +442,11 @@ F: accel/accel.c
 F: accel/Makefile.objs
 F: accel/stubs/Makefile.objs
 
+Apple Silicon HVF CPUs
+M: Alexander Graf 
+S: Maintained
+F: target/arm/hvf/
+
 X86 HVF CPUs
 M: Cameron Esfahani 
 M: Roman Bolshakov 
diff --git a/accel/hvf/hvf-cpus.c b/accel/hvf/hvf-cpus.c
index a324da2757..6d70ee742e 100644
--- a/accel/hvf/hvf-cpus.c
+++ b/accel/hvf/hvf-cpus.c
@@ -58,6 +58,10 @@
 #include "sysemu/runstate.h"
 #include "qemu/guest-random.h"
 
+#ifdef __aarch64__
+#define HV_VM_DEFAULT NULL
+#endif
+
 /* Memory slots */
 
 struct mac_slot {
@@ -328,7 +332,11 @@ static int hvf_init_vcpu(CPUState *cpu)
 pthread_sigmask(SIG_BLOCK, NULL, );
 sigdelset(, SIG_IPI);
 
+#ifdef __aarch64__
+r = hv_vcpu_create(>hvf->fd, (hv_vcpu_exit_t **)>hvf->exit, 
NULL);
+#else
 r = hv_vcpu_create((hv_vcpuid_t *)>hvf->fd, HV_VCPU_DEFAULT);
+#endif
 cpu->vcpu_dirty = 1;
 assert_hvf_ok(r);
 
@@ -399,8 +407,14 @@ static void hvf_start_vcpu_thread(CPUState *cpu)
cpu, QEMU_THREAD_JOINABLE);
 }
 
+__attribute__((weak)) void hvf_kick_vcpu_thread(CPUState *cpu)
+{
+cpus_kick_thread(cpu);
+}
+
 static const CpusAccel hvf_cpus = {
 .create_vcpu_thread = hvf_start_vcpu_thread,
+.kick_vcpu_thread = hvf_kick_vcpu_thread,
 
 .synchronize_post_reset = hvf_cpu_synchronize_post_reset,
 .synchronize_post_init = hvf_cpu_synchronize_post_init,
diff --git a/include/sysemu/hvf_int.h b/include/sysemu/hvf_int.h
index 9d3cb53e47..c2ac6c8f97 100644
--- a/include/sysemu/hvf_int.h
+++ b/include/sysemu/hvf_int.h
@@ -11,7 +11,12 @@
 #ifndef HVF_INT_H
 #define HVF_INT_H
 
+#include "qemu/osdep.h"
+#ifdef __aarch64__
+#include 
+#else
 #include 
+#endif
 
 /* hvf_slot flags */
 #define HVF_SLOT_LOG (1 << 0)
@@ -44,7 +49,8 @@ struct HVFState {
 extern HVFState *hvf_state;
 
 struct hvf_vcpu_state {
-int fd;
+uint64_t fd;
+void *exit;
 };
 
 void assert_hvf_ok(hv_return_t ret);
@@ -54,5 +60,6 @@ int hvf_arch_init_vcpu(CPUState *cpu);
 void hvf_arch_vcpu_destroy(CPUState *cpu);
 int hvf_vcpu_exec(CPUState *cpu);
 hvf_slot *hvf_find_overlap_slot(uint64_t, uint64_t);
+void hvf_kick_vcpu_thread(CPUState *cpu);
 
 #endif
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
new file mode 100644
index 00..8f18efe856
--- /dev/null
+++ b/target/arm/hvf/hvf.c
@@ -0,0 +1,618 @@
+/*
+ * QEMU Hypervisor.framework support for Apple Silicon
+
+ * Copyright 2020 Alexander Graf 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "qemu/error-report.h"
+
+#include "sysemu/runstate.h"
+#include "sysemu/hvf.h"
+#include "sysemu/hvf_int.h"
+#include "sysemu/hw_accel.h"
+
+#include "exec/address-spaces.h"
+#include "hw/irq.h"
+#include "qemu/main-loop.h"
+#include "sysemu/accel.h"
+#include "sysemu/cpus.h"
+#include "target/arm/cpu.h"
+#include "target/arm/internals.h"
+
+#define HVF_DEBUG 0
+#define DPRINTF(...)\
+if (HVF_DEBUG) {\
+fprintf(stderr, "HVF %s:%d ", __func__, __LINE__);  \
+fprintf(stderr, __VA_ARGS__);   \
+fprintf(stderr, "\n");  \
+}
+
+#define HVF_SYSREG(crn,

[PATCH v6 10/11] hvf: arm: Add support for GICv3

2021-01-20 Thread Alexander Graf

We currently only support GICv2 emulation. To also support GICv3, we will
need to pass a few system registers into their respective handler functions.

This patch adds handling for all of the required system registers, so that
we can run with more than 8 vCPUs.

Signed-off-by: Alexander Graf 
Acked-by: Roman Bolshakov 

---

v5 -> v6:

  - Adapt to new SYSREG() ordering
---
 target/arm/hvf/hvf.c | 141 +++
 1 file changed, 141 insertions(+)

diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index f0850ab14a..98bd6712c0 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -22,6 +22,7 @@
 
 #include "exec/address-spaces.h"
 #include "hw/irq.h"
+#include "hw/intc/gicv3_internal.h"
 #include "qemu/main-loop.h"
 #include "sysemu/accel.h"
 #include "sysemu/cpus.h"
@@ -46,6 +47,33 @@
 #define SYSREG_CNTPCT_EL0 SYSREG(3, 3, 14, 0, 1)
 #define SYSREG_PMCCNTR_EL0SYSREG(3, 3, 9, 13, 0)
 
+#define SYSREG_ICC_AP0R0_EL1 SYSREG(3, 0, 12, 8, 4)
+#define SYSREG_ICC_AP0R1_EL1 SYSREG(3, 0, 12, 8, 5)
+#define SYSREG_ICC_AP0R2_EL1 SYSREG(3, 0, 12, 8, 6)
+#define SYSREG_ICC_AP0R3_EL1 SYSREG(3, 0, 12, 8, 7)
+#define SYSREG_ICC_AP1R0_EL1 SYSREG(3, 0, 12, 9, 0)
+#define SYSREG_ICC_AP1R1_EL1 SYSREG(3, 0, 12, 9, 1)
+#define SYSREG_ICC_AP1R2_EL1 SYSREG(3, 0, 12, 9, 2)
+#define SYSREG_ICC_AP1R3_EL1 SYSREG(3, 0, 12, 9, 3)
+#define SYSREG_ICC_ASGI1R_EL1SYSREG(3, 0, 12, 11, 6)
+#define SYSREG_ICC_BPR0_EL1  SYSREG(3, 0, 12, 8, 3)
+#define SYSREG_ICC_BPR1_EL1  SYSREG(3, 0, 12, 12, 3)
+#define SYSREG_ICC_CTLR_EL1  SYSREG(3, 0, 12, 12, 4)
+#define SYSREG_ICC_DIR_EL1   SYSREG(3, 0, 12, 11, 1)
+#define SYSREG_ICC_EOIR0_EL1 SYSREG(3, 0, 12, 8, 1)
+#define SYSREG_ICC_EOIR1_EL1 SYSREG(3, 0, 12, 12, 1)
+#define SYSREG_ICC_HPPIR0_EL1SYSREG(3, 0, 12, 8, 2)
+#define SYSREG_ICC_HPPIR1_EL1SYSREG(3, 0, 12, 12, 2)
+#define SYSREG_ICC_IAR0_EL1  SYSREG(3, 0, 12, 8, 0)
+#define SYSREG_ICC_IAR1_EL1  SYSREG(3, 0, 12, 12, 0)
+#define SYSREG_ICC_IGRPEN0_EL1   SYSREG(3, 0, 12, 12, 6)
+#define SYSREG_ICC_IGRPEN1_EL1   SYSREG(3, 0, 12, 12, 7)
+#define SYSREG_ICC_PMR_EL1   SYSREG(3, 0, 4, 6, 0)
+#define SYSREG_ICC_RPR_EL1   SYSREG(3, 0, 12, 11, 3)
+#define SYSREG_ICC_SGI0R_EL1 SYSREG(3, 0, 12, 11, 7)
+#define SYSREG_ICC_SGI1R_EL1 SYSREG(3, 0, 12, 11, 5)
+#define SYSREG_ICC_SRE_EL1   SYSREG(3, 0, 12, 12, 5)
+
 #define WFX_IS_WFE (1 << 0)
 
 struct hvf_reg_match {
@@ -418,6 +446,38 @@ void hvf_kick_vcpu_thread(CPUState *cpu)
 hv_vcpus_exit(>hvf->fd, 1);
 }
 
+static uint32_t hvf_reg2cp_reg(uint32_t reg)
+{
+return ENCODE_AA64_CP_REG(CP_REG_ARM64_SYSREG_CP,
+  (reg >> 10) & 0xf,
+  (reg >> 1) & 0xf,
+  (reg >> 20) & 0x3,
+  (reg >> 14) & 0x7,
+  (reg >> 17) & 0x7);
+}
+
+static uint64_t hvf_sysreg_read_cp(CPUState *cpu, uint32_t reg)
+{
+ARMCPU *arm_cpu = ARM_CPU(cpu);
+CPUARMState *env = _cpu->env;
+const ARMCPRegInfo *ri;
+uint64_t val = 0;
+
+ri = get_arm_cp_reginfo(arm_cpu->cp_regs, hvf_reg2cp_reg(reg));
+if (ri) {
+if (ri->type & ARM_CP_CONST) {
+val = ri->resetvalue;
+} else if (ri->readfn) {
+val = ri->readfn(env, ri);
+} else {
+val = CPREG_FIELD64(env, ri);
+}
+DPRINTF("vgic read from %s [val=%016llx]", ri->name, val);
+}
+
+return val;
+}
+
 static uint64_t hvf_sysreg_read(CPUState *cpu, uint32_t reg)
 {
 ARMCPU *arm_cpu = ARM_CPU(cpu);
@@ -431,6 +491,39 @@ static uint64_t hvf_sysreg_read(CPUState *cpu, uint32_t 
reg)
 case SYSREG_PMCCNTR_EL0:
 val = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 break;
+case SYSREG_ICC_AP0R0_EL1:
+case SYSREG_ICC_AP0R1_EL1:
+case SYSREG_ICC_AP0R2_EL1:
+case SYSREG_ICC_AP0R3_EL1:
+case SYSREG_ICC_AP1R0_EL1:
+case SYSREG_ICC_AP1R1_EL1:
+case SYSREG_ICC_AP1R2_EL1:
+case SYSREG_ICC_AP1R3_EL1:
+case SYSREG_ICC_ASGI1R_EL1:
+case SYSREG_ICC_BPR0_EL1:
+case SYSREG_ICC_BPR1_EL1:
+case SYSREG_ICC_DIR_EL1:
+case SYSREG_ICC_EOIR0_EL1:
+case SYSREG_ICC_EOIR1_EL1:
+case SYSREG_ICC_HPPIR0_EL1:
+case SYSREG_ICC_HPPIR1_EL1:
+case SYSREG_ICC_IAR0_EL1:
+case SYSREG_ICC_IAR1_EL1:
+case SYSREG_ICC_IGRPEN0_EL1:
+case SYSREG_ICC_IGRPEN1_EL1:
+case SYSREG_ICC_PMR_EL1:
+case SYSREG_ICC_SGI0R_EL1:
+case SYSREG_ICC_SGI1R_EL1:
+case SYSREG_ICC_SRE_EL1:
+val = hvf_sysreg_read_cp(cpu, reg);
+break;
+case SYSREG_ICC_CTLR_EL1:
+val = hvf_sysreg_read_cp(cpu, reg);
+
+/* AP0R registers above 0 don't trap, expose less PRIs to fit */
+val &= ~ICC_CTLR_EL1_PRIBITS_MASK;
+val |= 4 << ICC_CTLR_EL1_PRIBITS_SHIFT;
+break;
 default:
 DPRINTF("unhandled sysreg read %08x

[PATCH v6 08/11] arm: Add Hypervisor.framework build target

2021-01-20 Thread Alexander Graf

Now that we have all logic in place that we need to handle Hypervisor.framework
on Apple Silicon systems, let's add CONFIG_HVF for aarch64 as well so that we
can build it.

Signed-off-by: Alexander Graf 
Reviewed-by: Roman Bolshakov 
Tested-by: Roman Bolshakov  (x86 only)

---

v1 -> v2:

  - Fix build on 32bit arm

v3 -> v4:

  - Remove i386-softmmu target
---
 meson.build| 11 ++-
 target/arm/hvf/meson.build |  3 +++
 target/arm/meson.build |  2 ++
 3 files changed, 15 insertions(+), 1 deletion(-)
 create mode 100644 target/arm/hvf/meson.build

diff --git a/meson.build b/meson.build
index c667d64498..8302fcbd90 100644
--- a/meson.build
+++ b/meson.build
@@ -74,16 +74,25 @@ else
 endif
 
 accelerator_targets = { 'CONFIG_KVM': kvm_targets }
+
+if cpu in ['x86', 'x86_64']
+  hvf_targets = ['x86_64-softmmu']
+elif cpu in ['aarch64']
+  hvf_targets = ['aarch64-softmmu']
+else
+  hvf_targets = []
+endif
+
 if cpu in ['x86', 'x86_64', 'arm', 'aarch64']
   # i368 emulator provides xenpv machine type for multiple architectures
   accelerator_targets += {
 'CONFIG_XEN': ['i386-softmmu', 'x86_64-softmmu'],
+'CONFIG_HVF': hvf_targets,
   }
 endif
 if cpu in ['x86', 'x86_64']
   accelerator_targets += {
 'CONFIG_HAX': ['i386-softmmu', 'x86_64-softmmu'],
-'CONFIG_HVF': ['x86_64-softmmu'],
 'CONFIG_WHPX': ['i386-softmmu', 'x86_64-softmmu'],
   }
 endif
diff --git a/target/arm/hvf/meson.build b/target/arm/hvf/meson.build
new file mode 100644
index 00..855e6cce5a
--- /dev/null
+++ b/target/arm/hvf/meson.build
@@ -0,0 +1,3 @@
+arm_softmmu_ss.add(when: [hvf, 'CONFIG_HVF'], if_true: files(
+  'hvf.c',
+))
diff --git a/target/arm/meson.build b/target/arm/meson.build
index 15b936c101..2efd6e672a 100644
--- a/target/arm/meson.build
+++ b/target/arm/meson.build
@@ -54,5 +54,7 @@ arm_softmmu_ss.add(files(
   'psci.c',
 ))
 
+subdir('hvf')
+
 target_arch += {'arm': arm_ss}
 target_softmmu_arch += {'arm': arm_softmmu_ss}
-- 
2.24.3 (Apple Git-128)

[PATCH v6 09/11] arm/hvf: Add a WFI handler

2021-01-20 Thread Alexander Graf

From: Peter Collingbourne 

Sleep on WFI until the VTIMER is due but allow ourselves to be woken
up on IPI.

In this implementation IPI is blocked on the CPU thread at startup and
pselect() is used to atomically unblock the signal and begin sleeping.
The signal is sent unconditionally so there's no need to worry about
races between actually sleeping and the "we think we're sleeping"
state. It may lead to an extra wakeup but that's better than missing
it entirely.

Signed-off-by: Peter Collingbourne 
[agraf: Remove unused 'set' variable, always advance PC on WFX trap]
Signed-off-by: Alexander Graf 
Acked-by: Roman Bolshakov 
---
 accel/hvf/hvf-cpus.c |  5 ++--
 include/sysemu/hvf_int.h |  1 +
 target/arm/hvf/hvf.c | 56 
 3 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/accel/hvf/hvf-cpus.c b/accel/hvf/hvf-cpus.c
index 6d70ee742e..abef6a58f7 100644
--- a/accel/hvf/hvf-cpus.c
+++ b/accel/hvf/hvf-cpus.c
@@ -322,15 +322,14 @@ static int hvf_init_vcpu(CPUState *cpu)
 cpu->hvf = g_malloc0(sizeof(*cpu->hvf));
 
 /* init cpu signals */
-sigset_t set;
 struct sigaction sigact;
 
 memset(, 0, sizeof(sigact));
 sigact.sa_handler = dummy_signal;
 sigaction(SIG_IPI, , NULL);
 
-pthread_sigmask(SIG_BLOCK, NULL, );
-sigdelset(, SIG_IPI);
+pthread_sigmask(SIG_BLOCK, NULL, >hvf->unblock_ipi_mask);
+sigdelset(>hvf->unblock_ipi_mask, SIG_IPI);
 
 #ifdef __aarch64__
 r = hv_vcpu_create(>hvf->fd, (hv_vcpu_exit_t **)>hvf->exit, 
NULL);
diff --git a/include/sysemu/hvf_int.h b/include/sysemu/hvf_int.h
index c2ac6c8f97..7a397fe85a 100644
--- a/include/sysemu/hvf_int.h
+++ b/include/sysemu/hvf_int.h
@@ -51,6 +51,7 @@ extern HVFState *hvf_state;
 struct hvf_vcpu_state {
 uint64_t fd;
 void *exit;
+sigset_t unblock_ipi_mask;
 };
 
 void assert_hvf_ok(hv_return_t ret);
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index 8f18efe856..f0850ab14a 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -2,6 +2,7 @@
  * QEMU Hypervisor.framework support for Apple Silicon
 
  * Copyright 2020 Alexander Graf 
+ * Copyright 2020 Google LLC
  *
  * This work is licensed under the terms of the GNU GPL, version 2 or later.
  * See the COPYING file in the top-level directory.
@@ -17,6 +18,8 @@
 #include "sysemu/hvf_int.h"
 #include "sysemu/hw_accel.h"
 
+#include 
+
 #include "exec/address-spaces.h"
 #include "hw/irq.h"
 #include "qemu/main-loop.h"
@@ -411,6 +414,7 @@ int hvf_arch_init_vcpu(CPUState *cpu)
 
 void hvf_kick_vcpu_thread(CPUState *cpu)
 {
+cpus_kick_thread(cpu);
 hv_vcpus_exit(>hvf->fd, 1);
 }
 
@@ -466,6 +470,18 @@ static int hvf_inject_interrupts(CPUState *cpu)
 return 0;
 }
 
+static void hvf_wait_for_ipi(CPUState *cpu, struct timespec *ts)
+{
+/*
+ * Use pselect to sleep so that other threads can IPI us while we're
+ * sleeping.
+ */
+qatomic_mb_set(>thread_kicked, false);
+qemu_mutex_unlock_iothread();
+pselect(0, 0, 0, 0, ts, >hvf->unblock_ipi_mask);
+qemu_mutex_lock_iothread();
+}
+
 int hvf_vcpu_exec(CPUState *cpu)
 {
 ARMCPU *arm_cpu = ARM_CPU(cpu);
@@ -577,6 +593,46 @@ int hvf_vcpu_exec(CPUState *cpu)
 }
 case EC_WFX_TRAP:
 advance_pc = true;
+if (!(syndrome & WFX_IS_WFE) && !(cpu->interrupt_request &
+(CPU_INTERRUPT_HARD | CPU_INTERRUPT_FIQ))) {
+
+uint64_t ctl;
+r = hv_vcpu_get_sys_reg(cpu->hvf->fd, HV_SYS_REG_CNTV_CTL_EL0,
+);
+assert_hvf_ok(r);
+
+if (!(ctl & 1) || (ctl & 2)) {
+/* Timer disabled or masked, just wait for an IPI. */
+hvf_wait_for_ipi(cpu, NULL);
+break;
+}
+
+uint64_t cval;
+r = hv_vcpu_get_sys_reg(cpu->hvf->fd, HV_SYS_REG_CNTV_CVAL_EL0,
+);
+assert_hvf_ok(r);
+
+int64_t ticks_to_sleep = cval - mach_absolute_time();
+if (ticks_to_sleep < 0) {
+break;
+}
+
+uint64_t seconds = ticks_to_sleep / arm_cpu->gt_cntfrq_hz;
+uint64_t nanos =
+(ticks_to_sleep - arm_cpu->gt_cntfrq_hz * seconds) *
+10 / arm_cpu->gt_cntfrq_hz;
+
+/*
+ * Don't sleep for less than 2ms. This is believed to improve
+ * latency of message passing workloads.
+ */
+if (!seconds && nanos < 200) {
+break;
+}
+
+struct timespec ts = { seconds, nanos };
+hvf_wait_for_ipi(cpu, );
+}
 break;
 case EC_AA64_HVC:
 cpu_synchronize_state(cpu);
-- 
2.24.3 (Apple Git-128)

[PATCH v6 06/11] hvf: Simplify post reset/init/loadvm hooks

2021-01-20 Thread Alexander Graf

The hooks we have that call us after reset, init and loadvm really all
just want to say "The reference of all register state is in the QEMU
vcpu struct, please push it".

We already have a working pushing mechanism though called cpu->vcpu_dirty,
so we can just reuse that for all of the above, syncing state properly the
next time we actually execute a vCPU.

This fixes PSCI resets on ARM, as they modify CPU state even after the
post init call has completed, but before we execute the vCPU again.

To also make the scheme work for x86, we have to make sure we don't
move stale eflags into our env when the vcpu state is dirty.

Signed-off-by: Alexander Graf 
Reviewed-by: Roman Bolshakov 
Tested-by: Roman Bolshakov 
---
 accel/hvf/hvf-cpus.c | 27 +++
 target/i386/hvf/x86hvf.c |  5 -
 2 files changed, 11 insertions(+), 21 deletions(-)

diff --git a/accel/hvf/hvf-cpus.c b/accel/hvf/hvf-cpus.c
index 2c6796604a..a324da2757 100644
--- a/accel/hvf/hvf-cpus.c
+++ b/accel/hvf/hvf-cpus.c
@@ -275,39 +275,26 @@ static void hvf_cpu_synchronize_state(CPUState *cpu)
 }
 }
 
-static void do_hvf_cpu_synchronize_post_reset(CPUState *cpu,
-  run_on_cpu_data arg)
+static void do_hvf_cpu_synchronize_set_dirty(CPUState *cpu,
+ run_on_cpu_data arg)
 {
-hvf_put_registers(cpu);
-cpu->vcpu_dirty = false;
+/* QEMU state is the reference, push it to HVF now and on next entry */
+cpu->vcpu_dirty = true;
 }
 
 static void hvf_cpu_synchronize_post_reset(CPUState *cpu)
 {
-run_on_cpu(cpu, do_hvf_cpu_synchronize_post_reset, RUN_ON_CPU_NULL);
-}
-
-static void do_hvf_cpu_synchronize_post_init(CPUState *cpu,
- run_on_cpu_data arg)
-{
-hvf_put_registers(cpu);
-cpu->vcpu_dirty = false;
+run_on_cpu(cpu, do_hvf_cpu_synchronize_set_dirty, RUN_ON_CPU_NULL);
 }
 
 static void hvf_cpu_synchronize_post_init(CPUState *cpu)
 {
-run_on_cpu(cpu, do_hvf_cpu_synchronize_post_init, RUN_ON_CPU_NULL);
-}
-
-static void do_hvf_cpu_synchronize_pre_loadvm(CPUState *cpu,
-  run_on_cpu_data arg)
-{
-cpu->vcpu_dirty = true;
+run_on_cpu(cpu, do_hvf_cpu_synchronize_set_dirty, RUN_ON_CPU_NULL);
 }
 
 static void hvf_cpu_synchronize_pre_loadvm(CPUState *cpu)
 {
-run_on_cpu(cpu, do_hvf_cpu_synchronize_pre_loadvm, RUN_ON_CPU_NULL);
+run_on_cpu(cpu, do_hvf_cpu_synchronize_set_dirty, RUN_ON_CPU_NULL);
 }
 
 static void hvf_vcpu_destroy(CPUState *cpu)
diff --git a/target/i386/hvf/x86hvf.c b/target/i386/hvf/x86hvf.c
index 0f2aeb1cf8..3111c0be4c 100644
--- a/target/i386/hvf/x86hvf.c
+++ b/target/i386/hvf/x86hvf.c
@@ -435,7 +435,10 @@ int hvf_process_events(CPUState *cpu_state)
 X86CPU *cpu = X86_CPU(cpu_state);
 CPUX86State *env = >env;
 
-env->eflags = rreg(cpu_state->hvf->fd, HV_X86_RFLAGS);
+if (!cpu_state->vcpu_dirty) {
+/* light weight sync for CPU_INTERRUPT_HARD and IF_MASK */
+env->eflags = rreg(cpu_state->hvf->fd, HV_X86_RFLAGS);
+}
 
 if (cpu_state->interrupt_request & CPU_INTERRUPT_INIT) {
 cpu_synchronize_state(cpu_state);
-- 
2.24.3 (Apple Git-128)

[PATCH v6 03/11] hvf: Move common code out

2021-01-20 Thread Alexander Graf

Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's move common code out
into its own accel directory.

Signed-off-by: Alexander Graf 
Reviewed-by: Roman Bolshakov 
Tested-by: Roman Bolshakov 

---

v3 -> v4:

  - Use hv.h instead of Hypervisor.h for 10.15 compat
  - Remove manual inclusion of Hypervisor.h in common .c files
---
 MAINTAINERS |   8 +
 accel/hvf/hvf-all.c |  54 +
 accel/hvf/hvf-cpus.c| 462 
 accel/hvf/meson.build   |   7 +
 accel/meson.build   |   1 +
 include/sysemu/hvf_int.h|  54 +
 target/i386/hvf/hvf-cpus.c  | 131 --
 target/i386/hvf/hvf-cpus.h  |  25 --
 target/i386/hvf/hvf-i386.h  |  33 +--
 target/i386/hvf/hvf.c   | 360 +---
 target/i386/hvf/meson.build |   1 -
 target/i386/hvf/x86hvf.c|  11 +-
 target/i386/hvf/x86hvf.h|   2 -
 13 files changed, 596 insertions(+), 553 deletions(-)
 create mode 100644 accel/hvf/hvf-all.c
 create mode 100644 accel/hvf/hvf-cpus.c
 create mode 100644 accel/hvf/meson.build
 create mode 100644 include/sysemu/hvf_int.h
 delete mode 100644 target/i386/hvf/hvf-cpus.c
 delete mode 100644 target/i386/hvf/hvf-cpus.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 3216387521..e589ec02e0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -448,7 +448,15 @@ M: Roman Bolshakov 
 W: https://wiki.qemu.org/Features/HVF
 S: Maintained
 F: target/i386/hvf/
+
+HVF
+M: Cameron Esfahani 
+M: Roman Bolshakov 
+W: https://wiki.qemu.org/Features/HVF
+S: Maintained
+F: accel/hvf/
 F: include/sysemu/hvf.h
+F: include/sysemu/hvf_int.h
 
 WHPX CPUs
 M: Sunil Muthuswamy 
diff --git a/accel/hvf/hvf-all.c b/accel/hvf/hvf-all.c
new file mode 100644
index 00..5b415eb0ed
--- /dev/null
+++ b/accel/hvf/hvf-all.c
@@ -0,0 +1,54 @@
+/*
+ * QEMU Hypervisor.framework support
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Contributions after 2012-01-13 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "qemu/error-report.h"
+#include "sysemu/hvf.h"
+#include "sysemu/hvf_int.h"
+#include "sysemu/runstate.h"
+
+#include "qemu/main-loop.h"
+#include "sysemu/accel.h"
+
+bool hvf_allowed;
+HVFState *hvf_state;
+
+void assert_hvf_ok(hv_return_t ret)
+{
+if (ret == HV_SUCCESS) {
+return;
+}
+
+switch (ret) {
+case HV_ERROR:
+error_report("Error: HV_ERROR");
+break;
+case HV_BUSY:
+error_report("Error: HV_BUSY");
+break;
+case HV_BAD_ARGUMENT:
+error_report("Error: HV_BAD_ARGUMENT");
+break;
+case HV_NO_RESOURCES:
+error_report("Error: HV_NO_RESOURCES");
+break;
+case HV_NO_DEVICE:
+error_report("Error: HV_NO_DEVICE");
+break;
+case HV_UNSUPPORTED:
+error_report("Error: HV_UNSUPPORTED");
+break;
+default:
+error_report("Unknown Error");
+}
+
+abort();
+}
diff --git a/accel/hvf/hvf-cpus.c b/accel/hvf/hvf-cpus.c
new file mode 100644
index 00..60f6d76bf3
--- /dev/null
+++ b/accel/hvf/hvf-cpus.c
@@ -0,0 +1,462 @@
+/*
+ * Copyright 2008 IBM Corporation
+ *   2008 Red Hat, Inc.
+ * Copyright 2011 Intel Corporation
+ * Copyright 2016 Veertu, Inc.
+ * Copyright 2017 The Android Open Source Project
+ *
+ * QEMU Hypervisor.framework support
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ *
+ * This file contain code under public domain from the hvdos project:
+ * https://github.com/mist64/hvdos
+ *
+ * Parts Copyright (c) 2011 NetApp, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY NETAPP,

[PATCH v6 04/11] hvf: Introduce hvf vcpu struct

2021-01-20 Thread Alexander Graf

We will need more than a single field for hvf going forward. To keep
the global vcpu struct uncluttered, let's allocate a special hvf vcpu
struct, similar to how hax does it.

Signed-off-by: Alexander Graf 
Reviewed-by: Roman Bolshakov 
Tested-by: Roman Bolshakov 
Reviewed-by: Alex Bennée 

---

v4 -> v5:

  - Use g_free() on destroy
---
 accel/hvf/hvf-cpus.c|   8 +-
 include/hw/core/cpu.h   |   3 +-
 include/sysemu/hvf_int.h|   4 +
 target/i386/hvf/hvf.c   | 102 +-
 target/i386/hvf/vmx.h   |  24 +++--
 target/i386/hvf/x86.c   |  28 ++---
 target/i386/hvf/x86_descr.c |  26 ++---
 target/i386/hvf/x86_emu.c   |  62 +--
 target/i386/hvf/x86_mmu.c   |   4 +-
 target/i386/hvf/x86_task.c  |  12 +--
 target/i386/hvf/x86hvf.c| 210 ++--
 11 files changed, 247 insertions(+), 236 deletions(-)

diff --git a/accel/hvf/hvf-cpus.c b/accel/hvf/hvf-cpus.c
index 60f6d76bf3..2c6796604a 100644
--- a/accel/hvf/hvf-cpus.c
+++ b/accel/hvf/hvf-cpus.c
@@ -312,10 +312,12 @@ static void hvf_cpu_synchronize_pre_loadvm(CPUState *cpu)
 
 static void hvf_vcpu_destroy(CPUState *cpu)
 {
-hv_return_t ret = hv_vcpu_destroy(cpu->hvf_fd);
+hv_return_t ret = hv_vcpu_destroy(cpu->hvf->fd);
 assert_hvf_ok(ret);
 
 hvf_arch_vcpu_destroy(cpu);
+g_free(cpu->hvf);
+cpu->hvf = NULL;
 }
 
 static void dummy_signal(int sig)
@@ -326,6 +328,8 @@ static int hvf_init_vcpu(CPUState *cpu)
 {
 int r;
 
+cpu->hvf = g_malloc0(sizeof(*cpu->hvf));
+
 /* init cpu signals */
 sigset_t set;
 struct sigaction sigact;
@@ -337,7 +341,7 @@ static int hvf_init_vcpu(CPUState *cpu)
 pthread_sigmask(SIG_BLOCK, NULL, );
 sigdelset(, SIG_IPI);
 
-r = hv_vcpu_create((hv_vcpuid_t *)>hvf_fd, HV_VCPU_DEFAULT);
+r = hv_vcpu_create((hv_vcpuid_t *)>hvf->fd, HV_VCPU_DEFAULT);
 cpu->vcpu_dirty = 1;
 assert_hvf_ok(r);
 
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 140fa32a5e..9e1b61f63d 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -281,6 +281,7 @@ struct KVMState;
 struct kvm_run;
 
 struct hax_vcpu_state;
+struct hvf_vcpu_state;
 
 #define TB_JMP_CACHE_BITS 12
 #define TB_JMP_CACHE_SIZE (1 << TB_JMP_CACHE_BITS)
@@ -464,7 +465,7 @@ struct CPUState {
 
 struct hax_vcpu_state *hax_vcpu;
 
-int hvf_fd;
+struct hvf_vcpu_state *hvf;
 
 /* track IOMMUs whose translations we've cached in the TCG TLB */
 GArray *iommu_notifiers;
diff --git a/include/sysemu/hvf_int.h b/include/sysemu/hvf_int.h
index 69de46db7d..9d3cb53e47 100644
--- a/include/sysemu/hvf_int.h
+++ b/include/sysemu/hvf_int.h
@@ -43,6 +43,10 @@ struct HVFState {
 };
 extern HVFState *hvf_state;
 
+struct hvf_vcpu_state {
+int fd;
+};
+
 void assert_hvf_ok(hv_return_t ret);
 int hvf_get_registers(CPUState *cpu);
 int hvf_put_registers(CPUState *cpu);
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 8b96ecd619..08b4adecd9 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -80,11 +80,11 @@ void vmx_update_tpr(CPUState *cpu)
 int tpr = cpu_get_apic_tpr(x86_cpu->apic_state) << 4;
 int irr = apic_get_highest_priority_irr(x86_cpu->apic_state);
 
-wreg(cpu->hvf_fd, HV_X86_TPR, tpr);
+wreg(cpu->hvf->fd, HV_X86_TPR, tpr);
 if (irr == -1) {
-wvmcs(cpu->hvf_fd, VMCS_TPR_THRESHOLD, 0);
+wvmcs(cpu->hvf->fd, VMCS_TPR_THRESHOLD, 0);
 } else {
-wvmcs(cpu->hvf_fd, VMCS_TPR_THRESHOLD, (irr > tpr) ? tpr >> 4 :
+wvmcs(cpu->hvf->fd, VMCS_TPR_THRESHOLD, (irr > tpr) ? tpr >> 4 :
   irr >> 4);
 }
 }
@@ -92,7 +92,7 @@ void vmx_update_tpr(CPUState *cpu)
 static void update_apic_tpr(CPUState *cpu)
 {
 X86CPU *x86_cpu = X86_CPU(cpu);
-int tpr = rreg(cpu->hvf_fd, HV_X86_TPR) >> 4;
+int tpr = rreg(cpu->hvf->fd, HV_X86_TPR) >> 4;
 cpu_set_apic_tpr(x86_cpu->apic_state, tpr);
 }
 
@@ -194,43 +194,43 @@ int hvf_arch_init_vcpu(CPUState *cpu)
 }
 
 /* set VMCS control fields */
-wvmcs(cpu->hvf_fd, VMCS_PIN_BASED_CTLS,
+wvmcs(cpu->hvf->fd, VMCS_PIN_BASED_CTLS,
   cap2ctrl(hvf_state->hvf_caps->vmx_cap_pinbased,
   VMCS_PIN_BASED_CTLS_EXTINT |
   VMCS_PIN_BASED_CTLS_NMI |
   VMCS_PIN_BASED_CTLS_VNMI));
-wvmcs(cpu->hvf_fd, VMCS_PRI_PROC_BASED_CTLS,
+wvmcs(cpu->hvf->fd, VMCS_PRI_PROC_BASED_CTLS,
   cap2ctrl(hvf_state->hvf_caps->vmx_cap_procbased,
   VMCS_PRI_PROC_BASED_CTLS_HLT |
   VMCS_PRI_PROC_BASED_CTLS_MWAIT |
   VMCS_PRI_PROC_BASED_CTLS_TSC_OFFSET |
   VMCS_PRI_PROC_BASED_CTLS_TPR_SHADOW) |
   VMCS_PRI_PROC_BASED_CTLS_SEC_CONTROL);
-wvmcs(cpu->hvf_fd, VMCS_SEC_PROC_BASED_CTLS,
+wvmcs(cpu->hvf->fd, VMCS_SEC_PROC_BASED_CTLS,
   cap2ctrl(hvf_state->hvf_caps->vmx_cap_procbased2,
VMCS_PRI_PROC_BASED2_CTLS_APIC_ACCESSES));
 
-wvmcs(cpu->hvf_fd, VMCS_ENTRY_CTLS,

[PATCH v6 01/11] hvf: Add hypervisor entitlement to output binaries

2021-01-20 Thread Alexander Graf

In macOS 11, QEMU only gets access to Hypervisor.framework if it has the
respective entitlement. Add an entitlement template and automatically self
sign and apply the entitlement in the build.

Signed-off-by: Alexander Graf 
Reviewed-by: Roman Bolshakov 
Tested-by: Roman Bolshakov 

---

v1 -> v2:

  - Make safe to ctrl-C

v3 -> v4:

  - Remove unused exe_full variable
  - Reuse exe_name variable
---
 accel/hvf/entitlements.plist |  8 
 meson.build  | 29 +
 scripts/entitlement.sh   | 13 +
 3 files changed, 46 insertions(+), 4 deletions(-)
 create mode 100644 accel/hvf/entitlements.plist
 create mode 100755 scripts/entitlement.sh

diff --git a/accel/hvf/entitlements.plist b/accel/hvf/entitlements.plist
new file mode 100644
index 00..154f3308ef
--- /dev/null
+++ b/accel/hvf/entitlements.plist
@@ -0,0 +1,8 @@
+
+http://www.apple.com/DTDs/PropertyList-1.0.dtd;>
+
+
+com.apple.security.hypervisor
+
+
+
diff --git a/meson.build b/meson.build
index 3d889857a0..c667d64498 100644
--- a/meson.build
+++ b/meson.build
@@ -2146,9 +2146,14 @@ foreach target : target_dirs
 }]
   endif
   foreach exe: execs
-emulators += {exe['name']:
- executable(exe['name'], exe['sources'],
-   install: true,
+exe_name = exe['name']
+exe_sign = 'CONFIG_HVF' in config_target
+if exe_sign
+  exe_name += '-unsigned'
+endif
+
+emulator = executable(exe_name, exe['sources'],
+   install: not exe_sign,
c_args: c_args,
dependencies: arch_deps + deps + exe['dependencies'],
objects: lib.extract_all_objects(recursive: true),
@@ -2156,7 +2161,23 @@ foreach target : target_dirs
link_depends: [block_syms, qemu_syms] + exe.get('link_depends', 
[]),
link_args: link_args,
gui_app: exe['gui'])
-}
+
+if exe_sign
+  emulators += {exe['name'] : custom_target(exe['name'],
+   install: true,
+   install_dir: get_option('bindir'),
+   depends: emulator,
+   output: exe['name'],
+   command: [
+ meson.current_source_dir() / 'scripts/entitlement.sh',
+ meson.current_build_dir() / exe_name,
+ meson.current_build_dir() / exe['name'],
+ meson.current_source_dir() / 
'accel/hvf/entitlements.plist'
+   ])
+  }
+else
+  emulators += {exe['name']: emulator}
+endif
 
 if 'CONFIG_TRACE_SYSTEMTAP' in config_host
   foreach stp: [
diff --git a/scripts/entitlement.sh b/scripts/entitlement.sh
new file mode 100755
index 00..c540fa6435
--- /dev/null
+++ b/scripts/entitlement.sh
@@ -0,0 +1,13 @@
+#!/bin/sh -e
+#
+# Helper script for the build process to apply entitlements
+
+SRC="$1"
+DST="$2"
+ENTITLEMENT="$3"
+
+trap 'rm "$DST.tmp"' exit
+cp -af "$SRC" "$DST.tmp"
+codesign --entitlements "$ENTITLEMENT" --force -s - "$DST.tmp"
+mv "$DST.tmp" "$DST"
+trap '' exit
-- 
2.24.3 (Apple Git-128)

[PATCH v6 05/11] arm: Set PSCI to 0.2 for HVF

2021-01-20 Thread Alexander Graf

In Hypervisor.framework, we just pass PSCI calls straight on to the QEMU 
emulation
of it. That means, if TCG is compatible with PSCI 0.2, so are we. Let's 
transpose
that fact in code too.

Signed-off-by: Alexander Graf 
Reviewed-by: Roman Bolshakov 

---

v3 -> v4:

  - Combine both if statements
---
 target/arm/cpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 40142ac141..f1929b5eba 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1063,8 +1063,8 @@ static void arm_cpu_initfn(Object *obj)
 cpu->psci_version = 1; /* By default assume PSCI v0.1 */
 cpu->kvm_target = QEMU_KVM_ARM_TARGET_NONE;
 
-if (tcg_enabled()) {
-cpu->psci_version = 2; /* TCG implements PSCI 0.2 */
+if (tcg_enabled() || hvf_enabled()) {
+cpu->psci_version = 2; /* TCG and HVF implement PSCI 0.2 */
 }
 }
 
-- 
2.24.3 (Apple Git-128)

[PATCH v6 00/11] hvf: Implement Apple Silicon Support

2021-01-20 Thread Alexander Graf

Now that Apple Silicon is widely available, people are obviously excited
to try and run virtualized workloads on them, such as Linux and Windows.

This patch set implements a fully functional version to get the ball
going on that. With this applied, I can successfully run both Linux and
Windows as guests. I am not aware of any limitations specific to
Hypervisor.framework apart from:

  - Live migration / savevm
  - gdbstub debugging (SP register)


Enjoy!

Alex

v1 -> v2:

  - New patch: hvf: Actually set SIG_IPI mask
  - New patch: hvf: Introduce hvf vcpu struct
  - New patch: hvf: arm: Mark CPU as dirty on reset
  - Removed patch: hw/arm/virt: Disable highmem when on hypervisor.framework
  - Removed patch: arm: Synchronize CPU on PSCI on
  - Fix build on 32bit arm
  - Merge vcpu kick function patch into ARM enablement
  - Implement WFI handling (allows vCPUs to sleep)
  - Synchronize system registers (fixes OVMF crashes and reboot)
  - Don't always call cpu_synchronize_state()
  - Use more fine grained iothread locking
  - Populate aa64mmfr0 from hardware
  - Make safe to ctrl-C entitlement application

v2 -> v3:

  - Removed patch: hvf: Actually set SIG_IPI mask
  - New patch: hvf: arm: Add support for GICv3
  - New patch: hvf: arm: Implement -cpu host
  - Advance PC on SMC
  - Use cp list interface for sysreg syncs
  - Do not set current_cpu
  - Fix sysreg isread mask
  - Move sysreg handling to functions
  - Remove WFI logic again
  - Revert to global iothread locking

v3 -> v4:

  - Removed patch: hvf: arm: Mark CPU as dirty on reset
  - New patch: hvf: Simplify post reset/init/loadvm hooks
  - Remove i386-softmmu target (meson.build for hvf target)
  - Combine both if statements (PSCI)
  - Use hv.h instead of Hypervisor.h for 10.15 compat
  - Remove manual inclusion of Hypervisor.h in common .c files
  - No longer include Hypervisor.h in arm hvf .c files
  - Remove unused exe_full variable
  - Reuse exe_name variable

v4 -> v5:

  - Use g_free() on destroy

v5 -> v6:

  - Switch SYSREG() macro order to the same as asm intrinsics

Alexander Graf (10):
  hvf: Add hypervisor entitlement to output binaries
  hvf: x86: Remove unused definitions
  hvf: Move common code out
  hvf: Introduce hvf vcpu struct
  arm: Set PSCI to 0.2 for HVF
  hvf: Simplify post reset/init/loadvm hooks
  hvf: Add Apple Silicon support
  arm: Add Hypervisor.framework build target
  hvf: arm: Add support for GICv3
  hvf: arm: Implement -cpu host

Peter Collingbourne (1):
  arm/hvf: Add a WFI handler

 MAINTAINERS  |  13 +
 accel/hvf/entitlements.plist |   8 +
 accel/hvf/hvf-all.c  |  54 +++
 accel/hvf/hvf-cpus.c | 466 +++
 accel/hvf/meson.build|   7 +
 accel/meson.build|   1 +
 include/hw/core/cpu.h|   3 +-
 include/sysemu/hvf.h |   2 +
 include/sysemu/hvf_int.h |  66 +++
 meson.build  |  40 +-
 scripts/entitlement.sh   |  13 +
 target/arm/cpu.c |  13 +-
 target/arm/cpu.h |   2 +
 target/arm/hvf/hvf.c | 856 +++
 target/arm/hvf/meson.build   |   3 +
 target/arm/kvm_arm.h |   2 -
 target/arm/meson.build   |   2 +
 target/i386/hvf/hvf-cpus.c   | 131 --
 target/i386/hvf/hvf-cpus.h   |  25 -
 target/i386/hvf/hvf-i386.h   |  49 +-
 target/i386/hvf/hvf.c| 462 +++
 target/i386/hvf/meson.build  |   1 -
 target/i386/hvf/vmx.h|  24 +-
 target/i386/hvf/x86.c|  28 +-
 target/i386/hvf/x86_descr.c  |  26 +-
 target/i386/hvf/x86_emu.c|  62 +--
 target/i386/hvf/x86_mmu.c|   4 +-
 target/i386/hvf/x86_task.c   |  12 +-
 target/i386/hvf/x86hvf.c | 224 -
 target/i386/hvf/x86hvf.h |   2 -
 30 files changed, 1786 insertions(+), 815 deletions(-)
 create mode 100644 accel/hvf/entitlements.plist
 create mode 100644 accel/hvf/hvf-all.c
 create mode 100644 accel/hvf/hvf-cpus.c
 create mode 100644 accel/hvf/meson.build
 create mode 100644 include/sysemu/hvf_int.h
 create mode 100755 scripts/entitlement.sh
 create mode 100644 target/arm/hvf/hvf.c
 create mode 100644 target/arm/hvf/meson.build
 delete mode 100644 target/i386/hvf/hvf-cpus.c
 delete mode 100644 target/i386/hvf/hvf-cpus.h

-- 
2.24.3 (Apple Git-128)

[PATCH v6 02/11] hvf: x86: Remove unused definitions

2021-01-20 Thread Alexander Graf

The hvf i386 has a few struct and cpp definitions that are never
used. Remove them.

Suggested-by: Roman Bolshakov 
Signed-off-by: Alexander Graf 
Reviewed-by: Roman Bolshakov 
Tested-by: Roman Bolshakov 
---
 target/i386/hvf/hvf-i386.h | 16 
 1 file changed, 16 deletions(-)

diff --git a/target/i386/hvf/hvf-i386.h b/target/i386/hvf/hvf-i386.h
index e0edffd077..e31938e5ff 100644
--- a/target/i386/hvf/hvf-i386.h
+++ b/target/i386/hvf/hvf-i386.h
@@ -21,21 +21,6 @@
 #include "cpu.h"
 #include "x86.h"
 
-#define HVF_MAX_VCPU 0x10
-
-extern struct hvf_state hvf_global;
-
-struct hvf_vm {
-int id;
-struct hvf_vcpu_state *vcpus[HVF_MAX_VCPU];
-};
-
-struct hvf_state {
-uint32_t version;
-struct hvf_vm *vm;
-uint64_t mem_quota;
-};
-
 /* hvf_slot flags */
 #define HVF_SLOT_LOG (1 << 0)
 
@@ -75,7 +60,6 @@ hvf_slot *hvf_find_overlap_slot(uint64_t, uint64_t);
 
 /* Host specific functions */
 int hvf_inject_interrupt(CPUArchState *env, int vector);
-int hvf_vcpu_run(struct hvf_vcpu_state *vcpu);
 #endif
 
 #endif
-- 
2.24.3 (Apple Git-128)

Re: [PATCH v3 00/12] hw/block/nvme: misc cmb/pmr patches and bump to v1.4

2021-01-20 Thread Klaus Jensen

On Jan 19 11:14, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> This is a resend of "hw/block/nvme: allow cmb and pmr to coexist" with
> some more PMR work added (PMR RDS/WDS support).
> 
> This includes a resurrection of Andrzej's series[1] from back July.
> 
> Andrzej's main patch basically moved the CMB from BAR 2 into an offset
> in BAR 4 (located after the MSI-X table and PBA). Having an offset on
> the CMB causes a bunch of calculations related to address mapping to
> change.
> 
> So, since I couldn't get the patch to apply cleanly I took a stab at
> implementing the suggestion I originally came up with: simply move the
> MSI-X table and PBA from BAR 4 into BAR 0 (up-aligned to a 4 KiB
> boundary after the main NVMe controller registers). This way we can keep
> the CMB at offset zero in its own BAR and free up BAR 4 for use by PMR.
> This makes the patch simpler and does not impact any of the existing
> address mapping code.
> 
>   [1]: 
> https://lore.kernel.org/qemu-devel/20200729220107.37758-1-andrzej.jakow...@linux.intel.com/
> 
> Changes for v3
> ~~
> 
>   - Fix a masking bug in "hw/block/nvme: fix 64 bit register hi/lo split
> writes" (Minwoo)
>   - Restore a deleted comment in "hw/block/nvme: remove redundant
> zeroing of PMR registers" (Minwoo)
>   - Restore the 'legacy-cmb' parameter from v1 to allow the device to
> exhibit the v1.3 CMB behavior.
> 
> Changes for v2
> ~~
> 
>   - Rebased on nvme-next
>   - Added a fix for 64 bit register hi/lo split writes
>   - Added the patches from "hw/block/nvme: cmb enhancements and bump to
> v1.4" to the back of this.
>   - As suggested by Keith, I removed "legacy CMB" support - the patch
> now exclusively bumps the support to the "v1.4 variant", so the
> linux kernel nvme gang have to get their game on ;)
> 
> Andrzej Jakowski (1):
>   hw/block/nvme: indicate CMB support through controller capabilities
> register
> 
> Klaus Jensen (9):
>   hw/block/nvme: add size to mmio read/write trace events
>   hw/block/nvme: fix 64 bit register hi/lo split writes
>   hw/block/nvme: move msix table and pba to BAR 0
>   hw/block/nvme: allow cmb and pmr to coexist
>   hw/block/nvme: rename PMR/CMB shift/mask fields
>   hw/block/nvme: remove redundant zeroing of PMR registers
>   hw/block/nvme: disable PMR at boot up
>   hw/block/nvme: bump to v1.4
>   hw/block/nvme: lift cmb restrictions
> 
> Naveen Nagar (1):
>   hw/block/nvme: add PMR RDS/WDS support
> 
> Padmakar Kalghatgi (1):
>   hw/block/nvme: move cmb logic to v1.4
> 
>  hw/block/nvme.h   |  17 +-
>  include/block/nvme.h  | 125 +--
>  hw/block/nvme.c   | 350 +++---
>  hw/block/trace-events |   6 +-
>  4 files changed, 351 insertions(+), 147 deletions(-)
> 

Thanks guys, your reviews are highly appreciated! :)

Applied to nvme-next.


signature.asc
Description: PGP signature

Re: [PATCH 01/11] block: add new BlockDriver handler: bdrv_cancel_in_flight

2021-01-20 Thread Eric Blake

On 11/18/20 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> It will be used to stop retrying NBD requests on mirror cancel.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/block.h |  3 +++
>  include/block/block_int.h |  9 +
>  block/io.c| 11 +++
>  3 files changed, 23 insertions(+)
> 

How does this relate to the recent addition of the QMP yank command?


> +++ b/include/block/block_int.h
> @@ -344,6 +344,15 @@ struct BlockDriver {
>  bool want_zero, int64_t offset, int64_t bytes, int64_t *pnum,
>  int64_t *map, BlockDriverState **file);
>  
> +/*
> + * This informs the driver that we are not more interested in in-flight

that we are no longer interested in the result of in-flight requests, so

> + * requests results, so don't waste the time if possible.
> + *
> + * The example usage is to not wait for nbd target nodedreconnect 
> timeout on
> + * job-cancel.

One example usage is to avoid waiting for an nbd target node reconnect
timeout during job-cancel.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 1/8] block/nbd: move initial connect to coroutine

2021-01-20 Thread Eric Blake

On 11/30/20 7:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> We are going to implement reconnect-on-open. Let's reuse existing
> reconnect loop. For this, do initial connect in connection coroutine.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 94 ++---
>  1 file changed, 53 insertions(+), 41 deletions(-)
> 

> @@ -2279,6 +2268,29 @@ static int nbd_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  bdrv_inc_in_flight(bs);
>  aio_co_schedule(bdrv_get_aio_context(bs), s->connection_co);
>  
> +if (qemu_in_coroutine()) {
> +s->open_co = qemu_coroutine_self();
> +qemu_coroutine_yield();
> +} else {
> +BDRV_POLL_WHILE(bs, s->state == NBD_CLIENT_OPENING);
> +}
> +
> +if (s->state != NBD_CLIENT_CONNECTED && s->connect_status < 0) {
> +/*
> + * It's possible that state != NBD_CLIENT_CONNECTED, but 
> connect_status
> + * is 0. This means that initial connecting succeed, but failed later
> + * (during BDRV_POLL_WHILE). It's a rare case, but it happen in 
> iotest

This means that starting the initial connection succeeded, but we failed
later (during BDRV_POLL_WHILE).

happens

> + * 83. Let's don't care and just report success in this case: it not
> + * much differs from the case when connection failed immediately 
> after
> + * succeeded open.

We don't care, and just report success in this case, as it does not
change our behavior from the case when the connection fails right after
open succeeds.


> + */
> +assert(s->connect_err);
> +error_propagate(errp, s->connect_err);
> +s->connect_err = NULL;
> +nbd_clear_bdrvstate(s);
> +return s->connect_status;
> +}
> +
>  return 0;
>  }
>  
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 12/12] hw/block/nvme: lift cmb restrictions

2021-01-20 Thread Minwoo Im

Nice for codes much more clean.

Reviewed-by: Minwoo Im

Re: [PATCH v3 07/12] hw/block/nvme: remove redundant zeroing of PMR registers

2021-01-20 Thread Minwoo Im

Reviewed-by: Minwoo Im

Re: [PATCH v3 10/12] hw/block/nvme: move cmb logic to v1.4

2021-01-20 Thread Minwoo Im

Reviewed-by: Minwoo Im

Re: [PATCH v4 01/16] block: refactor bdrv_check_request: add errp

2021-01-20 Thread Eric Blake

On 12/11/20 12:39 PM, Vladimir Sementsov-Ogievskiy wrote:
> It's better to pass _abort than just assert that result is 0: on
> crash, we'll immediately see the reason in the backtrace.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/block_int.h|  2 +-
>  block/file-posix.c   |  2 +-
>  block/io.c   | 29 ++---
>  tests/test-write-threshold.c |  5 +++--
>  4 files changed, 27 insertions(+), 11 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [RFC PATCH V3 8/8] hw/block/nvme: Add Identify Active Namespace ID List

2021-01-20 Thread Minwoo Im

> Hello Minwoo,
> 
> By introducing a detached parameter,
> you are also implicitly making the following
> NVMe commands no longer be spec compliant:
> 
> NVME_ID_CNS_NS, NVME_ID_CNS_CS_NS,
> NVME_ID_CNS_NS_ACTIVE_LIST, NVME_ID_CNS_CS_NS_ACTIVE_LIST
> 
> When these commands are called on a detached namespace,
> they should usually return a zero filled data struct.

Agreed.

> Dmitry and I had a patch on V8 on the ZNS series
> that tried to fix some the existing NVMe commands
> to be spec compliant, by handling detached namespaces
> properly. In the end, in order to make it easier to
> get the ZNS series accepted, we decided to drop the
> detached related stuff from the series.
> 
> Feel free to look at that patch for inspiration:
> https://github.com/dmitry-fomichev/qemu/commit/251c0ffee5149c739b1347811fa7e32a1c36bf7c

I've seen this patch and as Klaus said, only thing patches need go with
is to put some of nvme_ns_is_attached() helper among the Identify
handlers.

> I'm not sure if you want to modify all the functions that
> our patch modifies, but I think that you should at least
> modify the following nvme functions:
> 
> nvme_identify_ns()
> nvme_identify_ns_csi()
> nvme_identify_nslist()
> nvme_identify_nslist_csi()

Yes, pretty makes sense to update them.  But now it seems like
'attach/detach' scheme should have been separated out of this series
which just introduced the multi-path for controllers and namespace
sharing.  I will drop this 'detach' scheme out of this series and make
another series to support all of the Identify you mentioned above
cleanly.

> So they handle detached namespaces correctly for both:
> NVME_ID_CNS_NS, NVME_ID_CNS_CS_NS,
> NVME_ID_CNS_NS_ACTIVE_LIST, NVME_ID_CNS_CS_NS_ACTIVE_LIST
> as well as for:
> NVME_ID_CNS_NS_PRESENT, NVME_ID_CNS_CS_NS_PRESENT,
> NVME_ID_CNS_NS_PRESENT_LIST, NVME_ID_CNS_CS_NS_PRESENT_LIST
> 
> 
> Kind regards,
> Niklas

Re: [PATCH 0/4] Introduce a battery, AC adapter, and lid button

2021-01-20 Thread Philippe Mathieu-Daudé

Hi Leonid, Marcel,

On 1/20/21 9:54 PM, Leonid Bloch wrote:
> This series introduces the following ACPI devices:
> 
> * Battery
> * AC adapter
> * Laptop lid button
> 
> When running QEMU on a laptop, these paravirtualized devices reflect the
> state of these physical devices onto the guest. This functionality is
> relevant not only for laptops, but also for any other device which has e.g.
> a battery. This even allows to insert a ``fake'' battery to the
> guest, in a form of a file which emulates the behavior of the actual
> battery in sysfs. A possible use case for such a ``fake'' battery can be
> limiting the budget of VM usage to a subscriber, in a naturally-visible way.

Your series looks good. Now for this feature to be even more useful for
the community, it would be better to

1/ Have a generic (kind of abstract QDev) battery model.
   Your model would be the ISA implementation. But we could add LPC,
   SPI or I2C implementations for example.

2/ Make it a model backend accepting various kind of frontends:
   - host Linux sysfs mirroring is a particular frontend implementation
   - mirroring on Windows would be another
   - any connection (TCP) to battery simulator (Octave, ...)

Meanwhile 2/ is not available, it would be useful to have QMP commands
to set the battery charge and state (also max capacity).

Ditto QMP command to set the LID/AC adapter state.

> But of course, the main purpose here is addressing the desktop users.
> 
> This series was tested with Windows and (desktop) Linux guests, on which
> indeed the battery icon appears in the corresponding state (full,
> charging, discharging, time remaining to empty, etc.) and the AC adapter
> plugging/unplugging behaves as expected. So is the laptop lid button.
[...]

In patch #2 you comment 'if a "fake" host battery is to be provided,
a 'sysfs_path' property allows to override the default one.'.

Eventually you'd provide a such fake file as example, ideally used
by a QTest.

Another question. If the battery is disconnected, is there an event
propagated to the guest?

Thanks for contributing these patches :)

Phil.

Re: [PATCH] Deprecate pmem=on with non-DAX capable backend file

2021-01-20 Thread Philippe Mathieu-Daudé

Cc'ing MST.

On 1/20/21 8:31 PM, Igor Mammedov wrote:
> On Mon, 11 Jan 2021 15:33:32 -0500
> Igor Mammedov  wrote:
> 
>> It is not safe to pretend that emulated NVDIMM supports
>> persistence while backend actually failed to enable it
>> and used non-persistent mapping as fall back.
>> Instead of falling-back, QEMU should be more strict and
>> error out with clear message that it's not supported.
>> So if user asks for persistence (pmem=on), they should
>> store backing file on NVDIMM.
>>
>> Signed-off-by: Igor Mammedov 
>> Reviewed-by: Philippe Mathieu-Daudé 
>> ---
>> v2:
>>   rephrase deprecation comment andwarning message
>>   (Philippe Mathieu-Daudé )
> 
> I've posted as v1 though it's v2 and it looks like it fell through cracks,
> 
> can someone pick it up if it looks fine, please?
> 
>> ---
>>  docs/system/deprecated.rst | 17 +
>>  util/mmap-alloc.c  |  3 +++
>>  2 files changed, 20 insertions(+)
>>
>> diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
>> index bacd76d7a5..e79fb02b3a 100644
>> --- a/docs/system/deprecated.rst
>> +++ b/docs/system/deprecated.rst
>> @@ -327,6 +327,23 @@ The Raspberry Pi machines come in various models (A, 
>> A+, B, B+). To be able
>>  to distinguish which model QEMU is implementing, the ``raspi2`` and 
>> ``raspi3``
>>  machines have been renamed ``raspi2b`` and ``raspi3b``.
>>  
>> +Backend options
>> +---
>> +
>> +Using non-persistent backing file with pmem=on (since 6.0)
>> +''
>> +
>> +This option is used when ``memory-backend-file`` is consumed by emulated 
>> NVDIMM
>> +device. However enabling ``memory-backend-file.pmem`` option, when backing 
>> file
>> +is (a) not DAX capable or (b) not on a filesystem that support direct 
>> mapping
>> +of persistent memory, is not safe and may lead to data loss or corruption 
>> in case
>> +of host crash.
>> +Options are:
>> +- modify VM configuration to set ``pmem=off`` to continue using fake 
>> NVDIMM
>> +  (without persistence guaranties) with backing file on non DAX storage
>> +- move backing file to NVDIMM storage and keep ``pmem=on``
>> +  (to have NVDIMM with persistence guaranties).
>> +
>>  Device options
>>  --
>>  
>> diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
>> index 27dcccd8ec..0388cc3be2 100644
>> --- a/util/mmap-alloc.c
>> +++ b/util/mmap-alloc.c
>> @@ -20,6 +20,7 @@
>>  #include "qemu/osdep.h"
>>  #include "qemu/mmap-alloc.h"
>>  #include "qemu/host-utils.h"
>> +#include "qemu/error-report.h"
>>  
>>  #define HUGETLBFS_MAGIC   0x958458f6
>>  
>> @@ -166,6 +167,8 @@ void *qemu_ram_mmap(int fd,
>>  "crash.\n", file_name);
>>  g_free(proc_link);
>>  g_free(file_name);
>> +warn_report("Using non DAX backing file with 'pmem=on' option"
>> +" is deprecated");
>>  }
>>  /*
>>   * if map failed with MAP_SHARED_VALIDATE | MAP_SYNC,
>

[PATCH 3/4] hw/acpi: Introduce the QEMU AC adapter

2021-01-20 Thread Leonid Bloch

The AC adapter device communicates the host's AC adapter state to the
guest. The probing of the host's AC adapter state occurs on guest ACPI
requests, as well as on timed intervals. If a change of the host's AC
adapter state is detected on one of the timed probes (connected/disconnected)
an ACPI notification is sent to the guest, so it will be able to
update its AC adapter status accordingly.

The time interval between the periodic probes is 2 seconds by default.
A property 'probe_interval' allows to modify this value. The value
should be provided in milliseconds. A zero value disables the periodic
probes, and makes the AC adapter state updates occur on guest requests
only.

The host's AC adapter information is taken from the sysfs AC adapter
data, located in:

/sys/class/power_supply/[device of type "Mains"]

If the sysfs path differs, a different AC adapter needs to be probed,
or even if a "fake" host AC adapter is to be provided, a 'sysfs_path'
property allows to override the default one.

Signed-off-by: Leonid Bloch 
---
 MAINTAINERS  |   5 +
 docs/specs/acad.txt  |  24 ++
 hw/acpi/Kconfig  |   4 +
 hw/acpi/acad.c   | 318 +++
 hw/acpi/meson.build  |   1 +
 hw/acpi/trace-events |   5 +
 hw/i386/Kconfig  |   1 +
 hw/i386/acpi-build.c |  52 +
 include/hw/acpi/acad.h   |  37 
 include/hw/acpi/acpi_dev_interface.h |   1 +
 10 files changed, 448 insertions(+)
 create mode 100644 docs/specs/acad.txt
 create mode 100644 hw/acpi/acad.c
 create mode 100644 include/hw/acpi/acad.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 33eea28c22..1a3fc1dd56 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2146,6 +2146,11 @@ M: Leonid Bloch 
 S: Maintained
 F: hw/acpi/battery.*
 
+AC Adapter
+M: Leonid Bloch 
+S: Maintained
+F: hw/acpi/acad.*
+
 Subsystems
 --
 Audio
diff --git a/docs/specs/acad.txt b/docs/specs/acad.txt
new file mode 100644
index 00..0d563a7b50
--- /dev/null
+++ b/docs/specs/acad.txt
@@ -0,0 +1,24 @@
+AC ADAPTER DEVICE
+=
+
+The AC adapter device communicates the host's AC adapter state to the
+guest. The probing of the host's AC adapter state occurs on guest ACPI
+requests, as well as on timed intervals. If a change of the host's AC
+adapter state is detected on one of the timed probes (connected/disconnected)
+an ACPI notification is sent to the guest, so it will be able to
+update its AC adapter status accordingly.
+
+The time interval between the periodic probes is 2 s by default.
+A property 'probe_interval' allows to modify this value. The value
+should be provided in milliseconds. A zero value disables the periodic
+probes, and makes the AC adapter state updates occur on guest requests
+only.
+
+The host's AC adapter information is taken from the sysfs AC adapter
+data, located in:
+
+/sys/class/power_supply/[device of type "Mains"]
+
+If the sysfs path differs, a different AC adapter needs to be probed,
+or even if a "fake" host AC adapter is to be provided, a 'sysfs_path'
+property allows to override the default one.
diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index 6b4c41037a..d35331fbf7 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -45,4 +45,8 @@ config BATTERY
 bool
 depends on ACPI
 
+config AC_ADAPTER
+bool
+depends on ACPI
+
 config ACPI_HW_REDUCED
diff --git a/hw/acpi/acad.c b/hw/acpi/acad.c
new file mode 100644
index 00..279452e95f
--- /dev/null
+++ b/hw/acpi/acad.c
@@ -0,0 +1,318 @@
+/*
+ * QEMU emulated AC adapter device.
+ *
+ * Copyright (c) 2019 Janus Technologies, Inc. (http://janustech.com)
+ *
+ * Authors:
+ * Leonid Bloch 
+ * Marcel Apfelbaum 
+ * Dmitry Fleytman 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory for details.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "trace.h"
+#include "hw/isa/isa.h"
+#include "hw/acpi/acpi.h"
+#include "hw/nvram/fw_cfg.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "hw/qdev-properties.h"
+#include "migration/vmstate.h"
+
+#include "hw/acpi/acad.h"
+
+#define AC_ADAPTER_DEVICE(obj) OBJECT_CHECK(ACADState, (obj), \
+TYPE_AC_ADAPTER)
+
+#define AC_STA_ADDR   0
+
+#define SYSFS_PATH"/sys/class/power_supply"
+#define AC_ADAPTER_TYPE   "Mains"
+#define MAX_ALLOWED_TYPE_LENGTH   16
+
+enum {
+AC_ADAPTER_OFFLINE = 0,
+AC_ADAPTER_ONLINE = 1,
+};
+
+typedef struct ACADState {
+ISADevice dev;
+MemoryRegion io;
+uint16_t ioport;
+uint8_t state;
+
+QEMUTimer *probe_state_timer;
+uint64_t probe_state_interval;
+
+char *acad_path;
+} ACADState;
+
+static const char *online_file = "online";
+static const char *type_file = "type";
+
+static inline bool

[PATCH 2/4] hw/acpi: Introduce the QEMU Battery

2021-01-20 Thread Leonid Bloch

The battery device communicates the host's battery state to the guest.
The probing of the host's battery state occurs on guest ACPI requests,
as well as on timed intervals. If a change of the host's battery state is
detected on one of the timed probes (charging/discharging, rate, charge)
an ACPI notification is sent to the guest, so it will be able to update
its battery status accordingly.

The time interval between periodic probes is 2 seconds by default.
A property 'probe_interval' allows to modify this value. The value should
be provided in milliseconds. A zero value disables the periodic probes,
and makes the battery state updates occur on guest requests only.

The host's battery information is taken from the sysfs battery data,
located in:

/sys/class/power_supply/[device of type "Battery"]

If the sysfs path differs, a different battery needs to be probed, or
even if a "fake" host battery is to be provided, a 'sysfs_path' property
allows to override the default one.

Signed-off-by: Leonid Bloch 
Signed-off-by: Marcel Apfelbaum 
---
 MAINTAINERS  |   5 +
 docs/specs/battery.txt   |  23 ++
 hw/acpi/Kconfig  |   4 +
 hw/acpi/battery.c| 512 +++
 hw/acpi/meson.build  |   1 +
 hw/acpi/trace-events |   5 +
 hw/i386/Kconfig  |   1 +
 hw/i386/acpi-build.c |  97 +
 include/hw/acpi/acpi_dev_interface.h |   1 +
 include/hw/acpi/battery.h|  43 +++
 10 files changed, 692 insertions(+)
 create mode 100644 docs/specs/battery.txt
 create mode 100644 hw/acpi/battery.c
 create mode 100644 include/hw/acpi/battery.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 3216387521..33eea28c22 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2141,6 +2141,11 @@ F: net/can/*
 F: hw/net/can/*
 F: include/net/can_*.h
 
+Battery
+M: Leonid Bloch 
+S: Maintained
+F: hw/acpi/battery.*
+
 Subsystems
 --
 Audio
diff --git a/docs/specs/battery.txt b/docs/specs/battery.txt
new file mode 100644
index 00..e90324ac03
--- /dev/null
+++ b/docs/specs/battery.txt
@@ -0,0 +1,23 @@
+BATTERY DEVICE
+==
+
+The battery device communicates the host's battery state to the guest.
+The probing of the host's battery state occurs on guest ACPI requests,
+as well as on timed intervals. If a change of the host's battery state is
+detected on one of the timed probes (charging/discharging, rate, charge)
+an ACPI notification is sent to the guest, so it will be able to
+update its battery status accordingly.
+
+The time interval between periodic probes is 2 s by default. A property
+'probe_interval' allows to modify this value. The value should be
+provided in milliseconds. A zero value disables the periodic probes,
+and makes the battery state updates occur on guest requests only.
+
+The host's battery information is taken from the sysfs battery data,
+located in:
+
+/sys/class/power_supply/[device of type "Battery"]
+
+If the sysfs path differs, a different battery needs to be probed,
+or even if a "fake" host battery is to be provided, a 'sysfs_path'
+property allows to override the default one.
diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index 1932f66af8..6b4c41037a 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -41,4 +41,8 @@ config ACPI_VMGENID
 default y
 depends on PC
 
+config BATTERY
+bool
+depends on ACPI
+
 config ACPI_HW_REDUCED
diff --git a/hw/acpi/battery.c b/hw/acpi/battery.c
new file mode 100644
index 00..afd82594b1
--- /dev/null
+++ b/hw/acpi/battery.c
@@ -0,0 +1,512 @@
+/*
+ * QEMU emulated battery device.
+ *
+ * Copyright (c) 2019 Janus Technologies, Inc. (http://janustech.com)
+ *
+ * Authors:
+ * Leonid Bloch 
+ * Marcel Apfelbaum 
+ * Dmitry Fleytman 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory for details.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "trace.h"
+#include "hw/isa/isa.h"
+#include "hw/acpi/acpi.h"
+#include "hw/nvram/fw_cfg.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "hw/qdev-properties.h"
+#include "migration/vmstate.h"
+
+#include "hw/acpi/battery.h"
+
+#define BATTERY_DEVICE(obj) OBJECT_CHECK(BatteryState, (obj), TYPE_BATTERY)
+
+#define BATTERY_DISCHARGING  1
+#define BATTERY_CHARGING 2
+
+#define SYSFS_PATH   "/sys/class/power_supply"
+#define BATTERY_TYPE "Battery"
+
+#define MAX_ALLOWED_STATE_LENGTH  32  /* For convinience when comparing */
+
+#define NORMALIZE_BY_FULL(val, full) \
+((full == 0) ? BATTERY_VAL_UNKNOWN \
+ : (uint32_t)(val * BATTERY_FULL_CAP / full))
+
+typedef union bat_metric {
+uint32_t val;
+uint8_t acc[4];
+} bat_metric;
+
+typedef struct BatteryState {
+ISADevice dev;
+MemoryRegion io;
+uint16_t ioport;
+bat_metric state;
+bat_metric rate;
+bat_metric charge;
+uint32_t

[PATCH 4/4] hw/acpi: Introduce the QEMU lid button

2021-01-20 Thread Leonid Bloch

The button device communicates the host's "lid button" state to the guest.
The probing of the host's lid button state occurs on timed intervals.
If a change of the host's lid button state is detected (open/closed)
an ACPI notification is sent to the guest, so it will be able to act
accordingly.

The time interval between the periodic probes is 2 s by default.
A property 'probe_interval' allows to modify this value. The value
is provided in milliseconds.

The host's lid button information is taken from:

/proc/acpi/button/lid/*/state

And this file is expected to be formatted as:
'state:  open' (if the lid is open)
or:
'state:  closed' (if the lid is closed)

These are based on the Linux 'button' driver.

If the above procfs path differs, or even if a "fake" host lid button
is to be provided, a 'procfs_path' property allows to override the
default path.

Usage example (default values):

'-device button'

Is the same as:

'-device button,procfs_path=/proc/acpi/button,probe_interval=2000'

Signed-off-by: Leonid Bloch 
---
 MAINTAINERS  |   5 +
 docs/specs/button.txt|  35 +++
 hw/acpi/Kconfig  |   4 +
 hw/acpi/button.c | 327 +++
 hw/acpi/meson.build  |   1 +
 hw/acpi/trace-events |   5 +
 hw/i386/Kconfig  |   1 +
 hw/i386/acpi-build.c |  29 +++
 include/hw/acpi/acpi_dev_interface.h |   1 +
 include/hw/acpi/button.h |  35 +++
 10 files changed, 443 insertions(+)
 create mode 100644 docs/specs/button.txt
 create mode 100644 hw/acpi/button.c
 create mode 100644 include/hw/acpi/button.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 1a3fc1dd56..72bead7023 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2151,6 +2151,11 @@ M: Leonid Bloch 
 S: Maintained
 F: hw/acpi/acad.*
 
+Button
+M: Leonid Bloch 
+S: Maintained
+F: hw/acpi/button.*
+
 Subsystems
 --
 Audio
diff --git a/docs/specs/button.txt b/docs/specs/button.txt
new file mode 100644
index 00..48e9f3388d
--- /dev/null
+++ b/docs/specs/button.txt
@@ -0,0 +1,35 @@
+BUTTON DEVICE
+=
+
+The button device communicates the host's "lid button" state to the guest.
+The probing of the host's lid button state occurs on timed intervals.
+If a change of the host's lid button state is detected (open/closed)
+an ACPI notification is sent to the guest, so it will be able to act
+accordingly.
+
+The time interval between the periodic probes is 2 s by default.
+A property 'probe_interval' allows to modify this value. The value
+is provided in milliseconds.
+
+The host's lid button information is taken from:
+
+/proc/acpi/button/lid/*/state
+
+And this file is expected to be formatted as:
+'state:  open' (if the lid is open)
+or:
+'state:  closed' (if the lid is closed)
+
+These are based on the Linux 'button' driver.
+
+If the above procfs path differs, or even if a "fake" host lid button
+is to be provided, a 'procfs_path' property allows to override the
+default path.
+
+Usage example (default values):
+
+'-device button'
+
+Is the same as:
+
+'-device button,procfs_path=/proc/acpi/button,probe_interval=2000'
diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index d35331fbf7..0e78560f0f 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -49,4 +49,8 @@ config AC_ADAPTER
 bool
 depends on ACPI
 
+config BUTTON
+bool
+depends on ACPI
+
 config ACPI_HW_REDUCED
diff --git a/hw/acpi/button.c b/hw/acpi/button.c
new file mode 100644
index 00..edca46ce6d
--- /dev/null
+++ b/hw/acpi/button.c
@@ -0,0 +1,327 @@
+/*
+ * QEMU emulated lid button device
+ *
+ * Copyright (c) 2019 Janus Technologies, Inc. (http://janustech.com)
+ *
+ * Authors:
+ * Leonid Bloch 
+ * Marcel Apfelbaum 
+ * Dmitry Fleytman 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory for details.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "trace.h"
+#include "hw/isa/isa.h"
+#include "hw/acpi/acpi.h"
+#include "hw/nvram/fw_cfg.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "hw/qdev-properties.h"
+#include "migration/vmstate.h"
+
+#include "hw/acpi/button.h"
+
+#define BUTTON_DEVICE(obj) OBJECT_CHECK(BUTTONState, (obj), \
+TYPE_BUTTON)
+
+#define BUTTON_STA_ADDR0
+
+#define PROCFS_PATH"/proc/acpi/button"
+#define LID_DIR"lid"
+#define LID_STATE_FILE "state"
+#define MIN_BUTTON_PROBE_INTERVAL  10  /* ms */
+#define MAX_ALLOWED_LINE_LENGTH32  /* For convenience when comparing */
+
+enum {
+LID_CLOSED = 0,
+LID_OPEN = 1,
+};
+
+static const char *lid_state[] = { "closed", "open" };
+
+typedef struct BUTTONState {
+ISADevice dev;
+MemoryRegion io;
+uint16_t ioport;
+uint8_t lid_state;
+
+QEMUTimer *probe_state_timer;
+

[PATCH 1/4] hw/acpi: Increase the number of possible ACPI interrupts

2021-01-20 Thread Leonid Bloch

Increase the number of possible ACPI interrupts from 8, to the maximum
available: 64 by default.

Signed-off-by: Leonid Bloch 
---
 hw/acpi/core.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/hw/acpi/core.c b/hw/acpi/core.c
index 7170bff657..71ba7c17b8 100644
--- a/hw/acpi/core.c
+++ b/hw/acpi/core.c
@@ -706,19 +706,32 @@ uint32_t acpi_gpe_ioport_readb(ACPIREGS *ar, uint32_t 
addr)
 void acpi_send_gpe_event(ACPIREGS *ar, qemu_irq irq,
  AcpiEventStatusBits status)
 {
-ar->gpe.sts[0] |= status;
+int i;
+
+AcpiEventStatusBits st = status;
+for (i = 0; i < ar->gpe.len / 2; i++) {
+ar->gpe.sts[i] |= st;
+st >>= (sizeof(ar->gpe.sts[0]) * CHAR_BIT);
+}
+
 acpi_update_sci(ar, irq);
 }
 
 void acpi_update_sci(ACPIREGS *regs, qemu_irq irq)
 {
 int sci_level, pm1a_sts;
+uint64_t gpe_sci = 0;
+int i;
 
 pm1a_sts = acpi_pm1_evt_get_sts(regs);
 
+for (i = 0; i < regs->gpe.len / 2; i++) {
+gpe_sci |= (regs->gpe.sts[i] & regs->gpe.en[i]);
+}
+
 sci_level = ((pm1a_sts &
   regs->pm1.evt.en & ACPI_BITMASK_PM1_COMMON_ENABLED) != 0) ||
-((regs->gpe.sts[0] & regs->gpe.en[0]) != 0);
+(gpe_sci != 0);
 
 qemu_set_irq(irq, sci_level);
 
-- 
2.30.0

[PATCH 0/4] Introduce a battery, AC adapter, and lid button

2021-01-20 Thread Leonid Bloch

This series introduces the following ACPI devices:

* Battery
* AC adapter
* Laptop lid button

When running QEMU on a laptop, these paravirtualized devices reflect the
state of these physical devices onto the guest. This functionality is
relevant not only for laptops, but also for any other device which has e.g.
a battery. This even allows to insert a ``fake'' battery to the
guest, in a form of a file which emulates the behavior of the actual
battery in sysfs. A possible use case for such a ``fake'' battery can be
limiting the budget of VM usage to a subscriber, in a naturally-visible way.
But of course, the main purpose here is addressing the desktop users.

This series was tested with Windows and (desktop) Linux guests, on which
indeed the battery icon appears in the corresponding state (full,
charging, discharging, time remaining to empty, etc.) and the AC adapter
plugging/unplugging behaves as expected. So is the laptop lid button.

For the ease of review, these commits are also available here:
https://github.com/blochl/qemu/pull/1/commits


Thanks,
Leonid.

Leonid Bloch (4):
  hw/acpi: Increase the number of possible ACPI interrupts
  hw/acpi: Introduce the QEMU Battery
  hw/acpi: Introduce the QEMU AC adapter
  hw/acpi: Introduce the QEMU lid button

 MAINTAINERS  |  15 +
 docs/specs/acad.txt  |  24 ++
 docs/specs/battery.txt   |  23 ++
 docs/specs/button.txt|  35 ++
 hw/acpi/Kconfig  |  12 +
 hw/acpi/acad.c   | 318 +
 hw/acpi/battery.c| 512 +++
 hw/acpi/button.c | 327 +
 hw/acpi/core.c   |  17 +-
 hw/acpi/meson.build  |   3 +
 hw/acpi/trace-events |  15 +
 hw/i386/Kconfig  |   3 +
 hw/i386/acpi-build.c | 178 ++
 include/hw/acpi/acad.h   |  37 ++
 include/hw/acpi/acpi_dev_interface.h |   3 +
 include/hw/acpi/battery.h|  43 +++
 include/hw/acpi/button.h |  35 ++
 17 files changed, 1598 insertions(+), 2 deletions(-)
 create mode 100644 docs/specs/acad.txt
 create mode 100644 docs/specs/battery.txt
 create mode 100644 docs/specs/button.txt
 create mode 100644 hw/acpi/acad.c
 create mode 100644 hw/acpi/battery.c
 create mode 100644 hw/acpi/button.c
 create mode 100644 include/hw/acpi/acad.h
 create mode 100644 include/hw/acpi/battery.h
 create mode 100644 include/hw/acpi/button.h

-- 
2.30.0

Re: [PATCH v7 00/11] Rework iotests/check

2021-01-20 Thread Eric Blake

On 1/16/21 7:44 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> These series has 3 goals:
> 
>  - get rid of group file (to forget about rebase and in-list conflicts)
>  - introduce human-readable names for tests
>  - rewrite check into python
> 
> v7:
>   - fix wording and grammar
>   - satisfy python linters
>   - move argv interfaces all into one in new check script
>   - support '-n' == '--dry-run' option
>   - update check-block to run check with correct PYTHON

I'd really like a second reviewer on 7-11, but I'm queueing 1-6 on my
NBD tree now.

> 
>  findtests: - stop parsing test file after first '# group: ' line
> - use parse_test_name in add_group_file()
> - make new list instead of modifying parameter exclude_groups
> 
>  testenv: - fix machine_map
>   - fix self.python
> 
>  testrunner: use env.python to run python tests
> 
> Vladimir Sementsov-Ogievskiy (11):
>   iotests/277: use dot slash for nbd-fault-injector.py running
>   iotests/303: use dot slash for qcow2.py running
>   iotests: fix some whitespaces in test output files
>   iotests: make tests executable
>   iotests/294: add shebang line
>   iotests: define group in each iotest
>   iotests: add findtests.py
>   iotests: add testenv.py
>   iotests: add testrunner.py
>   iotests: rewrite check into python
>   iotests: rename and move 169 and 199 tests
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 0/4] nbd/server: Quiesce coroutines on context switch

2021-01-20 Thread Eric Blake

On 12/14/20 11:05 AM, Sergio Lopez wrote:
> This series allows the NBD server to properly switch between AIO contexts,
> having quiesced recv_coroutine and send_coroutine before doing the transition.
> 
> We need this because we send back devices running in IO Thread owned contexts
> to the main context when stopping the data plane, something that can happen
> multiple times during the lifetime of a VM (usually during the boot sequence 
> or
> on a reboot), and we drag the NBD server of the correspoing export with it.
> 
> While there, fix also a problem caused by a cross-dependency between
> closing the export's client connections and draining the block
> layer. The visible effect of this problem was QEMU getting hung when
> the guest request a power off while there's an active NBD client.
> 
> v2:
>  - Replace "virtio-blk: Acquire context while switching them on
>  dataplane start" with "block: Honor blk_set_aio_context() context
>  requirements" (Kevin Wolf)
>  - Add "block: Avoid processing BDS twice in
>  bdrv_set_aio_context_ignore()"
>  - Add "block: Close block exports in two steps"
>  - Rename nbd_read_eof() to nbd_server_read_eof() (Eric Blake)
>  - Fix double space and typo in comment. (Eric Blake)

ping - where do we stand on this series?  Patches 1 and 3 have positive
reviews, I'll queue them now; patches 2 and 4 had enough comments that
I'm guessing I should wait for a v3?


> 
> Sergio Lopez (4):
>   block: Honor blk_set_aio_context() context requirements
>   block: Avoid processing BDS twice in bdrv_set_aio_context_ignore()
>   nbd/server: Quiesce coroutines on context switch
>   block: Close block exports in two steps
> 
>  block.c |  27 ++-
>  block/export/export.c   |  10 +--
>  blockdev-nbd.c  |   2 +-
>  hw/block/dataplane/virtio-blk.c |   4 ++
>  hw/block/dataplane/xen-block.c  |   7 +-
>  hw/scsi/virtio-scsi.c   |   6 +-
>  include/block/export.h  |   4 +-
>  nbd/server.c| 120 
>  qemu-nbd.c  |   2 +-
>  stubs/blk-exp-close-all.c   |   2 +-
>  10 files changed, 156 insertions(+), 28 deletions(-)
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option

2021-01-20 Thread Eduardo Habkost

On Wed, Jan 20, 2021 at 08:08:32PM +0100, Igor Mammedov wrote:
> On Wed, 20 Jan 2021 15:38:33 +0100
> Vitaly Kuznetsov  wrote:
> 
> > Igor Mammedov  writes:
> > 
> > > On Fri, 15 Jan 2021 10:20:23 +0100
> > > Vitaly Kuznetsov  wrote:
> > >  
> > >> Igor Mammedov  writes:
> > >>   
> > >> > On Thu,  7 Jan 2021 16:14:49 +0100
> > >> > Vitaly Kuznetsov  wrote:
> > >> >
> > >> >> Enabling Hyper-V emulation for a Windows VM is a tiring experience as 
> > >> >> it
> > >> >> requires listing all currently supported enlightenments ("hv-*" CPU
> > >> >> features) explicitly. We do have 'hv-passthrough' mode enabling
> > >> >> everything but it can't be used in production as it prevents 
> > >> >> migration.
> > >> >> 
> > >> >> Introduce a simple 'hv-default=on' CPU flag enabling all currently 
> > >> >> supported
> > >> >> Hyper-V enlightenments. Later, when new enlightenments get 
> > >> >> implemented,
> > >> >> compat_props mechanism will be used to disable them for legacy 
> > >> >> machine types,
> > >> >> this will keep 'hv-default=on' configurations migratable.
> > >> >> 
> > >> >> Signed-off-by: Vitaly Kuznetsov 
> > >> >> ---
> > >> >>  docs/hyperv.txt   | 16 +---
> > >> >>  target/i386/cpu.c | 38 ++
> > >> >>  target/i386/cpu.h |  5 +
> > >> >>  3 files changed, 56 insertions(+), 3 deletions(-)
> > >> >> 
> > >> >> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> > >> >> index 5df00da54fc4..a54c066cab09 100644
> > >> >> --- a/docs/hyperv.txt
> > >> >> +++ b/docs/hyperv.txt
> > >> >> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific 
> > >> >> features.
> > >> >>  
> > >> >>  2. Setup
> > >> >>  =
> > >> >> -No Hyper-V enlightenments are enabled by default by either KVM or 
> > >> >> QEMU. In
> > >> >> -QEMU, individual enlightenments can be enabled through CPU flags, 
> > >> >> e.g:
> > >> >> +All currently supported Hyper-V enlightenments can be enabled by 
> > >> >> specifying
> > >> >> +'hv-default=on' CPU flag:
> > >> >>  
> > >> >> -  qemu-system-x86_64 --enable-kvm --cpu 
> > >> >> host,hv_relaxed,hv_vpindex,hv_time, ...
> > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
> > >> >> +
> > >> >> +Alternatively, it is possible to do fine-grained enablement through 
> > >> >> CPU flags,
> > >> >> +e.g:
> > >> >> +
> > >> >> +  qemu-system-x86_64 --enable-kvm --cpu 
> > >> >> host,hv-relaxed,hv-vpindex,hv-time ...
> > >> >
> > >> > I'd put here not '...' but rather recommended list of flags, and update
> > >> > it every time when new feature added if necessary.
> > >> >
> > >
> > > 1)
> > >
> > >> This is an example of fine-grained enablement, there is no point to put
> > >> all the existing flags there (hv-default is the only recommended way
> > >> now, the rest is 'expert'/'debugging').  
> > > so users are kept in dark what hv-default disables/enables (and it might 
> > > depend
> > > on machine version on top that). Doesn't look like a good documentation 
> > > to me
> > > (sure everyone can go and read source code for it and try to figure out 
> > > how
> > > it's supposed to work)  
> > 
> > 'hv-default' enables *all* currently supported enlightenments. When
> > using with an old machine type, it will enable *all* Hyper-V
> > enlightenmnets which were supported when the corresponding machine type
> > was released. I don't think we document all other cases when a machine
> > type is modified (i.e. where can I read how pc-q35-5.1 is different from
> > pc-q35-5.0 if I refuse to read the source code?)
> > 
> > >  
> > >>  
> > >> > (not to mention that if we had it to begin with, then new 'hv-default' 
> > >> > won't
> > >> > be necessary, I still see it as functionality duplication but I will 
> > >> > not oppose it)
> > >> >
> > >> 
> > >> Unfortunately, upper layer tools don't read this doc and update
> > >> themselves to enable new features when they appear.  
> > > rant: (just merge all libvirt into QEMU, and make VM configuration less 
> > > low-level.
> > > why stop there, just merge with yet another upper layer, it would save us 
> > > a lot
> > > on communication protocols and simplify VM creation even more,
> > > and no one will have to read docs and write anything new on top.)
> > > There should be limit somewhere, where QEMU job ends and others pile hw 
> > > abstraction
> > > layers on top of it.  
> > 
> > We have '-machine q35' and we don't require to list all the devices from
> > it. We have '-cpu Skylake-Server' and we don't require to configure all
> > the features manually. Why can't we have similar enablement for Hyper-V
> > emulation where we can't even see a real need for anything but 'enable
> > everything' option?
> > 
> > There is no 'one libvirt to rule them all' (fortunately or
> > unfortunately). And sometimes QEMU is the uppermost layer and there's no
> > 'libvirt' on top of it, this is also a perfectly valid use-case.
> > 
> > >  
> > >> Similarly, if when

[PATCH v3] target/arm: Implement ID_PFR2

2021-01-20 Thread Richard Henderson

This was defined at some point before ARMv8.4, and will
shortly be used by new processor descriptions.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
v2: Update for isar changes
v3: Add kvm lookup
---
 target/arm/cpu.h| 1 +
 target/arm/helper.c | 4 ++--
 target/arm/kvm64.c  | 2 ++
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index df0d677833..d080239863 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -922,6 +922,7 @@ struct ARMCPU {
 uint32_t id_mmfr4;
 uint32_t id_pfr0;
 uint32_t id_pfr1;
+uint32_t id_pfr2;
 uint32_t mvfr0;
 uint32_t mvfr1;
 uint32_t mvfr2;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 10102aab3c..677a4aa79e 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -7671,11 +7671,11 @@ void register_cp_regs_for_features(ARMCPU *cpu)
   .access = PL1_R, .type = ARM_CP_CONST,
   .accessfn = access_aa64_tid3,
   .resetvalue = 0 },
-{ .name = "MVFR4_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
+{ .name = "ID_PFR2", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 4,
   .access = PL1_R, .type = ARM_CP_CONST,
   .accessfn = access_aa64_tid3,
-  .resetvalue = 0 },
+  .resetvalue = cpu->isar.id_pfr2 },
 { .name = "MVFR5_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 5,
   .access = PL1_R, .type = ARM_CP_CONST,
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index f74bac2457..3c37fc4fb6 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -578,6 +578,8 @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
   ARM64_SYS_REG(3, 0, 0, 1, 0));
 err |= read_sys_reg32(fdarray[2], >isar.id_pfr1,
   ARM64_SYS_REG(3, 0, 0, 1, 1));
+err |= read_sys_reg32(fdarray[2], >isar.id_pfr2,
+  ARM64_SYS_REG(3, 0, 0, 3, 4));
 err |= read_sys_reg32(fdarray[2], >isar.id_dfr0,
   ARM64_SYS_REG(3, 0, 0, 1, 2));
 err |= read_sys_reg32(fdarray[2], >isar.id_mmfr0,
-- 
2.25.1

Re: [PATCH 0/8] s390x/pci: Fixing s390 vfio-pci ISM support

2021-01-20 Thread Matthew Rosato


On 1/20/21 2:18 PM, Pierre Morel wrote:



On 1/20/21 4:59 PM, Matthew Rosato wrote:

On 1/20/21 9:45 AM, Pierre Morel wrote:



On 1/20/21 3:03 PM, Matthew Rosato wrote:

On 1/20/21 4:12 AM, Pierre Morel wrote:



On 1/19/21 9:44 PM, Matthew Rosato wrote:
Today, ISM devices are completely disallowed for vfio-pci 
passthrough as
QEMU rejects the device due to an (inappropriate) MSI-X check. 
Removing

this fence, however, reveals additional deficiencies in the s390x PCI
interception layer that prevent ISM devices from working correctly.
Namely, ISM block write operations have particular requirements in 
regards

to the alignment, size and order of writes performed that cannot be
guaranteed when breaking up write operations through the typical
vfio_pci_bar_rw paths. Furthermore, ISM requires that legacy/non-MIO
s390 PCI instructions are used, which is also not guaranteed when 
the I/O

is passed through the typical userspace channels.

This patchset provides a set of fixes related to enabling ISM device
passthrough and includes patches to enable use of a new vfio 
region that
will allow s390x PCI pass-through devices to perform s390 PCI 
instructions
in such a way that the same instruction issued on the guest is 
re-issued

on the host.

Associated kernel patchset:
https://lkml.org/lkml/2021/1/19/874

Changes from RFC -> v1:
- Refresh the header sync (built using Eric's 'update-linux-headers:
Include const.h' + manually removed pvrdma_ring.h again)
- Remove s390x/pci: fix pcistb length (already merged)
- Remove s390x/pci: Fix memory_region_access_valid call (already 
merged)

- Fix bug: s390_pci_vfio_pcistb should use the pre-allocated PCISTB
buffer pcistb_buf rather than allocating/freeing its own.
- New patch: track the PFT (PCI Function Type) separately from 
guest CLP

response data -- we tell the guest '0' for now due to limitations in
measurement block support, but we can still use the real value 
provided via

the vfio CLP capabilities to make decisions.
- Use the PFT (pci function type) to determine when to use the region
for PCISTB/PCILG (only for ISM), rather than using the relaxed 
alignment

bit.
- As a result, the pcistb_default is now updated to also handle the
possibility of relaxed alignment via 2 new functions, 
pcistb_validate_write

and pcistb_write, which serve as wrappers to the memory_region calls.
- New patch, which partially restores the MSI-X fence for passthrough
devices...  Could potentially be squashed with 's390x/pci: MSI-X 
isn't
strictly required for passthrough' but left separately for now as 
I felt it
needed a clear commit description of why we should still fence 
this case.



Hi,

The choice of using the new VFIO region is made on the ISM PCI 
function type (PFT), which makes the patch ISM specific, why don't 
we use here the MIO bit common to any zPCI function and present in 
kernel to make the choice?




As discussed during the RFC (and see my reply also to the kernel 
set), the use of this region only works for devices that do not rely 
on MSI-X interrupts.  If we did as you suggest, other device types 
like mlx would not receive MSI-X interrupts in the guest (And I did 
indeed try variations where I used the special VFIO region for all 
PCISTG/PCILG/PCISTB for various device types)


So the idea for now was to solve the specific problem at hand 
(getting ISM devices working).





Sorry, if I missed or forgot some discussions, but I understood that 
we are using this region to handle PCISTB instructions when the 
device do not support MIO.

Don't we?


Sure thing - It's probably good to refresh the issue/rationale anyway 
as we've had the holidays in between.


You are correct, a primary reason we need to resort to a separate VFIO 
region for PCISTB (and PCILG) instructions for ISM devices is that 
they do not support the MIO instruction set, yet the host kernel will 
translate everything coming through the PCI I/O layer to MIO 
instructions whenever that facility is available to the host (and not 
purposely disabled).  This issue is unique to vfio-pci/passthrough - 
in the host, the ISM driver directly invokes functions in s390 pci 
code to ensure that MIO instructions are not used.



QEMU intercepts and differentiates PCISTG and PCISTB.
The new hardware support both MIO and legacy PCISTB/PCISTG.

QEMU does not support MIO

My first interrogation is why should we translate legacy to MIO?


This is existing behavior of the s390 kernel PCI layer, not something 
I'm introducing here.


So, as you say, QEMU does not support MIO, nor does it attempt to 
translate anything to MIO.  The thing is, to the host kernel, PCI I/O 
coming from QEMU through the standard vfio-pci codepath just looks like 
any other userspace PCI I/O coming in to the kernel.  So the 
memory_region operations against the vfio virtual address space bars in 
QEMU then turn into iowrite/ioread operations in vfio-pci in the kernel, 
which then subsequently end up in the s390 PCI kernel layer, and it's

Re: [PATCH] docs/system: Deprecate `-cpu ..., +feature, -feature` syntax

2021-01-20 Thread David Edmondson

On Wednesday, 2021-01-20 at 15:12:41 -05, Eduardo Habkost wrote:

> The ordering semantics of +feature/-feature is tricky and not
> obvious, and it requires a custom option parser.  Deprecate that
> syntax so we can eventually remove the custom -cpu option parser
> and plus_features/minus_features global variables in i386.
>
> Signed-off-by: Eduardo Habkost 

With very minor wording comment...

Reviewed-by: David Edmondson 

> ---
>  docs/system/deprecated.rst | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
> index e20bfcb17a4..2c4b8d4b78b 100644
> --- a/docs/system/deprecated.rst
> +++ b/docs/system/deprecated.rst
> @@ -127,6 +127,20 @@ Drives with interface types other than ``if=none`` are 
> for onboard
>  devices.  It is possible to use drives the board doesn't pick up with
>  -device.  This usage is now deprecated.  Use ``if=none`` instead.
>  
> +``-cpu`` ``+feature`` and ``-feature`` syntax (since 6.0.0)
> +'''
> +
> +The ``-cpu ...,+feature`` and ``-cpu ...,-feature`` syntax for
> +enabling and disabling CPU features is deprecated.  The ``-cpu
> +...,feature=on`` or ``-cpu ...,feature=off`` should be used
> +instead.
> +
> +Note that the ordering semantics of ``-cpu ...,-feature,+feature``
> +is different from ``-cpu ...,feature=off,feature=on``.  With the
> +former, the feature got disabled because ``-feature`` had

s/got/was/

> +precedence, but with the latter the feature will be enabled
> +because options are applied in the order they appear.
> +
>  
>  QEMU Machine Protocol (QMP) commands
>  
> -- 
> 2.28.0

dme.
-- 
I got a girlfriend that's better than that.

[PATCH] docs/system: Deprecate `-cpu ...,+feature,-feature` syntax

2021-01-20 Thread Eduardo Habkost

The ordering semantics of +feature/-feature is tricky and not
obvious, and it requires a custom option parser.  Deprecate that
syntax so we can eventually remove the custom -cpu option parser
and plus_features/minus_features global variables in i386.

Signed-off-by: Eduardo Habkost 
---
 docs/system/deprecated.rst | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index e20bfcb17a4..2c4b8d4b78b 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -127,6 +127,20 @@ Drives with interface types other than ``if=none`` are for 
onboard
 devices.  It is possible to use drives the board doesn't pick up with
 -device.  This usage is now deprecated.  Use ``if=none`` instead.
 
+``-cpu`` ``+feature`` and ``-feature`` syntax (since 6.0.0)
+'''
+
+The ``-cpu ...,+feature`` and ``-cpu ...,-feature`` syntax for
+enabling and disabling CPU features is deprecated.  The ``-cpu
+...,feature=on`` or ``-cpu ...,feature=off`` should be used
+instead.
+
+Note that the ordering semantics of ``-cpu ...,-feature,+feature``
+is different from ``-cpu ...,feature=off,feature=on``.  With the
+former, the feature got disabled because ``-feature`` had
+precedence, but with the latter the feature will be enabled
+because options are applied in the order they appear.
+
 
 QEMU Machine Protocol (QMP) commands
 
-- 
2.28.0

Re: [PATCH v2 1/1] linux-user/signal: Decode waitid si_code

2021-01-20 Thread Andreas K . Hüttel

> 
> This patch just passes the waitid status directly back to the guest.
> 

This works at least as well as the previous versions, so ++ from me. 

Will do more testing over the next days to see if it maybe also improves the 
additional oddities I observed. 

Tested-by: Andreas K. Hüttel 

> Buglink: https://bugs.launchpad.net/qemu/+bug/1906193
> Signed-off-by: Alistair Francis 
> ---
> v2:
>  - Set tinfo->_sifields._sigchld._status directly from status
> 
>  linux-user/signal.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/linux-user/signal.c b/linux-user/signal.c
> index 73de934c65..7eecec46c4 100644
> --- a/linux-user/signal.c
> +++ b/linux-user/signal.c
> @@ -349,8 +349,7 @@ static inline void
> host_to_target_siginfo_noswap(target_siginfo_t *tinfo, case TARGET_SIGCHLD:
>  tinfo->_sifields._sigchld._pid = info->si_pid;
>  tinfo->_sifields._sigchld._uid = info->si_uid;
> -tinfo->_sifields._sigchld._status
> -= host_to_target_waitstatus(info->si_status);
> +tinfo->_sifields._sigchld._status = info->si_status;
>  tinfo->_sifields._sigchld._utime = info->si_utime;
>  tinfo->_sifields._sigchld._stime = info->si_stime;
>  si_type = QEMU_SI_CHLD;


-- 
Andreas K. Hüttel
dilfri...@gentoo.org
Gentoo Linux developer 
(council, qa, toolchain, base-system, perl, libreoffice)

signature.asc
Description: This is a digitally signed message part.

Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option

2021-01-20 Thread Eduardo Habkost

On Wed, Jan 20, 2021 at 02:13:12PM +0100, Igor Mammedov wrote:
> On Fri, 15 Jan 2021 10:20:23 +0100
> Vitaly Kuznetsov  wrote:
> 
> > Igor Mammedov  writes:
> > 
> > > On Thu,  7 Jan 2021 16:14:49 +0100
> > > Vitaly Kuznetsov  wrote:
[...]
> > >> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. 
> > >> In
> > >> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
> > >> +All currently supported Hyper-V enlightenments can be enabled by 
> > >> specifying
> > >> +'hv-default=on' CPU flag:
> > >>  
> > >> -  qemu-system-x86_64 --enable-kvm --cpu 
> > >> host,hv_relaxed,hv_vpindex,hv_time, ...
> > >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
> > >> +
> > >> +Alternatively, it is possible to do fine-grained enablement through CPU 
> > >> flags,
> > >> +e.g:
> > >> +
> > >> +  qemu-system-x86_64 --enable-kvm --cpu 
> > >> host,hv-relaxed,hv-vpindex,hv-time ...  
> > >
> > > I'd put here not '...' but rather recommended list of flags, and update
> > > it every time when new feature added if necessary.
> > >  
> 
> 1)
>  
> > This is an example of fine-grained enablement, there is no point to put
> > all the existing flags there (hv-default is the only recommended way
> > now, the rest is 'expert'/'debugging').
> so users are kept in dark what hv-default disables/enables (and it might 
> depend
> on machine version on top that). Doesn't look like a good documentation to me
> (sure everyone can go and read source code for it and try to figure out how
> it's supposed to work)

Why is this a problem?  This is not different from CPU feature
flags hidden by CPU model names.  Or virtio feature bits hidden
by virtio devices and machine type compat code.


> > > (not to mention that if we had it to begin with, then new 'hv-default' 
> > > won't
> > > be necessary, I still see it as functionality duplication but I will not 
> > > oppose it)
> > >  
> > 
> > Unfortunately, upper layer tools don't read this doc and update
> > themselves to enable new features when they appear.
> rant: (just merge all libvirt into QEMU, and make VM configuration less 
> low-level.
> why stop there, just merge with yet another upper layer, it would save us a 
> lot
> on communication protocols and simplify VM creation even more,
> and no one will have to read docs and write anything new on top.)
> There should be limit somewhere, where QEMU job ends and others pile hw 
> abstraction
> layers on top of it.

If there should be a limit somewhere, I'd say low level hyperv
flags are a very long way from crossing that limit.  It is
completely reasonable to provide a higher level knob for them in
QEMU.  Vitaly is trying to solve a real problem here, because the
existing solution is not working.

We only need to offer more complex and lower level interfaces if
we have to.  I don't think we have real world examples where
libvirt or management software developers/users are asking for
low level hyperv feature flag knobs, do we?

> 
> > Similarly, if when these tools use '-machine q35' they get all the new 
> > features we add
> > automatically, right?
> it depends, in case of CPUs, new features usually 'off' by default
> for existing models. In case of bugs, features sometimes could be
> flipped and versioned machines were used to keep broken CPU models
> on old machine types.
> 
>
[...]

-- 
Eduardo

1 2 3 4 >

1 - 100 of 361 matches

Mail list logo