from:"Paul Durrant"

Re: [PATCH v2 0/2] qdev-monitor: avoid QemuOpts in QMP device_add

2024-08-13 Thread Paul Durrant


On 12/08/2024 19:15, Stefan Hajnoczi wrote:

On Fri, Aug 02, 2024 at 10:10:43AM +0200, Markus Armbruster wrote:

Can we additionally cut out the QemuOpts middleman in
usbback_portid_add()?

 qdict = qdict_new();
 qdict_put_str(qdict, "driver", "usb-host");
 tmp = g_strdup_printf("%s.0", usbif->xendev.qdev.id);
 qdict_put_str(qdict, "bus", tmp);
 g_free(tmp);
 tmp = g_strdup_printf("%s-%u", usbif->xendev.qdev.id, port);
 qdict_put_str(qdict, "id", tmp);
 g_free(tmp);
 qdict_put_int(qdict, "port", port);
 qdict_put_int(qdict, "hostbus", atoi(busid));
 qdict_put_str(qdict, "hostport", portname);
 opts = qemu_opts_from_qdict(qemu_find_opts("device"), qdict,
 &error_abort);
 usbif->ports[port - 1].dev = USB_DEVICE(qdev_device_add(opts, &local_err));

Trying this is up to you!


Paul or Anthony: Do you know how to run usbback_portid_add() for
testing? I would like to make sure that suggested the code change works
and don't have experience running the Xen code in QEMU.


Sorry, PV USB is not something I'm familiar with. 
https://wiki.xenproject.org/wiki/Xen_USB_Passthrough suggests that `xl 
usbdev-attach` might be the way to test... but you'd need a system with 
Xen installed and suitably configured guest, so not trivial to set up.

Re: [PATCH] MAINTAINERS: add Edgar as Xen maintainer

2024-07-11 Thread Paul Durrant


On 10/07/2024 22:28, Stefano Stabellini wrote:

Add Edgar as Xen subsystem maintainer in QEMU. Edgar has been a QEMU
maintainer for years, and has already made key changes to one of the
most difficult areas of the Xen subsystem (the mapcache).

Edgar volunteered helping us maintain the Xen subsystem in QEMU and we
are very happy to welcome him to the team. His knowledge and expertise
with QEMU internals will be of great help.

Signed-off-by: Stefano Stabellini 



Reviewed-by: Paul Durrant 


diff --git a/MAINTAINERS b/MAINTAINERS
index 6725913c8b..63e11095a2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -536,6 +536,7 @@ X86 Xen CPUs
  M: Stefano Stabellini 
  M: Anthony PERARD 
  M: Paul Durrant 
+M: Edgar E. Iglesias 
  L: xen-de...@lists.xenproject.org
  S: Supported
  F: */xen*

Re: [PATCH v3 3/3] ui+display: rename is_buffer_shared() -> surface_is_allocated()

2024-06-06 Thread Paul Durrant


On 05/06/2024 14:14, Gerd Hoffmann wrote:

Boolean return value is reversed, to align with QEMU_ALLOCATED_FLAG, so
all callers must be adapted.  Also rename share_surface variable in
vga_draw_graphic() to reduce confusion.

No functional change.

Suggested-by: Marc-André Lureau 
Signed-off-by: Gerd Hoffmann 
---
  include/ui/surface.h|  4 ++--
  hw/display/qxl-render.c |  2 +-
  hw/display/vga.c| 20 ++--
  hw/display/xenfb.c  |  5 +++--
  ui/console.c|  3 ++-
  5 files changed, 18 insertions(+), 16 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2 7/7] hw/xen: Register framebuffer backend via xen_backend_init()

2024-05-31 Thread Paul Durrant


On 10/05/2024 11:49, Philippe Mathieu-Daudé wrote:

Align the framebuffer backend with the other legacy ones,
register it via xen_backend_init() when '-vga xenfb' is
used. It is safe because MODULE_INIT_XEN_BACKEND is called
in xen_bus_realize(), long after CLI processing initialized
the vga_interface_type variable.

Signed-off-by: Philippe Mathieu-Daudé 
---
  include/hw/xen/xen-legacy-backend.h | 3 ---
  hw/display/xenfb.c  | 9 +++--
  hw/xenpv/xen_machine_pv.c   | 2 --
  3 files changed, 7 insertions(+), 7 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2 6/7] hw/xen: register legacy backends via xen_backend_init

2024-05-31 Thread Paul Durrant


On 10/05/2024 11:49, Philippe Mathieu-Daudé wrote:

From: Paolo Bonzini 

It is okay to register legacy backends in the middle of xen_bus_init().
All that the registration does is record the existence of the backend
in xenstore.

This makes it possible to remove them from the build without introducing
undefined symbols in xen_be_init().  It also removes the need for the
backend_register callback, whose only purpose is to avoid registering
nonfunctional backends.

Signed-off-by: Paolo Bonzini 
Message-ID: <20240509170044.190795-8-pbonz...@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
  include/hw/xen/xen-legacy-backend.h | 11 ++-
  include/hw/xen/xen_pvdev.h  |  1 -
  hw/9pfs/xen-9p-backend.c|  8 +++-
  hw/display/xenfb.c  |  8 +++-
  hw/usb/xen-usb.c| 14 --
  hw/xen/xen-legacy-backend.c | 16 
  6 files changed, 20 insertions(+), 38 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2 5/7] hw/xen: initialize legacy backends from xen_bus_init()

2024-05-31 Thread Paul Durrant


On 10/05/2024 11:49, Philippe Mathieu-Daudé wrote:

From: Paolo Bonzini 

Prepare for moving the calls to xen_be_register() under the
control of xen_bus_init(), using the normal xen_backend_init()
method that is used by the "modern" backends.

This requires the xenstore global variable to be initialized,
which is done by xen_be_init().  To ensure that everything is
ready at the time the xen_backend_init() functions are called,
remove the xen_be_init() function from all the boards and
place it directly in xen_bus_init().

Signed-off-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240509170044.190795-7-pbonz...@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
  hw/i386/pc.c  | 1 -
  hw/xen/xen-bus.c  | 4 
  hw/xen/xen-hvm-common.c   | 2 --
  hw/xenpv/xen_machine_pv.c | 5 +
  4 files changed, 5 insertions(+), 7 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2 4/7] hw/xen: Make XenDevOps structures const

2024-05-31 Thread Paul Durrant


On 10/05/2024 11:49, Philippe Mathieu-Daudé wrote:

Keep XenDevOps structures in .rodata.

Signed-off-by: Philippe Mathieu-Daudé 
---
  include/hw/xen/xen-legacy-backend.h | 8 
  hw/9pfs/xen-9p-backend.c| 2 +-
  hw/display/xenfb.c  | 4 ++--
  hw/usb/xen-usb.c| 4 ++--
  4 files changed, 9 insertions(+), 9 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2 3/7] hw/xen: Constify xenstore_be::XenDevOps

2024-05-31 Thread Paul Durrant


On 10/05/2024 11:49, Philippe Mathieu-Daudé wrote:

XenDevOps @ops is not updated, mark it const.

Signed-off-by: Philippe Mathieu-Daudé 
---
  include/hw/xen/xen-legacy-backend.h | 2 +-
  hw/xen/xen-legacy-backend.c | 6 +++---
  2 files changed, 4 insertions(+), 4 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2 2/7] hw/xen: Constify XenLegacyDevice::XenDevOps

2024-05-31 Thread Paul Durrant


On 10/05/2024 11:49, Philippe Mathieu-Daudé wrote:

XenDevOps @ops is not updated, mark it const.

Signed-off-by: Philippe Mathieu-Daudé 
---
  include/hw/xen/xen_pvdev.h  | 2 +-
  hw/xen/xen-legacy-backend.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2 1/7] hw/xen: Remove declarations left over in 'xen-legacy-backend.h'

2024-05-31 Thread Paul Durrant


On 10/05/2024 11:49, Philippe Mathieu-Daudé wrote:

'xen_blkdev_ops' was removed in commit 19f87870ba ("xen: remove
the legacy 'xen_disk' backend"), 'xen_netdev_ops' in commit
25967ff69f ("hw/xen: update Xen PV NIC to XenDevice model") and
'xen_console_ops' in commit 9b77374690 ("hw/xen: update Xen
console to XenDevice model"). Remove them.

Signed-off-by: Philippe Mathieu-Daudé 
---
  include/hw/xen/xen-legacy-backend.h | 3 ---
  1 file changed, 3 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH] xen-hvm: Avoid livelock while handling buffered ioreqs

2024-04-08 Thread Paul Durrant


On 04/04/2024 15:08, Ross Lagerwall wrote:

A malicious or buggy guest may generated buffered ioreqs faster than
QEMU can process them in handle_buffered_iopage(). The result is a
livelock - QEMU continuously processes ioreqs on the main thread without
iterating through the main loop which prevents handling other events,
processing timers, etc. Without QEMU handling other events, it often
results in the guest becoming unsable and makes it difficult to stop the
source of buffered ioreqs.

To avoid this, if we process a full page of buffered ioreqs, stop and
reschedule an immediate timer to continue processing them. This lets
QEMU go back to the main loop and catch up.

Signed-off-by: Ross Lagerwall 
---
  hw/xen/xen-hvm-common.c | 26 +-
  1 file changed, 17 insertions(+), 9 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH] xen-hvm: Avoid livelock while handling buffered ioreqs

2024-04-08 Thread Paul Durrant


On 08/04/2024 14:00, Ross Lagerwall wrote:

On Sat, Apr 6, 2024 at 11:58 AM Durrant, Paul  wrote:


On 04/04/2024 15:08, Ross Lagerwall wrote:

A malicious or buggy guest may generated buffered ioreqs faster than
QEMU can process them in handle_buffered_iopage(). The result is a
livelock - QEMU continuously processes ioreqs on the main thread without
iterating through the main loop which prevents handling other events,
processing timers, etc. Without QEMU handling other events, it often
results in the guest becoming unsable and makes it difficult to stop the
source of buffered ioreqs.

To avoid this, if we process a full page of buffered ioreqs, stop and
reschedule an immediate timer to continue processing them. This lets
QEMU go back to the main loop and catch up.



Do PV backends potentially cause the same scheduling issue (if not using
io threads)?



 From what I can tell:

xen-block: It reads req_prod / req_cons once before entering the loop
so it should be fine, I think.

xen_console: Same as xen-block

xen_nic: It reads req_prod / req_cons once before entering the loop.
However, once the loop ends it checks for more requests and if there
are more requests it restarts from the beginning. It seems like this
could be susceptible to the same issue.

(These PV backends generally aren't used by XenServer's system QEMU
so I didn't spend too much time looking into it.)

Thanks,


Ok. Thanks for checking.

  Paul

Re: [PATCH v4 3/6] xen: decouple generic xen code from legacy backends codebase

2023-12-04 Thread Paul Durrant


On 02/12/2023 01:41, Volodymyr Babchuk wrote:

In xen-all.c there are unneeded dependencies on xen-legacy-backend.c:

  - xen_init() uses xen_pv_printf() to report errors, but it does not
  provide a pointer to the struct XenLegacyDevice, so it is kind of
  useless, we can use standard error_report() instead.

  - xen-all.c has function xenstore_record_dm_state() which uses global
  variable "xenstore" defined and initialized in xen-legacy-backend.c
  It is used exactly once, so we can just open a new connection to the
  xenstore, update DM state and close connection back.

Those two changes allows us to remove xen-legacy-backend.c at all,
what should be done in the future anyways. But right now this patch
moves us one step close to have QEMU build without legacy Xen
backends.

Signed-off-by: Volodymyr Babchuk 

---

In v4:

  - New in v4, previous was part of "xen: add option to disable legacy
  backends"
  - Do not move xenstore global variable from xen-legacy-backend.c,
instead use a local variable.
---
  accel/xen/xen-all.c | 16 
  1 file changed, 12 insertions(+), 4 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH 05/12] block: remove AioContext locking

2023-11-30 Thread Paul Durrant


On 29/11/2023 19:55, Stefan Hajnoczi wrote:

This is the big patch that removes
aio_context_acquire()/aio_context_release() from the block layer and
affected block layer users.

There isn't a clean way to split this patch and the reviewers are likely
the same group of people, so I decided to do it in one patch.

Signed-off-by: Stefan Hajnoczi 
---
  include/block/block-global-state.h |   9 +-
  include/block/block-io.h   |   3 +-
  include/block/snapshot.h   |   2 -
  block.c| 234 +-
  block/block-backend.c  |  14 --
  block/copy-before-write.c  |  22 +--
  block/export/export.c  |  22 +--
  block/io.c |  45 +
  block/mirror.c |  19 --
  block/monitor/bitmap-qmp-cmds.c|  20 +-
  block/monitor/block-hmp-cmds.c |  29 ---
  block/qapi-sysemu.c|  27 +--
  block/qapi.c   |  18 +-
  block/raw-format.c |   5 -
  block/replication.c|  58 +-
  block/snapshot.c   |  22 +--
  block/write-threshold.c|   6 -
  blockdev.c | 307 +
  blockjob.c |  18 --
  hw/block/dataplane/virtio-blk.c|  10 -
  hw/block/dataplane/xen-block.c |  17 +-
  hw/block/virtio-blk.c  |  45 +
  hw/core/qdev-properties-system.c   |   9 -
  job.c  |  16 --
  migration/block.c  |  33 +---
  migration/migration-hmp-cmds.c |   3 -
  migration/savevm.c |  22 ---
  net/colo-compare.c |   2 -
  qemu-img.c |   4 -
  qemu-io.c  |  10 +-
  qemu-nbd.c |   2 -
  replay/replay-debugging.c  |   4 -
  tests/unit/test-bdrv-drain.c   |  51 +
  tests/unit/test-bdrv-graph-mod.c   |   6 -
  tests/unit/test-block-iothread.c   |  31 ---
  tests/unit/test-blockjob.c | 137 -
  tests/unit/test-replication.c  |  11 --
  util/async.c   |   4 -
  util/vhost-user-server.c   |   3 -
  scripts/block-coroutine-wrapper.py |   3 -
  tests/tsan/suppressions.tsan   |   1 -
  41 files changed, 102 insertions(+), 1202 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH 2/6] qemu/main-loop: rename QEMU_IOTHREAD_LOCK_GUARD to QEMU_BQL_LOCK_GUARD

2023-11-30 Thread Paul Durrant


On 29/11/2023 21:26, Stefan Hajnoczi wrote:

The name "iothread" is overloaded. Use the term Big QEMU Lock (BQL)
instead, it is already widely used and unambiguous.

Signed-off-by: Stefan Hajnoczi 
---
  include/qemu/main-loop.h  | 20 ++--
  hw/i386/kvm/xen_evtchn.c  | 14 +++---
  hw/i386/kvm/xen_gnttab.c  |  2 +-
  hw/mips/mips_int.c|  2 +-
  hw/ppc/ppc.c  |  2 +-
  target/i386/kvm/xen-emu.c |  2 +-
  target/ppc/excp_helper.c  |  2 +-
  target/ppc/helper_regs.c  |  2 +-
  target/riscv/cpu_helper.c |  4 ++--
  9 files changed, 25 insertions(+), 25 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH 1/6] system/cpus: rename qemu_mutex_lock_iothread() to qemu_bql_lock()

2023-11-30 Thread Paul Durrant

m.c   |   4 +-
  target/s390x/tcg/misc_helper.c   | 118 +--
  target/sparc/int32_helper.c  |   2 +-
  target/sparc/int64_helper.c  |   6 +-
  target/sparc/win_helper.c|  20 ++---
  target/xtensa/exc_helper.c   |   8 +-
  ui/spice-core.c  |   4 +-
  util/async.c |   2 +-
  util/main-loop.c |   8 +-
  util/rcu.c   |  14 ++--
  audio/coreaudio.m|   4 +-
  memory_ldst.c.inc|  18 ++--
  target/i386/hvf/README.md|   2 +-
  ui/cocoa.m   |  50 ++--
  94 files changed, 502 insertions(+), 502 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2 4/6] xen_pvdev: Do not assume Dom0 when creating a directory

2023-11-23 Thread Paul Durrant


On 23/11/2023 12:27, David Woodhouse wrote:

On 23 November 2023 12:17:57 GMT, Volodymyr Babchuk 
 wrote:


Hi David,

David Woodhouse  writes:

Which PV backends do you care about? We already have net, block and console 
converted.


Well, this is all what we need, actually. Even console only will be
sufficient, as we are using QEMU to provide virtio-pci backends, so both
storage and networking should be provided by virtio. Are you proposing
to just drop this patch at all? I believe we can live without it, yes.


Yeah, I think you can drop anything that's only needed for the legacy backends. 
I'm tempted to make a separate config option to compile those out.



I think that would be a good idea. The other legacy bacckend that we may 
need to care about is xenfb... not so much the framebuffer itself, but 
the mouse and keyboard aspects. The XENVKBD and XENHID drivers expose PV 
mouse and keyboard to Windows instances so it's be nice if we can avoid 
the backend withering away.


  Paul

Re: [PATCH v2 1/6] hw/xen: Set XenBackendInstance in the XenDevice before realizing it

2023-11-23 Thread Paul Durrant


On 22/11/2023 23:04, David Woodhouse wrote:

On Wed, 2023-11-22 at 22:56 +, Volodymyr Babchuk wrote:



Paul Durrant  writes:


On 21/11/2023 22:10, Volodymyr Babchuk wrote:

From: David Woodhouse 
This allows a XenDevice implementation to know whether it was
created
by QEMU, or merely discovered in XenStore after the toolstack created
it. This will allow us to create frontend/backend nodes only when we
should, rather than unconditionally attempting to overwrite them from
a driver domain which doesn't have privileges to do so.
As an added benefit, it also means we no longer have to call the
xen_backend_set_device() function from the device models immediately
after calling qdev_realize_and_unref(). Even though we could make
the argument that it's safe to do so, and the pointer to the unreffed
device *will* actually still be valid, it still made my skin itch to
look at it.
Signed-off-by: David Woodhouse 
---
   hw/block/xen-block.c | 3 +--
   hw/char/xen_console.c    | 2 +-
   hw/net/xen_nic.c | 2 +-
   hw/xen/xen-bus.c | 4 
   include/hw/xen/xen-backend.h | 2 --
   include/hw/xen/xen-bus.h | 2 ++
   6 files changed, 9 insertions(+), 6 deletions(-)



Actually, I think you should probably update
xen_backend_try_device_destroy() in this patch too. It currently uses
xen_backend_list_find() to check whether the specified XenDevice has
an associated XenBackendInstance.


Sure. Looks like it is the only user of xen_backend_list_find(), so we
can get rid of it as well.

I'll drop your R-b tag, because of those additional changes in the new
version.


I think it's fine to keep. He told me to do it :)


I confirm that :-)

Re: [PATCH v2 4/6] xen_pvdev: Do not assume Dom0 when creating a directory

2023-11-23 Thread Paul Durrant


On 23/11/2023 00:07, Volodymyr Babchuk wrote:


Hi,

Volodymyr Babchuk  writes:


Hi Stefano,

Stefano Stabellini  writes:


On Wed, 22 Nov 2023, David Woodhouse wrote:

On Wed, 2023-11-22 at 15:09 -0800, Stefano Stabellini wrote:

On Wed, 22 Nov 2023, David Woodhouse wrote:

On Wed, 2023-11-22 at 14:29 -0800, Stefano Stabellini wrote:

On Wed, 22 Nov 2023, Paul Durrant wrote:

On 21/11/2023 22:10, Volodymyr Babchuk wrote:

From: Oleksandr Tyshchenko 

Instead of forcing the owner to domid 0, use XS_PRESERVE_OWNER to
inherit the owner of the directory.


Ah... so that's why the previous patch is there.

This is not the right way to fix it. The QEMU Xen support is *assuming* that
QEMU is either running in, or emulating, dom0. In the emulation case this is
probably fine, but the 'real Xen' case it should be using the correct domid
for node creation. I guess this could either be supplied on the command line
or discerned by reading the local domain 'domid' node.


yes, it should be passed as command line option to QEMU


I'm not sure I like the idea of a command line option for something
which QEMU could discover for itself.


That's fine too. I meant to say "yes, as far as I know the toolstack
passes the domid to QEMU as a command line option today".


The -xen-domid argument on the QEMU command line today is the *guest*
domain ID, not the domain ID in which QEMU itself is running.

Or were you thinking of something different?


Ops, you are right and I understand your comment better now. The backend
domid is not on the command line but it should be discoverable (on
xenstore if I remember right).


Yes, it is just "~/domid". I'll add a function that reads it.


Just a quick question to QEMU folks: is it better to add a global
variable where we will store own Domain ID or it will be okay to read
domid from Xenstore every time we need it?

If global variable variant is better, what is proffered place to define
this variable? system/globals.c ?



Actually... is it possible for QEMU just to use a relative path for the 
backend nodes? That way it won't need to know it's own domid, will it?


  Paul

Re: [PATCH v2 4/6] xen_pvdev: Do not assume Dom0 when creating a directory

2023-11-22 Thread Paul Durrant


On 21/11/2023 22:10, Volodymyr Babchuk wrote:

From: Oleksandr Tyshchenko 

Instead of forcing the owner to domid 0, use XS_PRESERVE_OWNER to
inherit the owner of the directory.


Ah... so that's why the previous patch is there.

This is not the right way to fix it. The QEMU Xen support is *assuming* 
that QEMU is either running in, or emulating, dom0. In the emulation 
case this is probably fine, but the 'real Xen' case it should be using 
the correct domid for node creation. I guess this could either be 
supplied on the command line or discerned by reading the local domain 
'domid' node.




Note that for other than Dom0 domain (non toolstack domain) the
"driver_domain" property should be set in domain config file for the
toolstack to create required directories in advance.

Signed-off-by: Oleksandr Tyshchenko 
Signed-off-by: Volodymyr Babchuk 
---
  hw/xen/xen_pvdev.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/xen/xen_pvdev.c b/hw/xen/xen_pvdev.c
index c5ad71e8dc..42bdd4f6c8 100644
--- a/hw/xen/xen_pvdev.c
+++ b/hw/xen/xen_pvdev.c
@@ -60,7 +60,8 @@ void xen_config_cleanup(void)
  
  int xenstore_mkdir(char *path, int p)

  {
-if (!qemu_xen_xs_create(xenstore, 0, 0, xen_domid, p, path)) {
+if (!qemu_xen_xs_create(xenstore, 0, XS_PRESERVE_OWNER,
+xen_domid, p, path)) {
  xen_pv_printf(NULL, 0, "xs_mkdir %s: failed\n", path);
  return -1;
  }

Re: [PATCH v2 3/6] xen: xenstore: add possibility to preserve owner

2023-11-22 Thread Paul Durrant


On 21/11/2023 22:10, Volodymyr Babchuk wrote:

Add option to preserve owner when creating an entry in Xen Store. This
may be needed in cases when Qemu is working as device model in a


*may* be needed?

I don't undertstand why this patch is necessary and the commit comment 
isn't really helping me.



domain that is Domain-0, e.g. in driver domain.

"owner" parameter for qemu_xen_xs_create() function can have special
value XS_PRESERVE_OWNER, which will make specific implementation to
get original owner of an entry and pass it back to
set_permissions() call.

Please note, that XenStore inherits permissions, so even if entry is
newly created by, it already has the owner set to match owner of entry
at previous level.

Signed-off-by: Volodymyr Babchuk

Re: [PATCH v2 1/6] hw/xen: Set XenBackendInstance in the XenDevice before realizing it

2023-11-22 Thread Paul Durrant


On 21/11/2023 22:10, Volodymyr Babchuk wrote:

From: David Woodhouse 

This allows a XenDevice implementation to know whether it was created
by QEMU, or merely discovered in XenStore after the toolstack created
it. This will allow us to create frontend/backend nodes only when we
should, rather than unconditionally attempting to overwrite them from
a driver domain which doesn't have privileges to do so.

As an added benefit, it also means we no longer have to call the
xen_backend_set_device() function from the device models immediately
after calling qdev_realize_and_unref(). Even though we could make
the argument that it's safe to do so, and the pointer to the unreffed
device *will* actually still be valid, it still made my skin itch to
look at it.

Signed-off-by: David Woodhouse 
---
  hw/block/xen-block.c | 3 +--
  hw/char/xen_console.c| 2 +-
  hw/net/xen_nic.c | 2 +-
  hw/xen/xen-bus.c | 4 
  include/hw/xen/xen-backend.h | 2 --
  include/hw/xen/xen-bus.h | 2 ++
  6 files changed, 9 insertions(+), 6 deletions(-)



Actually, I think you should probably update 
xen_backend_try_device_destroy() in this patch too. It currently uses 
xen_backend_list_find() to check whether the specified XenDevice has an 
associated XenBackendInstance.


  Paul

Re: [PATCH v2 2/6] xen: backends: touch some XenStore nodes only if device...

2023-11-22 Thread Paul Durrant


On 21/11/2023 22:10, Volodymyr Babchuk wrote:

was created by QEMU

Xen PV devices in QEMU can be created in two ways: either by QEMU
itself, if they were passed via command line, or by Xen toolstack. In
the latter case, QEMU scans XenStore entries and configures devices
accordingly.

In the second case we don't want QEMU to write/delete front-end
entries for two reasons: it might have no access to those entries if
it is running in un-privileged domain and it is just incorrect to
overwrite entries already provided by Xen toolstack, because toolstack
manages those nodes. For example, it might read backend- or frontend-
state to be sure that they are both disconnected and it is safe to
destroy a domain.

This patch checks presence of xendev->backend to check if Xen PV
device is acting as a backend (i.e. it was configured by Xen


Technally *all* XenDevice objects are backends.


toolstack) to decide if it should touch frontend entries in XenStore.
Also, when we need to remove XenStore entries during device teardown
only if they weren't created by Xen toolstack. If they were created by
toolstack, then it is toolstack's job to do proper clean-up.

Re: [PATCH v2 1/6] hw/xen: Set XenBackendInstance in the XenDevice before realizing it

2023-11-22 Thread Paul Durrant


On 21/11/2023 22:10, Volodymyr Babchuk wrote:

From: David Woodhouse 

This allows a XenDevice implementation to know whether it was created
by QEMU, or merely discovered in XenStore after the toolstack created
it. This will allow us to create frontend/backend nodes only when we
should, rather than unconditionally attempting to overwrite them from
a driver domain which doesn't have privileges to do so.

As an added benefit, it also means we no longer have to call the
xen_backend_set_device() function from the device models immediately
after calling qdev_realize_and_unref(). Even though we could make
the argument that it's safe to do so, and the pointer to the unreffed
device *will* actually still be valid, it still made my skin itch to
look at it.

Signed-off-by: David Woodhouse 
---
  hw/block/xen-block.c | 3 +--
  hw/char/xen_console.c| 2 +-
  hw/net/xen_nic.c | 2 +-
  hw/xen/xen-bus.c | 4 
  include/hw/xen/xen-backend.h | 2 --
  include/hw/xen/xen-bus.h | 2 ++
  6 files changed, 9 insertions(+), 6 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH 2/3] vl: disable default serial when xen-console is enabled

2023-11-21 Thread Paul Durrant


On 15/11/2023 17:24, David Woodhouse wrote:

From: David Woodhouse 

If a Xen console is configured on the command line, do not add a default
serial port.

Signed-off-by: David Woodhouse 
---
  system/vl.c | 1 +
  1 file changed, 1 insertion(+)



Reviewed-by: Paul Durrant

Re: [PATCH 3/3] hw/xen: clean up xen_block_find_free_vdev() to avoid Coverity false positive

2023-11-15 Thread Paul Durrant


On 15/11/2023 12:24, David Woodhouse wrote:

From: David Woodhouse 

Coverity couldn't see that nr_existing was always going to be zero when
qemu_xen_xs_directory() returned NULL in the ENOENT case (CID 1523906).

Perhaps more to the point, neither could Peter at first glance. Improve
the code to hopefully make it clearer to Coverity and human reviewers
alike.

Signed-off-by: David Woodhouse 
---
  hw/block/xen-block.c | 24 +---
  1 file changed, 21 insertions(+), 3 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v1 3/7] xen: xenstore: add possibility to preserve owner

2023-11-12 Thread Paul Durrant


On 10/11/2023 15:42, Volodymyr Babchuk wrote:

Add option to preserve owner when creating an entry in Xen Store. This
may be needed in cases when Qemu is working as device model in a
domain that is Domain-0, e.g. in driver domain.

"owner" parameter for qemu_xen_xs_create() function can have special
value XS_PRESERVE_OWNER, which will make specific implementation to
get original owner of an entry and pass it back to
set_permissions() call.



If QEMU is running in a driver domain then it should know whether the 
domid of the domain it is running in and use that. Yes, it is hardcoded 
to 0 at the moment but surely a backend domid (which defaults to 0) 
could be passed on the command line?


  Paul


Signed-off-by: Volodymyr Babchuk 
---
  hw/i386/kvm/xen_xenstore.c   | 18 ++
  hw/xen/xen-operations.c  | 12 
  include/hw/xen/xen_backend_ops.h |  2 ++
  3 files changed, 32 insertions(+)

diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 660d0b72f9..7b894a9884 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -1572,6 +1572,24 @@ static bool xs_be_create(struct qemu_xs_handle *h, 
xs_transaction_t t,
  return false;
  }
  
+if (owner == XS_PRESERVE_OWNER) {

+GList *perms;
+char letter;
+
+err = xs_impl_get_perms(h->impl, 0, t, path, &perms);
+if (err) {
+errno = err;
+return false;
+}
+
+if (sscanf(perms->data, "%c%u", &letter, &owner) != 2) {
+errno = EFAULT;
+g_list_free_full(perms, g_free);
+return false;
+}
+g_list_free_full(perms, g_free);
+}
+
  perms_list = g_list_append(perms_list,
 xs_perm_as_string(XS_PERM_NONE, owner));
  perms_list = g_list_append(perms_list,
diff --git a/hw/xen/xen-operations.c b/hw/xen/xen-operations.c
index e00983ec44..1df59b3c08 100644
--- a/hw/xen/xen-operations.c
+++ b/hw/xen/xen-operations.c
@@ -300,6 +300,18 @@ static bool libxenstore_create(struct qemu_xs_handle *h, 
xs_transaction_t t,
  return false;
  }
  
+if (owner == XS_PRESERVE_OWNER) {

+struct xs_permissions *tmp;
+unsigned int num;
+
+tmp = xs_get_permissions(h->xsh, 0, path, &num);
+if (tmp == NULL) {
+return false;
+}
+perms_list[0].id = tmp[0].id;
+free(tmp);
+}
+
  return xs_set_permissions(h->xsh, t, path, perms_list,
ARRAY_SIZE(perms_list));
  }
diff --git a/include/hw/xen/xen_backend_ops.h b/include/hw/xen/xen_backend_ops.h
index 90cca85f52..273e414559 100644
--- a/include/hw/xen/xen_backend_ops.h
+++ b/include/hw/xen/xen_backend_ops.h
@@ -266,6 +266,8 @@ typedef uint32_t xs_transaction_t;
  #define XS_PERM_READ  0x01
  #define XS_PERM_WRITE 0x02
  
+#define XS_PRESERVE_OWNER0xFFFE

+
  struct xenstore_backend_ops {
  struct qemu_xs_handle *(*open)(void);
  void (*close)(struct qemu_xs_handle *h);

Re: [PATCH v1 5/7] xen-bus: Set offline if backend's state is XenbusStateClosed

2023-11-12 Thread Paul Durrant


On 10/11/2023 15:42, Volodymyr Babchuk wrote:

From: Oleksandr Tyshchenko 

Both state (XenbusStateClosed) and online (0) are expected by
toolstack/xl devd to completely destroy the device. But "offline"
is never being set by the backend resulting in timeout during
domain destruction, garbage in Xestore and still running Qemu
instance.

Signed-off-by: Oleksandr Tyshchenko 
Signed-off-by: Volodymyr Babchuk 
---
  hw/xen/xen-bus.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index 75474d4b43..6e7ec3af64 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -519,6 +519,10 @@ static void xen_device_backend_changed(void *opaque, const 
char *path)
  xen_device_backend_set_state(xendev, XenbusStateClosed);
  }
  
+if (xen_device_backend_get_state(xendev) == XenbusStateClosed) {

+xen_device_backend_set_online(xendev, false);
+}
+
  /*
   * If a backend is still 'online' then we should leave it alone but,
   * if a backend is not 'online', then the device is a candidate


I don't understand what you're trying to do here. Just a few lines up 
from this hunk there is:


 506if (xen_device_backend_scanf(xendev, "online", "%u", &online) 
!= 1) {

 507online = 0;
 508}
 509
 510xen_device_backend_set_online(xendev, !!online);

Why is this not sufficient? What happens if the frontend decides to stop 
and start (e.g. for a driver update)? I'm guessing the backend will be 
destroyed... which is not very friendly.


  Paul

Re: [PATCH v1 1/7] xen-block: Do not write frontend nodes

2023-11-12 Thread Paul Durrant


On 10/11/2023 15:42, Volodymyr Babchuk wrote:

From: Oleksandr Tyshchenko 

The PV backend running in other than Dom0 domain (non toolstack domain)
is not allowed to write frontend nodes. The more, the backend does not
need to do that at all, this is purely toolstack/xl devd business.

I do not know for what reason the backend does that here, this is not really
needed, probably it is just a leftover and all xen_device_frontend_printf()
instances should go away completely.



It is not a leftover and it is needed in the case that QEMU is 
instantiating the backend unilaterally... i.e. without the involvement 
of any Xen toolstack.
Agreed that, if QEMU, is running in a deprivileged context, that is not 
an option. The correct way to determined this though is whether the 
device is being created via the QEMU command line or whether is being 
created because XenStore nodes written by a toolstack were discovered.
In the latter case there should be a XenBackendInstance that corresponds 
to the XenDevice whereas in the former case there should not be. For 
example, see xen_backend_try_device_destroy()


  Paul



Signed-off-by: Oleksandr Tyshchenko 
Signed-off-by: Volodymyr Babchuk 
---
  hw/block/xen-block.c | 11 +++
  hw/xen/xen-bus.c |  2 +-
  2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index a07cd7eb5d..dc4d477c22 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -221,6 +221,7 @@ static void xen_block_realize(XenDevice *xendev, Error 
**errp)
  XenBlockVdev *vdev = &blockdev->props.vdev;
  BlockConf *conf = &blockdev->props.conf;
  BlockBackend *blk = conf->blk;
+XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
  
  if (vdev->type == XEN_BLOCK_VDEV_TYPE_INVALID) {

  error_setg(errp, "vdev property not set");
@@ -280,10 +281,12 @@ static void xen_block_realize(XenDevice *xendev, Error 
**errp)
  
  xen_device_backend_printf(xendev, "info", "%u", blockdev->info);
  
-xen_device_frontend_printf(xendev, "virtual-device", "%lu",

-   vdev->number);
-xen_device_frontend_printf(xendev, "device-type", "%s",
-   blockdev->device_type);
+if (xenbus->backend_id == 0) {
+xen_device_frontend_printf(xendev, "virtual-device", "%lu",
+   vdev->number);
+xen_device_frontend_printf(xendev, "device-type", "%s",
+   blockdev->device_type);
+}
  
  xen_device_backend_printf(xendev, "sector-size", "%u",

conf->logical_block_size);
diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index ece8ec40cd..06d5192aca 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -1048,7 +1048,7 @@ static void xen_device_realize(DeviceState *dev, Error 
**errp)
  xen_device_backend_set_online(xendev, true);
  xen_device_backend_set_state(xendev, XenbusStateInitWait);
  
-if (!xen_device_frontend_exists(xendev)) {

+if (!xen_device_frontend_exists(xendev) && xenbus->backend_id == 0) {
  xen_device_frontend_printf(xendev, "backend", "%s",
 xendev->backend_path);
  xen_device_frontend_printf(xendev, "backend-id", "%u",

Re: [PATCH] hw/xen: clean up xen_block_find_free_vdev() to avoid Coverity false positive

2023-11-09 Thread Paul Durrant


On 09/11/2023 15:30, David Woodhouse wrote:

From: David Woodhouse 

Coverity couldn't see that nr_existing was always going to be zero when
qemu_xen_xs_directory() returned NULL in the ENOENT case (CID 1523906).

Perhaps more to the point, neither could Peter at first glance. Improve
the code to hopefully make it clearer to Coverity and human reviewers
alike.

Signed-off-by: David Woodhouse 
---
  hw/block/xen-block.c | 24 +---
  1 file changed, 21 insertions(+), 3 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v4 17/17] docs: update Xen-on-KVM documentation

2023-11-06 Thread Paul Durrant


On 06/11/2023 14:35, David Woodhouse wrote:

From: David Woodhouse 

Add notes about console and network support, and how to launch PV guests.
Clean up the disk configuration examples now that that's simpler, and
remove the comment about IDE unplug on q35/AHCI now that it's fixed.

Update the -initrd option documentation to explain how to quote commas
in module command lines, and reference it when documenting PV guests.

Also update stale avocado test filename in MAINTAINERS.

Signed-off-by: David Woodhouse 
---
  MAINTAINERS  |   2 +-
  docs/system/i386/xen.rst | 107 +--
  qemu-options.hx  |  14 +++--
  3 files changed, 91 insertions(+), 32 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v4 16/17] doc/sphinx/hxtool.py: add optional label argument to SRST directive

2023-11-06 Thread Paul Durrant


On 06/11/2023 14:35, David Woodhouse wrote:

From: David Woodhouse 

We can't just embed labels directly into files like qemu-options.hx which
are included from multiple top-level RST files, because Sphinx sees the
labels as duplicate: https://github.com/sphinx-doc/sphinx/issues/9707

So add an 'emitrefs' option to the Sphinx hxtool-doc directive, which is
set only in invocation.rst and not from the HTML rendition of the man
page. Along with an argument to the SRST directive which causes a label
of the form '.. _LABEL-reference-label:' to be emitted when the emitrefs
option is set.

Signed-off-by: David Woodhouse 
---
  docs/sphinx/hxtool.py  | 18 +-
  docs/system/invocation.rst |  1 +
  2 files changed, 18 insertions(+), 1 deletion(-)



Reviewed-by: Paul Durrant

Re: [PATCH v4 13/17] hw/i386/pc: support '-nic' for xen-net-device

2023-11-06 Thread Paul Durrant


On 06/11/2023 14:35, David Woodhouse wrote:

From: David Woodhouse 

The default NIC creation seems a bit hackish to me. I don't understand
why each platform has to call pci_nic_init_nofail() from a point in the
code where it actually has a pointer to the PCI bus, and then we have
the special cases for things like ne2k_isa.

If qmp_device_add() can *find* the appropriate bus and instantiate
the device on it, why can't we just do that from generic code for
creating the default NICs too?

But that isn't a yak I want to shave today. Add a xenbus field to the
PCMachineState so that it can make its way from pc_basic_device_init()
to pc_nic_init() and be handled as a special case like ne2k_isa is.

Now we can launch emulated Xen guests with '-nic user'.

Signed-off-by: David Woodhouse 
---
  hw/i386/pc.c | 11 ---
  hw/i386/pc_piix.c|  2 +-
  hw/i386/pc_q35.c |  2 +-
  hw/xen/xen-bus.c |  4 +++-
  include/hw/i386/pc.h |  4 +++-
  include/hw/xen/xen-bus.h |  2 +-
  6 files changed, 17 insertions(+), 8 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v4 06/17] hw/xen: automatically assign device index to block devices

2023-11-06 Thread Paul Durrant


On 06/11/2023 14:34, David Woodhouse wrote:

From: David Woodhouse 

There's no need to force the user to assign a vdev. We can automatically
assign one, starting at xvda and searching until we find the first disk
name that's unused.

This means we can now allow '-drive if=xen,file=xxx' to work without an
explicit separate -driver argument, just like if=virtio.

Rip out the legacy handling from the xenpv machine, which was scribbling
over any disks configured by the toolstack, and didn't work with anything
but raw images.

Signed-off-by: David Woodhouse 
Acked-by: Kevin Wolf 
---
  blockdev.c  |  15 +++-
  hw/block/xen-block.c| 118 ++--
  hw/xen/xen_devconfig.c  |  28 ---
  hw/xenpv/xen_machine_pv.c   |   9 ---
  include/hw/xen/xen-legacy-backend.h |   1 -
  5 files changed, 125 insertions(+), 46 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH] xen/pt: Emulate multifunction bit in header type

2023-11-06 Thread Paul Durrant


On 03/11/2023 17:26, Ross Lagerwall wrote:

The intention of the code appears to have been to unconditionally set
the multifunction bit but since the emulation mask is 0x00 it has no
effect. Instead, emulate the bit and set it based on the multifunction
property of the PCIDevice (which can be set using QAPI).

This allows making passthrough devices appear as functions in a Xen
guest.

Signed-off-by: Ross Lagerwall 


Reviewed-by: Paul Durrant

Re: [PATCH 12/12] hw/xen: add support for Xen primary console in emulated mode

2023-10-25 Thread Paul Durrant


On 25/10/2023 10:00, David Woodhouse wrote:

On Wed, 2023-10-25 at 09:31 +0100, Paul Durrant wrote:

On 24/10/2023 17:34, David Woodhouse wrote:

On Tue, 2023-10-24 at 17:25 +0100, Paul Durrant wrote:

On 24/10/2023 16:49, David Woodhouse wrote:

On Tue, 2023-10-24 at 16:39 +0100, Paul Durrant wrote:

On 24/10/2023 16:37, David Woodhouse wrote:

On Tue, 2023-10-24 at 15:20 +0100, Paul Durrant wrote:

On 16/10/2023 16:19, David Woodhouse wrote:

From: David Woodhouse 

The primary console is special because the toolstack maps a page at a
fixed GFN and also allocates the guest-side event channel. Add support
for that in emulated mode, so that we can have a primary console.

Add a *very* rudimentary stub of foriegnmem ops for emulated mode, which
supports literally nothing except a single-page mapping of the console
page. This might as well have been a hack in the xen_console driver, but
this way at least the special-casing is kept within the Xen emulation
code, and it gives us a hook for a more complete implementation if/when
we ever do need one.


Why can't you map the console page via the grant table like the xenstore
page?


I suppose we could, but I didn't really want the generic xen-console
device code having any more of a special case for 'Xen emulation' than
it does already by having to call xen_primary_console_create().



But doesn't is save you the whole foreignmem thing? You can use the
grant table for primary and secondary consoles.


Yes. And I could leave the existing foreignmem thing just for the case
of primary console under true Xen. It's probably not that awful a
special case, in the end.

Then again, I was surprised I didn't *already* have a foreignmem ops
for the emulated case, and we're probably going to want to continue
fleshing it out later, so I don't really mind adding it.



True. We'll need it for some of the other more fun protocols like vkbd
or fb. Still, I think it'd be nicer to align the xenstore and primary
console code to look similar and punt the work until then :-)


I don't think it ends up looking like xenstore either way, does it?
Xenstore is special because it gets to use the original pointer to its
own page.



Not sure what you mean there? A guest can query the PFN for either
xenstore or console using HVM params, or it can find them in its own
grant table entries 0 or 1.


The code in our xen_xenstore.c uses its *own* pointer (s->xs) to the
MemoryRegion that it created (s->xenstore_page). It is its own backend,
as well as doing the "magic" to create the guest-side mapping and event
channel.

The difference for the console code is that we actually have a
*separation* between the standard backend code in xen_console.c, and
the magic frontend parts for the emulated mode.





I don't think I want to hack the xen_console code to explicitly call a
xen_console_give_me_your_page() function. If not foreignmem, I think
you were suggesting that we actually call the grant mapping code to get
a pointer to the underlying page, right?


I'm suggesting that the page be mapped in the same way that the xenstore
backend does:

1462    /*

1463 * We don't actually access the guest's page through the grant, because
1464 * this isn't real Xen, and we can just use the page we gave it in the
1465 * first place. Map the grant anyway, mostly for cosmetic purposes so
1466 * it *looks* like it's in use in the guest-visible grant table.
1467 */
1468    s->gt = qemu_xen_gnttab_open();
1469    uint32_t xs_gntref = GNTTAB_RESERVED_XENSTORE;
1470    s->granted_xs = qemu_xen_gnttab_map_refs(s->gt, 1, xen_domid, 
&xs_gntref,
1471 PROT_READ | PROT_WRITE);


It already *is*. But as with xen_xenstore.c, nothing ever *uses* the
s->granted_xs pointer. It's just cosmetic to make the grant table look
right.

But that doesn't help the *backend* code. The backend doesn't even know
the grant ref#, because the convention we inherited from Xen is that
the `ring-ref` in XenStore for the primary console is actually the MFN,
to be mapped as foreignmem.

Of course, we *do* know the grant-ref for the primary console, as it's
always GNTTAB_RESERVED_CONSOLE. So I suppose we could put a hack into
the xen_console backend to map *that* in the case of primary console
under emu? In fact that would probably do the right thing even under
Xen if we could persuade Xen to make an ioemu primary console?



That's exactly what I am getting at :-) I don't think we need care about 
the ring-ref in xenstore for the primary console.


  Paul








I could kind of live with that... except that Xen has this ugly
convention that the "ring-ref" frontend node for the primary console
actually has the *MFN* not a grant ref. Which I don't understand since
the toolstack *does* populate the grant table for it (just

Re: [PATCH 12/12] hw/xen: add support for Xen primary console in emulated mode

2023-10-25 Thread Paul Durrant


On 24/10/2023 17:34, David Woodhouse wrote:

On Tue, 2023-10-24 at 17:25 +0100, Paul Durrant wrote:

On 24/10/2023 16:49, David Woodhouse wrote:

On Tue, 2023-10-24 at 16:39 +0100, Paul Durrant wrote:

On 24/10/2023 16:37, David Woodhouse wrote:

On Tue, 2023-10-24 at 15:20 +0100, Paul Durrant wrote:

On 16/10/2023 16:19, David Woodhouse wrote:

From: David Woodhouse 

The primary console is special because the toolstack maps a page at a
fixed GFN and also allocates the guest-side event channel. Add support
for that in emulated mode, so that we can have a primary console.

Add a *very* rudimentary stub of foriegnmem ops for emulated mode, which
supports literally nothing except a single-page mapping of the console
page. This might as well have been a hack in the xen_console driver, but
this way at least the special-casing is kept within the Xen emulation
code, and it gives us a hook for a more complete implementation if/when
we ever do need one.


Why can't you map the console page via the grant table like the xenstore
page?


I suppose we could, but I didn't really want the generic xen-console
device code having any more of a special case for 'Xen emulation' than
it does already by having to call xen_primary_console_create().



But doesn't is save you the whole foreignmem thing? You can use the
grant table for primary and secondary consoles.


Yes. And I could leave the existing foreignmem thing just for the case
of primary console under true Xen. It's probably not that awful a
special case, in the end.

Then again, I was surprised I didn't *already* have a foreignmem ops
for the emulated case, and we're probably going to want to continue
fleshing it out later, so I don't really mind adding it.



True. We'll need it for some of the other more fun protocols like vkbd
or fb. Still, I think it'd be nicer to align the xenstore and primary
console code to look similar and punt the work until then :-)


I don't think it ends up looking like xenstore either way, does it?
Xenstore is special because it gets to use the original pointer to its
own page.



Not sure what you mean there? A guest can query the PFN for either 
xenstore or console using HVM params, or it can find them in its own 
grant table entries 0 or 1.



I don't think I want to hack the xen_console code to explicitly call a
xen_console_give_me_your_page() function. If not foreignmem, I think
you were suggesting that we actually call the grant mapping code to get
a pointer to the underlying page, right?


I'm suggesting that the page be mapped in the same way that the xenstore 
backend does:


1462/* 

1463 * We don't actually access the guest's page through the grant, 
because
1464 * this isn't real Xen, and we can just use the page we gave it 
in the
1465 * first place. Map the grant anyway, mostly for cosmetic 
purposes so
1466 * it *looks* like it's in use in the guest-visible grant table. 


1467 */
1468s->gt = qemu_xen_gnttab_open();
1469uint32_t xs_gntref = GNTTAB_RESERVED_XENSTORE;
1470s->granted_xs = qemu_xen_gnttab_map_refs(s->gt, 1, xen_domid, 
&xs_gntref,

1471 PROT_READ | PROT_WRITE);




I could kind of live with that... except that Xen has this ugly
convention that the "ring-ref" frontend node for the primary console
actually has the *MFN* not a grant ref. Which I don't understand since
the toolstack *does* populate the grant table for it (just as it does
for the xenstore page). But we'd have to add a special case exception
to that special case, so that in the emu case it's an actual grant ref
again. I think I prefer just having a stub of foreignmem, TBH.



You're worried about the guest changing the page it uses for the primary 
console and putting a new one in xenstore? I'd be amazed if that even 
works on Xen unless the guest is careful to write it into 
GNTTAB_RESERVED_CONSOLE.



(I didn't yet manage to get Xen to actually create a primary console of
type iomem, FWIW)



No, that doesn't entirely surprise me.

  Paul

Re: [PATCH 12/12] hw/xen: add support for Xen primary console in emulated mode

2023-10-24 Thread Paul Durrant


On 24/10/2023 16:49, David Woodhouse wrote:

On Tue, 2023-10-24 at 16:39 +0100, Paul Durrant wrote:

On 24/10/2023 16:37, David Woodhouse wrote:

On Tue, 2023-10-24 at 15:20 +0100, Paul Durrant wrote:

On 16/10/2023 16:19, David Woodhouse wrote:

From: David Woodhouse 

The primary console is special because the toolstack maps a page at a
fixed GFN and also allocates the guest-side event channel. Add support
for that in emulated mode, so that we can have a primary console.

Add a *very* rudimentary stub of foriegnmem ops for emulated mode, which
supports literally nothing except a single-page mapping of the console
page. This might as well have been a hack in the xen_console driver, but
this way at least the special-casing is kept within the Xen emulation
code, and it gives us a hook for a more complete implementation if/when
we ever do need one.


Why can't you map the console page via the grant table like the xenstore
page?


I suppose we could, but I didn't really want the generic xen-console
device code having any more of a special case for 'Xen emulation' than
it does already by having to call xen_primary_console_create().



But doesn't is save you the whole foreignmem thing? You can use the
grant table for primary and secondary consoles.


Yes. And I could leave the existing foreignmem thing just for the case
of primary console under true Xen. It's probably not that awful a
special case, in the end.

Then again, I was surprised I didn't *already* have a foreignmem ops
for the emulated case, and we're probably going to want to continue
fleshing it out later, so I don't really mind adding it.



True. We'll need it for some of the other more fun protocols like vkbd 
or fb. Still, I think it'd be nicer to align the xenstore and primary 
console code to look similar and punt the work until then :-)


  Paul

Re: [PATCH v2 16/24] hw/xen: handle soft reset for primary console

2023-10-24 Thread Paul Durrant


On 24/10/2023 16:48, David Woodhouse wrote:

On Tue, 2023-10-24 at 16:44 +0100, Paul Durrant wrote:

On 19/10/2023 16:40, David Woodhouse wrote:

From: David Woodhouse 

On soft reset, the prinary console event channel needs to be rebound to
the backend port (in the xen-console driver). We could put that into the
xen-console driver itself, but it's slightly less ugly to keep it within
the KVM/Xen code, by stashing the backend port# on event channel reset
and then rebinding in the primary console reset when it has to recreate
the guest port anyway.


Does Xen re-bind the primary console on EVTCHNOP_reset? That's news to
me. I go check.


I spent an unhapp hour trying to work out how Xen actually does any of
this :)

In the short term I'm more interested in having soft reset work, than
an explicit EVTCHNOP_reset. And I can't work out *how*, but we do seem
to have console again after a kexec in real Xen.


*Soft* reset may do it, but not the EVTCHNOP_reset hypercall itself, 
because there's a bunch of impenetrable toolstack magic involved the 
former. Perhaps you could just push the re-bind code up a layer into

kvm_xen_soft_reset().

  Paul

Re: [PATCH 2/4] hw/xen: update Xen PV NIC to XenDevice model

2023-10-24 Thread Paul Durrant


On 24/10/2023 16:17, David Woodhouse wrote:
[snip]


I don't think that's really a valid state for a network frontend. Linux
netback just ignores it.


Must we? I was thinking of making the ->frontend_changed() methods
optional and allowing backends to just provide ->connect() and
->disconnect() methods instead if they wanted to. Because we have three
identical ->frontend_changed() methods now...



Now maybe... Not sure things will look so common when other backends are 
converted. I'd prefer to maintain fidelity with Linux xen-netback as it 
is generally considered to be the canonical implementation.


  Paul

Re: [PATCH v2 16/24] hw/xen: handle soft reset for primary console

2023-10-24 Thread Paul Durrant


On 19/10/2023 16:40, David Woodhouse wrote:

From: David Woodhouse 

On soft reset, the prinary console event channel needs to be rebound to
the backend port (in the xen-console driver). We could put that into the
xen-console driver itself, but it's slightly less ugly to keep it within
the KVM/Xen code, by stashing the backend port# on event channel reset
and then rebinding in the primary console reset when it has to recreate
the guest port anyway.


Does Xen re-bind the primary console on EVTCHNOP_reset? That's news to 
me. I go check.


  Paul



Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_evtchn.c  |  9 +
  hw/i386/kvm/xen_primary_console.c | 29 -
  hw/i386/kvm/xen_primary_console.h |  1 +
  3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index d72dca6591..ce4da6d37a 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -40,6 +40,7 @@
  #include "xen_evtchn.h"
  #include "xen_overlay.h"
  #include "xen_xenstore.h"
+#include "xen_primary_console.h"
  
  #include "sysemu/kvm.h"

  #include "sysemu/kvm_xen.h"
@@ -1098,6 +1099,7 @@ int xen_evtchn_soft_reset(void)
  {
  XenEvtchnState *s = xen_evtchn_singleton;
  bool flush_kvm_routes;
+uint16_t con_port = xen_primary_console_get_port();
  int i;
  
  if (!s) {

@@ -1108,6 +1110,13 @@ int xen_evtchn_soft_reset(void)
  
  qemu_mutex_lock(&s->port_lock);
  
+if (con_port) {

+XenEvtchnPort *p = &s->port_table[con_port];
+if (p->type == EVTCHNSTAT_interdomain) {
+xen_primary_console_set_be_port(p->u.interdomain.port);
+}
+}
+
  for (i = 0; i < s->nr_ports; i++) {
  close_port(s, i, &flush_kvm_routes);
  }
diff --git a/hw/i386/kvm/xen_primary_console.c 
b/hw/i386/kvm/xen_primary_console.c
index 0aa1c16ad6..5e6e085ac7 100644
--- a/hw/i386/kvm/xen_primary_console.c
+++ b/hw/i386/kvm/xen_primary_console.c
@@ -112,6 +112,15 @@ uint16_t xen_primary_console_get_port(void)
  return s->guest_port;
  }
  
+void xen_primary_console_set_be_port(uint16_t port)

+{
+XenPrimaryConsoleState *s = xen_primary_console_singleton;
+if (s) {
+printf("be port set to %d\n", port);
+s->be_port = port;
+}
+}
+
  uint64_t xen_primary_console_get_pfn(void)
  {
  XenPrimaryConsoleState *s = xen_primary_console_singleton;
@@ -142,6 +151,20 @@ static void alloc_guest_port(XenPrimaryConsoleState *s)
  }
  }
  
+static void rebind_guest_port(XenPrimaryConsoleState *s)

+{
+struct evtchn_bind_interdomain inter = {
+.remote_dom = DOMID_QEMU,
+.remote_port = s->be_port,
+};
+
+if (!xen_evtchn_bind_interdomain_op(&inter)) {
+s->guest_port = inter.local_port;
+}
+
+s->be_port = 0;
+}
+
  int xen_primary_console_reset(void)
  {
  XenPrimaryConsoleState *s = xen_primary_console_singleton;
@@ -154,7 +177,11 @@ int xen_primary_console_reset(void)
  xen_overlay_do_map_page(&s->console_page, gpa);
  }
  
-alloc_guest_port(s);

+if (s->be_port) {
+rebind_guest_port(s);
+} else {
+alloc_guest_port(s);
+}
  
  trace_xen_primary_console_reset(s->guest_port);
  
diff --git a/hw/i386/kvm/xen_primary_console.h b/hw/i386/kvm/xen_primary_console.h

index dd4922f3f4..7e2989ea0d 100644
--- a/hw/i386/kvm/xen_primary_console.h
+++ b/hw/i386/kvm/xen_primary_console.h
@@ -16,6 +16,7 @@ void xen_primary_console_create(void);
  int xen_primary_console_reset(void);
  
  uint16_t xen_primary_console_get_port(void);

+void xen_primary_console_set_be_port(uint16_t port);
  uint64_t xen_primary_console_get_pfn(void);
  void *xen_primary_console_get_map(void);

Re: [PATCH 12/12] hw/xen: add support for Xen primary console in emulated mode

2023-10-24 Thread Paul Durrant


On 24/10/2023 16:37, David Woodhouse wrote:

On Tue, 2023-10-24 at 15:20 +0100, Paul Durrant wrote:

On 16/10/2023 16:19, David Woodhouse wrote:

From: David Woodhouse 

The primary console is special because the toolstack maps a page at a
fixed GFN and also allocates the guest-side event channel. Add support
for that in emulated mode, so that we can have a primary console.

Add a *very* rudimentary stub of foriegnmem ops for emulated mode, which
supports literally nothing except a single-page mapping of the console
page. This might as well have been a hack in the xen_console driver, but
this way at least the special-casing is kept within the Xen emulation
code, and it gives us a hook for a more complete implementation if/when
we ever do need one.


Why can't you map the console page via the grant table like the xenstore
page?


I suppose we could, but I didn't really want the generic xen-console
device code having any more of a special case for 'Xen emulation' than
it does already by having to call xen_primary_console_create().



But doesn't is save you the whole foreignmem thing? You can use the 
grant table for primary and secondary consoles.


  Paul

Re: [PATCH v2 10/24] hw/xen: populate store frontend nodes with XenStore PFN/port

2023-10-24 Thread Paul Durrant


On 19/10/2023 16:40, David Woodhouse wrote:

From: David Woodhouse 

This is kind of redundant since without being able to get these through
some other method (HVMOP_get_param) the guest wouldn't be able to access
XenStore in order to find them. But Xen populates them, and it does
allow guests to *rebind* to the event channel port after a reset.

Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_xenstore.c | 11 +++
  1 file changed, 11 insertions(+)



Reviewed-by: Paul Durrant

Re: [PATCH v2 05/24] hw/xen: fix XenStore watch delivery to guest

2023-10-24 Thread Paul Durrant


On 19/10/2023 16:40, David Woodhouse wrote:

From: David Woodhouse 

When fire_watch_cb() found the response buffer empty, it would call
deliver_watch() to generate the XS_WATCH_EVENT message in the response
buffer and send an event channel notification to the guest… without
actually *copying* the response buffer into the ring. So there was
nothing for the guest to see. The pending response didn't actually get
processed into the ring until the guest next triggered some activity
from its side.

Add the missing call to put_rsp().

It might have been slightly nicer to call xen_xenstore_event() here,
which would *almost* have worked. Except for the fact that it calls
xen_be_evtchn_pending() to check that it really does have an event
pending (and clear the eventfd for next time). And under Xen it's
defined that setting that fd to O_NONBLOCK isn't guaranteed to work,
so the emu implementation follows suit.

This fixes Xen device hot-unplug.

Fixes: 0254c4d19df ("hw/xen: Add xenstore wire implementation and implementation 
stubs")
Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_xenstore.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 660d0b72f9..82a215058a 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -1357,10 +1357,12 @@ static void fire_watch_cb(void *opaque, const char 
*path, const char *token)
  } else {
  deliver_watch(s, path, token);
  /*
- * If the message was queued because there was already ring activity,
- * no need to wake the guest. But if not, we need to send the evtchn.
+ * Attempt to queue the message into the actual ring, and send
+ * the event channel notification if any bytes are copied.
   */
-xen_be_evtchn_notify(s->eh, s->be_port);
+if (put_rsp(s) > 0) {
+xen_be_evtchn_notify(s->eh, s->be_port);
+}


There's actually the potential for an assertion failure there; if the 
guest has specified an oversize token then deliver_watch() will not set 
rsp_pending... then put_rsp() will fail its assertion that the flag is true.


  Paul


  }
  }

Re: [PATCH v2 04/24] hw/xen: don't clear map_track[] in xen_gnttab_reset()

2023-10-24 Thread Paul Durrant


On 19/10/2023 16:40, David Woodhouse wrote:

From: David Woodhouse 

The refcounts actually correspond to 'active_ref' structures stored in a
GHashTable per "user" on the backend side (mostly, per XenDevice).

If we zero map_track[] on reset, then when the backend drivers get torn
down and release their mapping we hit the assert(s->map_track[ref] != 0)
in gnt_unref().

So leave them in place. Each backend driver will disconnect and reconnect
as the guest comes back up again and reconnects, and it all works out OK
in the end as the old refs get dropped.

Fixes: de26b2619789 ("hw/xen: Implement soft reset for emulated gnttab")
Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_gnttab.c | 2 --
  1 file changed, 2 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2 03/24] hw/xen: select kernel mode for per-vCPU event channel upcall vector

2023-10-24 Thread Paul Durrant


On 19/10/2023 16:39, David Woodhouse wrote:

From: David Woodhouse 

A guest which has configured the per-vCPU upcall vector may set the
HVM_PARAM_CALLBACK_IRQ param to fairly much anything other than zero.

For example, Linux v6.0+ after commit b1c3497e604 ("x86/xen: Add support
for HVMOP_set_evtchn_upcall_vector") will just do this after setting the
vector:

/* Trick toolstack to think we are enlightened. */
if (!cpu)
rc = xen_set_callback_via(1);

That's explicitly setting the delivery to GSI#1, but it's supposed to be
overridden by the per-vCPU vector setting. This mostly works in Qemu
*except* for the logic to enable the in-kernel handling of event channels,
which falsely determines that the kernel cannot accelerate GSI delivery
in this case.

Add a kvm_xen_has_vcpu_callback_vector() to report whether vCPU#0 has
the vector set, and use that in xen_evtchn_set_callback_param() to
enable the kernel acceleration features even when the param *appears*
to be set to target a GSI.

Preserve the Xen behaviour that when HVM_PARAM_CALLBACK_IRQ is set to
*zero* the event channel delivery is disabled completely. (Which is
what that bizarre guest behaviour is working round in the first place.)

Fixes: 91cce756179 ("hw/xen: Add xen_evtchn device for event channel emulation")
Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_evtchn.c  | 6 ++
  include/sysemu/kvm_xen.h  | 1 +
  target/i386/kvm/xen-emu.c | 7 +++
  3 files changed, 14 insertions(+)



Reviewed-by: Paul Durrant

Re: [PATCH 2/4] hw/xen: update Xen PV NIC to XenDevice model

2023-10-24 Thread Paul Durrant


On 17/10/2023 19:25, David Woodhouse wrote:

From: David Woodhouse 

Signed-off-by: David Woodhouse 
---
  hw/net/meson.build|   2 +-
  hw/net/trace-events   |   9 +
  hw/net/xen_nic.c  | 426 +-
  hw/xenpv/xen_machine_pv.c |   1 -
  4 files changed, 342 insertions(+), 96 deletions(-)

diff --git a/hw/net/meson.build b/hw/net/meson.build
index 2632634df3..f64651c467 100644
--- a/hw/net/meson.build
+++ b/hw/net/meson.build
@@ -1,5 +1,5 @@
  system_ss.add(when: 'CONFIG_DP8393X', if_true: files('dp8393x.c'))
-system_ss.add(when: 'CONFIG_XEN', if_true: files('xen_nic.c'))
+system_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xen_nic.c'))
  system_ss.add(when: 'CONFIG_NE2000_COMMON', if_true: files('ne2000.c'))
  
  # PCI network cards

diff --git a/hw/net/trace-events b/hw/net/trace-events
index 3abfd65e5b..ee56acfbce 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -482,3 +482,12 @@ dp8393x_receive_oversize(int size) "oversize packet, pkt_size 
is %d"
  dp8393x_receive_not_netcard(void) "packet not for netcard"
  dp8393x_receive_packet(int crba) "Receive packet at 0x%"PRIx32
  dp8393x_receive_write_status(int crba) "Write status at 0x%"PRIx32
+
+# xen_nic.c
+xen_netdev_realize(int idx, const char *info) "idx %u info '%s'"
+xen_netdev_unrealize(int idx) "idx %u"
+xen_netdev_create(int idx) "idx %u"
+xen_netdev_destroy(int idx) "idx %u"
+xen_netdev_disconnect(int idx) "idx %u"
+xen_netdev_connect(int idx, unsigned int tx, unsigned int rx, int port) "idx %u tx 
%u rx %u port %u"
+xen_netdev_frontend_changed(const char *idx, int state) "idx %s state %d"
diff --git a/hw/net/xen_nic.c b/hw/net/xen_nic.c
index 9bbf6599fc..84914c329c 100644
--- a/hw/net/xen_nic.c
+++ b/hw/net/xen_nic.c
@@ -20,6 +20,11 @@
   */
  
  #include "qemu/osdep.h"

+#include "qemu/cutils.h"
+#include "qemu/qemu-print.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/error.h"
+
  #include 
  #include 
  #include 
@@ -27,18 +32,26 @@
  #include "net/net.h"
  #include "net/checksum.h"
  #include "net/util.h"
-#include "hw/xen/xen-legacy-backend.h"
+
+#include "hw/xen/xen-backend.h"
+#include "hw/xen/xen-bus-helper.h"
+#include "hw/qdev-properties.h"
+#include "hw/qdev-properties-system.h"
  
  #include "hw/xen/interface/io/netif.h"

+#include "hw/xen/interface/io/xs_wire.h"
+
+#include "trace.h"
  
  /* - */
  
  struct XenNetDev {

-struct XenLegacyDevice  xendev;  /* must be first */
-char  *mac;
+struct XenDevice  xendev;  /* must be first */
+XenEventChannel   *event_channel;
+int   dev;
  int   tx_work;
-int   tx_ring_ref;
-int   rx_ring_ref;
+unsigned int  tx_ring_ref;
+unsigned int  rx_ring_ref;
  struct netif_tx_sring *txs;
  struct netif_rx_sring *rxs;
  netif_tx_back_ring_t  tx_ring;
@@ -47,6 +60,13 @@ struct XenNetDev {
  NICState  *nic;
  };
  
+typedef struct XenNetDev XenNetDev;

+
+#define TYPE_XEN_NET_DEVICE "xen-net-device"
+OBJECT_DECLARE_SIMPLE_TYPE(XenNetDev, XEN_NET_DEVICE)
+
+#define xen_pv_printf(a, n, ...) qemu_printf(__VA_ARGS__)


Why define this...


+
  /* - */
  
  static void net_tx_response(struct XenNetDev *netdev, netif_tx_request_t *txp, int8_t st)

@@ -68,7 +88,8 @@ static void net_tx_response(struct XenNetDev *netdev, 
netif_tx_request_t *txp, i
  netdev->tx_ring.rsp_prod_pvt = ++i;
  RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&netdev->tx_ring, notify);
  if (notify) {
-xen_pv_send_notify(&netdev->xendev);
+xen_device_notify_event_channel(XEN_DEVICE(netdev),
+netdev->event_channel, NULL);
  }
  
  if (i == netdev->tx_ring.req_cons) {

@@ -104,8 +125,9 @@ static void net_tx_error(struct XenNetDev *netdev, 
netif_tx_request_t *txp, RING
  #endif
  }
  
-static void net_tx_packets(struct XenNetDev *netdev)

+static bool net_tx_packets(struct XenNetDev *netdev)
  {
+bool done_something = false;
  netif_tx_request_t txreq;
  RING_IDX rc, rp;
  void *page;
@@ -122,6 +144,7 @@ static void net_tx_packets(struct XenNetDev *netdev)
  }
  memcpy(&txreq, RING_GET_REQUEST(&netdev->tx_ring, rc), 
sizeof(txreq));
  netdev->tx_ring.req_cons = ++rc;
+done_something = true;
  
  #if 1

  /* should not happen in theory, we don't announce the *
@@ -151,7 +174,7 @@ static void net_tx_packets(struct XenNetDev *netdev)
  continue;
  }
  
-xen_pv_printf(&netdev->xendev, 3,

+if (0) qemu_printf(//&netdev->xendev, 3,


... and then not use it here? Perhaps the 'if (0)' ugliness can go in 
the macro too.



"tx packet ref %d, off %d, len %d, flags 
0x%x%s%s%

Re: [PATCH 1/4] hw/xen: only remove peers of PCI NICs on unplug

2023-10-24 Thread Paul Durrant


On 17/10/2023 19:25, David Woodhouse wrote:

From: David Woodhouse 

When the Xen guest asks to unplug *emulated* NICs, it's kind of unhelpful
also to unplug the peer of the *Xen* PV NIC.

Signed-off-by: David Woodhouse 
---
  hw/i386/xen/xen_platform.c | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c
index 17457ff3de..e2dd1b536a 100644
--- a/hw/i386/xen/xen_platform.c
+++ b/hw/i386/xen/xen_platform.c
@@ -140,9 +140,14 @@ static void unplug_nic(PCIBus *b, PCIDevice *d, void *o)
  /* Remove the peer of the NIC device. Normally, this would be a tap device. */
  static void del_nic_peer(NICState *nic, void *opaque)
  {
-NetClientState *nc;
+NetClientState *nc = qemu_get_queue(nic);
+ObjectClass *klass = module_object_class_by_name(nc->model);
+
+/* Only delete peers of PCI NICs that we're about to delete */
+if (!klass || !object_class_dynamic_cast(klass, TYPE_PCI_DEVICE)) {


Would it not be better to test for object_class_dynamic_cast(klass, 
TYPE_XEN_DEVICE)?


  Paul


+return;
+}
  
-nc = qemu_get_queue(nic);

  if (nc->peer)
  qemu_del_net_client(nc->peer);
  }

Re: [PATCH 12/12] hw/xen: add support for Xen primary console in emulated mode

2023-10-24 Thread Paul Durrant


On 16/10/2023 16:19, David Woodhouse wrote:

From: David Woodhouse 

The primary console is special because the toolstack maps a page at a
fixed GFN and also allocates the guest-side event channel. Add support
for that in emulated mode, so that we can have a primary console.

Add a *very* rudimentary stub of foriegnmem ops for emulated mode, which
supports literally nothing except a single-page mapping of the console
page. This might as well have been a hack in the xen_console driver, but
this way at least the special-casing is kept within the Xen emulation
code, and it gives us a hook for a more complete implementation if/when
we ever do need one.

Why can't you map the console page via the grant table like the xenstore 
page?


  Paul

Re: [PATCH 09/12] hw/xen: prevent duplicate device registrations

2023-10-24 Thread Paul Durrant


On 16/10/2023 16:19, David Woodhouse wrote:

From: David Woodhouse 

Ensure that we have a XenBackendInstance for every device regardless
of whether it was "discovered" in XenStore or created directly in QEMU.

This allows the backend_list to be a source of truth about whether a
given backend exists, and allows us to reject duplicates.

This also cleans up the fact that backend drivers were calling
xen_backend_set_device() with a XenDevice immediately after calling
qdev_realize_and_unref() on it, when it wasn't theirs to play with any
more.

Signed-off-by: David Woodhouse 
---
  hw/block/xen-block.c |  1 -
  hw/char/xen_console.c|  2 +-
  hw/xen/xen-backend.c | 78 ++--
  hw/xen/xen-bus.c |  8 
  include/hw/xen/xen-backend.h |  3 ++
  5 files changed, 69 insertions(+), 23 deletions(-)

diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index a07cd7eb5d..9262338535 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -975,7 +975,6 @@ static void xen_block_device_create(XenBackendInstance 
*backend,
  goto fail;
  }
  
-xen_backend_set_device(backend, xendev);

  return;
  
  fail:

diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index bd20be116c..2825b8c511 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -468,7 +468,7 @@ static void xen_console_device_create(XenBackendInstance 
*backend,
  Chardev *cd = NULL;
  struct qemu_xs_handle *xsh = xenbus->xsh;
  
-if (qemu_strtoul(name, NULL, 10, &number)) {

+if (qemu_strtoul(name, NULL, 10, &number) || number >= INT_MAX) {
  error_setg(errp, "failed to parse name '%s'", name);
  goto fail;
  }
I don't think this hunk belongs here, does it? Seems like it should be 
in patch 7.



diff --git a/hw/xen/xen-backend.c b/hw/xen/xen-backend.c
index b9bf70a9f5..dcb4329258 100644
--- a/hw/xen/xen-backend.c
+++ b/hw/xen/xen-backend.c
@@ -101,22 +101,28 @@ static XenBackendInstance 
*xen_backend_list_find(XenDevice *xendev)
  return NULL;
  }
  
-bool xen_backend_exists(const char *type, const char *name)

+static XenBackendInstance *xen_backend_lookup(const XenBackendImpl *impl, 
const char *name)


This name is a little close to xen_backend_table_lookup()... perhaps 
that one should be renamed xen_backend_impl_lookup() for clarity.



  {
-const XenBackendImpl *impl = xen_backend_table_lookup(type);
  XenBackendInstance *backend;
  
-if (!impl) {

-return false;
-}
-
  QLIST_FOREACH(backend, &backend_list, entry) {
  if (backend->impl == impl && !strcmp(backend->name, name)) {
-return true;
+return backend;
  }
  }
  
-return false;

+return NULL;
+}
+
+bool xen_backend_exists(const char *type, const char *name)
+{
+const XenBackendImpl *impl = xen_backend_table_lookup(type);
+
+if (!impl) {
+return false;
+}
+
+return !!xen_backend_lookup(impl, name);
  }
  
  static void xen_backend_list_remove(XenBackendInstance *backend)

@@ -138,11 +144,10 @@ void xen_backend_device_create(XenBus *xenbus, const char 
*type,
  backend = g_new0(XenBackendInstance, 1);
  backend->xenbus = xenbus;
  backend->name = g_strdup(name);
-
-impl->create(backend, opts, errp);
-
  backend->impl = impl;
  xen_backend_list_add(backend);
+
+impl->create(backend, opts, errp);
  }
  
  XenBus *xen_backend_get_bus(XenBackendInstance *backend)

@@ -155,13 +160,6 @@ const char *xen_backend_get_name(XenBackendInstance 
*backend)
  return backend->name;
  }
  
-void xen_backend_set_device(XenBackendInstance *backend,

-XenDevice *xendev)
-{
-g_assert(!backend->xendev);
-backend->xendev = xendev;
-}
-


The declaration also needs removing from xen-backend.h presumably.


  XenDevice *xen_backend_get_device(XenBackendInstance *backend)
  {
  return backend->xendev;
@@ -178,9 +176,7 @@ bool xen_backend_try_device_destroy(XenDevice *xendev, 
Error **errp)
  }
  
  impl = backend->impl;

-if (backend->xendev) {
-impl->destroy(backend, errp);
-}
+impl->destroy(backend, errp);
  
  xen_backend_list_remove(backend);

  g_free(backend->name);
@@ -188,3 +184,43 @@ bool xen_backend_try_device_destroy(XenDevice *xendev, 
Error **errp)
  
  return true;

  }
+
+bool xen_backend_device_realized(XenDevice *xendev, Error **errp)
+{
+XenDeviceClass *xendev_class = XEN_DEVICE_GET_CLASS(xendev);
+const char *type = xendev_class->backend ? : 
object_get_typename(OBJECT(xendev));
+const XenBackendImpl *impl = xen_backend_table_lookup(type);
+XenBackendInstance *backend;
+
+if (!impl) {
+return false;
+}
+
+backend = xen_backend_lookup(impl, xendev->name);
+if (backend) {
+if (backend->xendev && backend->xendev != xendev) {
+error_setg(errp, "device %s/%s already exists", type, 
xend

Re: [PATCH 06/12] hw/xen: add get_frontend_path() method to XenDeviceClass

2023-10-24 Thread Paul Durrant


On 24/10/2023 14:29, David Woodhouse wrote:

On Tue, 2023-10-24 at 13:59 +0100, Paul Durrant wrote:

On 24/10/2023 13:56, David Woodhouse wrote:

On Tue, 2023-10-24 at 13:42 +0100, Paul Durrant wrote:



--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -711,8 +711,16 @@ static void xen_device_frontend_create(XenDevice *xendev, 
Error **errp)
     {
     ERRP_GUARD();
     XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
+    XenDeviceClass *xendev_class = XEN_DEVICE_GET_CLASS(xendev);
 
-    xendev->frontend_path = xen_device_get_frontend_path(xendev);

+    if (xendev_class->get_frontend_path) {
+    xendev->frontend_path = xendev_class->get_frontend_path(xendev, errp);
+    if (!xendev->frontend_path) {
+    return;


I think you need to update errp here to note that you are failing to
create the frontend.


If xendev_class->get_frontend_path returned NULL it will have filled in errp.



Ok, but a prepend to say that a lack of path there means we skip
frontend creation seems reasonable?


No, it *is* returning an error. Perhaps I can make it



I understand it is returning an error. I thought the point of the 
cascading error handling was to prepend text at each (meaningful) layer 
such that the eventual message conveyed what failed and also what the 
consequences of that failure were.


  Paul


 if (!xendev->frontend_path) {
 /*
  * If the ->get_frontend_path() method returned NULL, it will
  * already have set *errp accordingly. Return the error.
  */
 return /* false */;
 }



As a general rule (I'll be doing a bombing run on xen-bus once I get my
patch queue down into single digits) we should never check 'if (*errp)'
to check if a function had an error. It should *also* return a success
or failure indication, and we should cope with errp being NULL.



I'm pretty sure someone told me the exact opposite a few years back.


Then they were wrong :)

Re: [PATCH 08/12] hw/xen: do not repeatedly try to create a failing backend device

2023-10-24 Thread Paul Durrant


On 16/10/2023 16:19, David Woodhouse wrote:

From: David Woodhouse 

If xen_backend_device_create() fails to instantiate a device, the XenBus
code will just keep trying over and over again each time the bus is
re-enumerated, as long as the backend appears online and in
XenbusStateInitialising.

The only thing which prevents the XenBus code from recreating duplicates
of devices which already exist, is the fact that xen_device_realize()
sets the backend state to XenbusStateInitWait. If the attempt to create
the device doesn't get *that* far, that's when it will keep getting
retried.

My first thought was to handle errors by setting the backend state to
XenbusStateClosed, but that doesn't work for XenConsole which wants to
*ignore* any device of type != "ioemu" completely.

So, make xen_backend_device_create() *keep* the XenBackendInstance for a
failed device, and provide a new xen_backend_exists() function to allow
xen_bus_type_enumerate() to check whether one already exists before
creating a new one.

Signed-off-by: David Woodhouse 
---
  hw/xen/xen-backend.c | 27 +--
  hw/xen/xen-bus.c |  3 ++-
  include/hw/xen/xen-backend.h |  1 +
  3 files changed, 24 insertions(+), 7 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH 07/12] hw/xen: update Xen console to XenDevice model

2023-10-24 Thread Paul Durrant

xen_device_notify_event_channel(XEN_DEVICE(con), con->event_channel, NULL);
  }
  
-static void xencons_send(struct XenConsole *con)

+static bool xencons_send(XenConsole *con)
  {
  ssize_t len, size;
  
@@ -159,174 +170,407 @@ static void xencons_send(struct XenConsole *con)

  if (len < 1) {
  if (!con->backlog) {
  con->backlog = 1;
-xen_pv_printf(&con->xendev, 1,
-  "backlog piling up, nobody listening?\n");
  }
  } else {
  buffer_advance(&con->buffer, len);
  if (con->backlog && len == size) {
  con->backlog = 0;
-xen_pv_printf(&con->xendev, 1, "backlog is gone\n");
  }
  }
+return len > 0;
  }
  
  /*  */
  
-static int store_con_info(struct XenConsole *con)

+static bool con_event(void *_xendev)
  {
-Chardev *cs = qemu_chr_fe_get_driver(&con->chr);
-char *pts = NULL;
-char *dom_path;
-g_autoptr(GString) path = NULL;
+XenConsole *con = XEN_CONSOLE_DEVICE(_xendev);
+bool done_something;
  
-/* Only continue if we're talking to a pty. */

-if (!CHARDEV_IS_PTY(cs)) {
-return 0;
-}
-pts = cs->filename + 4;
+done_something = buffer_append(con);
  
-dom_path = qemu_xen_xs_get_domain_path(xenstore, xen_domid);

-if (!dom_path) {
-return 0;
+if (con->buffer.size - con->buffer.consumed) {
+done_something |= xencons_send(con);
  }
+return done_something;
+}
  
-path = g_string_new(dom_path);

-free(dom_path);
+/*  */
  
-if (con->xendev.dev) {

-g_string_append_printf(path, "/device/console/%d", con->xendev.dev);
-} else {
-g_string_append(path, "/console");
+static void xen_console_disconnect(XenDevice *xendev, Error **errp)
+{
+XenConsole *con = XEN_CONSOLE_DEVICE(xendev);
+
+qemu_chr_fe_set_handlers(&con->chr, NULL, NULL, NULL, NULL,
+ con, NULL, true);
+


nit: extraneous blank line by the looks of it.

With that fixed...

Reviewed-by: Paul Durrant

Re: [PATCH 06/12] hw/xen: add get_frontend_path() method to XenDeviceClass

2023-10-24 Thread Paul Durrant


On 24/10/2023 13:56, David Woodhouse wrote:

On Tue, 2023-10-24 at 13:42 +0100, Paul Durrant wrote:



--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -711,8 +711,16 @@ static void xen_device_frontend_create(XenDevice *xendev, 
Error **errp)
    {
    ERRP_GUARD();
    XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
+    XenDeviceClass *xendev_class = XEN_DEVICE_GET_CLASS(xendev);

-    xendev->frontend_path = xen_device_get_frontend_path(xendev);

+    if (xendev_class->get_frontend_path) {
+    xendev->frontend_path = xendev_class->get_frontend_path(xendev, errp);
+    if (!xendev->frontend_path) {
+    return;


I think you need to update errp here to note that you are failing to
create the frontend.


If xendev_class->get_frontend_path returned NULL it will have filled in errp.



Ok, but a prepend to say that a lack of path there means we skip 
frontend creation seems reasonable?



As a general rule (I'll be doing a bombing run on xen-bus once I get my
patch queue down into single digits) we should never check 'if (*errp)'
to check if a function had an error. It should *also* return a success
or failure indication, and we should cope with errp being NULL.



I'm pretty sure someone told me the exact opposite a few years back.

  Paul

Re: [PATCH 06/12] hw/xen: add get_frontend_path() method to XenDeviceClass

2023-10-24 Thread Paul Durrant


On 16/10/2023 16:19, David Woodhouse wrote:

From: David Woodhouse 

The primary Xen console is special. The guest's side is set up for it by
the toolstack automatically and not by the standard PV init sequence.

Accordingly, its *frontend* doesn't appear in …/device/console/0 either;
instead it appears under …/console in the guest's XenStore node.

To allow the Xen console driver to override the frontend path for the
primary console, add a method to the XenDeviceClass which can be used
instead of the standard xen_device_get_frontend_path()

Signed-off-by: David Woodhouse 
---
  hw/xen/xen-bus.c | 10 +-
  include/hw/xen/xen-bus.h |  2 ++
  2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index ece8ec40cd..cc524ed92c 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -711,8 +711,16 @@ static void xen_device_frontend_create(XenDevice *xendev, 
Error **errp)
  {
  ERRP_GUARD();
  XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
+XenDeviceClass *xendev_class = XEN_DEVICE_GET_CLASS(xendev);
  
-xendev->frontend_path = xen_device_get_frontend_path(xendev);

+if (xendev_class->get_frontend_path) {
+xendev->frontend_path = xendev_class->get_frontend_path(xendev, errp);
+if (!xendev->frontend_path) {
+return;


I think you need to update errp here to note that you are failing to 
create the frontend.


  Paul

Re: [PATCH 05/12] hw/xen: populate store frontend nodes with XenStore PFN/port

2023-10-24 Thread Paul Durrant


On 16/10/2023 16:19, David Woodhouse wrote:

From: David Woodhouse 

This is kind of redundant since without being able to get these through
ome other method (HVMOP_get_param) the guest wouldn't be able to access


^ typo


XenStore in order to find them. But Xen populates them, and it does
allow guests to *rebind* to the event channel port after a reset.



... although this can also be done by querying the remote end of the 
port before reset.



Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_xenstore.c | 10 ++
  1 file changed, 10 insertions(+)



Reviewed-by: Paul Durrant

Re: [PATCH 04/12] i386/xen: advertise XEN_HVM_CPUID_UPCALL_VECTOR in CPUID

2023-10-24 Thread Paul Durrant


On 16/10/2023 16:19, David Woodhouse wrote:

From: David Woodhouse 

This will allow Linux guests (since v6.0) to use the per-vCPU upcall
vector delivered as MSI through the local APIC.

Signed-off-by: David Woodhouse 
---
  target/i386/kvm/kvm.c | 4 
  1 file changed, 4 insertions(+)



Reviewed-by: Paul Durrant

Re: [PATCH 03/12] include: update Xen public headers to Xen 4.17.2 release

2023-10-24 Thread Paul Durrant


On 16/10/2023 16:19, David Woodhouse wrote:

From: David Woodhouse 

... in order to advertise the XEN_HVM_CPUID_UPCALL_VECTOR feature,
which will come in a subsequent commit.

Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_xenstore.c|  2 +-
  include/hw/xen/interface/arch-arm.h   | 37 +++---
  include/hw/xen/interface/arch-x86/cpuid.h | 31 +---
  .../hw/xen/interface/arch-x86/xen-x86_32.h| 19 +--
  .../hw/xen/interface/arch-x86/xen-x86_64.h| 19 +--
  include/hw/xen/interface/arch-x86/xen.h   | 26 ++
  include/hw/xen/interface/event_channel.h  | 19 +--
  include/hw/xen/interface/features.h   | 19 +--
  include/hw/xen/interface/grant_table.h| 19 +--
  include/hw/xen/interface/hvm/hvm_op.h | 19 +--
  include/hw/xen/interface/hvm/params.h | 19 +--
  include/hw/xen/interface/io/blkif.h   | 27 --
  include/hw/xen/interface/io/console.h | 19 +--
  include/hw/xen/interface/io/fbif.h| 19 +--
  include/hw/xen/interface/io/kbdif.h   | 19 +--
  include/hw/xen/interface/io/netif.h   | 25 +++---
  include/hw/xen/interface/io/protocols.h   | 19 +--
  include/hw/xen/interface/io/ring.h| 49 ++-
  include/hw/xen/interface/io/usbif.h   | 19 +--
  include/hw/xen/interface/io/xenbus.h  | 19 +--
  include/hw/xen/interface/io/xs_wire.h | 36 ++
  include/hw/xen/interface/memory.h | 30 +---
  include/hw/xen/interface/physdev.h| 23 ++---
  include/hw/xen/interface/sched.h  | 19 +--
  include/hw/xen/interface/trace.h  | 19 +--
  include/hw/xen/interface/vcpu.h   | 19 +--
  include/hw/xen/interface/version.h| 19 +--
  include/hw/xen/interface/xen-compat.h | 19 +--
  include/hw/xen/interface/xen.h| 19 +--
  29 files changed, 124 insertions(+), 523 deletions(-)



Acked-by: Paul Durrant

Re: [PATCH 02/12] hw/xen: select kernel mode for per-vCPU event channel upcall vector

2023-10-24 Thread Paul Durrant


On 16/10/2023 16:18, David Woodhouse wrote:

From: David Woodhouse 

A guest which has configured the per-vCPU upcall vector may set the
HVM_PARAM_CALLBACK_IRQ param to fairly much anything other than zero.

For example, Linux v6.0+ after commit b1c3497e604 ("x86/xen: Add support
for HVMOP_set_evtchn_upcall_vector") will just do this after setting the
vector:

/* Trick toolstack to think we are enlightened. */
if (!cpu)
rc = xen_set_callback_via(1);

That's explicitly setting the delivery to GSI#, but it's supposed to be
overridden by the per-vCPU vector setting. This mostly works in QEMU
*except* for the logic to enable the in-kernel handling of event channels,
which falsely determines that the kernel cannot accelerate GSI delivery
in this case.

Add a kvm_xen_has_vcpu_callback_vector() to report whether vCPU#0 has
the vector set, and use that in xen_evtchn_set_callback_param() to
enable the kernel acceleration features even when the param *appears*
to be set to target a GSI.

Preserve the Xen behaviour that when HVM_PARAM_CALLBACK_IRQ is set to
*zero* the event channel delivery is disabled completely. (Which is
what that bizarre guest behaviour is working round in the first place.)

Fixes: 91cce756179 ("hw/xen: Add xen_evtchn device for event channel emulation")
Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_evtchn.c  | 6 ++
  include/sysemu/kvm_xen.h  | 1 +
  target/i386/kvm/xen-emu.c | 7 +++
  3 files changed, 14 insertions(+)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index 4df973022c..d72dca6591 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -490,6 +490,12 @@ int xen_evtchn_set_callback_param(uint64_t param)
  break;
  }
  
+/* If the guest has set a per-vCPU callback vector, prefer that. */

+if (gsi && kvm_xen_has_vcpu_callback_vector()) {
+in_kernel = kvm_xen_has_cap(EVTCHN_SEND);
+gsi = 0;
+}
+


So this deals with setting the callback via after setting the upcall 
vector. What happens if the guest then disables the upcall vector (by 
setting it to zero)? Xen checks 'v->arch.hvm.evtchn_upcall_vector != 0' 
for every event delivery.


  Paul

Re: [PATCH 01/12] i386/xen: fix per-vCPU upcall vector for Xen emulation

2023-10-24 Thread Paul Durrant


On 16/10/2023 16:18, David Woodhouse wrote:

From: David Woodhouse 

The per-vCPU upcall vector support had two problems. Firstly it was
using the wrong hypercall argument and would always return -EFAULT.
And secondly it was using the wrong ioctl() to pass the vector to
the kernel and thus the *kernel* would always return -EINVAL.

Linux doesn't (yet) use this mode so it went without decent testing
for a while.

Fixes: 105b47fdf2d0 ("i386/xen: implement HVMOP_set_evtchn_upcall_vector")
Signed-off-by: David Woodhouse 
---
  target/i386/kvm/xen-emu.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)


Reviewed-by: Paul Durrant

Re: [RFC PATCH v3 42/78] hw/i386: add fallthrough pseudo-keyword

2023-10-17 Thread Paul Durrant


On 13/10/2023 09:46, Emmanouil Pitsidianakis wrote:

In preparation of raising -Wimplicit-fallthrough to 5, replace all
fall-through comments with the fallthrough attribute pseudo-keyword.

Signed-off-by: Emmanouil Pitsidianakis 
---
  hw/i386/intel_iommu.c| 4 ++--
  hw/i386/kvm/xen_evtchn.c | 2 +-


Reviewed-by: Paul Durrant 


  hw/i386/x86.c| 2 +-
  3 files changed, 4 insertions(+), 4 deletions(-)

Re: [PATCH] i386/xen: Ignore VCPU_SSHOTTMR_future flag in set_singleshot_timer()

2023-08-25 Thread Paul Durrant


On 23/08/2023 12:58, David Woodhouse wrote:

From: David Woodhouse 

Upstream Xen now ignores this flag¹, since the only guest kernel ever to
use it was buggy.

¹ https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=19c6cbd909

Signed-off-by: David Woodhouse 
---
We do take an argument to emulate a specific Xen version, and
*theoretically* we could ignore the VCPU_SSHOTTMR_future flag only if
we're emulating Xen 4.17 or newer? But I don't think it's worth doing
that (and QEMU won't be the only instance of Xen emulation or even real
older Xen versions patched to ignore the flag).

  target/i386/kvm/xen-emu.c | 20 ++--
  1 file changed, 10 insertions(+), 10 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2 1/4] block: rename blk_io_plug_call() API to defer_call()

2023-08-18 Thread Paul Durrant


On 17/08/2023 16:58, Stefan Hajnoczi wrote:

Prepare to move the blk_io_plug_call() API out of the block layer so
that other subsystems call use this deferred call mechanism. Rename it
to defer_call() but leave the code in block/plug.c.

The next commit will move the code out of the block layer.

Suggested-by: Ilya Maximets 
Signed-off-by: Stefan Hajnoczi 
---
  include/sysemu/block-backend-io.h |   6 +-
  block/blkio.c |   8 +--
  block/io_uring.c  |   4 +-
  block/linux-aio.c |   4 +-
  block/nvme.c  |   4 +-
  block/plug.c  | 109 +++---
  hw/block/dataplane/xen-block.c|  10 +--
  hw/block/virtio-blk.c |   4 +-
  hw/scsi/virtio-scsi.c |   6 +-
  9 files changed, 76 insertions(+), 79 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH] i386/xen: Don't advertise XENFEAT_supervisor_mode_kernel

2023-08-14 Thread Paul Durrant


On 08/08/2023 18:08, David Woodhouse wrote:

From: David Woodhouse 

XENFEAT_supervisor_mode_kernel shouldn't be set for HVM guests. It
confuses lscpu into thinking it's running in PVH mode.

No non-cosmetic effects have been observed so far.

Signed-off-by: David Woodhouse 
---
Only really cosmetic. Don't feel strongly about whether it makes 8.1.

  target/i386/kvm/xen-emu.c | 1 -
  1 file changed, 1 deletion(-)



Reviewed-by: Paul Durrant

Re: [RFC PATCH post-8.1] hw/xen: Clean up event channel 'type_val' handling to use union

2023-08-14 Thread Paul Durrant

in->remote_dom == xen_domid) {
-type_val = 0;
-} else {
+if (interdomain->remote_dom != DOMID_QEMU &&
+interdomain->remote_dom != DOMID_SELF &&
+interdomain->remote_dom != xen_domid) {
  return -ESRCH;
  }
  
@@ -1375,8 +1378,8 @@ int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain)

  qemu_mutex_lock(&s->port_lock);
  
  /* The newly allocated port starts out as unbound */

-ret = allocate_port(s, 0, EVTCHNSTAT_unbound, type_val,
-&interdomain->local_port);
+ret = allocate_port(s, 0, EVTCHNSTAT_unbound, 0, &interdomain->local_port);
+
  if (ret) {
  goto out;
  }
@@ -1401,7 +1404,8 @@ int xen_evtchn_bind_interdomain_op(struct 
evtchn_bind_interdomain *interdomain)
  assign_kernel_eventfd(lp->type, xc->guest_port, xc->fd);
  }
  lp->type = EVTCHNSTAT_interdomain;
-lp->type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU | 
interdomain->remote_port;
+lp->u.interdomain.to_qemu = 1;
+lp->u.interdomain.port = interdomain->remote_port;
  ret = 0;
  } else {
  /* Loopback */
@@ -1415,13 +1419,13 @@ int xen_evtchn_bind_interdomain_op(struct 
evtchn_bind_interdomain *interdomain)
   * the port that was just allocated for the local end.
   */
  if (interdomain->local_port != interdomain->remote_port &&
-rp->type == EVTCHNSTAT_unbound && rp->type_val == 0) {
+rp->type == EVTCHNSTAT_unbound && !rp->u.interdomain.to_qemu) {
  
  rp->type = EVTCHNSTAT_interdomain;

-rp->type_val = interdomain->local_port;
+rp->u.interdomain.port = interdomain->local_port;
  
  lp->type = EVTCHNSTAT_interdomain;

-lp->type_val = interdomain->remote_port;
+lp->u.interdomain.port = interdomain->remote_port;
  } else {
  ret = -EINVAL;
  }
@@ -1440,7 +1444,6 @@ int xen_evtchn_bind_interdomain_op(struct 
evtchn_bind_interdomain *interdomain)
  int xen_evtchn_alloc_unbound_op(struct evtchn_alloc_unbound *alloc)
  {
  XenEvtchnState *s = xen_evtchn_singleton;
-uint16_t type_val;
  int ret;
  
  if (!s) {

@@ -1451,18 +1454,19 @@ int xen_evtchn_alloc_unbound_op(struct 
evtchn_alloc_unbound *alloc)
  return -ESRCH;
  }
  
-if (alloc->remote_dom == DOMID_QEMU) {

-type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU;
-} else if (alloc->remote_dom == DOMID_SELF ||
-   alloc->remote_dom == xen_domid) {
-type_val = 0;
-} else {
+if (alloc->remote_dom != DOMID_QEMU && alloc->remote_dom != DOMID_SELF &&
+alloc->remote_dom != xen_domid) {


Maybe vertically align the clauses here as in the 'if' a couple of hunks 
back?



  return -EPERM;
  }
  


Since all the above are nits...

Reviewed-by: Paul Durrant

Re: [PATCH v1] xen-platform: do full PCI reset during unplug of IDE devices

2023-07-27 Thread Paul Durrant


On 20/07/2023 08:29, Olaf Hering wrote:

The IDE unplug function needs to reset the entire PCI device, to make
sure all state is initialized to defaults. This is done by calling
pci_device_reset, which resets not only the chip specific registers, but
also all PCI state. This fixes "unplug" in a Xen HVM domU with the
modular legacy xenlinux PV drivers.

Commit ee358e919e38 ("hw/ide/piix: Convert reset handler to
DeviceReset") changed the way how the the disks are unplugged. Prior
this commit the PCI device remained unchanged. After this change,
piix_ide_reset is exercised after the "unplug" command, which was not
the case prior that commit. This function resets the command register.
As a result the ata_piix driver inside the domU will see a disabled PCI
device. The generic PCI code will reenable the PCI device. On the qemu
side, this runs pci_default_write_config/pci_update_mappings. Here a
changed address is returned by pci_bar_address, this is the address
which was truncated in piix_ide_reset. In case of a Xen HVM domU, the
address changes from 0xc120 to 0xc100. This truncation was a bug in
piix_ide_reset, which was fixed in commit 230dfd9257 ("hw/ide/piix:
properly initialize the BMIBA register"). If pci_xen_ide_unplug had used
pci_device_reset, the PCI registers would have been properly reset, and
commit ee358e919e38 would have not introduced a regression for this
specific domU environment.

While the unplug is supposed to hide the IDE disks, the changed BMIBA
address broke the UHCI device. In case the domU has an USB tablet
configured, to recive absolute pointer coordinates for the GUI, it will
cause a hang during device discovery of the partly discovered USB hid
device. Reading the USBSTS word size register will fail. The access ends
up in the QEMU piix-bmdma device, instead of the expected uhci device.
Here a byte size request is expected, and a value of ~0 is returned. As
a result the UCHI driver sees an error state in the register, and turns
off the UHCI controller.

Signed-off-by: Olaf Hering 
---
  hw/i386/xen/xen_platform.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)



Reviewed-by: Paul Durrant

Re: i386/xen: prevent guest from binding loopback event channel to itself

2023-07-26 Thread Paul Durrant


On 26/07/2023 10:07, David Woodhouse wrote:

On Wed, 2023-07-26 at 09:44 +0100, Paul Durrant wrote:

On 25/07/2023 11:05, David Woodhouse wrote:

From: David Woodhouse 

Fuzzing showed that a guest could bind an interdomain port to itself, by
guessing the next port to be allocated and putting that as the 'remote'
port number. By chance, that works because the newly-allocated port has
type EVTCHNSTAT_unbound. It shouldn't.

Signed-off-by: David Woodhouse 
---
   hw/i386/kvm/xen_evtchn.c | 11 +--
   1 file changed, 9 insertions(+), 2 deletions(-)



Reviewed-by: Paul Durrant 



Thanks. I'll change the title prefix to 'hw/xen' since it's in hw/ not
target/i386.


Yes, makes sense.


Please can I have also have a review for
https://lore.kernel.org/qemu-devel/20076888f6bdf06a65aafc5cf954260965d45b97.ca...@infradead.org/



Sorry I missed that. Done.

Cheers,

  Paul

Re: [PATCH] hw/xen: Clarify (lack of) error handling in transaction_commit()

2023-07-26 Thread Paul Durrant


On 20/06/2023 18:58, David Woodhouse wrote:

From: David Woodhouse 

Coverity was unhappy (CID 1508359) because we didn't check the return of
init_walk_op() in transaction_commit(), despite doing so at every other
call site.

Strictly speaking, this is a false positive since it can never fail. It
only fails for invalid user input (transaction ID or path), and both of
those are hard-coded to known sane values in this invocation.

But Coverity doesn't know that, and neither does the casual reader of the
code.

Returning an error here would be weird, since the transaction *is*
committed by this point; all the walk_op is doing is firing watches on
the newly-committed changed nodes. So make it a g_assert(!ret), since
it really should never happen.

Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xenstore_impl.c | 12 +++-
  1 file changed, 11 insertions(+), 1 deletion(-)



Reviewed-by: Paul Durrant

Re: i386/xen: prevent guest from binding loopback event channel to itself

2023-07-26 Thread Paul Durrant


On 25/07/2023 11:05, David Woodhouse wrote:

From: David Woodhouse 

Fuzzing showed that a guest could bind an interdomain port to itself, by
guessing the next port to be allocated and putting that as the 'remote'
port number. By chance, that works because the newly-allocated port has
type EVTCHNSTAT_unbound. It shouldn't.

Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_evtchn.c | 11 +--
  1 file changed, 9 insertions(+), 2 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH] xen-block: Avoid leaks on new error path

2023-07-06 Thread Paul Durrant


On 04/07/2023 18:18, Anthony PERARD wrote:

From: Anthony PERARD 

Commit 189829399070 ("xen-block: Use specific blockdev driver")
introduced a new error path, without taking care of allocated
resources.

So only allocate the qdicts after the error check, and free both
`filename` and `driver` when we are about to return and thus taking
care of both success and error path.

Coverity only spotted the leak of qdicts (*_layer variables).

Reported-by: Peter Maydell 
Fixes: Coverity CID 1508722, 1398649
Fixes: 189829399070 ("xen-block: Use specific blockdev driver")
Signed-off-by: Anthony PERARD 
---
  hw/block/xen-block.c | 11 ++-
  1 file changed, 6 insertions(+), 5 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2] i386/xen: consistent locking around Xen singleshot timers

2023-07-06 Thread Paul Durrant


On 04/07/2023 16:51, David Woodhouse wrote:

From: David Woodhouse 

Coverity points out (CID 1507534, 1507968) that we sometimes access
env->xen_singleshot_timer_ns under the protection of
env->xen_timers_lock and sometimes not.

This isn't always an issue. There are two modes for the timers; if the
kernel supports the EVTCHN_SEND capability then it handles all the timer
hypercalls and delivery internally, and all we use the field for is to
get/set the timer as part of the vCPU state via an ioctl(). If the
kernel doesn't have that support, then we do all the emulation within
qemu, and *those* are the code paths where we actually care about the
locking.

But it doesn't hurt to be a little bit more consistent and avoid having
to explain *why* it's OK.

Signed-off-by: David Woodhouse 
---
  target/i386/kvm/xen-emu.c | 36 ++--
  1 file changed, 26 insertions(+), 10 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH] i386/xen: consistent locking around Xen singleshot timers

2023-05-23 Thread Paul Durrant


On 22/05/2023 19:52, David Woodhouse wrote:

From: David Woodhouse 

Coverity points out (CID 1507534) that we sometimes access
env->xen_singleshot_timer_ns under the protection of
env->xen_timers_lock (eg in xen_vcpu_singleshot_timer_event()) and
sometimes not (the specific case Coverity complains about is in
do_vcpu_soft_reset()).

This isn't strictly an issue. There are two modes for the timers; if
the kernel supports the EVTCHN_SEND capability then it handles all the
timer hypercalls and delivery internally, and all we need to do is an
ioctl to get/set the next timer as part of the vCPU state. If the
kernel doesn't have that support, then we do all the emulation within
qemu, and *those* are the code paths where we actually care about the
locking.

But it doesn't hurt to be a little bit more consistent and avoid having
to explain *why* it's OK.

Signed-off-by: David Woodhouse 
---

On Tue, 2023-05-09 at 15:55 +0100, Peter Maydell wrote:

Hi; Coverity points out (CID 1507534) that we seem to sometimes
access env->xen_singleshot_timer_ns under the protection of
env->xen_timers_lock (eg in xen_vcpu_singleshot_timer_event())
and sometimes not (the specific case Coverity complains about is
in do_vcpu_soft_reset()). Is this a false positive, or is there


Thanks. As noted, I think it's harmless but it doesn't hurt to clean it
up a bit. I've pushed this to my tree at
https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv
on top of the other fixes that didn't make 8.0.

  target/i386/kvm/xen-emu.c | 15 +--
  1 file changed, 9 insertions(+), 6 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2 RESEND] xen: Fix SEGV on domain disconnect

2023-04-24 Thread Paul Durrant


On 24/04/2023 14:17, Tim Smith wrote:

On Mon, Apr 24, 2023 at 1:08 PM Mark Syms  wrote:


Copying in Tim who did the final phase of the changes.

On Mon, 24 Apr 2023 at 11:32, Paul Durrant  wrote:


On 20/04/2023 12:02, mark.s...@citrix.com wrote:

From: Mark Syms 

Ensure the PV ring is drained on disconnect. Also ensure all pending
AIO is complete, otherwise AIO tries to complete into a mapping of the
ring which has been torn down.

Signed-off-by: Mark Syms 
---
CC: Stefano Stabellini 
CC: Anthony Perard 
CC: Paul Durrant 
CC: xen-de...@lists.xenproject.org

v2:
   * Ensure all inflight requests are completed before teardown
   * RESEND to fix formatting
---
   hw/block/dataplane/xen-block.c | 31 +--
   1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 734da42ea7..d9da4090bf 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -523,6 +523,10 @@ static bool xen_block_handle_requests(XenBlockDataPlane 
*dataplane)

   dataplane->more_work = 0;

+if (dataplane->sring == 0) {
+return done_something;
+}
+


I think you could just return false here... Nothing is ever going to be
done if there's no ring :-)


   rc = dataplane->rings.common.req_cons;
   rp = dataplane->rings.common.sring->req_prod;
   xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
@@ -666,14 +670,35 @@ void xen_block_dataplane_destroy(XenBlockDataPlane 
*dataplane >   void xen_block_dataplane_stop(XenBlockDataPlane *dataplane)
   {
   XenDevice *xendev;
+XenBlockRequest *request, *next;

   if (!dataplane) {
   return;
   }

+/* We're about to drain the ring. We can cancel the scheduling of any
+ * bottom half now */
+qemu_bh_cancel(dataplane->bh);
+
+/* Ensure we have drained the ring */
+aio_context_acquire(dataplane->ctx);
+do {
+xen_block_handle_requests(dataplane);
+} while (dataplane->more_work);
+aio_context_release(dataplane->ctx);
+


I don't think we want to be taking new requests, do we?



If we're in this situation and the guest has put something on the
ring, I think we should do our best with it.
We cannot just rely on the guest to be well-behaved, because they're
not :-( We're about to throw the
ring away, so whatever is there would otherwise be lost.


We only throw away our mapping. The memory belongs to the guest and it
should ensure it does not submit requests after the state has left
'connected'


This bit is
here to try to handle guests which are
less than diligent about their shutdown. We *should* always be past
this fast enough when the disconnect()/connect()
of XenbusStateConnected happens that all remains well (if not, we were
in a worse situation before).



What about a malicious guest that is piling requests into the ring. It 
could keep us in the loop forever, couldn't it?



+/* Now ensure that all inflight requests are complete */
+while (!QLIST_EMPTY(&dataplane->inflight)) {
+QLIST_FOREACH_SAFE(request, &dataplane->inflight, list, next) {
+blk_aio_flush(request->dataplane->blk, xen_block_complete_aio,
+request);
+}
+}
+


I think this could possibly be simplified by doing the drain after the
call to blk_set_aio_context(), as long as we set dataplane->ctx to
qemu_get_aio_context(). Alos, as long as more_work is not set then it
should still be safe to cancel the bh before the drain AFAICT.


I'm not sure what you mean by simpler? Possibly I'm not getting something.



Sorry, I was referring to the need to do aio_context_acquire() calls but 
they are only around the disputed xen_block_handle_requests() call 
anyway, so there's no simplification in this bit.



We have to make sure that any "aio_bh_schedule_oneshot_full()" which
happens as a result of
"blk_aio_flush()" has finished before any change of AIO context,
because it tries to use the one which
was current at the time of being called (I have the SEGVs to prove it
:-)).


Ok, I had assumed that the issue was the context being picked up inside 
the xen_block_complete_aio() call.



Whether that happens before or after
"blk_set_aio_context(qemu_get_aio_context())" doesn't seem to be a
change in complexity to me.

Motivation was to get as much as possible to happen in the way it
"normally" would, so that future changes
are less likely to regress, but as mentioned maybe I'm missing something.

The BH needs to be prevented from firing ASAP, otherwise the
disconnect()/connect() which happens when
XenbusStateConnected can have the bh fire from what the guest does
next right in the middle of juggling
contexts for the disconnect() (I have the SEGVs from that too...).



So if you drop the ring drain then this patch should still stop the 
SEGVs, right?


  Paul

Re: [PATCH v2 RESEND] xen: Fix SEGV on domain disconnect

2023-04-24 Thread Paul Durrant


On 20/04/2023 12:02, mark.s...@citrix.com wrote:

From: Mark Syms 

Ensure the PV ring is drained on disconnect. Also ensure all pending
AIO is complete, otherwise AIO tries to complete into a mapping of the
ring which has been torn down.

Signed-off-by: Mark Syms 
---
CC: Stefano Stabellini 
CC: Anthony Perard 
CC: Paul Durrant 
CC: xen-de...@lists.xenproject.org

v2:
  * Ensure all inflight requests are completed before teardown
  * RESEND to fix formatting
---
  hw/block/dataplane/xen-block.c | 31 +--
  1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 734da42ea7..d9da4090bf 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -523,6 +523,10 @@ static bool xen_block_handle_requests(XenBlockDataPlane 
*dataplane)
  
  dataplane->more_work = 0;
  
+if (dataplane->sring == 0) {

+return done_something;
+}
+


I think you could just return false here... Nothing is ever going to be 
done if there's no ring :-)



  rc = dataplane->rings.common.req_cons;
  rp = dataplane->rings.common.sring->req_prod;
  xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
@@ -666,14 +670,35 @@ void xen_block_dataplane_destroy(XenBlockDataPlane 
*dataplane >   void xen_block_dataplane_stop(XenBlockDataPlane *dataplane)
  {
  XenDevice *xendev;
+XenBlockRequest *request, *next;
  
  if (!dataplane) {

  return;
  }
  
+/* We're about to drain the ring. We can cancel the scheduling of any

+ * bottom half now */
+qemu_bh_cancel(dataplane->bh);
+
+/* Ensure we have drained the ring */
+aio_context_acquire(dataplane->ctx);
+do {
+xen_block_handle_requests(dataplane);
+} while (dataplane->more_work);
+aio_context_release(dataplane->ctx);
+


I don't think we want to be taking new requests, do we?


+/* Now ensure that all inflight requests are complete */
+while (!QLIST_EMPTY(&dataplane->inflight)) {
+QLIST_FOREACH_SAFE(request, &dataplane->inflight, list, next) {
+blk_aio_flush(request->dataplane->blk, xen_block_complete_aio,
+request);
+}
+}
+


I think this could possibly be simplified by doing the drain after the 
call to blk_set_aio_context(), as long as we set dataplane->ctx to 
qemu_get_aio_context(). Alos, as long as more_work is not set then it 
should still be safe to cancel the bh before the drain AFAICT.


  Paul


  xendev = dataplane->xendev;
  
  aio_context_acquire(dataplane->ctx);

+
  if (dataplane->event_channel) {
  /* Only reason for failure is a NULL channel */
  xen_device_set_event_channel_context(xendev, dataplane->event_channel,
@@ -684,12 +709,6 @@ void xen_block_dataplane_stop(XenBlockDataPlane *dataplane)
  blk_set_aio_context(dataplane->blk, qemu_get_aio_context(), &error_abort);
  aio_context_release(dataplane->ctx);
  
-/*

- * Now that the context has been moved onto the main thread, cancel
- * further processing.
- */
-qemu_bh_cancel(dataplane->bh);
-
  if (dataplane->event_channel) {
  Error *local_err = NULL;

Re: [PATCH v2 08/16] hw/xen: do not use aio_set_fd_handler(is_external=true) in xen_xenstore

2023-04-20 Thread Paul Durrant


On 19/04/2023 18:28, Stefan Hajnoczi wrote:

There is no need to suspend activity between aio_disable_external() and
aio_enable_external(), which is mainly used for the block layer's drain
operation.

This is part of ongoing work to remove the aio_disable_external() API.

Reviewed-by: David Woodhouse 
Signed-off-by: Stefan Hajnoczi 
---
  hw/i386/kvm/xen_xenstore.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)



Reviewed-by: Paul Durrant

Re: [PATCH 5/5] hw/xen: Fix broken check for invalid state in xs_be_open()

2023-04-17 Thread Paul Durrant


On 12/04/2023 19:51, David Woodhouse wrote:

From: David Woodhouse 

Coverity points out that if (!s && !s->impl) isn't really what we intended
to do here. CID 1508131.

Fixes: 032475127225 ("hw/xen: Add emulated implementation of XenStore 
operations")
Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_xenstore.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)



Reviewed-by: Paul Durrant

Re: [PATCH 4/5] hw/xen: Fix double-free in xen_console store_con_info()

2023-04-17 Thread Paul Durrant


On 12/04/2023 19:51, David Woodhouse wrote:

From: David Woodhouse 

Coverity spotted a double-free (CID 1508254); we g_string_free(path) and
then for some reason immediately call free(path) too.

We should just use g_autoptr() for it anyway, which simplifies the code
a bit.

Fixes: 7a8a749da7d3 ("hw/xen: Move xenstore_store_pv_console_info to 
xen_console.c")
Signed-off-by: David Woodhouse 
---
  hw/char/xen_console.c | 13 +++--
  1 file changed, 3 insertions(+), 10 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH 3/5] xen: Drop support for Xen versions below 4.7.1

2023-04-17 Thread Paul Durrant


On 12/04/2023 19:51, David Woodhouse wrote:

From: David Woodhouse 

In restructuring to allow for internal emulation of Xen functionality,
I broke compatibility for Xen 4.6 and earlier. Fix this by explicitly
removing support for anything older than 4.7.1, which is also ancient
but it does still build, and the compatibility support for it is fairly
unintrusive.

Fixes: 15e283c5b684 ("hw/xen: Add foreignmem operations to allow redirection to 
internal emulation")
Signed-off-by: David Woodhouse 
---
  hw/xen/xen-operations.c |  57 +--
  include/hw/xen/xen_native.h | 107 +---
  meson.build |   5 +-
  scripts/xen-detect.c|  60 
  4 files changed, 3 insertions(+), 226 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH 2/5] hw/xen: Fix memory leak in libxenstore_open() for Xen

2023-04-17 Thread Paul Durrant


On 12/04/2023 19:50, David Woodhouse wrote:

From: David Woodhouse 

There was a superfluous allocation of the XS handle, leading to it
being leaked on both the error path and the success path (where it gets
allocated again).

Spotted by Coverity (CID 1508098).

Fixes: ba2a92db1ff6 ("hw/xen: Add xenstore operations to allow redirection to 
internal emulation")
Suggested-by: Peter Maydell 
Signed-off-by: David Woodhouse 
Reviewed-by: Peter Maydell 


Reviewed-by: Paul Durrant

Re: [PATCH] hw/xen: Simplify emulated Xen platform init

2023-03-24 Thread Paul Durrant


On 10/03/2023 10:54, David Woodhouse wrote:

From: David Woodhouse 

I initially put the basic platform init (overlay pages, grant tables,
event channels) into mc->kvm_type because that was the earliest place
that could sensibly test for xen_mode==XEN_EMULATE.

The intent was to do this early enough that we could then initialise the
XenBus and other parts which would have depended on them, from a generic
location for both Xen and KVM/Xen in the PC-specific code, as seen in
https://lore.kernel.org/qemu-devel/20230116221919.1124201-16-dw...@infradead.org/

However, then the Xen on Arm patches came along, and *they* wanted to
do the XenBus init from a 'generic' Xen-specific location instead:
https://lore.kernel.org/qemu-devel/20230210222729.957168-4-sstabell...@kernel.org/

Since there's no generic location that covers all three, I conceded to
do it for XEN_EMULATE mode in pc_basic_devices_init().

And now there's absolutely no point in having some of the platform init
done from pc_machine_kvm_type(); we can move it all up to live in a
single place in pc_basic_devices_init(). This has the added benefit that
we can drop the separate xen_evtchn_connect_gsis() function completely,
and pass just the system GSIs in directly to xen_evtchn_create().

While I'm at it, it does no harm to explicitly pass in the *number* of
said GSIs, because it does make me twitch a bit to pass an array of
impicit size. During the lifetime of the KVM/Xen patchset, that had
already changed (albeit just cosmetically) from GSI_NUM_PINS to
IOAPIC_NUM_PINS.

And document a bit better that this is for the *output* GSI for raising
CPU0's events when the per-CPU vector isn't available. The fact that
we create a whole set of them and then only waggle the one we're told
to, instead of having a single output and only *connecting* it to the
GSI that it should be connected to, is still non-intuitive for me.

Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_evtchn.c | 40 
  hw/i386/kvm/xen_evtchn.h |  3 +--
  hw/i386/pc.c | 13 -
  3 files changed, 25 insertions(+), 31 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH] hw/xenpv: Initialize Xen backend operations

2023-03-23 Thread Paul Durrant


On 23/03/2023 10:57, David Woodhouse wrote:

From: David Woodhouse 

As the Xen backend operations were abstracted out into a function table to
allow for internally emulated Xen support, we missed the xen_init_pv()
code path which also needs to install the operations for the true Xen
libraries. Add the missing call to setup_xen_backend_ops().

Fixes: b6cacfea0b38 ("hw/xen: Add evtchn operations to allow redirection to internal 
emulation")
Reported-by: Anthony PERARD 
Signed-off-by: David Woodhouse 
---
  hw/xenpv/xen_machine_pv.c | 2 ++
  1 file changed, 2 insertions(+)


Reviewed-by: Paul Durrant 



diff --git a/hw/xenpv/xen_machine_pv.c b/hw/xenpv/xen_machine_pv.c
index 2e759d0619..17cda5ec13 100644
--- a/hw/xenpv/xen_machine_pv.c
+++ b/hw/xenpv/xen_machine_pv.c
@@ -35,6 +35,8 @@ static void xen_init_pv(MachineState *machine)
  DriveInfo *dinfo;
  int i;
  
+setup_xen_backend_ops();

+
  /* Initialize backend core & drivers */
  xen_be_init();

Re: [PATCH] accel/xen: Fix DM state change notification in dm_restrict mode

2023-03-14 Thread Paul Durrant


On 14/03/2023 08:35, David Woodhouse wrote:

From: David Woodhouse 

When dm_restrict is set, QEMU isn't permitted to update the XenStore node
to indicate its running status. Previously, the xs_write() call would fail
but the failure was ignored.

However, in refactoring to allow for emulated XenStore operations, a new
call to xs_open() was added. That one didn't fail gracefully, causing a
fatal error when running in dm_restrict mode.

Partially revert the offending patch, removing the additional call to
xs_open() because the global 'xenstore' variable is still available; it
just needs to be used with qemu_xen_xs_write() now instead of directly
with the xs_write() libxenstore function.

Also make the whole thing conditional on !xen_domid_restrict. There's no
point even registering the state change handler to attempt to update the
XenStore node when we know it's destined to fail.

Fixes: ba2a92db1ff6 ("hw/xen: Add xenstore operations to allow redirection to 
internal emulation")
Reported-by: Jason Andryuk 
Co-developed-by: Jason Andryuk 
Not-Signed-off-by: Jason Andryuk 
Signed-off-by: David Woodhouse 
Will-be-Tested-by: Jason Andryuk 
---
  accel/xen/xen-all.c | 27 ++-
  1 file changed, 10 insertions(+), 17 deletions(-)



Reviewed-by: Paul Durrant

Re: [PATCH v2 00/27] Enable PV backends with Xen/KVM emulation

2023-03-07 Thread Paul Durrant


On 07/03/2023 17:17, David Woodhouse wrote:

Following on from the basic platform support which has already been
merged, here's phase 2 which wires up the XenBus and PV back ends.

It starts with a basic single-tenant internal implementation of a
XenStore, with a copy-on-write tree, watches, transactions, quotas.

Then we introduce operations tables for the grant table, event channel,
foreignmen and xenstore operations so that in addition to using the Xen
libraries for those, QEMU can use its internal emulated versions.

A little bit of cleaning up of header files, and we can enable the build
of xen-bus in the CONFIG_XEN_EMU build, and run a Xen guest with an
actual PV disk...

qemu-system-x86_64 -serial mon:stdio -M q35 -display none -m 1G -smp 2 \
   -accel kvm,xen-version=0x4000e,kernel-irqchip=split \
   -kernel bzImage -append "console=ttyS0 root=/dev/xvda1 selinux=0" \
   -drive file=/var/lib/libvirt/images/fedora28.qcow2,if=none,id=disk \
   -device xen-disk,drive=disk,vdev=xvda

The main thing that isn't working here is migration. I've implemented it
for the internal xenstore and the unit tests exercise it, but the
existing PV back ends don't support it, perhaps partly because support
for guest transparent live migration support isn't upstream in Xen yet.
So the disk doesn't come back correctly after migration. I'm content
with that for 8.0 though, and we just mark the emulated XenStore device
as unmigratable to prevent users from trying.

The other pre-existing constraint is that only the block back end has
yet been ported to the "new" XenBus infrastructure, and is actually
capable of creating its own backend nodes. Again, I can live with
that for 8.0. Maybe this will motivate us to finally get round to
converting the rest off XenLegacyBackend and killing it.

We also don't have a simple way to perform grant mapping of multiple
guest pages to contiguous addresses, as we can under real Xen. So we
don't advertise max-ring-page-order for xen-disk in the emulated mode.
Fixing that — if we actually want to — would probably require mapping
RAM from an actual backing store object, so that it can be mapped again
at a different location for the PV back end to see.

v2: https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv-2

  • Full set of reviewed-by tags from Paul (and associated minor fixes).

  • Disable migration for emulated XenStore device.

  • Update docs and add MAINTAINERS entry.

v1: 
https://lore.kernel.org/qemu-devel/20230302153435.1170111-1-dw...@infradead.org/
 https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv-1

David Woodhouse (23):
   hw/xen: Add xenstore wire implementation and implementation stubs
   hw/xen: Add basic XenStore tree walk and write/read/directory support
   hw/xen: Implement XenStore watches
   hw/xen: Implement XenStore transactions
   hw/xen: Watches on XenStore transactions
   hw/xen: Implement core serialize/deserialize methods for xenstore_impl
   hw/xen: Add evtchn operations to allow redirection to internal emulation
   hw/xen: Add gnttab operations to allow redirection to internal emulation
   hw/xen: Pass grant ref to gnttab unmap operation
   hw/xen: Add foreignmem operations to allow redirection to internal 
emulation
   hw/xen: Move xenstore_store_pv_console_info to xen_console.c
   hw/xen: Use XEN_PAGE_SIZE in PV backend drivers
   hw/xen: Rename xen_common.h to xen_native.h
   hw/xen: Build PV backend drivers for CONFIG_XEN_BUS
   hw/xen: Only advertise ring-page-order for xen-block if gnttab supports 
it
   hw/xen: Hook up emulated implementation for event channel operations
   hw/xen: Add emulated implementation of grant table operations
   hw/xen: Add emulated implementation of XenStore operations
   hw/xen: Map guest XENSTORE_PFN grant in emulated Xenstore
   hw/xen: Implement soft reset for emulated gnttab
   i386/xen: Initialize Xen backends from pc_basic_device_init() for 
emulation
   MAINTAINERS: Add entry for Xen on KVM emulation
   docs: Update Xen-on-KVM documentation for PV disk support

Paul Durrant (4):
   hw/xen: Implement XenStore permissions
   hw/xen: Create initial XenStore nodes
   hw/xen: Add xenstore operations to allow redirection to internal 
emulation
   hw/xen: Avoid crash when backend watch fires too early

  MAINTAINERS   |9 +
  accel/xen/xen-all.c   |   69 +-
  docs/system/i386/xen.rst  |   30 +-
  hw/9pfs/meson.build   |2 +-
  hw/9pfs/xen-9p-backend.c  |   32 +-
  hw/block/dataplane/meson.build|2 +-
  hw/block/dataplane/xen-block.c|   12 +-
  hw/block/meson.build  |2 +-
  hw/block/xen

Re: [PATCH v2 25/27] i386/xen: Initialize Xen backends from pc_basic_device_init() for emulation

2023-03-07 Thread Paul Durrant


On 07/03/2023 17:17, David Woodhouse wrote:

From: David Woodhouse 

Now that all the work is done to enable the PV backends to work without
actual Xen, instantiate the bus from pc_basic_device_init() for emulated
mode.

This allows us finally to launch an emulated Xen guest with PV disk.

qemu-system-x86_64 -serial mon:stdio -M q35 -cpu host -display none \
  -m 1G -smp 2 -accel kvm,xen-version=0x4000a,kernel-irqchip=split \
  -kernel bzImage -append "console=ttyS0 root=/dev/xvda1" \
  -drive file=/var/lib/libvirt/images/fedora28.qcow2,if=none,id=disk \
  -device xen-disk,drive=disk,vdev=xvda

If we use -M pc instead of q35, we can even add an IDE disk and boot a
guest image normally through grub. But q35 gives us AHCI and that isn't
unplugged by the Xen magic, so the guests ends up seeing "both" disks.

Signed-off-by: David Woodhouse 
Reviewed-by: Paul Durrant 
---
  hw/i386/pc.c | 7 +++
  1 file changed, 7 insertions(+)



Also...

Tested-by: Paul Durrant 

... on real Xen (master branch, 4.18) with a Debian guest.

Re: [RFC PATCH v1 07/25] hw/xen: Implement core serialize/deserialize methods for xenstore_impl

2023-03-07 Thread Paul Durrant


On 07/03/2023 16:59, Paul Durrant wrote:

On 07/03/2023 16:52, David Woodhouse wrote:

On Tue, 2023-03-07 at 16:39 +, Paul Durrant wrote:

On 07/03/2023 16:33, David Woodhouse wrote:

On Thu, 2023-03-02 at 15:34 +, David Woodhouse wrote:

From: David Woodhouse 

In fact I think we want to only serialize the contents of the domain's
path in /local/domain/${domid} and leave the rest to be recreated? 
Will

defer to Paul for that.

Signed-off-by: David Woodhouse 


Paul, your Reviewed-by: on this one is conspicuous in its absence. I
mentioned migration in the cover letter — this much is working fine,
but it's the PV back ends that don't yet work.

I'd quite like to merge the basic serialization/deserialization of
XenStore itself, with the unit tests.


The patch is basically ok, I just think the serialization should be
limited to the guest nodes... filtering out those not owned by xen_domid
would probably work for that.


Yeah, so let's just do this (as part of this patch #7) for now:

--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -235,6 +235,7 @@ static int xen_xenstore_post_load(void *opaque, int
ver)
  static const VMStateDescription xen_xenstore_vmstate = {
  .name = "xen_xenstore",
+    .unmigratable = 1, /* The PV back ends don't migrate yet */
  .version_id = 1,
  .minimum_version_id = 1,
  .needed = xen_xenstore_is_needed,


It means we can't migrate guests even if they're only using fully
emulated devices... but I think that's a reasonable limitation until we
implement it fully.



Ok. With that added...

Revieweed-by: Paul Durrant 


Typoed, sorry...

Reviewed-by: Paul Durrant

Re: [RFC PATCH v1 07/25] hw/xen: Implement core serialize/deserialize methods for xenstore_impl

2023-03-07 Thread Paul Durrant


On 07/03/2023 16:52, David Woodhouse wrote:

On Tue, 2023-03-07 at 16:39 +, Paul Durrant wrote:

On 07/03/2023 16:33, David Woodhouse wrote:

On Thu, 2023-03-02 at 15:34 +, David Woodhouse wrote:

From: David Woodhouse 

In fact I think we want to only serialize the contents of the domain's
path in /local/domain/${domid} and leave the rest to be recreated? Will
defer to Paul for that.

Signed-off-by: David Woodhouse 


Paul, your Reviewed-by: on this one is conspicuous in its absence. I
mentioned migration in the cover letter — this much is working fine,
but it's the PV back ends that don't yet work.

I'd quite like to merge the basic serialization/deserialization of
XenStore itself, with the unit tests.


The patch is basically ok, I just think the serialization should be
limited to the guest nodes... filtering out those not owned by xen_domid
would probably work for that.


Yeah, so let's just do this (as part of this patch #7) for now:

--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -235,6 +235,7 @@ static int xen_xenstore_post_load(void *opaque, int
ver)
  
  static const VMStateDescription xen_xenstore_vmstate = {

  .name = "xen_xenstore",
+.unmigratable = 1, /* The PV back ends don't migrate yet */
  .version_id = 1,
  .minimum_version_id = 1,
  .needed = xen_xenstore_is_needed,


It means we can't migrate guests even if they're only using fully
emulated devices... but I think that's a reasonable limitation until we
implement it fully.



Ok. With that added...

Revieweed-by: Paul Durrant

Re: [RFC PATCH v1 07/25] hw/xen: Implement core serialize/deserialize methods for xenstore_impl

2023-03-07 Thread Paul Durrant


On 07/03/2023 16:39, Paul Durrant wrote:

On 07/03/2023 16:33, David Woodhouse wrote:

On Thu, 2023-03-02 at 15:34 +, David Woodhouse wrote:

From: David Woodhouse 

In fact I think we want to only serialize the contents of the domain's
path in /local/domain/${domid} and leave the rest to be recreated? Will
defer to Paul for that.

Signed-off-by: David Woodhouse 


Paul, your Reviewed-by: on this one is conspicuous in its absence. I
mentioned migration in the cover letter — this much is working fine,
but it's the PV back ends that don't yet work.

I'd quite like to merge the basic serialization/deserialization of
XenStore itself, with the unit tests.


The patch is basically ok, I just think the serialization should be 
limited to the guest nodes... filtering out those not owned by xen_domid 
would probably work for that.




Perhaps we can also set TYPE_XEN_DEVICE or TYPE_XEN_BLOCK_DEVICE to be
unmigratable? Ideally I think we want TYPE_XEN_DEVICE to be
unmigratable by default *unless* the specific device class (including
net and other as we port them from XenLegacyDevice) says otherwise.



Yes, that sounds like a good idea.


Is there a way to do that?


Not something I've looked into. I'll go look now.



Maybe calling migrate_add_blocker() in the realize method is the right 
way to achieve this?


  Paul

Re: [RFC PATCH v1 07/25] hw/xen: Implement core serialize/deserialize methods for xenstore_impl

2023-03-07 Thread Paul Durrant


On 07/03/2023 16:33, David Woodhouse wrote:

On Thu, 2023-03-02 at 15:34 +, David Woodhouse wrote:

From: David Woodhouse 

In fact I think we want to only serialize the contents of the domain's
path in /local/domain/${domid} and leave the rest to be recreated? Will
defer to Paul for that.

Signed-off-by: David Woodhouse 


Paul, your Reviewed-by: on this one is conspicuous in its absence. I
mentioned migration in the cover letter — this much is working fine,
but it's the PV back ends that don't yet work.

I'd quite like to merge the basic serialization/deserialization of
XenStore itself, with the unit tests.


The patch is basically ok, I just think the serialization should be 
limited to the guest nodes... filtering out those not owned by xen_domid 
would probably work for that.




Perhaps we can also set TYPE_XEN_DEVICE or TYPE_XEN_BLOCK_DEVICE to be
unmigratable? Ideally I think we want TYPE_XEN_DEVICE to be
unmigratable by default *unless* the specific device class (including
net and other as we port them from XenLegacyDevice) says otherwise.



Yes, that sounds like a good idea.


Is there a way to do that?


Not something I've looked into. I'll go look now.

  Paul

Re: [RFC PATCH v1 27/25] docs: Update Xen-on-KVM documentation for PV disk support

2023-03-07 Thread Paul Durrant


On 07/03/2023 16:22, David Woodhouse wrote:

From: David Woodhouse 

Signed-off-by: David Woodhouse 
---
  docs/system/i386/xen.rst | 30 +++---
  1 file changed, 23 insertions(+), 7 deletions(-)



Reviewed-by: Paul Durrant

Re: [RFC PATCH v1 26/25] MAINTAINERS: Add entry for Xen on KVM emulation

2023-03-07 Thread Paul Durrant


On 07/03/2023 16:21, David Woodhouse wrote:

From: David Woodhouse 

Signed-off-by: David Woodhouse 
---
  MAINTAINERS | 9 +
  1 file changed, 9 insertions(+)



Reviewed-by: Paul Durrant

Re: [RFC PATCH v1 25/25] i386/xen: Initialize Xen backends from pc_basic_device_init() for emulation

2023-03-07 Thread Paul Durrant


On 02/03/2023 15:34, David Woodhouse wrote:

From: David Woodhouse 

Now that all the work is done to enable the PV backends to work without
actual Xen, instantiate the bus from pc_basic_device_init() for emulated
mode.

This allows us finally to launch an emulated Xen guest with PV disk.

qemu-system-x86_64 -serial mon:stdio -M q35 -cpu host -display none \
  -m 1G -smp 2 -accel kvm,xen-version=0x4000a,kernel-irqchip=split \
  -kernel bzImage -append "console=ttyS0 root=/dev/xvda1" \
  -drive file=/var/lib/libvirt/images/fedora28.qcow2,if=none,id=disk \
  -device xen-disk,drive=disk,vdev=xvda

If we use -M pc instead of q35, we can even add an IDE disk and boot a
guest image normally through grub. But q35 gives us AHCI and that isn't
unplugged by the Xen magic, so the guests ends up seeing "both" disks.

Signed-off-by: David Woodhouse 
---
  hw/i386/pc.c | 7 +++
  1 file changed, 7 insertions(+)



Reviewed-by: Paul Durrant

Re: [RFC PATCH v1 24/25] hw/xen: Implement soft reset for emulated gnttab

2023-03-07 Thread Paul Durrant


On 02/03/2023 15:34, David Woodhouse wrote:

From: David Woodhouse 

This is only part of it; we will also need to get the PV back end drivers
to tear down their own mappings (or do it for them, but they kind of need
to stop using the pointers too).

Some more work on the actual PV back ends and xen-bus code is going to be
needed to really make soft reset and migration fully functional, and this
part is the basis for that.

Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_gnttab.c  | 26 --
  hw/i386/kvm/xen_gnttab.h  |  1 +
  target/i386/kvm/xen-emu.c |  5 +
  3 files changed, 30 insertions(+), 2 deletions(-)



Reviewed-by: Paul Durrant

Re: [RFC PATCH v1 23/25] hw/xen: Map guest XENSTORE_PFN grant in emulated Xenstore

2023-03-07 Thread Paul Durrant


On 02/03/2023 15:34, David Woodhouse wrote:

From: David Woodhouse 

Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_xenstore.c | 16 
  1 file changed, 16 insertions(+)

diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 028f80499e..f9b7387024 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -21,6 +21,7 @@
  
  #include "hw/sysbus.h"

  #include "hw/xen/xen.h"
+#include "hw/xen/xen_backend_ops.h"
  #include "xen_overlay.h"
  #include "xen_evtchn.h"
  #include "xen_xenstore.h"
@@ -34,6 +35,7 @@
  
  #include "hw/xen/interface/io/xs_wire.h"

  #include "hw/xen/interface/event_channel.h"
+#include "hw/xen/interface/grant_table.h"
  
  #define TYPE_XEN_XENSTORE "xen-xenstore"

  OBJECT_DECLARE_SIMPLE_TYPE(XenXenstoreState, XEN_XENSTORE)
@@ -66,6 +68,9 @@ struct XenXenstoreState {
  
  uint8_t *impl_state;

  uint32_t impl_state_size;
+
+struct xengntdev_handle *gt;
+void *granted_xs;
  };
  
  struct XenXenstoreState *xen_xenstore_singleton;

@@ -1452,6 +1457,17 @@ int xen_xenstore_reset(void)
  }
  s->be_port = err;
  
+/*

+ * We don't actually access the guest's page through the grant, because
+ * this isn't real Xen, and we can just use the page we gave it in the
+ * first place. Map the grant anyway, mostly for cosmetic purposes so
+ * it *looks* like it's in use in the guest-visible grant table.


Might be useful to stick this text in the commit comment too.

Reviewed-by: Paul Durrant 


+ */
+s->gt = qemu_xen_gnttab_open();
+uint32_t xs_gntref = GNTTAB_RESERVED_XENSTORE;
+s->granted_xs = qemu_xen_gnttab_map_refs(s->gt, 1, xen_domid, &xs_gntref,
+ PROT_READ | PROT_WRITE);
+
  return 0;
  }

Re: [RFC PATCH v1 22/25] hw/xen: Add emulated implementation of XenStore operations

2023-03-07 Thread Paul Durrant


On 02/03/2023 15:34, David Woodhouse wrote:

From: David Woodhouse 

Now that we have an internal implementation of XenStore, we can populate
the xenstore_backend_ops to allow PV backends to talk to it.

Watches can't be processed with immediate callbacks because that would
call back into XenBus code recursively. Defer them to a QEMUBH to be run
as appropriate from the main loop. We use a QEMUBH per XS handle, and it
walks all the watches (there shouldn't be many per handle) to fire any
which have pending events. We *could* have done it differently but this
allows us to use the same struct watch_event as we have for the guest
side, and keeps things relatively simple.


Yes, it's more consistent with watch events on real Xen this way.



Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_xenstore.c | 273 -
  1 file changed, 269 insertions(+), 4 deletions(-)



Reviewed-by: Paul Durrant

Re: [RFC PATCH v1 21/25] hw/xen: Add emulated implementation of grant table operations

2023-03-07 Thread Paul Durrant


On 02/03/2023 15:34, David Woodhouse wrote:

From: David Woodhouse 

This is limited to mapping a single grant at a time, because under Xen the
pages are mapped *contiguously* into qemu's address space, and that's very
hard to do when those pages actually come from anonymous mappings in qemu
in the first place.

Eventually perhaps we can look at using shared mappings of actual objects
for system RAM, and then we can make new mappings of the same backing
store (be it deleted files, shmem, whatever). But for now let's stick to
a page at a time.

Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_gnttab.c | 299 ++-
  1 file changed, 296 insertions(+), 3 deletions(-)


[snip]

+static uint64_t gnt_ref(XenGnttabState *s, grant_ref_t ref, int prot)
+{
+uint16_t mask = GTF_type_mask | GTF_sub_page;
+grant_entry_v1_t gnt, *gnt_p;
+int retries = 0;
+
+if (ref >= s->max_frames * ENTRIES_PER_FRAME_V1 ||
+s->map_track[ref] == UINT8_MAX) {
+return INVALID_GPA;
+}
+
+if (prot & PROT_WRITE) {
+mask |= GTF_readonly;
+}
+
+gnt_p = &s->entries.v1[ref];
+
+/*
+ * The guest can legitimately be changing the GTF_readonly flag. Allow


I'd call a guest playing with the ref after setting GTF_permit_access a 
buggy guest and not bother with the loop.



+ * that, but don't let a malicious guest cause a livelock.
+ */
+for (retries = 0; retries < 5; retries++) {
+uint16_t new_flags;
+
+/* Read the entry before an atomic operation on its flags */
+gnt = *(volatile grant_entry_v1_t *)gnt_p;
+
+if ((gnt.flags & mask) != GTF_permit_access ||
+gnt.domid != DOMID_QEMU) {
+return INVALID_GPA;
+}
+
+new_flags = gnt.flags | GTF_reading;
+if (prot & PROT_WRITE) {
+new_flags |= GTF_writing;
+}
+
+if (qatomic_cmpxchg(&gnt_p->flags, gnt.flags, new_flags) == gnt.flags) 
{


Xen actually does a cmpxchg on both the flags and the domid. We probably 
ought to fail to set the flags if the guest is playing with the domid 
but since we're single-tenant it doesn't *really* matter... just a 
nice-to-have. So...


Reviewed-by: Paul Durrant

Re: [RFC PATCH v1 20/25] hw/xen: Hook up emulated implementation for event channel operations

2023-03-07 Thread Paul Durrant


On 02/03/2023 15:34, David Woodhouse wrote:

From: David Woodhouse 

We provided the backend-facing evtchn functions very early on as part of
the core Xen platform support, since things like timers and xenstore need
to use them.

By what may or may not be an astonishing coincidence, those functions
just *happen* all to have exactly the right function prototypes to slot
into the evtchn_backend_ops table and be called by the PV backends.


It is indeed an amazing coincidence :-)



Signed-off-by: David Woodhouse 
---
  hw/i386/kvm/xen_evtchn.c | 15 +++
  1 file changed, 15 insertions(+)



Reviewed-by: Paul Durrant

Re: [RFC PATCH v1 19/25] hw/xen: Only advertise ring-page-order for xen-block if gnttab supports it

2023-03-07 Thread Paul Durrant


On 02/03/2023 15:34, David Woodhouse wrote:

From: David Woodhouse 

Whem emulating Xen, multi-page grants are distinctly non-trivial and we
have elected not to support them for the time being. Don't advertise
them to the guest.

Signed-off-by: David Woodhouse 
---
  hw/block/xen-block.c | 11 ---
  1 file changed, 8 insertions(+), 3 deletions(-)



Reviewed-by: Paul Durrant

Re: [RFC PATCH v1 18/25] hw/xen: Avoid crash when backend watch fires too early

2023-03-07 Thread Paul Durrant


On 02/03/2023 15:34, David Woodhouse wrote:

From: Paul Durrant 

The xen-block code ends up calling aio_poll() through blkconf_geometry(),
which means we see watch events during the indirect call to
xendev_class->realize() in xen_device_realize(). Unfortunately this call
is made before populating the initial frontend and backend device nodes
in xenstore and hence xen_block_frontend_changed() (which is called from
a watch event) fails to read the frontend's 'state' node, and hence
believes the device is being torn down. This in-turn sets the backend
state to XenbusStateClosed and causes the device to be deleted before it
is fully set up, leading to the crash.
By simply moving the call to xendev_class->realize() after the initial
xenstore nodes are populated, this sorry state of affairs is avoided.

Reported-by: David Woodhouse 
Signed-off-by: Paul Durrant 
Signed-off-by: David Woodhouse 
---
  hw/xen/xen-bus.c | 14 +++---
  1 file changed, 7 insertions(+), 7 deletions(-)



Reviewed-by: Paul Durrant

Re: [RFC PATCH v1 17/25] hw/xen: Build PV backend drivers for CONFIG_XEN_BUS

2023-03-07 Thread Paul Durrant


On 02/03/2023 15:34, David Woodhouse wrote:

From: David Woodhouse 

Now that we have the redirectable Xen backend operations we can build the
PV backends even without the Xen libraries.

Signed-off-by: David Woodhouse 
---
  hw/9pfs/meson.build| 2 +-
  hw/block/dataplane/meson.build | 2 +-
  hw/block/meson.build   | 2 +-
  hw/char/meson.build| 2 +-
  hw/display/meson.build | 2 +-
  hw/usb/meson.build | 2 +-
  hw/xen/meson.build | 5 -
  7 files changed, 10 insertions(+), 7 deletions(-)



Reviewed-by: Paul Durrant

Re: [RFC PATCH v1 16/25] hw/xen: Rename xen_common.h to xen_native.h

2023-03-07 Thread Paul Durrant


On 02/03/2023 15:34, David Woodhouse wrote:

From: David Woodhouse 

This header is now only for native Xen code, not PV backends that may be
used in Xen emulation. Since the toolstack libraries may depend on the
specific version of Xen headers that they pull in (and will set the
__XEN_TOOLS__ macro to enable internal definitions that they depend on),
the rule is that xen_native.h (and thus the toolstack library headers)
must be included *before* any of the headers in include/hw/xen/interface.

Signed-off-by: David Woodhouse 
---
  accel/xen/xen-all.c   |  1 +
  hw/9pfs/xen-9p-backend.c  |  1 +
  hw/block/dataplane/xen-block.c|  3 ++-
  hw/block/xen-block.c  |  1 -
  hw/i386/pc_piix.c |  4 ++--
  hw/i386/xen/xen-hvm.c | 11 +-
  hw/i386/xen/xen-mapcache.c|  2 +-
  hw/i386/xen/xen_platform.c|  7 +++---
  hw/xen/trace-events   |  2 +-
  hw/xen/xen-operations.c   |  2 +-
  hw/xen/xen_pt.c   |  2 +-
  hw/xen/xen_pt.h   |  2 +-
  hw/xen/xen_pt_config_init.c   |  2 +-
  hw/xen/xen_pt_msi.c   |  4 ++--
  include/hw/xen/xen.h  | 22 ---
  include/hw/xen/{xen_common.h => xen_native.h} | 10 ++---
  include/hw/xen/xen_pvdev.h|  3 ++-
  17 files changed, 47 insertions(+), 32 deletions(-)
  rename include/hw/xen/{xen_common.h => xen_native.h} (98%)



Reviewed-by: Paul Durrant

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1001 matches

Mail list logo