date:20240322

Re: [PATCH 1/3] ui/cocoa: Fix aspect ratio

2024-03-22 Thread Akihiko Odaki


On 2024/03/22 21:55, Peter Maydell wrote:

On Fri, 22 Mar 2024 at 12:25, Akihiko Odaki  wrote:


On 2024/03/22 21:22, Peter Maydell wrote:

On Mon, 18 Mar 2024 at 07:53, Akihiko Odaki  wrote:


[NSWindow setContentAspectRatio:] does not trigger window resize itself,
so the wrong aspect ratio will persist if nothing resizes the window.
Call [NSWindow setContentSize:] in such a case.

Fixes: 91aa508d0274 ("ui/cocoa: Let the platform toggle fullscreen")
Signed-off-by: Akihiko Odaki 
---
   ui/cocoa.m | 23 ++-
   1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/ui/cocoa.m b/ui/cocoa.m
index fa879d7dcd4b..d6a5b462f78b 100644
--- a/ui/cocoa.m
+++ b/ui/cocoa.m
@@ -508,6 +508,25 @@ - (void) drawRect:(NSRect) rect
   }
   }

+- (NSSize)fixAspectRatio:(NSSize)original
+{
+NSSize scaled;
+NSSize fixed;
+
+scaled.width = screen.width * original.height;
+scaled.height = screen.height * original.width;
+
+if (scaled.width < scaled.height) {


Is this a standard algorithm for scaling with a fixed
aspect ratio? It looks rather weird to be comparing
a width against a height here, and to be multiplying a
width by a height.


Not sure if it's a standard, but it's an algorithm with least error I
came up with.


OK. Maybe a comment would help (at least it helps me in thinking
through the code :-))

  /*
   * Here screen is our guest's output size, and original is the
   * size of the largest possible area of the screen we can display on.
   * We want to scale up (screen.width x screen.height) by either:
   *   1) original.height / screen.height
   *   2) original.width / screen.width
   * With the first scale factor the scale will result in an output
   * height of original.height (i.e. we will fill the whole height
   * of the available screen space and have black bars left and right)
   * and with the second scale factor the scaling will result in an
   * output width of original.width (i.e. we fill the whole width of
   * the available screen space and have black bars top and bottom).
   * We need to pick whichever keeps the whole of the guest output
   * on the screen, which is to say the smaller of the two scale factors.
   * To avoid doing more division than strictly necessary, instead
   * of directly comparing scale factors 1 and 2 we instead
   * calculate and compare those two scale factors multiplied by
   * (screen.height * screen.width).
   */

Having written that out, it seems to me that the variable
names here could more clearly reflect what they're doing
(eg "screen" is not the size of the screen we're displaying
on, "original" is not the old displayable area size but the
new one we're trying to fit into, scaled doesn't actually
contain a (width, height) that go together with each other,
and it doesn't contain the actual scale factor we're going to
be using either).


With v2, I added the comment and renamed original to max as it's the 
largest area we can display on as described in the comment.


screen and scaled are not renamed. Renaming screen is a bit out of scope 
of this patch as it's an existing variable. The variable is referenced 
from several places so a patch to rename it will be a bit large and not 
suited to include in a bug fix series. I couldn't just invent a good 
name for scaled.


Regards,
Akihiko Odaki




+fixed.width = scaled.width / screen.height;
+fixed.height = original.height;
+} else {
+fixed.width = original.width;
+fixed.height = scaled.height / screen.width;
+}
+
+return fixed;
+}
+


thanks
-- PMM

[PATCH v2 3/3] ui/cocoa: Use NSTrackingInVisibleRect

2024-03-22 Thread Akihiko Odaki

I observed [NSTrackingArea rect] becomes de-synchronized with the view
frame with some unknown condition, and fails to track mouse movement on
some area of the view. Specify NSTrackingInVisibleRect option to let
Cocoa automatically update NSTrackingArea, which also saves code for
synchronization.

Fixes: 91aa508d0274 ("ui/cocoa: Let the platform toggle fullscreen")
Signed-off-by: Akihiko Odaki 
Reviewed-by: Peter Maydell 
---
 ui/cocoa.m | 48 ++--
 1 file changed, 14 insertions(+), 34 deletions(-)

diff --git a/ui/cocoa.m b/ui/cocoa.m
index 3a1b899ba768..fb60debb9a8e 100644
--- a/ui/cocoa.m
+++ b/ui/cocoa.m
@@ -306,7 +306,6 @@ static void handleAnyDeviceErrors(Error * err)
 */
 @interface QemuCocoaView : NSView
 {
-NSTrackingArea *trackingArea;
 QEMUScreen screen;
 pixman_image_t *pixman_image;
 QKbdState *kbd;
@@ -359,6 +358,19 @@ - (id)initWithFrame:(NSRect)frameRect
 self = [super initWithFrame:frameRect];
 if (self) {
 
+NSTrackingAreaOptions options = NSTrackingActiveInKeyWindow |
+NSTrackingMouseEnteredAndExited |
+NSTrackingMouseMoved |
+NSTrackingInVisibleRect;
+
+NSTrackingArea *trackingArea =
+[[NSTrackingArea alloc] initWithRect:CGRectZero
+ options:options
+   owner:self
+userInfo:nil];
+
+[self addTrackingArea:trackingArea];
+[trackingArea release];
 screen.width = frameRect.size.width;
 screen.height = frameRect.size.height;
 kbd = qkbd_state_init(dcl.con);
@@ -392,41 +404,9 @@ - (BOOL) isOpaque
 return YES;
 }
 
-- (void) removeTrackingRect
-{
-if (trackingArea) {
-[self removeTrackingArea:trackingArea];
-[trackingArea release];
-trackingArea = nil;
-}
-}
-
-- (void) frameUpdated
-{
-[self removeTrackingRect];
-
-if ([self window]) {
-NSTrackingAreaOptions options = NSTrackingActiveInKeyWindow |
-NSTrackingMouseEnteredAndExited |
-NSTrackingMouseMoved;
-trackingArea = [[NSTrackingArea alloc] initWithRect:[self frame]
-options:options
-  owner:self
-   userInfo:nil];
-[self addTrackingArea:trackingArea];
-[self updateUIInfo];
-}
-}
-
 - (void) viewDidMoveToWindow
 {
 [self resizeWindow];
-[self frameUpdated];
-}
-
-- (void) viewWillMoveToWindow:(NSWindow *)newWindow
-{
-[self removeTrackingRect];
 }
 
 - (void) hideCursor
@@ -1302,7 +1282,7 @@ - (void)windowDidExitFullScreen:(NSNotification 
*)notification
 - (void)windowDidResize:(NSNotification *)notification
 {
 [cocoaView updateBounds];
-[cocoaView frameUpdated];
+[cocoaView updateUIInfo];
 }
 
 /* Called when the user clicks on a window's close button */

-- 
2.44.0

[PATCH v2 0/3] Fixes for "ui/cocoa: Let the platform toggle fullscreen"

2024-03-22 Thread Akihiko Odaki

This series contains patches for regressions caused by commit 91aa508d0274
("ui/cocoa: Let the platform toggle fullscreen").

Signed-off-by: Akihiko Odaki 
---
Changes in v2:
- Added a comment to [QemuCocoaView fixAspectRatio:]. (Peter Maydell)
- Renamed [QemuCocoaView fixAspectRatio:] parameter to match the
  comment. (Peter Maydell)
- Noted that "ui/cocoa: Use NSTrackingInVisibleRect" fixes mouse
  movement tracking. (Peter Maydell)
- Link to v1: 
https://lore.kernel.org/r/20240318-fixes-v1-0-34f1a849b...@daynix.com

---
Akihiko Odaki (3):
  ui/cocoa: Fix aspect ratio
  ui/cocoa: Resize window after toggling zoom-to-fit
  ui/cocoa: Use NSTrackingInVisibleRect

 ui/cocoa.m | 90 ++
 1 file changed, 55 insertions(+), 35 deletions(-)
---
base-commit: ba49d760eb04630e7b15f423ebecf6c871b8f77b
change-id: 20240318-fixes-7b187ec236a0

Best regards,
-- 
Akihiko Odaki

[PATCH v2 2/3] ui/cocoa: Resize window after toggling zoom-to-fit

2024-03-22 Thread Akihiko Odaki

Resize the window so that the content will fit without zooming.

Fixes: 91aa508d0274 ("ui/cocoa: Let the platform toggle fullscreen")
Signed-off-by: Akihiko Odaki 
Reviewed-by: Peter Maydell 
---
 ui/cocoa.m | 1 +
 1 file changed, 1 insertion(+)

diff --git a/ui/cocoa.m b/ui/cocoa.m
index 834ebf5f6175..3a1b899ba768 100644
--- a/ui/cocoa.m
+++ b/ui/cocoa.m
@@ -1396,6 +1396,7 @@ - (void)zoomToFit:(id) sender
 
 [[cocoaView window] setStyleMask:styleMask];
 [sender setState:styleMask & NSWindowStyleMaskResizable ? 
NSControlStateValueOn : NSControlStateValueOff];
+[cocoaView resizeWindow];
 }
 
 - (void)toggleZoomInterpolation:(id) sender

-- 
2.44.0

[PATCH v2 1/3] ui/cocoa: Fix aspect ratio

2024-03-22 Thread Akihiko Odaki

[NSWindow setContentAspectRatio:] does not trigger window resize itself,
so the wrong aspect ratio will persist if nothing resizes the window.
Call [NSWindow setContentSize:] in such a case.

Fixes: 91aa508d0274 ("ui/cocoa: Let the platform toggle fullscreen")
Signed-off-by: Akihiko Odaki 
---
 ui/cocoa.m | 41 -
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/ui/cocoa.m b/ui/cocoa.m
index fa879d7dcd4b..834ebf5f6175 100644
--- a/ui/cocoa.m
+++ b/ui/cocoa.m
@@ -508,6 +508,43 @@ - (void) drawRect:(NSRect) rect
 }
 }
 
+- (NSSize)fixAspectRatio:(NSSize)max
+{
+NSSize scaled;
+NSSize fixed;
+
+scaled.width = screen.width * max.height;
+scaled.height = screen.height * max.width;
+
+/*
+ * Here screen is our guest's output size, and max is the size of the
+ * largest possible area of the screen we can display on.
+ * We want to scale up (screen.width x screen.height) by either:
+ *   1) max.height / screen.height
+ *   2) max.width / screen.width
+ * With the first scale factor the scale will result in an output height of
+ * max.height (i.e. we will fill the whole height of the available screen
+ * space and have black bars left and right) and with the second scale
+ * factor the scaling will result in an output width of max.width (i.e. we
+ * fill the whole width of the available screen space and have black bars
+ * top and bottom). We need to pick whichever keeps the whole of the guest
+ * output on the screen, which is to say the smaller of the two scale
+ * factors.
+ * To avoid doing more division than strictly necessary, instead of 
directly
+ * comparing scale factors 1 and 2 we instead calculate and compare those
+ * two scale factors multiplied by (screen.height * screen.width).
+ */
+if (scaled.width < scaled.height) {
+fixed.width = scaled.width / screen.height;
+fixed.height = max.height;
+} else {
+fixed.width = max.width;
+fixed.height = scaled.height / screen.width;
+}
+
+return fixed;
+}
+
 - (NSSize) screenSafeAreaSize
 {
 NSSize size = [[[self window] screen] frame].size;
@@ -525,8 +562,10 @@ - (void) resizeWindow
 [[self window] setContentSize:NSMakeSize(screen.width, screen.height)];
 [[self window] center];
 } else if ([[self window] styleMask] & NSWindowStyleMaskFullScreen) {
-[[self window] setContentSize:[self screenSafeAreaSize]];
+[[self window] setContentSize:[self fixAspectRatio:[self 
screenSafeAreaSize]]];
 [[self window] center];
+} else {
+[[self window] setContentSize:[self fixAspectRatio:[self frame].size]];
 }
 }
 

-- 
2.44.0

Re: [PATCH v2 0/2] ARM Sbsa-ref: Enable CPU cluster topology

2024-03-22 Thread Marcin Juszkiewicz


W dniu 22.03.2024 o 19:51, Peter Maydell pisze:

On Tue, 12 Mar 2024 at 08:32, Xiong Yining



xiongyining1480 (2):
   hw/arm/sbsa-ref:Enable CPU cluster on ARM sbsa machine
   hw/arm/sbsa-ref: Add cpu-map to device tree


Thanks for these patches. I think we should squash the two
patches together into one, because the first patch is only
a single line, and also because we shouldn't say that the
machine supports cluster topology until it actually does
by putting the information into the device tree.

There's no rush, because we're  now in softfreeze for 9.0, so these
will have to wait until 9.0 is released (in about a month's time).



I'm also a bit confused by the Reviewed-by: tag from Marcin on patch 2,
because I can't see that in my mail archives of the discussion on version
1 of this patchset, only a Tested-by.
Marcin, are you OK with these patches?


I only tested them. They are fine, will check on Monday.


Also, is this change to the DTB something that would require an
increase in the sbsa-ref platform version number, or not?


TF-A will check for "/cpus/cpu-map" node and if it is missing then will 
not provide it to EDK2. So far I did not saw patches for firmware side.


I would add bump of platform version to 0.4 one. It is cheap operation 
and so far (from firmware side) we check for >= 0.3 only.


> Should we adjust the documentation in docs/system/arm/sbsa.rst to
> mention that the DTB might have cluster topology information?

Yes. I will send an update to mention that NUMA configuration can be 
there too (we already export it from TF-A to EDK2 via SMC calls).

CommandNotFound on QMP Protocol

2024-03-22 Thread Mister Nether

Hi. I am trying to take advantage of QMP and automatically collect some
data about vms. I found a nice list of qmp commands.
I tried to use them, but many of them just sont seem to work. For example

Request: { "execute": "guest-get-cpustats" }
Response: {"error": {"class": "CommandNotFound", "desc": "The command
guest-get-cpustats has not been found"}}

or

Request: { "execute": "query-memory-devices" }
Response: {"return": []}

Even though the vm is running totally fine and it should return at least
some kind of memory information. I tried to compile the latest version
myselft, but it didnt work either. Pls help!

Version/Build info:
(Standart Debian 12 repo lts version)

qemu-system-x86_64 -version
QEMU emulator version 7.2.9 (Debian 1:7.2+dfsg-7+deb12u5)
Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers

{"QMP": {"version": {"qemu": {"micro": 9, "minor": 2, "major": 7},
"package": "Debian 1:7.2+dfsg-7+deb12u5"}, "capabilities": ["oob"]}}


Thx in advice!

Re: rutabaga 0.1.3

2024-03-22 Thread Alyssa Ross

On Mon, Mar 04, 2024 at 04:23:20PM -0800, Gurchetan Singh wrote:
> On Sat, Mar 2, 2024 at 6:38 AM Alyssa Ross  wrote:
>
> > Hi Gurchetan,
> >
> > > >> > Would this be a suitable commit for the 0.1.3 release of rutabaga?
> > > >> >
> > > >> > https://chromium.googlesource.com/crosvm/crosvm/+/5dfd74a0680d317c6edf44138def886f47cb1c7c
> > > >> >
> > > >> > The gfxstream/AEMU commits would remain unchanged.
> > > >>
> > > >> That combination works for me.
> > > >
> > > > Just FYI, still working on it.  Could take 1-2 more weeks.
> > >
> > > FYI:
> > >
> > > https://android.googlesource.com/platform/hardware/google/gfxstream/+/refs/tags/v0.1.2-gfxstream-release
> > >
> > > https://android.googlesource.com/platform/hardware/google/aemu/+/refs/tags/v0.1.2-aemu-release
> > >
> > >
> > https://chromium.googlesource.com/crosvm/crosvm/+/refs/tags/v0.1.3-rutabaga-release
> >
> > Unlike the commit I tested for you, the commit that ended up being
> > tagged as v0.1.3-rutabaga-release doesn't work for me:
> >
> > qemu: The errno is EBADF: Bad file number
> > qemu: CHECK failed in rutabaga_cmd_resource_map_blob()
> > ../hw/display/virtio-gpu-rutabaga.c:655
> > qemu: virtio_gpu_rutabaga_process_cmd: ctrl 0x208, error 0x1200
> > qemu: CHECK failed in rutabaga_cmd_resource_unmap_blob()
> > ../hw/display/virtio-gpu-rutabaga.c:723
> > qemu: virtio_gpu_rutabaga_process_cmd: ctrl 0x209, error 0x1200
> > qemu: The errno is EBADF: Bad file number
> > qemu: CHECK failed in rutabaga_cmd_resource_map_blob()
> > ../hw/display/virtio-gpu-rutabaga.c:655
> > qemu: virtio_gpu_rutabaga_process_cmd: ctrl 0x208, error 0x1200
> > qemu: CHECK failed in rutabaga_cmd_resource_unmap_blob()
> > ../hw/display/virtio-gpu-rutabaga.c:723
> > qemu: virtio_gpu_rutabaga_process_cmd: ctrl 0x209, error 0x1200
> > qemu: The errno is EBADF: Bad file number
> > qemu: CHECK failed in rutabaga_cmd_resource_map_blob()
> > ../hw/display/virtio-gpu-rutabaga.c:655
> > qemu: virtio_gpu_rutabaga_process_cmd: ctrl 0x208, error 0x1200
> > qemu: invalid resource id
> > qemu: CHECK failed in rutabaga_cmd_submit_3d()
> > ../hw/display/virtio-gpu-rutabaga.c:341
> > qemu: virtio_gpu_rutabaga_process_cmd: ctrl 0x207, error 0x1200
> > qemu: CHECK failed in rutabaga_cmd_resource_unmap_blob()
> > ../hw/display/virtio-gpu-rutabaga.c:723
> > qemu: virtio_gpu_rutabaga_process_cmd: ctrl 0x209, error 0x1200
> >
>
> Thank you for the bug report .. does crrev.com/c/5342655 fix this for you?

Hi Gurchetan, thanks for looking into it, and sorry for the late reply.

Alas it doesn't seem to make a difference.

(The commit message is also incorrect.  AsFd is implemented for
SafeDescriptor in rutabaga_gfx/src/rutabaga_os/sys/linux/descriptor.rs.)

> I bisected it to:
> >
> > commit f3dbf20eedadb135e2fd813474fbb9731d465f3a
> > Author: Andrew Walbran 
> > Date:   Wed Nov 29 17:23:45 2023 +
> >
> > rutabaga_gfx: Uprev nix to 0.27.1
> >
> > The new version of nix uses OwnedFd in various places, which 
> > allows us
> > to have less unsafe code.
> >
> > TEST=CQ
> > BUG=b:293289578
> >
> > Change-Id: I61aa80c4105eaf1182c5c325109b5aba11cf60de
> > Reviewed-on: 
> > https://chromium-review.googlesource.com/c/crosvm/crosvm/+/5072293
> > Auto-Submit: Andrew Walbran 
> > Reviewed-by: Gurchetan Singh 
> > Reviewed-by: Frederick Mayle 
> > Commit-Queue: Frederick Mayle 
> >


signature.asc
Description: PGP signature

Re: [PATCH v4 2/2] vhost: Perform memory section dirty scans once per iteration

2024-03-22 Thread Si-Wei Liu





On 3/21/2024 10:08 PM, Jason Wang wrote:

On Fri, Mar 22, 2024 at 5:43 AM Si-Wei Liu  wrote:



On 3/20/2024 8:56 PM, Jason Wang wrote:

On Thu, Mar 21, 2024 at 5:03 AM Si-Wei Liu  wrote:


On 3/19/2024 8:27 PM, Jason Wang wrote:

On Tue, Mar 19, 2024 at 6:16 AM Si-Wei Liu  wrote:

On 3/17/2024 8:22 PM, Jason Wang wrote:

On Sat, Mar 16, 2024 at 2:45 AM Si-Wei Liu  wrote:

On 3/14/2024 9:03 PM, Jason Wang wrote:

On Fri, Mar 15, 2024 at 5:39 AM Si-Wei Liu  wrote:

On setups with one or more virtio-net devices with vhost on,
dirty tracking iteration increases cost the bigger the number
amount of queues are set up e.g. on idle guests migration the
following is observed with virtio-net with vhost=on:

48 queues -> 78.11%  [.] vhost_dev_sync_region.isra.13
8 queues -> 40.50%   [.] vhost_dev_sync_region.isra.13
1 queue -> 6.89% [.] vhost_dev_sync_region.isra.13
2 devices, 1 queue -> 18.60%  [.] vhost_dev_sync_region.isra.14

With high memory rates the symptom is lack of convergence as soon
as it has a vhost device with a sufficiently high number of queues,
the sufficient number of vhost devices.

On every migration iteration (every 100msecs) it will redundantly
query the *shared log* the number of queues configured with vhost
that exist in the guest. For the virtqueue data, this is necessary,
but not for the memory sections which are the same. So essentially
we end up scanning the dirty log too often.

To fix that, select a vhost device responsible for scanning the
log with regards to memory sections dirty tracking. It is selected
when we enable the logger (during migration) and cleared when we
disable the logger. If the vhost logger device goes away for some
reason, the logger will be re-selected from the rest of vhost
devices.

After making mem-section logger a singleton instance, constant cost
of 7%-9% (like the 1 queue report) will be seen, no matter how many
queues or how many vhost devices are configured:

48 queues -> 8.71%[.] vhost_dev_sync_region.isra.13
2 devices, 8 queues -> 7.97%   [.] vhost_dev_sync_region.isra.14

Co-developed-by: Joao Martins 
Signed-off-by: Joao Martins 
Signed-off-by: Si-Wei Liu 

---
v3 -> v4:
   - add comment to clarify effect on cache locality and
 performance

v2 -> v3:
   - add after-fix benchmark to commit log
   - rename vhost_log_dev_enabled to vhost_dev_should_log
   - remove unneeded comparisons for backend_type
   - use QLIST array instead of single flat list to store vhost
 logger devices
   - simplify logger election logic
---
  hw/virtio/vhost.c | 67 
++-
  include/hw/virtio/vhost.h |  1 +
  2 files changed, 62 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 612f4db..58522f1 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -45,6 +45,7 @@

  static struct vhost_log *vhost_log[VHOST_BACKEND_TYPE_MAX];
  static struct vhost_log *vhost_log_shm[VHOST_BACKEND_TYPE_MAX];
+static QLIST_HEAD(, vhost_dev) vhost_log_devs[VHOST_BACKEND_TYPE_MAX];

  /* Memslots used by backends that support private memslots (without an 
fd). */
  static unsigned int used_memslots;
@@ -149,6 +150,47 @@ bool vhost_dev_has_iommu(struct vhost_dev *dev)
  }
  }

+static inline bool vhost_dev_should_log(struct vhost_dev *dev)
+{
+assert(dev->vhost_ops);
+assert(dev->vhost_ops->backend_type > VHOST_BACKEND_TYPE_NONE);
+assert(dev->vhost_ops->backend_type < VHOST_BACKEND_TYPE_MAX);
+
+return dev == QLIST_FIRST(&vhost_log_devs[dev->vhost_ops->backend_type]);

A dumb question, why not simple check

dev->log == vhost_log_shm[dev->vhost_ops->backend_type]

Because we are not sure if the logger comes from vhost_log_shm[] or
vhost_log[]. Don't want to complicate the check here by calling into
vhost_dev_log_is_shared() everytime when the .log_sync() is called.

It has very low overhead, isn't it?

Whether this has low overhead will have to depend on the specific
backend's implementation for .vhost_requires_shm_log(), which the common
vhost layer should not assume upon or rely on the current implementation.


static bool vhost_dev_log_is_shared(struct vhost_dev *dev)
{
return dev->vhost_ops->vhost_requires_shm_log &&
   dev->vhost_ops->vhost_requires_shm_log(dev);
}

For example, if I understand the code correctly, the log type won't be
changed during runtime, so we can endup with a boolean to record that
instead of a query ops?

Right now the log type won't change during runtime, but I am not sure if
this may prohibit future revisit to allow change at the runtime,

We can be bothered when we have such a request then.


then
there'll be complex code involvled to maintain the state.

Other than this, I think it's insufficient to just check the shm log
v.s. normal log. The logger device requires to identify a leading logger
device that gets elected in vhost_dev_elect_mem_logger(), as a

Re: [PATCH] build: Re-introduce an 'info' target to build a Texinfo manual.

2024-03-22 Thread Maxim Cournoyer

Hi Daniel,

Daniel P. Berrangé  writes:

> On Tue, Mar 19, 2024 at 11:47:59AM +, Peter Maydell wrote:
>> On Mon, 18 Mar 2024 at 03:05, Maxim Cournoyer  
>> wrote:
>> >
>> > This reinstates
>> > ,
>> > which was committed at some point but reverted many years later in
>> > cleanups that followed the migration from Texinfo sources to the
>> > ReStructuredText (RST) format.  It's still nice to leave the option for
>> > users to easily generate a QEMU manual in the Texinfo format, taking
>> > advantage of the Sphinx texinfo backend.
>> 
>> As far as I can tell, we never committed that patch, because
>> (as noted in the discussion there) we don't particularly want
>> to generate texinfo output (and also because it was missing a
>> signed-off-by line). So this isn't a regression: we've never
>> generated info docs since we switched away from texinfo to
>> rst as our source format.
>> 
>> I don't think my position personally has changed on this one
>> since your previous patch submission. Other QEMU developers
>> are welcome to weigh in and disagree with me.
>
> I tend to agree with your point in that thread above. It is already
> a big enough burden for maintainers to ensure that the HTML output
> for their docs is rendered effectively. Adding an 'info' docs output
> increases that burden for negligible benefit. HTML is the most
> broadly consumable docs format, so it makes sense to focus our effort
> on that.

For me, the value in Texinfo is that it can be consistently be used from
both headless environment (pseudo-terminal) and in graphical
environments with more capable viewers (Emacs, Yelp, ...).  Its viewers
typically offer better search capabilities than man pages viewers or
most HTML browsers (e.g. jumping to topics via their index or node
names, or searching for a regexp in the whole document), and its on disk
small size means it can easily be shipped along the program itself,
ensuring the documentation stays in sync with the documented program.

Using GNU Guix for years, where Texinfo is used as the standard
documentation system and where its enabled in most packages that support
it, has allowed me to appreciate the qualities of Texinfo as a
documentation system that I don't find in either man or pages or HTML
documentation.

-- 
Thanks,
Maxim

Re: [PATCH] build: Re-introduce an 'info' target to build a Texinfo manual.

2024-03-22 Thread Maxim Cournoyer

Hi Peter,

Peter Maydell  writes:

> On Mon, 18 Mar 2024 at 03:05, Maxim Cournoyer  
> wrote:
>>
>> This reinstates
>> ,
>> which was committed at some point but reverted many years later in
>> cleanups that followed the migration from Texinfo sources to the
>> ReStructuredText (RST) format.  It's still nice to leave the option for
>> users to easily generate a QEMU manual in the Texinfo format, taking
>> advantage of the Sphinx texinfo backend.
>
> As far as I can tell, we never committed that patch, because
> (as noted in the discussion there) we don't particularly want
> to generate texinfo output (and also because it was missing a
> signed-off-by line). So this isn't a regression: we've never
> generated info docs since we switched away from texinfo to
> rst as our source format.

I see.  For the record, very similar changes were contributed and
successfully merged into the Linux [0] and U-Boot [1] projects.  No
problems appear to have been caused by these in more than 2 years time.

[0]  
https://lwn.net/ml/linux-doc/20221116190210.28407-1-maxim.courno...@gmail.com/
 merged with commit 1f050e904d

[1]  https://lists.denx.de/pipermail/u-boot/2022-December/502355.html
 merged with commit 7fa4c27a2e

> I don't think my position personally has changed on this one
> since your previous patch submission. Other QEMU developers
> are welcome to weigh in and disagree with me.

My position is also unchanged re-reading the past thread; that the
maintenance burden would mostly be on the Sphinx project rather than
QEMU.  The info target doesn't even need to be tested by upstream; my
opinion is that there's value in making it easily available for
downstream users such as GNU Guix to use.  The target is not built
unless 'texinfo' is available in the environment, so your CI can
continue not producing it, if this is preferred.

Of course, GNU Guix can continue to maintain it as a custom patch
applied to QEMU, but I think it'd be nicer if it lived in the QEMU tree,
potentially benefiting others.

> (If we do enable this we might want to see whether we need to
> set texinfo_documents in conf.py or if the defaults are OK.)

The defaults appear to be fine.  The output file is 'qemu.info', and
'qemu' is a fine name for an Info manual.

-- 
Thanks,
Maxim

Re: [PULL 00/15] riscv-to-apply queue

2024-03-22 Thread Daniel Henrique Barboza





On 3/22/24 14:16, Michael Tokarev wrote:

22.03.2024 11:53, Alistair Francis :


RISC-V PR for 9.0

* Do not enable all named features by default
* A range of Vector fixes
* Update APLIC IDC after claiming iforce register
* Remove the dependency of Zvfbfmin to Zfbfmin
* Fix mode in riscv_tlb_fill
* Fix timebase-frequency when using KVM acceleration


Should something from there be picked up for stable (8.2 and probably 7.2)?


Ignore the "Do not enable all named features by default" since it's fixing 
something
that were added in 9.0.

The rest you can pick it up to 8.2 at least. Thanks,


Daniel





Thanks,

/mjt



Daniel Henrique Barboza (10):
   target/riscv: do not enable all named features by default
   target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX()
   trans_rvv.c.inc: set vstart = 0 in int scalar move insns
   target/riscv/vector_helper.c: fix 'vmvr_v' memcpy endianess
   target/riscv: always clear vstart in whole vec move insns
   target/riscv: always clear vstart for ldst_whole insns
   target/riscv/vector_helpers: do early exit when vstart >= vl
   target/riscv: remove 'over' brconds from vector trans
   trans_rvv.c.inc: remove redundant mark_vs_dirty() calls
   target/riscv/vector_helper.c: optimize loops in ldst helpers

Frank Chang (1):
   hw/intc: Update APLIC IDC after claiming iforce register

Irina Ryapolova (1):
   target/riscv: Fix mode in riscv_tlb_fill

Ivan Klokov (1):
   target/riscv: enable 'vstart_eq_zero' in the end of insns

Max Chou (1):
   target/riscv: rvv: Remove the dependency of Zvfbfmin to Zfbfmin

Yong-Xuan Wang (1):
   target/riscv/kvm: fix timebase-frequency when using KVM acceleration

Re: [PATCH v2] target/i386: Fix CPUID encoding of Fn8000001E_ECX

2024-03-22 Thread Moger, Babu

Any feedback or concerns with this patch? Otherwise can this be merged?
Thanks
Babu

On 1/2/24 17:17, Babu Moger wrote:
> Observed the following failure while booting the SEV-SNP guest and the
> guest fails to boot with the smp parameters:
> "-smp 192,sockets=1,dies=12,cores=8,threads=2".
> 
> qemu-system-x86_64: sev_snp_launch_update: SNP_LAUNCH_UPDATE ret=-5 
> fw_error=22 'Invalid parameter'
> qemu-system-x86_64: SEV-SNP: CPUID validation failed for function 0x801e, 
> index: 0x0.
> provided: eax:0x, ebx: 0x0100, ecx: 0x0b00, edx: 0x
> expected: eax:0x, ebx: 0x0100, ecx: 0x0300, edx: 0x
> qemu-system-x86_64: SEV-SNP: failed update CPUID page
> 
> Reason for the failure is due to overflowing of bits used for "Node per
> processor" in CPUID Fn801E_ECX. This field's width is 3 bits wide and
> can hold maximum value 0x7. With dies=12 (0xB), it overflows and spills
> over into the reserved bits. In the case of SEV-SNP, this causes CPUID
> enforcement failure and guest fails to boot.
> 
> The PPR documentation for CPUID_Fn801E_ECX [Node Identifiers]
> =
> BitsDescription
> 31:11   Reserved.
> 
> 10:8NodesPerProcessor: Node per processor. Read-only.
> ValidValues:
> Value   Description
> 0h  1 node per processor.
> 7h-1h   Reserved.
> 
> 7:0 NodeId: Node ID. Read-only. Reset: Fixed,XXh.
> =
> 
> As in the spec, the valid value for "node per processor" is 0 and rest
> are reserved.
> 
> Looking back at the history of decoding of CPUID_Fn801E_ECX, noticed
> that there were cases where "node per processor" can be more than 1. It
> is valid only for pre-F17h (pre-EPYC) architectures. For EPYC or later
> CPUs, the linux kernel does not use this information to build the L3
> topology.
> 
> Also noted that the CPUID Function 0x801E_ECX is available only when
> TOPOEXT feature is enabled. This feature is enabled only for EPYC(F17h)
> or later processors. So, previous generation of processors do not not
> enumerate 0x801E_ECX leaf.
> 
> There could be some corner cases where the older guests could enable the
> TOPOEXT feature by running with -cpu host, in which case legacy guests
> might notice the topology change. To address those cases introduced a
> new CPU property "legacy-multi-node". It will be true for older machine
> types to maintain compatibility. By default, it will be false, so new
> decoding will be used going forward.
> 
> The documentation is taken from Preliminary Processor Programming
> Reference (PPR) for AMD Family 19h Model 11h, Revision B1 Processors 55901
> Rev 0.25 - Oct 6, 2022.
> 
> Cc: qemu-sta...@nongnu.org
> Fixes: 31ada106d891 ("Simplify CPUID_8000_001E for AMD")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger 
> Reviewed-by: Zhao Liu 
> ---
> v2: Rebased to the latest tree.
> Updated the pc_compat_8_2 for the new flag.
> Added the comment for new property legacy_multi_node.
> Added Reviwed-by from Zhao.
> ---
>  hw/i386/pc.c  |  4 +++-
>  target/i386/cpu.c | 18 ++
>  target/i386/cpu.h |  6 ++
>  3 files changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 496498df3a..a504e05e62 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -78,7 +78,9 @@
>  { "qemu64-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },\
>  { "athlon-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },
>  
> -GlobalProperty pc_compat_8_2[] = {};
> +GlobalProperty pc_compat_8_2[] = {
> +{ TYPE_X86_CPU, "legacy-multi-node", "on" },
> +};
>  const size_t pc_compat_8_2_len = G_N_ELEMENTS(pc_compat_8_2);
>  
>  GlobalProperty pc_compat_8_1[] = {};
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 95d5f16cd5..2cc84e8500 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -398,12 +398,9 @@ static void encode_topo_cpuid801e(X86CPU *cpu, 
> X86CPUTopoInfo *topo_info,
>   * 31:11 Reserved.
>   * 10:8 NodesPerProcessor: Node per processor. Read-only. Reset: XXXb.
>   *  ValidValues:
> - *  Value Description
> - *  000b  1 node per processor.
> - *  001b  2 nodes per processor.
> - *  010b Reserved.
> - *  011b 4 nodes per processor.
> - *  111b-100b Reserved.
> + *  Value   Description
> + *  0h  1 node per processor.
> + *  7h-1h   Reserved.
>   *  7:0 NodeId: Node ID. Read-only. Reset: XXh.
>   *
>   * NOTE: Hardware reserves 3 bits for number of nodes per processor.
> @@ -412,8 +409,12 @@ static void encode_topo_cpuid801e(X86CPU *cpu, 
> X86CPUTopoInfo *topo_info,
>   * NodeId is combination of node and socket_id which is already decoded
>   * in apic_id. Just use it by shifting.
>   */
> -

Re: [RFC v2 2/2] hw/riscv: Add server platform reference machine

2024-03-22 Thread Atish Kumar Patra

On Tue, Mar 12, 2024 at 6:53 AM Fei Wu  wrote:
>
> The RISC-V Server Platform specification[1] defines a standardized set
> of hardware and software capabilities, that portable system software,
> such as OS and hypervisors can rely on being present in a RISC-V server
> platform.
>
> A corresponding Qemu RISC-V server platform reference (rvsp-ref for
> short) machine type is added to provide a environment for firmware/OS
> development and testing. The main features included in rvsp-ref are:
>
>  - Based on riscv virt machine type
>  - A new memory map as close as virt machine as possible
>  - A new virt CPU type rvsp-ref-cpu for server platform compliance
>  - AIA
>  - PCIe AHCI
>  - PCIe NIC
>  - No virtio device
>  - No fw_cfg device
>  - No ACPI table provided
>  - Only minimal device tree nodes
>
> [1] https://github.com/riscv-non-isa/riscv-server-platform
>
> Signed-off-by: Fei Wu 
> ---
>  configs/devices/riscv64-softmmu/default.mak |1 +
>  hw/riscv/Kconfig|   12 +
>  hw/riscv/meson.build|1 +
>  hw/riscv/server_platform_ref.c  | 1276 +++
>  4 files changed, 1290 insertions(+)
>  create mode 100644 hw/riscv/server_platform_ref.c
>
> diff --git a/configs/devices/riscv64-softmmu/default.mak 
> b/configs/devices/riscv64-softmmu/default.mak
> index 3f68059448..a1d98e49ef 100644
> --- a/configs/devices/riscv64-softmmu/default.mak
> +++ b/configs/devices/riscv64-softmmu/default.mak
> @@ -10,5 +10,6 @@ CONFIG_SPIKE=y
>  CONFIG_SIFIVE_E=y
>  CONFIG_SIFIVE_U=y
>  CONFIG_RISCV_VIRT=y
> +CONFIG_SERVER_PLATFORM_REF=y
>  CONFIG_MICROCHIP_PFSOC=y
>  CONFIG_SHAKTI_C=y
> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
> index 5d644eb7b1..5674589e66 100644
> --- a/hw/riscv/Kconfig
> +++ b/hw/riscv/Kconfig
> @@ -48,6 +48,18 @@ config RISCV_VIRT
>  select ACPI
>  select ACPI_PCI
>
> +config SERVER_PLATFORM_REF
> +bool
> +select RISCV_NUMA
> +select GOLDFISH_RTC
> +select PCI
> +select PCI_EXPRESS_GENERIC_BRIDGE
> +select PFLASH_CFI01
> +select SERIAL
> +select RISCV_ACLINT
> +select RISCV_APLIC
> +select RISCV_IMSIC
> +
>  config SHAKTI_C
>  bool
>  select RISCV_ACLINT
> diff --git a/hw/riscv/meson.build b/hw/riscv/meson.build
> index 2f7ee81be3..bb3aff91ea 100644
> --- a/hw/riscv/meson.build
> +++ b/hw/riscv/meson.build
> @@ -4,6 +4,7 @@ riscv_ss.add(when: 'CONFIG_RISCV_NUMA', if_true: 
> files('numa.c'))
>  riscv_ss.add(files('riscv_hart.c'))
>  riscv_ss.add(when: 'CONFIG_OPENTITAN', if_true: files('opentitan.c'))
>  riscv_ss.add(when: 'CONFIG_RISCV_VIRT', if_true: files('virt.c'))
> +riscv_ss.add(when: 'CONFIG_SERVER_PLATFORM_REF', if_true: 
> files('server_platform_ref.c'))
>  riscv_ss.add(when: 'CONFIG_SHAKTI_C', if_true: files('shakti_c.c'))
>  riscv_ss.add(when: 'CONFIG_SIFIVE_E', if_true: files('sifive_e.c'))
>  riscv_ss.add(when: 'CONFIG_SIFIVE_U', if_true: files('sifive_u.c'))
> diff --git a/hw/riscv/server_platform_ref.c b/hw/riscv/server_platform_ref.c
> new file mode 100644
> index 00..b552650265
> --- /dev/null
> +++ b/hw/riscv/server_platform_ref.c
> @@ -0,0 +1,1276 @@
> +/*
> + * QEMU RISC-V Server Platform (RVSP) Reference Board
> + *
> + * Copyright (c) 2024 Intel, Inc.
> + *
> + * This board is compliant RISC-V Server platform specification and 
> leveraging
> + * a lot of riscv virt code.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along 
> with
> + * this program.  If not, see .
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/units.h"
> +#include "qemu/error-report.h"
> +#include "qemu/guest-random.h"
> +#include "qapi/error.h"
> +#include "qapi/qapi-visit-common.h"
> +#include "hw/boards.h"
> +#include "hw/loader.h"
> +#include "hw/sysbus.h"
> +#include "hw/qdev-properties.h"
> +#include "hw/char/serial.h"
> +#include "hw/block/flash.h"
> +#include "hw/ide/pci.h"
> +#include "hw/ide/ahci-pci.h"
> +#include "hw/pci/pci.h"
> +#include "hw/pci-host/gpex.h"
> +#include "hw/core/sysbus-fdt.h"
> +#include "hw/riscv/riscv_hart.h"
> +#include "hw/riscv/boot.h"
> +#include "hw/riscv/numa.h"
> +#include "hw/intc/riscv_aclint.h"
> +#include "hw/intc/riscv_aplic.h"
> +#include "hw/intc/riscv_imsic.h"
> +#include "chardev/char.h"
> +#include "sysemu/device_tree.h"
> +#include "sysemu/runstate.h"
> +#include "sysemu/sysemu.h"
> +#include "sysemu/tcg.h"
> +#include "target/riscv/cpu.h"
> +#include "target/riscv/pmu.

Re: [RFC v2 2/2] hw/riscv: Add server platform reference machine

2024-03-22 Thread Atish Kumar Patra

On Fri, Mar 22, 2024 at 2:20 AM Marcin Juszkiewicz
 wrote:
>
> W dniu 22.03.2024 o 09:50, Heinrich Schuchardt pisze:
>  >>> I see no mention of device trees in the spec, but I do see ACPI. Do we
>  >>> really expect a server platform to use DTs?
>  >>
>  >> This platform "kind of" follows sbsa-ref where we have very
>  >> minimalistic device tree sharing information qemu->firmware.
>  >>
>  >> libfdt is small, format is known and describes hardware. Firmware is
>  >> free to make use of it in any way it wants.
>  >>
>  >> On sbsa-ref we parse DT in TF-A (base firmware) and provide hardware
>  >> information to higher level (edk2) via SMC mechanism. Then EDK2
>  >> creates ACPI tables and provide them to the Operating System.
>
>  > We should ensure that only either an ACPI table or a device-tree
>  > description is passed to the OS and not both, e.g. when using
>  >
>  >  qemu-system-riscv64 -kernel vmlinux -M sbsa-ref
>  >
>  > But that requirement is not machine specific.
>
> I would not call "qemu-system-* -M machinename -k kernel_image" a proper
> way to boot for several systems emulated by QEMU.
>
> DeviceTree is in rvsp-ref and sbsa-ref because it is easy to process in
> limited space 1st stage of firmware has.
>

OpenSBI also has DT support only. So a minimalistic DT generated by the machine
for the firmware is required for RISC-V as well.

> And if we knew how people will mention 'sbsa-ref uses DT' we would use
> something else instead. But that would require adding more code into
> existing firmware projects (libfdt is usually already there).
>
> I did not looked at DT generated for rvsp-ref. I know that sbsa-ref one
> is too minimalistic for kernel use as we added only those fields/nodes
> we need to provide data for firmware.

Re: [PATCH 19/26] RAMBlock: Add support of KVM private guest memfd

2024-03-22 Thread Michael Roth

On Fri, Mar 22, 2024 at 07:11:09PM +0100, Paolo Bonzini wrote:
> From: Michael Roth 

This should be:

  From: Xiaoyao Li 

Looks like the author got reset in my tree for some reason and I failed to
notice it before posting. Sorry for the mix-up.

-Mike

> 
> Add KVM guest_memfd support to RAMBlock so both normal hva based memory
> and kvm guest memfd based private memory can be associated in one RAMBlock.
> 
> Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to
> create private guest_memfd during RAMBlock setup.
> 
> Allocating a new RAM_GUEST_MEMFD flag to instruct the setup of guest memfd
> is more flexible and extensible than simply relying on the VM type because
> in the future we may have the case that not all the memory of a VM need
> guest memfd. As a benefit, it also avoid getting MachineState in memory
> subsystem.
> 
> Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of
> confidential guests, such as TDX VM. How and when to set it for memory
> backends will be implemented in the following patches.
> 
> Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has
> KVM guest_memfd allocated.
> 
> Signed-off-by: Xiaoyao Li 
> Reviewed-by: David Hildenbrand 
> Message-ID: <20240320083945.991426-7-michael.r...@amd.com>
> Signed-off-by: Paolo Bonzini 
> ---
>  include/exec/memory.h   | 20 +---
>  include/exec/ram_addr.h |  2 +-
>  include/exec/ramblock.h |  1 +
>  include/sysemu/kvm.h|  3 ++-
>  accel/kvm/kvm-all.c | 28 
>  accel/stubs/kvm-stub.c  |  5 +
>  system/memory.c |  5 +
>  system/physmem.c| 34 +++---
>  8 files changed, 90 insertions(+), 8 deletions(-)
> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 8626a355b31..679a8476852 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -243,6 +243,9 @@ typedef struct IOMMUTLBEvent {
>  /* RAM FD is opened read-only */
>  #define RAM_READONLY_FD (1 << 11)
>  
> +/* RAM can be private that has kvm guest memfd backend */
> +#define RAM_GUEST_MEMFD   (1 << 12)
> +
>  static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
> IOMMUNotifierFlag flags,
> hwaddr start, hwaddr end,
> @@ -1307,7 +1310,8 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
>   * @name: Region name, becomes part of RAMBlock name used in migration stream
>   *must be unique within any device
>   * @size: size of the region.
> - * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_NORESERVE.
> + * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_NORESERVE,
> + * RAM_GUEST_MEMFD.
>   * @errp: pointer to Error*, to store an error if it happens.
>   *
>   * Note that this function does not do anything to cause the data in the
> @@ -1369,7 +1373,7 @@ bool memory_region_init_resizeable_ram(MemoryRegion *mr,
>   * (getpagesize()) will be used.
>   * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>   * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
> - * RAM_READONLY_FD
> + * RAM_READONLY_FD, RAM_GUEST_MEMFD
>   * @path: the path in which to allocate the RAM.
>   * @offset: offset within the file referenced by path
>   * @errp: pointer to Error*, to store an error if it happens.
> @@ -1399,7 +1403,7 @@ bool memory_region_init_ram_from_file(MemoryRegion *mr,
>   * @size: size of the region.
>   * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>   * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
> - * RAM_READONLY_FD
> + * RAM_READONLY_FD, RAM_GUEST_MEMFD
>   * @fd: the fd to mmap.
>   * @offset: offset within the file referenced by fd
>   * @errp: pointer to Error*, to store an error if it happens.
> @@ -1722,6 +1726,16 @@ static inline bool memory_region_is_romd(MemoryRegion 
> *mr)
>   */
>  bool memory_region_is_protected(MemoryRegion *mr);
>  
> +/**
> + * memory_region_has_guest_memfd: check whether a memory region has 
> guest_memfd
> + * associated
> + *
> + * Returns %true if a memory region's ram_block has valid guest_memfd 
> assigned.
> + *
> + * @mr: the memory region being queried
> + */
> +bool memory_region_has_guest_memfd(MemoryRegion *mr);
> +
>  /**
>   * memory_region_get_iommu: check whether a memory region is an iommu
>   *
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index de45ba7bc96..07c8f863750 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -110,7 +110,7 @@ long qemu_maxrampagesize(void);
>   *  @mr: the memory region where the ram block is
>   *  @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>   *  RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
> - *  RAM_READONLY_FD
> + *

Re: [PATCH v2 0/2] ARM Sbsa-ref: Enable CPU cluster topology

2024-03-22 Thread Peter Maydell

On Tue, 12 Mar 2024 at 08:32, Xiong Yining
 wrote:
>
> Enable CPU cluster support on SbsaQemu platform, so that users can
> specify a 4-level CPU hierarchy sockets/clusters/cores/threads. And this
> topology can be passed to the firmware through DT cpu-map.
>
> Changes in v2:
> - put this code before sbsa_fdt_add_gic_node().
>
> xiongyining1480 (2):
>   hw/arm/sbsa-ref:Enable CPU cluster on ARM sbsa machine
>   hw/arm/sbsa-ref: Add cpu-map to device tree

Thanks for these patches. I think we should squash the two
patches together into one, because the first patch is only
a single line, and also because we shouldn't say that the
machine supports cluster topology until it actually does
by putting the information into the device tree.

There's no rush, because we're  now in softfreeze for 9.0, so these
will have to wait until 9.0 is released (in about a month's time).

I'm also a bit confused by the Reviewed-by: tag from Marcin on patch 2,
because I can't see that in my mail archives of the discussion on version
1 of this patchset, only a Tested-by. Marcin, are you OK with these patches?

Also, is this change to the DTB something that would require an
increase in the sbsa-ref platform version number, or not?
Should we adjust the documentation in docs/system/arm/sbsa.rst to
mention that the DTB might have cluster topology information?

thanks
-- PMM

Re: [PATCH 4/7] hw/char/stm32l4x5_usart: Enable serial read and write

2024-03-22 Thread Peter Maydell

On Sun, 17 Mar 2024 at 10:41, Arnaud Minier
 wrote:
>
> Implement the ability to read and write characters to the
> usart using the serial port.
>
> Signed-off-by: Arnaud Minier 
> Signed-off-by: Inès Varhol 
> ---
>  hw/char/stm32l4x5_usart.c | 105 +-
>  hw/char/trace-events  |   1 +
>  2 files changed, 105 insertions(+), 1 deletion(-)
>
> diff --git a/hw/char/stm32l4x5_usart.c b/hw/char/stm32l4x5_usart.c
> index f58bd56875..958d05a56d 100644
> --- a/hw/char/stm32l4x5_usart.c
> +++ b/hw/char/stm32l4x5_usart.c
> @@ -154,6 +154,71 @@ REG32(RDR, 0x24)
>  REG32(TDR, 0x28)
>  FIELD(TDR, TDR, 0, 8)
>
> +static int stm32l4x5_usart_base_can_receive(void *opaque)
> +{
> +Stm32l4x5UsartBaseState *s = opaque;
> +
> +if (!(s->isr & R_ISR_RXNE_MASK)) {
> +return 1;
> +}
> +
> +return 0;
> +}
> +
> +static void stm32l4x5_update_irq(Stm32l4x5UsartBaseState *s)
> +{
> +if (((s->isr & R_ISR_WUF_MASK) && (s->cr3 & R_CR3_WUFIE_MASK))||
> +((s->isr & R_ISR_CMF_MASK) && (s->cr1 & R_CR1_CMIE_MASK)) ||
> +((s->isr & R_ISR_ABRF_MASK) && (s->cr1 & R_CR1_RXNEIE_MASK))  ||
> +((s->isr & R_ISR_EOBF_MASK) && (s->cr1 & R_CR1_EOBIE_MASK))   ||
> +((s->isr & R_ISR_RTOF_MASK) && (s->cr1 & R_CR1_RTOIE_MASK))   ||
> +((s->isr & R_ISR_CTSIF_MASK) && (s->cr3 & R_CR3_CTSIE_MASK))  ||
> +((s->isr & R_ISR_LBDF_MASK) && (s->cr2 & R_CR2_LBDIE_MASK))   ||
> +((s->isr & R_ISR_TXE_MASK) && (s->cr1 & R_CR1_TXEIE_MASK))||
> +((s->isr & R_ISR_TC_MASK) && (s->cr1 & R_CR1_TCIE_MASK))  ||
> +((s->isr & R_ISR_RXNE_MASK) && (s->cr1 & R_CR1_RXNEIE_MASK))  ||
> +((s->isr & R_ISR_IDLE_MASK) && (s->cr1 & R_CR1_IDLEIE_MASK))  ||
> +((s->isr & R_ISR_ORE_MASK) &&
> +((s->cr1 & R_CR1_RXNEIE_MASK) || (s->cr3 & R_CR3_EIE_MASK)))  ||
> +/* TODO: Handle NF ? */
> +((s->isr & R_ISR_FE_MASK) && (s->cr3 & R_CR3_EIE_MASK))   ||
> +((s->isr & R_ISR_PE_MASK) && (s->cr1 & R_CR1_PEIE_MASK))) {

It always makes me a bit sad when hardware designers don't neatly
line up the bits in the ISR register and the mask register so we
can check them all at once :-)

> +qemu_irq_raise(s->irq);
> +trace_stm32l4x5_usart_irq_raised(s->isr);
> +} else {
> +qemu_irq_lower(s->irq);
> +trace_stm32l4x5_usart_irq_lowered();
> +}
> +}
> +
> +static void stm32l4x5_usart_base_receive(void *opaque, const uint8_t *buf, 
> int size)
> +{
> +Stm32l4x5UsartBaseState *s = opaque;
> +
> +if (!((s->cr1 & R_CR1_UE_MASK) && (s->cr1 & R_CR1_RE_MASK))) {
> +/* USART not enabled - drop the chars */
> +trace_stm32l4x5_usart_error("Dropping the chars\n");

This shouldn't have the newline on the end. Also, it looks like
this is the only use of this trace event so (a) you could make it
more specific rather than passing in an arbitrary string, and
(b) it should be defined in this patch, not in patch 2.

> +return;
> +}
> +
> +/* Check if overrun detection is enabled and if there is an overrun */
> +if (!(s->cr3 & R_CR3_OVRDIS_MASK) && (s->isr & R_ISR_RXNE_MASK)) {
> +/*
> + * A character has been received while
> + * the previous has not been read = Overrun.
> + */
> +s->isr |= R_ISR_ORE_MASK;
> +trace_stm32l4x5_usart_overrun_detected(s->rdr, *buf);
> +} else {
> +/* No overrun */
> +s->rdr = *buf;
> +s->isr |= R_ISR_RXNE_MASK;
> +trace_stm32l4x5_usart_rx(s->rdr);
> +}
> +
> +stm32l4x5_update_irq(s);
> +}
> +
>  static void stm32l4x5_usart_base_reset_hold(Object *obj)
>  {
>  Stm32l4x5UsartBaseState *s = STM32L4X5_USART_BASE(obj);
> @@ -168,6 +233,21 @@ static void stm32l4x5_usart_base_reset_hold(Object *obj)
>  s->isr = 0x02C0 | R_ISR_TXE_MASK;
>  s->rdr = 0x;
>  s->tdr = 0x;
> +
> +stm32l4x5_update_irq(s);
> +}
> +
> +static void usart_update_rqr(Stm32l4x5UsartBaseState *s, uint32_t value)
> +{
> +/* TXFRQ */
> +/* Reset RXNE flag */
> +if (value & R_RQR_RXFRQ_MASK) {
> +s->isr &= ~R_ISR_RXNE_MASK;
> +}
> +/* MMRQ */
> +/* SBKRQ */
> +/* ABRRQ */
> +stm32l4x5_update_irq(s);
>  }
>
>  static uint64_t stm32l4x5_usart_base_read(void *opaque, hwaddr addr,
> @@ -209,7 +289,8 @@ static uint64_t stm32l4x5_usart_base_read(void *opaque, 
> hwaddr addr,
>  case A_RDR:
>  retvalue = FIELD_EX32(s->rdr, RDR, RDR);
>  /* Reset RXNE flag */
> -s->isr &= ~USART_ISR_RXNE;
> +s->isr &= ~R_ISR_RXNE_MASK;

This looks like another "should have used that name to start with" change?

> +stm32l4x5_update_irq(s);
>  break;
>  case A_TDR:
>  retvalue = FIELD_EX32(s->tdr, TDR, TDR);
> @@ -237,6 +318,7 @@ static void stm32l4x5_usart_base_write(void *opaque, 
> hwaddr addr

[PATCH-for-9.1] target/ppc: Unify TYPE_POWERPC_CPU definition for 32/64-bit

2024-03-22 Thread Philippe Mathieu-Daudé

Apparently there is no wordsize special use with the QOM
TYPE_POWERPC_CPU typename. Unify 32 and 64-bit with a single
common definition.

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/ppc/cpu-qom.h | 4 
 1 file changed, 4 deletions(-)

diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
index 8247fa2336..ed75f1b690 100644
--- a/target/ppc/cpu-qom.h
+++ b/target/ppc/cpu-qom.h
@@ -23,11 +23,7 @@
 #include "exec/gdbstub.h"
 #include "hw/core/cpu.h"
 
-#ifdef TARGET_PPC64
-#define TYPE_POWERPC_CPU "powerpc64-cpu"
-#else
 #define TYPE_POWERPC_CPU "powerpc-cpu"
-#endif
 
 OBJECT_DECLARE_CPU_TYPE(PowerPCCPU, PowerPCCPUClass, POWERPC_CPU)
 
-- 
2.41.0

[PATCH] hw/microblaze: Do not allow xlnx-zynqmp-pmu-soc to be created by the user

2024-03-22 Thread Thomas Huth

Using xlnx-zynqmp-pmu-soc on the command line causes QEMU to crash:

 ./qemu-system-microblazeel -M petalogix-ml605 -device xlnx-zynqmp-pmu-soc
 **
 ERROR:tcg/tcg.c:813:tcg_register_thread: assertion failed: (n < tcg_max_ctxs)
 Bail out!
 Aborted (core dumped)

Mark the device with "user_creatable = false" to avoid that this can happen.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2229
Signed-off-by: Thomas Huth 
---
 hw/microblaze/xlnx-zynqmp-pmu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/microblaze/xlnx-zynqmp-pmu.c b/hw/microblaze/xlnx-zynqmp-pmu.c
index 5a2016672a..1bfc9641d2 100644
--- a/hw/microblaze/xlnx-zynqmp-pmu.c
+++ b/hw/microblaze/xlnx-zynqmp-pmu.c
@@ -125,6 +125,8 @@ static void xlnx_zynqmp_pmu_soc_class_init(ObjectClass *oc, 
void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
 
+/* xlnx-zynqmp-pmu-soc causes crashes when cold-plugged twice */
+dc->user_creatable = false;
 dc->realize = xlnx_zynqmp_pmu_soc_realize;
 }
 
-- 
2.44.0

[PATCH] kvm: use configs/ definition to conditionalize debug support

2024-03-22 Thread Paolo Bonzini

If an architecture adds support for KVM_CAP_SET_GUEST_DEBUG but QEMU does not
have the necessary code, QEMU will fail to build after updating kernel headers.
Avoid this by using a #define in config-target.h instead of 
KVM_CAP_SET_GUEST_DEBUG.

Signed-off-by: Paolo Bonzini 
---
 configs/targets/aarch64-softmmu.mak |  1 +
 configs/targets/i386-softmmu.mak|  1 +
 configs/targets/ppc-softmmu.mak |  1 +
 configs/targets/ppc64-softmmu.mak   |  1 +
 configs/targets/s390x-softmmu.mak   |  1 +
 configs/targets/x86_64-softmmu.mak  |  1 +
 include/sysemu/kvm.h|  2 +-
 include/sysemu/kvm_int.h|  2 +-
 accel/kvm/kvm-accel-ops.c   |  4 ++--
 accel/kvm/kvm-all.c | 10 +-
 10 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/configs/targets/aarch64-softmmu.mak 
b/configs/targets/aarch64-softmmu.mak
index b4338e95680..83c22391a69 100644
--- a/configs/targets/aarch64-softmmu.mak
+++ b/configs/targets/aarch64-softmmu.mak
@@ -1,5 +1,6 @@
 TARGET_ARCH=aarch64
 TARGET_BASE_ARCH=arm
 TARGET_SUPPORTS_MTTCG=y
+TARGET_KVM_HAVE_GUEST_DEBUG=y
 TARGET_XML_FILES= gdb-xml/aarch64-core.xml gdb-xml/aarch64-fpu.xml 
gdb-xml/arm-core.xml gdb-xml/arm-vfp.xml gdb-xml/arm-vfp3.xml 
gdb-xml/arm-vfp-sysregs.xml gdb-xml/arm-neon.xml gdb-xml/arm-m-profile.xml 
gdb-xml/arm-m-profile-mve.xml gdb-xml/aarch64-pauth.xml
 TARGET_NEED_FDT=y
diff --git a/configs/targets/i386-softmmu.mak b/configs/targets/i386-softmmu.mak
index 6b3c99fc86c..d61b5076134 100644
--- a/configs/targets/i386-softmmu.mak
+++ b/configs/targets/i386-softmmu.mak
@@ -1,4 +1,5 @@
 TARGET_ARCH=i386
 TARGET_SUPPORTS_MTTCG=y
 TARGET_NEED_FDT=y
+TARGET_KVM_HAVE_GUEST_DEBUG=y
 TARGET_XML_FILES= gdb-xml/i386-32bit.xml
diff --git a/configs/targets/ppc-softmmu.mak b/configs/targets/ppc-softmmu.mak
index 774440108f7..f3ea9c98f75 100644
--- a/configs/targets/ppc-softmmu.mak
+++ b/configs/targets/ppc-softmmu.mak
@@ -1,4 +1,5 @@
 TARGET_ARCH=ppc
 TARGET_BIG_ENDIAN=y
+TARGET_KVM_HAVE_GUEST_DEBUG=y
 TARGET_XML_FILES= gdb-xml/power-core.xml gdb-xml/power-fpu.xml 
gdb-xml/power-altivec.xml gdb-xml/power-spe.xml
 TARGET_NEED_FDT=y
diff --git a/configs/targets/ppc64-softmmu.mak 
b/configs/targets/ppc64-softmmu.mak
index ddf0c39617f..1db8d8381d0 100644
--- a/configs/targets/ppc64-softmmu.mak
+++ b/configs/targets/ppc64-softmmu.mak
@@ -2,5 +2,6 @@ TARGET_ARCH=ppc64
 TARGET_BASE_ARCH=ppc
 TARGET_BIG_ENDIAN=y
 TARGET_SUPPORTS_MTTCG=y
+TARGET_KVM_HAVE_GUEST_DEBUG=y
 TARGET_XML_FILES= gdb-xml/power64-core.xml gdb-xml/power-fpu.xml 
gdb-xml/power-altivec.xml gdb-xml/power-spe.xml gdb-xml/power-vsx.xml
 TARGET_NEED_FDT=y
diff --git a/configs/targets/s390x-softmmu.mak 
b/configs/targets/s390x-softmmu.mak
index 70d2f9f0ba0..b22218aacc8 100644
--- a/configs/targets/s390x-softmmu.mak
+++ b/configs/targets/s390x-softmmu.mak
@@ -1,4 +1,5 @@
 TARGET_ARCH=s390x
 TARGET_BIG_ENDIAN=y
 TARGET_SUPPORTS_MTTCG=y
+TARGET_KVM_HAVE_GUEST_DEBUG=y
 TARGET_XML_FILES= gdb-xml/s390x-core64.xml gdb-xml/s390-acr.xml 
gdb-xml/s390-fpr.xml gdb-xml/s390-vx.xml gdb-xml/s390-cr.xml 
gdb-xml/s390-virt.xml gdb-xml/s390-virt-kvm.xml gdb-xml/s390-gs.xml
diff --git a/configs/targets/x86_64-softmmu.mak 
b/configs/targets/x86_64-softmmu.mak
index 197817c9434..c5f882e5ba1 100644
--- a/configs/targets/x86_64-softmmu.mak
+++ b/configs/targets/x86_64-softmmu.mak
@@ -2,4 +2,5 @@ TARGET_ARCH=x86_64
 TARGET_BASE_ARCH=i386
 TARGET_SUPPORTS_MTTCG=y
 TARGET_NEED_FDT=y
+TARGET_KVM_HAVE_GUEST_DEBUG=y
 TARGET_XML_FILES= gdb-xml/i386-64bit.xml
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 698f1640fe2..e4bdc1ff914 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -224,7 +224,7 @@ void kvm_flush_coalesced_mmio_buffer(void);
  * calling down to kvm_arch_update_guest_debug after the generic
  * fields have been set.
  */
-#ifdef KVM_CAP_SET_GUEST_DEBUG
+#ifdef TARGET_KVM_HAVE_GUEST_DEBUG
 int kvm_update_guest_debug(CPUState *cpu, unsigned long reinject_trap);
 #else
 static inline int kvm_update_guest_debug(CPUState *cpu, unsigned long 
reinject_trap)
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index a5a3fee4119..3f3d13f8166 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -80,7 +80,7 @@ struct KVMState
 struct kvm_coalesced_mmio_ring *coalesced_mmio_ring;
 bool coalesced_flush_in_progress;
 int vcpu_events;
-#ifdef KVM_CAP_SET_GUEST_DEBUG
+#ifdef TARGET_KVM_HAVE_GUEST_DEBUG
 QTAILQ_HEAD(, kvm_sw_breakpoint) kvm_sw_breakpoints;
 #endif
 int max_nested_state_len;
diff --git a/accel/kvm/kvm-accel-ops.c b/accel/kvm/kvm-accel-ops.c
index 74e3c5785b5..94c828ac8da 100644
--- a/accel/kvm/kvm-accel-ops.c
+++ b/accel/kvm/kvm-accel-ops.c
@@ -85,7 +85,7 @@ static bool kvm_cpus_are_resettable(void)
 return !kvm_enabled() || !kvm_state->guest_state_protected;
 }
 
-#ifdef KVM_CAP_SET_GUEST_DEBUG
+#ifdef TARGET_KVM_HAVE_GUEST_DEBUG
 static int kvm_update_guest_debug_ops(CPUState *cpu)
 {
 return k

Re: [External] Re: [PATCH v4 2/2] memory tier: create CPUless memory tiers after obtaining HMAT info

2024-03-22 Thread Ho-Ren (Jack) Chuang

On Fri, Mar 22, 2024 at 1:41 AM Huang, Ying  wrote:
>
> "Ho-Ren (Jack) Chuang"  writes:
>
> > The current implementation treats emulated memory devices, such as
> > CXL1.1 type3 memory, as normal DRAM when they are emulated as normal memory
> > (E820_TYPE_RAM). However, these emulated devices have different
> > characteristics than traditional DRAM, making it important to
> > distinguish them. Thus, we modify the tiered memory initialization process
> > to introduce a delay specifically for CPUless NUMA nodes. This delay
> > ensures that the memory tier initialization for these nodes is deferred
> > until HMAT information is obtained during the boot process. Finally,
> > demotion tables are recalculated at the end.
> >
> > * late_initcall(memory_tier_late_init);
> > Some device drivers may have initialized memory tiers between
> > `memory_tier_init()` and `memory_tier_late_init()`, potentially bringing
> > online memory nodes and configuring memory tiers. They should be excluded
> > in the late init.
> >
> > * Handle cases where there is no HMAT when creating memory tiers
> > There is a scenario where a CPUless node does not provide HMAT information.
> > If no HMAT is specified, it falls back to using the default DRAM tier.
> >
> > * Introduce another new lock `default_dram_perf_lock` for adist calculation
> > In the current implementation, iterating through CPUlist nodes requires
> > holding the `memory_tier_lock`. However, `mt_calc_adistance()` will end up
> > trying to acquire the same lock, leading to a potential deadlock.
> > Therefore, we propose introducing a standalone `default_dram_perf_lock` to
> > protect `default_dram_perf_*`. This approach not only avoids deadlock
> > but also prevents holding a large lock simultaneously.
> >
> > * Upgrade `set_node_memory_tier` to support additional cases, including
> >   default DRAM, late CPUless, and hot-plugged initializations.
> > To cover hot-plugged memory nodes, `mt_calc_adistance()` and
> > `mt_find_alloc_memory_type()` are moved into `set_node_memory_tier()` to
> > handle cases where memtype is not initialized and where HMAT information is
> > available.
> >
> > * Introduce `default_memory_types` for those memory types that are not
> >   initialized by device drivers.
> > Because late initialized memory and default DRAM memory need to be managed,
> > a default memory type is created for storing all memory types that are
> > not initialized by device drivers and as a fallback.
> >
> > Signed-off-by: Ho-Ren (Jack) Chuang 
> > Signed-off-by: Hao Xiang 
> > ---
> >  mm/memory-tiers.c | 73 ---
> >  1 file changed, 63 insertions(+), 10 deletions(-)
> >
> > diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c
> > index 974af10cfdd8..9396330fa162 100644
> > --- a/mm/memory-tiers.c
> > +++ b/mm/memory-tiers.c
> > @@ -36,6 +36,11 @@ struct node_memory_type_map {
> >
> >  static DEFINE_MUTEX(memory_tier_lock);
> >  static LIST_HEAD(memory_tiers);
> > +/*
> > + * The list is used to store all memory types that are not created
> > + * by a device driver.
> > + */
> > +static LIST_HEAD(default_memory_types);
> >  static struct node_memory_type_map node_memory_types[MAX_NUMNODES];
> >  struct memory_dev_type *default_dram_type;
> >
> > @@ -108,6 +113,7 @@ static struct demotion_nodes *node_demotion 
> > __read_mostly;
> >
> >  static BLOCKING_NOTIFIER_HEAD(mt_adistance_algorithms);
> >
> > +static DEFINE_MUTEX(default_dram_perf_lock);
>
> Better to add comments about what is protected by this lock.
>

Thank you. I will add a comment like this:
+ /* The lock is used to protect `default_dram_perf*` info and nid. */
+static DEFINE_MUTEX(default_dram_perf_lock);

I also found an error path was not handled and
found the lock could be put closer to what it protects.
I will have them fixed in V5.

> >  static bool default_dram_perf_error;
> >  static struct access_coordinate default_dram_perf;
> >  static int default_dram_perf_ref_nid = NUMA_NO_NODE;
> > @@ -505,7 +511,8 @@ static inline void __init_node_memory_type(int node, 
> > struct memory_dev_type *mem
> >  static struct memory_tier *set_node_memory_tier(int node)
> >  {
> >   struct memory_tier *memtier;
> > - struct memory_dev_type *memtype;
> > + struct memory_dev_type *mtype;
>
> mtype may be referenced without initialization now below.
>

Good catch! Thank you.

Please check below.
I may found a potential NULL pointer dereference.

> > + int adist = MEMTIER_ADISTANCE_DRAM;
> >   pg_data_t *pgdat = NODE_DATA(node);
> >
> >
> > @@ -514,11 +521,20 @@ static struct memory_tier *set_node_memory_tier(int 
> > node)
> >   if (!node_state(node, N_MEMORY))
> >   return ERR_PTR(-EINVAL);
> >
> > - __init_node_memory_type(node, default_dram_type);
> > + mt_calc_adistance(node, &adist);
> > + if (node_memory_types[node].memtype == NULL) {
> > + mtype = mt_find_alloc_memory_type(adist, 
> > &default_memory_types);
> > +

Re: [PATCH 5/7] hw/char/stm32l4x5_usart: Add options for serial parameters setting

2024-03-22 Thread Peter Maydell

On Sun, 17 Mar 2024 at 10:41, Arnaud Minier
 wrote:
>
> Add a function to change the settings of the
> serial connection.
>
> Signed-off-by: Arnaud Minier 
> Signed-off-by: Inès Varhol 
> ---
>  hw/char/stm32l4x5_usart.c | 97 +++
>  1 file changed, 97 insertions(+)
>
> diff --git a/hw/char/stm32l4x5_usart.c b/hw/char/stm32l4x5_usart.c
> index 958d05a56d..95e792d09d 100644
> --- a/hw/char/stm32l4x5_usart.c
> +++ b/hw/char/stm32l4x5_usart.c
> @@ -165,6 +165,91 @@ static int stm32l4x5_usart_base_can_receive(void *opaque)
>  return 0;
>  }
>
> +static void stm32l4x5_update_params(Stm32l4x5UsartBaseState *s)
> +{
> +int speed, parity, data_bits, stop_bits;
> +uint32_t value, usart_div;
> +QEMUSerialSetParams ssp;
> +
> +/* Select the parity type */
> +if (s->cr1 & R_CR1_PCE_MASK) {
> +if (s->cr1 & R_CR1_PS_MASK) {
> +parity = 'O';
> +} else {
> +parity = 'E';
> +}
> +} else {
> +parity = 'N';
> +}
> +
> +/* Select the number of stop bits */
> +value = FIELD_EX32(s->cr2, CR2, STOP);
> +if (value == 0b00) {
> +stop_bits = 1;
> +} else if (value == 0b10) {
> +stop_bits = 2;
> +} else {

I think this would read a little more clearly as

 switch (FIELD_EX32(s->cr2, CR2, STOP)) {
 case 0:
  stop_bits = 1;
  break;
 case 2:
  stop_bits = 2;
  break;
 default:
  [error case code]
 }

rather than using an if-else ladder. Similarly below.

> +/* TODO: raise an error here */
> +stop_bits = 1;
> +error_report(
> +"UNIMPLEMENTED: fractionnal stop bits; CR2[13:12] = %x",
> +value);

We generally use qemu_log_mask(LOG_UNIMP, ...) for
"this was a valid thing for the guest to do but our implementation
doesn't handle it", and qemu_log_mask(LOG_GUEST_ERROR, ...) for
"this was an invalid thing for the guest to do" (eg programming register
fields to reserved/undefined values), rather than using error_report().

> +return;
> +}
> +
> +/* Select the length of the word */
> +value = (FIELD_EX32(s->cr1, CR1, M1) << 1) | FIELD_EX32(s->cr1, CR1, M0);
> +if (value == 0b00) {
> +data_bits = 8;
> +} else if (value == 0b01) {
> +data_bits = 9;
> +} else if (value == 0b01) {
> +data_bits = 7;

These two arms both check against the same value, so one of them
must be wrong...

> +} else {
> +/* TODO: Raise an error here */
> +data_bits = 8;
> +error_report("UNDEFINED: invalid word length, CR1.M = 0b11");
> +return;
> +}
> +
> +/* Select the baud rate */
> +value = FIELD_EX32(s->brr, BRR, BRR);
> +if (value < 16) {
> +/* TODO: Raise an error here */
> +error_report("UNDEFINED: BRR lesser than 16: %u", value);
> +return;
> +}
> +
> +if (FIELD_EX32(s->cr1, CR1, OVER8) == 0) {
> +/*
> + * Oversampling by 16
> + * BRR = USARTDIV
> + */
> +usart_div = value;
> +} else {
> +/*
> + * Oversampling by 8
> + * - BRR[2:0] = USARTDIV[3:0] shifted 1 bit to the right.
> + * - BRR[3] must be kept cleared.
> + * - BRR[15:4] = USARTDIV[15:4]
> + * - The frequency is multiplied by 2
> + */
> +usart_div = ((value & 0xFFF0) | ((value & 0x0007) << 1)) / 2;
> +}
> +
> +speed = clock_get_hz(s->clk) / usart_div;
> +
> +ssp.speed = speed;
> +ssp.parity= parity;
> +ssp.data_bits = data_bits;
> +ssp.stop_bits = stop_bits;
> +
> +qemu_chr_fe_ioctl(&s->chr, CHR_IOCTL_SERIAL_SET_PARAMS, &ssp);
> +
> +trace_stm32l4x5_usart_update_params(
> +speed, parity, data_bits, stop_bits, 0);

This is slightly weird indentation.

> +}
> +
>  static void stm32l4x5_update_irq(Stm32l4x5UsartBaseState *s)
>  {
>  if (((s->isr & R_ISR_WUF_MASK) && (s->cr3 & R_CR3_WUFIE_MASK))||
> @@ -318,16 +403,19 @@ static void stm32l4x5_usart_base_write(void *opaque, 
> hwaddr addr,
>  switch (addr) {
>  case A_CR1:
>  s->cr1 = value;
> +stm32l4x5_update_params(s);
>  stm32l4x5_update_irq(s);
>  return;
>  case A_CR2:
>  s->cr2 = value;
> +stm32l4x5_update_params(s);
>  return;
>  case A_CR3:
>  s->cr3 = value;
>  return;
>  case A_BRR:
>  s->brr = value;
> +stm32l4x5_update_params(s);
>  return;
>  case A_GTPR:
>  s->gtpr = value;
> @@ -409,10 +497,19 @@ static void stm32l4x5_usart_base_init(Object *obj)
>  s->clk = qdev_init_clock_in(DEVICE(s), "clk", NULL, s, 0);
>  }
>
> +static int stm32l4x5_usart_base_post_load(void *opaque, int version_id)
> +{
> +Stm32l4x5UsartBaseState *s = (Stm32l4x5UsartBaseState *)opaque;
> +
> +stm32l4x5_update_params(s);
> +return 0;
> +}
> +
>  static const VMStateDesc

Re: [PATCH] virtio: move logging definitions to hw/virtio/virtio.h

2024-03-22 Thread Philippe Mathieu-Daudé


On 22/3/24 19:03, Paolo Bonzini wrote:

They are not included in upstream Linux, and therefore should not be
in standard-headers.  Otherwise, the next update to the headers would
eliminate them.

Cc: Michael S. Tsirkin 
Signed-off-by: Paolo Bonzini 
---
  include/hw/virtio/virtio.h  | 7 +++
  include/standard-headers/linux/virtio_pci.h | 7 ---
  2 files changed, 7 insertions(+), 7 deletions(-)


Fixes: cd341fd1ff ("hw/virtio: Add support for VDPA network simulation 
devices")

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH 6/7] hw/arm: Add the USART to the stm32l4x5 SoC

2024-03-22 Thread Peter Maydell

On Sun, 17 Mar 2024 at 10:42, Arnaud Minier
 wrote:
>
> Add the USART to the SoC and connect it to the other implemented devices.
>
> Signed-off-by: Arnaud Minier 
> Signed-off-by: Inès Varhol 


> @@ -143,6 +172,7 @@ static void stm32l4x5_soc_realize(DeviceState *dev_soc, 
> Error **errp)
>  DeviceState *armv7m, *dev;
>  SysBusDevice *busdev;
>  uint32_t pin_index;
> +g_autofree char *name;

The idea with g_autofree is that you put it in a variable scope
which is the right size so that the automatic free on going out
of the scope is all that you need for freeing it. You've put this
one at the top function level, which then means you needed
to add extra manual g_free() calls.
>
>  if (!memory_region_init_rom(&s->flash, OBJECT(dev_soc), "flash",
>  sc->flash_size, errp)) {
> @@ -185,7 +215,7 @@ static void stm32l4x5_soc_realize(DeviceState *dev_soc, 
> Error **errp)
>
>  /* GPIOs */
>  for (unsigned i = 0; i < NUM_GPIOS; i++) {
> -g_autofree char *name = g_strdup_printf("%c", 'A' + i);
> +name = g_strdup_printf("%c", 'A' + i);
>  dev = DEVICE(&s->gpio[i]);
>  qdev_prop_set_string(dev, "name", name);
>  qdev_prop_set_uint32(dev, "mode-reset",
> @@ -199,6 +229,7 @@ static void stm32l4x5_soc_realize(DeviceState *dev_soc, 
> Error **errp)
>  name = g_strdup_printf("gpio%c-out", 'a' + i);
>  qdev_connect_clock_in(DEVICE(&s->gpio[i]), "clk",
>  qdev_get_clock_out(DEVICE(&(s->rcc)), name));
> +g_free(name);
>  if (!sysbus_realize(busdev, errp)) {
>  return;
>  }

For instance this code was correctly using the g_autofree, and
no longer is.

> @@ -279,6 +310,55 @@ static void stm32l4x5_soc_realize(DeviceState *dev_soc, 
> Error **errp)
>  sysbus_mmio_map(busdev, 0, RCC_BASE_ADDRESS);
>  sysbus_connect_irq(busdev, 0, qdev_get_gpio_in(armv7m, RCC_IRQ));
>
> +/* USART devices */
> +for (int i = 0; i < STM_NUM_USARTS; i++) {
> +dev = DEVICE(&(s->usart[i]));
> +qdev_prop_set_chr(dev, "chardev", serial_hd(i));
> +name = g_strdup_printf("usart%d-out", i + 1);
> +qdev_connect_clock_in(dev, "clk",
> +qdev_get_clock_out(DEVICE(&(s->rcc)), name));
> +g_free(name);

You should declare a new local g_autofree char *name inside
this loop body, and then the g_free() isn't needed.

> +busdev = SYS_BUS_DEVICE(dev);
> +if (!sysbus_realize(busdev, errp)) {
> +return;
> +}
> +sysbus_mmio_map(busdev, 0, usart_addr[i]);
> +sysbus_connect_irq(busdev, 0, qdev_get_gpio_in(armv7m, 
> usart_irq[i]));
> +}
> +
> +/*
> + * TODO: Connect the USARTs, UARTs and LPUART to the EXTI once the EXTI
> + * can handle other gpio-in than the gpios. (e.g. Direct Lines for the 
> usarts)
> + */
> +
> +/* UART devices */
> +for (int i = 0; i < STM_NUM_UARTS; i++) {
> +dev = DEVICE(&(s->uart[i]));
> +qdev_prop_set_chr(dev, "chardev", serial_hd(STM_NUM_USARTS + i));
> +name = g_strdup_printf("uart%d-out", STM_NUM_USARTS + i + 1);
> +qdev_connect_clock_in(dev, "clk",
> +qdev_get_clock_out(DEVICE(&(s->rcc)), name));
> +g_free(name);

Similarly here.

> +busdev = SYS_BUS_DEVICE(dev);
> +if (!sysbus_realize(busdev, errp)) {
> +return;
> +}
> +sysbus_mmio_map(busdev, 0, uart_addr[i]);
> +sysbus_connect_irq(busdev, 0, qdev_get_gpio_in(armv7m, uart_irq[i]));
> +}
> +
> +/* LPUART device*/
> +dev = DEVICE(&(s->lpuart));
> +qdev_prop_set_chr(dev, "chardev", serial_hd(STM_NUM_USARTS + 
> STM_NUM_UARTS));
> +qdev_connect_clock_in(dev, "clk",
> +qdev_get_clock_out(DEVICE(&(s->rcc)), "lpuart1-out"));
> +busdev = SYS_BUS_DEVICE(dev);
> +if (!sysbus_realize(busdev, errp)) {
> +return;
> +}
> +sysbus_mmio_map(busdev, 0, LPUART_BASE_ADDRESS);
> +sysbus_connect_irq(busdev, 0, qdev_get_gpio_in(armv7m, LPUART_IRQ));
> +
>  /* APB1 BUS */
>  create_unimplemented_device("TIM2",  0x4000, 0x400);
>  create_unimplemented_device("TIM3",  0x4400, 0x400);

thanks
-- PMM

Re: [PATCH 7/7] tests/qtest: Add tests for the STM32L4x5 USART

2024-03-22 Thread Peter Maydell

On Sun, 17 Mar 2024 at 10:42, Arnaud Minier
 wrote:
>
> Test:
> - read/write from/to the usart registers
> - send/receive a character/string over the serial port
>
> The test to detect overrun is implemented but disabled
> because overruns are currently impossible due to how we signal
> in the USART when we are ready to receive a new character.
>
> Signed-off-by: Arnaud Minier 
> Signed-off-by: Inès Varhol 
> ---



> --- /dev/null
> +++ b/tests/qtest/stm32l4x5_usart-test.c
> @@ -0,0 +1,399 @@
> +/*
> + * QTest testcase for STML4X5_USART
> + *
> + * Copyright (c) 2023 Arnaud Minier 
> + * Copyright (c) 2023 Inès Varhol 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "libqtest-single.h"
> +#include "hw/misc/stm32l4x5_rcc_internals.h"
> +#include "hw/registerfields.h"
> +
> +/*
> + * All page references in the following test
> + * refer to the ST RM0351 Reference Manuel.

"Manual". Also, what page references? I couldn't see any
from a quick scroll through.


> + */
> +
> +#define RCC_BASE_ADDR 0x40021000
> +/* Use USART 1 ADDR, assume the others work the same */
> +#define USART1_BASE_ADDR 0x40013800
> +
> +/* See stm32l4x5_usart for definitions */
> +REG32(CR1, 0x00)
> +FIELD(CR1, M1, 28, 1)
> +FIELD(CR1, OVER8, 15, 1)
> +FIELD(CR1, M0, 12, 1)
> +FIELD(CR1, PCE, 10, 1)
> +FIELD(CR1, TXEIE, 7, 1)
> +FIELD(CR1, RXNEIE, 5, 1)
> +FIELD(CR1, TE, 3, 1)
> +FIELD(CR1, RE, 2, 1)
> +FIELD(CR1, UE, 0, 1)
> +REG32(CR2, 0x04)
> +REG32(CR3, 0x08)
> +FIELD(CR3, OVRDIS, 12, 1)
> +REG32(BRR, 0x0C)
> +REG32(GTPR, 0x10)
> +REG32(RTOR, 0x14)
> +REG32(RQR, 0x18)
> +REG32(ISR, 0x1C)
> +FIELD(ISR, TXE, 7, 1)
> +FIELD(ISR, RXNE, 5, 1)
> +FIELD(ISR, ORE, 3, 1)
> +REG32(ICR, 0x20)
> +REG32(RDR, 0x24)
> +REG32(TDR, 0x28)
> +
> +#define NVIC_ISPR1 0XE000E204
> +#define NVIC_ICPR1 0xE000E284
> +#define USART1_IRQ 37
> +
> +static bool check_nvic_pending(QTestState *qts, unsigned int n)
> +{
> +/* No USART interrupts are less than 32 */
> +if (n < 32) {
> +return false;
> +}

I think this would be better as an assert() -- it would be
a bug in the test case if it called this with a bad interrupt number.

> +n -= 32;
> +return qtest_readl(qts, NVIC_ISPR1) & (1 << n);
> +}
> +
> +static bool clear_nvic_pending(QTestState *qts, unsigned int n)
> +{
> +/* No USART interrupts are less than 32 */
> +if (n < 32) {
> +return false;
> +}

Similarly here.

> +n -= 32;
> +qtest_writel(qts, NVIC_ICPR1, (1 << n));
> +return true;
> +}
> +
> +static void usart_writel(unsigned int offset, uint32_t value)
> +{
> +writel(USART1_BASE_ADDR + offset, value);
> +}
> +
> +static uint32_t usart_readl(unsigned int offset)
> +{
> +return readl(USART1_BASE_ADDR + offset);
> +}
> +
> +static bool usart_wait_for_flag(QTestState *qts, uint32_t event_addr, 
> uint32_t flag)
> +{
> +/* Wait at most 5 seconds */
> +for (int i = 0; i < 5000; i++) {
> +if ((qtest_readl(qts, event_addr) & flag)) {
> +return true;
> +}
> +g_usleep(1000);
> +}

qtest tests should never need to sleep(). If you need to advance
the guest clock, use clock_step(). If this is because we've sent some
data to the UART over the socket and need to wait for it to appear
in the QEMU process, that might be trickier, but I would see if
advancing the guest clock works reliably for that.

In general having tests which encode "wait for some wallclock
time" is flaky, because while it might be plenty of time on your
fast development machine, it can cause intermittent failures due
to timeouts if the test is on some heavily-loaded slow CI runner.

I note that the microbit version of this "wait for the UART"
has a timeout of 10 minutes.

> +
> +return false;
> +}



> +int main(int argc, char **argv)
> +{
> +int ret;
> +
> +g_test_init(&argc, &argv, NULL);
> +g_test_set_nonfatal_assertions();
> +
> +qtest_add_func("stm32l4x5/usart/write_read", test_write_read);
> +qtest_add_func("stm32l4x5/usart/receive_char", test_receive_char);
> +qtest_add_func("stm32l4x5/usart/send_char", test_send_char);
> +qtest_add_func("stm32l4x5/usart/receive_str", test_receive_str);
> +qtest_add_func("stm32l4x5/usart/send_str", test_send_str);
> +/* Disabled tests */
> +if (false) {
> +qtest_add_func("stm32l4x5/usart/overrun", test_overrun);
> +}

If the test doesn't work because QEMU's implementation never
generates overruns, I would just not put it in the file, rather
than having dead code we never run.

> +qtest_start("-machine b-l475e-iot01a");
> +ret = g_test_run();
> +qtest_end();
> +
> +return ret;
> +}

thanks
-- PMM

[PATCH 10/26] [TO SQUASH] hw/i386: Remove redeclaration of struct setup_data

2024-03-22 Thread Paolo Bonzini

From: Michael Roth 

It is now provided by kernel headers.

This needs to be squashed with the header update to avoid temporary
build bisect breakage. Keeping it separate for reference.

Signed-off-by: Michael Roth 
Message-ID: <20240320083945.991426-6-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 hw/i386/x86.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index ffbda48917f..84a48019770 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -679,14 +679,6 @@ DeviceState *ioapic_init_secondary(GSIState *gsi_state)
 return dev;
 }
 
-struct setup_data {
-uint64_t next;
-uint32_t type;
-uint32_t len;
-uint8_t data[];
-} __attribute__((packed));
-
-
 /*
  * The entry point into the kernel for PVH boot is different from
  * the native entry point.  The PVH entry is defined by the x86/HVM
-- 
2.44.0

[PATCH 13/26] KVM: remove kvm_arch_cpu_check_are_resettable

2024-03-22 Thread Paolo Bonzini

Board reset requires writing a fresh CPU state.  As far as KVM is
concerned, the only thing that blocks reset is that CPU state is
encrypted; therefore, kvm_cpus_are_resettable() can simply check
if that is the case.

Signed-off-by: Paolo Bonzini 
---
 include/sysemu/kvm.h   | 10 --
 accel/kvm/kvm-accel-ops.c  |  2 +-
 accel/kvm/kvm-all.c|  5 -
 target/arm/kvm.c   |  5 -
 target/i386/kvm/kvm.c  |  5 -
 target/loongarch/kvm/kvm.c |  5 -
 target/mips/kvm.c  |  5 -
 target/ppc/kvm.c   |  5 -
 target/riscv/kvm/kvm-cpu.c |  5 -
 target/s390x/kvm/kvm.c |  5 -
 10 files changed, 1 insertion(+), 51 deletions(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 302e8f6f1e5..54f4d83a370 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -525,16 +525,6 @@ int kvm_get_one_reg(CPUState *cs, uint64_t id, void 
*target);
 /* Notify resamplefd for EOI of specific interrupts. */
 void kvm_resample_fd_notify(int gsi);
 
-/**
- * kvm_cpu_check_are_resettable - return whether CPUs can be reset
- *
- * Returns: true: CPUs are resettable
- *  false: CPUs are not resettable
- */
-bool kvm_cpu_check_are_resettable(void);
-
-bool kvm_arch_cpu_check_are_resettable(void);
-
 bool kvm_dirty_ring_enabled(void);
 
 uint32_t kvm_dirty_ring_size(void);
diff --git a/accel/kvm/kvm-accel-ops.c b/accel/kvm/kvm-accel-ops.c
index b3c946dc4b4..74e3c5785b5 100644
--- a/accel/kvm/kvm-accel-ops.c
+++ b/accel/kvm/kvm-accel-ops.c
@@ -82,7 +82,7 @@ static bool kvm_vcpu_thread_is_idle(CPUState *cpu)
 
 static bool kvm_cpus_are_resettable(void)
 {
-return !kvm_enabled() || kvm_cpu_check_are_resettable();
+return !kvm_enabled() || !kvm_state->guest_state_protected;
 }
 
 #ifdef KVM_CAP_SET_GUEST_DEBUG
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 05fa3533c66..a05dea23133 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2691,11 +2691,6 @@ void kvm_flush_coalesced_mmio_buffer(void)
 s->coalesced_flush_in_progress = false;
 }
 
-bool kvm_cpu_check_are_resettable(void)
-{
-return kvm_arch_cpu_check_are_resettable();
-}
-
 static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg)
 {
 if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) {
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index ab85d628a8b..21ebbf3b8f8 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1598,11 +1598,6 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 return (data - 32) & 0x;
 }
 
-bool kvm_arch_cpu_check_are_resettable(void)
-{
-return true;
-}
-
 static void kvm_arch_get_eager_split_size(Object *obj, Visitor *v,
   const char *name, void *opaque,
   Error **errp)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index de10155b37a..0ec69109a2b 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -5614,11 +5614,6 @@ bool kvm_has_waitpkg(void)
 return has_msr_umwait;
 }
 
-bool kvm_arch_cpu_check_are_resettable(void)
-{
-return !sev_es_enabled();
-}
-
 #define ARCH_REQ_XCOMP_GUEST_PERM   0x1025
 
 void kvm_request_xsave_components(X86CPU *cpu, uint64_t mask)
diff --git a/target/loongarch/kvm/kvm.c b/target/loongarch/kvm/kvm.c
index d630cc39cb2..8224d943331 100644
--- a/target/loongarch/kvm/kvm.c
+++ b/target/loongarch/kvm/kvm.c
@@ -733,11 +733,6 @@ bool kvm_arch_stop_on_emulation_error(CPUState *cs)
 return true;
 }
 
-bool kvm_arch_cpu_check_are_resettable(void)
-{
-return true;
-}
-
 int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
 {
 int ret = 0;
diff --git a/target/mips/kvm.c b/target/mips/kvm.c
index 6c52e59f55d..a631ab544f5 100644
--- a/target/mips/kvm.c
+++ b/target/mips/kvm.c
@@ -1273,11 +1273,6 @@ int kvm_arch_get_default_type(MachineState *machine)
 return -1;
 }
 
-bool kvm_arch_cpu_check_are_resettable(void)
-{
-return true;
-}
-
 void kvm_arch_accel_class_init(ObjectClass *oc)
 {
 }
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 8231feb2d45..63930d4a77d 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -2956,11 +2956,6 @@ void kvmppc_set_reg_tb_offset(PowerPCCPU *cpu, int64_t 
tb_offset)
 }
 }
 
-bool kvm_arch_cpu_check_are_resettable(void)
-{
-return true;
-}
-
 void kvm_arch_accel_class_init(ObjectClass *oc)
 {
 }
diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index cda7d78a778..135d87dc3f5 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -1466,11 +1466,6 @@ void kvm_riscv_set_irq(RISCVCPU *cpu, int irq, int level)
 }
 }
 
-bool kvm_arch_cpu_check_are_resettable(void)
-{
-return true;
-}
-
 static int aia_mode;
 
 static const char *kvm_aia_mode_str(uint64_t mode)
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 4ce809c5d46..4dcd757cdcc 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -2622,11 +2622,6 @

[PATCH 06/26] s390: Switch to use confidential_guest_kvm_init()

2024-03-22 Thread Paolo Bonzini

From: Xiaoyao Li 

Use unified confidential_guest_kvm_init() for consistency with
other architectures.

Signed-off-by: Xiaoyao Li 
Message-Id: <20240229060038.606591-1-xiaoyao...@intel.com>
Signed-off-by: Paolo Bonzini 
---
 target/s390x/kvm/pv.h  | 14 --
 hw/s390x/s390-virtio-ccw.c |  5 -
 target/s390x/kvm/pv.c  |  8 
 3 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/target/s390x/kvm/pv.h b/target/s390x/kvm/pv.h
index 5877d28ff10..4b408174391 100644
--- a/target/s390x/kvm/pv.h
+++ b/target/s390x/kvm/pv.h
@@ -80,18 +80,4 @@ static inline int kvm_s390_dump_mem_state(uint64_t addr, 
size_t len,
 static inline int kvm_s390_dump_completion_data(void *buff) { return 0; }
 #endif /* CONFIG_KVM */
 
-int s390_pv_kvm_init(ConfidentialGuestSupport *cgs, Error **errp);
-static inline int s390_pv_init(ConfidentialGuestSupport *cgs, Error **errp)
-{
-if (!cgs) {
-return 0;
-}
-if (kvm_enabled()) {
-return s390_pv_kvm_init(cgs, errp);
-}
-
-error_setg(errp, "Protected Virtualization requires KVM");
-return -1;
-}
-
 #endif /* HW_S390_PV_H */
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index b1dcb3857f0..e35b90ed83c 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -14,6 +14,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "exec/ram_addr.h"
+#include "exec/confidential-guest-support.h"
 #include "hw/s390x/s390-virtio-hcall.h"
 #include "hw/s390x/sclp.h"
 #include "hw/s390x/s390_flic.h"
@@ -260,7 +261,9 @@ static void ccw_init(MachineState *machine)
 s390_init_cpus(machine);
 
 /* Need CPU model to be determined before we can set up PV */
-s390_pv_init(machine->cgs, &error_fatal);
+if (machine->cgs) {
+confidential_guest_kvm_init(machine->cgs, &error_fatal);
+}
 
 s390_flic_init();
 
diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
index 7ca7faec73e..c04d53753bf 100644
--- a/target/s390x/kvm/pv.c
+++ b/target/s390x/kvm/pv.c
@@ -340,6 +340,11 @@ int s390_pv_kvm_init(ConfidentialGuestSupport *cgs, Error 
**errp)
 return 0;
 }
 
+if (!kvm_enabled()) {
+error_setg(errp, "Protected Virtualization requires KVM");
+return -1;
+}
+
 if (!s390_has_feat(S390_FEAT_UNPACK)) {
 error_setg(errp,
"CPU model does not support Protected Virtualization");
@@ -364,6 +369,9 @@ OBJECT_DEFINE_TYPE_WITH_INTERFACES(S390PVGuest,
 
 static void s390_pv_guest_class_init(ObjectClass *oc, void *data)
 {
+ConfidentialGuestSupportClass *klass = 
CONFIDENTIAL_GUEST_SUPPORT_CLASS(oc);
+
+klass->kvm_init = s390_pv_kvm_init;
 }
 
 static void s390_pv_guest_init(Object *obj)
-- 
2.44.0

[PATCH 21/26] kvm/memory: Make memory type private by default if it has guest memfd backend

2024-03-22 Thread Paolo Bonzini

From: Xiaoyao Li 

KVM side leaves the memory to shared by default, while may incur the
overhead of paging conversion on the first visit of each page. Because
the expectation is that page is likely to private for the VMs that
require private memory (has guest memfd).

Explicitly set the memory to private when memory region has valid
guest memfd backend.

Signed-off-by: Xiaoyao Li 
Signed-off-by: Michael Roth 
Message-ID: <20240320083945.991426-16-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 accel/kvm/kvm-all.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 7fbaf31cbaf..56b17cbd8aa 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1430,6 +1430,16 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
 strerror(-err));
 abort();
 }
+
+if (memory_region_has_guest_memfd(mr)) {
+err = kvm_set_memory_attributes_private(start_addr, slot_size);
+if (err) {
+error_report("%s: failed to set memory attribute private: 
%s\n",
+ __func__, strerror(-err));
+exit(1);
+}
+}
+
 start_addr += slot_size;
 ram_start_offset += slot_size;
 ram += slot_size;
-- 
2.44.0

[PATCH 23/26] RAMBlock: make guest_memfd require uncoordinated discard

2024-03-22 Thread Paolo Bonzini

Some subsystems like VFIO might disable ram block discard, but guest_memfd
uses discard operations to implement conversions between private and
shared memory.  Because of this, sequences like the following can result
in stale IOMMU mappings:

1. allocate shared page
2. convert page shared->private
3. discard shared page
4. convert page private->shared
5. allocate shared page
6. issue DMA operations against that shared page

This is not a use-after-free, because after step 3 VFIO is still pinning
the page.  However, DMA operations in step 6 will hit the old mapping
that was allocated in step 1.

Address this by taking ram_block_discard_is_enabled() into account when
deciding whether or not to discard pages.

Since kvm_convert_memory()/guest_memfd doesn't implement a
RamDiscardManager handler to convey and replay discard operations,
this is a case of uncoordinated discard, which is blocked/released
by ram_block_discard_require().  Interestingly, this function had
no use so far.

Alternative approaches would be to block discard of shared pages, but
this would cause guests to consume twice the memory if they use VFIO;
or to implement a RamDiscardManager and only block uncoordinated
discard, i.e. use ram_block_coordinated_discard_require().

[Commit message mostly by Michael Roth ]

Signed-off-by: Paolo Bonzini 
---
 system/physmem.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/system/physmem.c b/system/physmem.c
index f5dfa20e57e..5ebcf5be116 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -1846,6 +1846,13 @@ static void ram_block_add(RAMBlock *new_block, Error 
**errp)
 assert(kvm_enabled());
 assert(new_block->guest_memfd < 0);
 
+if (ram_block_discard_require(true) < 0) {
+error_setg_errno(errp, errno,
+ "cannot set up private guest memory: discard 
currently blocked");
+error_append_hint(errp, "Are you using assigned devices?\n");
+goto out_free;
+}
+
 new_block->guest_memfd = kvm_create_guest_memfd(new_block->max_length,
 0, errp);
 if (new_block->guest_memfd < 0) {
@@ -2109,6 +2116,7 @@ static void reclaim_ramblock(RAMBlock *block)
 
 if (block->guest_memfd >= 0) {
 close(block->guest_memfd);
+ram_block_discard_require(false);
 }
 
 g_free(block);
-- 
2.44.0

[PATCH 16/26] target/i386: SEV: use KVM_SEV_INIT2 if possible

2024-03-22 Thread Paolo Bonzini

Implement support for the KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM virtual
machine types, and the KVM_SEV_INIT2 function of KVM_MEMORY_ENCRYPT_OP.

These replace the KVM_SEV_INIT and KVM_SEV_ES_INIT functions, and have
several advantages:

- sharing the initialization sequence with SEV-SNP and TDX

- allowing arguments including the set of desired VMSA features

- protection against invalid use of KVM_GET/SET_* ioctls for guests
  with encrypted state

If the KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM types are not supported,
fall back to KVM_SEV_INIT and KVM_SEV_ES_INIT (which use the
default x86 VM type).

Signed-off-by: Paolo Bonzini 
---
 target/i386/kvm/kvm.c |  2 ++
 target/i386/sev.c | 41 +
 2 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index b599a7fae36..2577e345502 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -164,6 +164,8 @@ static int kvm_get_one_msr(X86CPU *cpu, int index, uint64_t 
*value);
 
 static const char *vm_type_name[] = {
 [KVM_X86_DEFAULT_VM] = "default",
+[KVM_X86_SEV_VM] = "SEV",
+[KVM_X86_SEV_ES_VM] = "SEV-ES",
 };
 
 bool kvm_is_vm_type_supported(int type)
diff --git a/target/i386/sev.c b/target/i386/sev.c
index ebe36d4c10c..9dab4060b84 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -26,6 +26,7 @@
 #include "qemu/error-report.h"
 #include "crypto/hash.h"
 #include "sysemu/kvm.h"
+#include "kvm/kvm_i386.h"
 #include "sev.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/runstate.h"
@@ -56,6 +57,8 @@ OBJECT_DECLARE_SIMPLE_TYPE(SevGuestState, SEV_GUEST)
 struct SevGuestState {
 X86ConfidentialGuest parent_obj;
 
+int kvm_type;
+
 /* configuration parameters */
 char *sev_device;
 uint32_t policy;
@@ -850,6 +853,26 @@ sev_vm_state_change(void *opaque, bool running, RunState 
state)
 }
 }
 
+static int sev_kvm_type(X86ConfidentialGuest *cg)
+{
+SevGuestState *sev = SEV_GUEST(cg);
+int kvm_type;
+
+if (sev->kvm_type != -1) {
+goto out;
+}
+
+kvm_type = (sev->policy & SEV_POLICY_ES) ? KVM_X86_SEV_ES_VM : 
KVM_X86_SEV_VM;
+if (kvm_is_vm_type_supported(kvm_type)) {
+sev->kvm_type = kvm_type;
+} else {
+sev->kvm_type = KVM_X86_DEFAULT_VM;
+}
+
+out:
+return sev->kvm_type;
+}
+
 static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
 {
 SevGuestState *sev = SEV_GUEST(cgs);
@@ -929,13 +952,19 @@ static int sev_kvm_init(ConfidentialGuestSupport *cgs, 
Error **errp)
  __func__);
 goto err;
 }
-cmd = KVM_SEV_ES_INIT;
-} else {
-cmd = KVM_SEV_INIT;
 }
 
 trace_kvm_sev_init();
-ret = sev_ioctl(sev->sev_fd, cmd, NULL, &fw_error);
+if (sev_kvm_type(X86_CONFIDENTIAL_GUEST(sev)) == KVM_X86_DEFAULT_VM) {
+cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT;
+
+ret = sev_ioctl(sev->sev_fd, cmd, NULL, &fw_error);
+} else {
+struct kvm_sev_init args = { 0 };
+
+ret = sev_ioctl(sev->sev_fd, KVM_SEV_INIT2, &args, &fw_error);
+}
+
 if (ret) {
 error_setg(errp, "%s: failed to initialize ret=%d fw_error=%d '%s'",
__func__, ret, fw_error, fw_error_to_str(fw_error));
@@ -1327,8 +1356,10 @@ static void
 sev_guest_class_init(ObjectClass *oc, void *data)
 {
 ConfidentialGuestSupportClass *klass = 
CONFIDENTIAL_GUEST_SUPPORT_CLASS(oc);
+X86ConfidentialGuestClass *x86_klass = X86_CONFIDENTIAL_GUEST_CLASS(oc);
 
 klass->kvm_init = sev_kvm_init;
+x86_klass->kvm_type = sev_kvm_type;
 
 object_class_property_add_str(oc, "sev-device",
   sev_guest_get_sev_device,
@@ -1357,6 +1388,8 @@ sev_guest_instance_init(Object *obj)
 {
 SevGuestState *sev = SEV_GUEST(obj);
 
+sev->kvm_type = -1;
+
 sev->sev_device = g_strdup(DEFAULT_SEV_DEVICE);
 sev->policy = DEFAULT_GUEST_POLICY;
 object_property_add_uint32_ptr(obj, "policy", &sev->policy,
-- 
2.44.0

[PATCH 15/26] target/i386: Implement mc->kvm_type() to get VM type

2024-03-22 Thread Paolo Bonzini

KVM is introducing a new API to create confidential guests, which
will be used by TDX and SEV-SNP but is also available for SEV and
SEV-ES.  The API uses the VM type argument to KVM_CREATE_VM to
identify which confidential computing technology to use.

Since there are no other expected uses of VM types, delegate
mc->kvm_type() for x86 boards to the confidential-guest-support
object pointed to by ms->cgs.

For example, if a sev-guest object is specified to confidential-guest-support,
like,

  qemu -machine ...,confidential-guest-support=sev0 \
   -object sev-guest,id=sev0,...

it will check if a VM type KVM_X86_SEV_VM or KVM_X86_SEV_ES_VM
is supported, and if so use them together with the KVM_SEV_INIT2
function of the KVM_MEMORY_ENCRYPT_OP ioctl. If not, it will fall back to
KVM_SEV_INIT and KVM_SEV_ES_INIT.

This is a preparatory work towards TDX and SEV-SNP support, but it
will also enable support for VMSA features such as DebugSwap, which
are only available via KVM_SEV_INIT2.

Co-developed-by: Xiaoyao Li 
Signed-off-by: Xiaoyao Li 
Signed-off-by: Paolo Bonzini 
---
 target/i386/confidential-guest.h | 19 ++
 target/i386/kvm/kvm_i386.h   |  2 ++
 hw/i386/x86.c| 11 
 target/i386/kvm/kvm.c| 44 
 4 files changed, 76 insertions(+)

diff --git a/target/i386/confidential-guest.h b/target/i386/confidential-guest.h
index ca12d5a8fba..532e172a60b 100644
--- a/target/i386/confidential-guest.h
+++ b/target/i386/confidential-guest.h
@@ -36,5 +36,24 @@ struct X86ConfidentialGuest {
 struct X86ConfidentialGuestClass {
 /*  */
 ConfidentialGuestSupportClass parent;
+
+/*  */
+int (*kvm_type)(X86ConfidentialGuest *cg);
 };
+
+/**
+ * x86_confidential_guest_kvm_type:
+ *
+ * Calls #X86ConfidentialGuestClass.unplug callback of @plug_handler.
+ */
+static inline int x86_confidential_guest_kvm_type(X86ConfidentialGuest *cg)
+{
+X86ConfidentialGuestClass *klass = X86_CONFIDENTIAL_GUEST_GET_CLASS(cg);
+
+if (klass->kvm_type) {
+return klass->kvm_type(cg);
+} else {
+return 0;
+}
+}
 #endif
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index 30fedcffea3..6b44844d95d 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -37,6 +37,7 @@ bool kvm_hv_vpindex_settable(void);
 bool kvm_enable_sgx_provisioning(KVMState *s);
 bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp);
 
+int kvm_get_vm_type(MachineState *ms);
 void kvm_arch_reset_vcpu(X86CPU *cs);
 void kvm_arch_after_reset_vcpu(X86CPU *cpu);
 void kvm_arch_do_init_vcpu(X86CPU *cs);
@@ -49,6 +50,7 @@ void kvm_request_xsave_components(X86CPU *cpu, uint64_t mask);
 
 #ifdef CONFIG_KVM
 
+bool kvm_is_vm_type_supported(int type);
 bool kvm_has_adjust_clock_stable(void);
 bool kvm_has_exception_payload(void);
 void kvm_synchronize_all_tsc(void);
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 84a48019770..3d5b51e92db 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1381,6 +1381,16 @@ static void machine_set_sgx_epc(Object *obj, Visitor *v, 
const char *name,
 qapi_free_SgxEPCList(list);
 }
 
+static int x86_kvm_type(MachineState *ms, const char *vm_type)
+{
+/*
+ * No x86 machine has a kvm-type property.  If one is added that has
+ * it, it should call kvm_get_vm_type() directly or not use it at all.
+ */
+assert(vm_type == NULL);
+return kvm_enabled() ? kvm_get_vm_type(ms) : 0;
+}
+
 static void x86_machine_initfn(Object *obj)
 {
 X86MachineState *x86ms = X86_MACHINE(obj);
@@ -1405,6 +1415,7 @@ static void x86_machine_class_init(ObjectClass *oc, void 
*data)
 mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
 mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
 mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
+mc->kvm_type = x86_kvm_type;
 x86mc->save_tsc_khz = true;
 x86mc->fwcfg_dma_enabled = true;
 nc->nmi_monitor_handler = x86_nmi;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 0ec69109a2b..b599a7fae36 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -31,6 +31,7 @@
 #include "sysemu/kvm_int.h"
 #include "sysemu/runstate.h"
 #include "kvm_i386.h"
+#include "../confidential-guest.h"
 #include "sev.h"
 #include "xen-emu.h"
 #include "hyperv.h"
@@ -161,6 +162,49 @@ static KVMMSRHandlers 
msr_handlers[KVM_MSR_FILTER_MAX_RANGES];
 static RateLimit bus_lock_ratelimit_ctrl;
 static int kvm_get_one_msr(X86CPU *cpu, int index, uint64_t *value);
 
+static const char *vm_type_name[] = {
+[KVM_X86_DEFAULT_VM] = "default",
+};
+
+bool kvm_is_vm_type_supported(int type)
+{
+uint32_t machine_types;
+
+/*
+ * old KVM doesn't support KVM_CAP_VM_TYPES but KVM_X86_DEFAULT_VM
+ * is always supported
+ */
+if (type == KVM_X86_DEFAULT_VM) {
+return true;
+}
+
+machine_types = 
kvm_check_extension(KVM_STATE(current_machine->accelerator),
+

[PATCH 14/26] target/i386: introduce x86-confidential-guest

2024-03-22 Thread Paolo Bonzini

Introduce a common superclass for x86 confidential guest implementations.
It will extend ConfidentialGuestSupportClass with a method that provides
the VM type to be passed to KVM_CREATE_VM.

Signed-off-by: Paolo Bonzini 
---
 target/i386/confidential-guest.h | 40 
 target/i386/confidential-guest.c | 33 ++
 target/i386/sev.c|  6 ++---
 target/i386/meson.build  |  2 +-
 4 files changed, 77 insertions(+), 4 deletions(-)
 create mode 100644 target/i386/confidential-guest.h
 create mode 100644 target/i386/confidential-guest.c

diff --git a/target/i386/confidential-guest.h b/target/i386/confidential-guest.h
new file mode 100644
index 000..ca12d5a8fba
--- /dev/null
+++ b/target/i386/confidential-guest.h
@@ -0,0 +1,40 @@
+/*
+ * x86-specific confidential guest methods.
+ *
+ * Copyright (c) 2024 Red Hat Inc.
+ *
+ * Authors:
+ *  Paolo Bonzini 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#ifndef TARGET_I386_CG_H
+#define TARGET_I386_CG_H
+
+#include "qom/object.h"
+
+#include "exec/confidential-guest-support.h"
+
+#define TYPE_X86_CONFIDENTIAL_GUEST "x86-confidential-guest"
+
+OBJECT_DECLARE_TYPE(X86ConfidentialGuest,
+X86ConfidentialGuestClass,
+X86_CONFIDENTIAL_GUEST)
+
+struct X86ConfidentialGuest {
+/*  */
+ConfidentialGuestSupport parent_obj;
+};
+
+/**
+ * X86ConfidentialGuestClass:
+ *
+ * Class to be implemented by confidential-guest-support concrete objects
+ * for the x86 target.
+ */
+struct X86ConfidentialGuestClass {
+/*  */
+ConfidentialGuestSupportClass parent;
+};
+#endif
diff --git a/target/i386/confidential-guest.c b/target/i386/confidential-guest.c
new file mode 100644
index 000..b3727845adc
--- /dev/null
+++ b/target/i386/confidential-guest.c
@@ -0,0 +1,33 @@
+/*
+ * QEMU Confidential Guest support
+ *
+ * Copyright (C) 2024 Red Hat, Inc.
+ *
+ * Authors:
+ *  Paolo Bonzini 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+
+#include "confidential-guest.h"
+
+OBJECT_DEFINE_ABSTRACT_TYPE(X86ConfidentialGuest,
+x86_confidential_guest,
+X86_CONFIDENTIAL_GUEST,
+CONFIDENTIAL_GUEST_SUPPORT)
+
+static void x86_confidential_guest_class_init(ObjectClass *oc, void *data)
+{
+}
+
+static void x86_confidential_guest_init(Object *obj)
+{
+}
+
+static void x86_confidential_guest_finalize(Object *obj)
+{
+}
diff --git a/target/i386/sev.c b/target/i386/sev.c
index c49a8fd55eb..ebe36d4c10c 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -35,7 +35,7 @@
 #include "monitor/monitor.h"
 #include "monitor/hmp-target.h"
 #include "qapi/qapi-commands-misc-target.h"
-#include "exec/confidential-guest-support.h"
+#include "confidential-guest.h"
 #include "hw/i386/pc.h"
 #include "exec/address-spaces.h"
 
@@ -54,7 +54,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(SevGuestState, SEV_GUEST)
  * -machine ...,memory-encryption=sev0
  */
 struct SevGuestState {
-ConfidentialGuestSupport parent_obj;
+X86ConfidentialGuest parent_obj;
 
 /* configuration parameters */
 char *sev_device;
@@ -1372,7 +1372,7 @@ sev_guest_instance_init(Object *obj)
 
 /* sev guest info */
 static const TypeInfo sev_guest_info = {
-.parent = TYPE_CONFIDENTIAL_GUEST_SUPPORT,
+.parent = TYPE_X86_CONFIDENTIAL_GUEST,
 .name = TYPE_SEV_GUEST,
 .instance_size = sizeof(SevGuestState),
 .instance_finalize = sev_guest_finalize,
diff --git a/target/i386/meson.build b/target/i386/meson.build
index 7c74bfa8591..8abce725f86 100644
--- a/target/i386/meson.build
+++ b/target/i386/meson.build
@@ -6,7 +6,7 @@ i386_ss.add(files(
   'xsave_helper.c',
   'cpu-dump.c',
 ))
-i386_ss.add(when: 'CONFIG_SEV', if_true: files('host-cpu.c'))
+i386_ss.add(when: 'CONFIG_SEV', if_true: files('host-cpu.c', 
'confidential-guest.c'))
 
 # x86 cpu type
 i386_ss.add(when: 'CONFIG_KVM', if_true: files('host-cpu.c'))
-- 
2.44.0

[PATCH 09/26] [HACK] linux-headers: Update headers for 6.8 + kvm-coco-queue + SNP

2024-03-22 Thread Paolo Bonzini

From: Michael Roth 

Pull in 6.8 kvm-next + kvm-coco-queue + SNP headers.

Signed-off-by: Michael Roth 
Signed-off-by: Paolo Bonzini 
---
 include/standard-headers/asm-x86/bootparam.h  |  17 +-
 include/standard-headers/asm-x86/kvm_para.h   |   3 +-
 include/standard-headers/asm-x86/setup_data.h |  83 ++
 include/standard-headers/linux/ethtool.h  |  48 ++
 include/standard-headers/linux/fuse.h |  39 +-
 .../linux/input-event-codes.h |   1 +
 include/standard-headers/linux/virtio_gpu.h   |   2 +
 include/standard-headers/linux/virtio_snd.h   | 154 
 linux-headers/asm-arm64/kvm.h |  15 +-
 linux-headers/asm-arm64/sve_context.h |  11 +
 linux-headers/asm-generic/bitsperlong.h   |   4 +
 linux-headers/asm-loongarch/kvm.h |   2 -
 linux-headers/asm-mips/kvm.h  |   2 -
 linux-headers/asm-powerpc/kvm.h   |  45 +-
 linux-headers/asm-riscv/kvm.h |   3 +-
 linux-headers/asm-s390/kvm.h  | 315 +++-
 linux-headers/asm-x86/kvm.h   | 364 -
 linux-headers/linux/bits.h|  15 +
 linux-headers/linux/kvm.h | 717 +-
 linux-headers/linux/psp-sev.h |  71 ++
 20 files changed, 1186 insertions(+), 725 deletions(-)
 create mode 100644 include/standard-headers/asm-x86/setup_data.h
 create mode 100644 linux-headers/linux/bits.h

diff --git a/include/standard-headers/asm-x86/bootparam.h 
b/include/standard-headers/asm-x86/bootparam.h
index 0b06d2bff1b..b582a105c08 100644
--- a/include/standard-headers/asm-x86/bootparam.h
+++ b/include/standard-headers/asm-x86/bootparam.h
@@ -2,21 +2,7 @@
 #ifndef _ASM_X86_BOOTPARAM_H
 #define _ASM_X86_BOOTPARAM_H
 
-/* setup_data/setup_indirect types */
-#define SETUP_NONE 0
-#define SETUP_E820_EXT 1
-#define SETUP_DTB  2
-#define SETUP_PCI  3
-#define SETUP_EFI  4
-#define SETUP_APPLE_PROPERTIES 5
-#define SETUP_JAILHOUSE6
-#define SETUP_CC_BLOB  7
-#define SETUP_IMA  8
-#define SETUP_RNG_SEED 9
-#define SETUP_ENUM_MAX SETUP_RNG_SEED
-
-#define SETUP_INDIRECT (1<<31)
-#define SETUP_TYPE_MAX (SETUP_ENUM_MAX | SETUP_INDIRECT)
+#include "standard-headers/asm-x86/setup_data.h"
 
 /* ram_size flags */
 #define RAMDISK_IMAGE_START_MASK   0x07FF
@@ -38,6 +24,7 @@
 #define XLF_EFI_KEXEC  (1<<4)
 #define XLF_5LEVEL (1<<5)
 #define XLF_5LEVEL_ENABLED (1<<6)
+#define XLF_MEM_ENCRYPTION (1<<7)
 
 
 #endif /* _ASM_X86_BOOTPARAM_H */
diff --git a/include/standard-headers/asm-x86/kvm_para.h 
b/include/standard-headers/asm-x86/kvm_para.h
index f0235e58a1d..9a011d20f01 100644
--- a/include/standard-headers/asm-x86/kvm_para.h
+++ b/include/standard-headers/asm-x86/kvm_para.h
@@ -92,7 +92,7 @@ struct kvm_clock_pairing {
 #define KVM_ASYNC_PF_DELIVERY_AS_INT   (1 << 3)
 
 /* MSR_KVM_ASYNC_PF_INT */
-#define KVM_ASYNC_PF_VEC_MASK  GENMASK(7, 0)
+#define KVM_ASYNC_PF_VEC_MASK  __GENMASK(7, 0)
 
 /* MSR_KVM_MIGRATION_CONTROL */
 #define KVM_MIGRATION_READY(1 << 0)
@@ -142,7 +142,6 @@ struct kvm_vcpu_pv_apf_data {
uint32_t token;
 
uint8_t pad[56];
-   uint32_t enabled;
 };
 
 #define KVM_PV_EOI_BIT 0
diff --git a/include/standard-headers/asm-x86/setup_data.h 
b/include/standard-headers/asm-x86/setup_data.h
new file mode 100644
index 000..09355f54c55
--- /dev/null
+++ b/include/standard-headers/asm-x86/setup_data.h
@@ -0,0 +1,83 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_X86_SETUP_DATA_H
+#define _ASM_X86_SETUP_DATA_H
+
+/* setup_data/setup_indirect types */
+#define SETUP_NONE 0
+#define SETUP_E820_EXT 1
+#define SETUP_DTB  2
+#define SETUP_PCI  3
+#define SETUP_EFI  4
+#define SETUP_APPLE_PROPERTIES 5
+#define SETUP_JAILHOUSE6
+#define SETUP_CC_BLOB  7
+#define SETUP_IMA  8
+#define SETUP_RNG_SEED 9
+#define SETUP_ENUM_MAX SETUP_RNG_SEED
+
+#define SETUP_INDIRECT (1<<31)
+#define SETUP_TYPE_MAX (SETUP_ENUM_MAX | SETUP_INDIRECT)
+
+#ifndef __ASSEMBLY__
+
+#include "standard-headers/linux/types.h"
+
+/* extensible setup data list node */
+struct setup_data {
+   uint64_t next;
+   uint32_t type;
+   uint32_t len;
+   uint8_t data[];
+};
+
+/* extensible setup indirect data node */
+struct setup_indirect {
+   uint32_t type;
+   uint32_t reserved;  /* Reserved, must be set to zero. */
+   uint64_t len;
+   uint64_t addr;
+};
+
+/*
+ * The E8

[PATCH 22/26] HostMem: Add mechanism to opt in kvm guest memfd via MachineState

2024-03-22 Thread Paolo Bonzini

From: Xiaoyao Li 

Add a new member "guest_memfd" to memory backends. When it's set
to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm
guest_memfd will be allocated during RAMBlock allocation.

Memory backend's @guest_memfd is wired with @require_guest_memfd
field of MachineState. It avoid looking up the machine in phymem.c.

MachineState::require_guest_memfd is supposed to be set by any VMs
that requires KVM guest memfd as private memory, e.g., TDX VM.

Signed-off-by: Xiaoyao Li 
Reviewed-by: David Hildenbrand 
Message-ID: <20240320083945.991426-8-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 include/hw/boards.h  | 2 ++
 include/sysemu/hostmem.h | 1 +
 backends/hostmem-file.c  | 1 +
 backends/hostmem-memfd.c | 1 +
 backends/hostmem-ram.c   | 1 +
 backends/hostmem.c   | 1 +
 hw/core/machine.c| 5 +
 7 files changed, 12 insertions(+)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 8b8f6d5c00d..44c2a4e1ec7 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -36,6 +36,7 @@ bool machine_usb(MachineState *machine);
 int machine_phandle_start(MachineState *machine);
 bool machine_dump_guest_core(MachineState *machine);
 bool machine_mem_merge(MachineState *machine);
+bool machine_require_guest_memfd(MachineState *machine);
 HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine);
 void machine_set_cpu_numa_node(MachineState *machine,
const CpuInstanceProperties *props,
@@ -370,6 +371,7 @@ struct MachineState {
 char *dt_compatible;
 bool dump_guest_core;
 bool mem_merge;
+bool require_guest_memfd;
 bool usb;
 bool usb_disabled;
 char *firmware;
diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
index 0e411aaa29e..04b884bf42a 100644
--- a/include/sysemu/hostmem.h
+++ b/include/sysemu/hostmem.h
@@ -74,6 +74,7 @@ struct HostMemoryBackend {
 uint64_t size;
 bool merge, dump, use_canonical_path;
 bool prealloc, is_mapped, share, reserve;
+bool guest_memfd;
 uint32_t prealloc_threads;
 ThreadContext *prealloc_context;
 DECLARE_BITMAP(host_nodes, MAX_NODES + 1);
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index ac3e433cbdd..3c69db79460 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -85,6 +85,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error 
**errp)
 ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
 ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
 ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
+ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
 ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
 ram_flags |= RAM_NAMED_FILE;
 return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), 
name,
diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
index 3923ea9364d..745ead0034d 100644
--- a/backends/hostmem-memfd.c
+++ b/backends/hostmem-memfd.c
@@ -55,6 +55,7 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error 
**errp)
 name = host_memory_backend_get_name(backend);
 ram_flags = backend->share ? RAM_SHARED : 0;
 ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
+ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
 return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
   backend->size, ram_flags, fd, 0, 
errp);
 }
diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
index d121249f0f4..f7d81af783a 100644
--- a/backends/hostmem-ram.c
+++ b/backends/hostmem-ram.c
@@ -30,6 +30,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error 
**errp)
 name = host_memory_backend_get_name(backend);
 ram_flags = backend->share ? RAM_SHARED : 0;
 ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
+ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
 return memory_region_init_ram_flags_nomigrate(&backend->mr, 
OBJECT(backend),
   name, backend->size,
   ram_flags, errp);
diff --git a/backends/hostmem.c b/backends/hostmem.c
index 81a72ce40b7..eb9682b4a85 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -277,6 +277,7 @@ static void host_memory_backend_init(Object *obj)
 /* TODO: convert access to globals to compat properties */
 backend->merge = machine_mem_merge(machine);
 backend->dump = machine_dump_guest_core(machine);
+backend->guest_memfd = machine_require_guest_memfd(machine);
 backend->reserve = true;
 backend->prealloc_threads = machine->smp.cpus;
 }
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 37ede0e7d4f..73ce9da835b 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1198,6 +1198,11 @@ bool machine_mem_merge(MachineState *machine)
 return machine->mem_merge;
 }
 
+bool machine_require_guest_memfd(MachineState *machine)
+{
+return machi

[PATCH 25/26] kvm: handle KVM_EXIT_MEMORY_FAULT

2024-03-22 Thread Paolo Bonzini

From: Chao Peng 

When geeting KVM_EXIT_MEMORY_FAULT exit, it indicates userspace needs to
do the memory conversion on the RAMBlock to turn the memory into desired
attribute, i.e., private/shared.

Currently only KVM_MEMORY_EXIT_FLAG_PRIVATE in flags is valid when
KVM_EXIT_MEMORY_FAULT happens.

Note, KVM_EXIT_MEMORY_FAULT makes sense only when the RAMBlock has
guest_memfd memory backend.

Note, KVM_EXIT_MEMORY_FAULT returns with -EFAULT, so special handling is
added.

When page is converted from shared to private, the original shared
memory can be discarded via ram_block_discard_range(). Note, shared
memory can be discarded only when it's not back'ed by hugetlb because
hugetlb is supposed to be pre-allocated and no need for discarding.

Signed-off-by: Chao Peng 
Co-developed-by: Xiaoyao Li 
Signed-off-by: Xiaoyao Li 

Message-ID: <20240320083945.991426-13-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 include/sysemu/kvm.h   |  2 +
 accel/kvm/kvm-all.c| 99 +-
 accel/kvm/trace-events |  2 +
 3 files changed, 93 insertions(+), 10 deletions(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 2cb31925091..698f1640fe2 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -541,4 +541,6 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, 
Error **errp);
 
 int kvm_set_memory_attributes_private(hwaddr start, hwaddr size);
 int kvm_set_memory_attributes_shared(hwaddr start, hwaddr size);
+
+int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private);
 #endif
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 56b17cbd8aa..afd7f992e39 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2893,6 +2893,70 @@ static void kvm_eat_signals(CPUState *cpu)
 } while (sigismember(&chkset, SIG_IPI));
 }
 
+int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private)
+{
+MemoryRegionSection section;
+ram_addr_t offset;
+MemoryRegion *mr;
+RAMBlock *rb;
+void *addr;
+int ret = -1;
+
+trace_kvm_convert_memory(start, size, to_private ? "shared_to_private" : 
"private_to_shared");
+
+if (!QEMU_PTR_IS_ALIGNED(start, qemu_real_host_page_size()) ||
+!QEMU_PTR_IS_ALIGNED(size, qemu_real_host_page_size())) {
+return -1;
+}
+
+if (!size) {
+return -1;
+}
+
+section = memory_region_find(get_system_memory(), start, size);
+mr = section.mr;
+if (!mr) {
+return -1;
+}
+
+if (!memory_region_has_guest_memfd(mr)) {
+error_report("Converting non guest_memfd backed memory region "
+ "(0x%"HWADDR_PRIx" ,+ 0x%"HWADDR_PRIx") to %s",
+ start, size, to_private ? "private" : "shared");
+ret = -1;
+goto out_unref;
+}
+
+if (to_private) {
+ret = kvm_set_memory_attributes_private(start, size);
+} else {
+ret = kvm_set_memory_attributes_shared(start, size);
+}
+if (ret) {
+goto out_unref;
+}
+
+addr = memory_region_get_ram_ptr(mr) + section.offset_within_region;
+rb = qemu_ram_block_from_host(addr, false, &offset);
+
+if (to_private) {
+if (rb->page_size == qemu_real_host_page_size()) {
+/*
+* shared memory is back'ed by  hugetlb, which is supposed to be
+* pre-allocated and doesn't need to be discarded
+*/
+goto out_unref;
+}
+ret = ram_block_discard_range(rb, offset, size);
+} else {
+ret = ram_block_discard_guest_memfd_range(rb, offset, size);
+}
+
+out_unref:
+memory_region_unref(section.mr);
+return ret;
+}
+
 int kvm_cpu_exec(CPUState *cpu)
 {
 struct kvm_run *run = cpu->kvm_run;
@@ -2960,18 +3024,20 @@ int kvm_cpu_exec(CPUState *cpu)
 ret = EXCP_INTERRUPT;
 break;
 }
-fprintf(stderr, "error: kvm run failed %s\n",
-strerror(-run_ret));
+if (!(run_ret == -EFAULT && run->exit_reason == 
KVM_EXIT_MEMORY_FAULT)) {
+fprintf(stderr, "error: kvm run failed %s\n",
+strerror(-run_ret));
 #ifdef TARGET_PPC
-if (run_ret == -EBUSY) {
-fprintf(stderr,
-"This is probably because your SMT is enabled.\n"
-"VCPU can only run on primary threads with all "
-"secondary threads offline.\n");
-}
+if (run_ret == -EBUSY) {
+fprintf(stderr,
+"This is probably because your SMT is enabled.\n"
+"VCPU can only run on primary threads with all "
+"secondary threads offline.\n");
+}
 #endif
-ret = -1;
-break;
+ret = -1;
+break;
+}
 }
 
 trace_kvm_run_exit(cpu->cpu_index, run->exit_rea

[PATCH 18/26] kvm: Introduce support for memory_attributes

2024-03-22 Thread Paolo Bonzini

From: Xiaoyao Li 

Introduce the helper functions to set the attributes of a range of
memory to private or shared.

This is necessary to notify KVM the private/shared attribute of each gpa
range. KVM needs the information to decide the GPA needs to be mapped at
hva-based shared memory or guest_memfd based private memory.

Signed-off-by: Xiaoyao Li 
Message-ID: <20240320083945.991426-11-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 include/sysemu/kvm.h |  4 
 accel/kvm/kvm-all.c  | 31 +++
 2 files changed, 35 insertions(+)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 54f4d83a370..bda309d5ffa 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -536,4 +536,8 @@ void kvm_mark_guest_state_protected(void);
  * reported for the VM.
  */
 bool kvm_hwpoisoned_mem(void);
+
+int kvm_set_memory_attributes_private(hwaddr start, hwaddr size);
+int kvm_set_memory_attributes_shared(hwaddr start, hwaddr size);
+
 #endif
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 4ac3cf1c9ef..36e39fd6514 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -91,6 +91,7 @@ bool kvm_msi_use_devid;
 static bool kvm_has_guest_debug;
 static int kvm_sstep_flags;
 static bool kvm_immediate_exit;
+static uint64_t kvm_supported_memory_attributes;
 static hwaddr kvm_max_slot_size = ~0;
 
 static const KVMCapabilityInfo kvm_required_capabilites[] = {
@@ -1266,6 +1267,35 @@ void kvm_set_max_memslot_size(hwaddr max_slot_size)
 kvm_max_slot_size = max_slot_size;
 }
 
+static int kvm_set_memory_attributes(hwaddr start, hwaddr size, uint64_t attr)
+{
+struct kvm_memory_attributes attrs;
+int r;
+
+assert((attr & kvm_supported_memory_attributes) == attr);
+attrs.attributes = attr;
+attrs.address = start;
+attrs.size = size;
+attrs.flags = 0;
+
+r = kvm_vm_ioctl(kvm_state, KVM_SET_MEMORY_ATTRIBUTES, &attrs);
+if (r) {
+error_report("failed to set memory (0x%lx+%#zx) with attr 0x%lx error 
'%s'",
+ start, size, attr, strerror(errno));
+}
+return r;
+}
+
+int kvm_set_memory_attributes_private(hwaddr start, hwaddr size)
+{
+return kvm_set_memory_attributes(start, size, 
KVM_MEMORY_ATTRIBUTE_PRIVATE);
+}
+
+int kvm_set_memory_attributes_shared(hwaddr start, hwaddr size)
+{
+return kvm_set_memory_attributes(start, size, 0);
+}
+
 /* Called with KVMMemoryListener.slots_lock held */
 static void kvm_set_phys_mem(KVMMemoryListener *kml,
  MemoryRegionSection *section, bool add)
@@ -2382,6 +2412,7 @@ static int kvm_init(MachineState *ms)
 goto err;
 }
 
+kvm_supported_memory_attributes = kvm_check_extension(s, 
KVM_CAP_MEMORY_ATTRIBUTES);
 kvm_immediate_exit = kvm_check_extension(s, KVM_CAP_IMMEDIATE_EXIT);
 s->nr_slots = kvm_check_extension(s, KVM_CAP_NR_MEMSLOTS);
 
-- 
2.44.0

[PATCH 24/26] physmem: Introduce ram_block_discard_guest_memfd_range()

2024-03-22 Thread Paolo Bonzini

From: Xiaoyao Li 

When memory page is converted from private to shared, the original
private memory is back'ed by guest_memfd. Introduce
ram_block_discard_guest_memfd_range() for discarding memory in
guest_memfd.

Based on a patch by Isaku Yamahata .

Signed-off-by: Xiaoyao Li 
Reviewed-by: David Hildenbrand 
Signed-off-by: Michael Roth 
Message-ID: <20240320083945.991426-12-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 include/exec/cpu-common.h |  2 ++
 system/physmem.c  | 23 +++
 2 files changed, 25 insertions(+)

diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 6346df17ce9..6d5318895a3 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -159,6 +159,8 @@ typedef int (RAMBlockIterFunc)(RAMBlock *rb, void *opaque);
 
 int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque);
 int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length);
+int ram_block_discard_guest_memfd_range(RAMBlock *rb, uint64_t start,
+size_t length);
 
 #endif
 
diff --git a/system/physmem.c b/system/physmem.c
index 5ebcf5be116..c3d04ca9212 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -3721,6 +3721,29 @@ err:
 return ret;
 }
 
+int ram_block_discard_guest_memfd_range(RAMBlock *rb, uint64_t start,
+size_t length)
+{
+int ret = -1;
+
+#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
+ret = fallocate(rb->guest_memfd, FALLOC_FL_PUNCH_HOLE | 
FALLOC_FL_KEEP_SIZE,
+start, length);
+
+if (ret) {
+ret = -errno;
+error_report("%s: Failed to fallocate %s:%" PRIx64 " +%zx (%d)",
+ __func__, rb->idstr, start, length, ret);
+}
+#else
+ret = -ENOSYS;
+error_report("%s: fallocate not available %s:%" PRIx64 " +%zx (%d)",
+ __func__, rb->idstr, start, length, ret);
+#endif
+
+return ret;
+}
+
 bool ramblock_is_pmem(RAMBlock *rb)
 {
 return rb->flags & RAM_PMEM;
-- 
2.44.0

[PATCH 26/26] i386/kvm: Move architectural CPUID leaf generation to separate helper

2024-03-22 Thread Paolo Bonzini

From: Sean Christopherson 

Move the architectural (for lack of a better term) CPUID leaf generation
to a separate helper so that the generation code can be reused by TDX,
which needs to generate a canonical VM-scoped configuration.

For now this is just a cleanup, so keep the function static.

Signed-off-by: Sean Christopherson 
Signed-off-by: Xiaoyao Li 
Message-ID: <20240229063726.610065-23-xiaoyao...@intel.com>
[Unify error reporting, rename function. - Paolo]
Signed-off-by: Paolo Bonzini 
---
 target/i386/kvm/kvm.c | 446 +-
 1 file changed, 224 insertions(+), 222 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 2577e345502..eab6261e1f5 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1752,6 +1752,228 @@ static void kvm_init_nested_state(CPUX86State *env)
 }
 }
 
+static uint32_t kvm_x86_build_cpuid(CPUX86State *env,
+struct kvm_cpuid_entry2 *entries,
+uint32_t cpuid_i)
+{
+uint32_t limit, i, j;
+uint32_t unused;
+struct kvm_cpuid_entry2 *c;
+
+cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused);
+
+for (i = 0; i <= limit; i++) {
+j = 0;
+if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+goto full;
+}
+c = &entries[cpuid_i++];
+switch (i) {
+case 2: {
+/* Keep reading function 2 till all the input is received */
+int times;
+
+c->function = i;
+c->flags = KVM_CPUID_FLAG_STATEFUL_FUNC |
+   KVM_CPUID_FLAG_STATE_READ_NEXT;
+cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
+times = c->eax & 0xff;
+
+for (j = 1; j < times; ++j) {
+if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+goto full;
+}
+c = &entries[cpuid_i++];
+c->function = i;
+c->flags = KVM_CPUID_FLAG_STATEFUL_FUNC;
+cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
+}
+break;
+}
+case 0x1f:
+if (env->nr_dies < 2) {
+cpuid_i--;
+break;
+}
+/* fallthrough */
+case 4:
+case 0xb:
+case 0xd:
+for (j = 0; ; j++) {
+if (i == 0xd && j == 64) {
+break;
+}
+
+c->function = i;
+c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+c->index = j;
+cpu_x86_cpuid(env, i, j, &c->eax, &c->ebx, &c->ecx, &c->edx);
+
+if (i == 4 && c->eax == 0) {
+break;
+}
+if (i == 0xb && !(c->ecx & 0xff00)) {
+break;
+}
+if (i == 0x1f && !(c->ecx & 0xff00)) {
+break;
+}
+if (i == 0xd && c->eax == 0) {
+continue;
+}
+if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+goto full;
+}
+c = &entries[cpuid_i++];
+}
+break;
+case 0x12:
+for (j = 0; ; j++) {
+c->function = i;
+c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+c->index = j;
+cpu_x86_cpuid(env, i, j, &c->eax, &c->ebx, &c->ecx, &c->edx);
+
+if (j > 1 && (c->eax & 0xf) != 1) {
+break;
+}
+
+if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+goto full;
+}
+c = &entries[cpuid_i++];
+}
+break;
+case 0x7:
+case 0x14:
+case 0x1d:
+case 0x1e: {
+uint32_t times;
+
+c->function = i;
+c->index = 0;
+c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
+times = c->eax;
+
+for (j = 1; j <= times; ++j) {
+if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+goto full;
+}
+c = &entries[cpuid_i++];
+c->function = i;
+c->index = j;
+c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+cpu_x86_cpuid(env, i, j, &c->eax, &c->ebx, &c->ecx, &c->edx);
+}
+break;
+}
+default:
+c->function = i;
+c->flags = 0;
+cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
+if (!c->eax && !c->ebx && !c->ecx && !c->edx) {
+/*
+ * KVM already returns all zeroes if a CPUID entry is missing,
+ * so we can omit it and avoid hitting K

[PATCH 20/26] kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot

2024-03-22 Thread Paolo Bonzini

From: Chao Peng 

Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM.

With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that
backend'ed both by hva-based shared memory and guest memfd based private
memory.

Signed-off-by: Chao Peng 
Co-developed-by: Xiaoyao Li 
Signed-off-by: Xiaoyao Li 
Message-ID: <20240320083945.991426-10-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 include/sysemu/kvm_int.h |  2 ++
 accel/kvm/kvm-all.c  | 46 +---
 accel/kvm/trace-events   |  2 +-
 3 files changed, 41 insertions(+), 9 deletions(-)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index 3496be7997a..a5a3fee4119 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -30,6 +30,8 @@ typedef struct KVMSlot
 int as_id;
 /* Cache of the offset in ram address space */
 ram_addr_t ram_start_offset;
+int guest_memfd;
+hwaddr guest_memfd_offset;
 } KVMSlot;
 
 typedef struct KVMMemoryUpdate {
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 6aa0608805b..7fbaf31cbaf 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -284,35 +284,58 @@ int kvm_physical_memory_addr_from_host(KVMState *s, void 
*ram,
 static int kvm_set_user_memory_region(KVMMemoryListener *kml, KVMSlot *slot, 
bool new)
 {
 KVMState *s = kvm_state;
-struct kvm_userspace_memory_region mem;
+struct kvm_userspace_memory_region2 mem;
 int ret;
 
 mem.slot = slot->slot | (kml->as_id << 16);
 mem.guest_phys_addr = slot->start_addr;
 mem.userspace_addr = (unsigned long)slot->ram;
 mem.flags = slot->flags;
+mem.guest_memfd = slot->guest_memfd;
+mem.guest_memfd_offset = slot->guest_memfd_offset;
 
 if (slot->memory_size && !new && (mem.flags ^ slot->old_flags) & 
KVM_MEM_READONLY) {
 /* Set the slot size to 0 before setting the slot to the desired
  * value. This is needed based on KVM commit 75d61fbc. */
 mem.memory_size = 0;
-ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
+
+if (kvm_guest_memfd_supported) {
+ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION2, &mem);
+} else {
+ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
+}
 if (ret < 0) {
 goto err;
 }
 }
 mem.memory_size = slot->memory_size;
-ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
+if (kvm_guest_memfd_supported) {
+ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION2, &mem);
+} else {
+ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
+}
 slot->old_flags = mem.flags;
 err:
 trace_kvm_set_user_memory(mem.slot >> 16, (uint16_t)mem.slot, mem.flags,
   mem.guest_phys_addr, mem.memory_size,
-  mem.userspace_addr, ret);
+  mem.userspace_addr, mem.guest_memfd,
+  mem.guest_memfd_offset, ret);
 if (ret < 0) {
-error_report("%s: KVM_SET_USER_MEMORY_REGION failed, slot=%d,"
- " start=0x%" PRIx64 ", size=0x%" PRIx64 ": %s",
- __func__, mem.slot, slot->start_addr,
- (uint64_t)mem.memory_size, strerror(errno));
+if (kvm_guest_memfd_supported) {
+error_report("%s: KVM_SET_USER_MEMORY_REGION2 failed, slot=%d,"
+" start=0x%" PRIx64 ", size=0x%" PRIx64 ","
+" flags=0x%" PRIx32 ", guest_memfd=%" PRId32 ","
+" guest_memfd_offset=0x%" PRIx64 ": %s",
+__func__, mem.slot, slot->start_addr,
+(uint64_t)mem.memory_size, mem.flags,
+mem.guest_memfd, (uint64_t)mem.guest_memfd_offset,
+strerror(errno));
+} else {
+error_report("%s: KVM_SET_USER_MEMORY_REGION failed, slot=%d,"
+" start=0x%" PRIx64 ", size=0x%" PRIx64 ": %s",
+__func__, mem.slot, slot->start_addr,
+(uint64_t)mem.memory_size, strerror(errno));
+}
 }
 return ret;
 }
@@ -467,6 +490,10 @@ static int kvm_mem_flags(MemoryRegion *mr)
 if (readonly && kvm_readonly_mem_allowed) {
 flags |= KVM_MEM_READONLY;
 }
+if (memory_region_has_guest_memfd(mr)) {
+assert(kvm_guest_memfd_supported);
+flags |= KVM_MEM_GUEST_MEMFD;
+}
 return flags;
 }
 
@@ -1393,6 +1420,9 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
 mem->ram_start_offset = ram_start_offset;
 mem->ram = ram;
 mem->flags = kvm_mem_flags(mr);
+mem->guest_memfd = mr->ram_block->guest_memfd;
+mem->guest_memfd_offset = (uint8_t*)ram - mr->ram_block->host;
+
 kvm_slot_init_dirty_bitmap(mem);
 err = kvm_set_user_memory_region(kml, mem, true);
 if (err) {

[PATCH 19/26] RAMBlock: Add support of KVM private guest memfd

2024-03-22 Thread Paolo Bonzini

From: Michael Roth 

Add KVM guest_memfd support to RAMBlock so both normal hva based memory
and kvm guest memfd based private memory can be associated in one RAMBlock.

Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to
create private guest_memfd during RAMBlock setup.

Allocating a new RAM_GUEST_MEMFD flag to instruct the setup of guest memfd
is more flexible and extensible than simply relying on the VM type because
in the future we may have the case that not all the memory of a VM need
guest memfd. As a benefit, it also avoid getting MachineState in memory
subsystem.

Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of
confidential guests, such as TDX VM. How and when to set it for memory
backends will be implemented in the following patches.

Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has
KVM guest_memfd allocated.

Signed-off-by: Xiaoyao Li 
Reviewed-by: David Hildenbrand 
Message-ID: <20240320083945.991426-7-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 include/exec/memory.h   | 20 +---
 include/exec/ram_addr.h |  2 +-
 include/exec/ramblock.h |  1 +
 include/sysemu/kvm.h|  3 ++-
 accel/kvm/kvm-all.c | 28 
 accel/stubs/kvm-stub.c  |  5 +
 system/memory.c |  5 +
 system/physmem.c| 34 +++---
 8 files changed, 90 insertions(+), 8 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8626a355b31..679a8476852 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -243,6 +243,9 @@ typedef struct IOMMUTLBEvent {
 /* RAM FD is opened read-only */
 #define RAM_READONLY_FD (1 << 11)
 
+/* RAM can be private that has kvm guest memfd backend */
+#define RAM_GUEST_MEMFD   (1 << 12)
+
 static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
IOMMUNotifierFlag flags,
hwaddr start, hwaddr end,
@@ -1307,7 +1310,8 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
  * @name: Region name, becomes part of RAMBlock name used in migration stream
  *must be unique within any device
  * @size: size of the region.
- * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_NORESERVE.
+ * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_NORESERVE,
+ * RAM_GUEST_MEMFD.
  * @errp: pointer to Error*, to store an error if it happens.
  *
  * Note that this function does not do anything to cause the data in the
@@ -1369,7 +1373,7 @@ bool memory_region_init_resizeable_ram(MemoryRegion *mr,
  * (getpagesize()) will be used.
  * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
  * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
- * RAM_READONLY_FD
+ * RAM_READONLY_FD, RAM_GUEST_MEMFD
  * @path: the path in which to allocate the RAM.
  * @offset: offset within the file referenced by path
  * @errp: pointer to Error*, to store an error if it happens.
@@ -1399,7 +1403,7 @@ bool memory_region_init_ram_from_file(MemoryRegion *mr,
  * @size: size of the region.
  * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
  * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
- * RAM_READONLY_FD
+ * RAM_READONLY_FD, RAM_GUEST_MEMFD
  * @fd: the fd to mmap.
  * @offset: offset within the file referenced by fd
  * @errp: pointer to Error*, to store an error if it happens.
@@ -1722,6 +1726,16 @@ static inline bool memory_region_is_romd(MemoryRegion 
*mr)
  */
 bool memory_region_is_protected(MemoryRegion *mr);
 
+/**
+ * memory_region_has_guest_memfd: check whether a memory region has guest_memfd
+ * associated
+ *
+ * Returns %true if a memory region's ram_block has valid guest_memfd assigned.
+ *
+ * @mr: the memory region being queried
+ */
+bool memory_region_has_guest_memfd(MemoryRegion *mr);
+
 /**
  * memory_region_get_iommu: check whether a memory region is an iommu
  *
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index de45ba7bc96..07c8f863750 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -110,7 +110,7 @@ long qemu_maxrampagesize(void);
  *  @mr: the memory region where the ram block is
  *  @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
  *  RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
- *  RAM_READONLY_FD
+ *  RAM_READONLY_FD, RAM_GUEST_MEMFD
  *  @mem_path or @fd: specify the backing file or device
  *  @offset: Offset into target file
  *  @errp: pointer to Error*, to store an error if it happens
diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index 848915ea5bf..459c8917de2 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -41,6 +41,7 @@ struct RAMBlock {
 QLIST_HEAD(, RAMBlockNotifier) ramblock_notif

[PATCH 05/26] ppc/pef: switch to use confidential_guest_kvm_init/reset()

2024-03-22 Thread Paolo Bonzini

From: Xiaoyao Li 

Use the unified interface to call confidential guest related kvm_init()
and kvm_reset(), to avoid exposing pef specific functions.

As a bonus, pef.h goes away since there is no direct call from sPAPR
board code to PEF code anymore.

Signed-off-by: Xiaoyao Li 
Signed-off-by: Paolo Bonzini 
---
 include/hw/ppc/pef.h | 17 -
 hw/ppc/pef.c |  9 ++---
 hw/ppc/spapr.c   | 10 +++---
 3 files changed, 13 insertions(+), 23 deletions(-)
 delete mode 100644 include/hw/ppc/pef.h

diff --git a/include/hw/ppc/pef.h b/include/hw/ppc/pef.h
deleted file mode 100644
index 707dbe524c4..000
--- a/include/hw/ppc/pef.h
+++ /dev/null
@@ -1,17 +0,0 @@
-/*
- * PEF (Protected Execution Facility) for POWER support
- *
- * Copyright Red Hat.
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- *
- */
-
-#ifndef HW_PPC_PEF_H
-#define HW_PPC_PEF_H
-
-int pef_kvm_init(ConfidentialGuestSupport *cgs, Error **errp);
-int pef_kvm_reset(ConfidentialGuestSupport *cgs, Error **errp);
-
-#endif /* HW_PPC_PEF_H */
diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c
index d28ed3ba733..47553348b1e 100644
--- a/hw/ppc/pef.c
+++ b/hw/ppc/pef.c
@@ -15,7 +15,6 @@
 #include "sysemu/kvm.h"
 #include "migration/blocker.h"
 #include "exec/confidential-guest-support.h"
-#include "hw/ppc/pef.h"
 
 #define TYPE_PEF_GUEST "pef-guest"
 OBJECT_DECLARE_SIMPLE_TYPE(PefGuest, PEF_GUEST)
@@ -93,7 +92,7 @@ static int kvmppc_svm_off(Error **errp)
 #endif
 }
 
-int pef_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
+static int pef_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
 {
 if (!object_dynamic_cast(OBJECT(cgs), TYPE_PEF_GUEST)) {
 return 0;
@@ -107,7 +106,7 @@ int pef_kvm_init(ConfidentialGuestSupport *cgs, Error 
**errp)
 return kvmppc_svm_init(cgs, errp);
 }
 
-int pef_kvm_reset(ConfidentialGuestSupport *cgs, Error **errp)
+static int pef_kvm_reset(ConfidentialGuestSupport *cgs, Error **errp)
 {
 if (!object_dynamic_cast(OBJECT(cgs), TYPE_PEF_GUEST)) {
 return 0;
@@ -131,6 +130,10 @@ OBJECT_DEFINE_TYPE_WITH_INTERFACES(PefGuest,
 
 static void pef_guest_class_init(ObjectClass *oc, void *data)
 {
+ConfidentialGuestSupportClass *klass = 
CONFIDENTIAL_GUEST_SUPPORT_CLASS(oc);
+
+klass->kvm_init = pef_kvm_init;
+klass->kvm_reset = pef_kvm_reset;
 }
 
 static void pef_guest_init(Object *obj)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index c417f9dd523..7178da53901 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -74,6 +74,7 @@
 #include "hw/virtio/vhost-scsi-common.h"
 
 #include "exec/ram_addr.h"
+#include "exec/confidential-guest-support.h"
 #include "hw/usb.h"
 #include "qemu/config-file.h"
 #include "qemu/error-report.h"
@@ -86,7 +87,6 @@
 #include "hw/ppc/spapr_tpm_proxy.h"
 #include "hw/ppc/spapr_nvdimm.h"
 #include "hw/ppc/spapr_numa.h"
-#include "hw/ppc/pef.h"
 
 #include "monitor/monitor.h"
 
@@ -1714,7 +1714,9 @@ static void spapr_machine_reset(MachineState *machine, 
ShutdownCause reason)
 qemu_guest_getrandom_nofail(spapr->fdt_rng_seed, 32);
 }
 
-pef_kvm_reset(machine->cgs, &error_fatal);
+if (machine->cgs) {
+confidential_guest_kvm_reset(machine->cgs, &error_fatal);
+}
 spapr_caps_apply(spapr);
 spapr_nested_reset(spapr);
 
@@ -2840,7 +2842,9 @@ static void spapr_machine_init(MachineState *machine)
 /*
  * if Secure VM (PEF) support is configured, then initialize it
  */
-pef_kvm_init(machine->cgs, &error_fatal);
+if (machine->cgs) {
+confidential_guest_kvm_init(machine->cgs, &error_fatal);
+}
 
 msi_nonbroken = true;
 
-- 
2.44.0

[PATCH 02/26] q35: Introduce smm_ranges property for q35-pci-host

2024-03-22 Thread Paolo Bonzini

From: Isaku Yamahata 

Add a q35 property to check whether or not SMM ranges, e.g. SMRAM, TSEG,
etc... exist for the target platform.  TDX doesn't support SMM and doesn't
play nice with QEMU modifying related guest memory ranges.

Signed-off-by: Isaku Yamahata 
Co-developed-by: Sean Christopherson 
Signed-off-by: Sean Christopherson 
Signed-off-by: Xiaoyao Li 
Signed-off-by: Michael Roth 
Message-ID: <20240320083945.991426-19-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 include/hw/i386/pc.h  |  1 +
 include/hw/pci-host/q35.h |  1 +
 hw/i386/pc_q35.c  |  2 ++
 hw/pci-host/q35.c | 42 +++
 4 files changed, 33 insertions(+), 13 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 27a68071d77..fb1d4106e50 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -161,6 +161,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int 
level);
 #define PCI_HOST_PROP_PCI_HOLE64_SIZE  "pci-hole64-size"
 #define PCI_HOST_BELOW_4G_MEM_SIZE "below-4g-mem-size"
 #define PCI_HOST_ABOVE_4G_MEM_SIZE "above-4g-mem-size"
+#define PCI_HOST_PROP_SMM_RANGES   "smm-ranges"
 
 
 void pc_pci_as_mapping_init(MemoryRegion *system_memory,
diff --git a/include/hw/pci-host/q35.h b/include/hw/pci-host/q35.h
index bafcbe67521..22fadfa3ed7 100644
--- a/include/hw/pci-host/q35.h
+++ b/include/hw/pci-host/q35.h
@@ -50,6 +50,7 @@ struct MCHPCIState {
 MemoryRegion tseg_blackhole, tseg_window;
 MemoryRegion smbase_blackhole, smbase_window;
 bool has_smram_at_smbase;
+bool has_smm_ranges;
 Range pci_hole;
 uint64_t below_4g_mem_size;
 uint64_t above_4g_mem_size;
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index b5922b44afa..7f2d85df75f 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -219,6 +219,8 @@ static void pc_q35_init(MachineState *machine)
 x86ms->above_4g_mem_size, NULL);
 object_property_set_bool(phb, PCI_HOST_BYPASS_IOMMU,
  pcms->default_bus_bypass_iommu, NULL);
+object_property_set_bool(phb, PCI_HOST_PROP_SMM_RANGES,
+ x86_machine_is_smm_enabled(x86ms), NULL);
 sysbus_realize_and_unref(SYS_BUS_DEVICE(phb), &error_fatal);
 
 /* pci */
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 98d4a7c253a..0b6cbaed7ed 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -179,6 +179,8 @@ static Property q35_host_props[] = {
  mch.below_4g_mem_size, 0),
 DEFINE_PROP_SIZE(PCI_HOST_ABOVE_4G_MEM_SIZE, Q35PCIHost,
  mch.above_4g_mem_size, 0),
+DEFINE_PROP_BOOL(PCI_HOST_PROP_SMM_RANGES, Q35PCIHost,
+ mch.has_smm_ranges, true),
 DEFINE_PROP_BOOL("x-pci-hole64-fix", Q35PCIHost, pci_hole64_fix, true),
 DEFINE_PROP_END_OF_LIST(),
 };
@@ -214,6 +216,7 @@ static void q35_host_initfn(Object *obj)
 /* mch's object_initialize resets the default value, set it again */
 qdev_prop_set_uint64(DEVICE(s), PCI_HOST_PROP_PCI_HOLE64_SIZE,
  Q35_PCI_HOST_HOLE64_SIZE_DEFAULT);
+
 object_property_add(obj, PCI_HOST_PROP_PCI_HOLE_START, "uint32",
 q35_host_get_pci_hole_start,
 NULL, NULL, NULL);
@@ -476,6 +479,10 @@ static void mch_write_config(PCIDevice *d,
 mch_update_pciexbar(mch);
 }
 
+if (!mch->has_smm_ranges) {
+return;
+}
+
 if (ranges_overlap(address, len, MCH_HOST_BRIDGE_SMRAM,
MCH_HOST_BRIDGE_SMRAM_SIZE)) {
 mch_update_smram(mch);
@@ -494,10 +501,13 @@ static void mch_write_config(PCIDevice *d,
 static void mch_update(MCHPCIState *mch)
 {
 mch_update_pciexbar(mch);
+
 mch_update_pam(mch);
-mch_update_smram(mch);
-mch_update_ext_tseg_mbytes(mch);
-mch_update_smbase_smram(mch);
+if (mch->has_smm_ranges) {
+mch_update_smram(mch);
+mch_update_ext_tseg_mbytes(mch);
+mch_update_smbase_smram(mch);
+}
 
 /*
  * pci hole goes from end-of-low-ram to io-apic.
@@ -538,19 +548,21 @@ static void mch_reset(DeviceState *qdev)
 pci_set_quad(d->config + MCH_HOST_BRIDGE_PCIEXBAR,
  MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT);
 
-d->config[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_DEFAULT;
-d->config[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_DEFAULT;
-d->wmask[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_WMASK;
-d->wmask[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_WMASK;
+if (mch->has_smm_ranges) {
+d->config[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_DEFAULT;
+d->config[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_DEFAULT;
+d->wmask[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_WMASK;
+d->wmask[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_WMASK;
 
-if (mch->ext_tseg_mbytes > 0) {
-pci_set_word(d->config + MCH_HOST_BRIDGE_EXT_TSEG_MB

[PATCH for-9.1 00/26] x86, kvm: common confidential computing subset

2024-03-22 Thread Paolo Bonzini

These are the common bits for TDX and SEV-SNP support for QEMU 9.1.

The main changes compared to what Michael posted is:

1) I am including KVM_SEV_INIT2 support patches without the linux-headers
update hack (however the linux-headers update in these patches is still
not final).  This also includes the bits to track whether guest state
are encrypted, that are needed by TDX as well as SEV-ES/SNP.

2) VFIO currently is blocked, pending a decision on what is worse
between possibly consuming twice the memory and disallowing device
assignment; or someone biting the bullet and implementing the
RamDiscardManager interface.

3) I included another easy patch from the TDX series, "i386/kvm: Move
architectural CPUID leaf generation to separate helper".

Please test. :)

Paolo

Chao Peng (2):
  kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot
  kvm: handle KVM_EXIT_MEMORY_FAULT

Isaku Yamahata (2):
  pci-host/q35: Move PAM initialization above SMRAM initialization
  q35: Introduce smm_ranges property for q35-pci-host

Michael Roth (5):
  scripts/update-linux-headers: Add setup_data.h to import list
  scripts/update-linux-headers: Add bits.h to file imports
  [HACK] linux-headers: Update headers for 6.8 + kvm-coco-queue + SNP
  [TO SQUASH] hw/i386: Remove redeclaration of struct setup_data
  RAMBlock: Add support of KVM private guest memfd

Paolo Bonzini (7):
  runstate: skip initial CPU reset if reset is not actually possible
  KVM: track whether guest state is encrypted
  KVM: remove kvm_arch_cpu_check_are_resettable
  target/i386: introduce x86-confidential-guest
  target/i386: Implement mc->kvm_type() to get VM type
  target/i386: SEV: use KVM_SEV_INIT2 if possible
  RAMBlock: make guest_memfd require uncoordinated discard

Sean Christopherson (1):
  i386/kvm: Move architectural CPUID leaf generation to separate helper

Xiaoyao Li (9):
  confidential guest support: Add kvm_init() and kvm_reset() in class
  i386/sev: Switch to use confidential_guest_kvm_init()
  ppc/pef: switch to use confidential_guest_kvm_init/reset()
  s390: Switch to use confidential_guest_kvm_init()
  trace/kvm: Split address space and slot id in
trace_kvm_set_user_memory()
  kvm: Introduce support for memory_attributes
  kvm/memory: Make memory type private by default if it has guest memfd
backend
  HostMem: Add mechanism to opt in kvm guest memfd via MachineState
  physmem: Introduce ram_block_discard_guest_memfd_range()

 include/exec/confidential-guest-support.h |  34 +-
 include/exec/cpu-common.h |   2 +
 include/exec/memory.h |  20 +-
 include/exec/ram_addr.h   |   2 +-
 include/exec/ramblock.h   |   1 +
 include/hw/boards.h   |   2 +
 include/hw/i386/pc.h  |   1 +
 include/hw/pci-host/q35.h |   1 +
 include/hw/ppc/pef.h  |  17 -
 include/standard-headers/asm-x86/bootparam.h  |  17 +-
 include/standard-headers/asm-x86/kvm_para.h   |   3 +-
 include/standard-headers/asm-x86/setup_data.h |  83 ++
 include/standard-headers/linux/ethtool.h  |  48 ++
 include/standard-headers/linux/fuse.h |  39 +-
 .../linux/input-event-codes.h |   1 +
 include/standard-headers/linux/virtio_gpu.h   |   2 +
 include/standard-headers/linux/virtio_snd.h   | 154 
 include/sysemu/hostmem.h  |   1 +
 include/sysemu/kvm.h  |  19 +-
 include/sysemu/kvm_int.h  |   3 +
 linux-headers/asm-arm64/kvm.h |  15 +-
 linux-headers/asm-arm64/sve_context.h |  11 +
 linux-headers/asm-generic/bitsperlong.h   |   4 +
 linux-headers/asm-loongarch/kvm.h |   2 -
 linux-headers/asm-mips/kvm.h  |   2 -
 linux-headers/asm-powerpc/kvm.h   |  45 +-
 linux-headers/asm-riscv/kvm.h |   3 +-
 linux-headers/asm-s390/kvm.h  | 315 +++-
 linux-headers/asm-x86/kvm.h   | 364 -
 linux-headers/linux/bits.h|  15 +
 linux-headers/linux/kvm.h | 717 +-
 linux-headers/linux/psp-sev.h |  71 ++
 target/i386/confidential-guest.h  |  59 ++
 target/i386/kvm/kvm_i386.h|   2 +
 target/i386/sev.h |   2 -
 target/s390x/kvm/pv.h |  14 -
 accel/kvm/kvm-accel-ops.c |   2 +-
 accel/kvm/kvm-all.c   | 236 +-
 accel/stubs/kvm-stub.c|   5 +
 backends/hostmem-file.c   |   1 +
 backends/hostmem-memfd.c  |   1 +
 backends/hostmem-ram.c|   1 +
 backends/hostmem.c|   1 +
 hw/core/machine.c |   5 +
 hw/i386/pc_q35.c  |   2 +
 hw/i386/x86.c

[PATCH 04/26] i386/sev: Switch to use confidential_guest_kvm_init()

2024-03-22 Thread Paolo Bonzini

From: Xiaoyao Li 

Use confidential_guest_kvm_init() instead of calling SEV
specific sev_kvm_init(). This allows the introduction of multiple
confidential-guest-support subclasses for different x86 vendors.

As a bonus, stubs are not needed anymore since there is no
direct call from target/i386/kvm/kvm.c to SEV code.

Signed-off-by: Xiaoyao Li 
Message-Id: <20240229060038.606591-1-xiaoyao...@intel.com>
Signed-off-by: Paolo Bonzini 
---
 target/i386/sev.h   |   2 -
 target/i386/kvm/kvm.c   |  10 +--
 target/i386/kvm/sev-stub.c  |  21 --
 target/i386/sev.c   | 127 ++--
 target/i386/kvm/meson.build |   2 -
 5 files changed, 69 insertions(+), 93 deletions(-)
 delete mode 100644 target/i386/kvm/sev-stub.c

diff --git a/target/i386/sev.h b/target/i386/sev.h
index e7499c95b1e..9e10d09539a 100644
--- a/target/i386/sev.h
+++ b/target/i386/sev.h
@@ -57,6 +57,4 @@ int sev_inject_launch_secret(const char *hdr, const char 
*secret,
 int sev_es_save_reset_vector(void *flash_ptr, uint64_t flash_size);
 void sev_es_set_reset_vector(CPUState *cpu);
 
-int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp);
-
 #endif
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index e68cbe92930..de10155b37a 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2538,10 +2538,12 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
  * mechanisms are supported in future (e.g. TDX), they'll need
  * their own initialization either here or elsewhere.
  */
-ret = sev_kvm_init(ms->cgs, &local_err);
-if (ret < 0) {
-error_report_err(local_err);
-return ret;
+if (ms->cgs) {
+ret = confidential_guest_kvm_init(ms->cgs, &local_err);
+if (ret < 0) {
+error_report_err(local_err);
+return ret;
+}
 }
 
 has_xcrs = kvm_check_extension(s, KVM_CAP_XCRS);
diff --git a/target/i386/kvm/sev-stub.c b/target/i386/kvm/sev-stub.c
deleted file mode 100644
index 1be5341e8a6..000
--- a/target/i386/kvm/sev-stub.c
+++ /dev/null
@@ -1,21 +0,0 @@
-/*
- * QEMU SEV stub
- *
- * Copyright Advanced Micro Devices 2018
- *
- * Authors:
- *  Brijesh Singh 
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- *
- */
-
-#include "qemu/osdep.h"
-#include "sev.h"
-
-int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
-{
-/* If we get here, cgs must be some non-SEV thing */
-return 0;
-}
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 72930ff0dcc..b8f79d34d19 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -353,63 +353,6 @@ static void sev_guest_set_kernel_hashes(Object *obj, bool 
value, Error **errp)
 sev->kernel_hashes = value;
 }
 
-static void
-sev_guest_class_init(ObjectClass *oc, void *data)
-{
-object_class_property_add_str(oc, "sev-device",
-  sev_guest_get_sev_device,
-  sev_guest_set_sev_device);
-object_class_property_set_description(oc, "sev-device",
-"SEV device to use");
-object_class_property_add_str(oc, "dh-cert-file",
-  sev_guest_get_dh_cert_file,
-  sev_guest_set_dh_cert_file);
-object_class_property_set_description(oc, "dh-cert-file",
-"guest owners DH certificate (encoded with base64)");
-object_class_property_add_str(oc, "session-file",
-  sev_guest_get_session_file,
-  sev_guest_set_session_file);
-object_class_property_set_description(oc, "session-file",
-"guest owners session parameters (encoded with base64)");
-object_class_property_add_bool(oc, "kernel-hashes",
-   sev_guest_get_kernel_hashes,
-   sev_guest_set_kernel_hashes);
-object_class_property_set_description(oc, "kernel-hashes",
-"add kernel hashes to guest firmware for measured Linux boot");
-}
-
-static void
-sev_guest_instance_init(Object *obj)
-{
-SevGuestState *sev = SEV_GUEST(obj);
-
-sev->sev_device = g_strdup(DEFAULT_SEV_DEVICE);
-sev->policy = DEFAULT_GUEST_POLICY;
-object_property_add_uint32_ptr(obj, "policy", &sev->policy,
-   OBJ_PROP_FLAG_READWRITE);
-object_property_add_uint32_ptr(obj, "handle", &sev->handle,
-   OBJ_PROP_FLAG_READWRITE);
-object_property_add_uint32_ptr(obj, "cbitpos", &sev->cbitpos,
-   OBJ_PROP_FLAG_READWRITE);
-object_property_add_uint32_ptr(obj, "reduced-phys-bits",
-   &sev->reduced_phys_bits,
-   OBJ_PROP_FLAG_READWRITE);
-}
-
-/* sev guest info */
-static const TypeInfo sev_guest_info = {
-.parent = TYPE_CONFIDENTIAL_GUEST_SUPPOR

[PATCH 08/26] scripts/update-linux-headers: Add bits.h to file imports

2024-03-22 Thread Paolo Bonzini

From: Michael Roth 

Signed-off-by: Michael Roth 
Signed-off-by: Paolo Bonzini 
---
 scripts/update-linux-headers.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index d48856f9e24..5f20434d5c5 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -169,7 +169,7 @@ rm -rf "$output/linux-headers/linux"
 mkdir -p "$output/linux-headers/linux"
 for header in const.h stddef.h kvm.h vfio.h vfio_ccw.h vfio_zdev.h vhost.h \
   psci.h psp-sev.h userfaultfd.h memfd.h mman.h nvme_ioctl.h \
-  vduse.h iommufd.h; do
+  vduse.h iommufd.h bits.h; do
 cp "$tmpdir/include/linux/$header" "$output/linux-headers/linux"
 done
 
-- 
2.44.0

[PATCH 11/26] runstate: skip initial CPU reset if reset is not actually possible

2024-03-22 Thread Paolo Bonzini

Right now, the system reset is concluded by a call to
cpu_synchronize_all_post_reset() in order to sync any changes
that the machine reset callback applied to the CPU state.

However, for VMs with encrypted state such as SEV-ES guests (currently
the only case of guests with non-resettable CPUs) this cannot be done,
because guest state has already been finalized by machine-init-done notifiers.
cpu_synchronize_all_post_reset() does nothing on these guests, and actually
we would like to make it fail if called once guest has been encrypted.
So, assume that boards that support non-resettable CPUs do not touch
CPU state and that all such setup is done before, at the time of
cpu_synchronize_all_post_init().

Signed-off-by: Paolo Bonzini 
---
 system/runstate.c | 15 ++-
 roms/edk2 |  2 +-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/system/runstate.c b/system/runstate.c
index d6ab860ecaa..cb4905a40fc 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -501,7 +501,20 @@ void qemu_system_reset(ShutdownCause reason)
 default:
 qapi_event_send_reset(shutdown_caused_by_guest(reason), reason);
 }
-cpu_synchronize_all_post_reset();
+
+/*
+ * Some boards use the machine reset callback to point CPUs to the firmware
+ * entry point.  Assume that this is not the case for boards that support
+ * non-resettable CPUs (currently used only for confidential guests), in
+ * which case cpu_synchronize_all_post_init() is enough because
+ * it does _more_  than cpu_synchronize_all_post_reset().
+ */
+if (cpus_are_resettable()) {
+cpu_synchronize_all_post_reset();
+} else {
+assert(runstate_check(RUN_STATE_PRELAUNCH));
+}
+
 vm_set_suspended(false);
 }
 
diff --git a/roms/edk2 b/roms/edk2
index edc6681206c..819cfc6b42a 16
--- a/roms/edk2
+++ b/roms/edk2
@@ -1 +1 @@
-Subproject commit edc6681206c1a8791981a2f911d2fb8b3d2f5768
+Subproject commit 819cfc6b42a68790a23509e4fcc58ceb70e1965e
-- 
2.44.0

[PATCH 17/26] trace/kvm: Split address space and slot id in trace_kvm_set_user_memory()

2024-03-22 Thread Paolo Bonzini

From: Xiaoyao Li 

The upper 16 bits of kvm_userspace_memory_region::slot are
address space id. Parse it separately in trace_kvm_set_user_memory().

Signed-off-by: Xiaoyao Li 
Message-ID: <20240229063726.610065-5-xiaoyao...@intel.com>
Signed-off-by: Paolo Bonzini 
---
 accel/kvm/kvm-all.c| 5 +++--
 accel/kvm/trace-events | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index a05dea23133..4ac3cf1c9ef 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -303,8 +303,9 @@ static int kvm_set_user_memory_region(KVMMemoryListener 
*kml, KVMSlot *slot, boo
 ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
 slot->old_flags = mem.flags;
 err:
-trace_kvm_set_user_memory(mem.slot, mem.flags, mem.guest_phys_addr,
-  mem.memory_size, mem.userspace_addr, ret);
+trace_kvm_set_user_memory(mem.slot >> 16, (uint16_t)mem.slot, mem.flags,
+  mem.guest_phys_addr, mem.memory_size,
+  mem.userspace_addr, ret);
 if (ret < 0) {
 error_report("%s: KVM_SET_USER_MEMORY_REGION failed, slot=%d,"
  " start=0x%" PRIx64 ", size=0x%" PRIx64 ": %s",
diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events
index a25902597b1..9f599abc172 100644
--- a/accel/kvm/trace-events
+++ b/accel/kvm/trace-events
@@ -15,7 +15,7 @@ kvm_irqchip_update_msi_route(int virq) "Updating MSI route 
virq=%d"
 kvm_irqchip_release_virq(int virq) "virq %d"
 kvm_set_ioeventfd_mmio(int fd, uint64_t addr, uint32_t val, bool assign, 
uint32_t size, bool datamatch) "fd: %d @0x%" PRIx64 " val=0x%x assign: %d size: 
%d match: %d"
 kvm_set_ioeventfd_pio(int fd, uint16_t addr, uint32_t val, bool assign, 
uint32_t size, bool datamatch) "fd: %d @0x%x val=0x%x assign: %d size: %d 
match: %d"
-kvm_set_user_memory(uint32_t slot, uint32_t flags, uint64_t guest_phys_addr, 
uint64_t memory_size, uint64_t userspace_addr, int ret) "Slot#%d flags=0x%x 
gpa=0x%"PRIx64 " size=0x%"PRIx64 " ua=0x%"PRIx64 " ret=%d"
+kvm_set_user_memory(uint16_t as, uint16_t slot, uint32_t flags, uint64_t 
guest_phys_addr, uint64_t memory_size, uint64_t userspace_addr, int ret) 
"AddrSpace#%d Slot#%d flags=0x%x gpa=0x%"PRIx64 " size=0x%"PRIx64 " 
ua=0x%"PRIx64 " ret=%d"
 kvm_clear_dirty_log(uint32_t slot, uint64_t start, uint32_t size) 
"slot#%"PRId32" start 0x%"PRIx64" size 0x%"PRIx32
 kvm_resample_fd_notify(int gsi) "gsi %d"
 kvm_dirty_ring_full(int id) "vcpu %d"
-- 
2.44.0

[PATCH 07/26] scripts/update-linux-headers: Add setup_data.h to import list

2024-03-22 Thread Paolo Bonzini

From: Michael Roth 

Data structures like struct setup_data have been moved to a separate
setup_data.h header which bootparam.h relies on. Add setup_data.h to
the cp_portable() list and sync it along with the other header files.

Note that currently struct setup_data is stripped away as part of
generating bootparam.h, but that handling is no currently needed for
setup_data.h since it doesn't pull in many external
headers/dependencies. However, QEMU currently redefines struct
setup_data in hw/i386/x86.c, so that will need to be removed as part of
any header update that pulls in the new setup_data.h to avoid build
bisect breakage.

Because  is the first architecture specific #include
in include/standard-headers/, add a new sed substitution to rewrite
asm/ include to the standard-headers/asm-* subdirectory for the current
architecture.

And while at it, remove asm-generic/kvm_para.h from the list of
allowed includes: it does not have a matching substitution, and therefore
it would not be possible to use it on non-Linux systems where there is
no /usr/include/asm-generic/ directory.

Signed-off-by: Michael Roth 
Signed-off-by: Paolo Bonzini 
---
 scripts/update-linux-headers.sh | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index a0006eec6fd..d48856f9e24 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -61,7 +61,7 @@ cp_portable() {
  -e 'linux/const' \
  -e 'linux/kernel' \
  -e 'linux/sysinfo' \
- -e 'asm-generic/kvm_para' \
+ -e 'asm/setup_data.h' \
  > /dev/null
 then
 echo "Unexpected #include in input file $f".
@@ -77,6 +77,7 @@ cp_portable() {
 -e 's/__be\([0-9][0-9]*\)/uint\1_t/g' \
 -e 's/"\(input-event-codes\.h\)"/"standard-headers\/linux\/\1"/' \
 -e 's/]*\)>/"standard-headers\/linux\/\1"/' \
+-e 's/]*\)>/"standard-headers\/asm-'$arch'\/\1"/' \
 -e 's/__bitwise//' \
 -e 's/__attribute__((packed))/QEMU_PACKED/' \
 -e 's/__inline__/inline/' \
@@ -155,11 +156,14 @@ for arch in $ARCHLIST; do
"$tmpdir/include/asm/bootparam.h" > "$tmpdir/bootparam.h"
 cp_portable "$tmpdir/bootparam.h" \
 "$output/include/standard-headers/asm-$arch"
+cp_portable "$tmpdir/include/asm/setup_data.h" \
+"$output/standard-headers/asm-x86"
 fi
 if [ $arch = riscv ]; then
 cp "$tmpdir/include/asm/ptrace.h" "$output/linux-headers/asm-riscv/"
 fi
 done
+arch=
 
 rm -rf "$output/linux-headers/linux"
 mkdir -p "$output/linux-headers/linux"
-- 
2.44.0

[PATCH 12/26] KVM: track whether guest state is encrypted

2024-03-22 Thread Paolo Bonzini

So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the
guest state is encrypted, in which case they do nothing.  For the new
API using VM types, instead, the ioctls will fail which is a safer and
more robust approach.

The new API will be the only one available for SEV-SNP and TDX, but it
is also usable for SEV and SEV-ES.  In preparation for that, require
architecture-specific KVM code to communicate the point at which guest
state is protected (which must be after kvm_cpu_synchronize_post_init(),
though that might change in the future in order to suppor migration).
>From that point, skip reading registers so that cpu->vcpu_dirty is
never true: if it ever becomes true, kvm_arch_put_registers() will
fail miserably.

Signed-off-by: Paolo Bonzini 
---
 include/sysemu/kvm.h |  2 ++
 include/sysemu/kvm_int.h |  1 +
 accel/kvm/kvm-all.c  | 14 --
 target/i386/sev.c|  1 +
 4 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index fad9a7e8ff3..302e8f6f1e5 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -539,6 +539,8 @@ bool kvm_dirty_ring_enabled(void);
 
 uint32_t kvm_dirty_ring_size(void);
 
+void kvm_mark_guest_state_protected(void);
+
 /**
  * kvm_hwpoisoned_mem - indicate if there is any hwpoisoned page
  * reported for the VM.
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index 882e37e12c5..3496be7997a 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -87,6 +87,7 @@ struct KVMState
 bool kernel_irqchip_required;
 OnOffAuto kernel_irqchip_split;
 bool sync_mmu;
+bool guest_state_protected;
 uint64_t manual_dirty_log_protect;
 /* The man page (and posix) say ioctl numbers are signed int, but
  * they're not.  Linux, glibc and *BSD all treat ioctl numbers as
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index a8cecd040eb..05fa3533c66 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2698,7 +2698,7 @@ bool kvm_cpu_check_are_resettable(void)
 
 static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg)
 {
-if (!cpu->vcpu_dirty) {
+if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) {
 int ret = kvm_arch_get_registers(cpu);
 if (ret) {
 error_report("Failed to get registers: %s", strerror(-ret));
@@ -2712,7 +2712,7 @@ static void do_kvm_cpu_synchronize_state(CPUState *cpu, 
run_on_cpu_data arg)
 
 void kvm_cpu_synchronize_state(CPUState *cpu)
 {
-if (!cpu->vcpu_dirty) {
+if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) {
 run_on_cpu(cpu, do_kvm_cpu_synchronize_state, RUN_ON_CPU_NULL);
 }
 }
@@ -2747,6 +2747,11 @@ static void do_kvm_cpu_synchronize_post_init(CPUState 
*cpu, run_on_cpu_data arg)
 
 void kvm_cpu_synchronize_post_init(CPUState *cpu)
 {
+/*
+ * This runs before the machine_init_done notifiers, and is the last
+ * opportunity to synchronize the state of confidential guests.
+ */
+assert(!kvm_state->guest_state_protected);
 run_on_cpu(cpu, do_kvm_cpu_synchronize_post_init, RUN_ON_CPU_NULL);
 }
 
@@ -4094,3 +4099,8 @@ void query_stats_schemas_cb(StatsSchemaList **result, 
Error **errp)
 query_stats_schema_vcpu(first_cpu, &stats_args);
 }
 }
+
+void kvm_mark_guest_state_protected(void)
+{
+kvm_state->guest_state_protected = true;
+}
diff --git a/target/i386/sev.c b/target/i386/sev.c
index b8f79d34d19..c49a8fd55eb 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -755,6 +755,7 @@ sev_launch_get_measure(Notifier *notifier, void *unused)
 if (ret) {
 exit(1);
 }
+kvm_mark_guest_state_protected();
 }
 
 /* query the measurement blob length */
-- 
2.44.0

[PATCH 03/26] confidential guest support: Add kvm_init() and kvm_reset() in class

2024-03-22 Thread Paolo Bonzini

From: Xiaoyao Li 

Different confidential VMs in different architectures all have the same
needs to do their specific initialization (and maybe resetting) stuffs
with KVM. Currently each of them exposes individual *_kvm_init()
functions and let machine code or kvm code to call it.

To facilitate the introduction of confidential guest technology from
different x86 vendors, add two virtual functions, kvm_init() and kvm_reset()
in ConfidentialGuestSupportClass, and expose two helpers functions for
invodking them.

Signed-off-by: Xiaoyao Li 
Message-Id: <20240229060038.606591-1-xiaoyao...@intel.com>
Signed-off-by: Paolo Bonzini 
---
 include/exec/confidential-guest-support.h | 34 ++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/include/exec/confidential-guest-support.h 
b/include/exec/confidential-guest-support.h
index ba2dd4b5dfc..e5b188cffbf 100644
--- a/include/exec/confidential-guest-support.h
+++ b/include/exec/confidential-guest-support.h
@@ -23,7 +23,10 @@
 #include "qom/object.h"
 
 #define TYPE_CONFIDENTIAL_GUEST_SUPPORT "confidential-guest-support"
-OBJECT_DECLARE_SIMPLE_TYPE(ConfidentialGuestSupport, 
CONFIDENTIAL_GUEST_SUPPORT)
+OBJECT_DECLARE_TYPE(ConfidentialGuestSupport,
+ConfidentialGuestSupportClass,
+CONFIDENTIAL_GUEST_SUPPORT)
+
 
 struct ConfidentialGuestSupport {
 Object parent;
@@ -55,8 +58,37 @@ struct ConfidentialGuestSupport {
 
 typedef struct ConfidentialGuestSupportClass {
 ObjectClass parent;
+
+int (*kvm_init)(ConfidentialGuestSupport *cgs, Error **errp);
+int (*kvm_reset)(ConfidentialGuestSupport *cgs, Error **errp);
 } ConfidentialGuestSupportClass;
 
+static inline int confidential_guest_kvm_init(ConfidentialGuestSupport *cgs,
+  Error **errp)
+{
+ConfidentialGuestSupportClass *klass;
+
+klass = CONFIDENTIAL_GUEST_SUPPORT_GET_CLASS(cgs);
+if (klass->kvm_init) {
+return klass->kvm_init(cgs, errp);
+}
+
+return 0;
+}
+
+static inline int confidential_guest_kvm_reset(ConfidentialGuestSupport *cgs,
+   Error **errp)
+{
+ConfidentialGuestSupportClass *klass;
+
+klass = CONFIDENTIAL_GUEST_SUPPORT_GET_CLASS(cgs);
+if (klass->kvm_reset) {
+return klass->kvm_reset(cgs, errp);
+}
+
+return 0;
+}
+
 #endif /* !CONFIG_USER_ONLY */
 
 #endif /* QEMU_CONFIDENTIAL_GUEST_SUPPORT_H */
-- 
2.44.0

[PATCH 01/26] pci-host/q35: Move PAM initialization above SMRAM initialization

2024-03-22 Thread Paolo Bonzini

From: Isaku Yamahata 

In mch_realize(), process PAM initialization before SMRAM initialization so
that later patch can skill all the SMRAM related with a single check.

Signed-off-by: Isaku Yamahata 
Signed-off-by: Xiaoyao Li 
Signed-off-by: Michael Roth 
Message-ID: <20240320083945.991426-18-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 hw/pci-host/q35.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 0d7d4e3f086..98d4a7c253a 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -568,6 +568,16 @@ static void mch_realize(PCIDevice *d, Error **errp)
 /* setup pci memory mapping */
 pc_pci_as_mapping_init(mch->system_memory, mch->pci_address_space);
 
+/* PAM */
+init_pam(&mch->pam_regions[0], OBJECT(mch), mch->ram_memory,
+ mch->system_memory, mch->pci_address_space,
+ PAM_BIOS_BASE, PAM_BIOS_SIZE);
+for (i = 0; i < ARRAY_SIZE(mch->pam_regions) - 1; ++i) {
+init_pam(&mch->pam_regions[i + 1], OBJECT(mch), mch->ram_memory,
+ mch->system_memory, mch->pci_address_space,
+ PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE, PAM_EXPAN_SIZE);
+}
+
 /* if *disabled* show SMRAM to all CPUs */
 memory_region_init_alias(&mch->smram_region, OBJECT(mch), "smram-region",
  mch->pci_address_space, 
MCH_HOST_BRIDGE_SMRAM_C_BASE,
@@ -634,15 +644,6 @@ static void mch_realize(PCIDevice *d, Error **errp)
 
 object_property_add_const_link(qdev_get_machine(), "smram",
OBJECT(&mch->smram));
-
-init_pam(&mch->pam_regions[0], OBJECT(mch), mch->ram_memory,
- mch->system_memory, mch->pci_address_space,
- PAM_BIOS_BASE, PAM_BIOS_SIZE);
-for (i = 0; i < ARRAY_SIZE(mch->pam_regions) - 1; ++i) {
-init_pam(&mch->pam_regions[i + 1], OBJECT(mch), mch->ram_memory,
- mch->system_memory, mch->pci_address_space,
- PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE, PAM_EXPAN_SIZE);
-}
 }
 
 uint64_t mch_mcfg_base(void)
-- 
2.44.0

[PATCH] virtio: move logging definitions to hw/virtio/virtio.h

2024-03-22 Thread Paolo Bonzini

They are not included in upstream Linux, and therefore should not be
in standard-headers.  Otherwise, the next update to the headers would
eliminate them.

Cc: Michael S. Tsirkin 
Signed-off-by: Paolo Bonzini 
---
 include/hw/virtio/virtio.h  | 7 +++
 include/standard-headers/linux/virtio_pci.h | 7 ---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index b3c74a1bca7..2db5eef432a 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -38,6 +38,13 @@
 #define LM_DISABLE  0x00
 #define LM_ENABLE   0x01
 
+#define LM_LOGGING_CTRL 0
+#define LM_BASE_ADDR_LOW4
+#define LM_BASE_ADDR_HIGH   8
+#define LM_END_ADDR_LOW 12
+#define LM_END_ADDR_HIGH16
+#define LM_VRING_STATE_OFFSET   0x20
+
 struct VirtQueue;
 
 static inline hwaddr vring_align(hwaddr addr,
diff --git a/include/standard-headers/linux/virtio_pci.h 
b/include/standard-headers/linux/virtio_pci.h
index 86733278ba3..3e2bc2c97e6 100644
--- a/include/standard-headers/linux/virtio_pci.h
+++ b/include/standard-headers/linux/virtio_pci.h
@@ -221,13 +221,6 @@ struct virtio_pci_cfg_cap {
 #define VIRTIO_PCI_COMMON_ADM_Q_IDX60
 #define VIRTIO_PCI_COMMON_ADM_Q_NUM62
 
-#define LM_LOGGING_CTRL 0
-#define LM_BASE_ADDR_LOW4
-#define LM_BASE_ADDR_HIGH   8
-#define LM_END_ADDR_LOW 12
-#define LM_END_ADDR_HIGH16
-#define LM_VRING_STATE_OFFSET   0x20
-
 #endif /* VIRTIO_PCI_NO_MODERN */
 
 /* Admin command status. */
-- 
2.44.0

Re: [PATCH 3/7] hw/char/stm32l4x5_usart: Add USART, UART, LPUART types

2024-03-22 Thread Peter Maydell

On Sun, 17 Mar 2024 at 10:41, Arnaud Minier
 wrote:
>
> Create different types for the USART, UART and LPUART of the STM32L4x5
> to deduplicate code and enable the implementation of different
> behaviors depending on the type.
>
> Signed-off-by: Arnaud Minier 
> Signed-off-by: Inès Varhol 
> ---
>  hw/char/stm32l4x5_usart.c | 113 +++---
>  include/hw/char/stm32l4x5_usart.h |  21 +-
>  2 files changed, 92 insertions(+), 42 deletions(-)
>
> diff --git a/hw/char/stm32l4x5_usart.c b/hw/char/stm32l4x5_usart.c
> index b56fee5b3a..f58bd56875 100644
> --- a/hw/char/stm32l4x5_usart.c
> +++ b/hw/char/stm32l4x5_usart.c
> @@ -154,9 +154,9 @@ REG32(RDR, 0x24)
>  REG32(TDR, 0x28)
>  FIELD(TDR, TDR, 0, 8)
>
> -static void stm32l4x5_usart_reset_hold(Object *obj)
> +static void stm32l4x5_usart_base_reset_hold(Object *obj)
>  {
> -STM32L4X5UsartState *s = STM32L4X5_USART(obj);
> +Stm32l4x5UsartBaseState *s = STM32L4X5_USART_BASE(obj);

Could you avoid this kind of "add function/type/etc in
one patch and then rename it in a following patch", please?
Give things the right name from the start.

This probably looks something like "squash this patch into
the previous one" in practice.

thanks
-- PMM

[PATCH v3 2/2] Implement SSH commands in QEMU GA for Windows

2024-03-22 Thread aidan_leuck

From: Aidan Leuck 

Signed-off-by: Aidan Leuck 
---
 qga/commands-windows-ssh.c | 791 +
 qga/commands-windows-ssh.h |  26 ++
 qga/meson.build|   5 +-
 qga/qapi-schema.json   |  17 +-
 4 files changed, 828 insertions(+), 11 deletions(-)
 create mode 100644 qga/commands-windows-ssh.c
 create mode 100644 qga/commands-windows-ssh.h

diff --git a/qga/commands-windows-ssh.c b/qga/commands-windows-ssh.c
new file mode 100644
index 00..da402e320c
--- /dev/null
+++ b/qga/commands-windows-ssh.c
@@ -0,0 +1,791 @@
+/*
+ * QEMU Guest Agent win32-specific command implementations for SSH keys.
+ * The implementation is opinionated and expects the SSH implementation to
+ * be OpenSSH.
+ *
+ * Copyright Schweitzer Engineering Laboratories. 2024
+ *
+ * Authors:
+ *  Aidan Leuck 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include 
+
+#include "commands-ssh-core.h"
+#include "commands-windows-ssh.h"
+#include "guest-agent-core.h"
+#include "limits.h"
+#include "lmaccess.h"
+#include "lmapibuf.h"
+#include "lmerr.h"
+#include "qapi/error.h"
+
+#include "qga-qapi-commands.h"
+#include "sddl.h"
+#include "shlobj.h"
+#include "userenv.h"
+
+#define AUTHORIZED_KEY_FILE "authorized_keys"
+#define AUTHORIZED_KEY_FILE_ADMIN "administrators_authorized_keys"
+#define LOCAL_SYSTEM_SID "S-1-5-18"
+#define ADMIN_SID "S-1-5-32-544"
+#define WORLD_SID "S-1-1-0"
+
+/*
+ * Frees userInfo structure. This implements the g_auto cleanup
+ * for the structure.
+ */
+void free_userInfo(PWindowsUserInfo info)
+{
+g_free(info->sshDirectory);
+g_free(info->authorizedKeyFile);
+LocalFree(info->SSID);
+g_free(info->username);
+g_free(info);
+}
+
+/*
+ * Gets the admin SSH folder for OpenSSH. OpenSSH does not store
+ * the authorized_key file in the users home directory for security reasons and
+ * instead stores it at %PROGRAMDATA%/ssh. This function returns the path to
+ * that directory on the users machine
+ *
+ * parameters:
+ * errp -> error structure to set when an error occurs
+ * returns: The path to the ssh folder in %PROGRAMDATA% or NULL if an error
+ * occurred.
+ */
+static char *get_admin_ssh_folder(Error **errp)
+{
+/* Allocate memory for the program data path */
+g_autofree char *programDataPath = NULL;
+char *authkeys_path = NULL;
+PWSTR pgDataW = NULL;
+g_autoptr(GError) gerr = NULL;
+
+/* Get the KnownFolderPath on the machine. */
+HRESULT folderResult =
+SHGetKnownFolderPath(&FOLDERID_ProgramData, 0, NULL, &pgDataW);
+if (folderResult != S_OK) {
+error_setg(errp, "Failed to retrieve ProgramData folder");
+return NULL;
+}
+
+/* Convert from a wide string back to a standard character string. */
+programDataPath = g_utf16_to_utf8(pgDataW, -1, NULL, NULL, &gerr);
+CoTaskMemFree(pgDataW);
+if (!programDataPath) {
+error_setg(errp,
+   "Failed converting ProgramData folder path to UTF-16 %s",
+   gerr->message);
+return NULL;
+}
+
+/* Build the path to the file. */
+authkeys_path = g_build_filename(programDataPath, "ssh", NULL);
+return authkeys_path;
+}
+
+/*
+ * Gets the path to the SSH folder for the specified user. If the user is an
+ * admin it returns the ssh folder located at %PROGRAMDATA%/ssh. If the user is
+ * not an admin it returns %USERPROFILE%/.ssh
+ *
+ * parameters:
+ * username -> Username to get the SSH folder for
+ * isAdmin -> Whether the user is an admin or not
+ * errp -> Error structure to set any errors that occur.
+ * returns: path to the ssh folder as a string.
+ */
+static char *get_ssh_folder(const char *username, const bool isAdmin,
+Error **errp)
+{
+DWORD maxSize = MAX_PATH;
+g_autofree char *profilesDir = g_new0(char, maxSize);
+
+if (isAdmin) {
+return get_admin_ssh_folder(errp);
+}
+
+/* If not an Admin the SSH key is in the user directory. */
+/* Get the user profile directory on the machine. */
+BOOL ret = GetProfilesDirectory(profilesDir, &maxSize);
+if (!ret) {
+error_setg_win32(errp, GetLastError(),
+ "failed to retrieve profiles directory");
+return NULL;
+}
+
+/* Builds the filename */
+return g_build_filename(profilesDir, username, ".ssh", NULL);
+}
+
+/*
+ * Creates an entry for the everyone group. This is used when the user is an
+ * Administrator This is consistent with the folder permissions that OpenSSH
+ * creates when it is installed. Anyone can read the file, but only
+ * Administrators and SYSTEM can modify the file.
+ *
+ * parameters:
+ * userInfo -> Information about the current user
+ * pACL -> Pointer to an ACL structure
+ * errp -> Error structure to set any errors that occur
+ * returns: 1 on success, 0 otherwise
+

[PATCH v3 1/2] Refactor common functions between POSIX and Windows implementation

2024-03-22 Thread aidan_leuck

From: Aidan Leuck 

Signed-off-by: Aidan Leuck 
---
 qga/commands-posix-ssh.c | 47 +
 qga/commands-ssh-core.c  | 57 
 qga/commands-ssh-core.h  |  8 ++
 qga/meson.build  |  1 +
 4 files changed, 67 insertions(+), 46 deletions(-)
 create mode 100644 qga/commands-ssh-core.c
 create mode 100644 qga/commands-ssh-core.h

diff --git a/qga/commands-posix-ssh.c b/qga/commands-posix-ssh.c
index 236f80de44..9a71b109f9 100644
--- a/qga/commands-posix-ssh.c
+++ b/qga/commands-posix-ssh.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 
+#include "commands-ssh-core.h"
 #include "qapi/error.h"
 #include "qga-qapi-commands.h"
 
@@ -80,37 +81,6 @@ mkdir_for_user(const char *path, const struct passwd *p,
 return true;
 }
 
-static bool
-check_openssh_pub_key(const char *key, Error **errp)
-{
-/* simple sanity-check, we may want more? */
-if (!key || key[0] == '#' || strchr(key, '\n')) {
-error_setg(errp, "invalid OpenSSH public key: '%s'", key);
-return false;
-}
-
-return true;
-}
-
-static bool
-check_openssh_pub_keys(strList *keys, size_t *nkeys, Error **errp)
-{
-size_t n = 0;
-strList *k;
-
-for (k = keys; k != NULL; k = k->next) {
-if (!check_openssh_pub_key(k->value, errp)) {
-return false;
-}
-n++;
-}
-
-if (nkeys) {
-*nkeys = n;
-}
-return true;
-}
-
 static bool
 write_authkeys(const char *path, const GStrv keys,
const struct passwd *p, Error **errp)
@@ -139,21 +109,6 @@ write_authkeys(const char *path, const GStrv keys,
 return true;
 }
 
-static GStrv
-read_authkeys(const char *path, Error **errp)
-{
-g_autoptr(GError) err = NULL;
-g_autofree char *contents = NULL;
-
-if (!g_file_get_contents(path, &contents, NULL, &err)) {
-error_setg(errp, "failed to read '%s': %s", path, err->message);
-return NULL;
-}
-
-return g_strsplit(contents, "\n", -1);
-
-}
-
 void
 qmp_guest_ssh_add_authorized_keys(const char *username, strList *keys,
   bool has_reset, bool reset,
diff --git a/qga/commands-ssh-core.c b/qga/commands-ssh-core.c
new file mode 100644
index 00..f165c4a337
--- /dev/null
+++ b/qga/commands-ssh-core.c
@@ -0,0 +1,57 @@
+/*
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include 
+#include "qapi/error.h"
+#include "commands-ssh-core.h"
+
+GStrv read_authkeys(const char *path, Error **errp)
+{
+g_autoptr(GError) err = NULL;
+g_autofree char *contents = NULL;
+
+if (!g_file_get_contents(path, &contents, NULL, &err))
+{
+error_setg(errp, "failed to read '%s': %s", path, err->message);
+return NULL;
+}
+
+return g_strsplit(contents, "\n", -1);
+}
+
+bool check_openssh_pub_keys(strList *keys, size_t *nkeys, Error **errp)
+{
+size_t n = 0;
+strList *k;
+
+for (k = keys; k != NULL; k = k->next)
+{
+if (!check_openssh_pub_key(k->value, errp))
+{
+return false;
+}
+n++;
+}
+
+if (nkeys)
+{
+*nkeys = n;
+}
+return true;
+}
+
+bool check_openssh_pub_key(const char *key, Error **errp)
+{
+/* simple sanity-check, we may want more? */
+if (!key || key[0] == '#' || strchr(key, '\n'))
+{
+error_setg(errp, "invalid OpenSSH public key: '%s'", key);
+return false;
+}
+
+return true;
+}
diff --git a/qga/commands-ssh-core.h b/qga/commands-ssh-core.h
new file mode 100644
index 00..ef9f600d4d
--- /dev/null
+++ b/qga/commands-ssh-core.h
@@ -0,0 +1,8 @@
+/*
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+GStrv read_authkeys(const char *path, Error **errp);
+bool check_openssh_pub_keys(strList *keys, size_t *nkeys, Error **errp);
+bool check_openssh_pub_key(const char *key, Error **errp);
diff --git a/qga/meson.build b/qga/meson.build
index 1c3d2a3d1b..d32b401507 100644
--- a/qga/meson.build
+++ b/qga/meson.build
@@ -66,6 +66,7 @@ qga_ss.add(files(
   'guest-agent-command-state.c',
   'main.c',
   'cutils.c',
+  'commands-ssh-core.c'
 ))
 if host_os == 'windows'
   qga_ss.add(files(
-- 
2.44.0

[PATCH v3 0/2] Implement SSH commands in QEMU GA for Windows

2024-03-22 Thread aidan_leuck

From: Aidan Leuck 

This patch aims to implement guest-ssh-add-authorized-keys, 
guest-ssh-remove-authorized-keys, and guest-ssh-get-authorized-keys
for Windows. This PR is based on Microsoft's OpenSSH implementation 
https://github.com/PowerShell/Win32-OpenSSH. The guest agents 
will support Kubevirt and allow guest agent propagation to be used to 
dynamically inject SSH keys. 
https://kubevirt.io/user-guide/virtual_machines/accessing_virtual_machines/#dynamic-ssh-public-key-injection-via-qemu-guest-agent

Changes since v2
* Set indent to 4 spaces
* Moved all comments to C style comments
* Fixed a segfault bug in get_user_info function related to non zeroed memory 
when a user did not exist.
* Used g_new0 instead of g_malloc where applicable
* Modified newlines in qapi-schema.json
* Added newlines at the end of all files
* GError functions now use g_autoptr instead of being freed manually.
* Refactored get_ssh_folder to remove goto error statement
* Fixed uninitialized variable pgDataW
* Modified patch order so that the generalization patch is the first patch
* Removed unnecssary ZeroMemory calls

Changes since v1
* Fixed styling errors
* Moved from wcstombs to g_utf functions
* Removed unnecessary if checks on calls to free
* Fixed copyright headers
* Refactored create_acl functions into base function, admin function and user 
function
* Removed unused user count function
* Split up refactor of existing code into a separate patch

Aidan Leuck (2):
  Refactor common functions between POSIX and Windows implementation
  Implement SSH commands in QEMU GA for Windows

 qga/commands-posix-ssh.c   |  47 +--
 qga/commands-ssh-core.c|  57 +++
 qga/commands-ssh-core.h|   8 +
 qga/commands-windows-ssh.c | 791 +
 qga/commands-windows-ssh.h |  26 ++
 qga/meson.build|   6 +-
 qga/qapi-schema.json   |  17 +-
 7 files changed, 895 insertions(+), 57 deletions(-)
 create mode 100644 qga/commands-ssh-core.c
 create mode 100644 qga/commands-ssh-core.h
 create mode 100644 qga/commands-windows-ssh.c
 create mode 100644 qga/commands-windows-ssh.h

-- 
2.44.0

Re: [PULL 0/9] target/hppa fixes for 9.0

2024-03-22 Thread Richard Henderson


On 3/21/24 18:48, Michael Tokarev wrote:

21.03.2024 21:32, Helge Deller wrote:

On 3/21/24 19:25, Sven Schnelle wrote:

Michael Tokarev  writes:


20.03.2024 03:32, Richard Henderson :


Richard Henderson (3):
    target/hppa: Fix assemble_16 insns for wide mode
    target/hppa: Fix assemble_11a insns for wide mode
    target/hppa: Fix assemble_12a insns for wide mode
Sven Schnelle (6):
    target/hppa: ldcw,s uses static shift of 3
    target/hppa: fix shrp for wide mode
    target/hppa: fix access_id check
    target/hppa: exit tb on flush cache instructions
    target/hppa: mask privilege bits in mfia
    target/hppa: fix do_stdby_e()


Is it all -stable material (when appropriate)?


I'd say yes.


Yes.


Picked all 9 for stable-8.2.

And none for stable-7.2.  There, just one of them applies.

I understand most of them can be applied still (it is just adding
new lines here and there, the same lines needs to be added to 7.2
but there, context is missing so every patch needs manual applying,
which I'm not feeling confident doing.  If anything of that is
really good to have in 7.2 (which has de-facto become an LTS series),
please re-spin it on top of stable-7.2 branch and send the result
to qemu-stable@.


This is all for hppa64 support, which was not present in 7.2.

r~

Re: [PULL 00/15] riscv-to-apply queue

2024-03-22 Thread Michael Tokarev


22.03.2024 11:53, Alistair Francis :


RISC-V PR for 9.0

* Do not enable all named features by default
* A range of Vector fixes
* Update APLIC IDC after claiming iforce register
* Remove the dependency of Zvfbfmin to Zfbfmin
* Fix mode in riscv_tlb_fill
* Fix timebase-frequency when using KVM acceleration


Should something from there be picked up for stable (8.2 and probably 7.2)?

Thanks,

/mjt



Daniel Henrique Barboza (10):
   target/riscv: do not enable all named features by default
   target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX()
   trans_rvv.c.inc: set vstart = 0 in int scalar move insns
   target/riscv/vector_helper.c: fix 'vmvr_v' memcpy endianess
   target/riscv: always clear vstart in whole vec move insns
   target/riscv: always clear vstart for ldst_whole insns
   target/riscv/vector_helpers: do early exit when vstart >= vl
   target/riscv: remove 'over' brconds from vector trans
   trans_rvv.c.inc: remove redundant mark_vs_dirty() calls
   target/riscv/vector_helper.c: optimize loops in ldst helpers

Frank Chang (1):
   hw/intc: Update APLIC IDC after claiming iforce register

Irina Ryapolova (1):
   target/riscv: Fix mode in riscv_tlb_fill

Ivan Klokov (1):
   target/riscv: enable 'vstart_eq_zero' in the end of insns

Max Chou (1):
   target/riscv: rvv: Remove the dependency of Zvfbfmin to Zfbfmin

Yong-Xuan Wang (1):
   target/riscv/kvm: fix timebase-frequency when using KVM acceleration

Re: [PATCH v2 1/2] hw/arm: Add support for stm32g000 SoC family

2024-03-22 Thread Peter Maydell

On Wed, 20 Mar 2024 at 20:21, Felipe Balbi  wrote:
>
> Minimal support with USARTs and SPIs working. This SoC will be used to
> create and nucleo-g071rb board.
>
> Signed-off-by: Felipe Balbi 
> ---
>
> Changes since v1:
> - Convert tabs to spaces (checkpatch.pl)
> - Correct lines longer than 80 characters (checkpatch.pl)
> - Correct num-prio-bits (Samuel Tardieu)
> - Correct num-irqs (Found reviewing RM0444)

> +static void stm32g000_soc_initfn(Object *obj)
> +{
> +STM32G000State *s = STM32G000_SOC(obj);
> +int i;
> +
> +object_initialize_child(obj, "armv7m", &s->armv7m, TYPE_ARMV7M);
> +
> +for (i = 0; i < STM_NUM_USARTS; i++) {
> +object_initialize_child(obj, "usart[*]", &s->usart[i],
> +TYPE_STM32F2XX_USART);
> +}
> +

I was just prompted by another patchset on my review queue
to look a bit more carefully at the USART section of the
datasheet, and I think that TYPE_STM32F2XX_USART is not
the correct UART type for this SoC. That UART type has its
registers in the order SR, DR, BRR, CR1, CR2, CR3, GTPR.
The G0x0 SoC describes a UART with more registers, in a
different order (CR1, CR2, CR3, BRR, GTPR, RTOR, RQR,
ISR, ICR, RDR, TDR, PRESC). That's more like the device
that this patchset adds:

https://patchew.org/QEMU/20240317103918.44375-1-arnaud.min...@telecom-paris.fr/

though I haven't tried to cross-check all these reference
manuals to see if it is identical or merely quite close...

thanks
-- PMM

Re: [PATCH 3/7] KVM: track whether guest state is encrypted

2024-03-22 Thread Xiaoyao Li


On 3/19/2024 9:59 PM, Paolo Bonzini wrote:

So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the
guest state is encrypted, in which case they do nothing.  For the new
API using VM types, instead, the ioctls will fail which is a safer and
more robust approach.

The new API will be the only one available for SEV-SNP and TDX, but it
is also usable for SEV and SEV-ES.  In preparation for that, require
architecture-specific KVM code to communicate the point at which guest
state is protected (which must be after kvm_cpu_synchronize_post_init(),
though that might change in the future in order to suppor migration).
 From that point, skip reading registers so that cpu->vcpu_dirty is
never true: if it ever becomes true, kvm_arch_put_registers() will
fail miserably.

Signed-off-by: Paolo Bonzini 


Reviewed-by: Xiaoyao Li

Re: [PATCH 4/7] KVM: remove kvm_arch_cpu_check_are_resettable

2024-03-22 Thread Xiaoyao Li


On 3/19/2024 9:59 PM, Paolo Bonzini wrote:

Board reset requires writing a fresh CPU state.  As far as KVM is
concerned, the only thing that blocks reset is that CPU state is
encrypted; therefore, kvm_cpus_are_resettable() can simply check
if that is the case.

Signed-off-by: Paolo Bonzini 


Reviewed-by: Xiaoyao Li

Re: [PATCH-for-9.0 1/2] hw/clock: Let clock_set_mul_div() return boolean value

2024-03-22 Thread Peter Maydell

On Fri, 22 Mar 2024 at 15:58, Philippe Mathieu-Daudé  wrote:
>
> Let clock_set_mul_div() return a boolean value whether the
> clock has been updated or not, similarly to clock_set().
>
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  include/hw/clock.h | 4 +++-
>  hw/core/clock.c| 8 +++-
>  2 files changed, 10 insertions(+), 2 deletions(-)

I guess this makes sense, especially since you often would like to know
this to decide whether to call clock_propagate().

We should also update the docs/devel/clocks.rst to mention this.

thanks
-- PMM

Re: [PATCH v5 5/7] migration/multifd: implement initialization of qpl compression

2024-03-22 Thread Peter Xu

On Fri, Mar 22, 2024 at 02:47:02PM +, Liu, Yuan1 wrote:
> > -Original Message-
> > From: Liu, Yuan1
> > Sent: Friday, March 22, 2024 10:07 AM
> > To: Peter Xu 
> > Cc: Daniel P. Berrangé ; faro...@suse.de; qemu-
> > de...@nongnu.org; hao.xi...@bytedance.com; bryan.zh...@bytedance.com; Zou,
> > Nanhai 
> > Subject: RE: [PATCH v5 5/7] migration/multifd: implement initialization of
> > qpl compression
> > 
> > > -Original Message-
> > > From: Peter Xu 
> > > Sent: Thursday, March 21, 2024 11:28 PM
> > > To: Liu, Yuan1 
> > > Cc: Daniel P. Berrangé ; faro...@suse.de; qemu-
> > > de...@nongnu.org; hao.xi...@bytedance.com; bryan.zh...@bytedance.com;
> > Zou,
> > > Nanhai 
> > > Subject: Re: [PATCH v5 5/7] migration/multifd: implement initialization
> > of
> > > qpl compression
> > >
> > > On Thu, Mar 21, 2024 at 01:37:36AM +, Liu, Yuan1 wrote:
> > > > > -Original Message-
> > > > > From: Peter Xu 
> > > > > Sent: Thursday, March 21, 2024 4:32 AM
> > > > > To: Liu, Yuan1 
> > > > > Cc: Daniel P. Berrangé ; faro...@suse.de; qemu-
> > > > > de...@nongnu.org; hao.xi...@bytedance.com;
> > bryan.zh...@bytedance.com;
> > > Zou,
> > > > > Nanhai 
> > > > > Subject: Re: [PATCH v5 5/7] migration/multifd: implement
> > > initialization of
> > > > > qpl compression
> > > > >
> > > > > On Wed, Mar 20, 2024 at 04:23:01PM +, Liu, Yuan1 wrote:
> > > > > > let me explain here, during the decompression operation of IAA,
> > the
> > > > > > decompressed data can be directly output to the virtual address of
> > > the
> > > > > > guest memory by IAA hardware.  It can avoid copying the
> > decompressed
> > > > > data
> > > > > > to guest memory by CPU.
> > > > >
> > > > > I see.
> > > > >
> > > > > > Without -mem-prealloc, all the guest memory is not populated, and
> > > IAA
> > > > > > hardware needs to trigger I/O page fault first and then output the
> > > > > > decompressed data to the guest memory region.  Besides that, CPU
> > > page
> > > > > > faults will also trigger IOTLB flush operation when IAA devices
> > use
> > > SVM.
> > > > >
> > > > > Oh so the IAA hardware already can use CPU pgtables?  Nice..
> > > > >
> > > > > Why IOTLB flush is needed?  AFAIU we're only installing new pages,
> > the
> > > > > request can either come from a CPU access or a DMA.  In all cases
> > > there
> > > > > should have no tearing down of an old page.  Isn't an iotlb flush
> > only
> > > > > needed if a tear down happens?
> > > >
> > > > As far as I know, IAA hardware uses SVM technology to use the CPU's
> > page
> > > table
> > > > for address translation (IOMMU scalable mode directly accesses the CPU
> > > page table).
> > > > Therefore, when the CPU page table changes, the device's Invalidation
> > > operation needs
> > > > to be triggered to update the IOMMU and the device's cache.
> > > >
> > > > My current kernel version is mainline 6.2. The issue I see is as
> > > follows:
> > > > --Handle_mm_fault
> > > >  |
> > > >   -- wp_page_copy
> > >
> > > This is the CoW path.  Not usual at all..
> > >
> > > I assume this issue should only present on destination.  Then the guest
> > > pages should be the destination of such DMAs to happen, which means
> > these
> > > should be write faults, and as we see here it is, otherwise it won't
> > > trigger a CoW.
> > >
> > > However it's not clear to me why a pre-installed zero page existed.  It
> > > means someone read the guest pages first.
> > >
> > > It might be interesting to know _why_ someone reads the guest pages,
> > even
> > > if we know they're all zeros.  If we can avoid such reads then it'll be
> > a
> > > hole rather than a prefaulted read on zero page, then invalidations are
> > > not
> > > needed, and I expect that should fix the iotlb storm issue.
> > 
> > The received pages will be read for zero pages check first. Although
> > these pages are zero pages, and IAA hardware will not access them, the
> > COW happens and causes following IOTLB flush operation. As far as I know,
> > IOMMU quickly detects whether the address range has been used by the
> > device,
> > and does not invalidate the address that is not used by the device, this
> > has
> > not yet been resolved in Linux kernel 6.2. I will check the latest status
> > for
> > this.
> 
> I checked the Linux mainline 6.8 code, there are no big changes for this.
> In version 6.8, if the process needs to flush MMU TLB, then I/O TLB flush
> will be also triggered when the process has SVM devices. I haven't found
> the code to check if pages have been set EA (Extended-Accessed) bit before
> submitting invalidation operations, this is same with version 6.2.
> 
> VT-d 3.6.2
> If the Extended-Accessed-Flag-Enable (EAFE) is 1 in a scalable-mode 
> PASID-table
> entry that references a first-stage paging-structure entry used by the 
> remapping
> hardware, it atomically sets the EA field in that entry. Whenever EA field is 
> atomically set, the A field is also set in the same atomic operation. For 
> softw

Re: [PATCH-for-9.0 2/2] hw/misc/stm32l4x5_rcc: Propagate period when enabling a clock

2024-03-22 Thread Peter Maydell

On Fri, 22 Mar 2024 at 15:59, Philippe Mathieu-Daudé  wrote:
>
> From: Arnaud Minier 
>
> The "clock_set_mul_div" function doesn't propagate the clock period
> to the children if it is changed (e.g. by enabling/disabling a clock
> multiplexer).
> This was overlooked during the implementation due to late changes.
>
> This commit propagates the change if the multiplier or divider changes.
>
> Fixes: ec7d83acbd ("hw/misc/stm32l4x5_rcc: Add an internal clock multiplexer 
> object")
> Signed-off-by: Arnaud Minier 
> Signed-off-by: Inès Varhol 
> Message-ID: <20240317103918.44375-2-arnaud.min...@telecom-paris.fr>
> [PMD: Check clock_set_mul_div() return value]
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/misc/stm32l4x5_rcc.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/hw/misc/stm32l4x5_rcc.c b/hw/misc/stm32l4x5_rcc.c
> index bc2d63528b..7ad628b296 100644
> --- a/hw/misc/stm32l4x5_rcc.c
> +++ b/hw/misc/stm32l4x5_rcc.c
> @@ -59,7 +59,10 @@ static void clock_mux_update(RccClockMuxState *mux, bool 
> bypass_source)
>  freq_multiplier = mux->divider;
>  }
>
> -clock_set_mul_div(mux->out, freq_multiplier, mux->multiplier);
> +if (clock_set_mul_div(mux->out, freq_multiplier, mux->multiplier)) {
> +clock_propagate(mux->out);
> +}
> +
>  clock_update(mux->out, clock_get(current_source));

clock_update() also calls clock_propagate(), so this doesn't
seem entirely right: shouldn't we figure out whether we need to
do a clock_propagate() and do it once? (Maybe what seems odd to me
is that clock_set() does clock_propagate() for you but
clock_set_mul_div() does not...)

(Also I think we should have the information we need now to be able
to do the "reduce log spam" in the comment -- if neither
clock_set_mul_div() nor clock_update() needed to do anything
then we didn't actually change the config.)

-- PMM

Re: [PATCH v2 0/2] Add support for STM32G0 SoC family

2024-03-22 Thread Peter Maydell

On Wed, 20 Mar 2024 at 20:22, Felipe Balbi  wrote:
>
> Hi all,
>
> These two patches add support for STM32G0 family and nucleo-g071rb
> board. Patches have been tested with minimal embedded rust examples.
>
> Changes since v1:
>
>   - Patch 1:
> - Convert tabs to spaces (checkpatch.pl)
> - Correct lines longer than 80 characters (checkpatch.pl)
> - Correct num-prio-bits (Samuel Tardieu)
> - Correct num-irqs (Found reviewing RM0444)
>
>   - Patch 2:
> - Convert tabs to spaces (checkpatch.pl)
>
> Felipe Balbi (2):
>   hw/arm: Add support for stm32g000 SoC family
>   hw/arm: Add nucleo-g071rb board
>
>  MAINTAINERS|  13 ++
>  hw/arm/Kconfig |  12 ++
>  hw/arm/meson.build |   2 +
>  hw/arm/nucleo-g071rb.c |  70 +
>  hw/arm/stm32g000_soc.c | 253 +
>  include/hw/arm/stm32g000_soc.h |  62 

Hi; I've left review comments on the two patches in this series.
There are a couple of "missing pieces" here:

(1) documentation. Arm board documentation is in rst format
in docs/system/arm/. You can either add the information for
this board to the existing stm32.rst which documents the other
STM32 boards we model, or if you think it's too different to
share a source file you can create a new one with the same
kind of structure. (Using the existing stm32.rst seems likely
to be best to me.)

(2) tests. Are there any conveniently publicly available guest
images from some URL that doesn't mind our CI downloading from
it, that would run on the board model as it is? If so, we could
consider writing an avocado test (these live in tests/avocado/),
which basically can do "run QEMU with this image and look for
this output on the serial port". This is a "nice-to-have", not
a requirement.

thanks
-- PMM

[PATCH] hw/s390x: Include missing 'cpu.h' header

2024-03-22 Thread Philippe Mathieu-Daudé

"cpu.h" is implicitly included. Include it explicitly to
avoid the following error when refactoring headers:

  hw/s390x/s390-stattrib.c:86:40: error: use of undeclared identifier 
'TARGET_PAGE_SIZE'
  len = sac->peek_stattr(sas, addr / TARGET_PAGE_SIZE, buflen, vals);
 ^
  hw/s390x/s390-stattrib.c:94:58: error: use of undeclared identifier 
'TARGET_PAGE_MASK'
 addr / TARGET_PAGE_SIZE, len, addr & ~TARGET_PAGE_MASK);
 ^
  hw/s390x/s390-stattrib.c:224:40: error: use of undeclared identifier 
'TARGET_PAGE_BITS'
  qemu_put_be64(f, (start_gfn << TARGET_PAGE_BITS) | STATTR_FLAG_MORE);
 ^
  In file included from hw/s390x/s390-virtio-ccw.c:17:
  hw/s390x/s390-virtio-hcall.h:22:27: error: unknown type name 'CPUS390XState'
  int s390_virtio_hypercall(CPUS390XState *env);
^

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/s390x/s390-virtio-hcall.h | 2 ++
 hw/s390x/s390-stattrib.c | 1 +
 2 files changed, 3 insertions(+)

diff --git a/hw/s390x/s390-virtio-hcall.h b/hw/s390x/s390-virtio-hcall.h
index 9800c4b351..3ae6d6ae3a 100644
--- a/hw/s390x/s390-virtio-hcall.h
+++ b/hw/s390x/s390-virtio-hcall.h
@@ -13,6 +13,7 @@
 #define HW_S390_VIRTIO_HCALL_H
 
 #include "standard-headers/asm-s390/virtio-ccw.h"
+#include "cpu.h"
 
 /* The only thing that we need from the old kvm_virtio.h file */
 #define KVM_S390_VIRTIO_NOTIFY 0
@@ -20,4 +21,5 @@
 typedef int (*s390_virtio_fn)(const uint64_t *args);
 void s390_register_virtio_hypercall(uint64_t code, s390_virtio_fn fn);
 int s390_virtio_hypercall(CPUS390XState *env);
+
 #endif /* HW_S390_VIRTIO_HCALL_H */
diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
index c483b62a9b..aaf48ac73f 100644
--- a/hw/s390x/s390-stattrib.c
+++ b/hw/s390x/s390-stattrib.c
@@ -19,6 +19,7 @@
 #include "exec/ram_addr.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qdict.h"
+#include "cpu.h"
 
 /* 512KiB cover 2GB of guest memory */
 #define CMMA_BLOCK_SIZE  (512 * KiB)
-- 
2.41.0

RE: [PATCH v4 1/3] ui/console: Introduce dpy_gl_dmabuf_get_height/width() helpers

2024-03-22 Thread Kim, Dongwon

Hi Marc-André,

> -Original Message-
> From: Marc-André Lureau 
> Sent: Friday, March 22, 2024 2:06 AM
> To: Kim, Dongwon 
> Cc: qemu-devel@nongnu.org; phi...@linaro.org
> Subject: Re: [PATCH v4 1/3] ui/console: Introduce
> dpy_gl_dmabuf_get_height/width() helpers
> 
> Hi Kim
> 
> On Fri, Mar 22, 2024 at 3:45 AM  wrote:
> >
> > From: Dongwon Kim 
> >
> > dpy_gl_dmabuf_get_height() and dpy_gl_dmabuf_get_width() are helpers
> > for retrieving width and height fields from QemuDmaBuf struct.
> >
> 
> There are many places left where width/height fields are still accessed 
> directly.
> 
> If we want to make the whole structure private, we will probably need setters.
[Kim, Dongwon]  I am wondering if you are saying we need setters and getters 
for all individual fields and use those new functions in any places in QEMU 
where any of those fields are accessed including ui/* (e.g. gtk-egl.c)? 

> 
> I don't see why the function should silently return 0 when given NULL.
> Imho an assert(dmabuf != NULL) is appropriate (or g_return_val_if_fail).
[Kim, Dongwon] Yeah I can do that. I will update that part.

> 
> 
> 
> 
> 
> > Cc: Philippe Mathieu-Daudé 
> > Cc: Marc-André Lureau 
> > Cc: Vivek Kasireddy 
> > Signed-off-by: Dongwon Kim 
> > ---
> >  include/ui/console.h|  2 ++
> >  hw/display/virtio-gpu-udmabuf.c |  7 ---
> >  hw/vfio/display.c   |  9 ++---
> >  ui/console.c| 18 ++
> >  4 files changed, 30 insertions(+), 6 deletions(-)
> >
> > diff --git a/include/ui/console.h b/include/ui/console.h index
> > 0bc7a00ac0..6064487fc4 100644
> > --- a/include/ui/console.h
> > +++ b/include/ui/console.h
> > @@ -358,6 +358,8 @@ void dpy_gl_cursor_dmabuf(QemuConsole *con,
> QemuDmaBuf *dmabuf,
> >bool have_hot, uint32_t hot_x, uint32_t
> > hot_y);  void dpy_gl_cursor_position(QemuConsole *con,
> >  uint32_t pos_x, uint32_t pos_y);
> > +uint32_t dpy_gl_dmabuf_get_width(QemuDmaBuf *dmabuf); uint32_t
> > +dpy_gl_dmabuf_get_height(QemuDmaBuf *dmabuf);
> >  void dpy_gl_release_dmabuf(QemuConsole *con,
> > QemuDmaBuf *dmabuf);  void
> > dpy_gl_update(QemuConsole *con, diff --git
> > a/hw/display/virtio-gpu-udmabuf.c b/hw/display/virtio-gpu-udmabuf.c
> > index d51184d658..a4ebf828ec 100644
> > --- a/hw/display/virtio-gpu-udmabuf.c
> > +++ b/hw/display/virtio-gpu-udmabuf.c
> > @@ -206,6 +206,7 @@ int virtio_gpu_update_dmabuf(VirtIOGPU *g,  {
> >  struct virtio_gpu_scanout *scanout = 
> > &g->parent_obj.scanout[scanout_id];
> >  VGPUDMABuf *new_primary, *old_primary = NULL;
> > +uint32_t width, height;
> >
> >  new_primary = virtio_gpu_create_dmabuf(g, scanout_id, res, fb, r);
> >  if (!new_primary) {
> > @@ -216,10 +217,10 @@ int virtio_gpu_update_dmabuf(VirtIOGPU *g,
> >  old_primary = g->dmabuf.primary[scanout_id];
> >  }
> >
> > +width = dpy_gl_dmabuf_get_width(&new_primary->buf);
> > +height = dpy_gl_dmabuf_get_height(&new_primary->buf);
> >  g->dmabuf.primary[scanout_id] = new_primary;
> > -qemu_console_resize(scanout->con,
> > -new_primary->buf.width,
> > -new_primary->buf.height);
> > +qemu_console_resize(scanout->con, width, height);
> >  dpy_gl_scanout_dmabuf(scanout->con, &new_primary->buf);
> >
> >  if (old_primary) {
> > diff --git a/hw/vfio/display.c b/hw/vfio/display.c
> > index 1aa440c663..c962e5f88f 100644
> > --- a/hw/vfio/display.c
> > +++ b/hw/vfio/display.c
> > @@ -286,6 +286,7 @@ static void vfio_display_dmabuf_update(void
> *opaque)
> >  VFIOPCIDevice *vdev = opaque;
> >  VFIODisplay *dpy = vdev->dpy;
> >  VFIODMABuf *primary, *cursor;
> > +uint32_t width, height;
> >  bool free_bufs = false, new_cursor = false;
> >
> >  primary = vfio_display_get_dmabuf(vdev, DRM_PLANE_TYPE_PRIMARY);
> > @@ -296,10 +297,12 @@ static void vfio_display_dmabuf_update(void
> *opaque)
> >  return;
> >  }
> >
> > +width = dpy_gl_dmabuf_get_width(&primary->buf);
> > +height = dpy_gl_dmabuf_get_height(&primary->buf);
> > +
> >  if (dpy->dmabuf.primary != primary) {
> >  dpy->dmabuf.primary = primary;
> > -qemu_console_resize(dpy->con,
> > -primary->buf.width, primary->buf.height);
> > +qemu_console_resize(dpy->con, width, height);
> >  dpy_gl_scanout_dmabuf(dpy->con, &primary->buf);
> >  free_bufs = true;
> >  }
> > @@ -328,7 +331,7 @@ static void vfio_display_dmabuf_update(void
> *opaque)
> >  cursor->pos_updates = 0;
> >  }
> >
> > -dpy_gl_update(dpy->con, 0, 0, primary->buf.width, primary->buf.height);
> > +dpy_gl_update(dpy->con, 0, 0, width, height);
> >
> >  if (free_bufs) {
> >  vfio_display_free_dmabufs(vdev);
> > diff --git a/ui/console.c b/ui/console.c
> > index 43226c5c14..1d0513a733 100644
> > --- a/ui/console.c

[PATCH] hw/ppc/spapr: Include missing 'sysemu/tcg.h' header

2024-03-22 Thread Philippe Mathieu-Daudé

"sysemu/tcg.h" declares tcg_enabled(), and is implicitly included.
Include it explicitly to avoid the following error when refactoring
headers:

  hw/ppc/spapr.c:2612:9: error: call to undeclared function 'tcg_enabled'; ISO 
C99 and later do not support implicit function declarations 
[-Wimplicit-function-declaration]
if (tcg_enabled()) {
^

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/ppc/spapr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index c417f9dd52..e9bc97fee0 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -35,6 +35,7 @@
 #include "sysemu/sysemu.h"
 #include "sysemu/hostmem.h"
 #include "sysemu/numa.h"
+#include "sysemu/tcg.h"
 #include "sysemu/qtest.h"
 #include "sysemu/reset.h"
 #include "sysemu/runstate.h"
-- 
2.41.0

RE: [PATCH v4 3/3] ui/console: Introduce dpy_gl_create_dmabuf() helper

2024-03-22 Thread Kim, Dongwon

Hi Marc-André,

> -Original Message-
> From: Marc-André Lureau 
> Sent: Friday, March 22, 2024 2:06 AM
> To: Kim, Dongwon 
> Cc: qemu-devel@nongnu.org; phi...@linaro.org
> Subject: Re: [PATCH v4 3/3] ui/console: Introduce dpy_gl_create_dmabuf()
> helper
> 
> Hi
> 
> On Fri, Mar 22, 2024 at 3:45 AM  wrote:
> >
> > From: Dongwon Kim 
> >
> > dpy_gl_create_dmabuf() allocates QemuDmaBuf and initialize fields.
> > hw/display modules, hw/vfio and ui/dbus-listener now use this method
> > to create QemuDmaBuf instead of declaring and initializing it on their
> > own.
> >
> > Cc: Philippe Mathieu-Daudé 
> > Cc: Marc-André Lureau 
> > Cc: Vivek Kasireddy 
> > Signed-off-by: Dongwon Kim 
> > ---
> >  include/hw/vfio/vfio-common.h   |  2 +-
> >  include/hw/virtio/virtio-gpu.h  |  4 ++--
> >  include/ui/console.h|  6 ++
> >  hw/display/vhost-user-gpu.c | 33 ++---
> >  hw/display/virtio-gpu-udmabuf.c | 23 ---
> >  hw/vfio/display.c   | 26 +++---
> >  ui/console.c| 28 
> >  ui/dbus-listener.c  | 28 
> >  8 files changed, 86 insertions(+), 64 deletions(-)
> >
> > diff --git a/include/hw/vfio/vfio-common.h
> > b/include/hw/vfio/vfio-common.h index b9da6c08ef..d66e27db02 100644
> > --- a/include/hw/vfio/vfio-common.h
> > +++ b/include/hw/vfio/vfio-common.h
> > @@ -148,7 +148,7 @@ typedef struct VFIOGroup {  } VFIOGroup;
> >
> >  typedef struct VFIODMABuf {
> > -QemuDmaBuf buf;
> > +QemuDmaBuf *buf;
> >  uint32_t pos_x, pos_y, pos_updates;
> >  uint32_t hot_x, hot_y, hot_updates;
> >  int dmabuf_id;
> > diff --git a/include/hw/virtio/virtio-gpu.h
> > b/include/hw/virtio/virtio-gpu.h index ed44cdad6b..56d6e821bf 100644
> > --- a/include/hw/virtio/virtio-gpu.h
> > +++ b/include/hw/virtio/virtio-gpu.h
> > @@ -169,7 +169,7 @@ struct VirtIOGPUBaseClass {
> >  DEFINE_PROP_UINT32("yres", _state, _conf.yres, 800)
> >
> >  typedef struct VGPUDMABuf {
> > -QemuDmaBuf buf;
> > +QemuDmaBuf *buf;
> >  uint32_t scanout_id;
> >  QTAILQ_ENTRY(VGPUDMABuf) next;
> >  } VGPUDMABuf;
> > @@ -238,7 +238,7 @@ struct VhostUserGPU {
> >  VhostUserBackend *vhost;
> >  int vhost_gpu_fd; /* closed by the chardev */
> >  CharBackend vhost_chr;
> > -QemuDmaBuf dmabuf[VIRTIO_GPU_MAX_SCANOUTS];
> > +QemuDmaBuf *dmabuf[VIRTIO_GPU_MAX_SCANOUTS];
> >  bool backend_blocked;
> >  };
> >
> > diff --git a/include/ui/console.h b/include/ui/console.h index
> > d5334a806c..01e998264b 100644
> > --- a/include/ui/console.h
> > +++ b/include/ui/console.h
> > @@ -358,6 +358,12 @@ void dpy_gl_cursor_dmabuf(QemuConsole *con,
> QemuDmaBuf *dmabuf,
> >bool have_hot, uint32_t hot_x, uint32_t
> > hot_y);  void dpy_gl_cursor_position(QemuConsole *con,
> >  uint32_t pos_x, uint32_t pos_y);
> > +QemuDmaBuf *dpy_gl_create_dmabuf(uint32_t width, uint32_t height,
> > + uint32_t stride, uint32_t x,
> > + uint32_t y, uint32_t backing_width,
> > + uint32_t backing_height, uint32_t fourcc,
> > + uint64_t modifier, uint32_t dmabuf_fd,
> > + bool allow_fences, bool y0_top);
> >  uint32_t dpy_gl_dmabuf_get_width(QemuDmaBuf *dmabuf);  uint32_t
> > dpy_gl_dmabuf_get_height(QemuDmaBuf *dmabuf);  int32_t
> > dpy_gl_dmabuf_get_fd(QemuDmaBuf *dmabuf); diff --git
> > a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c index
> > 709c8a02a1..0e49a934ed 100644
> > --- a/hw/display/vhost-user-gpu.c
> > +++ b/hw/display/vhost-user-gpu.c
> > @@ -249,6 +249,7 @@ vhost_user_gpu_handle_display(VhostUserGPU *g,
> VhostUserGpuMsg *msg)
> >  case VHOST_USER_GPU_DMABUF_SCANOUT: {
> >  VhostUserGpuDMABUFScanout *m = &msg->payload.dmabuf_scanout;
> >  int fd = qemu_chr_fe_get_msgfd(&g->vhost_chr);
> > +uint64_t modifier = 0;
> >  QemuDmaBuf *dmabuf;
> >
> >  if (m->scanout_id >= g->parent_obj.conf.max_outputs) { @@
> > -261,30 +262,32 @@ vhost_user_gpu_handle_display(VhostUserGPU *g,
> > VhostUserGpuMsg *msg)
> >
> >  g->parent_obj.enable = 1;
> >  con = g->parent_obj.scanout[m->scanout_id].con;
> > -dmabuf = &g->dmabuf[m->scanout_id];
> > -if (dmabuf->fd >= 0) {
> > -close(dmabuf->fd);
> > -dmabuf->fd = -1;
> > +dmabuf = g->dmabuf[m->scanout_id];
> > +if (dmabuf) {
> > +int dmabuf_fd = dpy_gl_dmabuf_get_fd(dmabuf);
> > +if (dmabuf_fd >= 0) {
> > +close(dmabuf_fd);
> > +}
> > +dpy_gl_release_dmabuf(con, dmabuf);
> >  }
> > -dpy_gl_release_dmabuf(con, dmabuf);
> > +
> >  if (fd == -1) {
> >  dpy_gl_scanout_disable(con);
> >

Re: [PATCH v2 2/2] hw/arm: Add nucleo-g071rb board

2024-03-22 Thread Peter Maydell

On Wed, 20 Mar 2024 at 20:21, Felipe Balbi  wrote:
>
> This board is based around STM32G071RB SoC, a Cortex-M0 based
> device. More information can be found at:
>
> https://www.st.com/en/product/nucleo-g071rb.html

Could you put this URL in a comment in the source file too, please?

>
> Signed-off-by: Felipe Balbi 
> ---
>
> Changes since v1:
>
> - Convert tabs to spaces (checkpatch.pl)
>
>  MAINTAINERS|  6 
>  hw/arm/Kconfig |  6 
>  hw/arm/meson.build |  1 +
>  hw/arm/nucleo-g071rb.c | 70 ++
>  4 files changed, 83 insertions(+)
>  create mode 100644 hw/arm/nucleo-g071rb.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bce2eb3ad70b..052ce4dcfb97 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1116,6 +1116,12 @@ L: qemu-...@nongnu.org
>  S: Maintained
>  F: hw/arm/netduinoplus2.c
>
> +Nucleo G071RB
> +M: Felipe Balbi 
> +L: qemu-...@nongnu.org
> +S: Maintained
> +F: hw/arm/nucleo-g071rb.c
> +
>  Olimex STM32 H405
>  M: Felipe Balbi 
>  L: qemu-...@nongnu.org
> diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
> index 28a46d2b1ad3..5938bb8208a1 100644
> --- a/hw/arm/Kconfig
> +++ b/hw/arm/Kconfig
> @@ -310,6 +310,12 @@ config STM32VLDISCOVERY
>  depends on TCG && ARM
>  select STM32F100_SOC
>
> +config NUCLEO_G071RB
> +bool
> +default y
> +depends on TCG && ARM
> +select STM32G000_SOC
> +
>  config STRONGARM
>  bool
>  select PXA2XX
> diff --git a/hw/arm/meson.build b/hw/arm/meson.build
> index 9c4137a988e1..580c2d55fc3f 100644
> --- a/hw/arm/meson.build
> +++ b/hw/arm/meson.build
> @@ -18,6 +18,7 @@ arm_ss.add(when: 'CONFIG_REALVIEW', if_true: 
> files('realview.c'))
>  arm_ss.add(when: 'CONFIG_SBSA_REF', if_true: files('sbsa-ref.c'))
>  arm_ss.add(when: 'CONFIG_STELLARIS', if_true: files('stellaris.c'))
>  arm_ss.add(when: 'CONFIG_STM32VLDISCOVERY', if_true: 
> files('stm32vldiscovery.c'))
> +arm_ss.add(when: 'CONFIG_NUCLEO_G071RB', if_true: files('nucleo-g071rb.c'))
>  arm_ss.add(when: 'CONFIG_ZYNQ', if_true: files('xilinx_zynq.c'))
>  arm_ss.add(when: 'CONFIG_SABRELITE', if_true: files('sabrelite.c'))
>
> diff --git a/hw/arm/nucleo-g071rb.c b/hw/arm/nucleo-g071rb.c
> new file mode 100644
> index ..580b52bacf2c
> --- /dev/null
> +++ b/hw/arm/nucleo-g071rb.c
> @@ -0,0 +1,70 @@
> +/*
> + * ST Nucleo G071RB
> + *
> + * Copyright (c) 2024 Felipe Balbi 
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "hw/boards.h"
> +#include "hw/qdev-properties.h"
> +#include "hw/qdev-clock.h"
> +#include "qemu/error-report.h"
> +#include "hw/arm/stm32g000_soc.h"
> +#include "hw/arm/boot.h"
> +
> +/* nucleo_g071rb implementation is derived from olimex-stm32-h405.c */
> +
> +/* Main SYSCLK frequency in Hz (48MHz) */
> +#define SYSCLK_FRQ 4800ULL
> +
> +static void nucleo_g071rb_init(MachineState *machine)
> +{
> +DeviceState *dev;
> +Clock *sysclk;
> +
> +/* This clock doesn't need migration because it is fixed-frequency */
> +sysclk = clock_new(OBJECT(machine), "SYSCLK");
> +clock_set_hz(sysclk, SYSCLK_FRQ);
> +
> +dev = qdev_new(TYPE_STM32G000_SOC);
> +object_property_add_child(OBJECT(machine), "soc", OBJECT(dev));
> +qdev_connect_clock_in(dev, "sysclk", sysclk);
> +sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
> +
> +armv7m_load_kernel(ARM_CPU(first_cpu),
> +   machine->kernel_filename,
> +   0, FLASH_SIZE);
> +}
> +
> +static void nucleo_g071rb_machine_init(MachineClass *mc)
> +{
> +static const char * const valid_cpu_types[] = {
> +ARM_CPU_TYPE_NAME("cortex-m0"),
> +NULL
> +};
> +
> +mc->desc = "ST Nucleo-G071RB (Cortex-M0)";
> +mc->init = nucleo_g071rb_init;
> +mc->valid_cpu_types = valid_cpu_types;
> +}

Like the olim

Re: [PATCH v2 1/2] hw/arm: Add support for stm32g000 SoC family

2024-03-22 Thread Peter Maydell

On Wed, 20 Mar 2024 at 20:21, Felipe Balbi  wrote:
>
> Minimal support with USARTs and SPIs working. This SoC will be used to
> create and nucleo-g071rb board.
>
> Signed-off-by: Felipe Balbi 

Hi; thanks for this patchset, it looks pretty good, so I think
my review comments are mostly going to be fairly minor.

A note on timing: we're currently in freeze for the QEMU 9.0 release,
so although we can code review this patchset now, it won't go
upstream until we've released 9.0 and reopened the git trunk for
development (that's scheduled for mid-to-late April).

Do you plan to contribute further devices for this SoC in future,
or is the subset modelled in this patchset sufficient for your
uses? (I don't mind either way, just curious.)


> ---
>
> Changes since v1:
> - Convert tabs to spaces (checkpatch.pl)
> - Correct lines longer than 80 characters (checkpatch.pl)
> - Correct num-prio-bits (Samuel Tardieu)
> - Correct num-irqs (Found reviewing RM0444)
>
>  MAINTAINERS|   7 +
>  hw/arm/Kconfig |   6 +
>  hw/arm/meson.build |   1 +
>  hw/arm/stm32g000_soc.c | 253 +
>  include/hw/arm/stm32g000_soc.h |  62 

The reference manual calls this SoC family "STM32G0x0", so I
think we should be in line with that and use stm32g0x0 in
filenames etc rather than 000. (This also matches what we've
done with the stm32l4x5.)

>  5 files changed, 329 insertions(+)
>  create mode 100644 hw/arm/stm32g000_soc.c
>  create mode 100644 include/hw/arm/stm32g000_soc.h
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 409d7db4d457..bce2eb3ad70b 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1134,6 +1134,13 @@ F: hw/misc/stm32l4x5_rcc.c
>  F: hw/gpio/stm32l4x5_gpio.c
>  F: include/hw/*/stm32l4x5_*.h
>
> +STM32G000 SoC Family
> +M: Felipe Balbi 
> +L: qemu-...@nongnu.org
> +S: Maintained
> +F: hw/arm/stm32g000_soc.c
> +F: include/hw/*/stm32g000_*.h
> +
>  B-L475E-IOT01A IoT Node
>  M: Arnaud Minier 
>  M: Inès Varhol 
> diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
> index 893a7bff66b9..28a46d2b1ad3 100644
> --- a/hw/arm/Kconfig
> +++ b/hw/arm/Kconfig
> @@ -463,6 +463,12 @@ config STM32F405_SOC
>  select STM32F4XX_SYSCFG
>  select STM32F4XX_EXTI
>
> +config STM32G000_SOC
> +bool
> +select ARM_V7M
> +select STM32F2XX_USART
> +select STM32F2XX_SPI
> +
>  config B_L475E_IOT01A
>  bool
>  default y
> diff --git a/hw/arm/meson.build b/hw/arm/meson.build
> index 6808135c1f79..9c4137a988e1 100644
> --- a/hw/arm/meson.build
> +++ b/hw/arm/meson.build
> @@ -34,6 +34,7 @@ arm_ss.add(when: ['CONFIG_RASPI', 'TARGET_AARCH64'], 
> if_true: files('bcm2838.c',
>  arm_ss.add(when: 'CONFIG_STM32F100_SOC', if_true: files('stm32f100_soc.c'))
>  arm_ss.add(when: 'CONFIG_STM32F205_SOC', if_true: files('stm32f205_soc.c'))
>  arm_ss.add(when: 'CONFIG_STM32F405_SOC', if_true: files('stm32f405_soc.c'))
> +arm_ss.add(when: 'CONFIG_STM32G000_SOC', if_true: files('stm32g000_soc.c'))
>  arm_ss.add(when: 'CONFIG_B_L475E_IOT01A', if_true: files('b-l475e-iot01a.c'))
>  arm_ss.add(when: 'CONFIG_STM32L4X5_SOC', if_true: files('stm32l4x5_soc.c'))
>  arm_ss.add(when: 'CONFIG_XLNX_ZYNQMP_ARM', if_true: files('xlnx-zynqmp.c', 
> 'xlnx-zcu102.c'))
> diff --git a/hw/arm/stm32g000_soc.c b/hw/arm/stm32g000_soc.c
> new file mode 100644
> index ..48531d41fcc7
> --- /dev/null
> +++ b/hw/arm/stm32g000_soc.c
> @@ -0,0 +1,253 @@
> +/*
> + * STM32G000 SoC
> + *
> + * Copyright (c) 2024 Felipe Balbi 
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */


Somewhere near the top of the file it's nice to include a URL and
title for the documentation, if publicly available. I think in this
case it's

RM0454 Reference manual STM32G0x0 advanced Arm-based 32-bit MCUs
https://www.st.com/resource/en/reference_manual/rm0454-stm32g0x0-advanced-armbased-32bit-mcus-stm

[PULL 2/3] migration/postcopy: Fix high frequency sync

2024-03-22 Thread peterx

From: Peter Xu 

With current code base I can observe extremely high sync count during
precopy, as long as one enables postcopy-ram=on before switchover to
postcopy.

To provide some context of when QEMU decides to do a full sync: it checks
must_precopy (which implies "data must be sent during precopy phase"), and
as long as it is lower than the threshold size we calculated (out of
bandwidth and expected downtime) QEMU will kick off the slow/exact sync.

However, when postcopy is enabled (even if still during precopy phase), RAM
only reports all pages as can_postcopy, and report must_precopy==0.  Then
"must_precopy <= threshold_size" mostly always triggers and enforces a slow
sync for every call to migration_iteration_run() when postcopy is enabled
even if not used.  That is insane.

It turns out it was a regress bug introduced in the previous refactoring in
8.0 as reported by Nina [1]:

  (a) c8df4a7aef ("migration: Split save_live_pending() into state_pending_*")

Then a workaround patch is applied at the end of release (8.0-rc4) to fix it:

  (b) 28ef5339c3 ("migration: fix ram_state_pending_exact()")

However that "workaround" was overlooked when during the cleanup in this
9.0 release in this commit..

  (c) b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()")

Then the issue was re-exposed as reported by Nina [1].

The problem with (b) is that it only fixed the case for RAM, rather than
all the rest of iterators.  Here a slow sync should only be required if all
dirty data (precopy+postcopy) is less than the threshold_size that QEMU
calculated.  It is even debatable whether a sync is needed when switched to
postcopy.  Currently ram_state_pending_exact() will be mostly noop if
switched to postcopy, and that logic seems to apply too for all the rest of
iterators, as sync dirty bitmap during a postcopy doesn't make much sense.
However let's leave such change for later, as we're in rc phase.

So rather than reusing commit (b), this patch provides the complete fix for
all iterators.  When at it, cleanup a little bit on the lines around.

[1] https://gitlab.com/qemu-project/qemu/-/issues/1565

Reported-by: Nina Schoetterl-Glausch 
Fixes: b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()")
Reviewed-by: Fabiano Rosas 
Link: https://lore.kernel.org/r/20240320214453.584374-1-pet...@redhat.com
Signed-off-by: Peter Xu 
---
 migration/migration.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 047b6b49cf..9fe8fd2afd 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3199,17 +3199,16 @@ typedef enum {
  */
 static MigIterateState migration_iteration_run(MigrationState *s)
 {
-uint64_t must_precopy, can_postcopy;
+uint64_t must_precopy, can_postcopy, pending_size;
 Error *local_err = NULL;
 bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE;
 bool can_switchover = migration_can_switchover(s);
 
 qemu_savevm_state_pending_estimate(&must_precopy, &can_postcopy);
-uint64_t pending_size = must_precopy + can_postcopy;
-
+pending_size = must_precopy + can_postcopy;
 trace_migrate_pending_estimate(pending_size, must_precopy, can_postcopy);
 
-if (must_precopy <= s->threshold_size) {
+if (pending_size < s->threshold_size) {
 qemu_savevm_state_pending_exact(&must_precopy, &can_postcopy);
 pending_size = must_precopy + can_postcopy;
 trace_migrate_pending_exact(pending_size, must_precopy, can_postcopy);
-- 
2.44.0

[PULL 3/3] migration/multifd: Fix clearing of mapped-ram zero pages

2024-03-22 Thread peterx

From: Fabiano Rosas 

When the zero page detection is done in the multifd threads, we need
to iterate the second part of the pages->offset array and clear the
file bitmap for each zero page. The piece of code we merged to do that
is wrong.

The reason this has passed all the tests is because the bitmap is
initialized with zeroes already, so clearing the bits only really has
an effect during live migration and when a data page goes from having
data to no data.

Fixes: 303e6f54f9 ("migration/multifd: Implement zero page transmission on the 
multifd thread.")
Signed-off-by: Fabiano Rosas 
Link: https://lore.kernel.org/r/20240321201242.6009-1-faro...@suse.de
Signed-off-by: Peter Xu 
---
 migration/multifd.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index d2f0238f70..2802afe79d 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -111,7 +111,6 @@ void multifd_send_channel_created(void)
 static void multifd_set_file_bitmap(MultiFDSendParams *p)
 {
 MultiFDPages_t *pages = p->pages;
-uint32_t zero_num = p->pages->num - p->pages->normal_num;
 
 assert(pages->block);
 
@@ -119,7 +118,7 @@ static void multifd_set_file_bitmap(MultiFDSendParams *p)
 ramblock_set_file_bmap_atomic(pages->block, pages->offset[i], true);
 }
 
-for (int i = p->pages->num; i < zero_num; i++) {
+for (int i = p->pages->normal_num; i < p->pages->num; i++) {
 ramblock_set_file_bmap_atomic(pages->block, pages->offset[i], false);
 }
 }
-- 
2.44.0

[PATCH-for-9.1 v2 0/3] exec: Rename NEED_CPU_H -> COMPILING_PER_TARGET

2024-03-22 Thread Philippe Mathieu-Daudé

Since v1:
- prerequisites merged
- s/CONFIG_TARGET/COMPILING_PER_TARGET/ (Peter)

'NEED_CPU_H' guard target-specific code.
Clarify by renaming as COMPILING_PER_TARGET.

Philippe Mathieu-Daudé (3):
  gdbstub: Simplify #ifdef'ry in helpers.h
  hw/core: Remove check on NEED_CPU_H in tcg-cpu-ops.h
  exec: Rename NEED_CPU_H -> COMPILING_PER_TARGET

 meson.build   | 4 ++--
 include/exec/cpu-defs.h   | 2 +-
 include/exec/helper-head.h| 4 ++--
 include/exec/memop.h  | 4 ++--
 include/exec/memory.h | 4 ++--
 include/exec/tswap.h  | 4 ++--
 include/gdbstub/helpers.h | 9 -
 include/hw/core/cpu.h | 4 ++--
 include/hw/core/tcg-cpu-ops.h | 3 ---
 include/qemu/osdep.h  | 2 +-
 include/sysemu/hvf.h  | 8 
 include/sysemu/kvm.h  | 6 +++---
 include/sysemu/nvmm.h | 4 ++--
 include/sysemu/whpx.h | 4 ++--
 include/sysemu/xen.h  | 4 ++--
 target/arm/kvm-consts.h   | 4 ++--
 scripts/analyze-inclusions| 6 +++---
 17 files changed, 36 insertions(+), 40 deletions(-)

-- 
2.41.0

[PATCH-for-9.1 v2 2/3] hw/core: Remove check on NEED_CPU_H in tcg-cpu-ops.h

2024-03-22 Thread Philippe Mathieu-Daudé

Commit fd3f7d24d4 ("include/hw/core: Remove i386 conditional
on fake_user_interrupt") remove the need to check on NEED_CPU_H.

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
---
 include/hw/core/tcg-cpu-ops.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/include/hw/core/tcg-cpu-ops.h b/include/hw/core/tcg-cpu-ops.h
index bf8ff8e3ee..dc1f16a977 100644
--- a/include/hw/core/tcg-cpu-ops.h
+++ b/include/hw/core/tcg-cpu-ops.h
@@ -49,7 +49,6 @@ struct TCGCPUOps {
 /** @debug_excp_handler: Callback for handling debug exceptions */
 void (*debug_excp_handler)(CPUState *cpu);
 
-#ifdef NEED_CPU_H
 #ifdef CONFIG_USER_ONLY
 /**
  * @fake_user_interrupt: Callback for 'fake exception' handling.
@@ -174,8 +173,6 @@ struct TCGCPUOps {
  */
 bool (*need_replay_interrupt)(int interrupt_request);
 #endif /* !CONFIG_USER_ONLY */
-#endif /* NEED_CPU_H */
-
 };
 
 #if defined(CONFIG_USER_ONLY)
-- 
2.41.0

[PATCH-for-9.1 v2 3/3] exec: Rename NEED_CPU_H -> COMPILING_PER_TARGET

2024-03-22 Thread Philippe Mathieu-Daudé

'NEED_CPU_H' guard target-specific code; it is defined by meson
altogether with the 'CONFIG_TARGET' definition. Rename NEED_CPU_H
as COMPILING_PER_TARGET to clarify its meaning.

Mechanical change running:

 $ sed -i s/NEED_CPU_H/COMPILING_PER_TARGET/g $(git grep -l NEED_CPU_H)

then manually add a /* COMPILING_PER_TARGET */ comment
after the '#endif' when the block is large.

Inspired-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 meson.build| 4 ++--
 include/exec/cpu-defs.h| 2 +-
 include/exec/helper-head.h | 4 ++--
 include/exec/memop.h   | 4 ++--
 include/exec/memory.h  | 4 ++--
 include/exec/tswap.h   | 4 ++--
 include/gdbstub/helpers.h  | 2 +-
 include/hw/core/cpu.h  | 4 ++--
 include/qemu/osdep.h   | 2 +-
 include/sysemu/hvf.h   | 8 
 include/sysemu/kvm.h   | 6 +++---
 include/sysemu/nvmm.h  | 4 ++--
 include/sysemu/whpx.h  | 4 ++--
 include/sysemu/xen.h   | 4 ++--
 target/arm/kvm-consts.h| 4 ++--
 scripts/analyze-inclusions | 6 +++---
 16 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/meson.build b/meson.build
index b375248a76..f13ad52f5f 100644
--- a/meson.build
+++ b/meson.build
@@ -3637,7 +3637,7 @@ foreach d, list : target_modules
 if target.endswith('-softmmu')
   config_target = config_target_mak[target]
   target_inc = [include_directories('target' / 
config_target['TARGET_BASE_ARCH'])]
-  c_args = ['-DNEED_CPU_H',
+  c_args = ['-DCOMPILING_PER_TARGET',
 '-DCONFIG_TARGET="@0@-config-target.h"'.format(target),
 '-DCONFIG_DEVICES="@0@-config-devices.h"'.format(target)]
   target_module_ss = module_ss.apply(config_target, strict: false)
@@ -3820,7 +3820,7 @@ foreach target : target_dirs
   target_base_arch = config_target['TARGET_BASE_ARCH']
   arch_srcs = [config_target_h[target]]
   arch_deps = []
-  c_args = ['-DNEED_CPU_H',
+  c_args = ['-DCOMPILING_PER_TARGET',
 '-DCONFIG_TARGET="@0@-config-target.h"'.format(target),
 '-DCONFIG_DEVICES="@0@-config-devices.h"'.format(target)]
   link_args = emulator_link_args
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index 3915438b83..0dbef3010c 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -19,7 +19,7 @@
 #ifndef CPU_DEFS_H
 #define CPU_DEFS_H
 
-#ifndef NEED_CPU_H
+#ifndef COMPILING_PER_TARGET
 #error cpu.h included from common code
 #endif
 
diff --git a/include/exec/helper-head.h b/include/exec/helper-head.h
index 28ceab0a46..5ef467a79d 100644
--- a/include/exec/helper-head.h
+++ b/include/exec/helper-head.h
@@ -43,7 +43,7 @@
 #define dh_ctype_noreturn G_NORETURN void
 #define dh_ctype(t) dh_ctype_##t
 
-#ifdef NEED_CPU_H
+#ifdef COMPILING_PER_TARGET
 # ifdef TARGET_LONG_BITS
 #  if TARGET_LONG_BITS == 32
 #   define dh_alias_tl i32
@@ -54,7 +54,7 @@
 #  endif
 # endif
 # define dh_ctype_tl target_ulong
-#endif
+#endif /* COMPILING_PER_TARGET */
 
 /* We can't use glue() here because it falls foul of C preprocessor
recursive expansion rules.  */
diff --git a/include/exec/memop.h b/include/exec/memop.h
index a86dc6743a..06417ff361 100644
--- a/include/exec/memop.h
+++ b/include/exec/memop.h
@@ -35,7 +35,7 @@ typedef enum MemOp {
 MO_LE= 0,
 MO_BE= MO_BSWAP,
 #endif
-#ifdef NEED_CPU_H
+#ifdef COMPILING_PER_TARGET
 #if TARGET_BIG_ENDIAN
 MO_TE= MO_BE,
 #else
@@ -135,7 +135,7 @@ typedef enum MemOp {
 MO_BESL  = MO_BE | MO_SL,
 MO_BESQ  = MO_BE | MO_SQ,
 
-#ifdef NEED_CPU_H
+#ifdef COMPILING_PER_TARGET
 MO_TEUW  = MO_TE | MO_UW,
 MO_TEUL  = MO_TE | MO_UL,
 MO_TEUQ  = MO_TE | MO_UQ,
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8626a355b3..bb51e90fe1 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -3087,7 +3087,7 @@ address_space_write_cached(MemoryRegionCache *cache, 
hwaddr addr,
 MemTxResult address_space_set(AddressSpace *as, hwaddr addr,
   uint8_t c, hwaddr len, MemTxAttrs attrs);
 
-#ifdef NEED_CPU_H
+#ifdef COMPILING_PER_TARGET
 /* enum device_endian to MemOp.  */
 static inline MemOp devend_memop(enum device_endian end)
 {
@@ -3105,7 +3105,7 @@ static inline MemOp devend_memop(enum device_endian end)
 return (end == non_host_endianness) ? MO_BSWAP : 0;
 #endif
 }
-#endif
+#endif /* COMPILING_PER_TARGET */
 
 /*
  * Inhibit technologies that require discarding of pages in RAM blocks, e.g.,
diff --git a/include/exec/tswap.h b/include/exec/tswap.h
index 68944a880b..5089cd6a4c 100644
--- a/include/exec/tswap.h
+++ b/include/exec/tswap.h
@@ -15,11 +15,11 @@
  * If we're in target-specific code, we can hard-code the swapping
  * condition, otherwise we have to do (slower) run-time checks.
  */
-#ifdef NEED_CPU_H
+#ifdef COMPILING_PER_TARGET
 #define target_needs_bswap()  (HOST_BIG_ENDIAN != TARGET_BIG_ENDIAN)
 #else
 #define target_needs_bswap()  (target_words_bigendian() != HOST_BIG_E

[PULL 0/3] Migration 20240322 patches

2024-03-22 Thread peterx

From: Peter Xu 

The following changes since commit 853546f8128476eefb701d4a55b2781bb3a46faa:

  Merge tag 'pull-loongarch-20240322' of https://gitlab.com/gaosong/qemu into 
staging (2024-03-22 10:59:57 +)

are available in the Git repository at:

  https://gitlab.com/peterx/qemu.git tags/migration-20240322-pull-request

for you to fetch changes up to 8fa1a21c6edc2bf7de85984944848ab9ac49e937:

  migration/multifd: Fix clearing of mapped-ram zero pages (2024-03-22 12:12:08 
-0400)


Migration pull for 9.0-rc1

- Fabiano's patch to revert fd: support on mapped-ram
- Peter's fix on postcopy regression on unnecessary dirty syncs
- Fabiano's fix on mapped-ram rare corrupt on zero page handling



Fabiano Rosas (2):
  migration: Revert mapped-ram multifd support to fd: URI
  migration/multifd: Fix clearing of mapped-ram zero pages

Peter Xu (1):
  migration/postcopy: Fix high frequency sync

 migration/fd.h   |  2 --
 migration/fd.c   | 56 
 migration/file.c | 19 ++--
 migration/migration.c| 20 ++---
 migration/multifd.c  |  5 +---
 tests/qtest/migration-test.c | 43 ---
 6 files changed, 12 insertions(+), 133 deletions(-)

-- 
2.44.0

[PULL 1/3] migration: Revert mapped-ram multifd support to fd: URI

2024-03-22 Thread peterx

From: Fabiano Rosas 

This reverts commit decdc76772c453ff1444612e910caa0d45cd8eac in full
and also the relevant migration-tests from
7a09f092834641b7a793d50a3a261073bbb404a6.

After the addition of the new QAPI-based migration address API in 8.2
we've been converting an "fd:" URI into a SocketAddress, missing the
fact that the "fd:" syntax could also be used for a plain file instead
of a socket. This is a problem because the SocketAddress is part of
the API, so we're effectively asking users to create a "socket"
channel to pass in a plain file.

The easiest way to fix this situation is to deprecate the usage of
both SocketAddress and "fd:" when used with a plain file for
migration. Since this has been possible since 8.2, we can wait until
9.1 to deprecate it.

For 9.0, however, we should avoid adding further support to migration
to a plain file using the old "fd:" syntax or the new SocketAddress
API, and instead require the usage of either the old-style "file:" URI
or the FileMigrationArgs::filename field of the new API with the
"/dev/fdset/NN" syntax, both of which are already supported.

Signed-off-by: Fabiano Rosas 
Link: https://lore.kernel.org/r/20240319210941.1907-1-faro...@suse.de
Signed-off-by: Peter Xu 
---
 migration/fd.h   |  2 --
 migration/fd.c   | 56 
 migration/file.c | 19 ++--
 migration/migration.c| 13 -
 migration/multifd.c  |  2 --
 tests/qtest/migration-test.c | 43 ---
 6 files changed, 8 insertions(+), 127 deletions(-)

diff --git a/migration/fd.h b/migration/fd.h
index 0c0a18d9e7..b901bc014e 100644
--- a/migration/fd.h
+++ b/migration/fd.h
@@ -20,6 +20,4 @@ void fd_start_incoming_migration(const char *fdname, Error 
**errp);
 
 void fd_start_outgoing_migration(MigrationState *s, const char *fdname,
  Error **errp);
-void fd_cleanup_outgoing_migration(void);
-int fd_args_get_fd(void);
 #endif
diff --git a/migration/fd.c b/migration/fd.c
index fe0d096abd..449adaa2de 100644
--- a/migration/fd.c
+++ b/migration/fd.c
@@ -15,42 +15,19 @@
  */
 
 #include "qemu/osdep.h"
-#include "qapi/error.h"
 #include "channel.h"
 #include "fd.h"
 #include "file.h"
 #include "migration.h"
 #include "monitor/monitor.h"
-#include "io/channel-file.h"
-#include "io/channel-socket.h"
 #include "io/channel-util.h"
-#include "options.h"
 #include "trace.h"
 
 
-static struct FdOutgoingArgs {
-int fd;
-} outgoing_args;
-
-int fd_args_get_fd(void)
-{
-return outgoing_args.fd;
-}
-
-void fd_cleanup_outgoing_migration(void)
-{
-if (outgoing_args.fd > 0) {
-close(outgoing_args.fd);
-outgoing_args.fd = -1;
-}
-}
-
 void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error 
**errp)
 {
 QIOChannel *ioc;
 int fd = monitor_get_fd(monitor_cur(), fdname, errp);
-int newfd;
-
 if (fd == -1) {
 return;
 }
@@ -62,18 +39,6 @@ void fd_start_outgoing_migration(MigrationState *s, const 
char *fdname, Error **
 return;
 }
 
-/*
- * This is dup()ed just to avoid referencing an fd that might
- * be already closed by the iochannel.
- */
-newfd = dup(fd);
-if (newfd == -1) {
-error_setg_errno(errp, errno, "Could not dup FD %d", fd);
-object_unref(ioc);
-return;
-}
-outgoing_args.fd = newfd;
-
 qio_channel_set_name(ioc, "migration-fd-outgoing");
 migration_channel_connect(s, ioc, NULL, NULL);
 object_unref(OBJECT(ioc));
@@ -104,20 +69,9 @@ void fd_start_incoming_migration(const char *fdname, Error 
**errp)
 return;
 }
 
-if (migrate_multifd()) {
-if (fd_is_socket(fd)) {
-error_setg(errp,
-   "Multifd migration to a socket FD is not supported");
-object_unref(ioc);
-return;
-}
-
-file_create_incoming_channels(ioc, errp);
-} else {
-qio_channel_set_name(ioc, "migration-fd-incoming");
-qio_channel_add_watch_full(ioc, G_IO_IN,
-   fd_accept_incoming_migration,
-   NULL, NULL,
-   g_main_context_get_thread_default());
-}
+qio_channel_set_name(ioc, "migration-fd-incoming");
+qio_channel_add_watch_full(ioc, G_IO_IN,
+   fd_accept_incoming_migration,
+   NULL, NULL,
+   g_main_context_get_thread_default());
 }
diff --git a/migration/file.c b/migration/file.c
index b6e8ba13f2..ab18ba505a 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -11,7 +11,6 @@
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "channel.h"
-#include "fd.h"
 #include "file.h"
 #include "migration.h"
 #include "io/channel-file.h"
@@ -55,27 +54,15 @@ bool file_send_channel_create(gpointer opaque, Error **errp)
 {
 QIOChannelFile *io

[PATCH-for-9.1 v2 1/3] gdbstub: Simplify #ifdef'ry in helpers.h

2024-03-22 Thread Philippe Mathieu-Daudé

Slightly simplify by checking NEED_CPU_H definition in header.

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/gdbstub/helpers.h | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/include/gdbstub/helpers.h b/include/gdbstub/helpers.h
index c573aef2dc..53e88d807c 100644
--- a/include/gdbstub/helpers.h
+++ b/include/gdbstub/helpers.h
@@ -12,7 +12,10 @@
 #ifndef _GDBSTUB_HELPERS_H_
 #define _GDBSTUB_HELPERS_H_
 
-#ifdef NEED_CPU_H
+#ifndef NEED_CPU_H
+#error "gdbstub helpers should only be included by target specific code"
+#endif
+
 #include "cpu.h"
 
 /*
@@ -96,8 +99,4 @@ static inline uint8_t *gdb_get_reg_ptr(GByteArray *buf, int 
len)
 #define ldtul_p(addr) ldl_p(addr)
 #endif
 
-#else
-#error "gdbstub helpers should only be included by target specific code"
-#endif
-
 #endif /* _GDBSTUB_HELPERS_H_ */
-- 
2.41.0

[PATCH-for-9.0 1/2] hw/clock: Let clock_set_mul_div() return boolean value

2024-03-22 Thread Philippe Mathieu-Daudé

Let clock_set_mul_div() return a boolean value whether the
clock has been updated or not, similarly to clock_set().

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/clock.h | 4 +++-
 hw/core/clock.c| 8 +++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/hw/clock.h b/include/hw/clock.h
index bb12117f67..eb58599131 100644
--- a/include/hw/clock.h
+++ b/include/hw/clock.h
@@ -357,6 +357,8 @@ char *clock_display_freq(Clock *clk);
  * @multiplier: multiplier value
  * @divider: divider value
  *
+ * @return: true if the clock is changed.
+ *
  * By default, a Clock's children will all run with the same period
  * as their parent. This function allows you to adjust the multiplier
  * and divider used to derive the child clock frequency.
@@ -374,6 +376,6 @@ char *clock_display_freq(Clock *clk);
  * Note that this function does not call clock_propagate(); the
  * caller should do that if necessary.
  */
-void clock_set_mul_div(Clock *clk, uint32_t multiplier, uint32_t divider);
+bool clock_set_mul_div(Clock *clk, uint32_t multiplier, uint32_t divider);
 
 #endif /* QEMU_HW_CLOCK_H */
diff --git a/hw/core/clock.c b/hw/core/clock.c
index d82e44cd1a..a19c7db7df 100644
--- a/hw/core/clock.c
+++ b/hw/core/clock.c
@@ -143,14 +143,20 @@ char *clock_display_freq(Clock *clk)
 return freq_to_str(clock_get_hz(clk));
 }
 
-void clock_set_mul_div(Clock *clk, uint32_t multiplier, uint32_t divider)
+bool clock_set_mul_div(Clock *clk, uint32_t multiplier, uint32_t divider)
 {
 assert(divider != 0);
 
+if (clk->multiplier == multiplier && clk->divider == divider) {
+return false;
+}
+
 trace_clock_set_mul_div(CLOCK_PATH(clk), clk->multiplier, multiplier,
 clk->divider, divider);
 clk->multiplier = multiplier;
 clk->divider = divider;
+
+return true;
 }
 
 static void clock_initfn(Object *obj)
-- 
2.41.0

[PATCH-for-9.0 2/2] hw/misc/stm32l4x5_rcc: Propagate period when enabling a clock

2024-03-22 Thread Philippe Mathieu-Daudé

From: Arnaud Minier 

The "clock_set_mul_div" function doesn't propagate the clock period
to the children if it is changed (e.g. by enabling/disabling a clock
multiplexer).
This was overlooked during the implementation due to late changes.

This commit propagates the change if the multiplier or divider changes.

Fixes: ec7d83acbd ("hw/misc/stm32l4x5_rcc: Add an internal clock multiplexer 
object")
Signed-off-by: Arnaud Minier 
Signed-off-by: Inès Varhol 
Message-ID: <20240317103918.44375-2-arnaud.min...@telecom-paris.fr>
[PMD: Check clock_set_mul_div() return value]
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/misc/stm32l4x5_rcc.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/misc/stm32l4x5_rcc.c b/hw/misc/stm32l4x5_rcc.c
index bc2d63528b..7ad628b296 100644
--- a/hw/misc/stm32l4x5_rcc.c
+++ b/hw/misc/stm32l4x5_rcc.c
@@ -59,7 +59,10 @@ static void clock_mux_update(RccClockMuxState *mux, bool 
bypass_source)
 freq_multiplier = mux->divider;
 }
 
-clock_set_mul_div(mux->out, freq_multiplier, mux->multiplier);
+if (clock_set_mul_div(mux->out, freq_multiplier, mux->multiplier)) {
+clock_propagate(mux->out);
+}
+
 clock_update(mux->out, clock_get(current_source));
 
 src_freq = clock_get_hz(current_source);
-- 
2.41.0

[PATCH-for-9.0 0/2] hw/clock: Propagate clock changes when STM32L4X5 MUX is updated

2024-03-22 Thread Philippe Mathieu-Daudé

Per 
https://www.qemu.org/docs/master/devel/clocks.html#clock-multiplier-and-divider-settings:

  Note that clock_set_mul_div() does not automatically call
  clock_propagate(). If you make a runtime change to the
  multiplier or divider you must call clock_propagate() yourself.

Fix what we forgot to do that in recent commit ec7d83acbd
("hw/misc/stm32l4x5_rcc: Add an internal clock multiplexer object")

Arnaud Minier (1):
  hw/misc/stm32l4x5_rcc: Propagate period when enabling a clock

Philippe Mathieu-Daudé (1):
  hw/clock: Let clock_set_mul_div() return boolean value

 include/hw/clock.h  | 4 +++-
 hw/core/clock.c | 8 +++-
 hw/misc/stm32l4x5_rcc.c | 5 -
 3 files changed, 14 insertions(+), 3 deletions(-)

-- 
2.41.0

RE: [PATCH-for-9.1 08/27] target/hexagon: Convert to TCGCPUOps::get_cpu_state()

2024-03-22 Thread Brian Cain



> -Original Message-
> From: Philippe Mathieu-Daudé 
> Sent: Tuesday, March 19, 2024 10:43 AM
> To: qemu-devel@nongnu.org
> Cc: qemu-s3...@nongnu.org; Richard Henderson
> ; qemu-...@nongnu.org; qemu-
> a...@nongnu.org; qemu-ri...@nongnu.org; Anton Johansson ;
> Philippe Mathieu-Daudé ; Brian Cain
> 
> Subject: [PATCH-for-9.1 08/27] target/hexagon: Convert to
> TCGCPUOps::get_cpu_state()
> 
> WARNING: This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
> 
> Convert cpu_get_tb_cpu_state() to TCGCPUOps::get_cpu_state().
> 
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Brian Cain 

> ---
>  target/hexagon/cpu.h | 14 --
>  target/hexagon/cpu.c | 13 +
>  2 files changed, 13 insertions(+), 14 deletions(-)
> 
> diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
> index 935a9c3276..1d42c33827 100644
> --- a/target/hexagon/cpu.h
> +++ b/target/hexagon/cpu.h
> @@ -134,20 +134,6 @@ struct ArchCPU {
> 
>  FIELD(TB_FLAGS, IS_TIGHT_LOOP, 0, 1)
> 
> -#define TARGET_HAS_CPU_GET_TB_CPU_STATE
> -
> -static inline void cpu_get_tb_cpu_state(CPUHexagonState *env, vaddr *pc,
> -uint64_t *cs_base, uint32_t *flags)
> -{
> -uint32_t hex_flags = 0;
> -*pc = env->gpr[HEX_REG_PC];
> -*cs_base = 0;
> -if (*pc == env->gpr[HEX_REG_SA0]) {
> -hex_flags = FIELD_DP32(hex_flags, TB_FLAGS, IS_TIGHT_LOOP, 1);
> -}
> -*flags = hex_flags;
> -}
> -
>  typedef HexagonCPU ArchCPU;
> 
>  void hexagon_translate_init(void);
> diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
> index 3a716b9be3..5e0a9441f2 100644
> --- a/target/hexagon/cpu.c
> +++ b/target/hexagon/cpu.c
> @@ -273,6 +273,18 @@ static void hexagon_restore_state_to_opc(CPUState
> *cs,
>  cpu_env(cs)->gpr[HEX_REG_PC] = data[0];
>  }
> 
> +static void hexagon_get_cpu_state(CPUHexagonState *env, vaddr *pc,
> +  uint64_t *cs_base, uint32_t *flags)
> +{
> +uint32_t hex_flags = 0;
> +*pc = env->gpr[HEX_REG_PC];
> +*cs_base = 0;
> +if (*pc == env->gpr[HEX_REG_SA0]) {
> +hex_flags = FIELD_DP32(hex_flags, TB_FLAGS, IS_TIGHT_LOOP, 1);
> +}
> +*flags = hex_flags;
> +}
> +
>  static void hexagon_cpu_reset_hold(Object *obj)
>  {
>  CPUState *cs = CPU(obj);
> @@ -327,6 +339,7 @@ static const TCGCPUOps hexagon_tcg_ops = {
>  .initialize = hexagon_translate_init,
>  .synchronize_from_tb = hexagon_cpu_synchronize_from_tb,
>  .restore_state_to_opc = hexagon_restore_state_to_opc,
> +.get_cpu_state = hexagon_get_cpu_state,
>  };
> 
>  static void hexagon_cpu_class_init(ObjectClass *c, void *data)
> --
> 2.41.0

Re: [PATCH 3/3] qapi: Fix bogus documentation of query-migrationthreads

2024-03-22 Thread John Snow

On Fri, Mar 22, 2024, 9:51 AM Markus Armbruster  wrote:

> The doc comment documents an argument that doesn't exist.  Would
> fail compilation if it was marked up correctly.  Delete.
>
> The Returns: section fails to refer to the data type, leaving the user
> to guess.  Fix that.
>
> The command name violates QAPI naming rules: it should be
> query-migration-threads.  Too late to fix.
>
> Reported-by: John Snow 
> Fixes: 671326201dac (migration: Introduce interface query-migrationthreads)
> Signed-off-by: Markus Armbruster 
>

Reviewed-by: John Snow 

---
>  qapi/migration.json | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/qapi/migration.json b/qapi/migration.json
> index f6238b6980..e47ad7a63b 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -2419,9 +2419,7 @@
>  #
>  # Returns information of migration threads
>  #
> -# data: migration thread name
> -#
> -# Returns: information about migration threads
> +# Returns: @MigrationThreadInfo
>  #
>  # Since: 7.2
>  ##
> --
> 2.44.0
>
>

Re: [PATCH 01/12] qapi: Drop stray Arguments: line from qmp_capabilities docs

2024-03-22 Thread John Snow

On Fri, Mar 22, 2024, 10:09 AM Markus Armbruster  wrote:

> Reported-by: John Snow 
> Fixes: 119ebac1feb2 (qapi-schema: use generated marshaller for
> 'qmp_capabilities')
> Signed-off-by: Markus Armbruster 
>

Reviewed-by: John Snow 

---
>  qapi/control.json | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/qapi/control.json b/qapi/control.json
> index f404daef60..6bdbf077c2 100644
> --- a/qapi/control.json
> +++ b/qapi/control.json
> @@ -11,8 +11,6 @@
>  #
>  # Enable QMP capabilities.
>  #
> -# Arguments:
> -#
>  # @enable: An optional list of QMPCapability values to enable.  The
>  # client must not enable any capability that is not mentioned in
>  # the QMP greeting message.  If the field is not provided, it
> --
> 2.44.0
>
>

Re: [PATCH 5/7] target/i386: introduce x86-confidential-guest

2024-03-22 Thread Xiaoyao Li


On 3/19/2024 9:59 PM, Paolo Bonzini wrote:

Introduce a common superclass for x86 confidential guest implementations.
It will extend ConfidentialGuestSupportClass with a method that provides
the VM type to be passed to KVM_CREATE_VM.

Signed-off-by: Paolo Bonzini 


Reviewed-by: Xiaoyao Li

Re: [PATCH 0/3] qapi/migration: Doc fixes

2024-03-22 Thread Peter Xu

On Fri, Mar 22, 2024 at 02:51:14PM +0100, Markus Armbruster wrote:
> I'd like to get these into the release.  Please review.
> 
> Markus Armbruster (3):
>   qapi: Improve migration TLS documentation
>   qapi: Resync MigrationParameter and MigrateSetParameters
>   qapi: Fix bogus documentation of query-migrationthreads

With Fabiano's fixup squashed:

Reviewed-by: Peter Xu 

Thanks!

-- 
Peter Xu

Re: [PATCH 6/7] target/i386: Implement mc->kvm_type() to get VM type

2024-03-22 Thread Xiaoyao Li


On 3/19/2024 9:59 PM, Paolo Bonzini wrote:

From: Xiaoyao Li 

KVM is introducing a new API to create confidential guests, which
will be used by TDX and SEV-SNP but is also available for SEV and
SEV-ES.  The API uses the VM type argument to KVM_CREATE_VM to
identify which confidential computing technology to use.

Since there are no other expected uses of VM types, delegate
mc->kvm_type() for x86 boards to the confidential-guest-support
object pointed to by ms->cgs.

For example, if a sev-guest object is specified to confidential-guest-support,
like,

   qemu -machine ...,confidential-guest-support=sev0 \
-object sev-guest,id=sev0,...

it will check if a VM type KVM_X86_SEV_VM or KVM_X86_SEV_ES_VM
is supported, and if so use them together with the KVM_SEV_INIT2
function of the KVM_MEMORY_ENCRYPT_OP ioctl. If not, it will fall back to
KVM_SEV_INIT and KVM_SEV_ES_INIT.

This is a preparatory work towards TDX and SEV-SNP support, but it
will also enable support for VMSA features such as DebugSwap, which
are only available via KVM_SEV_INIT2.

Co-developed-by: Xiaoyao Li 
Signed-off-by: Xiaoyao Li 
Signed-off-by: Paolo Bonzini 


Reviewed-by: Xiaoyao Li 

some nits below.


---
  target/i386/confidential-guest.h | 19 ++
  target/i386/kvm/kvm_i386.h   |  2 ++
  hw/i386/x86.c|  6 +
  target/i386/kvm/kvm.c| 44 
  4 files changed, 71 insertions(+)

diff --git a/target/i386/confidential-guest.h b/target/i386/confidential-guest.h
index ca12d5a8fba..532e172a60b 100644
--- a/target/i386/confidential-guest.h
+++ b/target/i386/confidential-guest.h
@@ -36,5 +36,24 @@ struct X86ConfidentialGuest {
  struct X86ConfidentialGuestClass {
  /*  */
  ConfidentialGuestSupportClass parent;
+
+/*  */
+int (*kvm_type)(X86ConfidentialGuest *cg);
  };
+
+/**
+ * x86_confidential_guest_kvm_type:
+ *
+ * Calls #X86ConfidentialGuestClass.unplug callback of @plug_handler.


ah, forgot to change the callback name after copy+paste.


+ */
+static inline int x86_confidential_guest_kvm_type(X86ConfidentialGuest *cg)
+{
+X86ConfidentialGuestClass *klass = X86_CONFIDENTIAL_GUEST_GET_CLASS(cg);
+
+if (klass->kvm_type) {
+return klass->kvm_type(cg);
+} else {
+return 0;
+}
+}
  #endif
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index 30fedcffea3..02168122787 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -37,6 +37,7 @@ bool kvm_hv_vpindex_settable(void);
  bool kvm_enable_sgx_provisioning(KVMState *s);
  bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp);
  
+int kvm_get_vm_type(MachineState *ms, const char *vm_type);

  void kvm_arch_reset_vcpu(X86CPU *cs);
  void kvm_arch_after_reset_vcpu(X86CPU *cpu);
  void kvm_arch_do_init_vcpu(X86CPU *cs);
@@ -49,6 +50,7 @@ void kvm_request_xsave_components(X86CPU *cpu, uint64_t mask);
  
  #ifdef CONFIG_KVM
  
+bool kvm_is_vm_type_supported(int type);

  bool kvm_has_adjust_clock_stable(void);
  bool kvm_has_exception_payload(void);
  void kvm_synchronize_all_tsc(void);
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index ffbda48917f..2d4b148cd25 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1389,6 +1389,11 @@ static void machine_set_sgx_epc(Object *obj, Visitor *v, 
const char *name,
  qapi_free_SgxEPCList(list);
  }
  
+static int x86_kvm_type(MachineState *ms, const char *vm_type)

+{
+return kvm_enabled() ? kvm_get_vm_type(ms, vm_type) : 0;
+}
+
  static void x86_machine_initfn(Object *obj)
  {
  X86MachineState *x86ms = X86_MACHINE(obj);
@@ -1413,6 +1418,7 @@ static void x86_machine_class_init(ObjectClass *oc, void 
*data)
  mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
  mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
  mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
+mc->kvm_type = x86_kvm_type;
  x86mc->save_tsc_khz = true;
  x86mc->fwcfg_dma_enabled = true;
  nc->nmi_monitor_handler = x86_nmi;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 0ec69109a2b..e109648f260 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -31,6 +31,7 @@
  #include "sysemu/kvm_int.h"
  #include "sysemu/runstate.h"
  #include "kvm_i386.h"
+#include "../confidential-guest.h"
  #include "sev.h"
  #include "xen-emu.h"
  #include "hyperv.h"
@@ -161,6 +162,49 @@ static KVMMSRHandlers 
msr_handlers[KVM_MSR_FILTER_MAX_RANGES];
  static RateLimit bus_lock_ratelimit_ctrl;
  static int kvm_get_one_msr(X86CPU *cpu, int index, uint64_t *value);
  
+static const char *vm_type_name[] = {

+[KVM_X86_DEFAULT_VM] = "default",
+};
+
+bool kvm_is_vm_type_supported(int type)
+{
+uint32_t machine_types;


The name of machine_types confuses me a lot. why not supported_vm_types?


+
+/*
+ * old KVM doesn't support KVM_CAP_VM_TYPES but KVM_X86_DEFAULT_VM
+ * is always supported
+ */
+if (type == KVM_X86_DEFAULT_VM) {
+

Re: [PATCH for-9.0 v2] vhost-vdpa: check vhost_vdpa_set_vring_ready() return value

2024-03-22 Thread Philippe Mathieu-Daudé


On 22/3/24 10:23, Stefano Garzarella wrote:

vhost_vdpa_set_vring_ready() could already fail, but if Linux's
patch [1] will be merged, it may fail with more chance if
userspace does not activate virtqueues before DRIVER_OK when
VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is not negotiated.

So better check its return value anyway.

[1] 
https://lore.kernel.org/virtualization/20240206145154.118044-1-sgarz...@redhat.com/T/#u

Acked-by: Eugenio Pérez 
Acked-by: Jason Wang 
Signed-off-by: Stefano Garzarella 
---
Based-on: 20240315155949.86066-1-kw...@redhat.com

v1: https://patchew.org/QEMU/20240207092702.25242-1-sgarz...@redhat.com/
v2:
  - added acks
  - rebased on top of 
https://patchew.org/QEMU/20240315155949.86066-1-kw...@redhat.com/
---
  net/vhost-vdpa.c | 15 ---
  1 file changed, 12 insertions(+), 3 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé

Re: [PULL 1/1] target/loongarch: Fix qemu-system-loongarch64 assert failed with the option '-d int'

2024-03-22 Thread Michael Tokarev


22.03.2024 13:03, Song Gao :

qemu-system-loongarch64 assert failed with the option '-d int',
the helper_idle() raise an exception EXCP_HLT, but the exception name is 
undefined.

Signed-off-by: Song Gao 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240321123606.1704900-1-gaos...@loongson.cn>


Is this another qemu-stable material?  You Cc'd it to me but I'm not sure
what should I do with it.

For patches suitable for -stable, please Cc: qemu-sta...@nongnu.org.

Thanks,

/mjt

Re: [PATCH 0/4] hw/nmi: Remove @cpu_index argument

2024-03-22 Thread Peter Maydell

On Fri, 22 Mar 2024 at 14:08, Cédric Le Goater  wrote:
>
> On 3/20/24 16:00, Peter Maydell wrote:
> > On Wed, 20 Mar 2024 at 14:10, Mark Burton  wrote:
> >> I’d broaden this to all ’signals’ (IRQ, Reset etc) - and I guess
> >> similar statements apply, with the “bridge” between the function
> >> and the GPIO mechanism moved closer or further from the originator(s)
> >> of the activity.
> >>
> >> The issue isn’t my “machine” model, rather the compose-ability of
> >> (any) such machine.  A-priori, a model writer doesn’t know if they
> >> should respond directly to an NMI or not - Hence they dont know if
> >> they should implement the TYPE_NMI or not. That’s a decision only
> >> the machine composer knows.
> >> My suggestion would be to use a GPIO interface to models, which can
> >> then be appropriately wired. (And, hence, to have a single place
> >> that implements the TYPE_NMI interface and provides the GPIO wire
> >> ready for wiring to appropriate devices).
> >
> > I feel like that's a long way in the future, but my back-of-the-envelope
> > design sketch of that is that the TYPE_MACHINE class that's implementing
> > the "I am just a container for all the devices that the user has
> > specified and wired together" machine would itself implement TYPE_NMI and
> > when an NMI came in it would assert a GPIO line that the user could
> > wire up, or not wire up, as they chose.
> >
> > Right now we can't do that though, because, among other reasons,
> > TYPE_MACHINE isn't a TYPE_DEVICE. (I do want to fix that, though:
> > I'm hoping it won't be too difficult.)
>
> Oh that's interesting. Will that introduce an extra level of container
> with multiple machines below ?

No, I don't intend that we should have multiple machines in one
simulation, only that the thing which is "container for all the
machine's devices" shouldn't be a weirdly distinct type from
the SoC "container for devices" devices. What I'm primarily hoping
to remedy by making TYPE_MACHINE a subclass of TYPE_DEVICE to
deal with inconsistencies like:
 * reset of machine objects is nonstandard
 * machine models can't use facilities like having qdev gpio
   lines, so wind up calling qemu_allocate_irqs() directly

None of these are big things, but they're a bit paper-cut-ish.

-- PMM

Re: [PATCH v8] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-03-22 Thread Daniel P . Berrangé

On Tue, Mar 12, 2024 at 03:48:49AM -0400, Shaoqin Huang wrote:
> The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
> which PMU events are provided to the guest. Add a new option
> `kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
> Without the filter, all PMU events are exposed from host to guest by
> default. The usage of the new sub-option can be found from the updated
> document (docs/system/arm/cpu-features.rst).
> 
> Here is an example which shows how to use the PMU Event Filtering, when
> we launch a guest by use kvm, add such command line:
> 
>   # qemu-system-aarch64 \
> -accel kvm \
> -cpu host,kvm-pmu-filter="D:0x11-0x11"

I mistakenly sent some comments to the older v7 (despite this v8 already
existing) about the design of this syntax So for linking up the threads:

 https://lists.nongnu.org/archive/html/qemu-devel/2024-03/msg04703.html

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] migration/postcopy: Fix high frequency sync

2024-03-22 Thread Peter Xu

On Thu, Mar 21, 2024 at 12:20:32PM -0400, Peter Xu wrote:
> On Wed, Mar 20, 2024 at 05:44:53PM -0400, pet...@redhat.com wrote:
> > From: Peter Xu 
> > 
> > On current code base I can observe extremely high sync count during
> > precopy, as long as one enables postcopy-ram=on before switchover to
> > postcopy.
> > 
> > To provide some context of when we decide to do a full sync: we check
> > must_precopy (which implies "data must be sent during precopy phase"), and
> > as long as it is lower than the threshold size we calculated (out of
> > bandwidth and expected downtime) we will kick off the slow sync.
> > 
> > However, when postcopy is enabled (even if still during precopy phase), RAM
> > only reports all pages as can_postcopy, and report must_precopy==0.  Then
> > "must_precopy <= threshold_size" mostly always triggers and enforces a slow
> > sync for every call to migration_iteration_run() when postcopy is enabled
> > even if not used.  That is insane.
> > 
> > It turns out it was a regress bug introduced in the previous refactoring in
> > QEMU 8.0 in late 2022. Fix this by checking the whole RAM size rather than
> > must_precopy, like before.  Not copy stable yet as many things changed, and
> > even if this should be a major performance regression, no functional change
> > has observed (and that's also probably why nobody found it).  I only notice
> > this when looking for another bug reported by Nina.
> > 
> > When at it, cleanup a little bit on the lines around.
> > 
> > Cc: Nina Schoetterl-Glausch 
> > Fixes: c8df4a7aef ("migration: Split save_live_pending() into 
> > state_pending_*")
> > Signed-off-by: Peter Xu 
> 
> queued for 9.0-rc1.

When I was testing today on an old 8.2.0 binary I found that it's actually
working all fine..

It's because 28ef5339c3 ("migration: fix ram_state_pending_exact()")
actually fixed exactly the same issue, though that was a partial fix, which
I'll consider it as a "workaround" (because it only fixed RAM, while the
issue lies in the core calculations), which was overlooked in the cleanup
patch I did..  This patch should provide the complete fix.  I didn't check
whether other iterators can be affected, though.

To make it clearer, I'll change the Fixes to point to my cleanup patch, as
that's indeed the first commit to expose this issue again at least for a
generic postcopy use case (aka, my fault to break it..).  Then it also
means stable branches are all fine.  I also rewrote the commit log.
Attaching the updated version here just for reference (no code changes).

8<
>From 32e3146be16fef9d0fe7b0818265c9d07bb51de3 Mon Sep 17 00:00:00 2001
From: Peter Xu 
Date: Wed, 20 Mar 2024 17:44:53 -0400
Subject: [PATCH] migration/postcopy: Fix high frequency sync

With current code base I can observe extremely high sync count during
precopy, as long as one enables postcopy-ram=on before switchover to
postcopy.

To provide some context of when QEMU decides to do a full sync: it checks
must_precopy (which implies "data must be sent during precopy phase"), and
as long as it is lower than the threshold size we calculated (out of
bandwidth and expected downtime) QEMU will kick off the slow/exact sync.

However, when postcopy is enabled (even if still during precopy phase), RAM
only reports all pages as can_postcopy, and report must_precopy==0.  Then
"must_precopy <= threshold_size" mostly always triggers and enforces a slow
sync for every call to migration_iteration_run() when postcopy is enabled
even if not used.  That is insane.

It turns out it was a regress bug introduced in the previous refactoring in
8.0 as reported by Nina [1]:

  (a) c8df4a7aef ("migration: Split save_live_pending() into state_pending_*")

Then a workaround patch is applied at the end of release (8.0-rc4) to fix it:

  (b) 28ef5339c3 ("migration: fix ram_state_pending_exact()")

However that "workaround" was overlooked when during the cleanup in this
9.0 release in this commit..

  (c) b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()")

Then the issue was re-exposed as reported by Nina [1].

The problem with (b) is that it only fixed the case for RAM, rather than
all the rest of iterators.  Here a slow sync should only be required if all
dirty data (precopy+postcopy) is less than the threshold_size that QEMU
calculated.  It is even debatable whether a sync is needed when switched to
postcopy.  Currently ram_state_pending_exact() will be mostly noop if
switched to postcopy, and that logic seems to apply too for all the rest of
iterators, as sync dirty bitmap during a postcopy doesn't make much sense.
However let's leave such change for later, as we're in rc phase.

So rather than reusing commit (b), this patch provides the complete fix for
all iterators.  When at it, cleanup a little bit on the lines around.

[1] https://gitlab.com/qemu-project/qemu/-/issues/1565

Reported-by: Nina Schoetterl-Glausch 
Fixes: b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()")

RE: [PATCH v5 5/7] migration/multifd: implement initialization of qpl compression

2024-03-22 Thread Liu, Yuan1

> -Original Message-
> From: Liu, Yuan1
> Sent: Friday, March 22, 2024 10:07 AM
> To: Peter Xu 
> Cc: Daniel P. Berrangé ; faro...@suse.de; qemu-
> de...@nongnu.org; hao.xi...@bytedance.com; bryan.zh...@bytedance.com; Zou,
> Nanhai 
> Subject: RE: [PATCH v5 5/7] migration/multifd: implement initialization of
> qpl compression
> 
> > -Original Message-
> > From: Peter Xu 
> > Sent: Thursday, March 21, 2024 11:28 PM
> > To: Liu, Yuan1 
> > Cc: Daniel P. Berrangé ; faro...@suse.de; qemu-
> > de...@nongnu.org; hao.xi...@bytedance.com; bryan.zh...@bytedance.com;
> Zou,
> > Nanhai 
> > Subject: Re: [PATCH v5 5/7] migration/multifd: implement initialization
> of
> > qpl compression
> >
> > On Thu, Mar 21, 2024 at 01:37:36AM +, Liu, Yuan1 wrote:
> > > > -Original Message-
> > > > From: Peter Xu 
> > > > Sent: Thursday, March 21, 2024 4:32 AM
> > > > To: Liu, Yuan1 
> > > > Cc: Daniel P. Berrangé ; faro...@suse.de; qemu-
> > > > de...@nongnu.org; hao.xi...@bytedance.com;
> bryan.zh...@bytedance.com;
> > Zou,
> > > > Nanhai 
> > > > Subject: Re: [PATCH v5 5/7] migration/multifd: implement
> > initialization of
> > > > qpl compression
> > > >
> > > > On Wed, Mar 20, 2024 at 04:23:01PM +, Liu, Yuan1 wrote:
> > > > > let me explain here, during the decompression operation of IAA,
> the
> > > > > decompressed data can be directly output to the virtual address of
> > the
> > > > > guest memory by IAA hardware.  It can avoid copying the
> decompressed
> > > > data
> > > > > to guest memory by CPU.
> > > >
> > > > I see.
> > > >
> > > > > Without -mem-prealloc, all the guest memory is not populated, and
> > IAA
> > > > > hardware needs to trigger I/O page fault first and then output the
> > > > > decompressed data to the guest memory region.  Besides that, CPU
> > page
> > > > > faults will also trigger IOTLB flush operation when IAA devices
> use
> > SVM.
> > > >
> > > > Oh so the IAA hardware already can use CPU pgtables?  Nice..
> > > >
> > > > Why IOTLB flush is needed?  AFAIU we're only installing new pages,
> the
> > > > request can either come from a CPU access or a DMA.  In all cases
> > there
> > > > should have no tearing down of an old page.  Isn't an iotlb flush
> only
> > > > needed if a tear down happens?
> > >
> > > As far as I know, IAA hardware uses SVM technology to use the CPU's
> page
> > table
> > > for address translation (IOMMU scalable mode directly accesses the CPU
> > page table).
> > > Therefore, when the CPU page table changes, the device's Invalidation
> > operation needs
> > > to be triggered to update the IOMMU and the device's cache.
> > >
> > > My current kernel version is mainline 6.2. The issue I see is as
> > follows:
> > > --Handle_mm_fault
> > >  |
> > >   -- wp_page_copy
> >
> > This is the CoW path.  Not usual at all..
> >
> > I assume this issue should only present on destination.  Then the guest
> > pages should be the destination of such DMAs to happen, which means
> these
> > should be write faults, and as we see here it is, otherwise it won't
> > trigger a CoW.
> >
> > However it's not clear to me why a pre-installed zero page existed.  It
> > means someone read the guest pages first.
> >
> > It might be interesting to know _why_ someone reads the guest pages,
> even
> > if we know they're all zeros.  If we can avoid such reads then it'll be
> a
> > hole rather than a prefaulted read on zero page, then invalidations are
> > not
> > needed, and I expect that should fix the iotlb storm issue.
> 
> The received pages will be read for zero pages check first. Although
> these pages are zero pages, and IAA hardware will not access them, the
> COW happens and causes following IOTLB flush operation. As far as I know,
> IOMMU quickly detects whether the address range has been used by the
> device,
> and does not invalidate the address that is not used by the device, this
> has
> not yet been resolved in Linux kernel 6.2. I will check the latest status
> for
> this.

I checked the Linux mainline 6.8 code, there are no big changes for this.
In version 6.8, if the process needs to flush MMU TLB, then I/O TLB flush
will be also triggered when the process has SVM devices. I haven't found
the code to check if pages have been set EA (Extended-Accessed) bit before
submitting invalidation operations, this is same with version 6.2.

VT-d 3.6.2
If the Extended-Accessed-Flag-Enable (EAFE) is 1 in a scalable-mode PASID-table
entry that references a first-stage paging-structure entry used by the remapping
hardware, it atomically sets the EA field in that entry. Whenever EA field is 
atomically set, the A field is also set in the same atomic operation. For 
software
usages where the first-stage paging structures are shared across heterogeneous 
agents
(e.g., CPUs and accelerator devices such as GPUs), the EA flag may be used by 
software
to identify pages accessed by non-CPU agent(s) (as opposed to the A flag which 
indicates
access by any agent sharing the paging struct

RE: [PATCH v2 1/2] Implement SSH commands in QEMU GA for Windows

2024-03-22 Thread Aidan Leuck

Thanks for the feedback, Daniel, I will get these issues resolved shortly. 
Thank you for your patience, this is my first time committing to QEMU. 
Aidan Leuck

-Original Message-
From: Daniel P. Berrangé  
Sent: Friday, March 22, 2024 4:32 AM
To: Aidan Leuck 
Cc: qemu-devel@nongnu.org; kkost...@redhat.com
Subject: Re: [PATCH v2 1/2] Implement SSH commands in QEMU GA for Windows

[Caution - External]

On Thu, Mar 21, 2024 at 04:07:24PM +, aidan_le...@selinc.com wrote:
> From: aidaleuc 
>
> Signed-off-by: aidaleuc 
> ---
>  qga/commands-windows-ssh.c | 848 
> +
>  qga/commands-windows-ssh.h |  26 ++
>  qga/meson.build|   9 +-
>  qga/qapi-schema.json   |  22 +-
>  4 files changed, 892 insertions(+), 13 deletions(-)  create mode 
> 100644 qga/commands-windows-ssh.c  create mode 100644 
> qga/commands-windows-ssh.h
>
> diff --git a/qga/commands-windows-ssh.c b/qga/commands-windows-ssh.c 
> new file mode 100644 index 00..566266f465
> --- /dev/null
> +++ b/qga/commands-windows-ssh.c
> @@ -0,0 +1,848 @@


> +static char *get_admin_ssh_folder(Error **errp) {
> +  // Allocate memory for the program data path

Please use C /* ... */  comments, not C++ comment style

> +  g_autofree char *programDataPath = NULL;  char *authkeys_path = 
> + NULL;  PWSTR pgDataW;

Nothing initializes pgDataW here.

> +  GError *gerr = NULL;

Declare that 'g_autoptr(GError) gerr = NULL', avoiding the need for manual 
g_Error_Free calls.

> +
> +  // Get the KnownFolderPath on the machine.
> +  HRESULT folderResult =
> +  SHGetKnownFolderPath(&FOLDERID_ProgramData, 0, NULL, &pgDataW);  
> + if (folderResult != S_OK) {

This method is the initializer for 'pgDataW', but in this error scenario, is 
there any guarantee that 'pgDataW'
will actually have been initialized ? We're about to jump to an 'error' label 
that will try to free this.

> +error_setg(errp, "Failed to retrieve ProgramData folder");
> +goto error;

QEMU indent standard is 4 spaces, rather than 2 spaces. Applies throughout this 
patch.

> +  }
> +
> +  // Convert from a wide string back to a standard character string.
> +  programDataPath = g_utf16_to_utf8(pgDataW, -1, NULL, NULL, &gerr);

If you put the 'CoTaskMemFree(pgDataW)' call here

> +  if (!programDataPath) {

and call error_setg(, gerr->mssage); here

we can avoid the need for any 'error:' cleanup block at all. All places can 
just 'return NULL' immediately.

> +goto error;
> +  }
> +
> +  // Build the path to the file.
> +  authkeys_path = g_build_filename(programDataPath, "ssh", NULL);  
> + CoTaskMemFree(pgDataW);  return authkeys_path;
> +
> +error:
> +  CoTaskMemFree(pgDataW);

...before we access pgDataW here potentially uninitialized.

> +
> +  if (gerr) {
> +error_setg(errp,"Failed to convert program data path from wide string to 
> standard utf 8 string. %s", gerr->message);
> +g_error_free(gerr);
> +  }
> +
> +  return NULL;
> +}
> +
> +/*
> + * Gets the path to the SSH folder for the specified user. If the 
> +user is an
> + * admin it returns the ssh folder located at %PROGRAMDATA%/ssh. If 
> +the user is
> + * not an admin it returns %USERPROFILE%/.ssh
> + *
> + * parameters:
> + * username -> Username to get the SSH folder for
> + * isAdmin -> Whether the user is an admin or not
> + * errp -> Error structure to set any errors that occur.
> + * returns: path to the ssh folder as a string.
> + */
> +static char *get_ssh_folder(const char *username, const bool isAdmin,
> +Error **errp) {
> +  if (isAdmin) {
> +return get_admin_ssh_folder(errp);
> +  }
> +
> +  // If not an Admin the SSH key is in the user directory.
> +  DWORD maxSize = MAX_PATH;

QEMU preference is for all variables to be declared at the start of the code 
scope block. This is especially important when using g_autofree at the same 
time as 'goto', as if a goto jumps over a variable declaration, it will remain 
uninitialized.

> +  g_autofree char* profilesDir = g_malloc(maxSize);

Use 'char *' rather than 'char*', and use g_new0 to guarantee the memory is 
initialized.

> +
> +  // Get the user profile directory on the machine.
> +  BOOL ret = GetProfilesDirectory(profilesDir, &maxSize);  if (!ret) 
> + {
> +error_setg_win32(errp, GetLastError(),
> + "failed to retrieve profiles directory");
> +return NULL;
> +  }
> +
> +  // Builds the filename
> +  return g_build_filename(profilesDir, username, ".ssh", NULL); }
> +
> +/*
> + * Creates an entry for the everyone group. This is used when the 
> +user is an Administrator
> + * This is consistent with the folder permissions that OpenSSH 
> +creates when it
> + * is installed. Anyone can read the file, but only Administrators 
> +and SYSTEM can
> + * modify the file.
> + *
> + * parameters:
> + * userInfo -> Information about the current user
> + * pACL -> Pointer to an ACL structure
> + * errp -> Error structure to set any errors th

Re: [PATCH 1/3] qapi: Improve migration TLS documentation

2024-03-22 Thread Markus Armbruster

Fabiano Rosas  writes:

> Markus Armbruster  writes:
>
>> MigrateSetParameters is about setting parameters, and
>> MigrationParameters is about querying them.  Their documentation of
>> @tls-creds and @tls-hostname has residual damage from a failed attempt
>> at de-duplicating them (see commit de63ab61241 "migrate: Share common
>> MigrationParameters struct" and commit 1bda8b3c695 "migration: Unshare
>> MigrationParameters struct for now").
>>
>> MigrateSetParameters documentation issues:
>>
>> * It claims plain text mode "was reported by omitting tls-creds"
>>   before 2.9.  MigrateSetParameters is not used for reporting, so this
>>   is misleading.  Delete.
>>
>> * It similarly claims hostname defaulting to migration URI "was
>>   reported by omitting tls-hostname" before 2.9.  Delete as well.
>>
>> Rephrase the remaining @tls-hostname contents for clarity.
>>
>> Enum MigrationParameter mirrors the members of struct
>> MigrateSetParameters.  Differences to MigrateSetParameters's member
>> documentation are pointless.  Copy the new text to MigrationParameter.
>>
>> MigrationParameters documentation issues:
>>
>> * @tls-creds runs the two last sentences together without punctuation.
>>   Fix that.
>>
>> * Much of the contents on @tls-hostname only applies to setting
>>   parameters, resulting in confusion.  Replace by a suitable abridged
>>   version of the new MigrateSetParameters text, and a note on
>>   @tls-hostname omission in 2.8.
>>
>> Additional damage is due to flawed doc fix commit
>> 66fcb9d651d (qapi/migration: Add missing tls-authz documentation):
>> since it copied the missing MigrateSetParameters text from
>> MigrationParameters instead of MigrationParameter, the part on
>> recreating @tls-authz on the fly is missing.  Copy that, too.
>>
>> Signed-off-by: Markus Armbruster 
>> ---
>>  qapi/migration.json | 63 +++--
>>  1 file changed, 32 insertions(+), 31 deletions(-)
>>
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index aa1b39bce1..cbcc6946eb 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -809,16 +809,19 @@
>>  # for establishing a TLS connection over the migration data
>>  # channel.  On the outgoing side of the migration, the credentials
>>  # must be for a 'client' endpoint, while for the incoming side the
>> -# credentials must be for a 'server' endpoint.  Setting this will
>> -# enable TLS for all migrations.  The default is unset, resulting
>> -# in unsecured migration at the QEMU level.  (Since 2.7)
>> +# credentials must be for a 'server' endpoint.  Setting this to a
>> +# non-empty string enables TLS for all migrations.  An empty
>> +# string means that QEMU will use plain text mode for migration,
>> +# rather than TLS.  (Since 2.7)
>>  #
>> -# @tls-hostname: hostname of the target host for the migration.  This
>> -# is required when using x509 based TLS credentials and the
>> -# migration URI does not already include a hostname.  For example
>> -# if using fd: or exec: based migration, the hostname must be
>> -# provided so that the server's x509 certificate identity can be
>> -# validated.  (Since 2.7)
>> +# @tls-hostname: migration target's hostname for validating the
>> +# server's x509 certificate identify.  If empty, QEMU will use the
>
> identity

ACK!  Also the other two copies you noted.  Thanks!

[...]

Re: [PATCH 3/3] qapi: Fix bogus documentation of query-migrationthreads

2024-03-22 Thread Fabiano Rosas

Markus Armbruster  writes:

> The doc comment documents an argument that doesn't exist.  Would
> fail compilation if it was marked up correctly.  Delete.
>
> The Returns: section fails to refer to the data type, leaving the user
> to guess.  Fix that.
>
> The command name violates QAPI naming rules: it should be
> query-migration-threads.  Too late to fix.
>
> Reported-by: John Snow 
> Fixes: 671326201dac (migration: Introduce interface query-migrationthreads)
> Signed-off-by: Markus Armbruster 

Reviewed-by: Fabiano Rosas

[PATCH 07/12] qapi: Fix abbreviation punctuation in doc comments

2024-03-22 Thread Markus Armbruster

Signed-off-by: Markus Armbruster 
---
 qapi/migration.json | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index aa1b39bce1..faeb7d1ca9 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1762,7 +1762,7 @@
 #default network.
 #
 # 5. For now, number of migration streams is restricted to one,
-#i.e number of items in 'channels' list is just 1.
+#i.e. number of items in 'channels' list is just 1.
 #
 # 6. The 'uri' and 'channels' arguments are mutually exclusive;
 #exactly one of the two should be present.
@@ -1839,7 +1839,7 @@
 # 3. The uri format is the same as for -incoming
 #
 # 4. For now, number of migration streams is restricted to one,
-#i.e number of items in 'channels' list is just 1.
+#i.e. number of items in 'channels' list is just 1.
 #
 # 5. The 'uri' and 'channels' arguments are mutually exclusive;
 #exactly one of the two should be present.
-- 
2.44.0

1 2 >

1 - 100 of 194 matches

Mail list logo