On Tue, Jun 23, 2026 at 12:00:04AM +0400, Marc-André Lureau wrote:
> Hi Peter
> 
> On Mon, Jun 22, 2026 at 11:28 PM Peter Xu <[email protected]> wrote:
> >
> > On Mon, Jun 22, 2026 at 03:53:33PM +0400, Marc-André Lureau wrote:
> > > Hi Peter
> > >
> > > On Fri, Jun 19, 2026 at 7:13 PM Peter Xu <[email protected]> wrote:
> > > >
> > > > On Fri, Jun 19, 2026 at 12:11:48AM +0400, Marc-André Lureau wrote:
> > > > > Hi
> > > > >
> > > > > On Thu, Jun 4, 2026 at 5:46 PM Marc-André Lureau
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > This is an attempt to fix the incompatibility of virtio-mem with 
> > > > > > confidential
> > > > > > VMs. The solution implements what was discussed earlier with D. 
> > > > > > Hildenbrand:
> > > > > > https://patchwork.ozlabs.org/project/qemu-devel/patch/[email protected]/#3502238
> > > > > >
> > > > > > The first patches are misc cleanups. Then some code refactoring to 
> > > > > > have split a
> > > > > > manager/source. And finally, the manager learns to deal with 
> > > > > > multiple sources.
> > > > > >
> > > > > > This has been tested together with the Linux kernel series from
> > > > > > Zhenzhong Duan [1] for TDX guests.
> > > > > >
> > > > > > (help fix https://issues.redhat.com/browse/RHEL-131968)
> > > > >
> > > > > Can the patch 1-11 be queued or are we missing something?
> > > > > (RFC patch 12 can be dropped for now)
> > > >
> > > > Likely yes.. one thing to double check with you before I do: We don't 
> > > > need
> > > > the kernel series, do we?  Since when unplug, I expect with the 
> > > > truncation
> > > > approach that this series proposed, KVM will emit TDH.MEM.PAGE.REMOVE 
> > > > then
> > > > unaccept is done (?).
> > >
> > > > Say, what happens if we run QEMU with this series applied, but without 
> > > > the
> > > > kernel series?
> > >
> > > The kernel series is needed at least for PAGE.ACCEPT. Without it, QEMU
> > > will have KVM_RUN return EIO, and finish into assert (while tearing
> > > down ioeventfd).
> >
> > Could you elaborate a bit more on why ACCEPT would fail?
> >
> > I used to ask the event flow here:
> >
> > https://lore.kernel.org/qemu-devel/[email protected]/
> >
> > If AUG existed, then why ACCEPT would fail?
> >
> > PS: I didn't read a lot of what a Linux guest would do; I know there're
> > some lazy accept approach, but IIUC it's only a matter of time to ACCEPT,
> > not correctness.  My understanding is if we properly notify these new slots
> > with AUG, then it should be able to ACCEPT?
> 
> Yes, it will, but it needs the kernel patches to do it (or virtio-mem
> will let the guest access un-accepted pages and qemu will crash). I
> submitted a few proposals before Zhenzhong Duan proposed the last
> iteration.

OK, I think I misunderstood both the crash and also what Zhenzhong's series
is trying to do.  After a closer look, it seems to be a proposal for any
plug/unplug to work.  I'm surprised (if I get it right this time..) that TD
didn't seem to support DIMM hotplugs besides virtio-mem.

> 
> > >
> > > > What confused me a bit is the dependency of this series v.s. the kernel
> > > > one.  It seems to use different approaches, but then I don't understand 
> > > > why
> > > > this series was tested with the kernel change.
> > >
> > > My understanding is that the kernel may perform TDG.MEM.PAGE.RELEASE.
> > > That depends on TDX config TDCS_CONFIG_PAGE_RELEASE which qemu/kvm
> > > doesnt currently control. I don't know whether this is then
> > > redundant/needless with qemu doing discard on the guest_memfd..
> >
> > The problem is if this series depends on the kernel series, should we then
> > wait for the kernel solution be accepted first in case it'll change?
> 
> I don't think qemu needs to wait for the kernel to be fixed.
> Furthermore, this series is not tdx/sev/etc specific

AFAIU it is coco specific, otherwise we only always have 1 source, hence no
need for this series to provide this function.

Said that, I agree with you.. Looks like there's no major plan to add
anything specific to QEMU.

Patch 1-10 are all reviewed patches and correctly resolve the known >1 ram
sources and it's a design problem, I don't see how we can bypass that.

Patch 11 is very reasonable on its own, then I assume the guest ACCEPT
problem will need to evolve on its own.  Looks like the direction is
correct, and only some corner cases to think about (acpi coverage, lazy
accept, etc.) that I saw in the discussion.

> 
> >
> > But obviously I still don't yet fully understand how this whole thing
> > works.. :(
> 
> I am also slow, because I don't focus on this most of the time.
> Someone with more experience and dedication would handle it better.

I queued patch 1-11, will send a pull this week.  Thanks for all the
explanations.

-- 
Peter Xu


Reply via email to