On Thu, Oct 09, 2025 at 10:32:34AM -0400, Steven Sistare wrote: > On 10/9/2025 10:19 AM, Daniel P. Berrangé wrote: > > On Thu, Oct 09, 2025 at 10:11:17AM -0400, Steven Sistare wrote: > > > On 10/8/2025 6:22 AM, Duan, Zhenzhong wrote: > > > > > > > > > > > > > -----Original Message----- > > > > > From: Steven Sistare <[email protected]> > > > > > Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute > > > > > "query-balloon" after CPR transfer > > > > > > > > > > On 9/30/2025 2:00 AM, Duan, Zhenzhong wrote: > > > > > > > -----Original Message----- > > > > > > > From: Steven Sistare <[email protected]> > > > > > > > Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute > > > > > > > "query-balloon" after CPR transfer > > > > > > > > > > > > > > On 9/28/2025 4:54 AM, Zhenzhong Duan wrote: > > > > > > > > After CPR transfer, source QEMU closes kvm fd and sets > > > > > > > > kvm_state to > > > > > > > NULL, > > > > > > > > "query-balloon" will check kvm_state->sync_mmu and trigger NULL > > > > > pointer > > > > > > > > reference. > > > > > > > > > > > > > > > > We don't need to NULL kvm_state as all states in kvm_state > > > > > > > > aren't > > > > > released > > > > > > > > actually. Just closing kvm fd is enough so we could still query > > > > > > > > states > > > > > > > > through "query_*" qmp command. > > > > > > > > > > > > > > IMO this does not make sense. Much of the state in kvm_state was > > > > > derived > > > > > > >from ioctl's on the descriptors, and closing them invalidates it. > > > > > > >Asking > > > > > > > historical questions about what used to be makes no sense. > > > > > > > > > > > > You also have your valid point. > > > > > > > > > > > > > > > > > > > > Clearing kvm_state and setting kvm_allowed=false would be a safer > > > > > > > fix. > > > > > > > > > > Setting kvm_allowed=false causes kvm_enabled() to return false which > > > > > should > > > > > prevent kvm_state from being dereferenced anywhere: > > > > > > > > > > #define kvm_enabled() (kvm_allowed) > > > > > > > > > > Eg for the balloon: > > > > > > > > > > static bool have_balloon(Error **errp) > > > > > { > > > > > if (kvm_enabled() && !kvm_has_sync_mmu()) { > > > > > > > > OK, will do, clearing kvm_allowed is a good idea. > > > > > > > > Currently there are two qmp commands "query-balloon" and > > > > "query-cpu-definitions" > > > > causing SIGSEGV after CPR-transfer, clearing kvm_allowed helps for both. > > > > But clearing both kvm_allowed and kvm_state cause SIGSEGV on > > > > "query-cpu-definitions". > > > > > > > > I'll send a patch to clearing only kvm_allowed if you are fine with it. > > > > > > I still don't love the idea. kvm_state is no longer valid. > > > > > > It sounds like "query-cpu-definitions" is missing a check for > > > kvm_enabled(). > > > > This patch / bug feels like it is side-stepping a general conceptual > > question. After cpr-transfer is complete what QMP commands are still > > valid to call in general ? This thread mentions two which have caused > > bugs, but beyond that it feels like a large subset of QMP comamnds > > are conceptually invalid to use. > > Agreed. I don't see why these commands should be issued to the dead instance. > How about migration status, quit, and nothing else? > Half serious.
I was hoping you'd suggest such a short list :-) Essentially a case of calling qmp_for_each_command(), and in the callback run qmp_disable_command() for everything not in the allow-list. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
