On Fri, 7 Sep 2018 16:52:17 -0400 "Michael S. Tsirkin" <m...@redhat.com> wrote:
> On Wed, Aug 29, 2018 at 03:51:38PM +0200, Igor Mammedov wrote: > > On Wed, 29 Aug 2018 09:15:53 -0400 > > "Michael S. Tsirkin" <m...@redhat.com> wrote: > > > > > On Wed, Aug 29, 2018 at 10:43:11AM +0200, Igor Mammedov wrote: > > > > On Wed, 29 Aug 2018 12:54:40 +1000 > > > > David Gibson <da...@gibson.dropbear.id.au> wrote: > > > > > > > > > On Tue, Aug 28, 2018 at 03:18:48PM +0200, Igor Mammedov wrote: > > > > > > On Tue, 28 Aug 2018 10:52:37 +1000 > > > > > > David Gibson <da...@gibson.dropbear.id.au> wrote: > > > > > > > > > > > > > On Mon, Aug 27, 2018 at 04:02:39PM +0200, Greg Kurz wrote: > > > > > > > > On Mon, 27 Aug 2018 13:07:10 +0200 > > > > > > > > Igor Mammedov <imamm...@redhat.com> wrote: > > > > > > > > > > > > > > > > > The first cpu unplug wasn't ever supported and corresponding > > > > > > > > > monitor/qmp commands refuse to unplug it. However guest is > > > > > > > > > able > > > > > > > > > to issue eject request either using following command: > > > > > > > > > # echo 1 >/sys/devices/system/cpu/cpu0/firmware_node/eject > > > > > > > > > > > > > > > > > > > > > > > > > I can't reproduce the issue with a pc guest and current > > > > > > > > master... > > > > > > > > > > > > > > > > All I seem to get is an error in dmesg: > > > > > > > > > > > > > > > > [ 97.435446] processor cpu0: Offline failed. > > > > > > > > > > > > > > > > > or directly writing to cpu hotplug registers, which makes > > > > > > > > > qemu crash with SIGSEGV following back trace: > > > > > > > > > > > > > > > > > > kvm_flush_coalesced_mmio_buffer () > > > > > > > > > while (ring->first != ring->last) > > > > > > > > > ... > > > > > > > > > qemu_flush_coalesced_mmio_buffer > > > > > > > > > prepare_mmio_access > > > > > > > > > flatview_read_continue > > > > > > > > > flatview_read > > > > > > > > > address_space_read_full > > > > > > > > > address_space_rw > > > > > > > > > kvm_cpu_exec(cpu!0) > > > > > > > > > qemu_kvm_cpu_thread_fn > > > > > > > > > > > > > > > > > > the reason for which is that ring == > > > > > > > > > KVMState::coalesced_mmio_ring > > > > > > > > > happens to be a part of 1st CPU that was uplugged by guest. > > > > > > > > > > > > > > > > > > Fix it by forbidding 1st cpu unplug from guest side and in > > > > > > > > > addition > > > > > > > > > remove CPU0._EJ0 ACPI method to make clear that unplug of the > > > > > > > > > first > > > > > > > > > CPU is not supported. > > > > > > > > > > > > > > > > > > Signed-off-by: Igor Mammedov <imamm...@redhat.com> > > > > > > > > > --- > > > > > > > > > CCing spapr and s390x folks in case targets need to prevent > > > > > > > > > 1st CPU unplug as well > > > > > > > > > > > > > > > > > > > > > > > > > A spapr guest can _release_ the first cpu by doing something > > > > > > > > like: > > > > > > > > > > > > > > > > # echo -n "/cpus/PowerPC,POWER8@0" > > > > > > > > > /sys/devices/system/cpu/release > > > > > > > > > > > > > > > > But AFAIK, this doesn't unplug the cpu from a QEMU standpoint. > > > > > > > > > > > > > > > > > > > > > > Unplugging CPU 0 with device_del should be ok too. > > > > > > Do you mean real unplugging (cpu0 object freed) or just remove > > > > > > request? > > > > > > > > > > Real unplugging should be possible. I'm not sure how thorougly it's > > > > > been tested, though. > > > > Well, common kvm code in qemu seems to be in disagreement with it > > > > as backtrace in this patch shows also usage of first_cpu macro > > > > won't survive such unplug. > > > > > > Paolo - any take on this? Do we need to make cpu 0 special like this? > > It probably would take a bunch of refactoring to get rid of first_cpu&co > > dependencies and besides of experimenting with cpu0 unplug in guest kernel > > there isn't any other value in it, so it probably not worth the effort. > > > > On top of that, for pc/q35 machine we would need to select boot cpu > > in some other way (right now it's hardwired to first_cpu). > > > > It seems that seabios might work if cpu0 isn't present, don't know about > > OVMF. > > OK, I queued this, but can you please add some code comments > so we remember why it is like this if someone changes the code? > Even better maybe to add an API along the lines of: > cpu_is_hotpluggable > return cpu != first_cpu I like an API idea, I'll try to look into how to use use it in a generic way.