On Fri, 7 Sep 2018 16:52:17 -0400
"Michael S. Tsirkin" <m...@redhat.com> wrote:

> On Wed, Aug 29, 2018 at 03:51:38PM +0200, Igor Mammedov wrote:
> > On Wed, 29 Aug 2018 09:15:53 -0400
> > "Michael S. Tsirkin" <m...@redhat.com> wrote:
> >   
> > > On Wed, Aug 29, 2018 at 10:43:11AM +0200, Igor Mammedov wrote:  
> > > > On Wed, 29 Aug 2018 12:54:40 +1000
> > > > David Gibson <da...@gibson.dropbear.id.au> wrote:
> > > >     
> > > > > On Tue, Aug 28, 2018 at 03:18:48PM +0200, Igor Mammedov wrote:    
> > > > > > On Tue, 28 Aug 2018 10:52:37 +1000
> > > > > > David Gibson <da...@gibson.dropbear.id.au> wrote:
> > > > > >       
> > > > > > > On Mon, Aug 27, 2018 at 04:02:39PM +0200, Greg Kurz wrote:      
> > > > > > > > On Mon, 27 Aug 2018 13:07:10 +0200
> > > > > > > > Igor Mammedov <imamm...@redhat.com> wrote:
> > > > > > > >         
> > > > > > > > > The first cpu unplug wasn't ever supported and corresponding
> > > > > > > > > monitor/qmp commands refuse to unplug it. However guest is 
> > > > > > > > > able
> > > > > > > > > to issue eject request either using following command:
> > > > > > > > >   # echo 1 >/sys/devices/system/cpu/cpu0/firmware_node/eject  
> > > > > > > > >       
> > > > > > > > 
> > > > > > > > I can't reproduce the issue with a pc guest and current 
> > > > > > > > master...
> > > > > > > > 
> > > > > > > > All I seem to get is an error in dmesg:
> > > > > > > > 
> > > > > > > > [   97.435446] processor cpu0: Offline failed.
> > > > > > > >         
> > > > > > > > > or directly writing to cpu hotplug registers, which makes
> > > > > > > > > qemu crash with SIGSEGV following back trace:
> > > > > > > > > 
> > > > > > > > >    kvm_flush_coalesced_mmio_buffer ()
> > > > > > > > >        while (ring->first != ring->last)
> > > > > > > > >    ...
> > > > > > > > >    qemu_flush_coalesced_mmio_buffer
> > > > > > > > >    prepare_mmio_access
> > > > > > > > >    flatview_read_continue
> > > > > > > > >    flatview_read
> > > > > > > > >    address_space_read_full
> > > > > > > > >    address_space_rw
> > > > > > > > >    kvm_cpu_exec(cpu!0)
> > > > > > > > >    qemu_kvm_cpu_thread_fn
> > > > > > > > > 
> > > > > > > > > the reason for which is that ring == 
> > > > > > > > > KVMState::coalesced_mmio_ring
> > > > > > > > > happens to be a part of 1st CPU that was uplugged by guest.
> > > > > > > > > 
> > > > > > > > > Fix it by forbidding 1st cpu unplug from guest side and in 
> > > > > > > > > addition
> > > > > > > > > remove CPU0._EJ0 ACPI method to make clear that unplug of the 
> > > > > > > > > first
> > > > > > > > > CPU is not supported.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Igor Mammedov <imamm...@redhat.com>
> > > > > > > > > ---
> > > > > > > > > CCing spapr and s390x folks in case targets need to prevent 
> > > > > > > > > 1st CPU unplug as well
> > > > > > > > >         
> > > > > > > > 
> > > > > > > > A spapr guest can _release_ the first cpu by doing something 
> > > > > > > > like:
> > > > > > > > 
> > > > > > > > # echo -n "/cpus/PowerPC,POWER8@0" > 
> > > > > > > > /sys/devices/system/cpu/release
> > > > > > > > 
> > > > > > > > But AFAIK, this doesn't unplug the cpu from a QEMU standpoint.  
> > > > > > > >       
> > > > > > > 
> > > > > > > Unplugging CPU 0 with device_del should be ok too.      
> > > > > > Do you mean real unplugging (cpu0 object freed) or just remove 
> > > > > > request?      
> > > > > 
> > > > > Real unplugging should be possible.  I'm not sure how thorougly it's
> > > > > been tested, though.    
> > > > Well, common kvm code in qemu seems to be in disagreement with it
> > > > as backtrace in this patch shows also usage of first_cpu macro
> > > > won't survive such unplug.    
> > > 
> > > Paolo - any take on this? Do we need to make cpu 0 special like this?  
> > It probably would take a bunch of refactoring to get rid of first_cpu&co
> > dependencies and besides of experimenting with cpu0 unplug in guest kernel
> > there isn't any other value in it, so it probably not worth the effort.
> > 
> > On top of that, for pc/q35 machine we would need to select boot cpu
> > in some other way (right now it's hardwired to first_cpu).
> > 
> > It seems that seabios might work if cpu0 isn't present, don't know about 
> > OVMF.  
> 
> OK, I queued this, but can you please add some code comments
> so we remember why it is like this if someone changes the code?
> Even better maybe to add an API along the lines of:
>     cpu_is_hotpluggable
>         return cpu != first_cpu
I like an API idea,
I'll try to look into how to use use it in a generic way.


Reply via email to