On 05/16/2017 06:10 PM, Greg Kurz wrote:
> On Tue, 16 May 2017 17:18:27 +0200
> Cédric Le Goater <c...@kaod.org> wrote:
> 
>> On 05/16/2017 02:55 PM, Laurent Vivier wrote:
>>> On 16/05/2017 14:50, Cédric Le Goater wrote:  
>>>> On 05/16/2017 02:03 PM, Laurent Vivier wrote:  
>>>>> On 26/04/2017 09:00, David Gibson wrote:  
>>>>>> From: Cédric Le Goater <c...@kaod.org>
>>>>>>
>>>>>> Today, all the ICPs are created before the CPUs, stored in an array
>>>>>> under the sPAPR machine and linked to the CPU when the core threads
>>>>>> are realized. This modeling brings some complexity when a lookup in
>>>>>> the array is required and it can be simplified by allocating the ICPs
>>>>>> when the CPUs are.
>>>>>>
>>>>>> This is the purpose of this proposal which introduces a new 'icp_type'
>>>>>> field under the machine and creates the ICP objects of the right type
>>>>>> (KVM or not) before the PowerPCCPU object are.
>>>>>>
>>>>>> This change allows more cleanups : the removal of the icps array under
>>>>>> the sPAPR machine and the removal of the xics_get_cpu_index_by_dt_id()
>>>>>> helper.
>>>>>>
>>>>>> Signed-off-by: Cédric Le Goater <c...@kaod.org>
>>>>>> Reviewed-by: David Gibson <da...@gibson.dropbear.id.au>
>>>>>> Signed-off-by: David Gibson <da...@gibson.dropbear.id.au>
>>>>>> ---
>>>>>>  hw/intc/xics.c          | 11 -----------
>>>>>>  hw/ppc/spapr.c          | 47 
>>>>>> ++++++++++++++---------------------------------
>>>>>>  hw/ppc/spapr_cpu_core.c | 18 ++++++++++++++----
>>>>>>  include/hw/ppc/spapr.h  |  2 +-
>>>>>>  include/hw/ppc/xics.h   |  2 --
>>>>>>  5 files changed, 29 insertions(+), 51 deletions(-)
>>>>>>  
>>>>>
>>>>> This commit breaks CPU re-hotplugging with KVM
>>>>>
>>>>> the sequence "device_add, device_del, device_add" brings to the
>>>>> following error message:
>>>>>
>>>>>     Unable to connect CPUx to kernel XICS: Device or resource busy
>>>>>
>>>>> It comes from icp_kvm_cpu_setup():
>>>>>
>>>>> ...
>>>>>     ret = kvm_vcpu_enable_cap(cs, KVM_CAP_IRQ_XICS, 0, kernel_xics_fd,
>>>>>                               kvm_arch_vcpu_id(cs));
>>>>>     if (ret < 0) {
>>>>>         error_report("Unable to connect CPU%ld to kernel XICS: %s",
>>>>>                      kvm_arch_vcpu_id(cs), strerror(errno));
>>>>>         exit(1);
>>>>>     }
>>>>> ..
>>>>>
>>>>> It should be protected by cap_irq_xics_enabled:
>>>>>
>>>>> ...
>>>>>     /*
>>>>>      * If we are reusing a parked vCPU fd corresponding to the CPU
>>>>>      * which was hot-removed earlier we don't have to renable
>>>>>      * KVM_CAP_IRQ_XICS capability again.
>>>>>      */
>>>>>     if (icp->cap_irq_xics_enabled) {
>>>>>         return;
>>>>>     }
>>>>>
>>>>> ...
>>>>>     ret = kvm_vcpu_enable_cap(...);
>>>>> ...
>>>>>     icp->cap_irq_xics_enabled = true;
>>>>> ...
>>>>>
>>>>> But since this commit, "icp" is a new object on each call:
>>>>>
>>>>> spapr_cpu_core_realize_child()
>>>>> ...
>>>>>     obj = object_new(spapr->icp_type);
>>>>> ...
>>>>>     xics_cpu_setup(XICS_FABRIC(spapr), cpu, ICP(obj));
>>>>>     ...
>>>>>             icpc->cpu_setup(icp, cpu); -> icp_kvm_cpu_setup()
>>>>>     ...
>>>>> ...
>>>>>
>>>>> and "cap_irq_xics_enabled" is reinitialized.
>>>>>
>>>>> Any idea how to fix that?  
>>>>
>>>> it seems that a cleanup is not done in the kernel. We are missing
>>>> a way to call kvmppc_xics_free_icp() from QEMU. Today the only
>>>> way is to destroy the vcpu.   
>>>
>>> The commit introducing this hack, for reference:
>>>
>>> commit a45863bda90daa8ec39e5a312b9734fd4665b016
>>> Author: Bharata B Rao <bhar...@linux.vnet.ibm.com>
>>> Date:   Thu Jul 2 16:23:20 2015 +1000
>>>
>>>     xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled
>>>     
>>>     When supporting CPU hot removal by parking the vCPU fd and reusing
>>>     it during hotplug again, there can be cases where we try to reenable
>>>     KVM_CAP_IRQ_XICS CAP for the vCPU for which it was already enabled.
>>>     Introduce a boolean member in ICPState to track this and don't
>>>     reenable the CAP if it was already enabled earlier.
>>>     
>>>     Re-enabling this CAP should ideally work, but currently it results in
>>>     kernel trying to create and associate ICP with this vCPU and that
>>>     fails since there is already an ICP associated with it. Hence this
>>>     patch is needed to work around this problem in the kernel.
>>>     
>>>     This change allows CPU hot removal to work for sPAPR.
>>>     
>>>     Signed-off-by: Bharata B Rao <bhar...@linux.vnet.ibm.com>
>>>     Reviewed-by: David Gibson <da...@gibson.dropbear.id.au>
>>>     Signed-off-by: David Gibson <da...@gibson.dropbear.id.au>
>>>     Signed-off-by: Alexander Graf <ag...@suse.de>  
>>
>> OK. 
>>
>> Greg is looking at re-adding the ICPState array because of a 
>> migration issue with older machines. We might need to do so 
>> unconditionally ...
>>
> 
> That would be a pity to carry on with the pre-allocated ICPStates for
> new machine types just because of that... What about keeping track
> of all the cap_irq_xics_enabled flags in a separate max_cpus sized
> static array ?

Could we use 'cpu->unplug' instead ? 

C. 


>> But for that specific issue, I think it would have been better 
>> to clean up the kernel state. Is that possible ? 
>>
> 
> Commit 4c055ab54fae ("cpu: Reclaim vCPU objects") gives some more details
> on why we don't destroy the vCPU in KVM on unplug, but rather park the vCPU
> fd for later use... so I'm not sure we can clean up the kernel state.
> 
> But since the vCPU is still present, maybe we can find a way to tell KVM
> that we want to reuse an already present ICP ?
> 
>> Thanks,
>>
>> C.
>>  
>>
>>>> Else we need to reintroduce the array of icps (again) to keep some 
>>>> xics state ... but that just sucks :/ Let me think about it. 
>>>>  
>>>
>>> Thanks,
>>> Laurent  
>>>> C.
>>>>  
>>>   
>>
> 


Reply via email to