On 2015.4.29 5:31, Boris Ostrovsky wrote:
> On 04/28/2015 08:30 AM, Ouyang Zhaowei (Charles) wrote:
>>
>> On 2015.4.26 7:31, Boris Ostrovsky wrote:
>>> On 04/24/2015 05:30 AM, Ouyang Zhaowei (Charles) wrote:
>>>> If a PVOPS VM has multi-cpu the vcpu_info of cpu0 is the member of the 
>>>> structure HYPERVISOR_shared_info,
>>>> and the others is not, but after 'xl save -c/restore' the vcpu_info will 
>>>> be reinitialized,
>>>> the vcpu_info of all the vcpus will be considered as the member of 
>>>> HYPERVISOR_shared_info.
>>>> This will cause the cpu1 and other cpu keep receiving interrupts, and the 
>>>> cpu0 is waiting them to
>>>> finish the job.
>>>> So we do not reinit the vcpu_info when PVOPS vm is doing 'xl save 
>>>> -c/restore'.
>>>>
>>>> Signed-off-by: Charles Ouyang <ouyangzhao...@huawei.com>
>>>> ---
>>>>    arch/x86/xen/suspend.c | 3 ++-
>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
>>>> index d949769..b2bed45 100644
>>>> --- a/arch/x86/xen/suspend.c
>>>> +++ b/arch/x86/xen/suspend.c
>>>> @@ -32,7 +32,8 @@ static void xen_hvm_post_suspend(int suspend_cancelled)
>>>>    {
>>>>    #ifdef CONFIG_XEN_PVHVM
>>>>           int cpu;
>>>> -       xen_hvm_init_shared_info();
>>>> +       if (!suspend_cancelled)
>>>> +               xen_hvm_init_shared_info();
>>>>           xen_callback_vector();
>>>>           xen_unplug_emulated_devices();
>>>>           if (xen_feature(XENFEAT_hvm_safe_pvclock)) {
>>> Do we need to call other routines if suspend is canceled?
>>>
>>> Also, if suspend is canceled then we don't do xen_irq_resume() if that's 
>>> what you meant by "vcpu_info will be reinitialized". Were you referring 
>>> some other re-initialization?
>>>
>> Hi Boris,
>>
>> Sorry I didn't make myself clear.
>>
>> About the "vcpu_info reinitialize", I mean in the function 
>> "xen_hvm_init_shared_info()" the pointer "xen_vcpu" will be reset and all
>> point to HYPERVISOR_shared_info->vcpu_info[cpu].
>>
>> void __ref xen_hvm_init_shared_info(void)
>> ----
>> 1702          * When xen_hvm_init_shared_info is run at boot time only vcpu 
>> 0 is
>> 1703          * online but xen_hvm_init_shared_info is run at resume time 
>> too and
>> 1704          * in that case multiple vcpus might be online. */
>> 1705         for_each_online_cpu(cpu) {
>> 1706                 /* Leave it to be NULL. */
>> 1707                 if (cpu >= MAX_VIRT_CPUS)
>> 1708                         continue;
>> 1709                 per_cpu(xen_vcpu, cpu) = 
>> &HYPERVISOR_shared_info->vcpu_info[cpu];
>> 1710         }
>> 1711 }
>>
>>
>> But on Xen boot the init function "xen_start_kernel" only set the cpu0 to 
>> point to  HYPERVISOR_shared_info->vcpu_info[0]
>>
>> asmlinkage __visible void __init xen_start_kernel(void)
> 
> 
> We are talking about HVM guests here and xen_start_kernel is only called for 
> PV.  But even if it was, xen_vcpu pointers for other VCPUs are set in 
> xen_vcpu_setup(), which is called when non-boot VCPUs are coming up.
> 
> And I wonder whether the actual problem is that we don't call 
> xen_vcpu_setup() on canceled suspend (as we don't need to, really) and 
> therefore if we call xen_hvm_init_shared_info() then per_cpu(xen_vcpu,cpu) 
> for *non-boot* cpus is will become wrong.
> 

Yes, you are right, in xen_vcpu_setup() non-boot VCPUs is set to point to 
xen_vcpu_info

static void xen_vcpu_setup(int cpu)
----
 208         vcpup = &per_cpu(xen_vcpu_info, cpu);
...
 227                 /* This cpu is using the registered vcpu info, even if
 228                    later ones fail to. */
 229                 per_cpu(xen_vcpu, cpu) = vcpup;

But on canceled suspend if we call xen_hvm_init_shared_info(), the non-boot 
VCPUS will be reset to point to HYPERVISOR_shared_info->vcpu_info[cpu] which is 
a wrong address.
So I suggest we don't call xen_hvm_init_shared_info() when suspend is canceled.

> -boris
> 
>> ----
>> 1563         /* Don't do the full vcpu_info placement stuff until we have a
>> 1564            possible map and a non-dummy shared_info. */
>> 1565         per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
>> 1566
>> 1567         local_irq_disable();
>>
>> Other cpus are set to point to "xen_vcpu_info" in function xen_vcpu_setup().
>>
>> So after xl save -c/restore, the pointer xen_vcpu will be reset in function 
>> "xen_hvm_init_shared_info" and point to a wrong place.
>> This may cause all the cpus cannot handle irqs except cpu0, so IMHO it's not 
>> necessary to call xen_hvm_init_shared_info again if
>> suspend is canceled.
>>
>>> (The patch itself looks like the right thing to do though).
>>>
>>> -boris
>>>
>>> .
>>>
> 
> 
> .
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to