On 07/07/2016 10:31, Wanpeng Li wrote: > 2016-07-07 16:10 GMT+08:00 Paolo Bonzini <pbonz...@redhat.com>: >> >> >> On 07/07/2016 05:46, Wanpeng Li wrote: >>> From: Wanpeng Li <wanpeng...@hotmail.com> >>> >>> We will go to vcpu_run() loop after L0 emulates VMRESUME which incurs >>> kvm_sched_out and kvm_sched_in operations since cond_resched() will be >>> called once need resched. Preemption timer will be reprogrammed if vCPU >>> is scheduled to a different pCPU. Then the preemption timer bit of vmcs02 >>> will be set if L0 enable preemption timer to run L1 even if L1 doesn't >>> enable preemption timer to run L2. >>> >>> This patch fix it by don't reprogram preemption timer of vmcs02 if L1's >>> vCPU is scheduled on diffent pCPU when we are in the way to vmresume >>> nested guest. >> >> Again, this is wrong. There is no reason why L1's APIC timer cannot be >> emulated through the vmcs12's preemption timer setting. The only issue >> is getting the pin-based execution controls right. > > This patch doesn't intend to implement "L1 TSC deadline timer to > trigger while L2 is running", it just solves why vmcs02 is set even if > > exec_control = vmcs12->pin_based_vm_exec_control; > exec_control |= vmcs_config.pin_based_exec_ctrl; > exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; > > We should set pin-based execution controls right to implement "L1 TSC > deadline timer to trigger while L2 is running".
Ok, now I get it, but I still cannot understand the logic in your patch. You write: if (!is_guest_mode(vcpu) && kvm_lapic_hv_timer_in_use(vcpu) && kvm_x86_ops->set_hv_timer(vcpu, kvm_get_lapic_tscdeadline_msr(vcpu))) kvm_lapic_switch_to_sw_timer(vcpu); but this means that while L2 runs you miss L1's APIC timer interrupt. Do you want this instead: if (kvm_lapic_hv_timer_in_use(vcpu) && (is_guest_mode(vcpu) || kvm_x86_ops->set_hv_timer(vcpu, kvm_get_lapic_tscdeadline_msr(vcpu)))) kvm_lapic_switch_to_sw_timer(vcpu); ? Thanks, Paolo