RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set
Wu, Feng wrote on 2014-12-19: > > > Zhang, Yang Z wrote on 2014-12-19: >> Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is >> set >> >> Wu, Feng wrote on 2014-12-19: >>> >>> >>> Zhang, Yang Z wrote on 2014-12-19: Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set Wu, Feng wrote on 2014-12-19: > > > Zhang, Yang Z wrote on 2014-12-19: >> Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' >> is set >> >> Wu, Feng wrote on 2014-12-19: >>> >>> >>> iommu-boun...@lists.linux-foundation.org wrote on >> mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of: Cc: io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org; kvm@vger.kernel.org Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set Paolo Bonzini wrote on 2014-12-18: > > > On 18/12/2014 04:14, Wu, Feng wrote: >> >> >> linux-kernel-ow...@vger.kernel.org wrote on mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo: >>> x...@kernel.org; Gleb Natapov; Paolo Bonzini; >>> dw...@infradead.org; >>> joro-zlv9swrftaidnm+yrof...@public.gmane.org; Alex Williamson; >>> joro-zLv9SwRftAIdnm+Jiang Liu Cc: >>> io...@lists.linux-foundation.org; >>> linux-kernel-u79uwxl29ty76z2rm5m...@public.gmane.org; > KVM >> list; >>> Eric Auger Subject: Re: [v3 25/26] KVM: Suppress >>> posted-interrupt when 'SN' is set >>> >>> >>> >>> On 12/12/2014 16:14, Feng Wu wrote: Currently, we don't support urgent interrupt, all interrupts are recognized as non-urgent interrupt, so we cannot send posted-interrupt when 'SN' is set. >>> >>> Can this happen? If the vcpu is in guest mode, it cannot >>> have been scheduled out, and that's the only case when SN is set. >>> >>> Paolo >> >> Currently, the only place where SN is set is vCPU is >> preempted and If the vCPU is preempted, shouldn't the subsequent be ignored? What happens if a PI is occurs when vCPU is preempted? >>> >>> If a vCPU is preempted, the 'SN' bit is set, the subsequent >>> interrupts are suppressed for posting. >> >> I mean what happens if we don't set SN bit. From my point, if >> preempter already disabled the interrupt, it is ok to leave SN >> bit as zero. But if preempter enabled the interrupt, doesn't >> this mean he allow interrupt to happen? BTW, since there >> already has ON bit, so this means there only have one interrupt >> arrived at most and it doesn't hurt performance. Do we really need to >> set SN bit? > > > See this scenario: > vCPU0 is running on pCPU0 > --> vCPU0 is preempted by vCPU1 > --> Then vCPU1 is running on pCPU0 and vCPU0 is waiting for > --> schedule in runqueue > > If the we don't set SN for vCPU0, then all subsequent interrupts > for > vCPU0 is posted to vCPU1, this will consume hardware and > software The PI vector for vCPU1 is notification vector, but the PI vector for vCPU0 should be wakeup vector. Why vCPU1 will consume this PI event? >>> >>> Wakeup vector is only used for blocking case, when vCPU is >>> preempted and waiting in the runqueue, the NV is the notification vector. >> >> I see your point. But from performance point, if we can schedule the >> vCPU to another PCPU to handle the interrupt, it would helpful. But I >> remember current KVM will not schedule the vCPU in run queue (even >> though it got preempted) to another pCPU to run(Am I right?). So it may >> hard to do it. >> > > KVM is using the Linux scheduler, when the preempted vCPU (in > runqueue) is scheduled again depends on the scheduling algorithm > itself, I think it is a little hard for us to get involved. > > I think what you mentioned is a little like the urgent interrupt in VT-d PI > Spec. > For this kind of interrupts, if an interrupt is coming for an > preempted vCPU (waiting in the run queue), we need to schedule the > vCPU immediately. This is some real time things. And we don't support urgent > interrupt so far. Yes. IIRC, if we use two global vectors mechanism properly, there should no need to use hardware urgent interrupt mechanism. :) > > Thanks, > Feng > >>> >>> Thanks, >>> Feng >>> > efforts and in fact it is not needed at all. If SN is set for > vCPU0, VT-d hardware will not issue Notification Event for vCPU0 > when an interrupt is for it, but just setting the related PIR bit. > > Thanks, > Feng > >> >>> >>> Thanks, >>> Feng >>> >> waiting for the next scheduling in the runqueue. But I am >> n
RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set
> -Original Message- > From: Zhang, Yang Z > Sent: Friday, December 19, 2014 1:26 PM > To: Wu, Feng; Paolo Bonzini; kvm@vger.kernel.org > Cc: io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org; > kvm@vger.kernel.org > Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set > > Wu, Feng wrote on 2014-12-19: > > > > > > Zhang, Yang Z wrote on 2014-12-19: > >> Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is > >> set > >> > >> Wu, Feng wrote on 2014-12-19: > >>> > >>> > >>> Zhang, Yang Z wrote on 2014-12-19: > Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' > is set > > Wu, Feng wrote on 2014-12-19: > > > > > > iommu-boun...@lists.linux-foundation.org wrote on > mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of: > >> Cc: io...@lists.linux-foundation.org; > >> linux-ker...@vger.kernel.org; kvm@vger.kernel.org > >> Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' > >> is set > >> > >> Paolo Bonzini wrote on 2014-12-18: > >>> > >>> > >>> On 18/12/2014 04:14, Wu, Feng wrote: > > > linux-kernel-ow...@vger.kernel.org wrote on > >> mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo: > > x...@kernel.org; Gleb Natapov; Paolo Bonzini; > > dw...@infradead.org; > > joro-zlv9swrftaidnm+yrof...@public.gmane.org; Alex > Williamson; > > joro-zLv9SwRftAIdnm+Jiang Liu Cc: > > io...@lists.linux-foundation.org; > > linux-kernel-u79uwxl29ty76z2rm5m...@public.gmane.org; KVM > list; > > Eric Auger Subject: Re: [v3 25/26] KVM: Suppress > > posted-interrupt when 'SN' is set > > > > > > > > On 12/12/2014 16:14, Feng Wu wrote: > >> Currently, we don't support urgent interrupt, all > >> interrupts are recognized as non-urgent interrupt, so we > >> cannot send posted-interrupt when 'SN' is set. > > > > Can this happen? If the vcpu is in guest mode, it cannot > > have been scheduled out, and that's the only case when SN is set. > > > > Paolo > > Currently, the only place where SN is set is vCPU is > preempted and > >> > >> If the vCPU is preempted, shouldn't the subsequent be ignored? > >> What happens if a PI is occurs when vCPU is preempted? > > > > If a vCPU is preempted, the 'SN' bit is set, the subsequent > > interrupts are suppressed for posting. > > I mean what happens if we don't set SN bit. From my point, if > preempter already disabled the interrupt, it is ok to leave SN > bit as zero. But if preempter enabled the interrupt, doesn't this > mean he allow interrupt to happen? BTW, since there already has > ON bit, so this means there only have one interrupt arrived at > most and it doesn't hurt performance. Do we really need to set SN bit? > >>> > >>> > >>> See this scenario: > >>> vCPU0 is running on pCPU0 > >>> --> vCPU0 is preempted by vCPU1 > >>> --> Then vCPU1 is running on pCPU0 and vCPU0 is waiting for > >>> --> schedule in runqueue > >>> > >>> If the we don't set SN for vCPU0, then all subsequent interrupts > >>> for > >>> vCPU0 is posted to vCPU1, this will consume hardware and software > >> > >> The PI vector for vCPU1 is notification vector, but the PI vector > >> for > >> vCPU0 should be wakeup vector. Why vCPU1 will consume this PI event? > > > > Wakeup vector is only used for blocking case, when vCPU is preempted > > and waiting in the runqueue, the NV is the notification vector. > > I see your point. But from performance point, if we can schedule the vCPU to > another PCPU to handle the interrupt, it would helpful. But I remember current > KVM will not schedule the vCPU in run queue (even though it got preempted) to > another pCPU to run(Am I right?). So it may hard to do it. > KVM is using the Linux scheduler, when the preempted vCPU (in runqueue) is scheduled again depends on the scheduling algorithm itself, I think it is a little hard for us to get involved. I think what you mentioned is a little like the urgent interrupt in VT-d PI Spec. For this kind of interrupts, if an interrupt is coming for an preempted vCPU (waiting in the run queue), we need to schedule the vCPU immediately. This is some real time things. And we don't support urgent interrupt so far. Thanks, Feng > > > > Thanks, > > Feng > > > >> > >>> efforts and in fact it is not needed at all. If SN is set for > >>> vCPU0, VT-d hardware will not issue Notification Event for vCPU0 > >>> when an interrupt is for it, but just setting the related PIR bit. > >>> > >>> Thanks, > >>> Feng > >>> > > > > > Thanks, > > Feng > > > >> > waiting for the next scheduling in the runqueue. But I am not > sure whether we need to set SN for other purpose in future.
RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set
Wu, Feng wrote on 2014-12-19: > > > Zhang, Yang Z wrote on 2014-12-19: >> Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is >> set >> >> Wu, Feng wrote on 2014-12-19: >>> >>> >>> Zhang, Yang Z wrote on 2014-12-19: Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set Wu, Feng wrote on 2014-12-19: > > > iommu-boun...@lists.linux-foundation.org wrote on mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of: >> Cc: io...@lists.linux-foundation.org; >> linux-ker...@vger.kernel.org; kvm@vger.kernel.org >> Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' >> is set >> >> Paolo Bonzini wrote on 2014-12-18: >>> >>> >>> On 18/12/2014 04:14, Wu, Feng wrote: linux-kernel-ow...@vger.kernel.org wrote on >> mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo: > x...@kernel.org; Gleb Natapov; Paolo Bonzini; > dw...@infradead.org; > joro-zlv9swrftaidnm+yrof...@public.gmane.org; Alex Williamson; > joro-zLv9SwRftAIdnm+Jiang Liu Cc: > io...@lists.linux-foundation.org; > linux-kernel-u79uwxl29ty76z2rm5m...@public.gmane.org; KVM list; > Eric Auger Subject: Re: [v3 25/26] KVM: Suppress > posted-interrupt when 'SN' is set > > > > On 12/12/2014 16:14, Feng Wu wrote: >> Currently, we don't support urgent interrupt, all >> interrupts are recognized as non-urgent interrupt, so we >> cannot send posted-interrupt when 'SN' is set. > > Can this happen? If the vcpu is in guest mode, it cannot > have been scheduled out, and that's the only case when SN is set. > > Paolo Currently, the only place where SN is set is vCPU is preempted and >> >> If the vCPU is preempted, shouldn't the subsequent be ignored? >> What happens if a PI is occurs when vCPU is preempted? > > If a vCPU is preempted, the 'SN' bit is set, the subsequent > interrupts are suppressed for posting. I mean what happens if we don't set SN bit. From my point, if preempter already disabled the interrupt, it is ok to leave SN bit as zero. But if preempter enabled the interrupt, doesn't this mean he allow interrupt to happen? BTW, since there already has ON bit, so this means there only have one interrupt arrived at most and it doesn't hurt performance. Do we really need to set SN bit? >>> >>> >>> See this scenario: >>> vCPU0 is running on pCPU0 >>> --> vCPU0 is preempted by vCPU1 >>> --> Then vCPU1 is running on pCPU0 and vCPU0 is waiting for >>> --> schedule in runqueue >>> >>> If the we don't set SN for vCPU0, then all subsequent interrupts >>> for >>> vCPU0 is posted to vCPU1, this will consume hardware and software >> >> The PI vector for vCPU1 is notification vector, but the PI vector >> for >> vCPU0 should be wakeup vector. Why vCPU1 will consume this PI event? > > Wakeup vector is only used for blocking case, when vCPU is preempted > and waiting in the runqueue, the NV is the notification vector. I see your point. But from performance point, if we can schedule the vCPU to another PCPU to handle the interrupt, it would helpful. But I remember current KVM will not schedule the vCPU in run queue (even though it got preempted) to another pCPU to run(Am I right?). So it may hard to do it. > > Thanks, > Feng > >> >>> efforts and in fact it is not needed at all. If SN is set for >>> vCPU0, VT-d hardware will not issue Notification Event for vCPU0 >>> when an interrupt is for it, but just setting the related PIR bit. >>> >>> Thanks, >>> Feng >>> > > Thanks, > Feng > >> waiting for the next scheduling in the runqueue. But I am not sure whether we need to set SN for other purpose in future. Adding SN checking here is just to follow the Spec. non-urgent interrupts are suppressed >>> when SN is set. >>> >>> I would change that to a WARN_ON_ONCE then. >> >> >> Best regards, >> Yang >> >> >> ___ >> iommu mailing list >> io...@lists.linux-foundation.org >> https://lists.linuxfoundation.org/mailman/listinfo/iommu Best regards, Yang >> >> >> Best regards, >> Yang >> Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set
> -Original Message- > From: Zhang, Yang Z > Sent: Friday, December 19, 2014 12:44 PM > To: Wu, Feng; Paolo Bonzini; kvm@vger.kernel.org > Cc: io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org; > kvm@vger.kernel.org > Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set > > Wu, Feng wrote on 2014-12-19: > > > > > > Zhang, Yang Z wrote on 2014-12-19: > >> Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is > >> set > >> > >> Wu, Feng wrote on 2014-12-19: > >>> > >>> > >>> iommu-boun...@lists.linux-foundation.org wrote on > >> mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of: > Cc: io...@lists.linux-foundation.org; > linux-ker...@vger.kernel.org; kvm@vger.kernel.org > Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' > is set > > Paolo Bonzini wrote on 2014-12-18: > > > > > > On 18/12/2014 04:14, Wu, Feng wrote: > >> > >> > >> linux-kernel-ow...@vger.kernel.org wrote on > mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo: > >>> x...@kernel.org; Gleb Natapov; Paolo Bonzini; > dw...@infradead.org; > >>> joro-zlv9swrftaidnm+yrof...@public.gmane.org; Alex Williamson; > >>> joro-zLv9SwRftAIdnm+Jiang Liu Cc: > >>> io...@lists.linux-foundation.org; > >>> linux-kernel-u79uwxl29ty76z2rm5m...@public.gmane.org; KVM > list; > >>> Eric Auger Subject: Re: [v3 25/26] KVM: Suppress posted-interrupt > >>> when 'SN' is set > >>> > >>> > >>> > >>> On 12/12/2014 16:14, Feng Wu wrote: > Currently, we don't support urgent interrupt, all interrupts > are recognized as non-urgent interrupt, so we cannot send > posted-interrupt when 'SN' is set. > >>> > >>> Can this happen? If the vcpu is in guest mode, it cannot have > >>> been scheduled out, and that's the only case when SN is set. > >>> > >>> Paolo > >> > >> Currently, the only place where SN is set is vCPU is preempted > >> and > > If the vCPU is preempted, shouldn't the subsequent be ignored? > What happens if a PI is occurs when vCPU is preempted? > >>> > >>> If a vCPU is preempted, the 'SN' bit is set, the subsequent > >>> interrupts are suppressed for posting. > >> > >> I mean what happens if we don't set SN bit. From my point, if > >> preempter already disabled the interrupt, it is ok to leave SN bit > >> as zero. But if preempter enabled the interrupt, doesn't this mean > >> he allow interrupt to happen? BTW, since there already has ON bit, > >> so this means there only have one interrupt arrived at most and it > >> doesn't hurt performance. Do we really need to set SN bit? > > > > > > See this scenario: > > vCPU0 is running on pCPU0 > > --> vCPU0 is preempted by vCPU1 > > --> Then vCPU1 is running on pCPU0 and vCPU0 is waiting for schedule > > --> in runqueue > > > > If the we don't set SN for vCPU0, then all subsequent interrupts for > > vCPU0 is posted to vCPU1, this will consume hardware and software > > The PI vector for vCPU1 is notification vector, but the PI vector for vCPU0 > should be wakeup vector. Why vCPU1 will consume this PI event? Wakeup vector is only used for blocking case, when vCPU is preempted and waiting in the runqueue, the NV is the notification vector. Thanks, Feng > > > efforts and in fact it is not needed at all. If SN is set for vCPU0, > > VT-d hardware will not issue Notification Event for vCPU0 when an > > interrupt is for it, but just setting the related PIR bit. > > > > Thanks, > > Feng > > > >> > >>> > >>> Thanks, > >>> Feng > >>> > > >> waiting for the next scheduling in the runqueue. But I am not > >> sure whether we need to set SN for other purpose in future. > >> Adding SN checking here is just to follow the Spec. non-urgent > >> interrupts are suppressed > > when SN is set. > > > > I would change that to a WARN_ON_ONCE then. > > > Best regards, > Yang > > > ___ > iommu mailing list > io...@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > >> > >> > >> Best regards, > >> Yang > >> > > > Best regards, > Yang > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set
Wu, Feng wrote on 2014-12-19: > > > Zhang, Yang Z wrote on 2014-12-19: >> Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is >> set >> >> Wu, Feng wrote on 2014-12-19: >>> >>> >>> iommu-boun...@lists.linux-foundation.org wrote on >> mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of: Cc: io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org; kvm@vger.kernel.org Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set Paolo Bonzini wrote on 2014-12-18: > > > On 18/12/2014 04:14, Wu, Feng wrote: >> >> >> linux-kernel-ow...@vger.kernel.org wrote on mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo: >>> x...@kernel.org; Gleb Natapov; Paolo Bonzini; dw...@infradead.org; >>> joro-zlv9swrftaidnm+yrof...@public.gmane.org; Alex Williamson; >>> joro-zLv9SwRftAIdnm+Jiang Liu Cc: >>> io...@lists.linux-foundation.org; >>> linux-kernel-u79uwxl29ty76z2rm5m...@public.gmane.org; KVM list; >>> Eric Auger Subject: Re: [v3 25/26] KVM: Suppress posted-interrupt >>> when 'SN' is set >>> >>> >>> >>> On 12/12/2014 16:14, Feng Wu wrote: Currently, we don't support urgent interrupt, all interrupts are recognized as non-urgent interrupt, so we cannot send posted-interrupt when 'SN' is set. >>> >>> Can this happen? If the vcpu is in guest mode, it cannot have >>> been scheduled out, and that's the only case when SN is set. >>> >>> Paolo >> >> Currently, the only place where SN is set is vCPU is preempted >> and If the vCPU is preempted, shouldn't the subsequent be ignored? What happens if a PI is occurs when vCPU is preempted? >>> >>> If a vCPU is preempted, the 'SN' bit is set, the subsequent >>> interrupts are suppressed for posting. >> >> I mean what happens if we don't set SN bit. From my point, if >> preempter already disabled the interrupt, it is ok to leave SN bit >> as zero. But if preempter enabled the interrupt, doesn't this mean >> he allow interrupt to happen? BTW, since there already has ON bit, >> so this means there only have one interrupt arrived at most and it >> doesn't hurt performance. Do we really need to set SN bit? > > > See this scenario: > vCPU0 is running on pCPU0 > --> vCPU0 is preempted by vCPU1 > --> Then vCPU1 is running on pCPU0 and vCPU0 is waiting for schedule > --> in runqueue > > If the we don't set SN for vCPU0, then all subsequent interrupts for > vCPU0 is posted to vCPU1, this will consume hardware and software The PI vector for vCPU1 is notification vector, but the PI vector for vCPU0 should be wakeup vector. Why vCPU1 will consume this PI event? > efforts and in fact it is not needed at all. If SN is set for vCPU0, > VT-d hardware will not issue Notification Event for vCPU0 when an > interrupt is for it, but just setting the related PIR bit. > > Thanks, > Feng > >> >>> >>> Thanks, >>> Feng >>> >> waiting for the next scheduling in the runqueue. But I am not >> sure whether we need to set SN for other purpose in future. >> Adding SN checking here is just to follow the Spec. non-urgent >> interrupts are suppressed > when SN is set. > > I would change that to a WARN_ON_ONCE then. Best regards, Yang ___ iommu mailing list io...@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu >> >> >> Best regards, >> Yang >> Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set
> -Original Message- > From: Zhang, Yang Z > Sent: Friday, December 19, 2014 11:33 AM > To: Wu, Feng; Paolo Bonzini; kvm@vger.kernel.org > Cc: io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org; > kvm@vger.kernel.org > Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set > > Wu, Feng wrote on 2014-12-19: > > > > > > iommu-boun...@lists.linux-foundation.org wrote on > mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of: > >> Cc: io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org; > >> kvm@vger.kernel.org > >> Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is > >> set > >> > >> Paolo Bonzini wrote on 2014-12-18: > >>> > >>> > >>> On 18/12/2014 04:14, Wu, Feng wrote: > > > linux-kernel-ow...@vger.kernel.org wrote on > >> mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo: > > x...@kernel.org; Gleb Natapov; Paolo Bonzini; > > dw...@infradead.org; > > joro-zlv9swrftaidnm+yrof...@public.gmane.org; Alex Williamson; > > joro-zLv9SwRftAIdnm+Jiang > > Liu > > Cc: io...@lists.linux-foundation.org; > > linux-kernel-u79uwxl29ty76z2rm5m...@public.gmane.org; KVM list; > > Eric Auger > > Subject: Re: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' > > is set > > > > > > > > On 12/12/2014 16:14, Feng Wu wrote: > >> Currently, we don't support urgent interrupt, all interrupts > >> are recognized as non-urgent interrupt, so we cannot send > >> posted-interrupt when 'SN' is set. > > > > Can this happen? If the vcpu is in guest mode, it cannot have > > been scheduled out, and that's the only case when SN is set. > > > > Paolo > > Currently, the only place where SN is set is vCPU is preempted > and > >> > >> If the vCPU is preempted, shouldn't the subsequent be ignored? What > >> happens if a PI is occurs when vCPU is preempted? > > > > If a vCPU is preempted, the 'SN' bit is set, the subsequent interrupts > > are suppressed for posting. > > I mean what happens if we don't set SN bit. From my point, if preempter > already disabled the interrupt, it is ok to leave SN bit as zero. But if > preempter > enabled the interrupt, doesn't this mean he allow interrupt to happen? BTW, > since there already has ON bit, so this means there only have one interrupt > arrived at most and it doesn't hurt performance. Do we really need to set SN > bit? See this scenario: vCPU0 is running on pCPU0 --> vCPU0 is preempted by vCPU1 --> Then vCPU1 is running on pCPU0 and vCPU0 is waiting for schedule in runqueue If the we don't set SN for vCPU0, then all subsequent interrupts for vCPU0 is posted to vCPU1, this will consume hardware and software efforts and in fact it is not needed at all. If SN is set for vCPU0, VT-d hardware will not issue Notification Event for vCPU0 when an interrupt is for it, but just setting the related PIR bit. Thanks, Feng > > > > > Thanks, > > Feng > > > >> > waiting for the next scheduling in the runqueue. But I am not > sure whether we need to set SN for other purpose in future. > Adding SN checking here is just to follow the Spec. non-urgent > interrupts are suppressed > >>> when SN is set. > >>> > >>> I would change that to a WARN_ON_ONCE then. > >> > >> > >> Best regards, > >> Yang > >> > >> > >> ___ > >> iommu mailing list > >> io...@lists.linux-foundation.org > >> https://lists.linuxfoundation.org/mailman/listinfo/iommu > > > Best regards, > Yang > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [question] Why newer QEMU may lose irq when doing migration?
2014-12-17 18:46 GMT+08:00 Paolo Bonzini : > > > On 17/12/2014 04:46, Wincy Van wrote: >> Hi, all: >> >> The patchset (https://lkml.org/lkml/2014/3/18/309) fixed migration of >> Windows guests, but commit 0bc830b05c667218d703f2026ec866c49df974fc >> (KVM: ioapic: clear IRR for edge-triggered interrupts at delivery) >> introduced a bug (see >> https://www.mail-archive.com/kvm@vger.kernel.org/msg109813.html). >> >> From the description "Unlike the old qemu-kvm, which really never did >> that, with new QEMU it is for some reason >> somewhat likely to migrate a VM with a nonzero IRR in the ioapic." >> >> Why could new QEMU do that? I can not find any codes about the "some >> reason".. >> As we know, once a irq is set in kvm's ioapic, the ioapic will send >> that irq to lapic, this is an "atomic" operation. > > It can happen if the IRQ is masked in the IOAPIC, for example. Until > commit 0bc830b, KVM could not distinguish two cases: > > 1) an edge-triggered interrupt that was raised while the IOAPIC had it > masked > > 2) an edge-triggered interrupt that was raised and delivered, but for > which userspace left the level to 1. > Thank you Paolo. It seems that QEMU's rtc behavior is case 2. But before this patchset, a rtc interrupt may be lost when doing migration, and guest will not acknowledge it, then the newer rtc interrupts are ignored forever. I think this is none of the cases above, because the interrupt was lost. It must be something wrong here. >> Then, kvm will inject them in inject_pending_event(or set rvi in >> apic-v case). QEMU will also save the pending irq when doing >> migration. > > No, QEMU does not save the pending IRQ. IRQs are stateless in QEMU. > The assumption is that after a qemu_set_irq the IRQ will be > delivered---possibly on the other side of the migration, but it will be > delivered. > I find that in kvm_arch_vcpu_ioctl_get_sregs, KVM will save pending IRQs into sregs->interrupt_bitmap and QEMU will save it. Isn't it right? Thanks, Wincy > Paolo > >> I can not find a point which guest could lose a irq, but this scenario >> really exists. >> >> Any ideas? >> >> >> Thanks, >> >> Wincy >> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set
Wu, Feng wrote on 2014-12-19: > > > iommu-boun...@lists.linux-foundation.org wrote on > mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of: >> Cc: io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org; >> kvm@vger.kernel.org >> Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is >> set >> >> Paolo Bonzini wrote on 2014-12-18: >>> >>> >>> On 18/12/2014 04:14, Wu, Feng wrote: linux-kernel-ow...@vger.kernel.org wrote on >> mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo: > x...@kernel.org; Gleb Natapov; Paolo Bonzini; > dw...@infradead.org; > joro-zlv9swrftaidnm+yrof...@public.gmane.org; Alex Williamson; > joro-zLv9SwRftAIdnm+Jiang > Liu > Cc: io...@lists.linux-foundation.org; > linux-kernel-u79uwxl29ty76z2rm5m...@public.gmane.org; KVM list; > Eric Auger > Subject: Re: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' > is set > > > > On 12/12/2014 16:14, Feng Wu wrote: >> Currently, we don't support urgent interrupt, all interrupts >> are recognized as non-urgent interrupt, so we cannot send >> posted-interrupt when 'SN' is set. > > Can this happen? If the vcpu is in guest mode, it cannot have > been scheduled out, and that's the only case when SN is set. > > Paolo Currently, the only place where SN is set is vCPU is preempted and >> >> If the vCPU is preempted, shouldn't the subsequent be ignored? What >> happens if a PI is occurs when vCPU is preempted? > > If a vCPU is preempted, the 'SN' bit is set, the subsequent interrupts > are suppressed for posting. I mean what happens if we don't set SN bit. From my point, if preempter already disabled the interrupt, it is ok to leave SN bit as zero. But if preempter enabled the interrupt, doesn't this mean he allow interrupt to happen? BTW, since there already has ON bit, so this means there only have one interrupt arrived at most and it doesn't hurt performance. Do we really need to set SN bit? > > Thanks, > Feng > >> waiting for the next scheduling in the runqueue. But I am not sure whether we need to set SN for other purpose in future. Adding SN checking here is just to follow the Spec. non-urgent interrupts are suppressed >>> when SN is set. >>> >>> I would change that to a WARN_ON_ONCE then. >> >> >> Best regards, >> Yang >> >> >> ___ >> iommu mailing list >> io...@lists.linux-foundation.org >> https://lists.linuxfoundation.org/mailman/listinfo/iommu Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 90081] New: Windows Guest does not resized
https://bugzilla.kernel.org/show_bug.cgi?id=90081 Bug ID: 90081 Summary: Windows Guest does not resized Product: Virtualization Version: unspecified Kernel Version: 2.6.32-504.3.3.el6.x86_64 #1 SMP Wed Dec 17 01:55:02 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Hardware: Intel OS: Linux Tree: Mainline Status: NEW Severity: blocking Priority: P1 Component: kvm Assignee: virtualization_...@kernel-bugs.osdl.org Reporter: theway...@gmail.com Regression: No Created attachment 161291 --> https://bugzilla.kernel.org/attachment.cgi?id=161291&action=edit Disk Management Page Issue: For windows 2008 R2 SP1 guest, it resized the disk partition, in disk management it shows 30GB but the template created default disk size is shown in My Computer page it shows 10GB. [root@node101 ~]# cat /etc/redhat-release CentOS release 6.6 (Final) We had try 6.5 and 6.6. Where we have two other KVM node for testing, which 6.5 is working well on the other node, and it does not have this issue. -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set
> -Original Message- > From: iommu-boun...@lists.linux-foundation.org > [mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of Zhang, Yang Z > Sent: Thursday, December 18, 2014 11:10 PM > To: Paolo Bonzini; kvm@vger.kernel.org > Cc: io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org; > kvm@vger.kernel.org > Subject: RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set > > Paolo Bonzini wrote on 2014-12-18: > > > > > > On 18/12/2014 04:14, Wu, Feng wrote: > >> > >> > >> linux-kernel-ow...@vger.kernel.org wrote on > mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo: > >>> x...@kernel.org; Gleb Natapov; Paolo Bonzini; dw...@infradead.org; > >>> joro-zlv9swrftaidnm+yrof...@public.gmane.org; Alex Williamson; > >>> joro-zLv9SwRftAIdnm+Jiang > >>> Liu > >>> Cc: io...@lists.linux-foundation.org; > >>> linux-kernel-u79uwxl29ty76z2rm5m...@public.gmane.org; KVM list; > >>> Eric Auger > >>> Subject: Re: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is > >>> set > >>> > >>> > >>> > >>> On 12/12/2014 16:14, Feng Wu wrote: > Currently, we don't support urgent interrupt, all interrupts are > recognized as non-urgent interrupt, so we cannot send > posted-interrupt when 'SN' is set. > >>> > >>> Can this happen? If the vcpu is in guest mode, it cannot have been > >>> scheduled out, and that's the only case when SN is set. > >>> > >>> Paolo > >> > >> Currently, the only place where SN is set is vCPU is preempted and > > If the vCPU is preempted, shouldn't the subsequent be ignored? What happens > if a PI is occurs when vCPU is preempted? If a vCPU is preempted, the 'SN' bit is set, the subsequent interrupts are suppressed for posting. Thanks, Feng > > >> waiting for the next scheduling in the runqueue. But I am not sure > >> whether we need to set SN for other purpose in future. Adding SN > >> checking here is just to follow the Spec. non-urgent interrupts are > >> suppressed > > when SN is set. > > > > I would change that to a WARN_ON_ONCE then. > > > Best regards, > Yang > > > ___ > iommu mailing list > io...@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
> -Original Message- > From: linux-kernel-ow...@vger.kernel.org > [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo Bonzini > Sent: Thursday, December 18, 2014 4:37 PM > To: linux-ker...@vger.kernel.org > Cc: io...@lists.linux-foundation.org; kvm@vger.kernel.org; > linux-ker...@vger.kernel.org; kvm@vger.kernel.org > Subject: Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU > is blocked > > > > On 18/12/2014 04:16, Wu, Feng wrote: > >>> pre-block: > >>> - Add the vCPU to the blocked per-CPU list > >>> - Clear 'SN' > >> > >> Should SN be already clear (and NV set to POSTED_INTR_VECTOR)? > > > > I think the SN bit should be clear here, Adding it here is just to make sure > > SN is clear when vCPU is blocked, so it can receive wakeup notification > > event > later. > > Then, please, WARN if the SN bit is set inside the if (vcpu->blocked). > Inside that if you can just add the vCPU to the blocked list on vcpu_put. > > >> Can it > >> happen that you go from sched-out to blocked without doing a sched-in > first? > >> > > > > I cannot imagine this scenario, can you please be more specific? Thanks a > > lot! > > I cannot either. :) But it would be the case where SN is not cleared. > So we agree that it cannot happen. > > >> In fact, if this is possible, what happens if vcpu->preempted && > >> vcpu->blocked? > > > > In fact, vcpu->preempted && vcpu->blocked happens sometimes, but I think > there is > > no issues. Please refer to the following case: > > I agree that there should be no issues. But if it can happen, it's better: > > 1) to separate the handling of preemption and blocking: preemption > handles SN/NV/NDST, blocking handles the wakeup list. > Sorry, I don't quite understand this. I think handling of preemption and blocking is separated in vmx_vcpu_put(). For vmx_vcpu_load(), the handling of SN/NV/NDST is common for preemption and blocking. Thanks, Feng > 2) to change this > > + } else if (vcpu->blocked) { > + /* > + * The vcpu is blocked on the wait queue. > + * Store the blocked vCPU on the list of the > + * vcpu->wakeup_cpu, which is the destination > + * of the wake-up notification event. > > to just > > } > if (vcpu->blocked) { > ... > } > > kvm_vcpu_block() > > -> vcpu->blocked = true; > > -> prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE); > > > > before schedule() is called, this vcpu is woken up by another guy, so > > the state of the vcpu associated thread is changed to TASK_RUNNING, > > then preemption happens after interrupts or the following schedule() is > > hit, this will call kvm_sched_out(), in which current->state == > TASK_RUNNING > > and vcpu->preempted is set to true. So now vcpu->preempted and > vcpu->blocked > > are both true. In vmx_vcpu_put(), we will check vcpu->preempted first, > > so > > the vCPU will not be blocked, and the vcpu->blocked will be set the > > false in > > vmx_vcpu_load(). > > > > But maybe I need do a little change to the vmx_vcpu_load() like below: > > > > /* > > * Delete the vCPU from the related wakeup queue > > * if we are resuming from blocked state > > */ > > if (vcpu->blocked) { > > vcpu->blocked = false; > > + /* if wakeup_cpu == -1, the > > vcpu is currently not > blocked on any > > + pCPU, don't need dequeue here > > */ > > + if (vcpu->wakeup_cpu != -1) { > > > spin_lock_irqsave(&per_cpu(blocked_vcpu_on_cpu_lock, > > vcpu->wakeup_cpu), flags); > > list_del(&vcpu->blocked_vcpu_list); > > > spin_unlock_irqrestore(&per_cpu(blocked_vcpu_on_cpu_lock, > > vcpu->wakeup_cpu), flags); > > vcpu->wakeup_cpu = -1; > > + } > > } > > Good idea. > > Paolo > > > Any ideas about this? Thanks a lot! > > > > Thanks, > > Feng > > > > > > -> schedule(); > > > > > >> > >>> - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR > >>> > >>> post-block: > >>> - Remove the vCPU from the per-CPU list > >> > >> Paolo > >> > >>> Signed-off-by: Feng Wu > >> -- > >> To unsubscribe from this list: send the line "unsubscribe kvm" in > >> the body of a message to majord...@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://
Re: [PATCH] kvm: x86: remove vmx_vm_has_apicv() outside of hwapic_isr_update()
On 2014/12/1 19:43, Paolo Bonzini wrote: On 01/12/2014 10:28, Tiejun Chen wrote: In most cases calling hwapic_isr_update(), actually we always check if kvm_apic_vid_enabled() == 1, and also actually, kvm_apic_vid_enabled() -> kvm_x86_ops->vm_has_apicv() -> vmx_vm_has_apicv() or '0' in svm case So its unnecessary to recall this inside hwapic_isr_update(), here just remove vmx_vm_has_apicv() out and follow others. If you want to do this, please NULL out the function pointer instead, as KVM already does for hwapic_irr_update. Are you saying something below? if (enable_apicv) ... else { kvm_x86_ops->hwapic_irr_update = NULL; But there's a little bit difference to NULL out hwapic_isr_update(), static int vmx_vm_has_apicv(struct kvm *kvm) { return enable_apicv && irqchip_in_kernel(kvm); } Yes, I can do something like this, static __init int hadware_setup(void) { ... if (enable_apicv) { ... if (!irqchip_in_kernel(kvm)) kvm_x86_ops->hwapic_isr_update = NULL; } else { ... kvm_x86_ops->hwapic_isr_update = NULL; But this means we have to revise hadware_setup() to get 'kvm' inside, then rebase other callers to hwapic_isr_update(), is it really good? Here what I will intend to do is trying to reduce some cost (reduplicate check) with a little code, so its may not be worth changing much more. Tiejun Paolo Signed-off-by: Tiejun Chen --- arch/x86/kvm/lapic.c | 3 ++- arch/x86/kvm/vmx.c | 3 --- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index e0e5642..2ddc426 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1739,7 +1739,8 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu, if (kvm_x86_ops->hwapic_irr_update) kvm_x86_ops->hwapic_irr_update(vcpu, apic_find_highest_irr(apic)); - kvm_x86_ops->hwapic_isr_update(vcpu->kvm, apic_find_highest_isr(apic)); + if (kvm_apic_vid_enabled(vcpu->kvm)) + kvm_x86_ops->hwapic_isr_update(vcpu->kvm, apic_find_highest_isr(apic)); kvm_make_request(KVM_REQ_EVENT, vcpu); kvm_rtc_eoi_tracking_restore_one(vcpu); } diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6a951d8..f0c16a9 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -7406,9 +7406,6 @@ static void vmx_hwapic_isr_update(struct kvm *kvm, int isr) u16 status; u8 old; - if (!vmx_vm_has_apicv(kvm)) - return; - if (isr == -1) isr = 0; -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 16/26] KVM: Make struct kvm_irq_routing_table accessible
> -Original Message- > From: linux-kernel-ow...@vger.kernel.org > [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo Bonzini > Sent: Thursday, December 18, 2014 12:18 AM > To: linux-ker...@vger.kernel.org > Cc: io...@lists.linux-foundation.org; kvm@vger.kernel.org; > linux-ker...@vger.kernel.org; kvm@vger.kernel.org > Subject: Re: [v3 16/26] KVM: Make struct kvm_irq_routing_table accessible > > > > On 12/12/2014 16:14, Feng Wu wrote: > > Move struct kvm_irq_routing_table from irqchip.c to kvm_host.h, > > so we can use it outside of irqchip.c. > > > > Signed-off-by: Feng Wu > > --- > > include/linux/kvm_host.h | 19 +++ > > virt/kvm/irqchip.c | 11 --- > > 2 files changed, 19 insertions(+), 11 deletions(-) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index 0b9659d..cfa85ac 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -335,6 +335,25 @@ struct kvm_kernel_irq_routing_entry { > > struct hlist_node link; > > }; > > > > +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING > > + > > +struct kvm_irq_routing_table { > > + int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS]; > > + struct kvm_kernel_irq_routing_entry *rt_entries; > > + u32 nr_rt_entries; > > + /* > > +* Array indexed by gsi. Each entry contains list of irq chips > > +* the gsi is connected to. > > +*/ > > + struct hlist_head map[0]; > > +}; > > + > > +#else > > + > > +struct kvm_irq_routing_table {}; > > If possible, just make this "struct kvm_irq_routing_table;" and pull > this line to include/linux/kvm_types.h. > > Paolo Do you mean move the definition of struct kvm_irq_routing_table to include/linux/kvm_types.h and add a declaration here? Thanks, Feng > > > + > > +#endif > > + > > #ifndef KVM_PRIVATE_MEM_SLOTS > > #define KVM_PRIVATE_MEM_SLOTS 0 > > #endif > > diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c > > index 7f256f3..cdf29a6 100644 > > --- a/virt/kvm/irqchip.c > > +++ b/virt/kvm/irqchip.c > > @@ -31,17 +31,6 @@ > > #include > > #include "irq.h" > > > > -struct kvm_irq_routing_table { > > - int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS]; > > - struct kvm_kernel_irq_routing_entry *rt_entries; > > - u32 nr_rt_entries; > > - /* > > -* Array indexed by gsi. Each entry contains list of irq chips > > -* the gsi is connected to. > > -*/ > > - struct hlist_head map[0]; > > -}; > > - > > int kvm_irq_map_gsi(struct kvm *kvm, > > struct kvm_kernel_irq_routing_entry *entries, int gsi) > > { > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 23/26] KVM: Update Posted-Interrupts Descriptor when vCPU is preempted
> -Original Message- > From: linux-kernel-ow...@vger.kernel.org > [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo Bonzini > Sent: Thursday, December 18, 2014 4:32 PM > To: linux-ker...@vger.kernel.org > Cc: io...@lists.linux-foundation.org; kvm@vger.kernel.org; > linux-ker...@vger.kernel.org; kvm@vger.kernel.org > Subject: Re: [v3 23/26] KVM: Update Posted-Interrupts Descriptor when vCPU > is preempted > > > > On 18/12/2014 04:15, Wu, Feng wrote: > > Thanks for your comments, Paolo! > > > > If we use u64 new_control, we cannot use new.sn any more. > > Maybe we can change the struct pi_desc {} like this: > > > > typedef struct pid_control{ > > u64 on : 1, > > sn : 1, > > rsvd_1 : 13, > > ndm : 1, > > nv : 8, > > rsvd_2 : 8, > > ndst: 32; > > }pid_control_t; > > > > struct pi_desc { > > u32 pir[8]; /* Posted interrupt requested */ > > pid_control_t control; > > Probably something like this to keep the union: > > typedef union pid_control { > u64 full; > struct { > u64 on : 1, > ... > } fields; > }; > > > u32 rsvd[6]; > > } __aligned(64); > > > > > > Then we can define pid_control_t new_control, old_control. And use > new_control.sn = 0. > > > > What is your opinon? > > Sure. Alternatively, keep using struct pi_desc new; just > do not zero it, nor access any field outide the control word. > > Paolo Yes, this is also a good idea. Thanks! Thanks, Feng > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 13/26] KVM: Define a new interface kvm_find_dest_vcpu() for VT-d PI
Wu, Feng wrote on 2014-12-19: > > > Paolo Bonzini wrote on 2014-12-19: >> jiang@linux.intel.com >> Cc: eric.au...@linaro.org; linux-ker...@vger.kernel.org; >> io...@lists.linux-foundation.org; kvm@vger.kernel.org >> Subject: Re: [v3 13/26] KVM: Define a new interface >> kvm_find_dest_vcpu() for VT-d PI >> >> >> >> On 18/12/2014 15:49, Zhang, Yang Z wrote: > Here, we introduce a similar way with 'apic_arb_prio' to handle > guest lowest priority interrtups when VT-d PI is used. Here is > the > ideas: - Each vCPU has a counter 'round_robin_counter'. - When > guests sets an interrupts to lowest priority, we choose the vCPU > with smallest 'round_robin_counter' as the destination, then > increase it. >>> >>> How this can work well? All subsequent interrupts are delivered to >>> one vCPU? It shouldn't be the best solution, need more consideration. >> >> Well, it's a hardware limitation. The alternative (which is easy to >> implement) is to only do PI for single-CPU interrupts. This should >> work well for multiqueue NICs (and of course for UP guests :)), so >> perhaps it's a good idea to only support that as a first attempt. >> >> Paolo > > Paolo, what do you mean by "single-CPU interrupts"? Do you mean we It should be same idea as I mentioned on another thread: deliver the interrupt to a single CPU(maybe the first matched VCPU?) > don't support lowest priority interrupts for PI? But Linux OS uses > lowest priority for most of the case? If so, we can hardly get benefit > from this feature for Linux guest OS. > > Thanks, > Feng > >> >>> Also, I think you should take the apic_arb_prio into consider >>> since the priority is for the whole vCPU not for one interrupt. Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 06/26] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts
Wu, Feng wrote on 2014-12-19: > > > Zhang, Yang Z wrote on 2014-12-18: >> jiang@linux.intel.com >> Cc: eric.au...@linaro.org; linux-ker...@vger.kernel.org; >> io...@lists.linux-foundation.org; kvm@vger.kernel.org; Wu, Feng >> Subject: RE: [v3 06/26] iommu, x86: No need to migrating irq for >> VT-d Posted-Interrupts >> >> Feng Wu wrote on 2014-12-12: >>> We don't need to migrate the irqs for VT-d Posted-Interrupts here. >>> When 'pst' is set in IRTE, the associated irq will be posted to >>> guests instead of interrupt remapping. The destination of the >>> interrupt is set in Posted-Interrupts Descriptor, and the >>> migration happens during vCPU scheduling. >>> >>> However, we still update the cached irte here, which can be used >>> when changing back to remapping mode. >>> >>> Signed-off-by: Feng Wu >>> Reviewed-by: Jiang Liu >>> --- >>> drivers/iommu/intel_irq_remapping.c | 6 +- >>> 1 file changed, 5 insertions(+), 1 deletion(-) diff --git >>> a/drivers/iommu/intel_irq_remapping.c >>> b/drivers/iommu/intel_irq_remapping.c index 48c2051..ab9057a >>> 100644 >>> --- a/drivers/iommu/intel_irq_remapping.c +++ >>> b/drivers/iommu/intel_irq_remapping.c @@ -977,6 +977,7 @@ >>> intel_ir_set_affinity(struct irq_data *data, const struct cpumask >>> *mask, { >>> struct intel_ir_data *ir_data = data->chip_data;struct irte >>> *irte = >>> &ir_data->irte_entry; +struct irte_pi *irte_pi = (struct irte_pi >>> *)irte;struct irq_cfg *cfg = irqd_cfg(data); struct irq_data *parent >>> = data->parent_data; int ret; >>> @@ -991,7 +992,10 @@ intel_ir_set_affinity(struct irq_data *data, >>> const struct cpumask *mask, >>> */ >>> irte->vector = cfg->vector; >>> irte->dest_id = IRTE_DEST(cfg->dest_apicid); >>> - modify_irte(&ir_data->irq_2_iommu, irte); >>> + >>> + /* We don't need to modify irte if the interrupt is for posting. */ >>> + if (irte_pi->pst != 1) >>> + modify_irte(&ir_data->irq_2_iommu, irte); >> >> What happens if user changes the IRQ affinity manually? > > If the IRQ is posted, its affinity is controlled by guest (irq <---> > vCPU <> pCPU), it has no effect when host changes its affinity. That's the problem: User is able to changes it in host but it never takes effect since it is actually controlled by guest. I guess it will break the IRQ balance too. > > Thanks, > Feng > >> >>> >>> /* >>> * After this point, all the interrupts will start arriving >> >> >> Best regards, >> Yang >> Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 06/26] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts
> -Original Message- > From: Zhang, Yang Z > Sent: Thursday, December 18, 2014 10:26 PM > To: Wu, Feng; t...@linutronix.de; mi...@redhat.com; h...@zytor.com; > x...@kernel.org; g...@kernel.org; pbonz...@redhat.com; > dw...@infradead.org; j...@8bytes.org; alex.william...@redhat.com; > jiang@linux.intel.com > Cc: eric.au...@linaro.org; linux-ker...@vger.kernel.org; > io...@lists.linux-foundation.org; kvm@vger.kernel.org; Wu, Feng > Subject: RE: [v3 06/26] iommu, x86: No need to migrating irq for VT-d > Posted-Interrupts > > Feng Wu wrote on 2014-12-12: > > We don't need to migrate the irqs for VT-d Posted-Interrupts here. > > When 'pst' is set in IRTE, the associated irq will be posted to guests > > instead of interrupt remapping. The destination of the interrupt is > > set in Posted-Interrupts Descriptor, and the migration happens during > > vCPU scheduling. > > > > However, we still update the cached irte here, which can be used when > > changing back to remapping mode. > > > > Signed-off-by: Feng Wu > > Reviewed-by: Jiang Liu > > --- > > drivers/iommu/intel_irq_remapping.c | 6 +- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/intel_irq_remapping.c > > b/drivers/iommu/intel_irq_remapping.c index 48c2051..ab9057a 100644 --- > > a/drivers/iommu/intel_irq_remapping.c +++ > > b/drivers/iommu/intel_irq_remapping.c @@ -977,6 +977,7 @@ > > intel_ir_set_affinity(struct irq_data *data, const struct cpumask *mask, > > { > > struct intel_ir_data *ir_data = data->chip_data;struct irte > > *irte = > > &ir_data->irte_entry; +struct irte_pi *irte_pi = (struct irte_pi > > *)irte;struct irq_cfg *cfg = irqd_cfg(data); struct irq_data *parent > > = data->parent_data; int ret; > > @@ -991,7 +992,10 @@ intel_ir_set_affinity(struct irq_data *data, > > const struct cpumask *mask, > > */ > > irte->vector = cfg->vector; > > irte->dest_id = IRTE_DEST(cfg->dest_apicid); > > - modify_irte(&ir_data->irq_2_iommu, irte); > > + > > + /* We don't need to modify irte if the interrupt is for posting. */ > > + if (irte_pi->pst != 1) > > + modify_irte(&ir_data->irq_2_iommu, irte); > > What happens if user changes the IRQ affinity manually? If the IRQ is posted, its affinity is controlled by guest (irq <---> vCPU <> pCPU), it has no effect when host changes its affinity. Thanks, Feng > > > > > /* > > * After this point, all the interrupts will start arriving > > > Best regards, > Yang > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 13/26] KVM: Define a new interface kvm_find_dest_vcpu() for VT-d PI
> -Original Message- > From: Zhang, Yang Z > Sent: Friday, December 19, 2014 9:14 AM > To: Paolo Bonzini; Wu, Feng; t...@linutronix.de; mi...@redhat.com; > h...@zytor.com; x...@kernel.org; g...@kernel.org; dw...@infradead.org; > j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com > Cc: eric.au...@linaro.org; linux-ker...@vger.kernel.org; > io...@lists.linux-foundation.org; kvm@vger.kernel.org > Subject: RE: [v3 13/26] KVM: Define a new interface kvm_find_dest_vcpu() for > VT-d PI > > Paolo Bonzini wrote on 2014-12-19: > > > > > > On 18/12/2014 15:49, Zhang, Yang Z wrote: > Here, we introduce a similar way with 'apic_arb_prio' to handle > guest lowest priority interrtups when VT-d PI is used. Here is the > ideas: - Each vCPU has a counter 'round_robin_counter'. - When > guests sets an interrupts to lowest priority, we choose the vCPU > with smallest 'round_robin_counter' as the destination, then > increase it. > >> > >> How this can work well? All subsequent interrupts are delivered to > >> one vCPU? It shouldn't be the best solution, need more consideration. > > > > Well, it's a hardware limitation. The alternative (which is easy to > > Agree, it is limited by hardware. But lowest priority distributes the > interrupt > more efficient than fixed mode. And current implementation more likes to > switch the lowest priority mode to fixed mode. In case of interrupt intensive > environment, this may be a bottleneck and VM may not benefit greatly from > VT-d PI. But agree again, it is really a hardware limitation. > > > implement) is to only do PI for single-CPU interrupts. This should > > work well for multiqueue NICs (and of course for UP guests :)), so > > perhaps it's a good idea to only support that as a first attempt. > > The more easy way is to deliver the interrupt to the first matched VCPU we > find. > The round_robin_counter really helps nothing here since the interrupt is > delivered by hardware directly. > > > > > Paolo > > > >> Also, I think you should take the apic_arb_prio into consider since > >> the priority is for the whole vCPU not for one interrupt. > > > Best regards, > Yang In fact, the current solution was discussed with Rajesh in the cc List, here is Rajesh's original words: "When you see a guest requesting a lowest priority interrupts (by programming the virtual IOAPIC, or by programming the virtual MSI/MSI-X registers), have KVM associate it to a vCPU. Or, put another way, use the 'apic_arb_prio' method you describe below, but instead of using it at time of interrupt (which you no longer have control with posted interrupt direct delivery), do it at time of initializing the interrupt resource. This way, if the guest asks for 4 lowest priority interrupts, and say you a guest with two vCPUs, the first interrupt request will be serviced by KVM by assigning it through posting to vCPU0, the next one goes to vCPU1, the next one would go back to vCPU0, and so forth.. You could also choose to do this based on vector hashing instead of round-robin." Thanks, Feng > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 13/26] KVM: Define a new interface kvm_find_dest_vcpu() for VT-d PI
> -Original Message- > From: Paolo Bonzini [mailto:pbonz...@redhat.com] > Sent: Friday, December 19, 2014 12:58 AM > To: Zhang, Yang Z; Wu, Feng; t...@linutronix.de; mi...@redhat.com; > h...@zytor.com; x...@kernel.org; g...@kernel.org; dw...@infradead.org; > j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com > Cc: eric.au...@linaro.org; linux-ker...@vger.kernel.org; > io...@lists.linux-foundation.org; kvm@vger.kernel.org > Subject: Re: [v3 13/26] KVM: Define a new interface kvm_find_dest_vcpu() for > VT-d PI > > > > On 18/12/2014 15:49, Zhang, Yang Z wrote: > >>> Here, we introduce a similar way with 'apic_arb_prio' to handle > >>> guest lowest priority interrtups when VT-d PI is used. Here is > >>> the ideas: - Each vCPU has a counter 'round_robin_counter'. - > >>> When guests sets an interrupts to lowest priority, we choose the > >>> vCPU with smallest 'round_robin_counter' as the destination, then > >>> increase it. > > > > How this can work well? All subsequent interrupts are delivered to > > one vCPU? It shouldn't be the best solution, need more consideration. > > Well, it's a hardware limitation. The alternative (which is easy to > implement) is to only do PI for single-CPU interrupts. This should work > well for multiqueue NICs (and of course for UP guests :)), so perhaps > it's a good idea to only support that as a first attempt. > > Paolo Paolo, what do you mean by "single-CPU interrupts"? Do you mean we don't support lowest priority interrupts for PI? But Linux OS uses lowest priority for most of the case? If so, we can hardly get benefit from this feature for Linux guest OS. Thanks, Feng > > > Also, I think you should take the apic_arb_prio into consider since > > the priority is for the whole vCPU not for one interrupt. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 13/26] KVM: Define a new interface kvm_find_dest_vcpu() for VT-d PI
Paolo Bonzini wrote on 2014-12-19: > > > On 18/12/2014 15:49, Zhang, Yang Z wrote: Here, we introduce a similar way with 'apic_arb_prio' to handle guest lowest priority interrtups when VT-d PI is used. Here is the ideas: - Each vCPU has a counter 'round_robin_counter'. - When guests sets an interrupts to lowest priority, we choose the vCPU with smallest 'round_robin_counter' as the destination, then increase it. >> >> How this can work well? All subsequent interrupts are delivered to >> one vCPU? It shouldn't be the best solution, need more consideration. > > Well, it's a hardware limitation. The alternative (which is easy to Agree, it is limited by hardware. But lowest priority distributes the interrupt more efficient than fixed mode. And current implementation more likes to switch the lowest priority mode to fixed mode. In case of interrupt intensive environment, this may be a bottleneck and VM may not benefit greatly from VT-d PI. But agree again, it is really a hardware limitation. > implement) is to only do PI for single-CPU interrupts. This should > work well for multiqueue NICs (and of course for UP guests :)), so > perhaps it's a good idea to only support that as a first attempt. The more easy way is to deliver the interrupt to the first matched VCPU we find. The round_robin_counter really helps nothing here since the interrupt is delivered by hardware directly. > > Paolo > >> Also, I think you should take the apic_arb_prio into consider since >> the priority is for the whole vCPU not for one interrupt. Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 21/26] x86, irq: Define a global vector for VT-d Posted-Interrupts
> -Original Message- > From: Zhang, Yang Z > Sent: Thursday, December 18, 2014 10:55 PM > To: Wu, Feng; t...@linutronix.de; mi...@redhat.com; h...@zytor.com; > x...@kernel.org; g...@kernel.org; pbonz...@redhat.com; > dw...@infradead.org; j...@8bytes.org; alex.william...@redhat.com; > jiang@linux.intel.com > Cc: eric.au...@linaro.org; linux-ker...@vger.kernel.org; > io...@lists.linux-foundation.org; kvm@vger.kernel.org; Wu, Feng > Subject: RE: [v3 21/26] x86, irq: Define a global vector for VT-d > Posted-Interrupts > > Feng Wu wrote on 2014-12-12: > > Currently, we use a global vector as the Posted-Interrupts > > Notification Event for all the vCPUs in the system. We need to > > introduce another global vector for VT-d Posted-Interrtups, which will > > be used to wakeup the sleep vCPU when an external interrupt from a > direct-assigned device happens for that vCPU. > > > > Hi Feng, > > Since the idea of two global vectors mechanism is from me, please add me to > the comments. No problem, Yang, I will add a "suggested-by Yang Zhang " in this patch. Thanks a lot! Thanks, Feng > > > Signed-off-by: Feng Wu > > --- > > arch/x86/include/asm/entry_arch.h | 2 ++ > > arch/x86/include/asm/hardirq.h | 1 + > > arch/x86/include/asm/hw_irq.h | 2 ++ > > arch/x86/include/asm/irq_vectors.h | 1 + > > arch/x86/kernel/entry_64.S | 2 ++ > > arch/x86/kernel/irq.c | 27 > +++ > > arch/x86/kernel/irqinit.c | 2 ++ > > 7 files changed, 37 insertions(+) > > diff --git a/arch/x86/include/asm/entry_arch.h > > b/arch/x86/include/asm/entry_arch.h index dc5fa66..27ca0af 100644 --- > > a/arch/x86/include/asm/entry_arch.h +++ > > b/arch/x86/include/asm/entry_arch.h @@ -23,6 +23,8 @@ > > BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR) #ifdef > > CONFIG_HAVE_KVM BUILD_INTERRUPT3(kvm_posted_intr_ipi, > POSTED_INTR_VECTOR, > > smp_kvm_posted_intr_ipi) > > +BUILD_INTERRUPT3(kvm_posted_intr_wakeup_ipi, > POSTED_INTR_WAKEUP_VECTOR, > > +smp_kvm_posted_intr_wakeup_ipi) > > #endif > > > > /* > > diff --git a/arch/x86/include/asm/hardirq.h > > b/arch/x86/include/asm/hardirq.h index 0f5fb6b..9866065 100644 > > --- a/arch/x86/include/asm/hardirq.h > > +++ b/arch/x86/include/asm/hardirq.h > > @@ -14,6 +14,7 @@ typedef struct { > > #endif #ifdef CONFIG_HAVE_KVM unsigned int kvm_posted_intr_ipis; > > + unsigned int kvm_posted_intr_wakeup_ipis; #endifunsigned int > > x86_platform_ipis; /* arch dependent */unsigned int apic_perf_irqs; > > diff --git a/arch/x86/include/asm/hw_irq.h > > b/arch/x86/include/asm/hw_irq.h index e7ae6eb..38fac9b 100644 > > --- a/arch/x86/include/asm/hw_irq.h > > +++ b/arch/x86/include/asm/hw_irq.h > > @@ -29,6 +29,7 @@ > > extern asmlinkage void apic_timer_interrupt(void); extern asmlinkage > > void x86_platform_ipi(void); extern asmlinkage void > > kvm_posted_intr_ipi(void); +extern asmlinkage void > > kvm_posted_intr_wakeup_ipi(void); > > extern asmlinkage void error_interrupt(void); extern asmlinkage void > > irq_work_interrupt(void); > > > > @@ -92,6 +93,7 @@ extern void > > trace_call_function_single_interrupt(void); > > #define trace_irq_move_cleanup_interrupt irq_move_cleanup_interrupt > > #define trace_reboot_interrupt reboot_interrupt #define > > trace_kvm_posted_intr_ipi kvm_posted_intr_ipi > > +#define trace_kvm_posted_intr_wakeup_ipi kvm_posted_intr_wakeup_ipi > > #endif /* CONFIG_TRACING */ > > > > struct irq_domain; > > diff --git a/arch/x86/include/asm/irq_vectors.h > > b/arch/x86/include/asm/irq_vectors.h index b26cb12..dca94f2 100644 --- > > a/arch/x86/include/asm/irq_vectors.h +++ > > b/arch/x86/include/asm/irq_vectors.h @@ -105,6 +105,7 @@ > > /* Vector for KVM to deliver posted interrupt IPI */ #ifdef > > CONFIG_HAVE_KVM #define POSTED_INTR_VECTOR 0xf2 +#define > > POSTED_INTR_WAKEUP_VECTOR 0xf1 #endif > > > > /* > > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > > index e61c14a..a598447 100644 --- a/arch/x86/kernel/entry_64.S +++ > > b/arch/x86/kernel/entry_64.S @@ -960,6 +960,8 @@ apicinterrupt > > X86_PLATFORM_IPI_VECTOR \ #ifdef CONFIG_HAVE_KVM > > apicinterrupt3 POSTED_INTR_VECTOR \ > > kvm_posted_intr_ipi smp_kvm_posted_intr_ipi > > +apicinterrupt3 POSTED_INTR_WAKEUP_VECTOR \ > > + kvm_posted_intr_wakeup_ipi smp_kvm_posted_intr_wakeup_ipi > > #endif > > > > #ifdef CONFIG_X86_MCE_THRESHOLD > > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index > > 922d285..47408c3 100644 > > --- a/arch/x86/kernel/irq.c > > +++ b/arch/x86/kernel/irq.c > > @@ -237,6 +237,9 @@ __visible void smp_x86_platform_ipi(struct pt_regs > > *regs) } > > > > #ifdef CONFIG_HAVE_KVM > > +void (*wakeup_handler_callback)(void) = NULL; > > +EXPORT_SYMBOL_GPL(wakeup_handler_callback); + > > /* > > * Handler for POSTED_INTERRUPT_VECTOR. > > */ > > @@ -256,6 +259,30 @@ __visible void smp_kvm_posted_intr_ipi(s
Re: [v3 13/26] KVM: Define a new interface kvm_find_dest_vcpu() for VT-d PI
On 18/12/2014 15:49, Zhang, Yang Z wrote: >>> Here, we introduce a similar way with 'apic_arb_prio' to handle >>> guest lowest priority interrtups when VT-d PI is used. Here is >>> the ideas: - Each vCPU has a counter 'round_robin_counter'. - >>> When guests sets an interrupts to lowest priority, we choose the >>> vCPU with smallest 'round_robin_counter' as the destination, then >>> increase it. > > How this can work well? All subsequent interrupts are delivered to > one vCPU? It shouldn't be the best solution, need more consideration. Well, it's a hardware limitation. The alternative (which is easy to implement) is to only do PI for single-CPU interrupts. This should work well for multiqueue NICs (and of course for UP guests :)), so perhaps it's a good idea to only support that as a first attempt. Paolo > Also, I think you should take the apic_arb_prio into consider since > the priority is for the whole vCPU not for one interrupt. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 12/26] KVM: Initialize VT-d Posted-Interrupts Descriptor
Feng Wu wrote on 2014-12-12: > This patch initializes the VT-d Posted-Interrupts Descriptor. > > Signed-off-by: Feng Wu > --- > arch/x86/kvm/vmx.c | 27 +++ > 1 file changed, 27 insertions(+) > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index > 0b1383e..66ca275 100644 --- a/arch/x86/kvm/vmx.c +++ > b/arch/x86/kvm/vmx.c @@ -45,6 +45,7 @@ > #include > #include > #include > +#include > > #include "trace.h" > @@ -4433,6 +4434,30 @@ static void ept_set_mmio_spte_mask(void) > kvm_mmu_set_mmio_spte_mask((0x3ull << 62) | 0x6ull); } > +static void pi_desc_init(struct vcpu_vmx *vmx) { > + unsigned int dest; > + > + if (!irq_remapping_cap(IRQ_POSTING_CAP)) > + return; > + > + /* > + * Initialize Posted-Interrupt Descriptor > + */ > + > + pi_clear_sn(&vmx->pi_desc); > + vmx->pi_desc.nv = POSTED_INTR_VECTOR; Here. > + > + /* Physical mode for Notificaiton Event */ > + vmx->pi_desc.ndm = 0; And from here.. > + dest = cpu_physical_id(vmx->vcpu.cpu); > + > + if (x2apic_enabled()) > + vmx->pi_desc.ndst = dest; > + else > + vmx->pi_desc.ndst = (dest << 8) & 0xFF00; } > + ..to here are useless. The right place to update PI descriptor is where vcpu got loaded not in initialization. > /* > * Sets up the vmcs for emulated real mode. > */ > @@ -4476,6 +4501,8 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx) > > vmcs_write64(POSTED_INTR_NV, POSTED_INTR_VECTOR); > vmcs_write64(POSTED_INTR_DESC_ADDR, __pa((&vmx->pi_desc))); > + > + pi_desc_init(vmx); > } > > if (ple_gap) { Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set
Paolo Bonzini wrote on 2014-12-18: > > > On 18/12/2014 04:14, Wu, Feng wrote: >> >> >> linux-kernel-ow...@vger.kernel.org wrote on >> mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo: >>> x...@kernel.org; Gleb Natapov; Paolo Bonzini; dw...@infradead.org; >>> joro-zlv9swrftaidnm+yrof...@public.gmane.org; Alex Williamson; >>> joro-zLv9SwRftAIdnm+Jiang >>> Liu >>> Cc: io...@lists.linux-foundation.org; >>> linux-kernel-u79uwxl29ty76z2rm5m...@public.gmane.org; KVM list; >>> Eric Auger >>> Subject: Re: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is >>> set >>> >>> >>> >>> On 12/12/2014 16:14, Feng Wu wrote: Currently, we don't support urgent interrupt, all interrupts are recognized as non-urgent interrupt, so we cannot send posted-interrupt when 'SN' is set. >>> >>> Can this happen? If the vcpu is in guest mode, it cannot have been >>> scheduled out, and that's the only case when SN is set. >>> >>> Paolo >> >> Currently, the only place where SN is set is vCPU is preempted and If the vCPU is preempted, shouldn't the subsequent be ignored? What happens if a PI is occurs when vCPU is preempted? >> waiting for the next scheduling in the runqueue. But I am not sure >> whether we need to set SN for other purpose in future. Adding SN >> checking here is just to follow the Spec. non-urgent interrupts are >> suppressed > when SN is set. > > I would change that to a WARN_ON_ONCE then. Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 21/26] x86, irq: Define a global vector for VT-d Posted-Interrupts
Feng Wu wrote on 2014-12-12: > Currently, we use a global vector as the Posted-Interrupts > Notification Event for all the vCPUs in the system. We need to > introduce another global vector for VT-d Posted-Interrtups, which will > be used to wakeup the sleep vCPU when an external interrupt from a > direct-assigned device happens for that vCPU. > Hi Feng, Since the idea of two global vectors mechanism is from me, please add me to the comments. > Signed-off-by: Feng Wu > --- > arch/x86/include/asm/entry_arch.h | 2 ++ > arch/x86/include/asm/hardirq.h | 1 + > arch/x86/include/asm/hw_irq.h | 2 ++ > arch/x86/include/asm/irq_vectors.h | 1 + > arch/x86/kernel/entry_64.S | 2 ++ > arch/x86/kernel/irq.c | 27 +++ > arch/x86/kernel/irqinit.c | 2 ++ > 7 files changed, 37 insertions(+) > diff --git a/arch/x86/include/asm/entry_arch.h > b/arch/x86/include/asm/entry_arch.h index dc5fa66..27ca0af 100644 --- > a/arch/x86/include/asm/entry_arch.h +++ > b/arch/x86/include/asm/entry_arch.h @@ -23,6 +23,8 @@ > BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR) #ifdef > CONFIG_HAVE_KVM BUILD_INTERRUPT3(kvm_posted_intr_ipi, POSTED_INTR_VECTOR, >smp_kvm_posted_intr_ipi) > +BUILD_INTERRUPT3(kvm_posted_intr_wakeup_ipi, POSTED_INTR_WAKEUP_VECTOR, > + smp_kvm_posted_intr_wakeup_ipi) > #endif > > /* > diff --git a/arch/x86/include/asm/hardirq.h > b/arch/x86/include/asm/hardirq.h index 0f5fb6b..9866065 100644 > --- a/arch/x86/include/asm/hardirq.h > +++ b/arch/x86/include/asm/hardirq.h > @@ -14,6 +14,7 @@ typedef struct { > #endif #ifdef CONFIG_HAVE_KVMunsigned int kvm_posted_intr_ipis; > +unsigned int kvm_posted_intr_wakeup_ipis; #endifunsigned int > x86_platform_ipis; /* arch dependent */unsigned int apic_perf_irqs; > diff --git a/arch/x86/include/asm/hw_irq.h > b/arch/x86/include/asm/hw_irq.h index e7ae6eb..38fac9b 100644 > --- a/arch/x86/include/asm/hw_irq.h > +++ b/arch/x86/include/asm/hw_irq.h > @@ -29,6 +29,7 @@ > extern asmlinkage void apic_timer_interrupt(void); extern asmlinkage > void x86_platform_ipi(void); extern asmlinkage void > kvm_posted_intr_ipi(void); +extern asmlinkage void > kvm_posted_intr_wakeup_ipi(void); > extern asmlinkage void error_interrupt(void); extern asmlinkage void > irq_work_interrupt(void); > > @@ -92,6 +93,7 @@ extern void > trace_call_function_single_interrupt(void); > #define trace_irq_move_cleanup_interrupt irq_move_cleanup_interrupt > #define trace_reboot_interrupt reboot_interrupt #define > trace_kvm_posted_intr_ipi kvm_posted_intr_ipi > +#define trace_kvm_posted_intr_wakeup_ipi kvm_posted_intr_wakeup_ipi > #endif /* CONFIG_TRACING */ > > struct irq_domain; > diff --git a/arch/x86/include/asm/irq_vectors.h > b/arch/x86/include/asm/irq_vectors.h index b26cb12..dca94f2 100644 --- > a/arch/x86/include/asm/irq_vectors.h +++ > b/arch/x86/include/asm/irq_vectors.h @@ -105,6 +105,7 @@ > /* Vector for KVM to deliver posted interrupt IPI */ #ifdef > CONFIG_HAVE_KVM #define POSTED_INTR_VECTOR 0xf2 +#define > POSTED_INTR_WAKEUP_VECTOR0xf1 #endif > > /* > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > index e61c14a..a598447 100644 --- a/arch/x86/kernel/entry_64.S +++ > b/arch/x86/kernel/entry_64.S @@ -960,6 +960,8 @@ apicinterrupt > X86_PLATFORM_IPI_VECTOR \ #ifdef CONFIG_HAVE_KVM > apicinterrupt3 POSTED_INTR_VECTOR \ > kvm_posted_intr_ipi smp_kvm_posted_intr_ipi > +apicinterrupt3 POSTED_INTR_WAKEUP_VECTOR \ > + kvm_posted_intr_wakeup_ipi smp_kvm_posted_intr_wakeup_ipi > #endif > > #ifdef CONFIG_X86_MCE_THRESHOLD > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index > 922d285..47408c3 100644 > --- a/arch/x86/kernel/irq.c > +++ b/arch/x86/kernel/irq.c > @@ -237,6 +237,9 @@ __visible void smp_x86_platform_ipi(struct pt_regs > *regs) } > > #ifdef CONFIG_HAVE_KVM > +void (*wakeup_handler_callback)(void) = NULL; > +EXPORT_SYMBOL_GPL(wakeup_handler_callback); + > /* > * Handler for POSTED_INTERRUPT_VECTOR. > */ > @@ -256,6 +259,30 @@ __visible void smp_kvm_posted_intr_ipi(struct > pt_regs *regs) > > set_irq_regs(old_regs); > } > + > +/* > + * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR. > + */ > +__visible void smp_kvm_posted_intr_wakeup_ipi(struct pt_regs *regs) { > + struct pt_regs *old_regs = set_irq_regs(regs); > + > + ack_APIC_irq(); > + > + irq_enter(); > + > + exit_idle(); > + > + inc_irq_stat(kvm_posted_intr_wakeup_ipis); > + > + if (wakeup_handler_callback) > + wakeup_handler_callback(); > + > + irq_exit(); > + > + set_irq_regs(old_regs); > +} > + > #endif > > __visible void smp_trace_x86_platform_ipi(struct pt_regs *regs) diff > --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c index > 70e181e..844673c 100644 --- a/arch/x86/kernel/irqinit.c +++ > b/arch/x86/kernel/irqinit.c @@ -144,6 +144,8
RE: [v3 13/26] KVM: Define a new interface kvm_find_dest_vcpu() for VT-d PI
Feng Wu wrote on 2014-12-12: > This patch defines a new interface kvm_find_dest_vcpu for > VT-d PI, which can returns the destination vCPU of the > interrupt for guests. > > Since VT-d PI cannot handle broadcast/multicast interrupt, > Here we only handle Fixed and Lowest priority interrupts. > > The current method of handling guest lowest priority interrtups > is to use a counter 'apic_arb_prio' for each vCPU, we choose the > vCPU with smallest 'apic_arb_prio' and then increase it by 1. > However, for VT-d PI, we cannot re-use this, since we no longer > have control to 'apic_arb_prio' with posted interrupt direct > delivery by Hardware. > > Here, we introduce a similar way with 'apic_arb_prio' to handle guest > lowest priority interrtups when VT-d PI is used. Here is the ideas: - > Each vCPU has a counter 'round_robin_counter'. - When guests sets an > interrupts to lowest priority, we choose the vCPU with smallest > 'round_robin_counter' as the destination, then increase it. How this can work well? All subsequent interrupts are delivered to one vCPU? It shouldn't be the best solution, need more consideration. Also, I think you should take the apic_arb_prio into consider since the priority is for the whole vCPU not for one interrupt. Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v3 06/26] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts
Feng Wu wrote on 2014-12-12: > We don't need to migrate the irqs for VT-d Posted-Interrupts here. > When 'pst' is set in IRTE, the associated irq will be posted to guests > instead of interrupt remapping. The destination of the interrupt is > set in Posted-Interrupts Descriptor, and the migration happens during > vCPU scheduling. > > However, we still update the cached irte here, which can be used when > changing back to remapping mode. > > Signed-off-by: Feng Wu > Reviewed-by: Jiang Liu > --- > drivers/iommu/intel_irq_remapping.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > diff --git a/drivers/iommu/intel_irq_remapping.c > b/drivers/iommu/intel_irq_remapping.c index 48c2051..ab9057a 100644 --- > a/drivers/iommu/intel_irq_remapping.c +++ > b/drivers/iommu/intel_irq_remapping.c @@ -977,6 +977,7 @@ > intel_ir_set_affinity(struct irq_data *data, const struct cpumask *mask, > { > struct intel_ir_data *ir_data = data->chip_data;struct irte > *irte = > &ir_data->irte_entry; + struct irte_pi *irte_pi = (struct irte_pi > *)irte; struct irq_cfg *cfg = irqd_cfg(data); struct irq_data *parent > = data->parent_data; int ret; > @@ -991,7 +992,10 @@ intel_ir_set_affinity(struct irq_data *data, > const struct cpumask *mask, >*/ > irte->vector = cfg->vector; > irte->dest_id = IRTE_DEST(cfg->dest_apicid); > - modify_irte(&ir_data->irq_2_iommu, irte); > + > + /* We don't need to modify irte if the interrupt is for posting. */ > + if (irte_pi->pst != 1) > + modify_irte(&ir_data->irq_2_iommu, irte); What happens if user changes the IRQ affinity manually? > > /* >* After this point, all the interrupts will start arriving Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 2/3] KVM: x86: add option to advance tscdeadline hrtimer expiration
On Wed, Dec 17, 2014 at 08:36:27PM +0100, Radim Krcmar wrote: > 2014-12-17 15:41-0200, Marcelo Tosatti: > > On Wed, Dec 17, 2014 at 03:58:13PM +0100, Radim Krcmar wrote: > > > 2014-12-16 09:08-0500, Marcelo Tosatti: > > > > + tsc_deadline = apic->lapic_timer.expired_tscdeadline; > > > > + apic->lapic_timer.expired_tscdeadline = 0; > > > > + guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, native_read_tsc()); > > > > + > > > > + while (guest_tsc < tsc_deadline) { > > > > + int delay = min(tsc_deadline - guest_tsc, 1000ULL); > > > > > > Why break the __delay() loop into smaller parts? > > > > So that you can handle interrupts, in case this code ever moves > > outside IRQ protected region. > > __delay() works only if it is delay_tsc(), which has this handled ... > (It even considers rescheduling with unsynchronized TSC.) > > delay_tsc(delay) translates roughly to > > end = read_tsc() + delay; > while (read_tsc() < end); > > so the code of our while loop has a structure like > > while ((guest_tsc = read_tsc()) < tsc_deadline) { > end = read_tsc() + min(tsc_deadline - guest_tsc, 1000); > while (read_tsc() < end); > } > > which complicates our original idea of > > while (read_tsc() < tsc_deadline); > > (but I'm completely fine with it.) True. I can change to a direct wait if that is preferred. > > > > + __delay(delay); > > > > > > (Does not have to call delay_tsc, but I guess it won't change.) > > > > > > > + guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, > > > > native_read_tsc()); > > > > + } > > > > } > > > > > > > > > > Btw. simple automatic delta tuning had worse results? > > > > Haven't tried automatic tuning. > > > > So what happens on a realtime environment is this: you execute the fixed > > number of instructions from interrupt handling all the way to VM-entry. > > > > Well, almost fixed. Fixed is the number of apic_timer_fn plus KVM > > instructions. You can also execute host scheduler and timekeeping > > processing. > > > > In practice, the length to execute that instruction sequence is a bell > > shaped normal distribution around the average (the right side is > > slightly higher due to host scheduler and timekeeping processing). > > > > You want to advance the timer by the rightmost bucket, that way you > > guarantee lower possible latencies (which is the interest here). > > (Lower latencies would likely be achieved by having a timer that issues > posted interrupts from another CPU, and the guest set to busy idle.) Yes. > > That said, i don't see advantage in automatic tuning for the usecase > > which this targets. > > Thanks, it doesn't make much difference in the long RT setup checklist. Exactly. > --- > I was asking just because I consider programming to equal automation ... > If we know that we will always set this to the rightmost bucket anyway, > it could be done like this > > if ((s64)(delta = guest_tsc - tsc_deadline) > 0) > tsc_deadline_delta += delta; > ... > advance_ns = kvm_tsc_to_ns(tsc_deadline_delta); > > instead of a script that runs a test and sets the variable. > (On the other hand, it would probably have to be more complicated to > reach the same level of flexibility.) You'd have to guarantee the vcpus are never interrupted by other work, such as processing host interrupts, otherwise you could get high increments for tsc_deadline_delta. So to tune that value you do: 1) Boot guest. 2) Setup certain vCPUs as realtime (large checklist), which includes pinning and host interrupt routing. 3) Measure with cyclictest on those vCPUs with the realtime conditions. So its also a matter of configuration. But yes the code above would set advance_ns to the rightmost bucket. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PULL] vhost: cleanups and fixes
The following changes since commit f01a2a811ae04124fc9382925038fcbbd2f0b7c8: virtio_ccw: finalize_features error handling (2014-12-09 21:42:06 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus for you to fetch changes up to 5ff16110c637726111662c1df41afd9df7ef36bd: virtio_pci: restore module attributes (2014-12-17 00:59:40 +0200) vhost/virtio: virtio 1.0 related fixes Most importantly, this fixes using virtio_pci as a module. Further, the big virtio 1.0 conversion missed a couple of places. This fixes them up. This isn't 100% sparse-clean yet because on many architectures get_user triggers sparse warnings when used with __bitwise tag (when same tag is on both pointer and value read). I posted a patchset to fix it up by adding __force on all arches that don't already have it (many do), when that's merged these warnings will go away. Cc: Rusty Russell Signed-off-by: Michael S. Tsirkin Herbert Xu (1): virtio_pci: restore module attributes Michael S. Tsirkin (15): virtio: set VIRTIO_CONFIG_S_FEATURES_OK on restore virtio_config: fix virtio_cread_bytes virtio_pci_common.h: drop VIRTIO_PCI_NO_LEGACY virtio_pci: move probe to common file virtio_pci: add VIRTIO_PCI_NO_LEGACY virtio: core support for config generation tools/virtio: more stubs tools/virtio: fix vringh test tools/virtio: 64 bit features tools/virtio: enable -Werror tools/virtio: add virtio 1.0 in virtio_test tools/virtio: add virtio 1.0 in vringh_test vringh: 64 bit features vringh: update for virtio 1.0 APIs mic/host: fix up virtio 1.0 APIs drivers/virtio/virtio_pci_common.h | 7 +- include/linux/virtio_config.h | 29 +++- include/linux/vringh.h | 37 +- include/uapi/linux/virtio_pci.h| 15 ++-- tools/virtio/linux/virtio.h| 1 + tools/virtio/linux/virtio_byteorder.h | 8 +++ tools/virtio/linux/virtio_config.h | 70 +- tools/virtio/uapi/linux/virtio_types.h | 1 + drivers/misc/mic/host/mic_debugfs.c| 18 +++-- drivers/vhost/vringh.c | 125 - drivers/virtio/virtio.c| 37 ++ drivers/virtio/virtio_pci_common.c | 39 +- drivers/virtio/virtio_pci_legacy.c | 24 +-- tools/virtio/virtio_test.c | 15 +++- tools/virtio/vringh_test.c | 5 +- tools/virtio/Makefile | 2 +- 16 files changed, 324 insertions(+), 109 deletions(-) create mode 100644 tools/virtio/linux/virtio_byteorder.h create mode 100644 tools/virtio/uapi/linux/virtio_types.h -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PULL 00/18] ppc patch queue 2014-12-18
On 18/12/2014 01:46, Alexander Graf wrote: > Hi Paolo, > > This is my current patch queue for ppc. Please pull. > > After the merge with Linus' tree, e500v2 compilation will be broken because > commit 69111bac42f5 broke it upstream. Could you please take care to apply the > fix I CC'ed you on for it? Pulled. Thanks! Paolo "don't panic" -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: add KVM_CAP_VMX_APICV to advertise hardware apic-v support
On 18/12/2014 09:57, Stefan Fritsch wrote: > I think setting the "recommended" bits to be best for the host where the > VM is first started should always be ok. When it migrates to a different > host, performance may not be optimal but otherwise it will work just fine. > After all, non-optimal performance is not different from the current > situation. What if, due to a bug, a guest mysteriously starts failing on some hosts of a cluster but not others? This is really not a path you want to walk down. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[GIT PULL] KVM changes for 3.19
Linus, The following changes since commit 0df1f2487d2f0d04703f142813d53615d62a1da4: Linux 3.18-rc3 (2014-11-02 15:01:51 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus for you to fetch changes up to 2c4aa55a6af070262cca425745e8e54310e96b8d: Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into HEAD (2014-12-18 09:39:55 +0100) IA64 removal will cause a trivial merge conflict. 3.19 changes for KVM: - spring cleaning: removed support for IA64, and for hardware-assisted virtualization on the PPC970 - ARM, PPC, s390 all had only small fixes For x86: - small performance improvements (though only on weird guests) - usual round of hardware-compliancy fixes from Nadav - APICv fixes - XSAVES support for hosts and guests. XSAVES hosts were broken because the (non-KVM) XSAVES patches inadvertently changed the KVM userspace ABI whenever XSAVES was enabled; hence, this part is going to stable. Guest support is just a matter of exposing the feature and CPUID leaves support. Right now KVM is broken for PPC BookE in your tree (doesn't compile). I'll reply to the pull request with a patch, please apply it either before the pull request or in the merge commit, in order to preserve bisectability somewhat. Alexander Graf (1): KVM: PPC: BookE: Improve irq inject tracepoint Andre Przywara (1): arm/arm64: KVM: avoid unnecessary guest register mangling on MMIO read Andy Lutomirski (4): x86,kvm,vmx: Don't trap writes to CR4.TSD x86, kvm, vmx: Always use LOAD_IA32_EFER if available x86, kvm, vmx: Don't set LOAD_IA32_EFER when host and guest match x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit Aneesh Kumar K.V (1): KVM: PPC: Book3S HV: Add missing HPTE unlock Anton Blanchard (1): KVM: PPC: Book3S: Enable in-kernel XICS emulation by default Ard Biesheuvel (4): arm/arm64: kvm: drop inappropriate use of kvm_is_mmio_pfn() kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn() kvm: add a memslot flag for incoherent memory regions arm, arm64: KVM: handle potential incoherency of readonly memslots Bandan Das (1): KVM: nVMX: Disable unrestricted mode if ept=0 Chao Peng (1): KVM: x86: Enable Intel AVX-512 for guest Chris J Arges (1): kvm: svm: move WARN_ON in svm_adjust_tsc_offset Christian Borntraeger (4): KVM: s390: Fix ipte locking KVM: s390: flush CPU on load control KVM: s390: trigger the right CPU exit for floating interrupts KVM: track pid for VCPU only on KVM_RUN ioctl Christoffer Dall (12): arm/arm64: vgic: Remove unreachable irq_clear_pending arm/arm64: KVM: Don't clear the VCPU_POWER_OFF flag arm/arm64: KVM: Correct KVM_ARM_VCPU_INIT power off option arm/arm64: KVM: Reset the HCR on each vcpu when resetting the vcpu arm/arm64: KVM: Clarify KVM_ARM_VCPU_INIT ABI arm/arm64: KVM: Turn off vcpus on PSCI shutdown/reboot arm/arm64: KVM: Introduce stage2_unmap_vm arm/arm64: KVM: Rename vgic_initialized to vgic_ready arm/arm64: KVM: Add (new) vgic_initialized macro arm/arm64: KVM: Don't allow creating VCPUs after vgic_initialized arm/arm64: KVM: Initialize the vgic on-demand when injecting IRQs arm/arm64: KVM: Require in-kernel vgic for the arch timers Cédric Le Goater (1): KVM: PPC: Book3S HV: ptes are big endian David Hildenbrand (9): KVM: s390: sigp: dispatch orders with one target in a separate function KVM: s390: sigp: move target cpu checks into dispatcher KVM: s390: sigp: separate preparation handlers KVM: s390: sigp: instruction counters for all sigp orders KVM: s390: sigp: inject emergency calls in a separate function KVM: s390: sigp: split handling of SIGP STOP (AND STORE STATUS) KVM: s390: external param not valid for cpu timer and ckc KVM: don't check for PF_VCPU when yielding KVM: s390: some ext irqs have to clear the ext cpu addr David Matlack (1): kvm: x86: add trace event for pvclock updates Dominik Dingel (2): KVM: trivial fix comment regarding __kvm_set_memory_region KVM: fix vm device attribute documentation Heiko Carstens (1): KVM: s390: fix handling of lctl[g]/stctl[g] Igor Mammedov (7): kvm: x86: increase user memory slots to 509 kvm: memslots: replace heap sort with an insertion sort pass kvm: update_memslots: drop not needed check for the same number of pages kvm: update_memslots: drop not needed check for the same slot kvm: search_memslots: add simple LRU memslot caching kvm: change memslot sorting rule from size to GFN kvm: optimize GFN to memslot lookup with large slots amount Jan Kiszka (1): KVM: nVMX: Disable preemption while r
[PATCH] KVM: PPC: E500: Compile fix in this_cpu_write
From: Alexander Graf Commit 69111bac42f5 (powerpc: Replace __get_cpu_var uses) introduced compile breakage to the e500 target by introducing invalid automatically created C syntax. Fix up the breakage and make the code compile again. Signed-off-by: Alexander Graf Signed-off-by: Paolo Bonzini --- arch/powerpc/kvm/e500.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c index c09550c96357..b29ce752c7d6 100644 --- a/arch/powerpc/kvm/e500.c +++ b/arch/powerpc/kvm/e500.c @@ -78,7 +78,7 @@ static inline int local_sid_setup_one(struct id *entry) sid = __this_cpu_inc_return(pcpu_last_used_sid); if (sid < NUM_TIDS) { - __this_cpu_write(pcpu_sids)entry[sid], entry); + __this_cpu_write(pcpu_sids.entry[sid], entry); entry->val = sid; entry->pentry = this_cpu_ptr(&pcpu_sids.entry[sid]); ret = sid; -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set
On 18/12/2014 04:14, Wu, Feng wrote: > > >> -Original Message- >> From: linux-kernel-ow...@vger.kernel.org >> [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Paolo Bonzini >> Sent: Thursday, December 18, 2014 1:43 AM >> To: Wu, Feng; Thomas Gleixner; Ingo Molnar; H. Peter Anvin; x...@kernel.org; >> Gleb Natapov; Paolo Bonzini; dw...@infradead.org; >> joro-zlv9swrftaidnm+yrof...@public.gmane.org; Alex >> Williamson; Jiang Liu >> Cc: io...@lists.linux-foundation.org; >> linux-kernel-u79uwxl29ty76z2rm5m...@public.gmane.org; KVM list; >> Eric Auger >> Subject: Re: [v3 25/26] KVM: Suppress posted-interrupt when 'SN' is set >> >> >> >> On 12/12/2014 16:14, Feng Wu wrote: >>> Currently, we don't support urgent interrupt, all interrupts >>> are recognized as non-urgent interrupt, so we cannot send >>> posted-interrupt when 'SN' is set. >> >> Can this happen? If the vcpu is in guest mode, it cannot have been >> scheduled out, and that's the only case when SN is set. >> >> Paolo > > Currently, the only place where SN is set is vCPU is preempted and waiting for > the next scheduling in the runqueue. But I am not sure whether we need to > set SN for other purpose in future. Adding SN checking here is just to follow > the Spec. non-urgent interrupts are suppressed when SN is set. I would change that to a WARN_ON_ONCE then. Paolo > Thanks, > Feng > >> >>> Signed-off-by: Feng Wu >> >>> --- >>> arch/x86/kvm/vmx.c | 11 +-- >>> 1 file changed, 9 insertions(+), 2 deletions(-) >>> >>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >>> index a1c83a2..0aee151 100644 >>> --- a/arch/x86/kvm/vmx.c >>> +++ b/arch/x86/kvm/vmx.c >>> @@ -4401,15 +4401,22 @@ static int vmx_vm_has_apicv(struct kvm *kvm) >>> static void vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int >> vector) >>> { >>> struct vcpu_vmx *vmx = to_vmx(vcpu); >>> - int r; >>> + int r, sn; >>> >>> if (pi_test_and_set_pir(vector, &vmx->pi_desc)) >>> return; >>> >>> + /* >>> +* Currently, we don't support urgent interrupt, all interrupts >>> +* are recognized as non-urgent interrupt, so we cannot send >>> +* posted-interrupt when 'SN' is set. >>> +*/ >>> + sn = pi_test_sn(&vmx->pi_desc); >>> + >>> r = pi_test_and_set_on(&vmx->pi_desc); >>> kvm_make_request(KVM_REQ_EVENT, vcpu); >>> #ifdef CONFIG_SMP >>> - if (!r && (vcpu->mode == IN_GUEST_MODE)) >>> + if (!r && !sn && (vcpu->mode == IN_GUEST_MODE)) >>> apic->send_IPI_mask(get_cpu_mask(vcpu->cpu), >>> POSTED_INTR_VECTOR); >>> else >>> -- >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: add KVM_CAP_VMX_APICV to advertise hardware apic-v support
On Thu, 18 Dec 2014, Paolo Bonzini wrote: > On 18/12/2014 04:37, zhanghailiang wrote: > >> set this bit. But the HV_X64_APIC_ACCESS_RECOMMENDED bit should > >> probably not be set if the host supports apic-v. I havn't done any > > > > That is what this patch want to do ;) > > This would cause a guest ABI change upon migration from APICv to > non-APICv hosts, so it is not possible. Perhaps we need to set bit 8 > but not bit 3. I think setting the "recommended" bits to be best for the host where the VM is first started should always be ok. When it migrates to a different host, performance may not be optimal but otherwise it will work just fine. After all, non-optimal performance is not different from the current situation. It's important not to change the "supported" bits dynamically, though. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
On 18/12/2014 04:16, Wu, Feng wrote: >>> pre-block: >>> - Add the vCPU to the blocked per-CPU list >>> - Clear 'SN' >> >> Should SN be already clear (and NV set to POSTED_INTR_VECTOR)? > > I think the SN bit should be clear here, Adding it here is just to make sure > SN is clear when vCPU is blocked, so it can receive wakeup notification event > later. Then, please, WARN if the SN bit is set inside the if (vcpu->blocked). Inside that if you can just add the vCPU to the blocked list on vcpu_put. >> Can it >> happen that you go from sched-out to blocked without doing a sched-in first? >> > > I cannot imagine this scenario, can you please be more specific? Thanks a lot! I cannot either. :) But it would be the case where SN is not cleared. So we agree that it cannot happen. >> In fact, if this is possible, what happens if vcpu->preempted && >> vcpu->blocked? > > In fact, vcpu->preempted && vcpu->blocked happens sometimes, but I think > there is > no issues. Please refer to the following case: I agree that there should be no issues. But if it can happen, it's better: 1) to separate the handling of preemption and blocking: preemption handles SN/NV/NDST, blocking handles the wakeup list. 2) to change this + } else if (vcpu->blocked) { + /* +* The vcpu is blocked on the wait queue. +* Store the blocked vCPU on the list of the +* vcpu->wakeup_cpu, which is the destination +* of the wake-up notification event. to just } if (vcpu->blocked) { ... } > kvm_vcpu_block() > -> vcpu->blocked = true; > -> prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE); > > before schedule() is called, this vcpu is woken up by another guy, so > the state of the vcpu associated thread is changed to TASK_RUNNING, > then preemption happens after interrupts or the following schedule() is > hit, this will call kvm_sched_out(), in which current->state == > TASK_RUNNING > and vcpu->preempted is set to true. So now vcpu->preempted and > vcpu->blocked > are both true. In vmx_vcpu_put(), we will check vcpu->preempted first, > so > the vCPU will not be blocked, and the vcpu->blocked will be set the > false in > vmx_vcpu_load(). > > But maybe I need do a little change to the vmx_vcpu_load() like below: > > /* > * Delete the vCPU from the related wakeup queue > * if we are resuming from blocked state > */ > if (vcpu->blocked) { > vcpu->blocked = false; > + /* if wakeup_cpu == -1, the > vcpu is currently not blocked on any > + pCPU, don't need dequeue here > */ > + if (vcpu->wakeup_cpu != -1) { > > spin_lock_irqsave(&per_cpu(blocked_vcpu_on_cpu_lock, > vcpu->wakeup_cpu), flags); >list_del(&vcpu->blocked_vcpu_list); > > spin_unlock_irqrestore(&per_cpu(blocked_vcpu_on_cpu_lock, > vcpu->wakeup_cpu), flags); >vcpu->wakeup_cpu = -1; > + } > } Good idea. Paolo > Any ideas about this? Thanks a lot! > > Thanks, > Feng > > > -> schedule(); > > >> >>> - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR >>> >>> post-block: >>> - Remove the vCPU from the per-CPU list >> >> Paolo >> >>> Signed-off-by: Feng Wu >> -- >> To unsubscribe from this list: send the line "unsubscribe kvm" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v2 PATCH] KVM: nVMX: consult PFEC_MASK and PFEC_MATCH when generating #PF VM-exit
On 16/12/2014 20:35, Eugene Korenevsky wrote: > When generating #PF VM-exit, check equality: > (PFEC & PFEC_MASK) == PFEC_MATCH > If there is equality, the 14 bit of exception bitmap is used to take decision > about generating #PF VM-exit. If there is inequality, inverted 14 bit is used. > > Signed-off-by: Eugene Korenevsky > --- > arch/x86/kvm/vmx.c | 15 +-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 09ccf6c..a8ef8265 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -8206,6 +8206,18 @@ static void nested_ept_uninit_mmu_context(struct > kvm_vcpu *vcpu) > vcpu->arch.walk_mmu = &vcpu->arch.mmu; > } > > +static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12, > + u16 error_code) > +{ > + bool inequality, bit; > + > + bit = (vmcs12->exception_bitmap & (1u << PF_VECTOR)) != 0; > + inequality = > + (error_code & vmcs12->page_fault_error_code_mask) != > + vmcs12->page_fault_error_code_match; > + return inequality ^ bit; > +} > + > static void vmx_inject_page_fault_nested(struct kvm_vcpu *vcpu, > struct x86_exception *fault) > { > @@ -8213,8 +8225,7 @@ static void vmx_inject_page_fault_nested(struct > kvm_vcpu *vcpu, > > WARN_ON(!is_guest_mode(vcpu)); > > - /* TODO: also check PFEC_MATCH/MASK, not just EB.PF. */ > - if (vmcs12->exception_bitmap & (1u << PF_VECTOR)) > + if (nested_vmx_is_page_fault_vmexit(vmcs12, fault->error_code)) > nested_vmx_vmexit(vcpu, to_vmx(vcpu)->exit_reason, > vmcs_read32(VM_EXIT_INTR_INFO), > vmcs_readl(EXIT_QUALIFICATION)); > Applied to kvm/queue, thanks. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v3 23/26] KVM: Update Posted-Interrupts Descriptor when vCPU is preempted
On 18/12/2014 04:15, Wu, Feng wrote: > Thanks for your comments, Paolo! > > If we use u64 new_control, we cannot use new.sn any more. > Maybe we can change the struct pi_desc {} like this: > > typedef struct pid_control{ > u64 on : 1, > sn : 1, > rsvd_1 : 13, > ndm : 1, > nv : 8, > rsvd_2 : 8, > ndst: 32; > }pid_control_t; > > struct pi_desc { > u32 pir[8]; /* Posted interrupt requested */ > pid_control_t control; Probably something like this to keep the union: typedef union pid_control { u64 full; struct { u64 on : 1, ... } fields; }; > u32 rsvd[6]; > } __aligned(64); > > > Then we can define pid_control_t new_control, old_control. And use > new_control.sn = 0. > > What is your opinon? Sure. Alternatively, keep using struct pi_desc new; just do not zero it, nor access any field outide the control word. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: add KVM_CAP_VMX_APICV to advertise hardware apic-v support
On 18/12/2014 04:37, zhanghailiang wrote: >> set this bit. But the HV_X64_APIC_ACCESS_RECOMMENDED bit should >> probably not be set if the host supports apic-v. I havn't done any > > That is what this patch want to do ;) This would cause a guest ABI change upon migration from APICv to non-APICv hosts, so it is not possible. Perhaps we need to set bit 8 but not bit 3. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html