Re: [PATCH v2 8/9] KVM: arm/arm64: fix race in kvm_psci_vcpu_on
On Wed, Apr 05, 2017 at 11:12:12AM +0200, Andrew Jones wrote: > On Wed, Apr 05, 2017 at 10:50:05AM +0200, Christoffer Dall wrote: > > On Wed, Apr 05, 2017 at 10:35:59AM +0200, Andrew Jones wrote: > > > On Tue, Apr 04, 2017 at 09:42:08PM +0200, Christoffer Dall wrote: > > > > On Fri, Mar 31, 2017 at 06:06:57PM +0200, Andrew Jones wrote: > > > > > From: Levente Kurusa > > > > > > > > > > When two vcpus issue PSCI_CPU_ON on the same core at the same time, > > > > > then it's possible for them to both enter the target vcpu's setup > > > > > at the same time. This results in unexpected behaviors at best, > > > > > and the potential for some nasty bugs at worst. > > > > > > > > > > Signed-off-by: Levente Kurusa > > > > > Signed-off-by: Andrew Jones > > > > > --- > > > > > arch/arm/kvm/psci.c | 4 ++-- > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c > > > > > index f732484abc7a..0204daa899b1 100644 > > > > > --- a/arch/arm/kvm/psci.c > > > > > +++ b/arch/arm/kvm/psci.c > > > > > @@ -88,7 +88,8 @@ static unsigned long kvm_psci_vcpu_on(struct > > > > > kvm_vcpu *source_vcpu) > > > > >*/ > > > > > if (!vcpu) > > > > > return PSCI_RET_INVALID_PARAMS; > > > > > - if (!test_bit(KVM_REQ_POWER_OFF, &vcpu->requests)) { > > > > > + > > > > > + if (!test_and_clear_bit(KVM_REQ_POWER_OFF, &vcpu->requests)) { > > > > > if (kvm_psci_version(source_vcpu) != KVM_ARM_PSCI_0_1) > > > > > return PSCI_RET_ALREADY_ON; > > > > > else > > > > > @@ -116,7 +117,6 @@ static unsigned long kvm_psci_vcpu_on(struct > > > > > kvm_vcpu *source_vcpu) > > > > >* the general puspose registers are undefined upon CPU_ON. > > > > >*/ > > > > > vcpu_set_reg(vcpu, 0, context_id); > > > > > - clear_bit(KVM_REQ_POWER_OFF, &vcpu->requests); > > > > > > > > > > wq = kvm_arch_vcpu_wq(vcpu); > > > > > swake_up(wq); > > > > > -- > > > > > 2.9.3 > > > > > > > > > > > > > Depending on what you end up doing with the requests, if you keep the > > > > bool flag you could just use the kvm->lock mutex instead. > > > > > > > > Have you considered if there are any potential races between > > > > kvm_psci_system_off() being called on one VCPU while two other VPCUs are > > > > turning on the same CPU that is being turend off as part of system-wide > > > > power down as well? > > > > > > Sounds like a nice unit test. I haven't considered it, but I guess > > > the kvm_psci_system_off/reset calling VCPU will ultimately "win", as > > > it'll cause an exit to userspace that initiates a shutdown/reset. > > > When the VCPUs are restarted then vcpu init should reset the power_off > > > state correctly. As long as the race this patch addresses is fixed, then > > > I'm not sure there should be any risk with the actual system_off/reset > > > being delayed wrt a vcpu being "on'ed" again, nor with there being more > > > than one VCPU trying to "on" it at the same time. > > > > > > > > > > > I'm wondering if this means we should take the kvm->lock at a higher > > > > level when handling PSCI events... > > > > > > That would simplify our analysis of the PSCI emulation, but I'm not > > > sure we want to give a guest the power to constantly acquire that > > > mutex with a barrage of PSCI calls. Maybe we should create a PSCI > > > mutex? In order to avoid holding it too long we may want power_off to > > > be more than a boolean though, i.e. the PENDING state might also be > > > a good idea to represent. > > > > > > > Hmm, the kvm->lock mutex is per-VM, so if a VM wants to use its CPU > > resources by taking its own mutex, I don't really see the problem. > > I was worried about management paths that lead to a need for that > lock. For example, I see x86's kvm_free_vcpus(), called from > kvm_arch_destroy_vm(), acquires it. A quick grep of ARM code doesn't > reveal anything though. > Even in that case, PSCI is guaranteed to make progress, right? So I still don't understand the challenge. In any case, I'll have a look over this patch again when you respin. Thanks, -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 8/9] KVM: arm/arm64: fix race in kvm_psci_vcpu_on
On Wed, Apr 05, 2017 at 10:50:05AM +0200, Christoffer Dall wrote: > On Wed, Apr 05, 2017 at 10:35:59AM +0200, Andrew Jones wrote: > > On Tue, Apr 04, 2017 at 09:42:08PM +0200, Christoffer Dall wrote: > > > On Fri, Mar 31, 2017 at 06:06:57PM +0200, Andrew Jones wrote: > > > > From: Levente Kurusa > > > > > > > > When two vcpus issue PSCI_CPU_ON on the same core at the same time, > > > > then it's possible for them to both enter the target vcpu's setup > > > > at the same time. This results in unexpected behaviors at best, > > > > and the potential for some nasty bugs at worst. > > > > > > > > Signed-off-by: Levente Kurusa > > > > Signed-off-by: Andrew Jones > > > > --- > > > > arch/arm/kvm/psci.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c > > > > index f732484abc7a..0204daa899b1 100644 > > > > --- a/arch/arm/kvm/psci.c > > > > +++ b/arch/arm/kvm/psci.c > > > > @@ -88,7 +88,8 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu > > > > *source_vcpu) > > > > */ > > > > if (!vcpu) > > > > return PSCI_RET_INVALID_PARAMS; > > > > - if (!test_bit(KVM_REQ_POWER_OFF, &vcpu->requests)) { > > > > + > > > > + if (!test_and_clear_bit(KVM_REQ_POWER_OFF, &vcpu->requests)) { > > > > if (kvm_psci_version(source_vcpu) != KVM_ARM_PSCI_0_1) > > > > return PSCI_RET_ALREADY_ON; > > > > else > > > > @@ -116,7 +117,6 @@ static unsigned long kvm_psci_vcpu_on(struct > > > > kvm_vcpu *source_vcpu) > > > > * the general puspose registers are undefined upon CPU_ON. > > > > */ > > > > vcpu_set_reg(vcpu, 0, context_id); > > > > - clear_bit(KVM_REQ_POWER_OFF, &vcpu->requests); > > > > > > > > wq = kvm_arch_vcpu_wq(vcpu); > > > > swake_up(wq); > > > > -- > > > > 2.9.3 > > > > > > > > > > Depending on what you end up doing with the requests, if you keep the > > > bool flag you could just use the kvm->lock mutex instead. > > > > > > Have you considered if there are any potential races between > > > kvm_psci_system_off() being called on one VCPU while two other VPCUs are > > > turning on the same CPU that is being turend off as part of system-wide > > > power down as well? > > > > Sounds like a nice unit test. I haven't considered it, but I guess > > the kvm_psci_system_off/reset calling VCPU will ultimately "win", as > > it'll cause an exit to userspace that initiates a shutdown/reset. > > When the VCPUs are restarted then vcpu init should reset the power_off > > state correctly. As long as the race this patch addresses is fixed, then > > I'm not sure there should be any risk with the actual system_off/reset > > being delayed wrt a vcpu being "on'ed" again, nor with there being more > > than one VCPU trying to "on" it at the same time. > > > > > > > > I'm wondering if this means we should take the kvm->lock at a higher > > > level when handling PSCI events... > > > > That would simplify our analysis of the PSCI emulation, but I'm not > > sure we want to give a guest the power to constantly acquire that > > mutex with a barrage of PSCI calls. Maybe we should create a PSCI > > mutex? In order to avoid holding it too long we may want power_off to > > be more than a boolean though, i.e. the PENDING state might also be > > a good idea to represent. > > > > Hmm, the kvm->lock mutex is per-VM, so if a VM wants to use its CPU > resources by taking its own mutex, I don't really see the problem. I was worried about management paths that lead to a need for that lock. For example, I see x86's kvm_free_vcpus(), called from kvm_arch_destroy_vm(), acquires it. A quick grep of ARM code doesn't reveal anything though. Thanks, drew ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 8/9] KVM: arm/arm64: fix race in kvm_psci_vcpu_on
On Wed, Apr 05, 2017 at 10:35:59AM +0200, Andrew Jones wrote: > On Tue, Apr 04, 2017 at 09:42:08PM +0200, Christoffer Dall wrote: > > On Fri, Mar 31, 2017 at 06:06:57PM +0200, Andrew Jones wrote: > > > From: Levente Kurusa > > > > > > When two vcpus issue PSCI_CPU_ON on the same core at the same time, > > > then it's possible for them to both enter the target vcpu's setup > > > at the same time. This results in unexpected behaviors at best, > > > and the potential for some nasty bugs at worst. > > > > > > Signed-off-by: Levente Kurusa > > > Signed-off-by: Andrew Jones > > > --- > > > arch/arm/kvm/psci.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c > > > index f732484abc7a..0204daa899b1 100644 > > > --- a/arch/arm/kvm/psci.c > > > +++ b/arch/arm/kvm/psci.c > > > @@ -88,7 +88,8 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu > > > *source_vcpu) > > >*/ > > > if (!vcpu) > > > return PSCI_RET_INVALID_PARAMS; > > > - if (!test_bit(KVM_REQ_POWER_OFF, &vcpu->requests)) { > > > + > > > + if (!test_and_clear_bit(KVM_REQ_POWER_OFF, &vcpu->requests)) { > > > if (kvm_psci_version(source_vcpu) != KVM_ARM_PSCI_0_1) > > > return PSCI_RET_ALREADY_ON; > > > else > > > @@ -116,7 +117,6 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu > > > *source_vcpu) > > >* the general puspose registers are undefined upon CPU_ON. > > >*/ > > > vcpu_set_reg(vcpu, 0, context_id); > > > - clear_bit(KVM_REQ_POWER_OFF, &vcpu->requests); > > > > > > wq = kvm_arch_vcpu_wq(vcpu); > > > swake_up(wq); > > > -- > > > 2.9.3 > > > > > > > Depending on what you end up doing with the requests, if you keep the > > bool flag you could just use the kvm->lock mutex instead. > > > > Have you considered if there are any potential races between > > kvm_psci_system_off() being called on one VCPU while two other VPCUs are > > turning on the same CPU that is being turend off as part of system-wide > > power down as well? > > Sounds like a nice unit test. I haven't considered it, but I guess > the kvm_psci_system_off/reset calling VCPU will ultimately "win", as > it'll cause an exit to userspace that initiates a shutdown/reset. > When the VCPUs are restarted then vcpu init should reset the power_off > state correctly. As long as the race this patch addresses is fixed, then > I'm not sure there should be any risk with the actual system_off/reset > being delayed wrt a vcpu being "on'ed" again, nor with there being more > than one VCPU trying to "on" it at the same time. > > > > > I'm wondering if this means we should take the kvm->lock at a higher > > level when handling PSCI events... > > That would simplify our analysis of the PSCI emulation, but I'm not > sure we want to give a guest the power to constantly acquire that > mutex with a barrage of PSCI calls. Maybe we should create a PSCI > mutex? In order to avoid holding it too long we may want power_off to > be more than a boolean though, i.e. the PENDING state might also be > a good idea to represent. > Hmm, the kvm->lock mutex is per-VM, so if a VM wants to use its CPU resources by taking its own mutex, I don't really see the problem. -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 8/9] KVM: arm/arm64: fix race in kvm_psci_vcpu_on
On Tue, Apr 04, 2017 at 09:42:08PM +0200, Christoffer Dall wrote: > On Fri, Mar 31, 2017 at 06:06:57PM +0200, Andrew Jones wrote: > > From: Levente Kurusa > > > > When two vcpus issue PSCI_CPU_ON on the same core at the same time, > > then it's possible for them to both enter the target vcpu's setup > > at the same time. This results in unexpected behaviors at best, > > and the potential for some nasty bugs at worst. > > > > Signed-off-by: Levente Kurusa > > Signed-off-by: Andrew Jones > > --- > > arch/arm/kvm/psci.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c > > index f732484abc7a..0204daa899b1 100644 > > --- a/arch/arm/kvm/psci.c > > +++ b/arch/arm/kvm/psci.c > > @@ -88,7 +88,8 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu > > *source_vcpu) > > */ > > if (!vcpu) > > return PSCI_RET_INVALID_PARAMS; > > - if (!test_bit(KVM_REQ_POWER_OFF, &vcpu->requests)) { > > + > > + if (!test_and_clear_bit(KVM_REQ_POWER_OFF, &vcpu->requests)) { > > if (kvm_psci_version(source_vcpu) != KVM_ARM_PSCI_0_1) > > return PSCI_RET_ALREADY_ON; > > else > > @@ -116,7 +117,6 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu > > *source_vcpu) > > * the general puspose registers are undefined upon CPU_ON. > > */ > > vcpu_set_reg(vcpu, 0, context_id); > > - clear_bit(KVM_REQ_POWER_OFF, &vcpu->requests); > > > > wq = kvm_arch_vcpu_wq(vcpu); > > swake_up(wq); > > -- > > 2.9.3 > > > > Depending on what you end up doing with the requests, if you keep the > bool flag you could just use the kvm->lock mutex instead. > > Have you considered if there are any potential races between > kvm_psci_system_off() being called on one VCPU while two other VPCUs are > turning on the same CPU that is being turend off as part of system-wide > power down as well? Sounds like a nice unit test. I haven't considered it, but I guess the kvm_psci_system_off/reset calling VCPU will ultimately "win", as it'll cause an exit to userspace that initiates a shutdown/reset. When the VCPUs are restarted then vcpu init should reset the power_off state correctly. As long as the race this patch addresses is fixed, then I'm not sure there should be any risk with the actual system_off/reset being delayed wrt a vcpu being "on'ed" again, nor with there being more than one VCPU trying to "on" it at the same time. > > I'm wondering if this means we should take the kvm->lock at a higher > level when handling PSCI events... That would simplify our analysis of the PSCI emulation, but I'm not sure we want to give a guest the power to constantly acquire that mutex with a barrage of PSCI calls. Maybe we should create a PSCI mutex? In order to avoid holding it too long we may want power_off to be more than a boolean though, i.e. the PENDING state might also be a good idea to represent. Thanks, drew ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 8/9] KVM: arm/arm64: fix race in kvm_psci_vcpu_on
On Fri, Mar 31, 2017 at 06:06:57PM +0200, Andrew Jones wrote: > From: Levente Kurusa > > When two vcpus issue PSCI_CPU_ON on the same core at the same time, > then it's possible for them to both enter the target vcpu's setup > at the same time. This results in unexpected behaviors at best, > and the potential for some nasty bugs at worst. > > Signed-off-by: Levente Kurusa > Signed-off-by: Andrew Jones > --- > arch/arm/kvm/psci.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c > index f732484abc7a..0204daa899b1 100644 > --- a/arch/arm/kvm/psci.c > +++ b/arch/arm/kvm/psci.c > @@ -88,7 +88,8 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu > *source_vcpu) >*/ > if (!vcpu) > return PSCI_RET_INVALID_PARAMS; > - if (!test_bit(KVM_REQ_POWER_OFF, &vcpu->requests)) { > + > + if (!test_and_clear_bit(KVM_REQ_POWER_OFF, &vcpu->requests)) { > if (kvm_psci_version(source_vcpu) != KVM_ARM_PSCI_0_1) > return PSCI_RET_ALREADY_ON; > else > @@ -116,7 +117,6 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu > *source_vcpu) >* the general puspose registers are undefined upon CPU_ON. >*/ > vcpu_set_reg(vcpu, 0, context_id); > - clear_bit(KVM_REQ_POWER_OFF, &vcpu->requests); > > wq = kvm_arch_vcpu_wq(vcpu); > swake_up(wq); > -- > 2.9.3 > Depending on what you end up doing with the requests, if you keep the bool flag you could just use the kvm->lock mutex instead. Have you considered if there are any potential races between kvm_psci_system_off() being called on one VCPU while two other VPCUs are turning on the same CPU that is being turend off as part of system-wide power down as well? I'm wondering if this means we should take the kvm->lock at a higher level when handling PSCI events... Thanks, -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm