Re: [PATCH] KVM: VMX: Enable Notify VM exit
On 11/3/2020 2:08 PM, Tao Xu wrote: On 11/3/20 12:43 AM, Andy Lutomirski wrote: On Sun, Nov 1, 2020 at 10:14 PM Tao Xu wrote: ... +static int handle_notify(struct kvm_vcpu *vcpu) +{ + unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); + + /* + * Notify VM exit happened while executing iret from NMI, + * "blocked by NMI" bit has to be set before next VM entry. + */ + if (exit_qualification & NOTIFY_VM_CONTEXT_VALID) { + if (enable_vnmi && + (exit_qualification & INTR_INFO_UNBLOCK_NMI)) + vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO, + GUEST_INTR_STATE_NMI); This needs actual documentation in the SDM or at least ISE please. Hi Andy, Do you mean SDM or ISE should call out it needs to restore "blocked by NMI" if bit 12 of exit qualification is set and VMM decides to re-enter the guest? you can refer to SDM 27.2.3 "Information about NMI unblocking Due to IRET" in latest SDM 325462-072US Notify VM-Exit is defined in ISE, chapter 9.2: https://software.intel.com/content/dam/develop/external/us/en/documents/architecture-instruction-set-extensions-programming-reference.pdf I will add this information into commit message. Thank you for reminding me.
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On 11/3/2020 2:25 AM, Paolo Bonzini wrote: On 02/11/20 19:01, Andy Lutomirski wrote: What's the point? Surely the kernel should reliably mitigate the flaw, and the kernel should decide how to do so. There is some slowdown in trapping #DB and #AC unconditionally. Though for these two cases nobody should care so I agree with keeping the code simple and keeping the workaround. OK. Also, why would this trigger after more than a few hundred cycles, something like the length of the longest microcode loop? HZ*10 seems like a very generous estimate already. As Sean said in another mail, 1/10 tick should be a placeholder. Glad to see all of you think it should be smaller. We'll come up with more reasonable candidate once we can test on real silicon.
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On 11/3/2020 2:12 PM, Tao Xu wrote: On 11/3/20 6:53 AM, Jim Mattson wrote: On Sun, Nov 1, 2020 at 10:14 PM Tao Xu wrote: There are some cases that malicious virtual machines can cause CPU stuck (event windows don't open up), e.g., infinite loop in microcode when nested #AC (CVE-2015-5307). No event window obviously means no events, e.g. NMIs, SMIs, and IRQs will all be blocked, may cause the related hardware CPU can't be used by host or other VM. To resolve those cases, it can enable a notify VM exit if no event window occur in VMX non-root mode for a specified amount of time (notify window). Expose a module param for setting notify window, default setting it to the time as 1/10 of periodic tick, and user can set it to 0 to disable this feature. TODO: 1. The appropriate value of notify window. 2. Another patch to disable interception of #DB and #AC when notify VM-Exiting is enabled. Co-developed-by: Xiaoyao Li Signed-off-by: Tao Xu Signed-off-by: Xiaoyao Li Do you have test cases? yes we have. The nested #AC (CVE-2015-5307) is a known test case, though we need to tweak KVM to disable interception #AC for it. Not yet, because we are waiting real silicon to do some test. I should add RFC next time before I test it in hardware.
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On 11/3/20 6:53 AM, Jim Mattson wrote: On Sun, Nov 1, 2020 at 10:14 PM Tao Xu wrote: There are some cases that malicious virtual machines can cause CPU stuck (event windows don't open up), e.g., infinite loop in microcode when nested #AC (CVE-2015-5307). No event window obviously means no events, e.g. NMIs, SMIs, and IRQs will all be blocked, may cause the related hardware CPU can't be used by host or other VM. To resolve those cases, it can enable a notify VM exit if no event window occur in VMX non-root mode for a specified amount of time (notify window). Expose a module param for setting notify window, default setting it to the time as 1/10 of periodic tick, and user can set it to 0 to disable this feature. TODO: 1. The appropriate value of notify window. 2. Another patch to disable interception of #DB and #AC when notify VM-Exiting is enabled. Co-developed-by: Xiaoyao Li Signed-off-by: Tao Xu Signed-off-by: Xiaoyao Li Do you have test cases? Not yet, because we are waiting real silicon to do some test. I should add RFC next time before I test it in hardware.
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On 11/3/20 12:43 AM, Andy Lutomirski wrote: On Sun, Nov 1, 2020 at 10:14 PM Tao Xu wrote: There are some cases that malicious virtual machines can cause CPU stuck (event windows don't open up), e.g., infinite loop in microcode when nested #AC (CVE-2015-5307). No event window obviously means no events, e.g. NMIs, SMIs, and IRQs will all be blocked, may cause the related hardware CPU can't be used by host or other VM. To resolve those cases, it can enable a notify VM exit if no event window occur in VMX non-root mode for a specified amount of time (notify window). Expose a module param for setting notify window, default setting it to the time as 1/10 of periodic tick, and user can set it to 0 to disable this feature. TODO: 1. The appropriate value of notify window. 2. Another patch to disable interception of #DB and #AC when notify VM-Exiting is enabled. Whoa there. A VM control that says "hey, CPU, if you messed up and livelocked for a long time, please break out of the loop" is not a substitute for fixing the livelocks. So I don't think you get do disable interception of #DB and #AC. I also think you should print a loud warning and have some intelligent handling when this new exit triggers. +static int handle_notify(struct kvm_vcpu *vcpu) +{ + unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); + + /* +* Notify VM exit happened while executing iret from NMI, +* "blocked by NMI" bit has to be set before next VM entry. +*/ + if (exit_qualification & NOTIFY_VM_CONTEXT_VALID) { + if (enable_vnmi && + (exit_qualification & INTR_INFO_UNBLOCK_NMI)) + vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO, + GUEST_INTR_STATE_NMI); This needs actual documentation in the SDM or at least ISE please. Notify VM-Exit is defined in ISE, chapter 9.2: https://software.intel.com/content/dam/develop/external/us/en/documents/architecture-instruction-set-extensions-programming-reference.pdf I will add this information into commit message. Thank you for reminding me.
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On 11/3/20 1:32 AM, Sean Christopherson wrote: On Mon, Nov 02, 2020 at 02:14:45PM +0800, Tao Xu wrote: There are some cases that malicious virtual machines can cause CPU stuck (event windows don't open up), e.g., infinite loop in microcode when nested #AC (CVE-2015-5307). No event window obviously means no events, e.g. NMIs, SMIs, and IRQs will all be blocked, may cause the related hardware CPU can't be used by host or other VM. To resolve those cases, it can enable a notify VM exit if no event window occur in VMX non-root mode for a specified amount of time (notify window). Expose a module param for setting notify window, default setting it to the time as 1/10 of periodic tick, and user can set it to 0 to disable this feature. TODO: 1. The appropriate value of notify window. 2. Another patch to disable interception of #DB and #AC when notify VM-Exiting is enabled. Co-developed-by: Xiaoyao Li Signed-off-by: Tao Xu Signed-off-by: Xiaoyao Li Incorrect ordering, since you're sending the patch, you "handled" it last, therefore your SOB should come last, i.e.: Co-developed-by: Xiaoyao Li Signed-off-by: Xiaoyao Li Signed-off-by: Tao Xu OK, I will correct this.
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On 11/3/20 1:31 AM, Sean Christopherson wrote: On Mon, Nov 02, 2020 at 08:43:30AM -0800, Andy Lutomirski wrote: On Sun, Nov 1, 2020 at 10:14 PM Tao Xu wrote: 2. Another patch to disable interception of #DB and #AC when notify VM-Exiting is enabled. Whoa there. A VM control that says "hey, CPU, if you messed up and livelocked for a long time, please break out of the loop" is not a substitute for fixing the livelocks. So I don't think you get do disable interception of #DB and #AC. I think that can be incorporated into a module param, i.e. let the platform owner decide which tool(s) they want to use to mitigate the legacy architecture flaws. I also think you should print a loud warning I'm not so sure on this one, e.g. userspace could just spin up a new instance if its malicious guest and spam the kernel log. and have some intelligent handling when this new exit triggers. We discussed something similar in the context of the new bus lock VM-Exit. I don't know that it makes sense to try and add intelligence into the kernel. In many use cases, e.g. clouds, the userspace VMM is trusted (inasmuch as userspace can be trusted), while the guest is completely untrusted. Reporting the error to userspace and letting the userspace stack take action is likely preferable to doing something fancy in the kernel. Tao, this patch should probably be tagged RFC, at least until we can experiment with the threshold on real silicon. KVM and kernel behavior may depend on the accuracy of detecting actual attacks, e.g. if we can set a threshold that has zero false negatives and near-zero false postives, then it probably makes sense to be more assertive in how such VM-Exits are reported and logged. Sorry, I should add RFC tag for this patch. I will add it next time.
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On Sun, Nov 1, 2020 at 10:14 PM Tao Xu wrote: > > There are some cases that malicious virtual machines can cause CPU stuck > (event windows don't open up), e.g., infinite loop in microcode when > nested #AC (CVE-2015-5307). No event window obviously means no events, > e.g. NMIs, SMIs, and IRQs will all be blocked, may cause the related > hardware CPU can't be used by host or other VM. > > To resolve those cases, it can enable a notify VM exit if no > event window occur in VMX non-root mode for a specified amount of > time (notify window). > > Expose a module param for setting notify window, default setting it to > the time as 1/10 of periodic tick, and user can set it to 0 to disable > this feature. > > TODO: > 1. The appropriate value of notify window. > 2. Another patch to disable interception of #DB and #AC when notify > VM-Exiting is enabled. > > Co-developed-by: Xiaoyao Li > Signed-off-by: Tao Xu > Signed-off-by: Xiaoyao Li Do you have test cases?
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On Mon, Nov 02, 2020 at 10:01:16AM -0800, Andy Lutomirski wrote: > On Mon, Nov 2, 2020 at 9:31 AM Sean Christopherson > wrote: > > > > On Mon, Nov 02, 2020 at 08:43:30AM -0800, Andy Lutomirski wrote: > > > On Sun, Nov 1, 2020 at 10:14 PM Tao Xu wrote: > > > > 2. Another patch to disable interception of #DB and #AC when notify > > > > VM-Exiting is enabled. > > > > > > Whoa there. > > > > > > A VM control that says "hey, CPU, if you messed up and livelocked for > > > a long time, please break out of the loop" is not a substitute for > > > fixing the livelocks. So I don't think you get do disable > > > interception of #DB and #AC. > > > > I think that can be incorporated into a module param, i.e. let the platform > > owner decide which tool(s) they want to use to mitigate the legacy > > architecture > > flaws. > > What's the point? Surely the kernel should reliably mitigate the > flaw, and the kernel should decide how to do so. IMO, setting a reasonably low threshold _is_ mitigating such flaws. E.g. it's entirely possible, if not likely, that we can push the threshold below various ENCLS instruction latencies. Now I'm curious as to how exactly the accounting is done under the hood, e.g. I assume retiring uops of a massive instruction is enough to reset the timer, but I haven't actually read the specs in detail. If userspace is truly malicious, it can easily spawn new VMs/processes to carry out its attack, e.g. exiting to userspace on these VM-Exits effectively throttles userspace as much as straight killing the process. > > > > > I also think you should print a loud warning > > > > I'm not so sure on this one, e.g. userspace could just spin up a new > > instance > > if its malicious guest and spam the kernel log. > > pr_warn_once()? Or ratelimited. My point was that a straight WARN would be less than ideal. > If this triggers, it's a *bug*, right? Kernel or CPU. Sort of? Many (all?) of the known of the scenarios that can trigger this exit are unlikely to ever be fixed in silicon. I'm not saying they shouldn't be fixed, just that practically speaking they are highly unlikely to be fixed anytime soon. The infinite #DB/#AC recursion flaws are inarguably dumb CPU behavior, but there are other scenarious that are less cut and dried, i.e. may not be fixable without non-trivial tradeoffs. > > > and have some intelligent handling when this new exit triggers. > > > > We discussed something similar in the context of the new bus lock VM-Exit. > > I > > don't know that it makes sense to try and add intelligence into the kernel. > > In many use cases, e.g. clouds, the userspace VMM is trusted (inasmuch as > > userspace can be trusted), while the guest is completely untrusted. > > Reporting > > the error to userspace and letting the userspace stack take action is likely > > preferable to doing something fancy in the kernel. > > > > > > Tao, this patch should probably be tagged RFC, at least until we can > > experiment > > with the threshold on real silicon. KVM and kernel behavior may depend on > > the > > accuracy of detecting actual attacks, e.g. if we can set a threshold that > > has > > zero false negatives and near-zero false postives, then it probably makes > > sense > > to be more assertive in how such VM-Exits are reported and logged. > > If you can actually find a threshold that reliably mitigates the bug > and does not allow a guest to cause undesirably large latency in the > host, then fine. 1/10 if a tick is way too long, I think. Yes, this was my internal review feedback as well. Either that got lost along the way or I wasn't clear enough in stating what should be used as a placeholder until we have silicon in hand.
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On 02/11/20 19:01, Andy Lutomirski wrote: > What's the point? Surely the kernel should reliably mitigate the > flaw, and the kernel should decide how to do so. There is some slowdown in trapping #DB and #AC unconditionally. Though for these two cases nobody should care so I agree with keeping the code simple and keeping the workaround. Also, why would this trigger after more than a few hundred cycles, something like the length of the longest microcode loop? HZ*10 seems like a very generous estimate already. Paolo >>> I also think you should print a loud warning >> I'm not so sure on this one, e.g. userspace could just spin up a new instance >> if its malicious guest and spam the kernel log. > pr_warn_once()? If this triggers, it's a *bug*, right? Kernel or CPU. >
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On Mon, Nov 2, 2020 at 9:31 AM Sean Christopherson wrote: > > On Mon, Nov 02, 2020 at 08:43:30AM -0800, Andy Lutomirski wrote: > > On Sun, Nov 1, 2020 at 10:14 PM Tao Xu wrote: > > > 2. Another patch to disable interception of #DB and #AC when notify > > > VM-Exiting is enabled. > > > > Whoa there. > > > > A VM control that says "hey, CPU, if you messed up and livelocked for > > a long time, please break out of the loop" is not a substitute for > > fixing the livelocks. So I don't think you get do disable > > interception of #DB and #AC. > > I think that can be incorporated into a module param, i.e. let the platform > owner decide which tool(s) they want to use to mitigate the legacy > architecture > flaws. What's the point? Surely the kernel should reliably mitigate the flaw, and the kernel should decide how to do so. > > > I also think you should print a loud warning > > I'm not so sure on this one, e.g. userspace could just spin up a new instance > if its malicious guest and spam the kernel log. pr_warn_once()? If this triggers, it's a *bug*, right? Kernel or CPU. > > > and have some intelligent handling when this new exit triggers. > > We discussed something similar in the context of the new bus lock VM-Exit. I > don't know that it makes sense to try and add intelligence into the kernel. > In many use cases, e.g. clouds, the userspace VMM is trusted (inasmuch as > userspace can be trusted), while the guest is completely untrusted. Reporting > the error to userspace and letting the userspace stack take action is likely > preferable to doing something fancy in the kernel. > > > Tao, this patch should probably be tagged RFC, at least until we can > experiment > with the threshold on real silicon. KVM and kernel behavior may depend on the > accuracy of detecting actual attacks, e.g. if we can set a threshold that has > zero false negatives and near-zero false postives, then it probably makes > sense > to be more assertive in how such VM-Exits are reported and logged. If you can actually find a threshold that reliably mitigates the bug and does not allow a guest to cause undesirably large latency in the host, then fine. 1/10 if a tick is way too long, I think.
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On Mon, Nov 02, 2020 at 02:14:45PM +0800, Tao Xu wrote: > There are some cases that malicious virtual machines can cause CPU stuck > (event windows don't open up), e.g., infinite loop in microcode when > nested #AC (CVE-2015-5307). No event window obviously means no events, > e.g. NMIs, SMIs, and IRQs will all be blocked, may cause the related > hardware CPU can't be used by host or other VM. > > To resolve those cases, it can enable a notify VM exit if no > event window occur in VMX non-root mode for a specified amount of > time (notify window). > > Expose a module param for setting notify window, default setting it to > the time as 1/10 of periodic tick, and user can set it to 0 to disable > this feature. > > TODO: > 1. The appropriate value of notify window. > 2. Another patch to disable interception of #DB and #AC when notify > VM-Exiting is enabled. > > Co-developed-by: Xiaoyao Li > Signed-off-by: Tao Xu > Signed-off-by: Xiaoyao Li Incorrect ordering, since you're sending the patch, you "handled" it last, therefore your SOB should come last, i.e.: Co-developed-by: Xiaoyao Li Signed-off-by: Xiaoyao Li Signed-off-by: Tao Xu
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On Mon, Nov 02, 2020 at 08:43:30AM -0800, Andy Lutomirski wrote: > On Sun, Nov 1, 2020 at 10:14 PM Tao Xu wrote: > > 2. Another patch to disable interception of #DB and #AC when notify > > VM-Exiting is enabled. > > Whoa there. > > A VM control that says "hey, CPU, if you messed up and livelocked for > a long time, please break out of the loop" is not a substitute for > fixing the livelocks. So I don't think you get do disable > interception of #DB and #AC. I think that can be incorporated into a module param, i.e. let the platform owner decide which tool(s) they want to use to mitigate the legacy architecture flaws. > I also think you should print a loud warning I'm not so sure on this one, e.g. userspace could just spin up a new instance if its malicious guest and spam the kernel log. > and have some intelligent handling when this new exit triggers. We discussed something similar in the context of the new bus lock VM-Exit. I don't know that it makes sense to try and add intelligence into the kernel. In many use cases, e.g. clouds, the userspace VMM is trusted (inasmuch as userspace can be trusted), while the guest is completely untrusted. Reporting the error to userspace and letting the userspace stack take action is likely preferable to doing something fancy in the kernel. Tao, this patch should probably be tagged RFC, at least until we can experiment with the threshold on real silicon. KVM and kernel behavior may depend on the accuracy of detecting actual attacks, e.g. if we can set a threshold that has zero false negatives and near-zero false postives, then it probably makes sense to be more assertive in how such VM-Exits are reported and logged.
Re: [PATCH] KVM: VMX: Enable Notify VM exit
On Sun, Nov 1, 2020 at 10:14 PM Tao Xu wrote: > > There are some cases that malicious virtual machines can cause CPU stuck > (event windows don't open up), e.g., infinite loop in microcode when > nested #AC (CVE-2015-5307). No event window obviously means no events, > e.g. NMIs, SMIs, and IRQs will all be blocked, may cause the related > hardware CPU can't be used by host or other VM. > > To resolve those cases, it can enable a notify VM exit if no > event window occur in VMX non-root mode for a specified amount of > time (notify window). > > Expose a module param for setting notify window, default setting it to > the time as 1/10 of periodic tick, and user can set it to 0 to disable > this feature. > > TODO: > 1. The appropriate value of notify window. > 2. Another patch to disable interception of #DB and #AC when notify > VM-Exiting is enabled. Whoa there. A VM control that says "hey, CPU, if you messed up and livelocked for a long time, please break out of the loop" is not a substitute for fixing the livelocks. So I don't think you get do disable interception of #DB and #AC. I also think you should print a loud warning and have some intelligent handling when this new exit triggers. > +static int handle_notify(struct kvm_vcpu *vcpu) > +{ > + unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); > + > + /* > +* Notify VM exit happened while executing iret from NMI, > +* "blocked by NMI" bit has to be set before next VM entry. > +*/ > + if (exit_qualification & NOTIFY_VM_CONTEXT_VALID) { > + if (enable_vnmi && > + (exit_qualification & INTR_INFO_UNBLOCK_NMI)) > + vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO, > + GUEST_INTR_STATE_NMI); This needs actual documentation in the SDM or at least ISE please.