Signed-off-by: Nikunj A. Dadhania <nik...@linux.vnet.ibm.com>
---
 Documentation/virtual/kvm/msr.txt                |    4 ++
 Documentation/virtual/kvm/paravirt-tlb-flush.txt |   53 ++++++++++++++++++++++
 2 files changed, 57 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/virtual/kvm/paravirt-tlb-flush.txt

diff --git a/Documentation/virtual/kvm/msr.txt 
b/Documentation/virtual/kvm/msr.txt
index 7304710..92a6af6 100644
--- a/Documentation/virtual/kvm/msr.txt
+++ b/Documentation/virtual/kvm/msr.txt
@@ -256,3 +256,7 @@ MSR_KVM_EOI_EN: 0x4b564d04
        guest must both read the least significant bit in the memory area and
        clear it using a single CPU instruction, such as test and clear, or
        compare and exchange.
+
+MSR_KVM_VCPU_STATE: 0x4b564d05
+
+Refer: Documentation/virtual/kvm/paravirt-tlb-flush.txt
diff --git a/Documentation/virtual/kvm/paravirt-tlb-flush.txt 
b/Documentation/virtual/kvm/paravirt-tlb-flush.txt
new file mode 100644
index 0000000..0eaabd7
--- /dev/null
+++ b/Documentation/virtual/kvm/paravirt-tlb-flush.txt
@@ -0,0 +1,53 @@
+KVM - Paravirt TLB Flush
+Nikunj A Dadhania <nik...@linux.vnet.ibm.com>, IBM, 2012
+========================================================
+
+Remote flushing api's does a busy wait which is fine in bare-metal
+scenario. But with-in the guest, the vcpus might have been pre-empted
+or blocked. In this scenario, the initator vcpu would end up
+busy-waiting for a long amount of time.
+
+This would require to have information of guest running/not-running
+within the guest to take a decision. The following MSR introduces vcpu
+running state information.
+
+Using this MSR we have implemented para-virt flush tlbs making sure
+that it does not wait for vcpus that are not-running. And TLB flushing
+for them is deferred, which is done on guest enter.
+
+MSR_KVM_VCPU_STATE: 0x4b564d04
+
+       data: 64-byte alignment physical address of a memory area which must be
+       in guest RAM, plus an enable bit in bit 0. This memory is expected to
+       hold a copy of the following structure:
+
+       struct kvm_steal_time {
+               __u64 state;
+               __u32 pad[14];
+       }
+
+       whose data will be filled in by the hypervisor/guest. Only one
+       write, or registration, is needed for each VCPU.  The interval
+       between updates of this structure is arbitrary and
+       implementation-dependent.  The hypervisor may update this
+       structure at any time it sees fit until anything with bit0 ==
+       0 is written to it. Guest is required to make sure this
+       structure is initialized to zero.
+
+       This would enable a VCPU to know running status of sibling
+       VCPUs. The information can further be used to determine if an
+       IPI needs to be send to the non-running VCPU and wait for them
+       unnecessarily. For e.g. flush_tlb_others_ipi.
+
+       Fields have the following meanings:
+
+               state: has bit  following fields:
+
+               Bit 0 - vcpu running state. Hypervisor would set vcpu
+                       running/not running. Value 1 meaning the vcpu
+                       is running and value 0 means vcpu is
+                       pre-empted out.
+
+               Bit 1 - hypervisor should flush tlb is set during
+                       guest enter/exit
+

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to