Re: [PATCH RFC V9 0/19] Paravirtualized ticket spinlocks

2013-06-07 Thread Raghavendra K T

On 06/03/2013 11:51 AM, Raghavendra K T wrote:

On 06/03/2013 07:10 AM, Raghavendra K T wrote:

On 06/02/2013 09:50 PM, Jiannan Ouyang wrote:

On Sun, Jun 2, 2013 at 1:07 AM, Gleb Natapov g...@redhat.com wrote:


High level question here. We have a big hope for Preemptable Ticket
Spinlock patch series by Jiannan Ouyang to solve most, if not all,
ticketing spinlocks in overcommit scenarios problem without need for
PV.
So how this patch series compares with his patches on PLE enabled
processors?



No experiment results yet.

An error is reported on a 20 core VM. I'm during an internship
relocation, and will start work on it next week.


Preemptable spinlocks' testing update:
I hit the same softlockup problem while testing on 32 core machine with
32 guest vcpus that Andrew had reported.

After that i started tuning TIMEOUT_UNIT, and when I went till (18),
things seemed to be manageable for undercommit cases.
But I still see degradation for undercommit w.r.t baseline itself on 32
core machine (after tuning).

(37.5% degradation w.r.t base line).
I can give the full report after the all tests complete.

For over-commit cases, I again started hitting softlockups (and
degradation is worse). But as I said in the preemptable thread, the
concept of preemptable locks looks promising (though I am still not a
fan of  embedded TIMEOUT mechanism)

Here is my opinion of TODOs for preemptable locks to make it better ( I
think I need to paste in the preemptable thread also)

1. Current TIMEOUT UNIT seem to be on higher side and also it does not
scale well with large guests and also overcommit. we need to have a
sort of adaptive mechanism and better is sort of different TIMEOUT_UNITS
for different types of lock too. The hashing mechanism that was used in
Rik's spinlock backoff series fits better probably.

2. I do not think TIMEOUT_UNIT itself would work great when we have a
big queue (for large guests / overcommits) for lock.
one way is to add a PV hook that does yield hypercall immediately for
the waiters above some THRESHOLD so that they don't burn the CPU.
( I can do POC to check if  that idea works in improving situation
at some later point of time)



Preemptable-lock results from my run with 2^8 TIMEOUT:

+---+---+---++---+
  ebizzy (records/sec) higher is better
+---+---+---++---+
 basestdevpatchedstdev%improvement
+---+---+---++---+
1x  5574.9000   237.49973484.2000   113.4449   -37.50202
2x  2741.5000   561.3090 351.5000   140.5420   -87.17855
3x  2146.2500   216.7718 194.833385.0303   -90.92215
4x  1663.   141.9235 101.57.7853   -93.92664
+---+---+---++---+
+---+---+---++---+
dbench  (Throughput) higher is better
+---+---+---++---+
  basestdevpatchedstdev%improvement
+---+---+---++---+
1x  14111.5600   754.4525   3930.1602   2547.2369-72.14936
2x  2481.627071.2665  181.181689.5368-92.69908
3x  1510.248331.8634  104.724353.2470-93.06576
4x  1029.487516.9166   72.373838.2432-92.96992
+---+---+---++---+

Note we can not trust on overcommit results because of softlock-ups



Hi, I tried
(1) TIMEOUT=(2^7)

(2) having yield hypercall that uses kvm_vcpu_on_spin() to do directed 
yield to other vCPUs.


Now I do not see any soft-lockup in overcommit cases and results are 
better now (except ebizzy 1x). and for dbench I see now it is closer to 
base and even improvement in 4x


+---+---+---++---+
   ebizzy (records/sec) higher is better
+---+---+---++---+
  basestdevpatchedstdev%improvement
+---+---+---++---+
  5574.9000   237.4997 523.7000 1.4181   -90.60611
  2741.5000   561.3090 597.800034.9755   -78.19442
  2146.2500   216.7718 902.666782.4228   -57.94215
  1663.   141.92351245.67.2989   -25.13530
+---+---+---++---+
+---+---+---++---+
dbench  (Throughput) higher is better
+---+---+---++---+
   basestdevpatchedstdev%improvement
+---+---+---++---+
 14111.5600   754.4525 884.905124.4723   -93.72922
  2481.627071.26652383.5700   333.2435-3.95132
  1510.248331.86341477.735850.5126-2.15279
  1029.487516.91661075.922513.9911 

[PATCH] get 2% or more performance improved by reducing spin_lock race

2013-06-07 Thread Qinchuanyu
the wake_up_process func is included by spin_lock/unlock in vhost_work_queue,
 but it could be done outside the spin_lock. 
I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf, the num as 
below.
 orignal   modified
thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps) vhost(%)
1   9.59 28.82   |  9.5927.49
89.6132.92   |  9.6226.77
649.5846.48  | 9.5538.99
2569.663.7   |  9.6 52.59

Signed-off-by: Chuanyu Qin qinchua...@huawei.com
---
 drivers/vhost/vhost.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 94dbd25..8bee109 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -146,9 +146,10 @@ static inline void vhost_work_queue(struct vhost_dev *dev,
if (list_empty(work-node)) {
list_add_tail(work-node, dev-work_list);
work-queue_seq++;
+   spin_unlock_irqrestore(dev-work_lock, flags);
wake_up_process(dev-worker);
-   }
-   spin_unlock_irqrestore(dev-work_lock, flags);
+   } else
+   spin_unlock_irqrestore(dev-work_lock, flags);
 }
 
 void vhost_poll_queue(struct vhost_poll *poll)
-- 
1.7.3.1.msysgit.0
N�r��yb�X��ǧv�^�)޺{.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf

Re: [patch 2/2] tools: lkvm - Filter out cpu vendor string

2013-06-07 Thread Asias He
On Thu, Jun 6, 2013 at 8:03 PM, Pekka Enberg penb...@kernel.org wrote:
 On Tue, May 28, 2013 at 2:49 PM, Cyrill Gorcunov gorcu...@openvz.org wrote:
 If cpuvendor string is not filetered in case of host
 amd machine we get unhandled msr reads

 | [1709265.368464] kvm: 25706: cpu6 unhandled rdmsr: 0xc0010048
 | [1709265.397161] kvm: 25706: cpu7 unhandled rdmsr: 0xc0010048
 | [1709265.425774] kvm: 25706: cpu8 unhandled rdmsr: 0xc0010048

 thus provide own string and kernel will use generic cpu init.

 Reported-by: Ingo Molnar mi...@kernel.org
 CC: Pekka Enberg penb...@kernel.org
 CC: Sasha Levin sasha.le...@oracle.com
 CC: Asias He as...@redhat.com
 Signed-off-by: Cyrill Gorcunov gorcu...@openvz.org
 ---
  tools/kvm/x86/cpuid.c |8 
  1 file changed, 8 insertions(+)

 Index: linux-2.6.git/tools/kvm/x86/cpuid.c
 ===
 --- linux-2.6.git.orig/tools/kvm/x86/cpuid.c
 +++ linux-2.6.git/tools/kvm/x86/cpuid.c
 @@ -12,6 +12,7 @@

  static void filter_cpuid(struct kvm_cpuid2 *kvm_cpuid)
  {
 +   unsigned int signature[3];
 unsigned int i;

 /*
 @@ -21,6 +22,13 @@ static void filter_cpuid(struct kvm_cpui
 struct kvm_cpuid_entry2 *entry = kvm_cpuid-entries[i];

 switch (entry-function) {
 +   case 0:
 +   /* Vendor name */
 +   memcpy(signature, LKVMLKVMLKVM, 12);
 +   entry-ebx = signature[0];
 +   entry-ecx = signature[1];
 +   entry-edx = signature[2];
 +   break;
 case 1:
 /* Set X86_FEATURE_HYPERVISOR */
 if (entry-index == 0)

 Ping! Is there someone out there who has a AMD box they could test this on?

I tested it on AMD box.  Guest boots with this patch, guest does not
boot without it.  I am not seeing the msr warning in both cases.

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
Asias
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-06-07 Thread Peter Maydell
On 6 June 2013 15:59, Anthony Liguori aligu...@us.ibm.com wrote:
 We would still use virtio-pci for existing devices.  Only new devices
 would use virtio-pcie.

Surely you'd want to support both for any new device, because
(a) transport is orthogonal to backend functionality
(b) not all existing machine models have pci-e ?

Or am I misunderstanding?

thanks
-- PMM
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] get 2% or more performance improved by reducing spin_lock race

2013-06-07 Thread Jason Wang
On 06/07/2013 03:31 PM, Qinchuanyu wrote:
 the wake_up_process func is included by spin_lock/unlock in vhost_work_queue,
  but it could be done outside the spin_lock. 
 I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf, the num 
 as below.
  orignal   modified
 thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps) vhost(%)
 1   9.59 28.82   |  9.5927.49
 89.6132.92   |  9.6226.77
 649.5846.48  | 9.5538.99
 2569.663.7   |  9.6 52.59

 Signed-off-by: Chuanyu Qin qinchua...@huawei.com
 ---
  drivers/vhost/vhost.c |5 +++--
  1 files changed, 3 insertions(+), 2 deletions(-)

 diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
 index 94dbd25..8bee109 100644
 --- a/drivers/vhost/vhost.c
 +++ b/drivers/vhost/vhost.c
 @@ -146,9 +146,10 @@ static inline void vhost_work_queue(struct vhost_dev 
 *dev,
   if (list_empty(work-node)) {
   list_add_tail(work-node, dev-work_list);
   work-queue_seq++;
 + spin_unlock_irqrestore(dev-work_lock, flags);
   wake_up_process(dev-worker);
 - }
 - spin_unlock_irqrestore(dev-work_lock, flags);
 + } else
 + spin_unlock_irqrestore(dev-work_lock, flags);
  }
  
  void vhost_poll_queue(struct vhost_poll *poll)

Hi Chuanyu:

I believe you need a better name of the patch such as:

vhost: wake up worker outside spinlock
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 5/6] KVM: MMU: add tracepoint for check_mmio_spte

2013-06-07 Thread Xiao Guangrong
It is useful for debug mmio spte invalidation

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/x86/kvm/mmu.c  |  9 +++--
 arch/x86/kvm/mmutrace.h | 24 
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bdc95bc..1fd2c05 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -281,8 +281,13 @@ static bool set_mmio_spte(struct kvm *kvm, u64 *sptep, 
gfn_t gfn,
 
 static bool check_mmio_spte(struct kvm *kvm, u64 spte)
 {
-   return likely(get_mmio_spte_generation(spte) ==
-   kvm_current_mmio_generation(kvm));
+   unsigned int kvm_gen, spte_gen;
+
+   kvm_gen = kvm_current_mmio_generation(kvm);
+   spte_gen = get_mmio_spte_generation(spte);
+
+   trace_check_mmio_spte(spte, kvm_gen, spte_gen);
+   return likely(kvm_gen == spte_gen);
 }
 
 static inline u64 rsvd_bits(int s, int e)
diff --git a/arch/x86/kvm/mmutrace.h b/arch/x86/kvm/mmutrace.h
index ad24757..9d2e0ff 100644
--- a/arch/x86/kvm/mmutrace.h
+++ b/arch/x86/kvm/mmutrace.h
@@ -298,6 +298,30 @@ TRACE_EVENT(
  __entry-mmu_valid_gen, __entry-mmu_used_pages
)
 );
+
+
+TRACE_EVENT(
+   check_mmio_spte,
+   TP_PROTO(u64 spte, unsigned int kvm_gen, unsigned int spte_gen),
+   TP_ARGS(spte, kvm_gen, spte_gen),
+
+   TP_STRUCT__entry(
+   __field(unsigned int, kvm_gen)
+   __field(unsigned int, spte_gen)
+   __field(u64, spte)
+   ),
+
+   TP_fast_assign(
+   __entry-kvm_gen = kvm_gen;
+   __entry-spte_gen = spte_gen;
+   __entry-spte = spte;
+   ),
+
+   TP_printk(spte %llx kvm_gen %x spte-gen %x valid %d, __entry-spte,
+ __entry-kvm_gen, __entry-spte_gen,
+ __entry-kvm_gen == __entry-spte_gen
+   )
+);
 #endif /* _TRACE_KVMMMU_H */
 
 #undef TRACE_INCLUDE_PATH
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/6] KVM: MMU: retain more available bits on mmio spte

2013-06-07 Thread Xiao Guangrong
Let mmio spte only use bit62 and bit63 on upper 32 bits, then bit 52 ~ bit 61
can be used for other purposes

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/x86/kvm/vmx.c | 4 ++--
 arch/x86/kvm/x86.c | 8 +++-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 260a919..78ee123 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4176,10 +4176,10 @@ static void ept_set_mmio_spte_mask(void)
/*
 * EPT Misconfigurations can be generated if the value of bits 2:0
 * of an EPT paging-structure entry is 110b (write/execute).
-* Also, magic bits (0xffull  49) is set to quickly identify mmio
+* Also, magic bits (0x3ull  62) is set to quickly identify mmio
 * spte.
 */
-   kvm_mmu_set_mmio_spte_mask(0xffull  49 | 0x6ull);
+   kvm_mmu_set_mmio_spte_mask((0x3ull  62) | 0x6ull);
 }
 
 /*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6402951..54059ba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5263,7 +5263,13 @@ static void kvm_set_mmio_spte_mask(void)
 * Set the reserved bits and the present bit of an paging-structure
 * entry to generate page fault with PFER.RSV = 1.
 */
-   mask = ((1ull  (62 - maxphyaddr + 1)) - 1)  maxphyaddr;
+/* Mask the reserved physical address bits. */
+   mask = ((1ull  (51 - maxphyaddr + 1)) - 1)  maxphyaddr;
+
+   /* Bit 62 is always reserved for 32bit host. */
+   mask |= 0x3ull  62;
+
+   /* Set the present bit. */
mask |= 1ull;
 
 #ifdef CONFIG_X86_64
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 6/6] KVM: MMU: init kvm generation close to mmio wrap-around value

2013-06-07 Thread Xiao Guangrong
Then it has the chance to trigger mmio generation number wrap-around

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/x86/kvm/mmu.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 1fd2c05..7d50a2d 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -235,7 +235,12 @@ static unsigned int get_mmio_spte_generation(u64 spte)
 
 static unsigned int kvm_current_mmio_generation(struct kvm *kvm)
 {
-   return kvm_memslots(kvm)-generation  MMIO_GEN_MASK;
+   /*
+* Init kvm generation close to MMIO_MAX_GEN to easily test the
+* code of handling generation number wrap-around.
+*/
+   return (kvm_memslots(kvm)-generation +
+ MMIO_MAX_GEN - 13)  MMIO_GEN_MASK;
 }
 
 static void mark_mmio_spte(struct kvm *kvm, u64 *sptep, u64 gfn,
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 4/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-07 Thread Xiao Guangrong
This patch tries to introduce a very simple and scale way to invalidate
all mmio sptes - it need not walk any shadow pages and hold mmu-lock

KVM maintains a global mmio valid generation-number which is stored in
kvm-memslots.generation and every mmio spte stores the current global
generation-number into his available bits when it is created

When KVM need zap all mmio sptes, it just simply increase the global
generation-number. When guests do mmio access, KVM intercepts a MMIO #PF
then it walks the shadow page table and get the mmio spte. If the
generation-number on the spte does not equal the global generation-number,
it will go to the normal #PF handler to update the mmio spte

Since 19 bits are used to store generation-number on mmio spte, we zap all
mmio sptes when the number is round

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/mmu.c  | 54 +++--
 arch/x86/kvm/mmu.h  |  5 +++-
 arch/x86/kvm/paging_tmpl.h  |  7 --
 arch/x86/kvm/vmx.c  |  4 +++
 arch/x86/kvm/x86.c  |  3 +--
 6 files changed, 61 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1f98c1b..90d05ed 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -773,7 +773,7 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 struct kvm_memory_slot *slot,
 gfn_t gfn_offset, unsigned long mask);
 void kvm_mmu_zap_all(struct kvm *kvm);
-void kvm_mmu_zap_mmio_sptes(struct kvm *kvm);
+void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm);
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
 void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
 
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 044d8c0..bdc95bc 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -205,9 +205,11 @@ EXPORT_SYMBOL_GPL(kvm_mmu_set_mmio_spte_mask);
 #define MMIO_SPTE_GEN_LOW_SHIFT3
 #define MMIO_SPTE_GEN_HIGH_SHIFT   52
 
+#define MMIO_GEN_SHIFT 19
 #define MMIO_GEN_LOW_SHIFT 9
 #define MMIO_GEN_LOW_MASK  ((1  MMIO_GEN_LOW_SHIFT) - 1)
-#define MMIO_MAX_GEN   ((1  19) - 1)
+#define MMIO_GEN_MASK  ((1  MMIO_GEN_SHIFT) - 1)
+#define MMIO_MAX_GEN   ((1  MMIO_GEN_SHIFT) - 1)
 
 static u64 generation_mmio_spte_mask(unsigned int gen)
 {
@@ -231,17 +233,23 @@ static unsigned int get_mmio_spte_generation(u64 spte)
return gen;
 }
 
+static unsigned int kvm_current_mmio_generation(struct kvm *kvm)
+{
+   return kvm_memslots(kvm)-generation  MMIO_GEN_MASK;
+}
+
 static void mark_mmio_spte(struct kvm *kvm, u64 *sptep, u64 gfn,
   unsigned access)
 {
struct kvm_mmu_page *sp =  page_header(__pa(sptep));
-   u64 mask = generation_mmio_spte_mask(0);
+   unsigned int gen = kvm_current_mmio_generation(kvm);
+   u64 mask = generation_mmio_spte_mask(gen);
 
access = ACC_WRITE_MASK | ACC_USER_MASK;
mask |= shadow_mmio_mask | access | gfn  PAGE_SHIFT;
sp-mmio_cached = true;
 
-   trace_mark_mmio_spte(sptep, gfn, access, 0);
+   trace_mark_mmio_spte(sptep, gfn, access, gen);
mmu_spte_set(sptep, mask);
 }
 
@@ -271,6 +279,12 @@ static bool set_mmio_spte(struct kvm *kvm, u64 *sptep, 
gfn_t gfn,
return false;
 }
 
+static bool check_mmio_spte(struct kvm *kvm, u64 spte)
+{
+   return likely(get_mmio_spte_generation(spte) ==
+   kvm_current_mmio_generation(kvm));
+}
+
 static inline u64 rsvd_bits(int s, int e)
 {
return ((1ULL  (e - s + 1)) - 1)  s;
@@ -3235,6 +3249,9 @@ int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, 
u64 addr, bool direct)
gfn_t gfn = get_mmio_spte_gfn(spte);
unsigned access = get_mmio_spte_access(spte);
 
+   if (!check_mmio_spte(vcpu-kvm, spte))
+   return RET_MMIO_PF_INVALID;
+
if (direct)
addr = 0;
 
@@ -3276,8 +3293,12 @@ static int nonpaging_page_fault(struct kvm_vcpu *vcpu, 
gva_t gva,
 
pgprintk(%s: gva %lx error %x\n, __func__, gva, error_code);
 
-   if (unlikely(error_code  PFERR_RSVD_MASK))
-   return handle_mmio_page_fault(vcpu, gva, error_code, true);
+   if (unlikely(error_code  PFERR_RSVD_MASK)) {
+   r = handle_mmio_page_fault(vcpu, gva, error_code, true);
+
+   if (likely(r != RET_MMIO_PF_INVALID))
+   return r;
+   }
 
r = mmu_topup_memory_caches(vcpu);
if (r)
@@ -3353,8 +3374,12 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t 
gpa, u32 error_code,
ASSERT(vcpu);
ASSERT(VALID_PAGE(vcpu-arch.mmu.root_hpa));
 
-   

[PATCH v3 2/6] KVM: MMU: store generation-number into mmio spte

2013-06-07 Thread Xiao Guangrong
Store the generation-number into bit3 ~ bit11 and bit52 ~ bit61, totally
19 bits can be used, it should be enough for nearly all most common cases

In this patch, the generation-number is always 0, it will be changed in
the later patch

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/x86/kvm/mmu.c | 58 ++
 arch/x86/kvm/mmutrace.h| 10 
 arch/x86/kvm/paging_tmpl.h |  3 ++-
 3 files changed, 56 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 6941fa7..eca91bd 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -197,15 +197,52 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask)
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_set_mmio_spte_mask);
 
-static void mark_mmio_spte(u64 *sptep, u64 gfn, unsigned access)
+/*
+ * spte bits of bit 3 ~ bit 11 are used as low 9 bits of generation number,
+ * the bits of bits 52 ~ bit 61 are used as high 10 bits of generation
+ * number.
+ */
+#define MMIO_SPTE_GEN_LOW_SHIFT3
+#define MMIO_SPTE_GEN_HIGH_SHIFT   52
+
+#define MMIO_GEN_LOW_SHIFT 9
+#define MMIO_GEN_LOW_MASK  ((1  MMIO_GEN_LOW_SHIFT) - 1)
+#define MMIO_MAX_GEN   ((1  19) - 1)
+
+static u64 generation_mmio_spte_mask(unsigned int gen)
+{
+   u64 mask;
+
+   WARN_ON(gen  MMIO_MAX_GEN);
+
+   mask = (gen  MMIO_GEN_LOW_MASK)  MMIO_SPTE_GEN_LOW_SHIFT;
+   mask |= ((u64)gen  MMIO_GEN_LOW_SHIFT)  MMIO_SPTE_GEN_HIGH_SHIFT;
+   return mask;
+}
+
+static unsigned int get_mmio_spte_generation(u64 spte)
+{
+   unsigned int gen;
+
+   spte = ~shadow_mmio_mask;
+
+   gen = (spte  MMIO_SPTE_GEN_LOW_SHIFT)  MMIO_GEN_LOW_MASK;
+   gen |= (spte  MMIO_SPTE_GEN_HIGH_SHIFT)  MMIO_GEN_LOW_SHIFT;
+   return gen;
+}
+
+static void mark_mmio_spte(struct kvm *kvm, u64 *sptep, u64 gfn,
+  unsigned access)
 {
struct kvm_mmu_page *sp =  page_header(__pa(sptep));
+   u64 mask = generation_mmio_spte_mask(0);
 
access = ACC_WRITE_MASK | ACC_USER_MASK;
-
+   mask |= shadow_mmio_mask | access | gfn  PAGE_SHIFT;
sp-mmio_cached = true;
-   trace_mark_mmio_spte(sptep, gfn, access);
-   mmu_spte_set(sptep, shadow_mmio_mask | access | gfn  PAGE_SHIFT);
+
+   trace_mark_mmio_spte(sptep, gfn, access, 0);
+   mmu_spte_set(sptep, mask);
 }
 
 static bool is_mmio_spte(u64 spte)
@@ -223,10 +260,11 @@ static unsigned get_mmio_spte_access(u64 spte)
return (spte  ~shadow_mmio_mask)  ~PAGE_MASK;
 }
 
-static bool set_mmio_spte(u64 *sptep, gfn_t gfn, pfn_t pfn, unsigned access)
+static bool set_mmio_spte(struct kvm *kvm, u64 *sptep, gfn_t gfn,
+ pfn_t pfn, unsigned access)
 {
if (unlikely(is_noslot_pfn(pfn))) {
-   mark_mmio_spte(sptep, gfn, access);
+   mark_mmio_spte(kvm, sptep, gfn, access);
return true;
}
 
@@ -2364,7 +2402,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
u64 spte;
int ret = 0;
 
-   if (set_mmio_spte(sptep, gfn, pfn, pte_access))
+   if (set_mmio_spte(vcpu-kvm, sptep, gfn, pfn, pte_access))
return 0;
 
spte = PT_PRESENT_MASK;
@@ -3427,8 +3465,8 @@ static inline void protect_clean_gpte(unsigned *access, 
unsigned gpte)
*access = mask;
 }
 
-static bool sync_mmio_spte(u64 *sptep, gfn_t gfn, unsigned access,
-  int *nr_present)
+static bool sync_mmio_spte(struct kvm *kvm, u64 *sptep, gfn_t gfn,
+  unsigned access, int *nr_present)
 {
if (unlikely(is_mmio_spte(*sptep))) {
if (gfn != get_mmio_spte_gfn(*sptep)) {
@@ -3437,7 +3475,7 @@ static bool sync_mmio_spte(u64 *sptep, gfn_t gfn, 
unsigned access,
}
 
(*nr_present)++;
-   mark_mmio_spte(sptep, gfn, access);
+   mark_mmio_spte(kvm, sptep, gfn, access);
return true;
}
 
diff --git a/arch/x86/kvm/mmutrace.h b/arch/x86/kvm/mmutrace.h
index eb444dd..ad24757 100644
--- a/arch/x86/kvm/mmutrace.h
+++ b/arch/x86/kvm/mmutrace.h
@@ -199,23 +199,25 @@ DEFINE_EVENT(kvm_mmu_page_class, kvm_mmu_prepare_zap_page,
 
 TRACE_EVENT(
mark_mmio_spte,
-   TP_PROTO(u64 *sptep, gfn_t gfn, unsigned access),
-   TP_ARGS(sptep, gfn, access),
+   TP_PROTO(u64 *sptep, gfn_t gfn, unsigned access, unsigned int gen),
+   TP_ARGS(sptep, gfn, access, gen),
 
TP_STRUCT__entry(
__field(void *, sptep)
__field(gfn_t, gfn)
__field(unsigned, access)
+   __field(unsigned int, gen)
),
 
TP_fast_assign(
__entry-sptep = sptep;
__entry-gfn = gfn;
__entry-access = access;
+   __entry-gen = gen;
),
 
-   TP_printk(sptep:%p gfn %llx access %x, __entry-sptep, __entry-gfn,
-   

[PATCH v3 3/6] KVM: MMU: make return value of mmio page fault handler more readable

2013-06-07 Thread Xiao Guangrong
Define some meaningful names instead of raw code

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/x86/kvm/mmu.c | 15 +--
 arch/x86/kvm/mmu.h | 14 ++
 arch/x86/kvm/vmx.c |  4 ++--
 3 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index eca91bd..044d8c0 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3222,17 +3222,12 @@ static u64 walk_shadow_page_get_mmio_spte(struct 
kvm_vcpu *vcpu, u64 addr)
return spte;
 }
 
-/*
- * If it is a real mmio page fault, return 1 and emulat the instruction
- * directly, return 0 to let CPU fault again on the address, -1 is
- * returned if bug is detected.
- */
 int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct)
 {
u64 spte;
 
if (quickly_check_mmio_pf(vcpu, addr, direct))
-   return 1;
+   return RET_MMIO_PF_EMULATE;
 
spte = walk_shadow_page_get_mmio_spte(vcpu, addr);
 
@@ -3245,7 +3240,7 @@ int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, 
u64 addr, bool direct)
 
trace_handle_mmio_page_fault(addr, gfn, access);
vcpu_cache_mmio_info(vcpu, addr, gfn, access);
-   return 1;
+   return RET_MMIO_PF_EMULATE;
}
 
/*
@@ -3253,13 +3248,13 @@ int handle_mmio_page_fault_common(struct kvm_vcpu 
*vcpu, u64 addr, bool direct)
 * it's a BUG if the gfn is not a mmio page.
 */
if (direct  !check_direct_spte_mmio_pf(spte))
-   return -1;
+   return RET_MMIO_PF_BUG;
 
/*
 * If the page table is zapped by other cpus, let CPU fault again on
 * the address.
 */
-   return 0;
+   return RET_MMIO_PF_RETRY;
 }
 EXPORT_SYMBOL_GPL(handle_mmio_page_fault_common);
 
@@ -3269,7 +3264,7 @@ static int handle_mmio_page_fault(struct kvm_vcpu *vcpu, 
u64 addr,
int ret;
 
ret = handle_mmio_page_fault_common(vcpu, addr, direct);
-   WARN_ON(ret  0);
+   WARN_ON(ret == RET_MMIO_PF_BUG);
return ret;
 }
 
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 922bfae..ba6a19c 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -52,6 +52,20 @@
 
 int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]);
 void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask);
+
+/*
+ * Return values of handle_mmio_page_fault_common:
+ * RET_MMIO_PF_EMULATE: it is a real mmio page fault, emulate the instruction
+ *  directly.
+ * RET_MMIO_PF_RETRY: let CPU fault again on the address.
+ * RET_MMIO_PF_BUG: bug is detected.
+ */
+enum {
+   RET_MMIO_PF_EMULATE = 1,
+   RET_MMIO_PF_RETRY = 0,
+   RET_MMIO_PF_BUG = -1
+};
+
 int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool 
direct);
 int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context);
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 78ee123..85c8d51 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5366,10 +5366,10 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
 
ret = handle_mmio_page_fault_common(vcpu, gpa, true);
-   if (likely(ret == 1))
+   if (likely(ret == RET_MMIO_PF_EMULATE))
return x86_emulate_instruction(vcpu, gpa, 0, NULL, 0) ==
  EMULATE_DONE;
-   if (unlikely(!ret))
+   if (unlikely(ret == RET_MMIO_PF_RETRY))
return 1;
 
/* It is the real ept misconfig */
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-07 Thread Xiao Guangrong
Changelog:
V3:
  All of these changes are from Gleb's review:
  1) rename RET_MMIO_PF_EMU to RET_MMIO_PF_EMULATE.
  2) smartly adjust kvm generation number in kvm_current_mmio_generatio()
 to avoid kvm_memslots-generation overflow.

V2:
  - rename kvm_mmu_invalid_mmio_spte to kvm_mmu_invalid_mmio_sptes
  - use kvm-memslots-generation as kvm global generation-number
  - fix comment and codestyle
  - init kvm generation close to mmio wrap-around value
  - keep kvm_mmu_zap_mmio_sptes

The current way is holding hot mmu-lock and walking all shadow pages, this
is not scale. This patchset tries to introduce a very simple and scale way
to fast invalidate all mmio sptes - it need not walk any shadow pages and hold
any locks.

The idea is simple:
KVM maintains a global mmio valid generation-number which is stored in
kvm-memslots.generation and every mmio spte stores the current global
generation-number into his available bits when it is created

When KVM need zap all mmio sptes, it just simply increase the global
generation-number. When guests do mmio access, KVM intercepts a MMIO #PF
then it walks the shadow page table and get the mmio spte. If the
generation-number on the spte does not equal the global generation-number,
it will go to the normal #PF handler to update the mmio spte

Since 19 bits are used to store generation-number on mmio spte, we zap all
mmio sptes when the number is round

Xiao Guangrong (6):
  KVM: MMU: retain more available bits on mmio spte
  KVM: MMU: store generation-number into mmio spte
  KVM: MMU: make return value of mmio page fault handler more readable
  KVM: MMU: fast invalidate all mmio sptes
  KVM: MMU: add tracepoint for check_mmio_spte
  KVM: MMU: init kvm generation close to mmio wrap-around value

 arch/x86/include/asm/kvm_host.h |   2 +-
 arch/x86/kvm/mmu.c  | 131 
 arch/x86/kvm/mmu.h  |  17 ++
 arch/x86/kvm/mmutrace.h |  34 +--
 arch/x86/kvm/paging_tmpl.h  |  10 ++-
 arch/x86/kvm/vmx.c  |  12 ++--
 arch/x86/kvm/x86.c  |  11 +++-
 7 files changed, 177 insertions(+), 40 deletions(-)

-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 0/2] KVM: s390: virtio-ccw adapter interrupts.

2013-06-07 Thread Cornelia Huck
Hi,

here's the guest driver support for adapter interrupts in virtio-ccw.

We use one summary indicator per page of indicators. For each device,
we try to find a space in an indicator where all of its virtqueues fit.

Locking probably needs some more love, but it seems to work fine so far.

Cornelia Huck (2):
  KVM: s390: virtio-ccw: Handle command rejects.
  KVM: s390: virtio-ccw adapter interrupt support.

 arch/s390/include/asm/irq.h   |   1 +
 arch/s390/kernel/irq.c|   1 +
 drivers/s390/kvm/virtio_ccw.c | 307 --
 3 files changed, 298 insertions(+), 11 deletions(-)

-- 
1.8.1.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC] s390/virtio-ccw: Adapter interrupt support.

2013-06-07 Thread Cornelia Huck
Handle the new CCW_CMD_SET_IND_ADAPTER command enabling adapter interrupts
on guest request. When active, host-guest notifications will be handled
via global_indicator - queue indicators instead of queue indicators +
subchannel I/O interrupt. Indicators for virtqueues may be present at an
offset.

Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 hw/s390x/css.c|   10 
 hw/s390x/css.h|2 ++
 hw/s390x/virtio-ccw.c |   66 -
 hw/s390x/virtio-ccw.h |4 +++
 target-s390x/ioinst.h |2 ++
 target-s390x/kvm.c|8 --
 trace-events  |1 +
 7 files changed, 90 insertions(+), 3 deletions(-)

diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index f82abfe..323c232 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -115,6 +115,15 @@ void css_conditional_io_interrupt(SubchDev *sch)
 }
 }
 
+void css_adapter_interrupt(uint8_t isc)
+{
+S390CPU *cpu = s390_cpu_addr2state(0);
+uint32_t io_int_word = (isc  27) | IO_INT_WORD_AI;
+
+trace_css_adapter_interrupt(isc);
+s390_io_interrupt(cpu, 0, 0, 0, io_int_word);
+}
+
 static void sch_handle_clear_func(SubchDev *sch)
 {
 PMCW *p = sch-curr_status.pmcw;
@@ -1256,6 +1265,7 @@ void css_reset_sch(SubchDev *sch)
 sch-channel_prog = 0x0;
 sch-last_cmd_valid = false;
 sch-orb = NULL;
+sch-thinint_active = false;
 }
 
 void css_reset(void)
diff --git a/hw/s390x/css.h b/hw/s390x/css.h
index 85ed05d..ab5d4c4 100644
--- a/hw/s390x/css.h
+++ b/hw/s390x/css.h
@@ -77,6 +77,7 @@ struct SubchDev {
 CCW1 last_cmd;
 bool last_cmd_valid;
 ORB *orb;
+bool thinint_active;
 /* transport-provided data: */
 int (*ccw_cb) (SubchDev *, CCW1);
 SenseId id;
@@ -96,4 +97,5 @@ void css_queue_crw(uint8_t rsc, uint8_t erc, int chain, 
uint16_t rsid);
 void css_generate_sch_crws(uint8_t cssid, uint8_t ssid, uint16_t schid,
int hotplugged, int add);
 void css_generate_chp_crws(uint8_t cssid, uint8_t chpid);
+void css_adapter_interrupt(uint8_t isc);
 #endif
diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index 44f5772..ccebd11 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -101,6 +101,13 @@ typedef struct VirtioFeatDesc {
 uint8_t index;
 } QEMU_PACKED VirtioFeatDesc;
 
+typedef struct VirtioThinintInfo {
+hwaddr summary_indicator;
+hwaddr device_indicator;
+uint16_t ind_shift;
+uint8_t isc;
+} QEMU_PACKED VirtioThinintInfo;
+
 /* Specify where the virtqueues for the subchannel are in guest memory. */
 static int virtio_ccw_set_vqs(SubchDev *sch, uint64_t addr, uint32_t align,
   uint16_t index, uint16_t num)
@@ -149,6 +156,7 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
 bool check_len;
 int len;
 hwaddr hw_len;
+VirtioThinintInfo *thinint;
 
 if (!dev) {
 return -EINVAL;
@@ -328,6 +336,11 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
 ret = -EINVAL;
 break;
 }
+if (sch-thinint_active) {
+/* Trigger a command reject. */
+ret = -ENOSYS;
+break;
+}
 if (!ccw.cda) {
 ret = -EFAULT;
 } else {
@@ -379,6 +392,42 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
 ret = 0;
 }
 break;
+case CCW_CMD_SET_IND_ADAPTER:
+if (check_len) {
+if (ccw.count != sizeof(*thinint)) {
+ret = -EINVAL;
+break;
+}
+} else if (ccw.count  sizeof(*thinint)) {
+/* Can't execute command. */
+ret = -EINVAL;
+break;
+}
+len = sizeof(*thinint);
+hw_len = len;
+if (!ccw.cda) {
+ret = -EFAULT;
+} else if (dev-indicators  !sch-thinint_active) {
+/* Trigger a command reject. */
+ret = -ENOSYS;
+} else {
+thinint = cpu_physical_memory_map(ccw.cda, hw_len, 0);
+if (!thinint) {
+ret = -EFAULT;
+} else {
+len = hw_len;
+dev-summary_indicator = thinint-summary_indicator;
+dev-indicators = thinint-device_indicator;
+dev-thinint_isc = thinint-isc;
+dev-ind_shift = thinint-ind_shift;
+cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len);
+sch-thinint_active = ((dev-indicators != 0) 
+   (dev-summary_indicator != 0));
+sch-curr_status.scsw.count = ccw.count - len;
+ret = 0;
+}
+}
+break;
 default:
 ret = -ENOSYS;
 break;
@@ -411,6 +460,7 @@ static int virtio_ccw_device_init(VirtioCcwDevice *dev, 
VirtIODevice *vdev)
 sch-channel_prog = 0x0;
 sch-last_cmd_valid = false;
 sch-orb = NULL;
+sch-thinint_active = false;
 

[PATCH RFC] virtio-ccw: Document adapter interrupts.

2013-06-07 Thread Cornelia Huck
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 virtio-spec.lyx |  147 +--
 1 file changed, 144 insertions(+), 3 deletions(-)

diff --git a/virtio-spec.lyx b/virtio-spec.lyx
index 6e188d0..697351e 100644
--- a/virtio-spec.lyx
+++ b/virtio-spec.lyx
@@ -10701,11 +10701,18 @@ status open
 
 \begin_layout LyX-Code
 
-\change_inserted -385801441 1343732726
+\change_inserted -385801441 1369814105
 
 #define CCW_CMD_READ_VQ_CONF 0x32
 \end_layout
 
+\begin_layout LyX-Code
+
+\change_inserted -385801441 1369814140
+
+#define CCW_CMD_SET_IND_ADAPTER 0x63
+\end_layout
+
 \end_inset
 
 
@@ -11045,11 +11052,136 @@ To communicate the location of the indicator bits 
for host-guest notification,
 
 \begin_layout Standard
 
-\change_inserted -385801441 1347015749
+\change_inserted -385801441 1369814376
 For the indicator bits used in the configuration change host-guest 
notification
 , the CCW_CMD_SET_CONF_IND command is used analogously.
 \end_layout
 
+\begin_layout Subsubsection*
+
+\change_inserted -385801441 1369814399
+Setting Up Indicators For Adapter Interrupts
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted -385801441 1369815013
+If the guest wishes to use adapter interrupts for host-guest notification,
+ it may use the CCW_CMD_SET_IND_ADAPTER command instead of CCW_CMD_SET_IND.
+ Note that usage of those two mechanisms is mutually exclusive.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted -385801441 1369815065
+CCW_CMD_SET_IND_ADAPTER uses the following communication block:
+\end_layout
+
+\begin_layout LyX-Code
+
+\change_inserted -385801441 1369815367
+\begin_inset listings
+inline false
+status open
+
+\begin_layout LyX-Code
+
+\change_inserted -385801441 1369815367
+
+struct thinint_area {
+\end_layout
+
+\begin_layout LyX-Code
+
+\change_inserted -385801441 1369815367
+
+unsigned long summary_indicator;
+\end_layout
+
+\begin_layout LyX-Code
+
+\change_inserted -385801441 1369815367
+
+unsigned long indicator;
+\end_layout
+
+\begin_layout LyX-Code
+
+\change_inserted -385801441 1369815367
+
+u16 shift;
+\end_layout
+
+\begin_layout LyX-Code
+
+\change_inserted -385801441 1369815367
+
+u8 isc;
+\end_layout
+
+\begin_layout LyX-Code
+
+\change_inserted -385801441 1369815367
+
+} __packed;
+\change_unchanged
+
+\end_layout
+
+\end_inset
+
+
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted -385801441 1370345028
+
+\family typewriter
+summary_indicator
+\family default
+ contains the guest address of a byte value to be used as a summary indicator
+ which is set to != 0 every time the host wants to signal the guest for
+ any of the indictors and unset by the guest to signify that it received
+ the notification.
+ 
+\family typewriter
+isc
+\family default
+ is the interruption subclass to be used for the adapter interrupt.
+ Note that an isc/summary indicator pair must match for any subsequent requests
+ to set up adapter interrupts .
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted -385801441 1369816401
+
+\family typewriter
+indicator
+\family default
+ contains the guest address of the 64 bit indicators to be used; 
+\family typewriter
+shift
+\family default
+ contains the offset of the queue indicators for the device in this value.
+ All queue indicators for a device must fit into the same 64 bit value.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted -385801441 1369814707
+Hosts not supporting adapter interrupts for virtio-ccw may fail this command
+ with a command reject.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted -385801441 1369814766
+Configuration change host-guest notification is always setup using 
CCW_CMD_SET_
+CONF_IND.
+\end_layout
+
 \begin_layout Subsection*
 
 \change_inserted -385801441 1343732726
@@ -11064,7 +11196,7 @@ Host-Guest Notification
 
 \begin_layout Standard
 
-\change_inserted -385801441 1347015762
+\change_inserted -385801441 1369814838
 For notifying the guest of virtqueue buffers, the host sets the corresponding
  bit in the guest-provided indicators.
  If an interrupt is not already pending for the subchannel, the host generates
@@ -11073,6 +11205,15 @@ For notifying the guest of virtqueue buffers, the host 
sets the corresponding
 
 \begin_layout Standard
 
+\change_inserted -385801441 1369815397
+Alternatively, if the guest enabled adapter interrupts for a device, 
notificatio
+n happens via setting the bit in the guest-provided indicators, setting
+ the summary indicator and generating an adapter interrupt for the registered
+ interruption subclass.
+\end_layout
+
+\begin_layout Standard
+
 \change_inserted -385801441 1347015847
 If the host wants to notify the guest about configuration changes, it sets
  bit 0 in the configuration indicators and generates an unsolicited I/O
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More 

[PATCH RFC] qemu: Adapter interrupts for virtio-ccw.

2013-06-07 Thread Cornelia Huck
Hi,

here's the qemu patch that implements the new adapter indicators ccw
in virtio-ccw and injects adapter interrupts for the devices enabled
for it.

Cornelia Huck (1):
  s390/virtio-ccw: Adapter interrupt support.

 hw/s390x/css.c|   10 
 hw/s390x/css.h|2 ++
 hw/s390x/virtio-ccw.c |   66 -
 hw/s390x/virtio-ccw.h |4 +++
 target-s390x/ioinst.h |2 ++
 target-s390x/kvm.c|8 --
 trace-events  |1 +
 7 files changed, 90 insertions(+), 3 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 2/2] KVM: s390: virtio-ccw adapter interrupt support.

2013-06-07 Thread Cornelia Huck
Implement the new CCW_CMD_SET_IND_ADAPTER command and try to enable
adapter interrupts for every device on the first startup. If the host
does not support adapter interrupts, fall back to normal I/O interrupts.

virtio-ccw adapter interrupts use the same isc as normal I/O subchannels
and share a summary indicator for all devices sharing the same indicator
area.

Indicator bits for the individual virtqueues may be contained in the same
indicator area for different devices.

Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 arch/s390/include/asm/irq.h   |   1 +
 arch/s390/kernel/irq.c|   1 +
 drivers/s390/kvm/virtio_ccw.c | 296 --
 3 files changed, 289 insertions(+), 9 deletions(-)

diff --git a/arch/s390/include/asm/irq.h b/arch/s390/include/asm/irq.h
index 87c17bf..ba75d32 100644
--- a/arch/s390/include/asm/irq.h
+++ b/arch/s390/include/asm/irq.h
@@ -42,6 +42,7 @@ enum interruption_class {
IRQIO_PCI,
IRQIO_MSI,
IRQIO_VIR,
+   IRQIO_VAI,
NMI_NMI,
CPU_RST,
NR_ARCH_IRQS
diff --git a/arch/s390/kernel/irq.c b/arch/s390/kernel/irq.c
index f7fb589..39237cc 100644
--- a/arch/s390/kernel/irq.c
+++ b/arch/s390/kernel/irq.c
@@ -82,6 +82,7 @@ static const struct irq_class irqclass_sub_desc[NR_ARCH_IRQS] 
= {
[IRQIO_PCI]  = {.name = PCI, .desc = [I/O] PCI Interrupt },
[IRQIO_MSI]  = {.name = MSI, .desc = [I/O] MSI Interrupt },
[IRQIO_VIR]  = {.name = VIR, .desc = [I/O] Virtual I/O Devices},
+   [IRQIO_VAI]  = {.name = VAI, .desc = [I/O] Virtual I/O Devices AI},
[NMI_NMI]= {.name = NMI, .desc = [NMI] Machine Check},
[CPU_RST]= {.name = RST, .desc = [CPU] CPU Restart},
 };
diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c
index d6c7aba..be15b6b 100644
--- a/drivers/s390/kvm/virtio_ccw.c
+++ b/drivers/s390/kvm/virtio_ccw.c
@@ -32,6 +32,8 @@
 #include asm/cio.h
 #include asm/ccwdev.h
 #include asm/virtio-ccw.h
+#include asm/isc.h
+#include asm/airq.h
 
 /*
  * virtio related functions
@@ -58,6 +60,8 @@ struct virtio_ccw_device {
unsigned long indicators;
unsigned long indicators2;
struct vq_config_block *config_block;
+   bool is_thinint;
+   void *airq_info;
 };
 
 struct vq_info_block {
@@ -72,15 +76,42 @@ struct virtio_feature_desc {
__u8 index;
 } __packed;
 
+struct virtio_thinint_area {
+   unsigned long summary_indicator;
+   unsigned long indicator;
+   u16 shift;
+   u8 isc;
+} __packed;
+
 struct virtio_ccw_vq_info {
struct virtqueue *vq;
int num;
void *queue;
struct vq_info_block *info_block;
+   int bit_nr;
struct list_head node;
long cookie;
 };
 
+#define VIRTIO_AIRQ_ISC IO_SCH_ISC /* inherit from subchannel */
+
+#define VIRTIO_DEV_CHUNK (PAGE_SIZE/sizeof(unsigned long))
+#define MAX_AIRQ_AREAS 8
+
+static int virtio_ccw_use_airq = 1;
+
+struct airq_vq {
+   unsigned long used;
+   void *map[BITS_PER_LONG];
+};
+struct airq_info {
+   rwlock_t lock;
+   u8 *summary_indicator;
+   unsigned long indicators[VIRTIO_DEV_CHUNK];
+   struct airq_vq airq_vqs[VIRTIO_DEV_CHUNK];
+};
+static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
+
 #define CCW_CMD_SET_VQ 0x13
 #define CCW_CMD_VDEV_RESET 0x33
 #define CCW_CMD_SET_IND 0x43
@@ -91,6 +122,7 @@ struct virtio_ccw_vq_info {
 #define CCW_CMD_WRITE_CONF 0x21
 #define CCW_CMD_WRITE_STATUS 0x31
 #define CCW_CMD_READ_VQ_CONF 0x32
+#define CCW_CMD_SET_IND_ADAPTER 0x63
 
 #define VIRTIO_CCW_DOING_SET_VQ 0x0001
 #define VIRTIO_CCW_DOING_RESET 0x0004
@@ -102,6 +134,7 @@ struct virtio_ccw_vq_info {
 #define VIRTIO_CCW_DOING_SET_IND 0x0100
 #define VIRTIO_CCW_DOING_READ_VQ_CONF 0x0200
 #define VIRTIO_CCW_DOING_SET_CONF_IND 0x0400
+#define VIRTIO_CCW_DOING_SET_IND_ADAPTER 0x0800
 #define VIRTIO_CCW_INTPARM_MASK 0x
 
 static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
@@ -109,6 +142,141 @@ static struct virtio_ccw_device *to_vc_device(struct 
virtio_device *vdev)
return container_of(vdev, struct virtio_ccw_device, vdev);
 }
 
+static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
+{
+   int i, j;
+   struct airq_vq *p;
+   unsigned long flags;
+
+   write_lock_irqsave(info-lock, flags);
+   for (i = 0; i  VIRTIO_DEV_CHUNK; i++) {
+   p = info-airq_vqs[i];
+   for_each_set_bit(j, p-used,
+sizeof(p-used) * BITS_PER_BYTE)
+   if (p-map[j] == vq) {
+   p-map[j] = NULL;
+   clear_bit(j, p-used);
+   break;
+   }
+   }
+   write_unlock_irqrestore(info-lock, flags);
+}
+
+static void virtio_airq_handler(void *indicator, void *data)
+{
+   int i, bit;
+   unsigned long 

[PATCH RFC] Adapter interrupts for virtio-ccw.

2013-06-07 Thread Cornelia Huck
Hi,

here's a proposal to support adapter (aka thin) interrupts for virtio-ccw.

The basic idea is to make host-guest signalling on s390 more lightweight.
Normal I/O interrupts have two parts: an interrupt that is made pending on
any of the guest cpus, and status that is made pending on the subchannel.
This means that we need two exits for every interrupt.

With adapter interrupts, only the I/O interrupt remains - no status is made
pending for the subchannel. To find out which virtqueue the signal was for
we rely on indicators.

To set this up, the guest uses a new ccw (which is used instead of the
normal ccw to set up indicators). The payload contains pointers to two
indicators (first level and second level) and the offset at which the
virtqueue indicators start in the second level indicators, as well as the
interruption subclass (which will usually be the same as the isc for the
device). This is partially inspired by what qdio does today.

I have seen some nice speedup on simple dd with my current implementation.
Adapter interrupts are also a prereq for implementing irqfd on s390, since
they eliminate the need for manipulating subchannel status.

Cornelia Huck (1):
  virtio-ccw: Document adapter interrupts.

 virtio-spec.lyx |  147 +--
 1 file changed, 144 insertions(+), 3 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 1/2] KVM: s390: virtio-ccw: Handle command rejects.

2013-06-07 Thread Cornelia Huck
A command reject for a ccw may happen if we run on a host not supporting
a certain feature. We want to be able to handle this as special case of
command failure, so let's split this off from the generic -EIO error code.

Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 drivers/s390/kvm/virtio_ccw.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c
index 779dc51..d6c7aba 100644
--- a/drivers/s390/kvm/virtio_ccw.c
+++ b/drivers/s390/kvm/virtio_ccw.c
@@ -639,8 +639,15 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
 (SCSW_STCTL_ALERT_STATUS | SCSW_STCTL_STATUS_PEND))) {
/* OK */
}
-   if (irb_is_error(irb))
-   vcdev-err = -EIO; /* XXX - use real error */
+   if (irb_is_error(irb)) {
+   /* Command reject? */
+   if ((scsw_dstat(irb-scsw)  DEV_STAT_UNIT_CHECK) 
+   (irb-ecw[0]  SNS0_CMD_REJECT))
+   vcdev-err = -EOPNOTSUPP;
+   else
+   /* Map everything else to -EIO. */
+   vcdev-err = -EIO;
+   }
if (vcdev-curr_io  activity) {
switch (activity) {
case VIRTIO_CCW_DOING_READ_FEAT:
-- 
1.8.1.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] vhost: wake up worker outside spin_lock

2013-06-07 Thread Qinchuanyu
the wake_up_process func is included by spin_lock/unlock in vhost_work_queue,
but it could be done outside the spin_lock. 
I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf, the num as 
below.
 orignal   modified
thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps) vhost(%)
1   9.59 28.82   |  9.5927.49
89.6132.92   |  9.6226.77
649.5846.48  | 9.5538.99
2569.663.7   |  9.6 52.59

Signed-off-by: Chuanyu Qin qinchua...@huawei.com
---
 drivers/vhost/vhost.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 
94dbd25..8bee109 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -146,9 +146,10 @@ static inline void vhost_work_queue(struct vhost_dev *dev,
if (list_empty(work-node)) {
list_add_tail(work-node, dev-work_list);
work-queue_seq++;
+   spin_unlock_irqrestore(dev-work_lock, flags);
wake_up_process(dev-worker);
-   }
-   spin_unlock_irqrestore(dev-work_lock, flags);
+   } else
+   spin_unlock_irqrestore(dev-work_lock, flags);
 }
 
 void vhost_poll_queue(struct vhost_poll *poll)
--
1.7.3.1.msysgit.0
N�r��yb�X��ǧv�^�)޺{.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf

Re: [patch 2/2] tools: lkvm - Filter out cpu vendor string

2013-06-07 Thread Pekka Enberg

On 06/07/2013 11:17 AM, Asias He wrote:

Ping! Is there someone out there who has a AMD box they could test this on?


I tested it on AMD box.  Guest boots with this patch, guest does not
boot without it.  I am not seeing the msr warning in both cases.


That's pretty interesting. Can you please provide your /proc/cpuinfo so 
I can include it in the changelog?


Pekka

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/2] tools: lkvm - Filter out cpu vendor string

2013-06-07 Thread Asias He
On Fri, Jun 07, 2013 at 02:06:40PM +0300, Pekka Enberg wrote:
 On 06/07/2013 11:17 AM, Asias He wrote:
 Ping! Is there someone out there who has a AMD box they could test this on?
 
 I tested it on AMD box.  Guest boots with this patch, guest does not
 boot without it.  I am not seeing the msr warning in both cases.
 
 That's pretty interesting. Can you please provide your /proc/cpuinfo
 so I can include it in the changelog?

Indeed. Here you go:

processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 21
model   : 1
model name  : AMD Opteron(TM) Processor 6274
stepping: 2
microcode   : 0x6000626
cpu MHz : 2200.034
cache size  : 2048 KB
physical id : 0
siblings: 16
core id : 0
cpu cores   : 8
apicid  : 32
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc
extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1
sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy
abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4
nodeid_msr topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv
svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists
pausefilter pfthreshold
bogomips: 4400.06
TLB size: 1536 4K pages
clflush size: 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb

-- 
Asias
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] get 2% or more performance improved by reducing spin_lock race

2013-06-07 Thread Sergei Shtylyov

Hello.

On 07-06-2013 11:31, Qinchuanyu wrote:


the wake_up_process func is included by spin_lock/unlock in vhost_work_queue,
  but it could be done outside the spin_lock.
I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf, the num as 
below.
  orignal   modified
thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps) vhost(%)
1   9.59 28.82   |  9.5927.49
89.6132.92   |  9.6226.77
649.5846.48  | 9.5538.99
2569.663.7   |  9.6 52.59


   Could you align your columns?


Signed-off-by: Chuanyu Qin qinchua...@huawei.com
---
  drivers/vhost/vhost.c |5 +++--
  1 files changed, 3 insertions(+), 2 deletions(-)



diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 94dbd25..8bee109 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -146,9 +146,10 @@ static inline void vhost_work_queue(struct vhost_dev *dev,
if (list_empty(work-node)) {
list_add_tail(work-node, dev-work_list);
work-queue_seq++;
+   spin_unlock_irqrestore(dev-work_lock, flags);
wake_up_process(dev-worker);
-   }
-   spin_unlock_irqrestore(dev-work_lock, flags);
+   } else
+   spin_unlock_irqrestore(dev-work_lock, flags);


   You should have {} in the *else* branch, if you have it in the *if* 
branch (and vice versa), according to Documentation/CodingStyle.


WBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC V9 0/19] Paravirtualized ticket spinlocks

2013-06-07 Thread Andrew Theurer
On Fri, 2013-06-07 at 11:45 +0530, Raghavendra K T wrote:
 On 06/03/2013 11:51 AM, Raghavendra K T wrote:
  On 06/03/2013 07:10 AM, Raghavendra K T wrote:
  On 06/02/2013 09:50 PM, Jiannan Ouyang wrote:
  On Sun, Jun 2, 2013 at 1:07 AM, Gleb Natapov g...@redhat.com wrote:
 
  High level question here. We have a big hope for Preemptable Ticket
  Spinlock patch series by Jiannan Ouyang to solve most, if not all,
  ticketing spinlocks in overcommit scenarios problem without need for
  PV.
  So how this patch series compares with his patches on PLE enabled
  processors?
 
 
  No experiment results yet.
 
  An error is reported on a 20 core VM. I'm during an internship
  relocation, and will start work on it next week.
 
  Preemptable spinlocks' testing update:
  I hit the same softlockup problem while testing on 32 core machine with
  32 guest vcpus that Andrew had reported.
 
  After that i started tuning TIMEOUT_UNIT, and when I went till (18),
  things seemed to be manageable for undercommit cases.
  But I still see degradation for undercommit w.r.t baseline itself on 32
  core machine (after tuning).
 
  (37.5% degradation w.r.t base line).
  I can give the full report after the all tests complete.
 
  For over-commit cases, I again started hitting softlockups (and
  degradation is worse). But as I said in the preemptable thread, the
  concept of preemptable locks looks promising (though I am still not a
  fan of  embedded TIMEOUT mechanism)
 
  Here is my opinion of TODOs for preemptable locks to make it better ( I
  think I need to paste in the preemptable thread also)
 
  1. Current TIMEOUT UNIT seem to be on higher side and also it does not
  scale well with large guests and also overcommit. we need to have a
  sort of adaptive mechanism and better is sort of different TIMEOUT_UNITS
  for different types of lock too. The hashing mechanism that was used in
  Rik's spinlock backoff series fits better probably.
 
  2. I do not think TIMEOUT_UNIT itself would work great when we have a
  big queue (for large guests / overcommits) for lock.
  one way is to add a PV hook that does yield hypercall immediately for
  the waiters above some THRESHOLD so that they don't burn the CPU.
  ( I can do POC to check if  that idea works in improving situation
  at some later point of time)
 
 
  Preemptable-lock results from my run with 2^8 TIMEOUT:
 
  +---+---+---++---+
ebizzy (records/sec) higher is better
  +---+---+---++---+
   basestdevpatchedstdev%improvement
  +---+---+---++---+
  1x  5574.9000   237.49973484.2000   113.4449   -37.50202
  2x  2741.5000   561.3090 351.5000   140.5420   -87.17855
  3x  2146.2500   216.7718 194.833385.0303   -90.92215
  4x  1663.   141.9235 101.57.7853   -93.92664
  +---+---+---++---+
  +---+---+---++---+
  dbench  (Throughput) higher is better
  +---+---+---++---+
basestdevpatchedstdev%improvement
  +---+---+---++---+
  1x  14111.5600   754.4525   3930.1602   2547.2369-72.14936
  2x  2481.627071.2665  181.181689.5368-92.69908
  3x  1510.248331.8634  104.724353.2470-93.06576
  4x  1029.487516.9166   72.373838.2432-92.96992
  +---+---+---++---+
 
  Note we can not trust on overcommit results because of softlock-ups
 
 
 Hi, I tried
 (1) TIMEOUT=(2^7)
 
 (2) having yield hypercall that uses kvm_vcpu_on_spin() to do directed 
 yield to other vCPUs.
 
 Now I do not see any soft-lockup in overcommit cases and results are 
 better now (except ebizzy 1x). and for dbench I see now it is closer to 
 base and even improvement in 4x
 
 +---+---+---++---+
 ebizzy (records/sec) higher is better
 +---+---+---++---+
basestdevpatchedstdev%improvement
 +---+---+---++---+
5574.9000   237.4997 523.7000 1.4181   -90.60611
2741.5000   561.3090 597.800034.9755   -78.19442
2146.2500   216.7718 902.666782.4228   -57.94215
1663.   141.92351245.67.2989   -25.13530
 +---+---+---++---+
 +---+---+---++---+
  dbench  (Throughput) higher is better
 +---+---+---++---+
 basestdevpatchedstdev%improvement
 +---+---+---++---+
   

Re: [patch 2/2] tools: lkvm - Filter out cpu vendor string

2013-06-07 Thread Asias He
On Fri, Jun 07, 2013 at 08:20:33PM +0800, Asias He wrote:
 On Fri, Jun 07, 2013 at 02:06:40PM +0300, Pekka Enberg wrote:
  On 06/07/2013 11:17 AM, Asias He wrote:
  Ping! Is there someone out there who has a AMD box they could test this 
  on?
  
  I tested it on AMD box.  Guest boots with this patch, guest does not
  boot without it.  I am not seeing the msr warning in both cases.
  
  That's pretty interesting. Can you please provide your /proc/cpuinfo
  so I can include it in the changelog?
 
 Indeed. Here you go:
 
 processor   : 0
 vendor_id   : AuthenticAMD
 cpu family  : 21
 model   : 1
 model name  : AMD Opteron(TM) Processor 6274
 stepping: 2
 microcode   : 0x6000626
 cpu MHz : 2200.034
 cache size  : 2048 KB
 physical id : 0
 siblings: 16
 core id : 0
 cpu cores   : 8
 apicid  : 32
 initial apicid  : 0
 fpu : yes
 fpu_exception   : yes
 cpuid level : 13
 wp  : yes
 flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
 mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
 fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc
 extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1
 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy
 abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4
 nodeid_msr topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv
 svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists
 pausefilter pfthreshold
 bogomips: 4400.06
 TLB size: 1536 4K pages
 clflush size: 64
 cache_alignment : 64
 address sizes   : 48 bits physical, 48 bits virtual
 power management: ts ttp tm 100mhzsteps hwpstate cpb

And in guest:

# cat /proc/cpuinfo 
processor   : 0
vendor_id   : LKVMLKVMLKVM
cpu family  : 21
model   : 1
model name  : 15/01
stepping: 2
cpu MHz : 0.000
cache size  : 0 KB
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt
pdpe1gb lm nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes
xsave avx hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a
misalignsse 3dnowprefetch osvw xop fma4 npt nrip_save tsc_adjust
bogomips: 1340.41
clflush size: 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management:


-- 
Asias
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] vhost: wake up worker outside spin_lock

2013-06-07 Thread Qin Chuanyu
the wake_up_process func is included by spin_lock/unlock in 
vhost_work_queue,

but it could be done outside the spin_lock.
I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf,
the num as below.
 original modified
thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps) vhost(%)
1   9.5928.82|   9.5927.49
8   9.6132.92|   9.6226.77
64  9.5846.48|   9.5538.99
256 9.6 63.7 |   9.6 52.59

Signed-off-by: Chuanyu Qin qinchua...@huawei.com
---
 drivers/vhost/vhost.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 94dbd25..dcc7a17 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -146,9 +146,11 @@ static inline void vhost_work_queue(struct 
vhost_dev *dev,

if (list_empty(work-node)) {
list_add_tail(work-node, dev-work_list);
work-queue_seq++;
+   spin_unlock_irqrestore(dev-work_lock, flags);
wake_up_process(dev-worker);
+   } else {
+   spin_unlock_irqrestore(dev-work_lock, flags);
}
-   spin_unlock_irqrestore(dev-work_lock, flags);
 }

 void vhost_poll_queue(struct vhost_poll *poll)
--
1.7.3.1.msysgit.0


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: wake up worker outside spin_lock

2013-06-07 Thread Asias He
On Fri, Jun 7, 2013 at 9:50 PM, Qin Chuanyu qinchua...@huawei.com wrote:
 the wake_up_process func is included by spin_lock/unlock in
 vhost_work_queue,
 but it could be done outside the spin_lock.
 I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf,
 the num as below.
  original modified

 thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps) vhost(%)
 1   9.5928.82|   9.5927.49
 8   9.6132.92|   9.6226.77
 64  9.5846.48|   9.5538.99
 256 9.6 63.7 |   9.6 52.59

 Signed-off-by: Chuanyu Qin qinchua...@huawei.com
 ---
  drivers/vhost/vhost.c |4 +++-

  1 files changed, 3 insertions(+), 1 deletions(-)

 diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
 index 94dbd25..dcc7a17 100644
 --- a/drivers/vhost/vhost.c
 +++ b/drivers/vhost/vhost.c
 @@ -146,9 +146,11 @@ static inline void vhost_work_queue(struct vhost_dev
 *dev,

 if (list_empty(work-node)) {
 list_add_tail(work-node, dev-work_list);
 work-queue_seq++;
 +   spin_unlock_irqrestore(dev-work_lock, flags);
 wake_up_process(dev-worker);
 +   } else {
 +   spin_unlock_irqrestore(dev-work_lock, flags);
 }
 -   spin_unlock_irqrestore(dev-work_lock, flags);

  }

Hmm,  this looks more clearer to me.

--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -156,14 +156,17 @@ EXPORT_SYMBOL_GPL(vhost_poll_flush);
 void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work)
 {
unsigned long flags;
+   bool wakeup = false;

spin_lock_irqsave(dev-work_lock, flags);
if (list_empty(work-node)) {
list_add_tail(work-node, dev-work_list);
work-queue_seq++;
-   wake_up_process(dev-worker);
+   wakeup = true;
}
spin_unlock_irqrestore(dev-work_lock, flags);
+   if (wakeup)
+   wake_up_process(dev-worker);
 }



  void vhost_poll_queue(struct vhost_poll *poll)
 --
 1.7.3.1.msysgit.0


 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
Asias
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL] port of KVM to arm64

2013-06-07 Thread Marc Zyngier
Catalin, Will,

Please consider pulling the following branch to get the current 
KVM/arm64 code to be merged in 3.11.

Note that this code doesn't get built as it stands, as it depends on 
other bits and pieces coming from both the main KVM tree and the
KVM/ARM tree. Once these dependencies are met, I'll post the last
patch enabling the KVM/arm64 code.

Thanks,

M.

The following changes since commit d683b96b072dc4680fc74964eca77e6a23d1fa6e:

  Linux 3.10-rc4 (2013-06-02 17:11:17 +0900)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git 
kvm-arm64/kvm-for-3.11

for you to fetch changes up to 6544c4736594302a142e24dea014c472484b53ca:

  arm64: KVM: document kernel object mappings in HYP (2013-06-07 14:17:52 +0100)


Marc Zyngier (33):
  arm64: KVM: define HYP and Stage-2 translation page flags
  arm64: KVM: HYP mode idmap support
  arm64: KVM: EL2 register definitions
  arm64: KVM: system register definitions for 64bit guests
  arm64: KVM: Basic ESR_EL2 helpers and vcpu register access
  arm64: KVM: fault injection into a guest
  arm64: KVM: architecture specific MMU backend
  arm64: KVM: user space interface
  arm64: KVM: system register handling
  arm64: KVM: CPU specific system registers handling
  arm64: KVM: virtual CPU reset
  arm64: KVM: kvm_arch and kvm_vcpu_arch definitions
  arm64: KVM: MMIO access backend
  arm64: KVM: guest one-reg interface
  arm64: KVM: hypervisor initialization code
  arm64: KVM: HYP mode world switch implementation
  arm64: KVM: Exit handling
  arm64: KVM: Plug the VGIC
  ARM: KVM: timer: allow DT matching for ARMv8 cores
  arm64: KVM: Plug the arch timer
  arm64: KVM: PSCI implementation
  arm64: KVM: Build system integration
  arm64: KVM: define 32bit specific registers
  arm64: KVM: 32bit GP register access
  arm64: KVM: 32bit conditional execution emulation
  arm64: KVM: 32bit handling of coprocessor traps
  arm64: KVM: CPU specific 32bit coprocessor access
  arm64: KVM: 32bit specific register world switch
  arm64: KVM: 32bit guest fault injection
  arm64: KVM: enable initialization of a 32bit vcpu
  arm64: KVM: userspace API documentation
  arm64: KVM: MAINTAINERS update
  arm64: KVM: document kernel object mappings in HYP

 Documentation/arm64/memory.txt |7 +
 Documentation/virtual/kvm/api.txt  |   58 +++--
 MAINTAINERS|9 +
 arch/arm/kvm/arch_timer.c  |1 +
 arch/arm64/Makefile|2 +-
 arch/arm64/include/asm/kvm_arm.h   |  245 +++
 arch/arm64/include/asm/kvm_asm.h   |  104 
 arch/arm64/include/asm/kvm_coproc.h|   56 +
 arch/arm64/include/asm/kvm_emulate.h   |  180 ++
 arch/arm64/include/asm/kvm_host.h  |  202 
 arch/arm64/include/asm/kvm_mmio.h  |   59 +
 arch/arm64/include/asm/kvm_mmu.h   |  135 +++
 arch/arm64/include/asm/kvm_psci.h  |   23 ++
 arch/arm64/include/asm/memory.h|6 +
 arch/arm64/include/asm/pgtable-hwdef.h |   19 ++
 arch/arm64/include/asm/pgtable.h   |   12 +
 arch/arm64/include/uapi/asm/kvm.h  |  168 +
 arch/arm64/kernel/asm-offsets.c|   34 +++
 arch/arm64/kernel/vmlinux.lds.S|   20 ++
 arch/arm64/kvm/Makefile|   23 ++
 arch/arm64/kvm/emulate.c   |  158 
 arch/arm64/kvm/guest.c |  265 
 arch/arm64/kvm/handle_exit.c   |  124 ++
 arch/arm64/kvm/hyp-init.S  |  107 +
 arch/arm64/kvm/hyp.S   |  831 
+++
 arch/arm64/kvm/inject_fault.c  |  203 
 arch/arm64/kvm/regmap.c|  168 +
 arch/arm64/kvm/reset.c |  112 +
 arch/arm64/kvm/sys_regs.c  | 1050 

 arch/arm64/kvm/sys_regs.h  |  138 +++
 arch/arm64/kvm/sys_regs_generic_v8.c   |   95 
 include/uapi/linux/kvm.h   |2 +
 32 files changed, 4596 insertions(+), 20 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_arm.h
 create mode 100644 arch/arm64/include/asm/kvm_asm.h
 create mode 100644 arch/arm64/include/asm/kvm_coproc.h
 create mode 100644 arch/arm64/include/asm/kvm_emulate.h
 create mode 100644 arch/arm64/include/asm/kvm_host.h
 create mode 100644 arch/arm64/include/asm/kvm_mmio.h
 create mode 100644 arch/arm64/include/asm/kvm_mmu.h
 create mode 100644 arch/arm64/include/asm/kvm_psci.h
 create mode 100644 arch/arm64/include/uapi/asm/kvm.h
 create mode 100644 arch/arm64/kvm/Makefile
 create mode 100644 

[GIT PULL] VFIO fix for v3.10-rc5

2013-06-07 Thread Alex Williamson
Hi Linus,

The following changes since commit d683b96b072dc4680fc74964eca77e6a23d1fa6e:

  Linux 3.10-rc4 (2013-06-02 17:11:17 +0900)

are available in the git repository at:

  git://github.com/awilliam/linux-vfio.git tags/vfio-v3.10-rc5

for you to fetch changes up to 9a6aa279d3d17af73a029fa40654e92f4e75e8bb:

  vfio: fix crash on rmmod (2013-06-05 08:54:16 -0600)


vfio - fix rmmod crash


Alexey Kardashevskiy (1):
  vfio: fix crash on rmmod

 drivers/vfio/vfio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/31] MIPS: Move allocate_kscratch to cpu-probe.c and make it public.

2013-06-07 Thread David Daney
From: David Daney dda...@caviumnetworks.com

Signed-off-by: David Daney dda...@caviumnetworks.com
---
 arch/mips/include/asm/mipsregs.h |  2 ++
 arch/mips/kernel/cpu-probe.c | 29 +
 arch/mips/mm/tlbex.c | 20 +---
 3 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h
index 87e6207..6e0da5aa 100644
--- a/arch/mips/include/asm/mipsregs.h
+++ b/arch/mips/include/asm/mipsregs.h
@@ -1806,6 +1806,8 @@ __BUILD_SET_C0(brcm_cmt_ctrl)
 __BUILD_SET_C0(brcm_config)
 __BUILD_SET_C0(brcm_mode)
 
+int allocate_kscratch(void);
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /* _ASM_MIPSREGS_H */
diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
index c6568bf..ee1014e 100644
--- a/arch/mips/kernel/cpu-probe.c
+++ b/arch/mips/kernel/cpu-probe.c
@@ -1064,3 +1064,32 @@ __cpuinit void cpu_report(void)
if (c-options  MIPS_CPU_FPU)
printk(KERN_INFO FPU revision is: %08x\n, c-fpu_id);
 }
+
+static DEFINE_SPINLOCK(kscratch_used_lock);
+
+static unsigned int kscratch_used_mask;
+
+int allocate_kscratch(void)
+{
+   int r;
+   unsigned int a;
+
+   spin_lock(kscratch_used_lock);
+
+   a = cpu_data[0].kscratch_mask  ~kscratch_used_mask;
+
+   r = ffs(a);
+
+   if (r == 0) {
+   r = -1;
+   goto out;
+   }
+
+   r--; /* make it zero based */
+
+   kscratch_used_mask |= (1  r);
+out:
+   spin_unlock(kscratch_used_lock);
+
+   return r;
+}
diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index ce9818e..001b87c 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -30,6 +30,7 @@
 #include linux/cache.h
 
 #include asm/cacheflush.h
+#include asm/mipsregs.h
 #include asm/pgtable.h
 #include asm/war.h
 #include asm/uasm.h
@@ -307,25 +308,6 @@ static int check_for_high_segbits __cpuinitdata;
 
 static int check_for_high_segbits __cpuinitdata;
 
-static unsigned int kscratch_used_mask __cpuinitdata;
-
-static int __cpuinit allocate_kscratch(void)
-{
-   int r;
-   unsigned int a = cpu_data[0].kscratch_mask  ~kscratch_used_mask;
-
-   r = ffs(a);
-
-   if (r == 0)
-   return -1;
-
-   r--; /* make it zero based */
-
-   kscratch_used_mask |= (1  r);
-
-   return r;
-}
-
 static int scratch_reg __cpuinitdata;
 static int pgd_reg __cpuinitdata;
 enum vmalloc64_mode {not_refill, refill_scratch, refill_noscratch};
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/31] MIPS: Rearrange branch.c so it can be used by kvm code.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

Introduce __compute_return_epc_for_insn0() entry point.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/branch.h |  7 +
 arch/mips/kernel/branch.c  | 63 +++---
 2 files changed, 54 insertions(+), 16 deletions(-)

diff --git a/arch/mips/include/asm/branch.h b/arch/mips/include/asm/branch.h
index e28a3e0..b3de685 100644
--- a/arch/mips/include/asm/branch.h
+++ b/arch/mips/include/asm/branch.h
@@ -37,6 +37,13 @@ static inline unsigned long exception_epc(struct pt_regs 
*regs)
 
 #define BRANCH_LIKELY_TAKEN 0x0001
 
+extern int __compute_return_epc(struct pt_regs *regs);
+extern int __compute_return_epc_for_insn(struct pt_regs *regs,
+union mips_instruction insn);
+extern int __compute_return_epc_for_insn0(struct pt_regs *regs,
+ union mips_instruction insn,
+ unsigned int (*get_fcr31)(void));
+
 static inline int compute_return_epc(struct pt_regs *regs)
 {
if (get_isa16_mode(regs-cp0_epc)) {
diff --git a/arch/mips/kernel/branch.c b/arch/mips/kernel/branch.c
index 46c2ad0..e47145b 100644
--- a/arch/mips/kernel/branch.c
+++ b/arch/mips/kernel/branch.c
@@ -195,17 +195,18 @@ int __MIPS16e_compute_return_epc(struct pt_regs *regs)
 }
 
 /**
- * __compute_return_epc_for_insn - Computes the return address and do emulate
+ * __compute_return_epc_for_insn0 - Computes the return address and do emulate
  * branch simulation, if required.
  *
  * @regs:  Pointer to pt_regs
  * @insn:  branch instruction to decode
- * @returns:   -EFAULT on error and forces SIGBUS, and on success
+ * @returns:   -EFAULT on error, and on success
  * returns 0 or BRANCH_LIKELY_TAKEN as appropriate after
  * evaluating the branch.
  */
-int __compute_return_epc_for_insn(struct pt_regs *regs,
-  union mips_instruction insn)
+int __compute_return_epc_for_insn0(struct pt_regs *regs,
+  union mips_instruction insn,
+  unsigned int (*get_fcr31)(void))
 {
unsigned int bit, fcr31, dspcontrol;
long epc = regs-cp0_epc;
@@ -281,7 +282,7 @@ int __compute_return_epc_for_insn(struct pt_regs *regs,
 
case bposge32_op:
if (!cpu_has_dsp)
-   goto sigill;
+   return -EFAULT;
 
dspcontrol = rddsp(0x01);
 
@@ -364,13 +365,7 @@ int __compute_return_epc_for_insn(struct pt_regs *regs,
 * And now the FPA/cp1 branch instructions.
 */
case cop1_op:
-   preempt_disable();
-   if (is_fpu_owner())
-   asm volatile(cfc1\t%0,$31 : =r (fcr31));
-   else
-   fcr31 = current-thread.fpu.fcr31;
-   preempt_enable();
-
+   fcr31 = get_fcr31();
bit = (insn.i_format.rt  2);
bit += (bit != 0);
bit += 23;
@@ -434,11 +429,47 @@ int __compute_return_epc_for_insn(struct pt_regs *regs,
}
 
return ret;
+}
+EXPORT_SYMBOL_GPL(__compute_return_epc_for_insn0);
 
-sigill:
-   printk(%s: DSP branch but not DSP ASE - sending SIGBUS.\n, 
current-comm);
-   force_sig(SIGBUS, current);
-   return -EFAULT;
+static unsigned int __get_fcr31(void)
+{
+   unsigned int fcr31;
+
+   preempt_disable();
+   if (is_fpu_owner())
+   asm volatile(
+   .set push\n
+   \t.set mips1\n
+   \tcfc1\t%0,$31\n
+   \t.set pop : =r (fcr31));
+   else
+   fcr31 = current-thread.fpu.fcr31;
+   preempt_enable();
+   return fcr31;
+}
+
+/**
+ * __compute_return_epc_for_insn - Computes the return address and do emulate
+ * branch simulation, if required.
+ *
+ * @regs:  Pointer to pt_regs
+ * @insn:  branch instruction to decode
+ * @returns:   -EFAULT on error and forces SIGBUS, and on success
+ * returns 0 or BRANCH_LIKELY_TAKEN as appropriate after
+ * evaluating the branch.
+ */
+int __compute_return_epc_for_insn(struct pt_regs *regs,
+  union mips_instruction insn)
+{
+   int r =  __compute_return_epc_for_insn0(regs, insn, __get_fcr31);
+
+   if (r  0) {
+   printk(%s: DSP branch but not DSP ASE - sending SIGBUS.\n, 
current-comm);
+   force_sig(SIGBUS, current);
+   }
+
+   return r;
 }
 EXPORT_SYMBOL_GPL(__compute_return_epc_for_insn);
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

[PATCH 12/31] MIPS: Add instruction format information for WAIT, MTC0, MFC0, et al.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

---
 arch/mips/include/uapi/asm/inst.h | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/mips/include/uapi/asm/inst.h 
b/arch/mips/include/uapi/asm/inst.h
index 0f4aec2..133abc1 100644
--- a/arch/mips/include/uapi/asm/inst.h
+++ b/arch/mips/include/uapi/asm/inst.h
@@ -117,7 +117,8 @@ enum bcop_op {
 enum cop0_coi_func {
tlbr_op   = 0x01, tlbwi_op  = 0x02,
tlbwr_op  = 0x06, tlbp_op   = 0x08,
-   rfe_op= 0x10, eret_op   = 0x18
+   rfe_op= 0x10, eret_op   = 0x18,
+   wait_op   = 0x20
 };
 
 /*
@@ -567,6 +568,24 @@ struct b_format {  /* BREAK and SYSCALL */
;)))
 };
 
+struct c0_format { /* WAIT, TLB?? */
+   BITFIELD_FIELD(unsigned int opcode : 6,
+   BITFIELD_FIELD(unsigned int co : 1,
+   BITFIELD_FIELD(unsigned int code : 19,
+   BITFIELD_FIELD(unsigned int func : 6,
+   ;
+};
+
+struct c0m_format {/* MTC0, MFC0, ... */
+   BITFIELD_FIELD(unsigned int opcode : 6,
+   BITFIELD_FIELD(unsigned int func : 5,
+   BITFIELD_FIELD(unsigned int rt : 5,
+   BITFIELD_FIELD(unsigned int rd : 5,
+   BITFIELD_FIELD(unsigned int code : 8,
+   BITFIELD_FIELD(unsigned int sel : 3,
+   ;))
+};
+
 struct ps_format { /* MIPS-3D / paired single format */
BITFIELD_FIELD(unsigned int opcode : 6,
BITFIELD_FIELD(unsigned int rs : 5,
@@ -857,6 +876,8 @@ union mips_instruction {
struct f_format f_format;
struct ma_format ma_format;
struct b_format b_format;
+   struct c0_format c0_format;
+   struct c0m_format c0m_format;
struct ps_format ps_format;
struct v_format v_format;
struct fb_format fb_format;
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 30/31] mips/kvm: Enable MIPSVZ in Kconfig/Makefile

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

Also let CPU_CAVIUM_OCTEON select KVM.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/Kconfig  | 1 +
 arch/mips/kvm/Kconfig  | 9 +
 arch/mips/kvm/Makefile | 1 +
 3 files changed, 11 insertions(+)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 7a58ab9..16e3d22 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -1426,6 +1426,7 @@ config CPU_CAVIUM_OCTEON
select LIBFDT
select USE_OF
select USB_EHCI_BIG_ENDIAN_MMIO
+   select HAVE_KVM
help
  The Cavium Octeon processor is a highly integrated chip containing
  many ethernet hardware widgets for networking tasks. The processor
diff --git a/arch/mips/kvm/Kconfig b/arch/mips/kvm/Kconfig
index 95c0d22..32a5016 100644
--- a/arch/mips/kvm/Kconfig
+++ b/arch/mips/kvm/Kconfig
@@ -48,6 +48,15 @@ config KVM_MIPS_DEBUG_COP0_COUNTERS
 
  If unsure, say N.
 
+config KVM_MIPSVZ
+   bool Kernel-based Virtual Machine (KVM) using hardware MIPS-VZ support
+   depends on HAVE_KVM
+   select KVM
+   ---help---
+ Support for hosting Guest kernels on hardware with the
+ MIPS-VZ hardware module.
+
+
 source drivers/vhost/Kconfig
 
 endif # VIRTUALIZATION
diff --git a/arch/mips/kvm/Makefile b/arch/mips/kvm/Makefile
index 3377197..595358f 100644
--- a/arch/mips/kvm/Makefile
+++ b/arch/mips/kvm/Makefile
@@ -13,3 +13,4 @@ kvm_mipste-objs   := kvm_mips_emul.o kvm_locore.o 
kvm_mips_int.o \
 
 obj-$(CONFIG_KVM)  += $(common-objs) kvm_mips.o
 obj-$(CONFIG_KVM_MIPSTE)   += kvm_mipste.o
+obj-$(CONFIG_KVM_MIPSVZ)   += kvm_mipsvz.o kvm_mipsvz_guest.o
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 27/31] mips/kvm: Gate the use of kvm_local_flush_tlb_all() by KVM_MIPSTE

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

Only the trap-and-emulate KVM code needs a Special tlb flusher.  All
other configurations should use the regular version.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/mmu_context.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/mmu_context.h 
b/arch/mips/include/asm/mmu_context.h
index 5609a32..04d0b74 100644
--- a/arch/mips/include/asm/mmu_context.h
+++ b/arch/mips/include/asm/mmu_context.h
@@ -117,7 +117,7 @@ get_new_asid(unsigned long cpu)
if (! ((asid += ASID_INC)  ASID_MASK) ) {
if (cpu_has_vtag_icache)
flush_icache_all();
-#ifdef CONFIG_VIRTUALIZATION
+#if IS_ENABLED(CONFIG_KVM_MIPSTE)
kvm_local_flush_tlb_all();  /* start new asid cycle */
 #else
local_flush_tlb_all();  /* start new asid cycle */
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 31/31] mips/kvm: Allow for upto 8 KVM vcpus per vm.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

The mipsvz implementation allows for SMP, so let's be able to create
all those vcpus.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/kvm_host.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index 9f209e1..0a5e218 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -20,7 +20,7 @@
 #include linux/spinlock.h
 
 
-#define KVM_MAX_VCPUS  1
+#define KVM_MAX_VCPUS  8
 #define KVM_USER_MEM_SLOTS 8
 /* memory slots that does not exposed to userspace */
 #define KVM_PRIVATE_MEM_SLOTS  0
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 24/31] mips/kvm: Add thread_struct fields used by MIPSVZ hosts.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

... and their accessors in asm-offsets.c

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/processor.h | 6 ++
 arch/mips/kernel/asm-offsets.c| 5 +
 2 files changed, 11 insertions(+)

diff --git a/arch/mips/include/asm/processor.h 
b/arch/mips/include/asm/processor.h
index 1470b7b..e0aa198 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -198,6 +198,7 @@ typedef struct {
 #define ARCH_MIN_TASKALIGN 8
 
 struct mips_abi;
+struct kvm_vcpu;
 
 /*
  * If you change thread_struct remember to change the #defines below too!
@@ -230,6 +231,11 @@ struct thread_struct {
unsigned long cp0_badvaddr; /* Last user fault */
unsigned long cp0_baduaddr; /* Last kernel fault accessing USEG */
unsigned long error_code;
+#ifdef CONFIG_KVM_MIPSVZ
+   struct kvm_vcpu *vcpu;
+   unsigned int mm_asid;
+   unsigned int guest_asid;
+#endif
 #ifdef CONFIG_CPU_CAVIUM_OCTEON
 struct octeon_cop2_state cp2 __attribute__ ((__aligned__(128)));
 struct octeon_cvmseg_state cvmseg __attribute__ ((__aligned__(128)));
diff --git a/arch/mips/kernel/asm-offsets.c b/arch/mips/kernel/asm-offsets.c
index c5cc28f..37fd9e2 100644
--- a/arch/mips/kernel/asm-offsets.c
+++ b/arch/mips/kernel/asm-offsets.c
@@ -132,6 +132,11 @@ void output_thread_defines(void)
   thread.cp0_baduaddr);
OFFSET(THREAD_ECODE, task_struct, \
   thread.error_code);
+#ifdef CONFIG_KVM_MIPSVZ
+   OFFSET(THREAD_VCPU, task_struct, thread.vcpu);
+   OFFSET(THREAD_MM_ASID, task_struct, thread.mm_asid);
+   OFFSET(THREAD_GUEST_ASID, task_struct, thread.guest_asid);
+#endif
BLANK();
 }
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 23/31] mips/kvm: Hook into CP unusable exception handler.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

The MIPS VZ KVM code needs this to be able to manage the FPU.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/kernel/traps.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
index fca0a2f..2bdeb32 100644
--- a/arch/mips/kernel/traps.c
+++ b/arch/mips/kernel/traps.c
@@ -56,6 +56,7 @@
 #include asm/types.h
 #include asm/stacktrace.h
 #include asm/uasm.h
+#include asm/kvm_mips_vz.h
 
 extern void check_wait(void);
 extern asmlinkage void rollback_handle_int(void);
@@ -1045,6 +1046,13 @@ asmlinkage void do_cpu(struct pt_regs *regs)
int status;
unsigned long __maybe_unused flags;
 
+#ifdef CONFIG_KVM_MIPSVZ
+   if (test_tsk_thread_flag(current, TIF_GUESTMODE)) {
+   if (mipsvz_cp_unusable(regs))
+   return;
+   }
+#endif
+
die_if_kernel(do_cpu invoked from kernel context!, regs);
 
cpid = (regs-cp0_cause  CAUSEB_CE)  3;
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 26/31] mips/kvm: Split up Kconfig and Makefile definitions in preperation for MIPSVZ.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

Create the symbol KVM_MIPSTE, and use it to select the trap and
emulate specific things.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/kvm/Kconfig  | 14 +-
 arch/mips/kvm/Makefile | 14 --
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/arch/mips/kvm/Kconfig b/arch/mips/kvm/Kconfig
index 2c15590..95c0d22 100644
--- a/arch/mips/kvm/Kconfig
+++ b/arch/mips/kvm/Kconfig
@@ -16,18 +16,22 @@ menuconfig VIRTUALIZATION
 if VIRTUALIZATION
 
 config KVM
-   tristate Kernel-based Virtual Machine (KVM) support
-   depends on HAVE_KVM
+   tristate
select PREEMPT_NOTIFIERS
+
+config KVM_MIPSTE
+   tristate Kernel-based Virtual Machine (KVM) 32-bit trap-and-emulate
+   depends on HAVE_KVM
+   select KVM
select ANON_INODES
select KVM_MMIO
---help---
- Support for hosting Guest kernels.
+ Support for hosting Guest kernels with modified address space layout.
  Currently supported on MIPS32 processors.
 
 config KVM_MIPS_DYN_TRANS
bool KVM/MIPS: Dynamic binary translation to reduce traps
-   depends on KVM
+   depends on KVM_MIPSTE
---help---
  When running in Trap  Emulate mode patch privileged
  instructions to reduce the number of traps.
@@ -36,7 +40,7 @@ config KVM_MIPS_DYN_TRANS
 
 config KVM_MIPS_DEBUG_COP0_COUNTERS
bool Maintain counters for COP0 accesses
-   depends on KVM
+   depends on KVM_MIPSTE
---help---
  Maintain statistics for Guest COP0 accesses.
  A histogram of COP0 accesses is printed when the VM is
diff --git a/arch/mips/kvm/Makefile b/arch/mips/kvm/Makefile
index 78d87bb..3377197 100644
--- a/arch/mips/kvm/Makefile
+++ b/arch/mips/kvm/Makefile
@@ -1,13 +1,15 @@
 # Makefile for KVM support for MIPS
 #
 
-common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o coalesced_mmio.o)
+common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o)
 
 EXTRA_CFLAGS += -Ivirt/kvm -Iarch/mips/kvm
 
-kvm-objs := $(common-objs) kvm_mips.o kvm_mips_emul.o kvm_locore.o \
-   kvm_mips_int.o kvm_mips_stats.o kvm_mips_commpage.o \
-   kvm_mips_dyntrans.o kvm_trap_emul.o
+kvm_mipste-objs:= kvm_mips_emul.o kvm_locore.o kvm_mips_int.o \
+  kvm_mips_stats.o kvm_mips_commpage.o \
+  kvm_mips_dyntrans.o kvm_trap_emul.o kvm_cb.o \
+  kvm_tlb.o \
+  $(addprefix ../../../virt/kvm/, coalesced_mmio.o)
 
-obj-$(CONFIG_KVM)  += kvm.o
-obj-y  += kvm_cb.o kvm_tlb.o
+obj-$(CONFIG_KVM)  += $(common-objs) kvm_mips.o
+obj-$(CONFIG_KVM_MIPSTE)   += kvm_mipste.o
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 28/31] mips/kvm: Only use KVM_COALESCED_MMIO_PAGE_OFFSET with KVM_MIPSTE

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

The forthcoming MIPSVZ code doesn't currently use this, so it must
only be enabled for KVM_MIPSTE.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/kvm_host.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index 505b804..9f209e1 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -25,7 +25,9 @@
 /* memory slots that does not exposed to userspace */
 #define KVM_PRIVATE_MEM_SLOTS  0
 
+#ifdef CONFIG_KVM_MIPSTE
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
+#endif
 
 /* Don't support huge pages */
 #define KVM_HPAGE_GFN_SHIFT(x) 0
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 25/31] mips/kvm: Add some asm-offsets constants used by MIPSVZ.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/kernel/asm-offsets.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/mips/kernel/asm-offsets.c b/arch/mips/kernel/asm-offsets.c
index 37fd9e2..db09376 100644
--- a/arch/mips/kernel/asm-offsets.c
+++ b/arch/mips/kernel/asm-offsets.c
@@ -19,6 +19,7 @@
 
 #include linux/kvm_host.h
 #include asm/kvm_mips_te.h
+#include asm/kvm_mips_vz.h
 
 void output_ptreg_defines(void)
 {
@@ -345,6 +346,8 @@ void output_pbe_defines(void)
 void output_kvm_defines(void)
 {
COMMENT( KVM/MIPS Specfic offsets. );
+   OFFSET(KVM_ARCH_IMPL, kvm, arch.impl);
+   OFFSET(KVM_VCPU_KVM, kvm_vcpu, kvm);
DEFINE(VCPU_ARCH_SIZE, sizeof(struct kvm_vcpu_arch));
OFFSET(VCPU_RUN, kvm_vcpu, run);
OFFSET(VCPU_HOST_ARCH, kvm_vcpu, arch);
@@ -411,5 +414,9 @@ void output_kvm_defines(void)
OFFSET(COP0_TLB_HI, mips_coproc, reg[MIPS_CP0_TLB_HI][0]);
OFFSET(COP0_STATUS, mips_coproc, reg[MIPS_CP0_STATUS][0]);
BLANK();
+
+   COMMENT( Linux struct kvm mipsvz offsets. );
+   OFFSET(KVM_MIPS_VZ_PGD, kvm_mips_vz, pgd);
+   BLANK();
 }
 #endif
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 18/31] mips/kvm: Add pt_regs slots for BadInstr and BadInstrP

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

These save the instruction word to be used by MIPSVZ code for
instruction emulation.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/ptrace.h | 4 
 arch/mips/kernel/asm-offsets.c | 4 
 2 files changed, 8 insertions(+)

diff --git a/arch/mips/include/asm/ptrace.h b/arch/mips/include/asm/ptrace.h
index 5e6cd09..d080716 100644
--- a/arch/mips/include/asm/ptrace.h
+++ b/arch/mips/include/asm/ptrace.h
@@ -46,6 +46,10 @@ struct pt_regs {
unsigned long long mpl[3];/* MTM{0,1,2} */
unsigned long long mtp[3];/* MTP{0,1,2} */
 #endif
+#ifdef CONFIG_KVM_MIPSVZ
+   unsigned int cp0_badinstr;  /* Only populated on 
do_page_fault_{0,1} */
+   unsigned int cp0_badinstrp; /* Only populated on 
do_page_fault_{0,1} */
+#endif
 } __aligned(8);
 
 struct task_struct;
diff --git a/arch/mips/kernel/asm-offsets.c b/arch/mips/kernel/asm-offsets.c
index 03bf363..c5cc28f 100644
--- a/arch/mips/kernel/asm-offsets.c
+++ b/arch/mips/kernel/asm-offsets.c
@@ -71,6 +71,10 @@ void output_ptreg_defines(void)
OFFSET(PT_MPL, pt_regs, mpl);
OFFSET(PT_MTP, pt_regs, mtp);
 #endif /* CONFIG_CPU_CAVIUM_OCTEON */
+#ifdef CONFIG_KVM_MIPSVZ
+   OFFSET(PT_BADINSTR, pt_regs, cp0_badinstr);
+   OFFSET(PT_BADINSTRP, pt_regs, cp0_badinstrp);
+#endif
DEFINE(PT_SIZE, sizeof(struct pt_regs));
BLANK();
 }
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 19/31] mips/kvm: Add host definitions for MIPS VZ based host.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/kvm_mips_vz.h | 29 +
 1 file changed, 29 insertions(+)
 create mode 100644 arch/mips/include/asm/kvm_mips_vz.h

diff --git a/arch/mips/include/asm/kvm_mips_vz.h 
b/arch/mips/include/asm/kvm_mips_vz.h
new file mode 100644
index 000..dfc6951
--- /dev/null
+++ b/arch/mips/include/asm/kvm_mips_vz.h
@@ -0,0 +1,29 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file COPYING in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 2013 Cavium, Inc.
+ */
+#ifndef _ASM_KVM_MIPS_VZ_H
+#define _ASM_KVM_MIPS_VZ_H
+
+struct kvm;
+
+struct kvm_mips_vz {
+   struct mutex guest_mm_lock;
+   pgd_t *pgd; /* Translations for this host. */
+   spinlock_t irq_chip_lock;
+   struct page *irq_chip;
+   unsigned int asid[NR_CPUS]; /* Per CPU ASIDs for pgd. */
+};
+
+bool mipsvz_page_fault(struct pt_regs *regs, unsigned long write,
+  unsigned long address);
+
+bool mipsvz_cp_unusable(struct pt_regs *regs);
+int mipsvz_arch_init(void *opaque);
+int mipsvz_arch_hardware_enable(void *garbage);
+int mipsvz_init_vm(struct kvm *kvm, unsigned long type);
+
+#endif /* _ASM_KVM_MIPS_VZ_H */
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/31] mips/kvm: Add thread_info flag to indicate operation in MIPS VZ Guest Mode.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/thread_info.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/mips/include/asm/thread_info.h 
b/arch/mips/include/asm/thread_info.h
index 895320e..a7a894a 100644
--- a/arch/mips/include/asm/thread_info.h
+++ b/arch/mips/include/asm/thread_info.h
@@ -109,6 +109,7 @@ static inline struct thread_info *current_thread_info(void)
 #define TIF_RESTORE_SIGMASK9   /* restore signal mask in do_signal() */
 #define TIF_USEDFPU16  /* FPU was used by this task this 
quantum (SMP) */
 #define TIF_MEMDIE 18  /* is terminating due to OOM killer */
+#define TIF_GUESTMODE  19  /* If set, running in VZ Guest mode. */
 #define TIF_FIXADE 20  /* Fix address errors in software */
 #define TIF_LOGADE 21  /* Log address errors to syslog */
 #define TIF_32BIT_REGS 22  /* also implies 16/32 fprs */
@@ -124,6 +125,7 @@ static inline struct thread_info *current_thread_info(void)
 #define _TIF_SECCOMP   (1TIF_SECCOMP)
 #define _TIF_NOTIFY_RESUME (1TIF_NOTIFY_RESUME)
 #define _TIF_USEDFPU   (1TIF_USEDFPU)
+#define _TIF_GUESTMODE (1TIF_GUESTMODE)
 #define _TIF_FIXADE(1TIF_FIXADE)
 #define _TIF_LOGADE(1TIF_LOGADE)
 #define _TIF_32BIT_REGS(1TIF_32BIT_REGS)
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 20/31] mips/kvm: Hook into TLB fault handlers.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

If the CPU is operating in guest mode when a TLB related excpetion
occurs, give KVM a chance to do emulation.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/mm/fault.c   | 8 
 arch/mips/mm/tlbex-fault.S | 6 ++
 2 files changed, 14 insertions(+)

diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c
index 0fead53..9391da49 100644
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@@ -26,6 +26,7 @@
 #include asm/ptrace.h
 #include asm/highmem.h   /* For VMALLOC_END */
 #include linux/kdebug.h
+#include asm/kvm_mips_vz.h
 
 /*
  * This routine handles page faults.  It determines the address,
@@ -50,6 +51,13 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs 
*regs, unsigned long writ
   field, regs-cp0_epc);
 #endif
 
+#ifdef CONFIG_KVM_MIPSVZ
+   if (test_tsk_thread_flag(current, TIF_GUESTMODE)) {
+   if (mipsvz_page_fault(regs, write, address))
+   return;
+   }
+#endif
+
 #ifdef CONFIG_KPROBES
/*
 * This is to notify the fault handler of the kprobes.  The
diff --git a/arch/mips/mm/tlbex-fault.S b/arch/mips/mm/tlbex-fault.S
index 318855e..df0f70b 100644
--- a/arch/mips/mm/tlbex-fault.S
+++ b/arch/mips/mm/tlbex-fault.S
@@ -14,6 +14,12 @@
NESTED(tlb_do_page_fault_\write, PT_SIZE, sp)
SAVE_ALL
MFC0a2, CP0_BADVADDR
+#ifdef CONFIG_KVM_MIPSVZ
+   mfc0v0, CP0_BADINSTR
+   mfc0v1, CP0_BADINSTRP
+   sw  v0, PT_BADINSTR(sp)
+   sw  v1, PT_BADINSTRP(sp)
+#endif
KMODE
movea0, sp
REG_S   a2, PT_BVADDR(sp)
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 21/31] mips/kvm: Allow set_except_vector() to be used from MIPSVZ code.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

We need to move it out of __init so we don't have section mismatch problems.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/uasm.h | 2 +-
 arch/mips/kernel/traps.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/include/asm/uasm.h b/arch/mips/include/asm/uasm.h
index 370d967..90b4f5e 100644
--- a/arch/mips/include/asm/uasm.h
+++ b/arch/mips/include/asm/uasm.h
@@ -11,7 +11,7 @@
 
 #include linux/types.h
 
-#ifdef CONFIG_EXPORT_UASM
+#if defined(CONFIG_EXPORT_UASM) || IS_ENABLED(CONFIG_KVM_MIPSVZ)
 #include linux/export.h
 #define __uasminit
 #define __uasminitdata
diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
index f008795..fca0a2f 100644
--- a/arch/mips/kernel/traps.c
+++ b/arch/mips/kernel/traps.c
@@ -1457,7 +1457,7 @@ unsigned long ebase;
 unsigned long exception_handlers[32];
 unsigned long vi_handlers[64];
 
-void __init *set_except_vector(int n, void *addr)
+void __uasminit *set_except_vector(int n, void *addr)
 {
unsigned long handler = (unsigned long) addr;
unsigned long old_handler;
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 22/31] mips/kvm: Split get_new_mmu_context into two parts.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

The new function (part) get_new_asid() can now be used from MIPSVZ code.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/mmu_context.h | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/mips/include/asm/mmu_context.h 
b/arch/mips/include/asm/mmu_context.h
index 8201160..5609a32 100644
--- a/arch/mips/include/asm/mmu_context.h
+++ b/arch/mips/include/asm/mmu_context.h
@@ -108,8 +108,8 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, 
struct task_struct *tsk)
 
 #ifndef CONFIG_MIPS_MT_SMTC
 /* Normal, classic MIPS get_new_mmu_context */
-static inline void
-get_new_mmu_context(struct mm_struct *mm, unsigned long cpu)
+static inline unsigned long
+get_new_asid(unsigned long cpu)
 {
extern void kvm_local_flush_tlb_all(void);
unsigned long asid = asid_cache(cpu);
@@ -125,7 +125,13 @@ get_new_mmu_context(struct mm_struct *mm, unsigned long 
cpu)
if (!asid)  /* fix version if needed */
asid = ASID_FIRST_VERSION;
}
+   return asid;
+}
 
+static inline void
+get_new_mmu_context(struct mm_struct *mm, unsigned long cpu)
+{
+   unsigned long asid = get_new_asid(cpu);
cpu_context(cpu, mm) = asid_cache(cpu) = asid;
 }
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/31] MIPS: Quit exposing Kconfig symbols in uapi headers.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

The kernel's struct pt_regs has many fields conditional on various
Kconfig variables, we cannot be exporting this garbage to user-space.

Move the kernel's definition to asm/ptrace.h, and put a uapi only
version in uapi/asm/ptrace.h gated by #ifndef __KERNEL__

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/ptrace.h  | 32 
 arch/mips/include/uapi/asm/ptrace.h | 17 ++---
 2 files changed, 34 insertions(+), 15 deletions(-)

diff --git a/arch/mips/include/asm/ptrace.h b/arch/mips/include/asm/ptrace.h
index a3186f2..5e6cd09 100644
--- a/arch/mips/include/asm/ptrace.h
+++ b/arch/mips/include/asm/ptrace.h
@@ -16,6 +16,38 @@
 #include asm/isadep.h
 #include uapi/asm/ptrace.h
 
+/*
+ * This struct defines the way the registers are stored on the stack during a
+ * system call/exception. As usual the registers k0/k1 aren't being saved.
+ */
+struct pt_regs {
+#ifdef CONFIG_32BIT
+   /* Pad bytes for argument save space on the stack. */
+   unsigned long pad0[6];
+#endif
+
+   /* Saved main processor registers. */
+   unsigned long regs[32];
+
+   /* Saved special registers. */
+   unsigned long cp0_status;
+   unsigned long hi;
+   unsigned long lo;
+#ifdef CONFIG_CPU_HAS_SMARTMIPS
+   unsigned long acx;
+#endif
+   unsigned long cp0_badvaddr;
+   unsigned long cp0_cause;
+   unsigned long cp0_epc;
+#ifdef CONFIG_MIPS_MT_SMTC
+   unsigned long cp0_tcstatus;
+#endif /* CONFIG_MIPS_MT_SMTC */
+#ifdef CONFIG_CPU_CAVIUM_OCTEON
+   unsigned long long mpl[3];/* MTM{0,1,2} */
+   unsigned long long mtp[3];/* MTP{0,1,2} */
+#endif
+} __aligned(8);
+
 struct task_struct;
 
 extern int ptrace_getregs(struct task_struct *child, __s64 __user *data);
diff --git a/arch/mips/include/uapi/asm/ptrace.h 
b/arch/mips/include/uapi/asm/ptrace.h
index 4d58d84..b26f7e3 100644
--- a/arch/mips/include/uapi/asm/ptrace.h
+++ b/arch/mips/include/uapi/asm/ptrace.h
@@ -22,16 +22,12 @@
 #define DSP_CONTROL77
 #define ACX78
 
+#ifndef __KERNEL__
 /*
  * This struct defines the way the registers are stored on the stack during a
  * system call/exception. As usual the registers k0/k1 aren't being saved.
  */
 struct pt_regs {
-#ifdef CONFIG_32BIT
-   /* Pad bytes for argument save space on the stack. */
-   unsigned long pad0[6];
-#endif
-
/* Saved main processor registers. */
unsigned long regs[32];
 
@@ -39,20 +35,11 @@ struct pt_regs {
unsigned long cp0_status;
unsigned long hi;
unsigned long lo;
-#ifdef CONFIG_CPU_HAS_SMARTMIPS
-   unsigned long acx;
-#endif
unsigned long cp0_badvaddr;
unsigned long cp0_cause;
unsigned long cp0_epc;
-#ifdef CONFIG_MIPS_MT_SMTC
-   unsigned long cp0_tcstatus;
-#endif /* CONFIG_MIPS_MT_SMTC */
-#ifdef CONFIG_CPU_CAVIUM_OCTEON
-   unsigned long long mpl[3];/* MTM{0,1,2} */
-   unsigned long long mtp[3];/* MTP{0,1,2} */
-#endif
 } __attribute__ ((aligned (8)));
+#endif /* __KERNEL__ */
 
 /* Arbitrarily choose the same ptrace numbers as used by the Sparc code. */
 #define PTRACE_GETREGS 12
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/31] mips/kvm: Add accessors for MIPS VZ registers.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

There are accessors for both the guest control registers as well as
guest CP0 context.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/mipsregs.h | 260 +++
 1 file changed, 260 insertions(+)

diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h
index 6f03c72..0addfec 100644
--- a/arch/mips/include/asm/mipsregs.h
+++ b/arch/mips/include/asm/mipsregs.h
@@ -50,10 +50,13 @@
 #define CP0_WIRED $6
 #define CP0_INFO $7
 #define CP0_BADVADDR $8
+#define CP0_BADINSTR $8, 1
+#define CP0_BADINSTRP $8, 2
 #define CP0_COUNT $9
 #define CP0_ENTRYHI $10
 #define CP0_COMPARE $11
 #define CP0_STATUS $12
+#define CP0_GUESTCTL0 $12, 6
 #define CP0_CAUSE $13
 #define CP0_EPC $14
 #define CP0_PRID $15
@@ -623,6 +626,10 @@
 #define MIPS_FPIR_L(_ULCAST_(1)  21)
 #define MIPS_FPIR_F64  (_ULCAST_(1)  22)
 
+/* Bits in the MIPS VZ GuestCtl0 Register */
+#define MIPS_GUESTCTL0B_GM 31
+#define MIPS_GUESTCTL0F_GM (_ULCAST_(1)  MIPS_GUESTCTL0B_GM)
+
 #ifndef __ASSEMBLY__
 
 /*
@@ -851,6 +858,144 @@ do {  
\
local_irq_restore(__flags); \
 } while (0)
 
+/*
+ * Macros to access the VZ Guest system control coprocessor
+ */
+
+#define __read_32bit_gc0_register(source, sel) \
+   ({ int __res;   \
+   __asm__ __volatile__(   \
+   .set mips64r2\n\t \
+   .set\tvirt\n\t\
+   .ifeq 0- #sel \n\t  \
+   mfgc0\t%0,  #source , 0\n\t \
+   .endif\n\t\
+   .ifeq 1- #sel \n\t  \
+   mfgc0\t%0,  #source , 1\n\t \
+   .endif\n\t\
+   .ifeq 2- #sel \n\t  \
+   mfgc0\t%0,  #source , 2\n\t \
+   .endif\n\t\
+   .ifeq 3- #sel \n\t  \
+   mfgc0\t%0,  #source , 3\n\t \
+   .endif\n\t\
+   .ifeq 4- #sel \n\t  \
+   mfgc0\t%0,  #source , 4\n\t \
+   .endif\n\t\
+   .ifeq 5- #sel \n\t  \
+   mfgc0\t%0,  #source , 5\n\t \
+   .endif\n\t\
+   .ifeq 6- #sel \n\t  \
+   mfgc0\t%0,  #source , 6\n\t \
+   .endif\n\t\
+   .ifeq 7- #sel \n\t  \
+   mfgc0\t%0,  #source , 7\n\t \
+   .endif\n\t\
+   .set\tmips0   \
+   : =r (__res));\
+   __res;  \
+})
+
+#define __read_64bit_gc0_register(source, sel) \
+   ({ unsigned long long __res;\
+   __asm__ __volatile__(   \
+   .set mips64r2\n\t \
+   .set\tvirt\n\t\
+   .ifeq 0- #sel \n\t  \
+   dmfgc0\t%0,  #source , 0\n\t\
+   .endif\n\t\
+   .ifeq 1- #sel \n\t  \
+   dmfgc0\t%0,  #source , 1\n\t\
+   .endif\n\t\
+   .ifeq 2- #sel \n\t  \
+   dmfgc0\t%0,  #source , 2\n\t\
+   .endif\n\t\
+   .ifeq 3- #sel \n\t  \
+   dmfgc0\t%0,  #source , 3\n\t\
+   .endif\n\t\
+   .ifeq 4- #sel \n\t  \
+   dmfgc0\t%0,  #source , 

[PATCH 15/31] mips/kvm: Exception handling to leave and reenter guest mode.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

Currently this is a little complex, here are the facts about how it works:

o When running in Guest mode we set the high bit of CP0_XCONTEXT.  If
  this bit is clear, we don't do anything special on an exception.

o If we are in guest mode, upon an exception we:

  1) load the stack pointer from the mips_kvm_rootsp array instead of
 kernelsp.

  2) Clear GuestCtl[GM] and high bit of CP0_XCONTEXT.

  3) Restore host ASID and PGD pointer.

o Upon restarting from an exception we test the task TIF_GUESTMODE
  flag if it is clear, nothing special is done.

o If Guest mode is active for the thread we:

  1) Compare the stack pointer to mips_kvm_rootsp, if it doesn't match
 we are not reentering guest mode, so no more special processing
 is done.

  2) If reentering guest mode:

  2a) Set high bit of CP0_XCONTEXT and GuestCtl[GM].

  2b) Set Guest mode ASID and PGD pointer.

This allows a single set of exception handlers to be used for both
host and guest mode operation.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/stackframe.h | 135 -
 1 file changed, 132 insertions(+), 3 deletions(-)

diff --git a/arch/mips/include/asm/stackframe.h 
b/arch/mips/include/asm/stackframe.h
index 20627b2..bf2ec48 100644
--- a/arch/mips/include/asm/stackframe.h
+++ b/arch/mips/include/asm/stackframe.h
@@ -17,6 +17,7 @@
 #include asm/asmmacro.h
 #include asm/mipsregs.h
 #include asm/asm-offsets.h
+#include asm/thread_info.h
 
 /*
  * For SMTC kernel, global IE should be left set, and interrupts
@@ -98,7 +99,9 @@
 #define CPU_ID_REG CP0_CONTEXT
 #define CPU_ID_MFC0 MFC0
 #endif
-   .macro  get_saved_sp/* SMP variation */
+#define CPU_ID_MASK ((1  13) - 1)
+
+   .macro  get_saved_sp_for_save_some  /* SMP variation */
CPU_ID_MFC0 k0, CPU_ID_REG
 #if defined(CONFIG_32BIT) || defined(KBUILD_64BIT_SYM32)
lui k1, %hi(kernelsp)
@@ -110,15 +113,49 @@
dsllk1, 16
 #endif
LONG_SRLk0, PTEBASE_SHIFT
+#ifdef CONFIG_KVM_MIPSVZ
+   andik0, CPU_ID_MASK /* high bits indicate guest mode. */
+#endif
LONG_ADDU   k1, k0
LONG_L  k1, %lo(kernelsp)(k1)
.endm
 
+   .macro get_saved_sp
+   CPU_ID_MFC0 k0, CPU_ID_REG
+   get_saved_sp_for_save_some
+   .endm
+
+   .macro  get_mips_kvm_rootsp /* SMP variation */
+#if defined(CONFIG_32BIT) || defined(KBUILD_64BIT_SYM32)
+   lui k1, %hi(mips_kvm_rootsp)
+#else
+   lui k1, %highest(mips_kvm_rootsp)
+   daddiu  k1, %higher(mips_kvm_rootsp)
+   dsllk1, 16
+   daddiu  k1, %hi(mips_kvm_rootsp)
+   dsllk1, 16
+#endif
+   LONG_SRLk0, PTEBASE_SHIFT
+   andik0, CPU_ID_MASK /* high bits indicate guest mode. */
+   LONG_ADDU   k1, k0
+   LONG_L  k1, %lo(mips_kvm_rootsp)(k1)
+   .endm
+
.macro  set_saved_sp stackp temp temp2
CPU_ID_MFC0 \temp, CPU_ID_REG
LONG_SRL\temp, PTEBASE_SHIFT
+#ifdef CONFIG_KVM_MIPSVZ
+   andik0, CPU_ID_MASK /* high bits indicate guest mode. */
+#endif
LONG_S  \stackp, kernelsp(\temp)
.endm
+
+   .macro  set_mips_kvm_rootsp stackp temp
+   CPU_ID_MFC0 \temp, CPU_ID_REG
+   LONG_SRL\temp, PTEBASE_SHIFT
+   andik0, CPU_ID_MASK /* high bits indicate guest mode. */
+   LONG_S  \stackp, mips_kvm_rootsp(\temp)
+   .endm
 #else
.macro  get_saved_sp/* Uniprocessor variation */
 #ifdef CONFIG_CPU_JUMP_WORKAROUNDS
@@ -152,9 +189,27 @@
LONG_L  k1, %lo(kernelsp)(k1)
.endm
 
+   .macro  get_mips_kvm_rootsp /* Uniprocessor variation */
+#if defined(CONFIG_32BIT) || defined(KBUILD_64BIT_SYM32)
+   lui k1, %hi(mips_kvm_rootsp)
+#else
+   lui k1, %highest(mips_kvm_rootsp)
+   daddiu  k1, %higher(mips_kvm_rootsp)
+   dsllk1, k1, 16
+   daddiu  k1, %hi(mips_kvm_rootsp)
+   dsllk1, k1, 16
+#endif
+   LONG_L  k1, %lo(mips_kvm_rootsp)(k1)
+   .endm
+
+
.macro  set_saved_sp stackp temp temp2
LONG_S  \stackp, kernelsp
.endm
+
+   .macro  set_mips_kvm_rootsp stackp temp
+   LONG_S  \stackp, mips_kvm_rootsp
+   .endm
 #endif
 
.macro  SAVE_SOME
@@ -164,11 +219,21 @@
mfc0k0, CP0_STATUS
sll k0, 3   /* extract cu0 bit */
.setnoreorder
+#ifdef CONFIG_KVM_MIPSVZ
+  

[PATCH 16/31] mips/kvm: Add exception handler for MIPSVZ Guest exceptions.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/kernel/genex.S | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S
index 163e299..ce0be96 100644
--- a/arch/mips/kernel/genex.S
+++ b/arch/mips/kernel/genex.S
@@ -486,6 +486,9 @@ NESTED(nmi_handler, PT_SIZE, sp)
BUILD_HANDLER mcheck mcheck cli verbose /* #24 */
BUILD_HANDLER mt mt sti silent  /* #25 */
BUILD_HANDLER dsp dsp sti silent/* #26 */
+#ifdef CONFIG_KVM_MIPSVZ
+   BUILD_HANDLER hypervisor hypervisor cli silent  /* #27 */
+#endif
BUILD_HANDLER reserved reserved sti verbose /* others */
 
.align  5
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/31] mips/kvm: Rename VCPU_registername to KVM_VCPU_ARCH_registername

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

This makes it follow the pattern where the structure name is the
symbol name prefix.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/kernel/asm-offsets.c |  68 +++---
 arch/mips/kvm/kvm_locore.S | 206 -
 2 files changed, 137 insertions(+), 137 deletions(-)

diff --git a/arch/mips/kernel/asm-offsets.c b/arch/mips/kernel/asm-offsets.c
index 22bf8f5..a0aa12c 100644
--- a/arch/mips/kernel/asm-offsets.c
+++ b/arch/mips/kernel/asm-offsets.c
@@ -351,40 +351,40 @@ void output_kvm_defines(void)
 
OFFSET(VCPU_GUEST_INST, kvm_vcpu_arch, guest_inst);
 
-   OFFSET(VCPU_R0, kvm_vcpu_arch, gprs[0]);
-   OFFSET(VCPU_R1, kvm_vcpu_arch, gprs[1]);
-   OFFSET(VCPU_R2, kvm_vcpu_arch, gprs[2]);
-   OFFSET(VCPU_R3, kvm_vcpu_arch, gprs[3]);
-   OFFSET(VCPU_R4, kvm_vcpu_arch, gprs[4]);
-   OFFSET(VCPU_R5, kvm_vcpu_arch, gprs[5]);
-   OFFSET(VCPU_R6, kvm_vcpu_arch, gprs[6]);
-   OFFSET(VCPU_R7, kvm_vcpu_arch, gprs[7]);
-   OFFSET(VCPU_R8, kvm_vcpu_arch, gprs[8]);
-   OFFSET(VCPU_R9, kvm_vcpu_arch, gprs[9]);
-   OFFSET(VCPU_R10, kvm_vcpu_arch, gprs[10]);
-   OFFSET(VCPU_R11, kvm_vcpu_arch, gprs[11]);
-   OFFSET(VCPU_R12, kvm_vcpu_arch, gprs[12]);
-   OFFSET(VCPU_R13, kvm_vcpu_arch, gprs[13]);
-   OFFSET(VCPU_R14, kvm_vcpu_arch, gprs[14]);
-   OFFSET(VCPU_R15, kvm_vcpu_arch, gprs[15]);
-   OFFSET(VCPU_R16, kvm_vcpu_arch, gprs[16]);
-   OFFSET(VCPU_R17, kvm_vcpu_arch, gprs[17]);
-   OFFSET(VCPU_R18, kvm_vcpu_arch, gprs[18]);
-   OFFSET(VCPU_R19, kvm_vcpu_arch, gprs[19]);
-   OFFSET(VCPU_R20, kvm_vcpu_arch, gprs[20]);
-   OFFSET(VCPU_R21, kvm_vcpu_arch, gprs[21]);
-   OFFSET(VCPU_R22, kvm_vcpu_arch, gprs[22]);
-   OFFSET(VCPU_R23, kvm_vcpu_arch, gprs[23]);
-   OFFSET(VCPU_R24, kvm_vcpu_arch, gprs[24]);
-   OFFSET(VCPU_R25, kvm_vcpu_arch, gprs[25]);
-   OFFSET(VCPU_R26, kvm_vcpu_arch, gprs[26]);
-   OFFSET(VCPU_R27, kvm_vcpu_arch, gprs[27]);
-   OFFSET(VCPU_R28, kvm_vcpu_arch, gprs[28]);
-   OFFSET(VCPU_R29, kvm_vcpu_arch, gprs[29]);
-   OFFSET(VCPU_R30, kvm_vcpu_arch, gprs[30]);
-   OFFSET(VCPU_R31, kvm_vcpu_arch, gprs[31]);
-   OFFSET(VCPU_LO, kvm_vcpu_arch, lo);
-   OFFSET(VCPU_HI, kvm_vcpu_arch, hi);
+   OFFSET(KVM_VCPU_ARCH_R0, kvm_vcpu_arch, gprs[0]);
+   OFFSET(KVM_VCPU_ARCH_R1, kvm_vcpu_arch, gprs[1]);
+   OFFSET(KVM_VCPU_ARCH_R2, kvm_vcpu_arch, gprs[2]);
+   OFFSET(KVM_VCPU_ARCH_R3, kvm_vcpu_arch, gprs[3]);
+   OFFSET(KVM_VCPU_ARCH_R4, kvm_vcpu_arch, gprs[4]);
+   OFFSET(KVM_VCPU_ARCH_R5, kvm_vcpu_arch, gprs[5]);
+   OFFSET(KVM_VCPU_ARCH_R6, kvm_vcpu_arch, gprs[6]);
+   OFFSET(KVM_VCPU_ARCH_R7, kvm_vcpu_arch, gprs[7]);
+   OFFSET(KVM_VCPU_ARCH_R8, kvm_vcpu_arch, gprs[8]);
+   OFFSET(KVM_VCPU_ARCH_R9, kvm_vcpu_arch, gprs[9]);
+   OFFSET(KVM_VCPU_ARCH_R10, kvm_vcpu_arch, gprs[10]);
+   OFFSET(KVM_VCPU_ARCH_R11, kvm_vcpu_arch, gprs[11]);
+   OFFSET(KVM_VCPU_ARCH_R12, kvm_vcpu_arch, gprs[12]);
+   OFFSET(KVM_VCPU_ARCH_R13, kvm_vcpu_arch, gprs[13]);
+   OFFSET(KVM_VCPU_ARCH_R14, kvm_vcpu_arch, gprs[14]);
+   OFFSET(KVM_VCPU_ARCH_R15, kvm_vcpu_arch, gprs[15]);
+   OFFSET(KVM_VCPU_ARCH_R16, kvm_vcpu_arch, gprs[16]);
+   OFFSET(KVM_VCPU_ARCH_R17, kvm_vcpu_arch, gprs[17]);
+   OFFSET(KVM_VCPU_ARCH_R18, kvm_vcpu_arch, gprs[18]);
+   OFFSET(KVM_VCPU_ARCH_R19, kvm_vcpu_arch, gprs[19]);
+   OFFSET(KVM_VCPU_ARCH_R20, kvm_vcpu_arch, gprs[20]);
+   OFFSET(KVM_VCPU_ARCH_R21, kvm_vcpu_arch, gprs[21]);
+   OFFSET(KVM_VCPU_ARCH_R22, kvm_vcpu_arch, gprs[22]);
+   OFFSET(KVM_VCPU_ARCH_R23, kvm_vcpu_arch, gprs[23]);
+   OFFSET(KVM_VCPU_ARCH_R24, kvm_vcpu_arch, gprs[24]);
+   OFFSET(KVM_VCPU_ARCH_R25, kvm_vcpu_arch, gprs[25]);
+   OFFSET(KVM_VCPU_ARCH_R26, kvm_vcpu_arch, gprs[26]);
+   OFFSET(KVM_VCPU_ARCH_R27, kvm_vcpu_arch, gprs[27]);
+   OFFSET(KVM_VCPU_ARCH_R28, kvm_vcpu_arch, gprs[28]);
+   OFFSET(KVM_VCPU_ARCH_R29, kvm_vcpu_arch, gprs[29]);
+   OFFSET(KVM_VCPU_ARCH_R30, kvm_vcpu_arch, gprs[30]);
+   OFFSET(KVM_VCPU_ARCH_R31, kvm_vcpu_arch, gprs[31]);
+   OFFSET(KVM_VCPU_ARCH_LO, kvm_vcpu_arch, lo);
+   OFFSET(KVM_VCPU_ARCH_HI, kvm_vcpu_arch, hi);
OFFSET(KVM_VCPU_ARCH_EPC, kvm_vcpu_arch, epc);
OFFSET(VCPU_COP0, kvm_vcpu_arch, cop0);
OFFSET(VCPU_GUEST_KERNEL_ASID, kvm_vcpu_arch, guest_kernel_asid);
diff --git a/arch/mips/kvm/kvm_locore.S b/arch/mips/kvm/kvm_locore.S
index a434bbe..7a33ee7 100644
--- a/arch/mips/kvm/kvm_locore.S
+++ b/arch/mips/kvm/kvm_locore.S
@@ -175,52 +175,52 @@ FEXPORT(__kvm_mips_load_asid)
 mtc0zero,  CP0_HWRENA
 
 /* Now load up the Guest Context from VCPU */
-LONG_L $1, VCPU_R1(k1)
-LONG_L $2, VCPU_R2(k1)
-LONG_L $3, VCPU_R3(k1)
-
-LONG_L 

[PATCH 10/31] mips/kvm: Implement ioctls to get and set FPU registers.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

The current implementation does nothing with them, but future MIPSVZ
work need them.  Also add the asm-offsets accessors for the fields.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/kvm_host.h |  8 
 arch/mips/kernel/asm-offsets.c   |  8 
 arch/mips/kvm/kvm_mips.c | 26 --
 3 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index 16013c7..505b804 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -102,6 +102,14 @@ struct kvm_vcpu_arch {
unsigned long lo;
unsigned long epc;
 
+   /* FPU state */
+   u64 fpr[32];
+   u32 fir;
+   u32 fccr;
+   u32 fexr;
+   u32 fenr;
+   u32 fcsr;
+
void *impl;
 };
 
diff --git a/arch/mips/kernel/asm-offsets.c b/arch/mips/kernel/asm-offsets.c
index 5a9222e..03bf363 100644
--- a/arch/mips/kernel/asm-offsets.c
+++ b/arch/mips/kernel/asm-offsets.c
@@ -377,6 +377,14 @@ void output_kvm_defines(void)
OFFSET(KVM_VCPU_ARCH_HI, kvm_vcpu, arch.hi);
OFFSET(KVM_VCPU_ARCH_EPC, kvm_vcpu, arch.epc);
OFFSET(KVM_VCPU_ARCH_IMPL, kvm_vcpu, arch.impl);
+   BLANK();
+   OFFSET(KVM_VCPU_ARCH_FPR,   kvm_vcpu, arch.fpr);
+   OFFSET(KVM_VCPU_ARCH_FIR,   kvm_vcpu, arch.fir);
+   OFFSET(KVM_VCPU_ARCH_FCCR,  kvm_vcpu, arch.fccr);
+   OFFSET(KVM_VCPU_ARCH_FEXR,  kvm_vcpu, arch.fexr);
+   OFFSET(KVM_VCPU_ARCH_FENR,  kvm_vcpu, arch.fenr);
+   OFFSET(KVM_VCPU_ARCH_FCSR,  kvm_vcpu, arch.fcsr);
+   BLANK();
 
OFFSET(KVM_MIPS_VCPU_TE_HOST_EBASE, kvm_mips_vcpu_te, host_ebase);
OFFSET(KVM_MIPS_VCPU_TE_GUEST_EBASE, kvm_mips_vcpu_te, guest_ebase);
diff --git a/arch/mips/kvm/kvm_mips.c b/arch/mips/kvm/kvm_mips.c
index 041caad..18c8dc8 100644
--- a/arch/mips/kvm/kvm_mips.c
+++ b/arch/mips/kvm/kvm_mips.c
@@ -465,12 +465,34 @@ int kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
 
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
 {
-   return -ENOIOCTLCMD;
+   int i;
+
+   for (i = 0; i  ARRAY_SIZE(vcpu-arch.fpr); i++)
+   fpu-fpr[i] = vcpu-arch.fpr[i];
+
+   fpu-fir = vcpu-arch.fir;
+   fpu-fccr = vcpu-arch.fccr;
+   fpu-fexr = vcpu-arch.fexr;
+   fpu-fenr = vcpu-arch.fenr;
+   fpu-fcsr = vcpu-arch.fcsr;
+
+   return 0;
 }
 
 int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
 {
-   return -ENOIOCTLCMD;
+   int i;
+
+   for (i = 0; i  ARRAY_SIZE(vcpu-arch.fpr); i++)
+   vcpu-arch.fpr[i] = fpu-fpr[i];
+
+   vcpu-arch.fir = fpu-fir;
+   vcpu-arch.fccr = fpu-fccr;
+   vcpu-arch.fexr = fpu-fexr;
+   vcpu-arch.fenr = fpu-fenr;
+   vcpu-arch.fcsr = fpu-fcsr;
+
+   return 0;
 }
 
 int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/31] mips/kvm: Fix code formatting in arch/mips/kvm/kvm_locore.S

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

It was a completely inconsistent mix of spaces and tabs.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/kvm/kvm_locore.S | 921 +++--
 1 file changed, 464 insertions(+), 457 deletions(-)

diff --git a/arch/mips/kvm/kvm_locore.S b/arch/mips/kvm/kvm_locore.S
index 7a33ee7..7c2933a 100644
--- a/arch/mips/kvm/kvm_locore.S
+++ b/arch/mips/kvm/kvm_locore.S
@@ -1,13 +1,13 @@
 /*
-* This file is subject to the terms and conditions of the GNU General Public
-* License.  See the file COPYING in the main directory of this archive
-* for more details.
-*
-* Main entry point for the guest, exception handling.
-*
-* Copyright (C) 2012  MIPS Technologies, Inc.  All rights reserved.
-* Authors: Sanjay Lal sanj...@kymasys.com
-*/
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file COPYING in the main directory of this archive
+ * for more details.
+ *
+ * Main entry point for the guest, exception handling.
+ *
+ * Copyright (C) 2012  MIPS Technologies, Inc.  All rights reserved.
+ * Authors: Sanjay Lal sanj...@kymasys.com
+ */
 
 #include asm/asm.h
 #include asm/asmmacro.h
@@ -57,172 +57,177 @@
  */
 
 FEXPORT(__kvm_mips_vcpu_run)
-.setpush
-.setnoreorder
-.setnoat
-
-/* k0/k1 not being used in host kernel context */
-   addiu   k1,sp, -PT_SIZE
-LONG_S $0, PT_R0(k1)
-LONG_S $1, PT_R1(k1)
-LONG_S $2, PT_R2(k1)
-LONG_S $3, PT_R3(k1)
-
-LONG_S $4, PT_R4(k1)
-LONG_S $5, PT_R5(k1)
-LONG_S $6, PT_R6(k1)
-LONG_S $7, PT_R7(k1)
-
-LONG_S $8,  PT_R8(k1)
-LONG_S $9,  PT_R9(k1)
-LONG_S $10, PT_R10(k1)
-LONG_S $11, PT_R11(k1)
-LONG_S $12, PT_R12(k1)
-LONG_S $13, PT_R13(k1)
-LONG_S $14, PT_R14(k1)
-LONG_S $15, PT_R15(k1)
-LONG_S $16, PT_R16(k1)
-LONG_S $17, PT_R17(k1)
-
-LONG_S $18, PT_R18(k1)
-LONG_S $19, PT_R19(k1)
-LONG_S $20, PT_R20(k1)
-LONG_S $21, PT_R21(k1)
-LONG_S $22, PT_R22(k1)
-LONG_S $23, PT_R23(k1)
-LONG_S $24, PT_R24(k1)
-LONG_S $25, PT_R25(k1)
+   .setpush
+   .setnoreorder
+   .setnoat
+
+   /* k0/k1 not being used in host kernel context */
+   addiu   k1, sp, -PT_SIZE
+   LONG_S  $0, PT_R0(k1)
+   LONG_S  $1, PT_R1(k1)
+   LONG_S  $2, PT_R2(k1)
+   LONG_S  $3, PT_R3(k1)
+
+   LONG_S  $4, PT_R4(k1)
+   LONG_S  $5, PT_R5(k1)
+   LONG_S  $6, PT_R6(k1)
+   LONG_S  $7, PT_R7(k1)
+
+   LONG_S  $8,  PT_R8(k1)
+   LONG_S  $9,  PT_R9(k1)
+   LONG_S  $10, PT_R10(k1)
+   LONG_S  $11, PT_R11(k1)
+   LONG_S  $12, PT_R12(k1)
+   LONG_S  $13, PT_R13(k1)
+   LONG_S  $14, PT_R14(k1)
+   LONG_S  $15, PT_R15(k1)
+   LONG_S  $16, PT_R16(k1)
+   LONG_S  $17, PT_R17(k1)
+
+   LONG_S  $18, PT_R18(k1)
+   LONG_S  $19, PT_R19(k1)
+   LONG_S  $20, PT_R20(k1)
+   LONG_S  $21, PT_R21(k1)
+   LONG_S  $22, PT_R22(k1)
+   LONG_S  $23, PT_R23(k1)
+   LONG_S  $24, PT_R24(k1)
+   LONG_S  $25, PT_R25(k1)
 
/* XXXKYMA k0/k1 not saved, not being used if we got here through an 
ioctl() */
 
-LONG_S $28, PT_R28(k1)
-LONG_S $29, PT_R29(k1)
-LONG_S $30, PT_R30(k1)
-LONG_S $31, PT_R31(k1)
+   LONG_S  $28, PT_R28(k1)
+   LONG_S  $29, PT_R29(k1)
+   LONG_S  $30, PT_R30(k1)
+   LONG_S  $31, PT_R31(k1)
 
-/* Save hi/lo */
-   mflov0
-   LONG_S  v0, PT_LO(k1)
-   mfhiv1
-   LONG_S  v1, PT_HI(k1)
+   /* Save hi/lo */
+   mflov0
+   LONG_S  v0, PT_LO(k1)
+   mfhiv1
+   LONG_S  v1, PT_HI(k1)
 
/* Save host status */
-   mfc0v0, CP0_STATUS
-   LONG_S  v0, PT_STATUS(k1)
+   mfc0v0, CP0_STATUS
+   LONG_S  v0, PT_STATUS(k1)
 
/* Save host ASID, shove it into the BVADDR location */
-   mfc0v1,CP0_ENTRYHI
-   andiv1, 0xff
-   LONG_S  v1, PT_HOST_ASID(k1)
+   mfc0v1, CP0_ENTRYHI
+   andiv1, 0xff
+   LONG_S  v1, PT_HOST_ASID(k1)
 
-/* Save DDATA_LO, will be used to store pointer to vcpu */
-mfc0v1, CP0_DDATA_LO
-LONG_S  v1, PT_HOST_USERLOCAL(k1)
+   /* Save DDATA_LO, will be used to store pointer to vcpu */
+   mfc0v1, CP0_DDATA_LO
+   LONG_S  v1, PT_HOST_USERLOCAL(k1)
 
-/* DDATA_LO has pointer to vcpu */
-mtc0a1,CP0_DDATA_LO
+   /* DDATA_LO has pointer to vcpu */
+   mtc0a1, CP0_DDATA_LO
 

[PATCH 06/31] mips/kvm: Rename kvm_vcpu_arch.pc to kvm_vcpu_arch.epc

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

The proper MIPS name for this register is EPC, so use that.

Change the asm-offsets name to KVM_VCPU_ARCH_EPC, so that the symbol
name prefix matches the structure name.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/kvm_host.h |   2 +-
 arch/mips/kernel/asm-offsets.c   |   2 +-
 arch/mips/kvm/kvm_locore.S   |   6 +-
 arch/mips/kvm/kvm_mips.c |  12 ++--
 arch/mips/kvm/kvm_mips_emul.c| 140 +++
 arch/mips/kvm/kvm_mips_int.c |   8 +--
 arch/mips/kvm/kvm_trap_emul.c|  20 +++---
 7 files changed, 95 insertions(+), 95 deletions(-)

diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index 4d6fa0b..d9ee320 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -363,7 +363,7 @@ struct kvm_vcpu_arch {
unsigned long gprs[32];
unsigned long hi;
unsigned long lo;
-   unsigned long pc;
+   unsigned long epc;
 
/* FPU State */
struct mips_fpu_struct fpu;
diff --git a/arch/mips/kernel/asm-offsets.c b/arch/mips/kernel/asm-offsets.c
index 0845091..22bf8f5 100644
--- a/arch/mips/kernel/asm-offsets.c
+++ b/arch/mips/kernel/asm-offsets.c
@@ -385,7 +385,7 @@ void output_kvm_defines(void)
OFFSET(VCPU_R31, kvm_vcpu_arch, gprs[31]);
OFFSET(VCPU_LO, kvm_vcpu_arch, lo);
OFFSET(VCPU_HI, kvm_vcpu_arch, hi);
-   OFFSET(VCPU_PC, kvm_vcpu_arch, pc);
+   OFFSET(KVM_VCPU_ARCH_EPC, kvm_vcpu_arch, epc);
OFFSET(VCPU_COP0, kvm_vcpu_arch, cop0);
OFFSET(VCPU_GUEST_KERNEL_ASID, kvm_vcpu_arch, guest_kernel_asid);
OFFSET(VCPU_GUEST_USER_ASID, kvm_vcpu_arch, guest_user_asid);
diff --git a/arch/mips/kvm/kvm_locore.S b/arch/mips/kvm/kvm_locore.S
index e86fa2a..a434bbe 100644
--- a/arch/mips/kvm/kvm_locore.S
+++ b/arch/mips/kvm/kvm_locore.S
@@ -151,7 +151,7 @@ FEXPORT(__kvm_mips_vcpu_run)
 
 
/* Set Guest EPC */
-   LONG_L  t0, VCPU_PC(k1)
+   LONG_L  t0, KVM_VCPU_ARCH_EPC(k1)
mtc0t0, CP0_EPC
 
 FEXPORT(__kvm_mips_load_asid)
@@ -330,7 +330,7 @@ NESTED (MIPSX(GuestException), CALLFRAME_SIZ, ra)
 
 /* Save Host level EPC, BadVaddr and Cause to VCPU, useful to process the 
exception */
 mfc0k0,CP0_EPC
-LONG_S  k0, VCPU_PC(k1)
+LONG_S  k0, KVM_VCPU_ARCH_EPC(k1)
 
 mfc0k0, CP0_BADVADDR
 LONG_S  k0, VCPU_HOST_CP0_BADVADDR(k1)
@@ -438,7 +438,7 @@ __kvm_mips_return_to_guest:
 
 
/* Set Guest EPC */
-   LONG_L  t0, VCPU_PC(k1)
+   LONG_L  t0, KVM_VCPU_ARCH_EPC(k1)
mtc0t0, CP0_EPC
 
 /* Set the ASID for the Guest Kernel */
diff --git a/arch/mips/kvm/kvm_mips.c b/arch/mips/kvm/kvm_mips.c
index 6018e2a..4ac5ab4 100644
--- a/arch/mips/kvm/kvm_mips.c
+++ b/arch/mips/kvm/kvm_mips.c
@@ -583,7 +583,7 @@ static int kvm_mips_get_reg(struct kvm_vcpu *vcpu,
v = (long)vcpu-arch.lo;
break;
case KVM_REG_MIPS_PC:
-   v = (long)vcpu-arch.pc;
+   v = (long)vcpu-arch.epc;
break;
 
case KVM_REG_MIPS_CP0_INDEX:
@@ -658,7 +658,7 @@ static int kvm_mips_set_reg(struct kvm_vcpu *vcpu,
vcpu-arch.lo = v;
break;
case KVM_REG_MIPS_PC:
-   vcpu-arch.pc = v;
+   vcpu-arch.epc = v;
break;
 
case KVM_REG_MIPS_CP0_INDEX:
@@ -890,7 +890,7 @@ int kvm_arch_vcpu_dump_regs(struct kvm_vcpu *vcpu)
return -1;
 
printk(VCPU Register Dump:\n);
-   printk(\tpc = 0x%08lx\n, vcpu-arch.pc);;
+   printk(\tepc = 0x%08lx\n, vcpu-arch.epc);;
printk(\texceptions: %08lx\n, vcpu-arch.pending_exceptions);
 
for (i = 0; i  32; i += 4) {
@@ -920,7 +920,7 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
vcpu-arch.gprs[0] = 0; /* zero is special, and cannot be set. */
vcpu-arch.hi = regs-hi;
vcpu-arch.lo = regs-lo;
-   vcpu-arch.pc = regs-pc;
+   vcpu-arch.epc = regs-pc;
 
return 0;
 }
@@ -934,7 +934,7 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
 
regs-hi = vcpu-arch.hi;
regs-lo = vcpu-arch.lo;
-   regs-pc = vcpu-arch.pc;
+   regs-pc = vcpu-arch.epc;
 
return 0;
 }
@@ -1014,7 +1014,7 @@ int kvm_mips_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
 {
uint32_t cause = vcpu-arch.host_cp0_cause;
uint32_t exccode = (cause  CAUSEB_EXCCODE)  0x1f;
-   uint32_t __user *opc = (uint32_t __user *) vcpu-arch.pc;
+   uint32_t __user *opc = (uint32_t __user *) vcpu-arch.epc;
unsigned long badvaddr = vcpu-arch.host_cp0_badvaddr;
enum emulation_result er = EMULATE_DONE;
int ret = RESUME_GUEST;
diff --git a/arch/mips/kvm/kvm_mips_emul.c b/arch/mips/kvm/kvm_mips_emul.c
index 

[PATCH 04/31] mips/kvm: Add casts to avoid pointer width mismatch build failures.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

When building for 64-bits we need these cases to make it build.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/kvm/kvm_mips.c  | 4 ++--
 arch/mips/kvm/kvm_mips_dyntrans.c | 4 ++--
 arch/mips/kvm/kvm_mips_emul.c | 2 +-
 arch/mips/kvm/kvm_tlb.c   | 4 ++--
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/mips/kvm/kvm_mips.c b/arch/mips/kvm/kvm_mips.c
index d934b01..6018e2a 100644
--- a/arch/mips/kvm/kvm_mips.c
+++ b/arch/mips/kvm/kvm_mips.c
@@ -303,7 +303,7 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, 
unsigned int id)
}
 
/* Save Linux EBASE */
-   vcpu-arch.host_ebase = (void *)read_c0_ebase();
+   vcpu-arch.host_ebase = (void *)(long)(read_c0_ebase()  0x3ff);
 
gebase = kzalloc(ALIGN(size, PAGE_SIZE), GFP_KERNEL);
 
@@ -339,7 +339,7 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, 
unsigned int id)
offset = 0x2000;
kvm_info(Installing KVM Exception handlers @ %p, %#x bytes\n,
 gebase + offset,
-mips32_GuestExceptionEnd - mips32_GuestException);
+(unsigned)(mips32_GuestExceptionEnd - mips32_GuestException));
 
memcpy(gebase + offset, mips32_GuestException,
   mips32_GuestExceptionEnd - mips32_GuestException);
diff --git a/arch/mips/kvm/kvm_mips_dyntrans.c 
b/arch/mips/kvm/kvm_mips_dyntrans.c
index 96528e2..dd0b8f9 100644
--- a/arch/mips/kvm/kvm_mips_dyntrans.c
+++ b/arch/mips/kvm/kvm_mips_dyntrans.c
@@ -94,7 +94,7 @@ kvm_mips_trans_mfc0(uint32_t inst, uint32_t *opc, struct 
kvm_vcpu *vcpu)
  cop0);
}
 
-   if (KVM_GUEST_KSEGX(opc) == KVM_GUEST_KSEG0) {
+   if (KVM_GUEST_KSEGX((unsigned long)opc) == KVM_GUEST_KSEG0) {
kseg0_opc =
CKSEG0ADDR(kvm_mips_translate_guest_kseg0_to_hpa
   (vcpu, (unsigned long) opc));
@@ -129,7 +129,7 @@ kvm_mips_trans_mtc0(uint32_t inst, uint32_t *opc, struct 
kvm_vcpu *vcpu)
offsetof(struct mips_coproc,
 reg[rd][sel]) + offsetof(struct kvm_mips_commpage, cop0);
 
-   if (KVM_GUEST_KSEGX(opc) == KVM_GUEST_KSEG0) {
+   if (KVM_GUEST_KSEGX((unsigned long)opc) == KVM_GUEST_KSEG0) {
kseg0_opc =
CKSEG0ADDR(kvm_mips_translate_guest_kseg0_to_hpa
   (vcpu, (unsigned long) opc));
diff --git a/arch/mips/kvm/kvm_mips_emul.c b/arch/mips/kvm/kvm_mips_emul.c
index 4b6274b..af9a661 100644
--- a/arch/mips/kvm/kvm_mips_emul.c
+++ b/arch/mips/kvm/kvm_mips_emul.c
@@ -892,7 +892,7 @@ int kvm_mips_sync_icache(unsigned long va, struct kvm_vcpu 
*vcpu)
pfn = kvm-arch.guest_pmap[gfn];
pa = (pfn  PAGE_SHIFT) | offset;
 
-   printk(%s: va: %#lx, unmapped: %#x\n, __func__, va, CKSEG0ADDR(pa));
+   printk(%s: va: %#lx, unmapped: %#lx\n, __func__, va, CKSEG0ADDR(pa));
 
mips32_SyncICache(CKSEG0ADDR(pa), 32);
return 0;
diff --git a/arch/mips/kvm/kvm_tlb.c b/arch/mips/kvm/kvm_tlb.c
index c777dd3..5e189be 100644
--- a/arch/mips/kvm/kvm_tlb.c
+++ b/arch/mips/kvm/kvm_tlb.c
@@ -353,7 +353,7 @@ int kvm_mips_handle_commpage_tlb_fault(unsigned long 
badvaddr,
unsigned long entrylo0 = 0, entrylo1 = 0;
 
 
-   pfn0 = CPHYSADDR(vcpu-arch.kseg0_commpage)  PAGE_SHIFT;
+   pfn0 = CPHYSADDR((unsigned long)vcpu-arch.kseg0_commpage)  
PAGE_SHIFT;
pfn1 = 0;
entrylo0 = mips3_paddr_to_tlbpfn(pfn0  PAGE_SHIFT) | (0x3  3) | (1 
 2) |
(0x1  1);
@@ -916,7 +916,7 @@ uint32_t kvm_get_inst(uint32_t *opc, struct kvm_vcpu *vcpu)
inst = *(opc);
}
local_irq_restore(flags);
-   } else if (KVM_GUEST_KSEGX(opc) == KVM_GUEST_KSEG0) {
+   } else if (KVM_GUEST_KSEGX((unsigned long)opc) == KVM_GUEST_KSEG0) {
paddr =
kvm_mips_translate_guest_kseg0_to_hpa(vcpu,
 (unsigned long) opc);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/31] MIPS: Save and restore K0/K1 when CONFIG_KVM_MIPSVZ

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

We cannot clobber any registers on exceptions as any guest will need
them all.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/include/asm/mipsregs.h   |  2 ++
 arch/mips/include/asm/stackframe.h | 15 +++
 arch/mips/kernel/cpu-probe.c   |  7 ++-
 arch/mips/kernel/genex.S   |  5 +
 arch/mips/kernel/scall64-64.S  | 12 
 arch/mips/kernel/scall64-n32.S | 12 
 arch/mips/kernel/traps.c   |  5 +
 arch/mips/mm/tlbex.c   | 25 +
 8 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h
index 6e0da5aa..6f03c72 100644
--- a/arch/mips/include/asm/mipsregs.h
+++ b/arch/mips/include/asm/mipsregs.h
@@ -73,6 +73,8 @@
 #define CP0_TAGHI $29
 #define CP0_ERROREPC $30
 #define CP0_DESAVE $31
+#define CP0_KSCRATCH1 $31, 2
+#define CP0_KSCRATCH2 $31, 3
 
 /*
  * R4640/R4650 cp0 register names.  These registers are listed
diff --git a/arch/mips/include/asm/stackframe.h 
b/arch/mips/include/asm/stackframe.h
index a89d1b1..20627b2 100644
--- a/arch/mips/include/asm/stackframe.h
+++ b/arch/mips/include/asm/stackframe.h
@@ -181,6 +181,16 @@
 #endif
LONG_S  k0, PT_R29(sp)
LONG_S  $3, PT_R3(sp)
+#ifdef CONFIG_KVM_MIPSVZ
+   /*
+* With KVM_MIPSVZ, we must not clobber k0/k1
+* they were saved before they were used
+*/
+   MFC0k0, CP0_KSCRATCH1
+   MFC0$3, CP0_KSCRATCH2
+   LONG_S  k0, PT_R26(sp)
+   LONG_S  $3, PT_R27(sp)
+#endif
/*
 * You might think that you don't need to save $0,
 * but the FPU emulator and gdb remote debug stub
@@ -447,6 +457,11 @@
.endm
 
.macro  RESTORE_SP_AND_RET
+
+#ifdef CONFIG_KVM_MIPSVZ
+   LONG_L  k0, PT_R26(sp)
+   LONG_L  k1, PT_R27(sp)
+#endif
LONG_L  sp, PT_R29(sp)
.setmips3
eret
diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
index ee1014e..7a07edb 100644
--- a/arch/mips/kernel/cpu-probe.c
+++ b/arch/mips/kernel/cpu-probe.c
@@ -1067,7 +1067,12 @@ __cpuinit void cpu_report(void)
 
 static DEFINE_SPINLOCK(kscratch_used_lock);
 
-static unsigned int kscratch_used_mask;
+static unsigned int kscratch_used_mask
+#ifdef CONFIG_KVM_MIPSVZ
+/* KVM_MIPSVZ implemtation uses these two statically. */
+= 0xc
+#endif
+;
 
 int allocate_kscratch(void)
 {
diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S
index 31fa856..163e299 100644
--- a/arch/mips/kernel/genex.S
+++ b/arch/mips/kernel/genex.S
@@ -46,6 +46,11 @@
 NESTED(except_vec3_generic, 0, sp)
.setpush
.setnoat
+#ifdef CONFIG_KVM_MIPSVZ
+   /* With KVM_MIPSVZ, we must not clobber k0/k1 */
+   MTC0k0, CP0_KSCRATCH1
+   MTC0k1, CP0_KSCRATCH2
+#endif
 #if R5432_CP0_INTERRUPT_WAR
mfc0k0, CP0_INDEX
 #endif
diff --git a/arch/mips/kernel/scall64-64.S b/arch/mips/kernel/scall64-64.S
index 97a5909..5ff4882 100644
--- a/arch/mips/kernel/scall64-64.S
+++ b/arch/mips/kernel/scall64-64.S
@@ -62,6 +62,9 @@ NESTED(handle_sys64, PT_SIZE, sp)
jalrt2  # Do The Real Thing (TM)
 
li  t0, -EMAXERRNO - 1  # error?
+#if defined(CONFIG_KVM_MIPSVZ)  defined(CONFIG_FAST_ACCESS_TO_THREAD_POINTER)
+   ld  t2, TI_TP_VALUE($28)
+#endif
sltut0, t0, v0
sd  t0, PT_R7(sp)   # set error flag
beqzt0, 1f
@@ -70,6 +73,9 @@ NESTED(handle_sys64, PT_SIZE, sp)
dnegu   v0  # error
sd  t1, PT_R0(sp)   # save it for syscall restarting
 1: sd  v0, PT_R2(sp)   # result
+#if defined(CONFIG_KVM_MIPSVZ)  defined(CONFIG_FAST_ACCESS_TO_THREAD_POINTER)
+   sd  t2, PT_R26(sp)
+#endif
 
 n64_syscall_exit:
j   syscall_exit_partial
@@ -93,6 +99,9 @@ syscall_trace_entry:
jalrt0
 
li  t0, -EMAXERRNO - 1  # error?
+#if defined(CONFIG_KVM_MIPSVZ)  defined(CONFIG_FAST_ACCESS_TO_THREAD_POINTER)
+   ld  t2, TI_TP_VALUE($28)
+#endif
sltut0, t0, v0
sd  t0, PT_R7(sp)   # set error flag
beqzt0, 1f
@@ -101,6 +110,9 @@ syscall_trace_entry:
dnegu   v0  # error
sd  t1, PT_R0(sp)   # save it for syscall restarting
 1: sd  v0, PT_R2(sp)   # result
+#if defined(CONFIG_KVM_MIPSVZ)  defined(CONFIG_FAST_ACCESS_TO_THREAD_POINTER)
+   sd  t2, PT_R26(sp)
+#endif
 
j   syscall_exit
 
diff --git a/arch/mips/kernel/scall64-n32.S b/arch/mips/kernel/scall64-n32.S
index edcb659..cba35b4 100644
--- a/arch/mips/kernel/scall64-n32.S
+++ b/arch/mips/kernel/scall64-n32.S

[PATCH 03/31] mips/kvm: Fix 32-bitisms in kvm_locore.S

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

For a warning free compile, we need to use the width aware PTR_LI and
PTR_LA macros.  Use LI variant for immediate data and LA variant for
addresses.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/kvm/kvm_locore.S | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/mips/kvm/kvm_locore.S b/arch/mips/kvm/kvm_locore.S
index dca2aa6..e86fa2a 100644
--- a/arch/mips/kvm/kvm_locore.S
+++ b/arch/mips/kvm/kvm_locore.S
@@ -310,7 +310,7 @@ NESTED (MIPSX(GuestException), CALLFRAME_SIZ, ra)
 LONG_S  t0, VCPU_R26(k1)
 
 /* Get GUEST k1 and save it in VCPU */
-la  t1, ~0x2ff
+   PTR_LI  t1, ~0x2ff
 mfc0t0, CP0_EBASE
 and t0, t0, t1
 LONG_L  t0, 0x3000(t0)
@@ -384,14 +384,14 @@ NESTED (MIPSX(GuestException), CALLFRAME_SIZ, ra)
 mtc0k0, CP0_DDATA_LO
 
 /* Restore RDHWR access */
-la  k0, 0x200F
+   PTR_LI  k0, 0x200F
 mtc0k0,  CP0_HWRENA
 
 /* Jump to handler */
 FEXPORT(__kvm_mips_jump_to_handler)
 /* XXXKYMA: not sure if this is safe, how large is the stack?? */
 /* Now jump to the kvm_mips_handle_exit() to see if we can deal with this 
in the kernel */
-la  t9,kvm_mips_handle_exit
+   PTR_LA  t9, kvm_mips_handle_exit
 jalr.hb t9
 addiu   sp,sp, -CALLFRAME_SIZ   /* BD Slot */
 
@@ -566,7 +566,7 @@ __kvm_mips_return_to_host:
 mtlok0
 
 /* Restore RDHWR access */
-la  k0, 0x200F
+   PTR_LI  k0, 0x200F
 mtc0k0,  CP0_HWRENA
 
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/31] mips/kvm: Use generic cache flushing functions.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

We don't know if we have the r4k specific functions available, so use
universally available __flush_cache_all() instead.  This takes longer
as it flushes both i-cache and d-cache, but is available for all CPUs.

Signed-off-by: David Daney david.da...@cavium.com
---
 arch/mips/kvm/kvm_mips_emul.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/mips/kvm/kvm_mips_emul.c b/arch/mips/kvm/kvm_mips_emul.c
index af9a661..a2c6687 100644
--- a/arch/mips/kvm/kvm_mips_emul.c
+++ b/arch/mips/kvm/kvm_mips_emul.c
@@ -916,8 +916,6 @@ kvm_mips_emulate_cache(uint32_t inst, uint32_t *opc, 
uint32_t cause,
   struct kvm_run *run, struct kvm_vcpu *vcpu)
 {
struct mips_coproc *cop0 = vcpu-arch.cop0;
-   extern void (*r4k_blast_dcache) (void);
-   extern void (*r4k_blast_icache) (void);
enum emulation_result er = EMULATE_DONE;
int32_t offset, cache, op_inst, op, base;
struct kvm_vcpu_arch *arch = vcpu-arch;
@@ -954,9 +952,9 @@ kvm_mips_emulate_cache(uint32_t inst, uint32_t *opc, 
uint32_t cause,
 arch-gprs[base], offset);
 
if (cache == MIPS_CACHE_DCACHE)
-   r4k_blast_dcache();
+   __flush_cache_all();
else if (cache == MIPS_CACHE_ICACHE)
-   r4k_blast_icache();
+   __flush_cache_all();
else {
printk(%s: unsupported CACHE INDEX operation\n,
   __func__);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/31] KVM/MIPS: Implement hardware virtualization via the MIPS-VZ extensions.

2013-06-07 Thread David Daney
I should also add that I will shortly send patches for the kvm tool 
required to drive this VM as well as a small set of patches that create 
a para-virtualized MIPS/Linux guest kernel.


The idea is that because there is no standard SMP linux system, we 
create a standard para-virtualized system that uses a handful of 
hypercalls, but mostly just uses virtio devices.  It has no emulated 
real hardware (no 8250 UART, no emulated legacy anything...)


David Daney


On 06/07/2013 04:03 PM, David Daney wrote:

From: David Daney david.da...@cavium.com

These patches take a somewhat different approach to MIPS
virtualization via the MIPS-VZ extensions than the patches previously
sent by Sanjay Lal.

Several facts about the code:

o Existing exception handlers are modified to hook in to KVM instead
   of intercepting all exceptions via the EBase register, and then
   chaining to real exception handlers.

o Able to boot 64-bit SMP guests that use the FPU (I have booted 4-way
   SMP 64-bit MIPS/Linux).

o Additional overhead on every exception even when *no* vCPU is running.

o Lower interrupt overhead, than the EBase interception method, when
   vCPU *is* running.

o This code is somewhat smaller than the existing trap/emulate
   implementation (about 2100 lines vs. about 5300 lines)

o Currently probably only usable on the OCTEON III CPU model, as some
   MIPS-VZ implementation-defined behaviors were assumed to have the
   OCTEON III behavior.

Note: I think Ralf already has the 17/31 (MIPS: Quit exposing Kconfig
symbols in uapi headers.) queued, but I also include it here.

David Daney (31):
   MIPS: Move allocate_kscratch to cpu-probe.c and make it public.
   MIPS: Save and restore K0/K1 when CONFIG_KVM_MIPSVZ
   mips/kvm: Fix 32-bitisms in kvm_locore.S
   mips/kvm: Add casts to avoid pointer width mismatch build failures.
   mips/kvm: Use generic cache flushing functions.
   mips/kvm: Rename kvm_vcpu_arch.pc to  kvm_vcpu_arch.epc
   mips/kvm: Rename VCPU_registername to KVM_VCPU_ARCH_registername
   mips/kvm: Fix code formatting in arch/mips/kvm/kvm_locore.S
   mips/kvm: Factor trap-and-emulate support into a pluggable
 implementation.
   mips/kvm: Implement ioctls to get and set FPU registers.
   MIPS: Rearrange branch.c so it can be used by kvm code.
   MIPS: Add instruction format information for WAIT, MTC0, MFC0, et al.
   mips/kvm: Add accessors for MIPS VZ registers.
   mips/kvm: Add thread_info flag to indicate operation in MIPS VZ Guest
 Mode.
   mips/kvm: Exception handling to leave and reenter guest mode.
   mips/kvm: Add exception handler for MIPSVZ Guest exceptions.
   MIPS: Quit exposing Kconfig symbols in uapi headers.
   mips/kvm: Add pt_regs slots for BadInstr and BadInstrP
   mips/kvm: Add host definitions for MIPS VZ based host.
   mips/kvm: Hook into TLB fault handlers.
   mips/kvm: Allow set_except_vector() to be used from MIPSVZ code.
   mips/kvm: Split get_new_mmu_context into two parts.
   mips/kvm: Hook into CP unusable exception handler.
   mips/kvm: Add thread_struct fields used by MIPSVZ hosts.
   mips/kvm: Add some asm-offsets constants used by MIPSVZ.
   mips/kvm: Split up Kconfig and Makefile definitions in preperation
 for MIPSVZ.
   mips/kvm: Gate the use of kvm_local_flush_tlb_all() by KVM_MIPSTE
   mips/kvm: Only use KVM_COALESCED_MMIO_PAGE_OFFSET with KVM_MIPSTE
   mips/kvm: Add MIPSVZ support.
   mips/kvm: Enable MIPSVZ in Kconfig/Makefile
   mips/kvm: Allow for upto 8 KVM vcpus per vm.

  arch/mips/Kconfig   |1 +
  arch/mips/include/asm/branch.h  |7 +
  arch/mips/include/asm/kvm_host.h|  622 +---
  arch/mips/include/asm/kvm_mips_te.h |  589 +++
  arch/mips/include/asm/kvm_mips_vz.h |   29 +
  arch/mips/include/asm/mipsregs.h|  264 +
  arch/mips/include/asm/mmu_context.h |   12 +-
  arch/mips/include/asm/processor.h   |6 +
  arch/mips/include/asm/ptrace.h  |   36 +
  arch/mips/include/asm/stackframe.h  |  150 ++-
  arch/mips/include/asm/thread_info.h |2 +
  arch/mips/include/asm/uasm.h|2 +-
  arch/mips/include/uapi/asm/inst.h   |   23 +-
  arch/mips/include/uapi/asm/ptrace.h |   17 +-
  arch/mips/kernel/asm-offsets.c  |  124 ++-
  arch/mips/kernel/branch.c   |   63 +-
  arch/mips/kernel/cpu-probe.c|   34 +
  arch/mips/kernel/genex.S|8 +
  arch/mips/kernel/scall64-64.S   |   12 +
  arch/mips/kernel/scall64-n32.S  |   12 +
  arch/mips/kernel/traps.c|   15 +-
  arch/mips/kvm/Kconfig   |   23 +-
  arch/mips/kvm/Makefile  |   15 +-
  arch/mips/kvm/kvm_locore.S  |  980 +-
  arch/mips/kvm/kvm_mips.c|  768 ++
  arch/mips/kvm/kvm_mips_comm.h   |1 +
  arch/mips/kvm/kvm_mips_commpage.c   |9 +-
  arch/mips/kvm/kvm_mips_dyntrans.c   |4 +-
  arch/mips/kvm/kvm_mips_emul.c   |  312 +++---
  arch/mips/kvm/kvm_mips_int.c|   53 +-
  

[PATCH 00/31] KVM/MIPS: Implement hardware virtualization via the MIPS-VZ extensions.

2013-06-07 Thread David Daney
From: David Daney david.da...@cavium.com

These patches take a somewhat different approach to MIPS
virtualization via the MIPS-VZ extensions than the patches previously
sent by Sanjay Lal.

Several facts about the code:

o Existing exception handlers are modified to hook in to KVM instead
  of intercepting all exceptions via the EBase register, and then
  chaining to real exception handlers.

o Able to boot 64-bit SMP guests that use the FPU (I have booted 4-way
  SMP 64-bit MIPS/Linux).

o Additional overhead on every exception even when *no* vCPU is running.

o Lower interrupt overhead, than the EBase interception method, when
  vCPU *is* running.

o This code is somewhat smaller than the existing trap/emulate
  implementation (about 2100 lines vs. about 5300 lines)

o Currently probably only usable on the OCTEON III CPU model, as some
  MIPS-VZ implementation-defined behaviors were assumed to have the
  OCTEON III behavior.

Note: I think Ralf already has the 17/31 (MIPS: Quit exposing Kconfig
symbols in uapi headers.) queued, but I also include it here.

David Daney (31):
  MIPS: Move allocate_kscratch to cpu-probe.c and make it public.
  MIPS: Save and restore K0/K1 when CONFIG_KVM_MIPSVZ
  mips/kvm: Fix 32-bitisms in kvm_locore.S
  mips/kvm: Add casts to avoid pointer width mismatch build failures.
  mips/kvm: Use generic cache flushing functions.
  mips/kvm: Rename kvm_vcpu_arch.pc to  kvm_vcpu_arch.epc
  mips/kvm: Rename VCPU_registername to KVM_VCPU_ARCH_registername
  mips/kvm: Fix code formatting in arch/mips/kvm/kvm_locore.S
  mips/kvm: Factor trap-and-emulate support into a pluggable
implementation.
  mips/kvm: Implement ioctls to get and set FPU registers.
  MIPS: Rearrange branch.c so it can be used by kvm code.
  MIPS: Add instruction format information for WAIT, MTC0, MFC0, et al.
  mips/kvm: Add accessors for MIPS VZ registers.
  mips/kvm: Add thread_info flag to indicate operation in MIPS VZ Guest
Mode.
  mips/kvm: Exception handling to leave and reenter guest mode.
  mips/kvm: Add exception handler for MIPSVZ Guest exceptions.
  MIPS: Quit exposing Kconfig symbols in uapi headers.
  mips/kvm: Add pt_regs slots for BadInstr and BadInstrP
  mips/kvm: Add host definitions for MIPS VZ based host.
  mips/kvm: Hook into TLB fault handlers.
  mips/kvm: Allow set_except_vector() to be used from MIPSVZ code.
  mips/kvm: Split get_new_mmu_context into two parts.
  mips/kvm: Hook into CP unusable exception handler.
  mips/kvm: Add thread_struct fields used by MIPSVZ hosts.
  mips/kvm: Add some asm-offsets constants used by MIPSVZ.
  mips/kvm: Split up Kconfig and Makefile definitions in preperation
for MIPSVZ.
  mips/kvm: Gate the use of kvm_local_flush_tlb_all() by KVM_MIPSTE
  mips/kvm: Only use KVM_COALESCED_MMIO_PAGE_OFFSET with KVM_MIPSTE
  mips/kvm: Add MIPSVZ support.
  mips/kvm: Enable MIPSVZ in Kconfig/Makefile
  mips/kvm: Allow for upto 8 KVM vcpus per vm.

 arch/mips/Kconfig   |1 +
 arch/mips/include/asm/branch.h  |7 +
 arch/mips/include/asm/kvm_host.h|  622 +---
 arch/mips/include/asm/kvm_mips_te.h |  589 +++
 arch/mips/include/asm/kvm_mips_vz.h |   29 +
 arch/mips/include/asm/mipsregs.h|  264 +
 arch/mips/include/asm/mmu_context.h |   12 +-
 arch/mips/include/asm/processor.h   |6 +
 arch/mips/include/asm/ptrace.h  |   36 +
 arch/mips/include/asm/stackframe.h  |  150 ++-
 arch/mips/include/asm/thread_info.h |2 +
 arch/mips/include/asm/uasm.h|2 +-
 arch/mips/include/uapi/asm/inst.h   |   23 +-
 arch/mips/include/uapi/asm/ptrace.h |   17 +-
 arch/mips/kernel/asm-offsets.c  |  124 ++-
 arch/mips/kernel/branch.c   |   63 +-
 arch/mips/kernel/cpu-probe.c|   34 +
 arch/mips/kernel/genex.S|8 +
 arch/mips/kernel/scall64-64.S   |   12 +
 arch/mips/kernel/scall64-n32.S  |   12 +
 arch/mips/kernel/traps.c|   15 +-
 arch/mips/kvm/Kconfig   |   23 +-
 arch/mips/kvm/Makefile  |   15 +-
 arch/mips/kvm/kvm_locore.S  |  980 +-
 arch/mips/kvm/kvm_mips.c|  768 ++
 arch/mips/kvm/kvm_mips_comm.h   |1 +
 arch/mips/kvm/kvm_mips_commpage.c   |9 +-
 arch/mips/kvm/kvm_mips_dyntrans.c   |4 +-
 arch/mips/kvm/kvm_mips_emul.c   |  312 +++---
 arch/mips/kvm/kvm_mips_int.c|   53 +-
 arch/mips/kvm/kvm_mips_int.h|2 -
 arch/mips/kvm/kvm_mips_stats.c  |6 +-
 arch/mips/kvm/kvm_mipsvz.c  | 1894 +++
 arch/mips/kvm/kvm_mipsvz_guest.S|  234 +
 arch/mips/kvm/kvm_tlb.c |  140 +--
 arch/mips/kvm/kvm_trap_emul.c   |  932 +++--
 arch/mips/mm/fault.c|8 +
 arch/mips/mm/tlbex-fault.S  |6 +
 arch/mips/mm/tlbex.c|   45 +-
 39 files changed, 5299 insertions(+), 2161 deletions(-)
 create mode 100644 arch/mips/include/asm/kvm_mips_te.h
 create mode 

Re: [PATCH 20/31] mips/kvm: Hook into TLB fault handlers.

2013-06-07 Thread Sergei Shtylyov

Hello.

On 06/08/2013 03:03 AM, David Daney wrote:


From: David Daney david.da...@cavium.com

If the CPU is operating in guest mode when a TLB related excpetion
occurs, give KVM a chance to do emulation.

Signed-off-by: David Daney david.da...@cavium.com
---
  arch/mips/mm/fault.c   | 8 
  arch/mips/mm/tlbex-fault.S | 6 ++
  2 files changed, 14 insertions(+)

diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c
index 0fead53..9391da49 100644
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c

[...]

@@ -50,6 +51,13 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs 
*regs, unsigned long writ
   field, regs-cp0_epc);
  #endif
  
+#ifdef CONFIG_KVM_MIPSVZ

+   if (test_tsk_thread_flag(current, TIF_GUESTMODE)) {
+   if (mipsvz_page_fault(regs, write, address))


   Any reason not to collapse these into single *if*?


+   return;
+   }
+#endif
+



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC V9 0/19] Paravirtualized ticket spinlocks

2013-06-07 Thread Jiannan Ouyang
Raghu, thanks for you input. I'm more than glad to work together with
you to make this idea work better.

-Jiannan

On Thu, Jun 6, 2013 at 11:15 PM, Raghavendra K T
raghavendra...@linux.vnet.ibm.com wrote:
 On 06/03/2013 11:51 AM, Raghavendra K T wrote:

 On 06/03/2013 07:10 AM, Raghavendra K T wrote:

 On 06/02/2013 09:50 PM, Jiannan Ouyang wrote:

 On Sun, Jun 2, 2013 at 1:07 AM, Gleb Natapov g...@redhat.com wrote:

 High level question here. We have a big hope for Preemptable Ticket
 Spinlock patch series by Jiannan Ouyang to solve most, if not all,
 ticketing spinlocks in overcommit scenarios problem without need for
 PV.
 So how this patch series compares with his patches on PLE enabled
 processors?


 No experiment results yet.

 An error is reported on a 20 core VM. I'm during an internship
 relocation, and will start work on it next week.


 Preemptable spinlocks' testing update:
 I hit the same softlockup problem while testing on 32 core machine with
 32 guest vcpus that Andrew had reported.

 After that i started tuning TIMEOUT_UNIT, and when I went till (18),
 things seemed to be manageable for undercommit cases.
 But I still see degradation for undercommit w.r.t baseline itself on 32
 core machine (after tuning).

 (37.5% degradation w.r.t base line).
 I can give the full report after the all tests complete.

 For over-commit cases, I again started hitting softlockups (and
 degradation is worse). But as I said in the preemptable thread, the
 concept of preemptable locks looks promising (though I am still not a
 fan of  embedded TIMEOUT mechanism)

 Here is my opinion of TODOs for preemptable locks to make it better ( I
 think I need to paste in the preemptable thread also)

 1. Current TIMEOUT UNIT seem to be on higher side and also it does not
 scale well with large guests and also overcommit. we need to have a
 sort of adaptive mechanism and better is sort of different TIMEOUT_UNITS
 for different types of lock too. The hashing mechanism that was used in
 Rik's spinlock backoff series fits better probably.

 2. I do not think TIMEOUT_UNIT itself would work great when we have a
 big queue (for large guests / overcommits) for lock.
 one way is to add a PV hook that does yield hypercall immediately for
 the waiters above some THRESHOLD so that they don't burn the CPU.
 ( I can do POC to check if  that idea works in improving situation
 at some later point of time)


 Preemptable-lock results from my run with 2^8 TIMEOUT:

 +---+---+---++---+
   ebizzy (records/sec) higher is better
 +---+---+---++---+
  basestdevpatchedstdev%improvement
 +---+---+---++---+
 1x  5574.9000   237.49973484.2000   113.4449   -37.50202
 2x  2741.5000   561.3090 351.5000   140.5420   -87.17855
 3x  2146.2500   216.7718 194.833385.0303   -90.92215
 4x  1663.   141.9235 101.57.7853   -93.92664
 +---+---+---++---+
 +---+---+---++---+
 dbench  (Throughput) higher is better
 +---+---+---++---+
   basestdevpatchedstdev%improvement
 +---+---+---++---+
 1x  14111.5600   754.4525   3930.1602   2547.2369-72.14936
 2x  2481.627071.2665  181.181689.5368-92.69908
 3x  1510.248331.8634  104.724353.2470-93.06576
 4x  1029.487516.9166   72.373838.2432-92.96992
 +---+---+---++---+

 Note we can not trust on overcommit results because of softlock-ups


 Hi, I tried
 (1) TIMEOUT=(2^7)

 (2) having yield hypercall that uses kvm_vcpu_on_spin() to do directed yield
 to other vCPUs.

 Now I do not see any soft-lockup in overcommit cases and results are better
 now (except ebizzy 1x). and for dbench I see now it is closer to base and
 even improvement in 4x


 +---+---+---++---+
ebizzy (records/sec) higher is better
 +---+---+---++---+
   basestdevpatchedstdev%improvement
 +---+---+---++---+
   5574.9000   237.4997 523.7000 1.4181   -90.60611
   2741.5000   561.3090 597.800034.9755   -78.19442
   2146.2500   216.7718 902.666782.4228   -57.94215
   1663.   141.92351245.67.2989   -25.13530

 +---+---+---++---+
 +---+---+---++---+
 dbench  (Throughput) higher is better
 +---+---+---++---+
basestdevpatchedstdev%improvement
 

Re: [PATCH 20/31] mips/kvm: Hook into TLB fault handlers.

2013-06-07 Thread David Daney

On 06/07/2013 04:34 PM, Sergei Shtylyov wrote:

Hello.

On 06/08/2013 03:03 AM, David Daney wrote:


From: David Daney david.da...@cavium.com

If the CPU is operating in guest mode when a TLB related excpetion
occurs, give KVM a chance to do emulation.

Signed-off-by: David Daney david.da...@cavium.com
---
  arch/mips/mm/fault.c   | 8 
  arch/mips/mm/tlbex-fault.S | 6 ++
  2 files changed, 14 insertions(+)

diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c
index 0fead53..9391da49 100644
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c

[...]

@@ -50,6 +51,13 @@ asmlinkage void __kprobes do_page_fault(struct
pt_regs *regs, unsigned long writ
 field, regs-cp0_epc);
  #endif
+#ifdef CONFIG_KVM_MIPSVZ
+if (test_tsk_thread_flag(current, TIF_GUESTMODE)) {
+if (mipsvz_page_fault(regs, write, address))


Any reason not to collapse these into single *if*?



It makes the conditional call to mipsvz_page_fault() less obvious.

Certainly the same semantics can be achieved several different ways.

David Daney



+return;
+}
+#endif
+







--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvmclock: clock should count only if vm is running

2013-06-07 Thread Marcelo Tosatti

kvmclock should not count while vm is paused, because:

1) if the vm is paused for long periods, timekeeping 
math can overflow while converting the (large) clocksource 
delta to nanoseconds.

2) Users rely on CLOCK_MONOTONIC to count run time, that is, 
time which OS has been in a runnable state (see CLOCK_BOOTTIME).

Change kvmclock driver so as to save clock value when vm transitions
from runnable to stopped state, and to restore clock value from stopped
to runnable transition.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c
index 87d4d0f..7d2d005 100644
--- a/hw/i386/kvm/clock.c
+++ b/hw/i386/kvm/clock.c
@@ -28,38 +28,6 @@ typedef struct KVMClockState {
 bool clock_valid;
 } KVMClockState;
 
-static void kvmclock_pre_save(void *opaque)
-{
-KVMClockState *s = opaque;
-struct kvm_clock_data data;
-int ret;
-
-if (s-clock_valid) {
-return;
-}
-ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, data);
-if (ret  0) {
-fprintf(stderr, KVM_GET_CLOCK failed: %s\n, strerror(ret));
-data.clock = 0;
-}
-s-clock = data.clock;
-/*
- * If the VM is stopped, declare the clock state valid to avoid re-reading
- * it on next vmsave (which would return a different value). Will be reset
- * when the VM is continued.
- */
-s-clock_valid = !runstate_is_running();
-}
-
-static int kvmclock_post_load(void *opaque, int version_id)
-{
-KVMClockState *s = opaque;
-struct kvm_clock_data data;
-
-data.clock = s-clock;
-data.flags = 0;
-return kvm_vm_ioctl(kvm_state, KVM_SET_CLOCK, data);
-}
 
 static void kvmclock_vm_state_change(void *opaque, int running,
  RunState state)
@@ -70,8 +38,18 @@ static void kvmclock_vm_state_change(void *opaque, int 
running,
 int ret;
 
 if (running) {
+struct kvm_clock_data data;
+
 s-clock_valid = false;
 
+data.clock = s-clock;
+data.flags = 0;
+ret = kvm_vm_ioctl(kvm_state, KVM_SET_CLOCK, data);
+if (ret  0) {
+fprintf(stderr, KVM_SET_CLOCK failed: %s\n, strerror(ret));
+abort();
+}
+
 if (!cap_clock_ctrl) {
 return;
 }
@@ -84,6 +62,26 @@ static void kvmclock_vm_state_change(void *opaque, int 
running,
 return;
 }
 }
+} else {
+struct kvm_clock_data data;
+int ret;
+
+if (s-clock_valid) {
+return;
+}
+ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, data);
+if (ret  0) {
+fprintf(stderr, KVM_GET_CLOCK failed: %s\n, strerror(ret));
+abort();
+}
+s-clock = data.clock;
+
+/*
+ * If the VM is stopped, declare the clock state valid to
+ * avoid re-reading it on next vmsave (which would return
+ * a different value). Will be reset when the VM is continued.
+ */
+s-clock_valid = !runstate_is_running();
 }
 }
 
@@ -100,8 +98,6 @@ static const VMStateDescription kvmclock_vmsd = {
 .version_id = 1,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
-.pre_save = kvmclock_pre_save,
-.post_load = kvmclock_post_load,
 .fields = (VMStateField[]) {
 VMSTATE_UINT64(clock, KVMClockState),
 VMSTATE_END_OF_LIST()
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

2013-06-07 Thread Xiao Guangrong
From: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com

Currently, memory synchronization is missed in emulator_fix_hypercall,
please see the commit 758ccc89b83
(KVM: x86: drop calling kvm_mmu_zap_all in emulator_fix_hypercall)

This patch fixes it by introducing kvm_vcpus_hang_on_page_start() and
kvm_vcpus_hang_on_page_end which unmap the patched page from guest
and use kvm_flush_remote_tlbs() as the serializing instruction to
ensure the memory coherence
[ The SDM said that INVEPT, INVVPID and MOV (to control register, with
  the exception of MOV CR8) are the serializing instructions. ]

The mmu-lock is held during host patches the page so that it stops vcpus
to fix its further page fault

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/x86/kvm/mmu.c | 25 +
 arch/x86/kvm/mmu.h |  3 +++
 arch/x86/kvm/x86.c |  7 +++
 3 files changed, 35 insertions(+)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 7d50a2d..35cd0b6 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4536,6 +4536,31 @@ int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, 
u64 addr, u64 sptes[4])
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_get_spte_hierarchy);

+/*
+ * Force vcpu to hang when it is trying to access the specified page.
+ *
+ * kvm_vcpus_hang_on_page_start and kvm_vcpus_hang_on_page_end should
+ * be used in pairs and they are currently used to sync memory access
+ * between vcpus when host cross-modifies the code segment of guest.
+ *
+ * We unmap the page from the guest and do memory synchronization by
+ * kvm_flush_remote_tlbs() under the protection of mmu-lock. If vcpu
+ * accesses the page, it will trigger #PF and be blocked on mmu-lock.
+ */
+void kvm_vcpus_hang_on_page_start(struct kvm *kvm, gfn_t gfn)
+{
+   spin_lock(kvm-mmu_lock);
+
+   /* kvm_flush_remote_tlbs() can act as serializing instruction. */
+   if (kvm_unmap_hva(kvm, gfn_to_hva(kvm, gfn)))
+   kvm_flush_remote_tlbs(kvm);
+}
+
+void kvm_vcpus_hang_on_page_end(struct kvm *kvm)
+{
+   spin_unlock(kvm-mmu_lock);
+}
+
 void kvm_mmu_destroy(struct kvm_vcpu *vcpu)
 {
ASSERT(vcpu);
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 5b59c57..35910be 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -115,4 +115,7 @@ static inline bool permission_fault(struct kvm_mmu *mmu, 
unsigned pte_access,
 }

 void kvm_mmu_invalidate_zap_all_pages(struct kvm *kvm);
+
+void kvm_vcpus_hang_on_page_start(struct kvm *kvm, gfn_t gfn);
+void kvm_vcpus_hang_on_page_end(struct kvm *kvm);
 #endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9e4afa7..776bf1a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5528,8 +5528,15 @@ static int emulator_fix_hypercall(struct 
x86_emulate_ctxt *ctxt)
struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
char instruction[3];
unsigned long rip = kvm_rip_read(vcpu);
+   gpa_t gpa;
+
+   gpa = kvm_mmu_gva_to_gpa_fetch(vcpu, rip, NULL);
+   if (gpa == UNMAPPED_GVA)
+   return X86EMUL_PROPAGATE_FAULT;

+   kvm_vcpus_hang_on_page_start(vcpu-kvm, gpa_to_gfn(gpa));
kvm_x86_ops-patch_hypercall(vcpu, instruction);
+   kvm_vcpus_hang_on_page_end(vcpu-kvm);

return emulator_write_emulated(ctxt, rip, instruction, 3, NULL);
 }
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html