Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-26 Thread Scott Wood

On 07/25/2013 03:50:42 AM, Gleb Natapov wrote:

Why ppc uses page_is_ram() for mmap? How should I know? But looking at
the function it does it only as a fallback if
ppc_md.phys_mem_access_prot() is not provided. Making access to MMIO
noncached as a safe fallback makes sense.


There's only one current implementation of  
ppc_md.phys_mem_access_prot(), which is pci_phys_mem_access_prot(),  
which also uses page_is_ram().  If page_is_ram() returns false then it  
checks for write-combining PCI.  But yes, we would want to call  
ppc_md.phys_mem_access_prot() if present.


Copying from the host PTE would be ideal if doesn't come with a  
noticeable performance impact compared to other methods, but one way or  
another we want to be sure we match.


It is also make sense to allow noncached access to reserved ram  
sometimes.


Perhaps, but that's not KVM's decision to make.  You should get the  
same result as if you mmaped it -- because QEMU already did and we need  
to be consistent.  Not to mention the large page kernel mapping that  
will have been done on e500...


-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-26 Thread Benjamin Herrenschmidt
On Fri, 2013-07-26 at 15:03 +, Bhushan Bharat-R65777 wrote:
> Will not searching the Linux PTE is a overkill?

That's the best approach. Also we are searching it already to resolve
the page fault. That does mean we search twice but on the other hand
that also means it's hot in the cache.

Cheers,
Ben


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 02/25] kvm: Change prototype of kvm_update_guest_debug()

2013-07-26 Thread Andreas Färber
From: Stefan Weil 

Passing a CPUState pointer instead of a CPUArchState pointer eliminates
the last target dependent data type in sysemu/kvm.h.

It also simplifies the code.

Signed-off-by: Stefan Weil 
Acked-by: Paolo Bonzini 
Signed-off-by: Andreas Färber 
---
 exec.c   |  5 ++---
 include/sysemu/kvm.h |  2 +-
 kvm-all.c| 17 +
 kvm-stub.c   |  2 +-
 target-i386/kvm.c|  2 +-
 5 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/exec.c b/exec.c
index 3ba9525..c4f2894 100644
--- a/exec.c
+++ b/exec.c
@@ -590,15 +590,14 @@ void cpu_breakpoint_remove_all(CPUArchState *env, int 
mask)
 void cpu_single_step(CPUState *cpu, int enabled)
 {
 #if defined(TARGET_HAS_ICE)
-CPUArchState *env = cpu->env_ptr;
-
 if (cpu->singlestep_enabled != enabled) {
 cpu->singlestep_enabled = enabled;
 if (kvm_enabled()) {
-kvm_update_guest_debug(env, 0);
+kvm_update_guest_debug(cpu, 0);
 } else {
 /* must flush all the translated code to avoid inconsistencies */
 /* XXX: only flush what is necessary */
+CPUArchState *env = cpu->env_ptr;
 tb_flush(env);
 }
 }
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index f8ac448..de74411 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -174,7 +174,7 @@ int kvm_insert_breakpoint(CPUState *cpu, target_ulong addr,
 int kvm_remove_breakpoint(CPUState *cpu, target_ulong addr,
   target_ulong len, int type);
 void kvm_remove_all_breakpoints(CPUState *cpu);
-int kvm_update_guest_debug(CPUArchState *env, unsigned long reinject_trap);
+int kvm_update_guest_debug(CPUState *cpu, unsigned long reinject_trap);
 #ifndef _WIN32
 int kvm_set_signal_mask(CPUState *cpu, const sigset_t *sigset);
 #endif
diff --git a/kvm-all.c b/kvm-all.c
index 4fb4ccb..716860f 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1883,9 +1883,8 @@ static void kvm_invoke_set_guest_debug(void *data)
&dbg_data->dbg);
 }
 
-int kvm_update_guest_debug(CPUArchState *env, unsigned long reinject_trap)
+int kvm_update_guest_debug(CPUState *cpu, unsigned long reinject_trap)
 {
-CPUState *cpu = ENV_GET_CPU(env);
 struct kvm_set_guest_debug_data data;
 
 data.dbg.control = reinject_trap;
@@ -1935,9 +1934,7 @@ int kvm_insert_breakpoint(CPUState *cpu, target_ulong 
addr,
 }
 
 for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
-CPUArchState *env = cpu->env_ptr;
-
-err = kvm_update_guest_debug(env, 0);
+err = kvm_update_guest_debug(cpu, 0);
 if (err) {
 return err;
 }
@@ -1977,9 +1974,7 @@ int kvm_remove_breakpoint(CPUState *cpu, target_ulong 
addr,
 }
 
 for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
-CPUArchState *env = cpu->env_ptr;
-
-err = kvm_update_guest_debug(env, 0);
+err = kvm_update_guest_debug(cpu, 0);
 if (err) {
 return err;
 }
@@ -2007,15 +2002,13 @@ void kvm_remove_all_breakpoints(CPUState *cpu)
 kvm_arch_remove_all_hw_breakpoints();
 
 for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
-CPUArchState *env = cpu->env_ptr;
-
-kvm_update_guest_debug(env, 0);
+kvm_update_guest_debug(cpu, 0);
 }
 }
 
 #else /* !KVM_CAP_SET_GUEST_DEBUG */
 
-int kvm_update_guest_debug(CPUArchState *env, unsigned long reinject_trap)
+int kvm_update_guest_debug(CPUState *cpu, unsigned long reinject_trap)
 {
 return -EINVAL;
 }
diff --git a/kvm-stub.c b/kvm-stub.c
index 7b2233a..771360b 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -78,7 +78,7 @@ void kvm_setup_guest_memory(void *start, size_t size)
 {
 }
 
-int kvm_update_guest_debug(CPUArchState *env, unsigned long reinject_trap)
+int kvm_update_guest_debug(CPUState *cpu, unsigned long reinject_trap)
 {
 return -ENOSYS;
 }
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 3c9d10a..376fc70 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1618,7 +1618,7 @@ static int kvm_guest_debug_workarounds(X86CPU *cpu)
  */
 if (reinject_trap ||
 (!kvm_has_robust_singlestep() && cs->singlestep_enabled)) {
-ret = kvm_update_guest_debug(env, reinject_trap);
+ret = kvm_update_guest_debug(cs, reinject_trap);
 }
 return ret;
 }
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 03/25] target-s390x: Fix CPUState rework fallout

2013-07-26 Thread Andreas Färber
From: Christian Borntraeger 

Commit f17ec444c3d39f76bcd8b71c2c05d5754bfe333e
exec: Change cpu_memory_rw_debug() argument to CPUState

missed to update s390x KVM code, breaking the build.

Let's fix it up.

Signed-off-by: Christian Borntraeger 
Signed-off-by: Andreas Färber 
---
 target-s390x/kvm.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index 60e94f8..85f0112 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -345,12 +345,10 @@ void *kvm_arch_ram_alloc(ram_addr_t size)
 
 int kvm_arch_insert_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp)
 {
-S390CPU *cpu = S390_CPU(cs);
-CPUS390XState *env = &cpu->env;
 static const uint8_t diag_501[] = {0x83, 0x24, 0x05, 0x01};
 
-if (cpu_memory_rw_debug(env, bp->pc, (uint8_t *)&bp->saved_insn, 4, 0) ||
-cpu_memory_rw_debug(env, bp->pc, (uint8_t *)diag_501, 4, 1)) {
+if (cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)&bp->saved_insn, 4, 0) ||
+cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)diag_501, 4, 1)) {
 return -EINVAL;
 }
 return 0;
@@ -358,16 +356,14 @@ int kvm_arch_insert_sw_breakpoint(CPUState *cs, struct 
kvm_sw_breakpoint *bp)
 
 int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp)
 {
-S390CPU *cpu = S390_CPU(cs);
-CPUS390XState *env = &cpu->env;
 uint8_t t[4];
 static const uint8_t diag_501[] = {0x83, 0x24, 0x05, 0x01};
 
-if (cpu_memory_rw_debug(env, bp->pc, t, 4, 0)) {
+if (cpu_memory_rw_debug(cs, bp->pc, t, 4, 0)) {
 return -EINVAL;
 } else if (memcmp(t, diag_501, 4)) {
 return -EINVAL;
-} else if (cpu_memory_rw_debug(env, bp->pc, (uint8_t *)&bp->saved_insn, 1, 
1)) {
+} else if (cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)&bp->saved_insn, 1, 
1)) {
 return -EINVAL;
 }
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


WARNING : Faculty/Staff july mail/quota upgrade notification!

2013-07-26 Thread Berry Angus @ Berry
***
FACULTY/STAFF: JANUARY - JULY MAILBOX QUOTA UPGRADE
Mailbox Quota Size: 100 %
Current Mailbox Quota: 97.9%

Important Notice: Mailbox SEND or RECEIVE operations will be deactivated at 
99.9% Quota-size click here on 
Faculty/Staff for mail 
account/quota upgrade.
ITS HELP DESK

© Copyright 2013


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel 3.10.1 - "NMI received for unknown reason"

2013-07-26 Thread Gleb Natapov
On Fri, Jul 26, 2013 at 08:25:19PM +0200, Stefan Pietsch wrote:
> Hi all,
> 
> starting a virtual machine (Debian sid) with KVM on my host running
> kernel 3.10.1 (Debian 3.10-1-686-pae) produces these messages:
> 
Are those messages printed by a host or a guest?

> [  765.522920] Uhhuh. NMI received for unknown reason 31 on CPU 0.
> [  765.522927] Do you have a strange power saving mode enabled?
> [  765.522930] Dazed and confused, but trying to continue
> [  770.487732] Uhhuh. NMI received for unknown reason 21 on CPU 0.
> [  770.487740] Do you have a strange power saving mode enabled?
> [  770.487742] Dazed and confused, but trying to continue
> [  846.340966] Uhhuh. NMI received for unknown reason 31 on CPU 1.
> [  846.340973] Do you have a strange power saving mode enabled?
> [  846.340976] Dazed and confused, but trying to continue
> [  847.563023] Uhhuh. NMI received for unknown reason 31 on CPU 0.
> [  847.563029] Do you have a strange power saving mode enabled?
> [  847.563032] Dazed and confused, but trying to continue
> 
> I can disable the messages (echo 0 > /proc/sys/kernel/nmi_watchdog) and
> the host and the virtual machine are running as usual.
> 
> 
> Host CPU: Intel(R) Core(TM) Duo CPU L2400
> 

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kernel 3.10.1 - "NMI received for unknown reason"

2013-07-26 Thread Stefan Pietsch
Hi all,

starting a virtual machine (Debian sid) with KVM on my host running
kernel 3.10.1 (Debian 3.10-1-686-pae) produces these messages:

[  765.522920] Uhhuh. NMI received for unknown reason 31 on CPU 0.
[  765.522927] Do you have a strange power saving mode enabled?
[  765.522930] Dazed and confused, but trying to continue
[  770.487732] Uhhuh. NMI received for unknown reason 21 on CPU 0.
[  770.487740] Do you have a strange power saving mode enabled?
[  770.487742] Dazed and confused, but trying to continue
[  846.340966] Uhhuh. NMI received for unknown reason 31 on CPU 1.
[  846.340973] Do you have a strange power saving mode enabled?
[  846.340976] Dazed and confused, but trying to continue
[  847.563023] Uhhuh. NMI received for unknown reason 31 on CPU 0.
[  847.563029] Do you have a strange power saving mode enabled?
[  847.563032] Dazed and confused, but trying to continue

I can disable the messages (echo 0 > /proc/sys/kernel/nmi_watchdog) and
the host and the virtual machine are running as usual.


Host CPU: Intel(R) Core(TM) Duo CPU L2400



Regards,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1/2] KVM: s390: add and extend interrupt information data structs

2013-07-26 Thread Jens Freimann
With the currently available struct kvm_s390_interrupt it is not possible to
inject all kinds of interrupts as defined in the z/Architecture. Add
interruption parameters to the structures to make sure we can inject all kinds
of interrupts and move it to kvm.h

Signed-off-by: Jens Freimann 
---
 arch/s390/include/asm/kvm_host.h |  45 +-
 arch/s390/kvm/interrupt.c| 189 +++
 arch/s390/kvm/priv.c |  22 ++---
 arch/s390/kvm/sigp.c |  14 +--
 include/uapi/linux/kvm.h |  62 +
 5 files changed, 175 insertions(+), 157 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 3238d40..c755a9d 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -162,18 +163,6 @@ struct kvm_vcpu_stat {
u32 diagnose_9c;
 };
 
-struct kvm_s390_io_info {
-   __u16subchannel_id;/* 0x0b8 */
-   __u16subchannel_nr;/* 0x0ba */
-   __u32io_int_parm;  /* 0x0bc */
-   __u32io_int_word;  /* 0x0c0 */
-};
-
-struct kvm_s390_ext_info {
-   __u32 ext_params;
-   __u64 ext_params2;
-};
-
 #define PGM_OPERATION0x01
 #define PGM_PRIVILEGED_OP   0x02
 #define PGM_EXECUTE  0x03
@@ -182,39 +171,9 @@ struct kvm_s390_ext_info {
 #define PGM_SPECIFICATION0x06
 #define PGM_DATA 0x07
 
-struct kvm_s390_pgm_info {
-   __u16 code;
-};
-
-struct kvm_s390_prefix_info {
-   __u32 address;
-};
-
-struct kvm_s390_extcall_info {
-   __u16 code;
-};
-
-struct kvm_s390_emerg_info {
-   __u16 code;
-};
-
-struct kvm_s390_mchk_info {
-   __u64 cr14;
-   __u64 mcic;
-};
-
 struct kvm_s390_interrupt_info {
struct list_head list;
-   u64 type;
-   union {
-   struct kvm_s390_io_info io;
-   struct kvm_s390_ext_info ext;
-   struct kvm_s390_pgm_info pgm;
-   struct kvm_s390_emerg_info emerg;
-   struct kvm_s390_extcall_info extcall;
-   struct kvm_s390_prefix_info prefix;
-   struct kvm_s390_mchk_info mchk;
-   };
+   struct kvm_s390_irq irq;
 };
 
 /* for local_interrupt.action_flags */
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 7f35cb3..25cf71d 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -65,7 +65,7 @@ static u64 int_word_to_isc_bits(u32 int_word)
 static int __interrupt_is_deliverable(struct kvm_vcpu *vcpu,
  struct kvm_s390_interrupt_info *inti)
 {
-   switch (inti->type) {
+   switch (inti->irq.type) {
case KVM_S390_INT_EXTERNAL_CALL:
if (psw_extint_disabled(vcpu))
return 0;
@@ -97,19 +97,19 @@ static int __interrupt_is_deliverable(struct kvm_vcpu *vcpu,
case KVM_S390_MCHK:
if (psw_mchk_disabled(vcpu))
return 0;
-   if (vcpu->arch.sie_block->gcr[14] & inti->mchk.cr14)
+   if (vcpu->arch.sie_block->gcr[14] & inti->irq.mchk.cr14)
return 1;
return 0;
case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
if (psw_ioint_disabled(vcpu))
return 0;
if (vcpu->arch.sie_block->gcr[6] &
-   int_word_to_isc_bits(inti->io.io_int_word))
+   int_word_to_isc_bits(inti->irq.io.io_int_word))
return 1;
return 0;
default:
printk(KERN_WARNING "illegal interrupt type %llx\n",
-  inti->type);
+  inti->irq.type);
BUG();
}
return 0;
@@ -146,7 +146,7 @@ static void __set_cpuflag(struct kvm_vcpu *vcpu, u32 flag)
 static void __set_intercept_indicator(struct kvm_vcpu *vcpu,
  struct kvm_s390_interrupt_info *inti)
 {
-   switch (inti->type) {
+   switch (inti->irq.type) {
case KVM_S390_INT_EXTERNAL_CALL:
case KVM_S390_INT_EMERGENCY:
case KVM_S390_INT_SERVICE:
@@ -182,14 +182,14 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
const unsigned short table[] = { 2, 4, 4, 6 };
int rc = 0;
 
-   switch (inti->type) {
+   switch (inti->irq.type) {
case KVM_S390_INT_EMERGENCY:
VCPU_EVENT(vcpu, 4, "%s", "interrupt: sigp emerg");
vcpu->stat.deliver_emergency_signal++;
-   trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, inti->type,
-inti->emerg.code, 0);
+   trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, inti->irq.type,
+

[RFC 2/2] KVM: s390: add floating irq controller

2013-07-26 Thread Jens Freimann
This patch adds a floating irq controller as a kvm_device.
It will be necesary for migration of floating interrupts as well
as for hardening the reset code by allowing user space to explicitly
remove all pending floating interrupts.

Signed-off-by: Jens Freimann 

---
 arch/s390/include/uapi/asm/kvm.h |   5 +
 arch/s390/kvm/interrupt.c| 192 +++
 arch/s390/kvm/kvm-s390.c |   1 +
 include/linux/kvm_host.h |   1 +
 include/uapi/linux/kvm.h |   1 +
 virt/kvm/kvm_main.c  |   3 +
 6 files changed, 163 insertions(+), 40 deletions(-)

diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index d25da59..33d52b8 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -16,6 +16,11 @@
 
 #define __KVM_S390
 
+/* Device control API: s390-specific devices */
+#define KVM_DEV_FLIC_DEQUEUE 1
+#define KVM_DEV_FLIC_ENQUEUE 2
+#define KVM_DEV_FLIC_CLEAR_IRQS 3
+
 /* for KVM_GET_REGS and KVM_SET_REGS */
 struct kvm_regs {
/* general purpose regs for s390 */
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 25cf71d..065a402 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -656,15 +656,57 @@ struct kvm_s390_interrupt_info 
*kvm_s390_get_io_int(struct kvm *kvm,
return inti;
 }
 
-int kvm_s390_inject_vm(struct kvm *kvm,
-  struct kvm_s390_interrupt *s390int)
+static void __inject_vm(struct kvm *kvm, struct kvm_s390_interrupt_info *inti)
 {
struct kvm_s390_local_interrupt *li;
struct kvm_s390_float_interrupt *fi;
-   struct kvm_s390_interrupt_info *inti, *iter;
-   struct kvm_s390_irq *irq;
+   struct kvm_s390_interrupt_info *iter;
int sigcpu;
 
+   mutex_lock(&kvm->lock);
+   fi = &kvm->arch.float_int;
+   spin_lock(&fi->lock);
+   if (!is_ioint(inti->irq.type))
+   list_add_tail(&inti->list, &fi->list);
+   else {
+   u64 isc_bits = int_word_to_isc_bits(inti->irq.io.io_int_word);
+
+   /* Keep I/O interrupts sorted in isc order. */
+   list_for_each_entry(iter, &fi->list, list) {
+   if (!is_ioint(iter->irq.type))
+   continue;
+   if (int_word_to_isc_bits(iter->irq.io.io_int_word)
+   <= isc_bits)
+   continue;
+   break;
+   }
+   list_add_tail(&inti->list, &iter->list);
+   }
+   atomic_set(&fi->active, 1);
+   sigcpu = find_first_bit(fi->idle_mask, KVM_MAX_VCPUS);
+   if (sigcpu == KVM_MAX_VCPUS) {
+   do {
+   sigcpu = fi->next_rr_cpu++;
+   if (sigcpu == KVM_MAX_VCPUS)
+   sigcpu = fi->next_rr_cpu = 0;
+   } while (fi->local_int[sigcpu] == NULL);
+   }
+   li = fi->local_int[sigcpu];
+   spin_lock_bh(&li->lock);
+   atomic_set_mask(CPUSTAT_EXT_INT, li->cpuflags);
+   if (waitqueue_active(li->wq))
+   wake_up_interruptible(li->wq);
+   spin_unlock_bh(&li->lock);
+   spin_unlock(&fi->lock);
+   mutex_unlock(&kvm->lock);
+}
+
+int kvm_s390_inject_vm(struct kvm *kvm,
+  struct kvm_s390_interrupt *s390int)
+{
+   struct kvm_s390_interrupt_info *inti;
+   struct kvm_s390_irq *irq;
+
inti = kzalloc(sizeof(*inti), GFP_KERNEL);
if (!inti)
return -ENOMEM;
@@ -712,42 +754,7 @@ int kvm_s390_inject_vm(struct kvm *kvm,
}
trace_kvm_s390_inject_vm(s390int->type, s390int->parm, s390int->parm64, 
2);
 
-   mutex_lock(&kvm->lock);
-   fi = &kvm->arch.float_int;
-   spin_lock(&fi->lock);
-   if (!is_ioint(inti->irq.type))
-   list_add_tail(&inti->list, &fi->list);
-   else {
-   u64 isc_bits = int_word_to_isc_bits(inti->irq.io.io_int_word);
-
-   /* Keep I/O interrupts sorted in isc order. */
-   list_for_each_entry(iter, &fi->list, list) {
-   if (!is_ioint(iter->irq.type))
-   continue;
-   if (int_word_to_isc_bits(iter->irq.io.io_int_word)
-   <= isc_bits)
-   continue;
-   break;
-   }
-   list_add_tail(&inti->list, &iter->list);
-   }
-   atomic_set(&fi->active, 1);
-   sigcpu = find_first_bit(fi->idle_mask, KVM_MAX_VCPUS);
-   if (sigcpu == KVM_MAX_VCPUS) {
-   do {
-   sigcpu = fi->next_rr_cpu++;
-   if (sigcpu == KVM_MAX_VCPUS)
-   sigcpu = fi->next_rr_cpu = 0;
-   } while (fi->local_int[sigcpu] == NULL);
-   }
-   li = fi->local_int[sigcpu];
-   spin_lock_bh(&li->lock);
-

[RFC 0/2] KVM: s390: add floating irq controller

2013-07-26 Thread Jens Freimann
This series adds a kvm_device that acts as a irq controller for floating
interrupts.  As a first step it implements functionality to retrieve and inject
interrupts for the purpose of migration and  for hardening the reset code by
allowing user space to explicitly remove all pending floating interrupts.

PFAULT patches will also use this device for enabling/disabling pfault, 
therefore
the pfault patch series will be reworked to use this device depending on
review feedback

* Patch 1 adds a new data structure to hold interrupt information. The current
one (struct kvm_s390_interrupt) does not allow to inject all kinds of 
interrupts,
e.g. some data for program interrupts and machine check interruptions were
missing.

* Patch 2 adds a kvm_device which supports getting/setting currently pending
floating interrupts as well as deleting all currently pending interrupts


Jens Freimann (2):
  s390/kvm: add and extend interrupt information data structs
  s390/kvm: add floating irq controller

 arch/s390/include/asm/kvm_host.h |  45 +
 arch/s390/include/uapi/asm/kvm.h |   5 +
 arch/s390/kvm/interrupt.c| 365 +--
 arch/s390/kvm/kvm-s390.c |   1 +
 arch/s390/kvm/priv.c |  22 +--
 arch/s390/kvm/sigp.c |  14 +-
 include/linux/kvm_host.h |   1 +
 include/uapi/linux/kvm.h |  63 +++
 virt/kvm/kvm_main.c  |   3 +
 9 files changed, 330 insertions(+), 189 deletions(-)

-- 
1.8.0.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-26 Thread Bhushan Bharat-R65777


> -Original Message-
> From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
> Behalf Of Alexander Graf
> Sent: Friday, July 26, 2013 2:20 PM
> To: Benjamin Herrenschmidt
> Cc: Bhushan Bharat-R65777; kvm-...@vger.kernel.org; kvm@vger.kernel.org;
> linuxppc-...@lists.ozlabs.org; Wood Scott-B07421; Bhushan Bharat-R65777
> Subject: Re: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages
> 
> 
> On 26.07.2013, at 10:26, Benjamin Herrenschmidt wrote:
> 
> > On Fri, 2013-07-26 at 11:16 +0530, Bharat Bhushan wrote:
> >> If the page is RAM then map this as cacheable and coherent (set "M"
> >> bit) otherwise this page is treated as I/O and map this as cache
> >> inhibited and guarded (set  "I + G")
> >>
> >> This helps setting proper MMU mapping for direct assigned device.
> >>
> >> NOTE: There can be devices that require cacheable mapping, which is not yet
> supported.
> >
> > Why don't you do like server instead and enforce the use of the same I
> > and M bits as the corresponding qemu PTE ?
> 
> Specifically, Ben is talking about this code:
> 
> 
> /* Translate to host virtual address */
> hva = __gfn_to_hva_memslot(memslot, gfn);
> 
> /* Look up the Linux PTE for the backing page */
> pte_size = psize;
> pte = lookup_linux_pte(pgdir, hva, writing, &pte_size);
> if (pte_present(pte)) {
> if (writing && !pte_write(pte))
> /* make the actual HPTE be read-only */
> ptel = hpte_make_readonly(ptel);
> is_io = hpte_cache_bits(pte_val(pte));
> pa = pte_pfn(pte) << PAGE_SHIFT;
> }
> 

Will not searching the Linux PTE is a overkill?

=Bharat



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/8] KVM: s390: Fix sparse warnings in priv.c

2013-07-26 Thread Christian Borntraeger
From: Thomas Huth 

sparse complained about the missing UL postfix for long constants.

Signed-off-by: Thomas Huth 
Acked-by: Cornelia Huck 
Signed-off-by: Christian Borntraeger 
---
 arch/s390/kvm/priv.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 2883026..bb69df0 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -491,12 +491,12 @@ static int handle_epsw(struct kvm_vcpu *vcpu)
kvm_s390_get_regs_rre(vcpu, ®1, ®2);
 
/* This basically extracts the mask half of the psw. */
-   vcpu->run->s.regs.gprs[reg1] &= 0x;
+   vcpu->run->s.regs.gprs[reg1] &= 0xUL;
vcpu->run->s.regs.gprs[reg1] |= vcpu->arch.sie_block->gpsw.mask >> 32;
if (reg2) {
-   vcpu->run->s.regs.gprs[reg2] &= 0x;
+   vcpu->run->s.regs.gprs[reg2] &= 0xUL;
vcpu->run->s.regs.gprs[reg2] |=
-   vcpu->arch.sie_block->gpsw.mask & 0x;
+   vcpu->arch.sie_block->gpsw.mask & 0xUL;
}
return 0;
 }
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/8] KVM: s390: declare virtual HW facilities

2013-07-26 Thread Christian Borntraeger
From: Michael Mueller 

The patch renames the array holding the HW facility bitmaps.
This allows to interprete the variable as set of virtual
machine specific "virtual" facilities. The basic idea is
to make virtual facilities externally managable in future.
An availability test for virtual facilites has been added
as well.

Signed-off-by: Michael Mueller 
Signed-off-by: Christian Borntraeger 
---
 arch/s390/kvm/kvm-s390.c | 23 +++
 arch/s390/kvm/kvm-s390.h |  3 +++
 arch/s390/kvm/priv.c | 11 ---
 3 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 39894aa..776dafe 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "kvm-s390.h"
 #include "gaccess.h"
@@ -84,9 +85,15 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ NULL }
 };
 
-static unsigned long long *facilities;
+unsigned long *vfacilities;
 static struct gmap_notifier gmap_notifier;
 
+/* test availability of vfacility */
+static inline int test_vfacility(unsigned long nr)
+{
+   return __test_facility(nr, (void *) vfacilities);
+}
+
 /* Section: not file related */
 int kvm_arch_hardware_enable(void *garbage)
 {
@@ -387,7 +394,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
vcpu->arch.sie_block->ecb   = 6;
vcpu->arch.sie_block->ecb2  = 8;
vcpu->arch.sie_block->eca   = 0xC1002001U;
-   vcpu->arch.sie_block->fac   = (int) (long) facilities;
+   vcpu->arch.sie_block->fac   = (int) (long) vfacilities;
hrtimer_init(&vcpu->arch.ckc_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
tasklet_init(&vcpu->arch.tasklet, kvm_s390_tasklet,
 (unsigned long) vcpu);
@@ -1133,20 +1140,20 @@ static int __init kvm_s390_init(void)
 * to hold the maximum amount of facilities. On the other hand, we
 * only set facilities that are known to work in KVM.
 */
-   facilities = (unsigned long long *) get_zeroed_page(GFP_KERNEL|GFP_DMA);
-   if (!facilities) {
+   vfacilities = (unsigned long *) get_zeroed_page(GFP_KERNEL|GFP_DMA);
+   if (!vfacilities) {
kvm_exit();
return -ENOMEM;
}
-   memcpy(facilities, S390_lowcore.stfle_fac_list, 16);
-   facilities[0] &= 0xff82fff3f47cULL;
-   facilities[1] &= 0x001cULL;
+   memcpy(vfacilities, S390_lowcore.stfle_fac_list, 16);
+   vfacilities[0] &= 0xff82fff3f47cUL;
+   vfacilities[1] &= 0x001cUL;
return 0;
 }
 
 static void __exit kvm_s390_exit(void)
 {
-   free_page((unsigned long) facilities);
+   free_page((unsigned long) vfacilities);
kvm_exit();
 }
 
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 028ca9f..faa4df6 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -24,6 +24,9 @@
 
 typedef int (*intercept_handler_t)(struct kvm_vcpu *vcpu);
 
+/* declare vfacilities extern */
+extern unsigned long *vfacilities;
+
 /* negativ values are error codes, positive values for internal conditions */
 #define SIE_INTERCEPT_RERUNVCPU(1<<0)
 #define SIE_INTERCEPT_UCONTROL (1<<1)
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 4cdc54e..2883026 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -228,7 +228,6 @@ static int handle_io_inst(struct kvm_vcpu *vcpu)
 
 static int handle_stfl(struct kvm_vcpu *vcpu)
 {
-   unsigned int facility_list;
int rc;
 
vcpu->stat.instruction_stfl++;
@@ -236,15 +235,13 @@ static int handle_stfl(struct kvm_vcpu *vcpu)
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
 
-   /* only pass the facility bits, which we can handle */
-   facility_list = S390_lowcore.stfl_fac_list & 0xff82fff3;
-
rc = copy_to_guest(vcpu, offsetof(struct _lowcore, stfl_fac_list),
-  &facility_list, sizeof(facility_list));
+  vfacilities, 4);
if (rc)
return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
-   VCPU_EVENT(vcpu, 5, "store facility list value %x", facility_list);
-   trace_kvm_s390_handle_stfl(vcpu, facility_list);
+   VCPU_EVENT(vcpu, 5, "store facility list value %x",
+  *(unsigned int *) vfacilities);
+   trace_kvm_s390_handle_stfl(vcpu, *(unsigned int *) vfacilities);
return 0;
 }
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/8] KVM: s390: Add helper function for setting condition code

2013-07-26 Thread Christian Borntraeger
From: Thomas Huth 

Introduced a helper function for setting the CC in the
guest PSW to improve the readability of the code.

Signed-off-by: Thomas Huth 
Signed-off-by: Christian Borntraeger 
---
 arch/s390/kvm/kvm-s390.h |  7 +++
 arch/s390/kvm/priv.c | 15 ++-
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index faa4df6..dc99f1c 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -115,6 +115,13 @@ static inline u64 kvm_s390_get_base_disp_rs(struct 
kvm_vcpu *vcpu)
return (base2 ? vcpu->run->s.regs.gprs[base2] : 0) + disp2;
 }
 
+/* Set the condition code in the guest program status word */
+static inline void kvm_s390_set_psw_cc(struct kvm_vcpu *vcpu, unsigned long cc)
+{
+   vcpu->arch.sie_block->gpsw.mask &= ~(3UL << 44);
+   vcpu->arch.sie_block->gpsw.mask |= cc << 44;
+}
+
 int kvm_s390_handle_wait(struct kvm_vcpu *vcpu);
 enum hrtimer_restart kvm_s390_idle_wakeup(struct hrtimer *timer);
 void kvm_s390_tasklet(unsigned long parm);
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index bb69df0..59200ee 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -164,8 +164,7 @@ static int handle_tpi(struct kvm_vcpu *vcpu)
kfree(inti);
 no_interrupt:
/* Set condition code and we're done. */
-   vcpu->arch.sie_block->gpsw.mask &= ~(3ul << 44);
-   vcpu->arch.sie_block->gpsw.mask |= (cc & 3ul) << 44;
+   kvm_s390_set_psw_cc(vcpu, cc);
return 0;
 }
 
@@ -220,8 +219,7 @@ static int handle_io_inst(struct kvm_vcpu *vcpu)
 * Set condition code 3 to stop the guest from issueing channel
 * I/O instructions.
 */
-   vcpu->arch.sie_block->gpsw.mask &= ~(3ul << 44);
-   vcpu->arch.sie_block->gpsw.mask |= (3 & 3ul) << 44;
+   kvm_s390_set_psw_cc(vcpu, 3);
return 0;
}
 }
@@ -384,7 +382,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
 
if (fc > 3) {
-   vcpu->arch.sie_block->gpsw.mask |= 3ul << 44; /* cc 3 */
+   kvm_s390_set_psw_cc(vcpu, 3);
return 0;
}
 
@@ -394,7 +392,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
 
if (fc == 0) {
vcpu->run->s.regs.gprs[0] = 3 << 28;
-   vcpu->arch.sie_block->gpsw.mask &= ~(3ul << 44);  /* cc 0 */
+   kvm_s390_set_psw_cc(vcpu, 0);
return 0;
}
 
@@ -428,12 +426,11 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
}
trace_kvm_s390_handle_stsi(vcpu, fc, sel1, sel2, operand2);
free_page(mem);
-   vcpu->arch.sie_block->gpsw.mask &= ~(3ul << 44);
+   kvm_s390_set_psw_cc(vcpu, 0);
vcpu->run->s.regs.gprs[0] = 0;
return 0;
 out_no_data:
-   /* condition code 3 */
-   vcpu->arch.sie_block->gpsw.mask |= 3ul << 44;
+   kvm_s390_set_psw_cc(vcpu, 3);
 out_exception:
free_page(mem);
return rc;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/8] KVM: s390: fix task size check

2013-07-26 Thread Christian Borntraeger
From: Martin Schwidefsky 

The gmap_map_segment function uses PGDIR_SIZE in the check for the
maximum address in the tasks address space. This incorrectly limits
the amount of memory usable for a kvm guest to 4TB. The correct limit
is (1UL << 53). As the TASK_SIZE has different values (4TB vs 8PB)
dependent on the existance of the fourth page table level, create
a new define 'TASK_MAX_SIZE' for (1UL << 53).

Signed-off-by: Martin Schwidefsky 
Signed-off-by: Christian Borntraeger 
---
 arch/s390/include/asm/processor.h | 2 ++
 arch/s390/mm/pgtable.c| 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/processor.h 
b/arch/s390/include/asm/processor.h
index 6b49987..83c85c2 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -43,6 +43,7 @@ extern void execve_tail(void);
 #ifndef CONFIG_64BIT
 
 #define TASK_SIZE  (1UL << 31)
+#define TASK_MAX_SIZE  (1UL << 31)
 #define TASK_UNMAPPED_BASE (1UL << 30)
 
 #else /* CONFIG_64BIT */
@@ -51,6 +52,7 @@ extern void execve_tail(void);
 #define TASK_UNMAPPED_BASE (test_thread_flag(TIF_31BIT) ? \
(1UL << 30) : (1UL << 41))
 #define TASK_SIZE  TASK_SIZE_OF(current)
+#define TASK_MAX_SIZE  (1UL << 53)
 
 #endif /* CONFIG_64BIT */
 
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 6d33248..967d0bf 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -335,7 +335,7 @@ int gmap_map_segment(struct gmap *gmap, unsigned long from,
 
if ((from | to | len) & (PMD_SIZE - 1))
return -EINVAL;
-   if (len == 0 || from + len > PGDIR_SIZE ||
+   if (len == 0 || from + len > TASK_MAX_SIZE ||
from + len < from || to + len < to)
return -EINVAL;
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/8] KVM: s390: Make KVM_HVA_ERR_BAD usable on s390

2013-07-26 Thread Christian Borntraeger
From: Dominik Dingel 

Current common code uses PAGE_OFFSET to indicate a bad host virtual address.
As this check won't work on architectures that don't map kernel and user memory
into the same address space (e.g. s390), such architectures can now provide
their own KVM_HVA_ERR_BAD defines.

Signed-off-by: Dominik Dingel 
Signed-off-by: Christian Borntraeger 
---
 arch/s390/include/asm/kvm_host.h | 8 
 include/linux/kvm_host.h | 8 
 2 files changed, 16 insertions(+)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 3238d40..e87ecaa 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -274,6 +274,14 @@ struct kvm_arch{
int css_support;
 };
 
+#define KVM_HVA_ERR_BAD(-1UL)
+#define KVM_HVA_ERR_RO_BAD (-2UL)
+
+static inline bool kvm_is_error_hva(unsigned long addr)
+{
+   return IS_ERR_VALUE(addr);
+}
+
 extern int sie64a(struct kvm_s390_sie_block *, u64 *);
 extern char sie_exit;
 #endif
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c11c7686..ca645a0 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -85,6 +85,12 @@ static inline bool is_noslot_pfn(pfn_t pfn)
return pfn == KVM_PFN_NOSLOT;
 }
 
+/*
+ * architectures with KVM_HVA_ERR_BAD other than PAGE_OFFSET (e.g. s390)
+ * provide own defines and kvm_is_error_hva
+ */
+#ifndef KVM_HVA_ERR_BAD
+
 #define KVM_HVA_ERR_BAD(PAGE_OFFSET)
 #define KVM_HVA_ERR_RO_BAD (PAGE_OFFSET + PAGE_SIZE)
 
@@ -93,6 +99,8 @@ static inline bool kvm_is_error_hva(unsigned long addr)
return addr >= PAGE_OFFSET;
 }
 
+#endif
+
 #define KVM_ERR_PTR_BAD_PAGE   (ERR_PTR(-ENOENT))
 
 static inline bool is_error_page(struct page *page)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/8] KVM: s390: fix pfmf non-quiescing control handling

2013-07-26 Thread Christian Borntraeger
From: Heiko Carstens 

Fix the test within handle_pfmf() if the host has the NQ key-setting
facility installed.
Right now the code would incorrectly generate a program check in the
guest if the NQ control bit for a pfmf request was set and if the host
has the NQ key-setting facility installed.

Signed-off-by: Heiko Carstens 
Reviewed-by: Thomas Huth 
Signed-off-by: Christian Borntraeger 
---
 arch/s390/kvm/priv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 0da3e6e..4cdc54e 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -532,8 +533,7 @@ static int handle_pfmf(struct kvm_vcpu *vcpu)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
 
/* Only provide non-quiescing support if the host supports it */
-   if (vcpu->run->s.regs.gprs[reg1] & PFMF_NQ &&
-   S390_lowcore.stfl_fac_list & 0x0002)
+   if (vcpu->run->s.regs.gprs[reg1] & PFMF_NQ && !test_facility(14))
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
 
/* No support for conditional-SSKE */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/8] KVM: s390: allow sie enablement for multi-threaded programs

2013-07-26 Thread Christian Borntraeger
From: Martin Schwidefsky 

Improve the code to upgrade the standard 2K page tables to 4K page tables
with PGSTEs to allow the operation to happen when the program is already
multi-threaded.

Signed-off-by: Martin Schwidefsky 
Signed-off-by: Christian Borntraeger 
---
 arch/s390/include/asm/mmu.h |   2 -
 arch/s390/include/asm/mmu_context.h |  19 +---
 arch/s390/include/asm/pgtable.h |  11 +++
 arch/s390/mm/pgtable.c  | 181 +++-
 4 files changed, 129 insertions(+), 84 deletions(-)

diff --git a/arch/s390/include/asm/mmu.h b/arch/s390/include/asm/mmu.h
index 6340178..ff132ac 100644
--- a/arch/s390/include/asm/mmu.h
+++ b/arch/s390/include/asm/mmu.h
@@ -12,8 +12,6 @@ typedef struct {
unsigned long asce_bits;
unsigned long asce_limit;
unsigned long vdso_base;
-   /* Cloned contexts will be created with extended page tables. */
-   unsigned int alloc_pgste:1;
/* The mmu context has extended page tables. */
unsigned int has_pgste:1;
 } mm_context_t;
diff --git a/arch/s390/include/asm/mmu_context.h 
b/arch/s390/include/asm/mmu_context.h
index 084e775..4fb67a0 100644
--- a/arch/s390/include/asm/mmu_context.h
+++ b/arch/s390/include/asm/mmu_context.h
@@ -21,24 +21,7 @@ static inline int init_new_context(struct task_struct *tsk,
 #ifdef CONFIG_64BIT
mm->context.asce_bits |= _ASCE_TYPE_REGION3;
 #endif
-   if (current->mm && current->mm->context.alloc_pgste) {
-   /*
-* alloc_pgste indicates, that any NEW context will be created
-* with extended page tables. The old context is unchanged. The
-* page table allocation and the page table operations will
-* look at has_pgste to distinguish normal and extended page
-* tables. The only way to create extended page tables is to
-* set alloc_pgste and then create a new context (e.g. dup_mm).
-* The page table allocation is called after init_new_context
-* and if has_pgste is set, it will create extended page
-* tables.
-*/
-   mm->context.has_pgste = 1;
-   mm->context.alloc_pgste = 1;
-   } else {
-   mm->context.has_pgste = 0;
-   mm->context.alloc_pgste = 0;
-   }
+   mm->context.has_pgste = 0;
mm->context.asce_limit = STACK_TOP_MAX;
crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm));
return 0;
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 75fb726..7a60bb9 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1361,6 +1361,17 @@ static inline pmd_t pmd_mkwrite(pmd_t pmd)
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLB_PAGE */
 
+static inline void pmdp_flush_lazy(struct mm_struct *mm,
+  unsigned long address, pmd_t *pmdp)
+{
+   int active = (mm == current->active_mm) ? 1 : 0;
+
+   if ((atomic_read(&mm->context.attach_count) & 0x) > active)
+   __pmd_idte(address, pmdp);
+   else
+   mm->context.flush_mm = 1;
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 
 #define __HAVE_ARCH_PGTABLE_DEPOSIT
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index a8154a1..6d33248 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -731,6 +731,11 @@ void gmap_do_ipte_notify(struct mm_struct *mm, unsigned 
long addr, pte_t *pte)
spin_unlock(&gmap_notifier_lock);
 }
 
+static inline int page_table_with_pgste(struct page *page)
+{
+   return atomic_read(&page->_mapcount) == 0;
+}
+
 static inline unsigned long *page_table_alloc_pgste(struct mm_struct *mm,
unsigned long vmaddr)
 {
@@ -750,7 +755,7 @@ static inline unsigned long *page_table_alloc_pgste(struct 
mm_struct *mm,
mp->vmaddr = vmaddr & PMD_MASK;
INIT_LIST_HEAD(&mp->mapper);
page->index = (unsigned long) mp;
-   atomic_set(&page->_mapcount, 3);
+   atomic_set(&page->_mapcount, 0);
table = (unsigned long *) page_to_phys(page);
clear_table(table, _PAGE_TYPE_EMPTY, PAGE_SIZE/2);
clear_table(table + PTRS_PER_PTE, 0, PAGE_SIZE/2);
@@ -821,6 +826,11 @@ EXPORT_SYMBOL(set_guest_storage_key);
 
 #else /* CONFIG_PGSTE */
 
+static inline int page_table_with_pgste(struct page *page)
+{
+   return 0;
+}
+
 static inline unsigned long *page_table_alloc_pgste(struct mm_struct *mm,
unsigned long vmaddr)
 {
@@ -897,12 +907,12 @@ void page_table_free(struct mm_struct *mm, unsigned long 
*table)
struct page *page;
unsigned int bit, mask;
 
-   if (mm_has_pgste(mm)) {
+   page = pfn_to_page(__pa(table) >> PAGE_SHIFT);
+   if (page_table_with_pgste(page)) {
gmap_disconnect_pgtable(mm, t

[PATCH 1/8] KVM: s390: move kvm_guest_enter,exit closer to sie

2013-07-26 Thread Christian Borntraeger
From: Dominik Dingel 

Any uaccess between guest_enter and guest_exit could trigger a page fault,
the page fault handler would handle it as a guest fault and translate a
user address as guest address.

Signed-off-by: Dominik Dingel 
Signed-off-by: Christian Borntraeger 
CC: sta...@vger.kernel.org
---
 arch/s390/kvm/kvm-s390.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index a3d797b..39894aa 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -702,14 +702,25 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
return rc;
 
vcpu->arch.sie_block->icptcode = 0;
-   preempt_disable();
-   kvm_guest_enter();
-   preempt_enable();
VCPU_EVENT(vcpu, 6, "entering sie flags %x",
   atomic_read(&vcpu->arch.sie_block->cpuflags));
trace_kvm_s390_sie_enter(vcpu,
 atomic_read(&vcpu->arch.sie_block->cpuflags));
+
+   /*
+* As PF_VCPU will be used in fault handler, between guest_enter
+* and guest_exit should be no uaccess.
+*/
+   preempt_disable();
+   kvm_guest_enter();
+   preempt_enable();
rc = sie64a(vcpu->arch.sie_block, vcpu->run->s.regs.gprs);
+   kvm_guest_exit();
+
+   VCPU_EVENT(vcpu, 6, "exit sie icptcode %d",
+  vcpu->arch.sie_block->icptcode);
+   trace_kvm_s390_sie_exit(vcpu, vcpu->arch.sie_block->icptcode);
+
if (rc > 0)
rc = 0;
if (rc < 0) {
@@ -721,10 +732,6 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
rc = kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
}
}
-   VCPU_EVENT(vcpu, 6, "exit sie icptcode %d",
-  vcpu->arch.sie_block->icptcode);
-   trace_kvm_s390_sie_exit(vcpu, vcpu->arch.sie_block->icptcode);
-   kvm_guest_exit();
 
memcpy(&vcpu->run->s.regs.gprs[14], &vcpu->arch.sie_block->gg14, 16);
return rc;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/8] KVM: s390: fixes and cleanup

2013-07-26 Thread Christian Borntraeger
Gleb, Paolo,

here are some fixes and cleanups for KVM/s390.

The first two patches
"KVM: s390: move kvm_guest_enter,exit closer to sie"
and
"KVM: s390: fix pfmf non-quiescing control handling"

should go into 3.11. Everything else looks more like 3.12.
Please apply.

Christian


Dominik Dingel (2):
  KVM: s390: move kvm_guest_enter,exit closer to sie
  KVM: s390: Make KVM_HVA_ERR_BAD usable on s390

Heiko Carstens (1):
  KVM: s390: fix pfmf non-quiescing control handling

Martin Schwidefsky (2):
  KVM: s390: allow sie enablement for multi-threaded programs
  KVM: s390: fix task size check

Michael Mueller (1):
  KVM: s390: declare virtual HW facilities

Thomas Huth (2):
  KVM: s390: Fix sparse warnings in priv.c
  KVM: s390: Add helper function for setting condition code

 arch/s390/include/asm/kvm_host.h|   8 ++
 arch/s390/include/asm/mmu.h |   2 -
 arch/s390/include/asm/mmu_context.h |  19 +---
 arch/s390/include/asm/pgtable.h |  11 +++
 arch/s390/include/asm/processor.h   |   2 +
 arch/s390/kvm/kvm-s390.c|  44 ++---
 arch/s390/kvm/kvm-s390.h|  10 ++
 arch/s390/kvm/priv.c|  36 +++
 arch/s390/mm/pgtable.c  | 183 +++-
 include/linux/kvm_host.h|   8 ++
 10 files changed, 202 insertions(+), 121 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-26 Thread Bhushan Bharat-R65777


> -Original Message-
> From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
> Behalf Of Alexander Graf
> Sent: Friday, July 26, 2013 2:20 PM
> To: Benjamin Herrenschmidt
> Cc: Bhushan Bharat-R65777; kvm-...@vger.kernel.org; kvm@vger.kernel.org;
> linuxppc-...@lists.ozlabs.org; Wood Scott-B07421; Bhushan Bharat-R65777
> Subject: Re: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages
> 
> 
> On 26.07.2013, at 10:26, Benjamin Herrenschmidt wrote:
> 
> > On Fri, 2013-07-26 at 11:16 +0530, Bharat Bhushan wrote:
> >> If the page is RAM then map this as cacheable and coherent (set "M"
> >> bit) otherwise this page is treated as I/O and map this as cache
> >> inhibited and guarded (set  "I + G")
> >>
> >> This helps setting proper MMU mapping for direct assigned device.
> >>
> >> NOTE: There can be devices that require cacheable mapping, which is not yet
> supported.
> >
> > Why don't you do like server instead and enforce the use of the same I
> > and M bits as the corresponding qemu PTE ?
> 
> Specifically, Ben is talking about this code:
> 
> 
> /* Translate to host virtual address */
> hva = __gfn_to_hva_memslot(memslot, gfn);
> 
> /* Look up the Linux PTE for the backing page */
> pte_size = psize;
> pte = lookup_linux_pte(pgdir, hva, writing, &pte_size);
> if (pte_present(pte)) {
> if (writing && !pte_write(pte))
> /* make the actual HPTE be read-only */
> ptel = hpte_make_readonly(ptel);
> is_io = hpte_cache_bits(pte_val(pte));
> pa = pte_pfn(pte) << PAGE_SHIFT;
> }
> 

Ok

Thanks
-Bharat


> 
> Alex
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body
> of a message to majord...@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-26 Thread Bhushan Bharat-R65777


> -Original Message-
> From: Benjamin Herrenschmidt [mailto:b...@kernel.crashing.org]
> Sent: Friday, July 26, 2013 1:57 PM
> To: Bhushan Bharat-R65777
> Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; 
> linuxppc-...@lists.ozlabs.org;
> ag...@suse.de; Wood Scott-B07421; Bhushan Bharat-R65777
> Subject: Re: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages
> 
> On Fri, 2013-07-26 at 11:16 +0530, Bharat Bhushan wrote:
> > If the page is RAM then map this as cacheable and coherent (set "M"
> > bit) otherwise this page is treated as I/O and map this as cache
> > inhibited and guarded (set  "I + G")
> >
> > This helps setting proper MMU mapping for direct assigned device.
> >
> > NOTE: There can be devices that require cacheable mapping, which is not yet
> supported.
> 
> Why don't you do like server instead and enforce the use of the same I and M
> bits as the corresponding qemu PTE ?

Ben/Alex, I will look into the code. Can you please describe how this is 
handled on server?

Thanks
-Bharat

> 
> Cheers,
> Ben.
> 
> > Signed-off-by: Bharat Bhushan 
> > ---
> >  arch/powerpc/kvm/e500_mmu_host.c |   24 +++-
> >  1 files changed, 19 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/powerpc/kvm/e500_mmu_host.c
> > b/arch/powerpc/kvm/e500_mmu_host.c
> > index 1c6a9d7..5cbdc8f 100644
> > --- a/arch/powerpc/kvm/e500_mmu_host.c
> > +++ b/arch/powerpc/kvm/e500_mmu_host.c
> > @@ -64,13 +64,27 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int
> usermode)
> > return mas3;
> >  }
> >
> > -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
> > +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
> >  {
> > +   u32 mas2_attr;
> > +
> > +   mas2_attr = mas2 & MAS2_ATTRIB_MASK;
> > +
> > +   if (kvm_is_mmio_pfn(pfn)) {
> > +   /*
> > +* If page is not RAM then it is treated as I/O page.
> > +* Map it with cache inhibited and guarded (set "I" + "G").
> > +*/
> > +   mas2_attr |= MAS2_I | MAS2_G;
> > +   return mas2_attr;
> > +   }
> > +
> > +   /* Map RAM pages as cacheable (Not setting "I" in MAS2) */
> >  #ifdef CONFIG_SMP
> > -   return (mas2 & MAS2_ATTRIB_MASK) | MAS2_M;
> > -#else
> > -   return mas2 & MAS2_ATTRIB_MASK;
> > +   /* Also map as coherent (set "M") in SMP */
> > +   mas2_attr |= MAS2_M;
> >  #endif
> > +   return mas2_attr;
> >  }
> >
> >  /*
> > @@ -313,7 +327,7 @@ static void kvmppc_e500_setup_stlbe(
> > /* Force IPROT=0 for all guest mappings. */
> > stlbe->mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
> > stlbe->mas2 = (gvaddr & MAS2_EPN) |
> > - e500_shadow_mas2_attrib(gtlbe->mas2, pr);
> > + e500_shadow_mas2_attrib(gtlbe->mas2, pfn);
> > stlbe->mas7_3 = ((u64)pfn << PAGE_SHIFT) |
> > e500_shadow_mas3_attrib(gtlbe->mas7_3, pr);
> >
> 
> 



Re: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-26 Thread Alexander Graf

On 26.07.2013, at 10:26, Benjamin Herrenschmidt wrote:

> On Fri, 2013-07-26 at 11:16 +0530, Bharat Bhushan wrote:
>> If the page is RAM then map this as cacheable and coherent (set "M" bit)
>> otherwise this page is treated as I/O and map this as cache inhibited
>> and guarded (set  "I + G")
>> 
>> This helps setting proper MMU mapping for direct assigned device.
>> 
>> NOTE: There can be devices that require cacheable mapping, which is not yet 
>> supported.
> 
> Why don't you do like server instead and enforce the use of the same I
> and M bits as the corresponding qemu PTE ?

Specifically, Ben is talking about this code:


/* Translate to host virtual address */
hva = __gfn_to_hva_memslot(memslot, gfn);

/* Look up the Linux PTE for the backing page */
pte_size = psize;
pte = lookup_linux_pte(pgdir, hva, writing, &pte_size);
if (pte_present(pte)) {
if (writing && !pte_write(pte))
/* make the actual HPTE be read-only */
ptel = hpte_make_readonly(ptel);
is_io = hpte_cache_bits(pte_val(pte));
pa = pte_pfn(pte) << PAGE_SHIFT;
}


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-26 Thread Benjamin Herrenschmidt
On Fri, 2013-07-26 at 11:16 +0530, Bharat Bhushan wrote:
> If the page is RAM then map this as cacheable and coherent (set "M" bit)
> otherwise this page is treated as I/O and map this as cache inhibited
> and guarded (set  "I + G")
> 
> This helps setting proper MMU mapping for direct assigned device.
> 
> NOTE: There can be devices that require cacheable mapping, which is not yet 
> supported.

Why don't you do like server instead and enforce the use of the same I
and M bits as the corresponding qemu PTE ?

Cheers,
Ben.

> Signed-off-by: Bharat Bhushan 
> ---
>  arch/powerpc/kvm/e500_mmu_host.c |   24 +++-
>  1 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/e500_mmu_host.c 
> b/arch/powerpc/kvm/e500_mmu_host.c
> index 1c6a9d7..5cbdc8f 100644
> --- a/arch/powerpc/kvm/e500_mmu_host.c
> +++ b/arch/powerpc/kvm/e500_mmu_host.c
> @@ -64,13 +64,27 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
> usermode)
>   return mas3;
>  }
>  
> -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
> +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
>  {
> + u32 mas2_attr;
> +
> + mas2_attr = mas2 & MAS2_ATTRIB_MASK;
> +
> + if (kvm_is_mmio_pfn(pfn)) {
> + /*
> +  * If page is not RAM then it is treated as I/O page.
> +  * Map it with cache inhibited and guarded (set "I" + "G").
> +  */
> + mas2_attr |= MAS2_I | MAS2_G;
> + return mas2_attr;
> + }
> +
> + /* Map RAM pages as cacheable (Not setting "I" in MAS2) */
>  #ifdef CONFIG_SMP
> - return (mas2 & MAS2_ATTRIB_MASK) | MAS2_M;
> -#else
> - return mas2 & MAS2_ATTRIB_MASK;
> + /* Also map as coherent (set "M") in SMP */
> + mas2_attr |= MAS2_M;
>  #endif
> + return mas2_attr;
>  }
>  
>  /*
> @@ -313,7 +327,7 @@ static void kvmppc_e500_setup_stlbe(
>   /* Force IPROT=0 for all guest mappings. */
>   stlbe->mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
>   stlbe->mas2 = (gvaddr & MAS2_EPN) |
> -   e500_shadow_mas2_attrib(gtlbe->mas2, pr);
> +   e500_shadow_mas2_attrib(gtlbe->mas2, pfn);
>   stlbe->mas7_3 = ((u64)pfn << PAGE_SHIFT) |
>   e500_shadow_mas3_attrib(gtlbe->mas7_3, pr);
>  


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html