Using block device instead of character device for virtio-serial

2014-02-05 Thread Jobin Raju George
I am trying to establish a communication mechanism between the guest
and its host using virtio-serial. For this I am using the following to
boot the VM:

qemu-system-x86_64 -m 1024 \
-name ubuntu_vm \
-hda ubuntu \
-device virtio-serial \
-chardev socket,path=/tmp/virt_socket,server,nowait,id=virt_socket \
-device virtconsole,name=v_soc,chardev=virt_socket,name=ubuntu_vm_soc

This creates a character device on the guest machine and a UNIX socket
on the host machine.

1) Is there a way I can create sockets on the host as well as the guest?
2) Is there a way I can create a block device for communication?

I required a block device since the data that is to be transferred is
huge and the frequency of the data transfer is quite high.

Thanks in advance!

-- 

Thanks and regards,
Jobin Raju George
Final Year, Information Technology
College of Engineering Pune
Alternate e-mail: georgejr10...@coep.ac.in
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Discard is not working

2014-02-05 Thread chickenmarkus

Hello,

after reading the "invitation" for some one-off questions without 
subscribing excuse my disturbing, please.


At first my setup of host:

 * in general Debian Wheezy
 * Kernel: 3.11-0.bpo.2-amd64
   (http://packages.debian.org/wheezy-backports/linux-image-3.11-0.bpo.2-amd64)
 * LVM 2.02.98 (http://packages.debian.org/jessie/lvm2)
 * thin-provisioning-tools 0.2.8-1
   (http://packages.debian.org/jessie/thin-provisioning-tools)
 * Qemu-KVM: 1.7.0 (http://packages.debian.org/jessie/qemu-kvm)

The new kernel and dirty things from Jessie are for working thin volumes 
with discard on my ssd. Everything's fine up to this point.


Afterwards I start a guest (also Debian Wheezy) over the following 
command (over libvirt) as root (for test only):


   kvm [...] -drive
   file=/dev/ssd0/sarabi,if=none,id=drive-scsi0-0-0-0,format=raw,discard=unmap
   -device
   
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2
   [...]

Neither lsblk -D nor the try of fstrim ends positive. Both say that 
discard is not supported.
I read that guests need at least kernel 3.4 to support discard (and 
virtio-scsi). But the same backports kernel 3.11 did not changed anything.


Did I understand something wrong? Is the command wrong? Are any 
requirements not met?


Bye Markus

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/51] arm, kvm: Fix CPU hotplug callback registration

2014-02-05 Thread Srivatsa S. Bhat
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the kvm code in arm by using this latter form of callback registration.

Cc: Christoffer Dall 
Cc: Gleb Natapov 
Cc: Paolo Bonzini 
Cc: Russell King 
Cc: kvm...@lists.cs.columbia.edu
Cc: kvm@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Signed-off-by: Srivatsa S. Bhat 
---

 arch/arm/kvm/arm.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 1d8248e..e2ef4c4 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -1050,21 +1050,26 @@ int kvm_arch_init(void *opaque)
}
}
 
+   cpu_maps_update_begin();
+
err = init_hyp_mode();
if (err)
goto out_err;
 
-   err = register_cpu_notifier(&hyp_init_cpu_nb);
+   err = __register_cpu_notifier(&hyp_init_cpu_nb);
if (err) {
kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
goto out_err;
}
 
+   cpu_maps_update_done();
+
hyp_cpu_pm_init();
 
kvm_coproc_table_init();
return 0;
 out_err:
+   cpu_maps_update_done();
return err;
 }
 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 27/51] x86, kvm: Fix CPU hotplug callback registration

2014-02-05 Thread Srivatsa S. Bhat
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the kvm code in x86 by using this latter form of callback registration.

Cc: Gleb Natapov 
Cc: Paolo Bonzini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: kvm@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat 
---

 arch/x86/kvm/x86.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 39c28f09..e3893b7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5365,7 +5365,8 @@ static void kvm_timer_init(void)
int cpu;
 
max_tsc_khz = tsc_khz;
-   register_hotcpu_notifier(&kvmclock_cpu_notifier_block);
+
+   cpu_maps_update_begin();
if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) {
 #ifdef CONFIG_CPU_FREQ
struct cpufreq_policy policy;
@@ -5382,6 +5383,10 @@ static void kvm_timer_init(void)
pr_debug("kvm: max_tsc_khz = %ld\n", max_tsc_khz);
for_each_online_cpu(cpu)
smp_call_function_single(cpu, tsc_khz_changed, NULL, 1);
+
+   __register_hotcpu_notifier(&kvmclock_cpu_notifier_block);
+   cpu_maps_update_done();
+
 }
 
 static DEFINE_PER_CPU(struct kvm_vcpu *, current_vcpu);

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] ARM: KVM: fix warning in mmu.c

2014-02-05 Thread Marc Zyngier
Compiling with THP enabled leads to the following warning:

arch/arm/kvm/mmu.c: In function ‘unmap_range’:
arch/arm/kvm/mmu.c:177:39: warning: ‘pte’ may be used uninitialized in this 
function [-Wmaybe-uninitialized]
   if (kvm_pmd_huge(*pmd) || page_empty(pte)) {
^
Code inspection reveals that these two cases are mutually exclusive,
so GCC is a bit overzealous here. But silence it anyway by setting
pte to NULL if kvm_pmd_huge(*pmd) is true.

Signed-off-by: Marc Zyngier 
---
 arch/arm/kvm/mmu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index ea21b6a..3020221 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -169,12 +169,14 @@ static void unmap_range(struct kvm *kvm, pgd_t *pgdp,
pte = pte_offset_kernel(pmd, addr);
clear_pte_entry(kvm, pte, addr);
next = addr + PAGE_SIZE;
+   } else {
+   pte = NULL;
}
 
/*
 * If the pmd entry is to be cleared, walk back up the ladder
 */
-   if (kvm_pmd_huge(*pmd) || page_empty(pte)) {
+   if (kvm_pmd_huge(*pmd) || (pte && page_empty(pte))) {
clear_pmd_entry(kvm, pmd, addr);
next = pmd_addr_end(addr, end);
if (page_empty(pmd) && !page_empty(pud)) {
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 09/11] ARM: KVM: introduce per-vcpu HYP Configuration Register

2014-02-05 Thread Marc Zyngier
So far, KVM/ARM used a fixed HCR configuration per guest, except for
the VI/VF/VA bits to control the interrupt in absence of VGIC.

With the upcoming need to dynamically reconfigure trapping, it becomes
necessary to allow the HCR to be changed on a per-vcpu basis.

The fix here is to mimic what KVM/arm64 already does: a per vcpu HCR
field, initialized at setup time.

Signed-off-by: Marc Zyngier 
Reviewed-by: Christoffer Dall 
---
 arch/arm/include/asm/kvm_arm.h  | 1 -
 arch/arm/include/asm/kvm_host.h | 9 ++---
 arch/arm/kernel/asm-offsets.c   | 1 +
 arch/arm/kvm/guest.c| 1 +
 arch/arm/kvm/interrupts_head.S  | 9 +++--
 5 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/arm/include/asm/kvm_arm.h b/arch/arm/include/asm/kvm_arm.h
index 1d3153c..a843e74 100644
--- a/arch/arm/include/asm/kvm_arm.h
+++ b/arch/arm/include/asm/kvm_arm.h
@@ -69,7 +69,6 @@
 #define HCR_GUEST_MASK (HCR_TSC | HCR_TSW | HCR_TWI | HCR_VM | HCR_BSU_IS | \
HCR_FB | HCR_TAC | HCR_AMO | HCR_IMO | HCR_FMO | \
HCR_TWE | HCR_SWIO | HCR_TIDCP)
-#define HCR_VIRT_EXCP_MASK (HCR_VA | HCR_VI | HCR_VF)
 
 /* System Control Register (SCTLR) bits */
 #define SCTLR_TE   (1 << 30)
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 228ae1c..86be18c 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -101,6 +101,12 @@ struct kvm_vcpu_arch {
/* The CPU type we expose to the VM */
u32 midr;
 
+   /* HYP trapping configuration */
+   u32 hcr;
+
+   /* Interrupt related fields */
+   u32 irq_lines;  /* IRQ and FIQ levels */
+
/* Exception Information */
struct kvm_vcpu_fault_info fault;
 
@@ -128,9 +134,6 @@ struct kvm_vcpu_arch {
/* IO related fields */
struct kvm_decode mmio_decode;
 
-   /* Interrupt related fields */
-   u32 irq_lines;  /* IRQ and FIQ levels */
-
/* Cache some mmu pages needed inside spinlock regions */
struct kvm_mmu_memory_cache mmu_page_cache;
 
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index dbe0476..713e807 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -174,6 +174,7 @@ int main(void)
   DEFINE(VCPU_FIQ_REGS,offsetof(struct kvm_vcpu, 
arch.regs.fiq_regs));
   DEFINE(VCPU_PC,  offsetof(struct kvm_vcpu, 
arch.regs.usr_regs.ARM_pc));
   DEFINE(VCPU_CPSR,offsetof(struct kvm_vcpu, 
arch.regs.usr_regs.ARM_cpsr));
+  DEFINE(VCPU_HCR, offsetof(struct kvm_vcpu, arch.hcr));
   DEFINE(VCPU_IRQ_LINES,   offsetof(struct kvm_vcpu, arch.irq_lines));
   DEFINE(VCPU_HSR, offsetof(struct kvm_vcpu, arch.fault.hsr));
   DEFINE(VCPU_HxFAR,   offsetof(struct kvm_vcpu, arch.fault.hxfar));
diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
index 2786eae..b23a59c 100644
--- a/arch/arm/kvm/guest.c
+++ b/arch/arm/kvm/guest.c
@@ -38,6 +38,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
 
 int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 {
+   vcpu->arch.hcr = HCR_GUEST_MASK;
return 0;
 }
 
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index 4a2a97a..7cb41e1 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -597,17 +597,14 @@ vcpu  .reqr0  @ vcpu pointer always 
in r0
 
 /* Enable/Disable: stage-2 trans., trap interrupts, trap wfi, trap smc */
 .macro configure_hyp_role operation
-   mrc p15, 4, r2, c1, c1, 0   @ HCR
-   bic r2, r2, #HCR_VIRT_EXCP_MASK
-   ldr r3, =HCR_GUEST_MASK
.if \operation == vmentry
-   orr r2, r2, r3
+   ldr r2, [vcpu, #VCPU_HCR]
ldr r3, [vcpu, #VCPU_IRQ_LINES]
orr r2, r2, r3
.else
-   bic r2, r2, r3
+   mov r2, #0
.endif
-   mcr p15, 4, r2, c1, c1, 0
+   mcr p15, 4, r2, c1, c1, 0   @ HCR
 .endm
 
 .macro load_vcpu
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 02/11] arm64: KVM: allows discrimination of AArch32 sysreg access

2014-02-05 Thread Marc Zyngier
The current handling of AArch32 trapping is slightly less than
perfect, as it is not possible (from a handler point of view)
to distinguish it from an AArch64 access, nor to tell a 32bit
from a 64bit access either.

Fix this by introducing two additional flags:
- is_aarch32: true if the access was made in AArch32 mode
- is_32bit: true if is_aarch32 == true and a MCR/MRC instruction
  was used to perform the access (as opposed to MCRR/MRRC).

This allows a handler to cover all the possible conditions in which
a system register gets trapped.

Signed-off-by: Marc Zyngier 
Acked-by: Christoffer Dall 
---
 arch/arm64/kvm/sys_regs.c | 6 ++
 arch/arm64/kvm/sys_regs.h | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 02e9d09..bf03e0f 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -437,6 +437,8 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
u32 hsr = kvm_vcpu_get_hsr(vcpu);
int Rt2 = (hsr >> 10) & 0xf;
 
+   params.is_aarch32 = true;
+   params.is_32bit = false;
params.CRm = (hsr >> 1) & 0xf;
params.Rt = (hsr >> 5) & 0xf;
params.is_write = ((hsr & 1) == 0);
@@ -480,6 +482,8 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
struct sys_reg_params params;
u32 hsr = kvm_vcpu_get_hsr(vcpu);
 
+   params.is_aarch32 = true;
+   params.is_32bit = true;
params.CRm = (hsr >> 1) & 0xf;
params.Rt  = (hsr >> 5) & 0xf;
params.is_write = ((hsr & 1) == 0);
@@ -549,6 +553,8 @@ int kvm_handle_sys_reg(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
struct sys_reg_params params;
unsigned long esr = kvm_vcpu_get_hsr(vcpu);
 
+   params.is_aarch32 = false;
+   params.is_32bit = false;
params.Op0 = (esr >> 20) & 3;
params.Op1 = (esr >> 14) & 0x7;
params.CRn = (esr >> 10) & 0xf;
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index d50d372..d411e25 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -30,6 +30,8 @@ struct sys_reg_params {
u8  Op2;
u8  Rt;
boolis_write;
+   boolis_aarch32;
+   boolis_32bit;   /* Only valid if is_aarch32 is true */
 };
 
 struct sys_reg_desc {
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 01/11] arm64: KVM: force cache clean on page fault when caches are off

2014-02-05 Thread Marc Zyngier
In order for the guest with caches off to observe data written
contained in a given page, we need to make sure that page is
committed to memory, and not just hanging in the cache (as
guest accesses are completely bypassing the cache until it
decides to enable it).

For this purpose, hook into the coherent_icache_guest_page
function and flush the region if the guest SCTLR_EL1
register doesn't show the MMU  and caches as being enabled.
The function also get renamed to coherent_cache_guest_page.

Signed-off-by: Marc Zyngier 
Reviewed-by: Catalin Marinas 
Reviewed-by: Christoffer Dall 
---
 arch/arm/include/asm/kvm_mmu.h   |  4 ++--
 arch/arm/kvm/mmu.c   |  4 ++--
 arch/arm64/include/asm/kvm_mmu.h | 16 
 3 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 2d122ad..6d0f3d3 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -116,8 +116,8 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
 
 struct kvm;
 
-static inline void coherent_icache_guest_page(struct kvm *kvm, hva_t hva,
- unsigned long size)
+static inline void coherent_cache_guest_page(struct kvm_vcpu *vcpu, hva_t hva,
+unsigned long size)
 {
/*
 * If we are going to insert an instruction page and the icache is
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 7789857..fc71a8d 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -715,7 +715,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
kvm_set_s2pmd_writable(&new_pmd);
kvm_set_pfn_dirty(pfn);
}
-   coherent_icache_guest_page(kvm, hva & PMD_MASK, PMD_SIZE);
+   coherent_cache_guest_page(vcpu, hva & PMD_MASK, PMD_SIZE);
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
} else {
pte_t new_pte = pfn_pte(pfn, PAGE_S2);
@@ -723,7 +723,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
kvm_set_s2pte_writable(&new_pte);
kvm_set_pfn_dirty(pfn);
}
-   coherent_icache_guest_page(kvm, hva, PAGE_SIZE);
+   coherent_cache_guest_page(vcpu, hva, PAGE_SIZE);
ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, false);
}
 
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 7f1f940..6eaf69b 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -106,7 +106,6 @@ static inline bool kvm_is_write_fault(unsigned long esr)
return true;
 }
 
-static inline void kvm_clean_dcache_area(void *addr, size_t size) {}
 static inline void kvm_clean_pgd(pgd_t *pgd) {}
 static inline void kvm_clean_pmd_entry(pmd_t *pmd) {}
 static inline void kvm_clean_pte(pte_t *pte) {}
@@ -124,9 +123,19 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
 
 struct kvm;
 
-static inline void coherent_icache_guest_page(struct kvm *kvm, hva_t hva,
- unsigned long size)
+#define kvm_flush_dcache_to_poc(a,l)   __flush_dcache_area((a), (l))
+
+static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
 {
+   return (vcpu_sys_reg(vcpu, SCTLR_EL1) & 0b101) == 0b101;
+}
+
+static inline void coherent_cache_guest_page(struct kvm_vcpu *vcpu, hva_t hva,
+unsigned long size)
+{
+   if (!vcpu_has_cache_enabled(vcpu))
+   kvm_flush_dcache_to_poc((void *)hva, size);
+
if (!icache_is_aliasing()) {/* PIPT */
flush_icache_range(hva, hva + size);
} else if (!icache_is_aivivt()) {   /* non ASID-tagged VIVT */
@@ -135,7 +144,6 @@ static inline void coherent_icache_guest_page(struct kvm 
*kvm, hva_t hva,
}
 }
 
-#define kvm_flush_dcache_to_poc(a,l)   __flush_dcache_area((a), (l))
 #define kvm_virt_to_phys(x)__virt_to_phys((unsigned long)(x))
 
 #endif /* __ASSEMBLY__ */
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 06/11] ARM: KVM: force cache clean on page fault when caches are off

2014-02-05 Thread Marc Zyngier
In order for a guest with caches disabled to observe data written
contained in a given page, we need to make sure that page is
committed to memory, and not just hanging in the cache (as guest
accesses are completely bypassing the cache until it decides to
enable it).

For this purpose, hook into the coherent_cache_guest_page
function and flush the region if the guest SCTLR
register doesn't show the MMU and caches as being enabled.

Signed-off-by: Marc Zyngier 
Reviewed-by: Christoffer Dall 
---
 arch/arm/include/asm/kvm_mmu.h | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 0931cda..b62ca91 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -122,9 +122,19 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
 
 struct kvm;
 
+#define kvm_flush_dcache_to_poc(a,l)   __cpuc_flush_dcache_area((a), (l))
+
+static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
+{
+   return (vcpu->arch.cp15[c1_SCTLR] & 0b101) == 0b101;
+}
+
 static inline void coherent_cache_guest_page(struct kvm_vcpu *vcpu, hva_t hva,
 unsigned long size)
 {
+   if (!vcpu_has_cache_enabled(vcpu))
+   kvm_flush_dcache_to_poc((void *)hva, size);
+   
/*
 * If we are going to insert an instruction page and the icache is
 * either VIPT or PIPT, there is a potential problem where the host
@@ -145,7 +155,6 @@ static inline void coherent_cache_guest_page(struct 
kvm_vcpu *vcpu, hva_t hva,
}
 }
 
-#define kvm_flush_dcache_to_poc(a,l)   __cpuc_flush_dcache_area((a), (l))
 #define kvm_virt_to_phys(x)virt_to_idmap((unsigned long)(x))
 
 void stage2_flush_vm(struct kvm *kvm);
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 07/11] ARM: KVM: fix handling of trapped 64bit coprocessor accesses

2014-02-05 Thread Marc Zyngier
Commit 240e99cbd00a (ARM: KVM: Fix 64-bit coprocessor handling)
changed the way we match the 64bit coprocessor access from
user space, but didn't update the trap handler for the same
set of registers.

The effect is that a trapped 64bit access is never matched, leading
to a fault being injected into the guest. This went unnoticed as we
didn't really trap any 64bit register so far.

Placing the CRm field of the access into the CRn field of the matching
structure fixes the problem. Also update the debug feature to emit the
expected string in case of failing match.

Signed-off-by: Marc Zyngier 
Reviewed-by: Christoffer Dall 
---
 arch/arm/kvm/coproc.c | 4 ++--
 arch/arm/kvm/coproc.h | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 78c0885..126c90d 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -443,7 +443,7 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
 {
struct coproc_params params;
 
-   params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
+   params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
params.is_64bit = true;
@@ -451,7 +451,7 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
params.Op2 = 0;
params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
-   params.CRn = 0;
+   params.CRm = 0;
 
return emulate_cp15(vcpu, ¶ms);
 }
diff --git a/arch/arm/kvm/coproc.h b/arch/arm/kvm/coproc.h
index 0461d5c..c5ad7ff 100644
--- a/arch/arm/kvm/coproc.h
+++ b/arch/arm/kvm/coproc.h
@@ -58,8 +58,8 @@ static inline void print_cp_instr(const struct coproc_params 
*p)
 {
/* Look, we even formatted it for you to paste into the table! */
if (p->is_64bit) {
-   kvm_pr_unimpl(" { CRm(%2lu), Op1(%2lu), is64, func_%s },\n",
- p->CRm, p->Op1, p->is_write ? "write" : "read");
+   kvm_pr_unimpl(" { CRm64(%2lu), Op1(%2lu), is64, func_%s },\n",
+ p->CRn, p->Op1, p->is_write ? "write" : "read");
} else {
kvm_pr_unimpl(" { CRn(%2lu), CRm(%2lu), Op1(%2lu), Op2(%2lu), 
is32,"
  " func_%s },\n",
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 03/11] arm64: KVM: trap VM system registers until MMU and caches are ON

2014-02-05 Thread Marc Zyngier
In order to be able to detect the point where the guest enables
its MMU and caches, trap all the VM related system registers.

Once we see the guest enabling both the MMU and the caches, we
can go back to a saner mode of operation, which is to leave these
registers in complete control of the guest.

Signed-off-by: Marc Zyngier 
Reviewed-by: Catalin Marinas 
Reviewed-by: Christoffer Dall 
---
 arch/arm64/include/asm/kvm_arm.h |  3 +-
 arch/arm64/include/asm/kvm_asm.h |  3 +-
 arch/arm64/kvm/sys_regs.c| 90 ++--
 3 files changed, 82 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index c98ef47..fd0a651 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -62,6 +62,7 @@
  * RW: 64bit by default, can be overriden for 32bit VMs
  * TAC:Trap ACTLR
  * TSC:Trap SMC
+ * TVM:Trap VM ops (until M+C set in SCTLR_EL1)
  * TSW:Trap cache operations by set/way
  * TWE:Trap WFE
  * TWI:Trap WFI
@@ -74,7 +75,7 @@
  * SWIO:   Turn set/way invalidates into set/way clean+invalidate
  */
 #define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
-HCR_BSU_IS | HCR_FB | HCR_TAC | \
+HCR_TVM | HCR_BSU_IS | HCR_FB | HCR_TAC | \
 HCR_AMO | HCR_IMO | HCR_FMO | \
 HCR_SWIO | HCR_TIDCP | HCR_RW)
 #define HCR_VIRT_EXCP_MASK (HCR_VA | HCR_VI | HCR_VF)
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 3d796b4..89d7796 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -81,7 +81,8 @@
 #define c13_TID_URW(TPIDR_EL0 * 2) /* Thread ID, User R/W */
 #define c13_TID_URO(TPIDRRO_EL0 * 2)/* Thread ID, User R/O */
 #define c13_TID_PRIV   (TPIDR_EL1 * 2) /* Thread ID, Privileged */
-#define c10_AMAIR  (AMAIR_EL1 * 2) /* Aux Memory Attr Indirection Reg */
+#define c10_AMAIR0 (AMAIR_EL1 * 2) /* Aux Memory Attr Indirection Reg */
+#define c10_AMAIR1 (c10_AMAIR0 + 1)/* Aux Memory Attr Indirection Reg */
 #define c14_CNTKCTL(CNTKCTL_EL1 * 2) /* Timer Control Register (PL1) */
 #define NR_CP15_REGS   (NR_SYS_REGS * 2)
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index bf03e0f..2097e5e 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -121,6 +121,46 @@ done:
 }
 
 /*
+ * Generic accessor for VM registers. Only called as long as HCR_TVM
+ * is set.
+ */
+static bool access_vm_reg(struct kvm_vcpu *vcpu,
+ const struct sys_reg_params *p,
+ const struct sys_reg_desc *r)
+{
+   unsigned long val;
+
+   BUG_ON(!p->is_write);
+
+   val = *vcpu_reg(vcpu, p->Rt);
+   if (!p->is_aarch32) {
+   vcpu_sys_reg(vcpu, r->reg) = val;
+   } else {
+   vcpu_cp15(vcpu, r->reg) = val & 0xUL;
+   if (!p->is_32bit)
+   vcpu_cp15(vcpu, r->reg + 1) = val >> 32;
+   }
+   return true;
+}
+
+/*
+ * SCTLR_EL1 accessor. Only called as long as HCR_TVM is set.  If the
+ * guest enables the MMU, we stop trapping the VM sys_regs and leave
+ * it in complete control of the caches.
+ */
+static bool access_sctlr(struct kvm_vcpu *vcpu,
+const struct sys_reg_params *p,
+const struct sys_reg_desc *r)
+{
+   access_vm_reg(vcpu, p, r);
+
+   if (vcpu_has_cache_enabled(vcpu))   /* MMU+Caches enabled? */
+   vcpu->arch.hcr_el2 &= ~HCR_TVM;
+
+   return true;
+}
+
+/*
  * We could trap ID_DFR0 and tell the guest we don't support performance
  * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
  * NAKed, so it will read the PMCR anyway.
@@ -185,32 +225,32 @@ static const struct sys_reg_desc sys_reg_descs[] = {
  NULL, reset_mpidr, MPIDR_EL1 },
/* SCTLR_EL1 */
{ Op0(0b11), Op1(0b000), CRn(0b0001), CRm(0b), Op2(0b000),
- NULL, reset_val, SCTLR_EL1, 0x00C50078 },
+ access_sctlr, reset_val, SCTLR_EL1, 0x00C50078 },
/* CPACR_EL1 */
{ Op0(0b11), Op1(0b000), CRn(0b0001), CRm(0b), Op2(0b010),
  NULL, reset_val, CPACR_EL1, 0 },
/* TTBR0_EL1 */
{ Op0(0b11), Op1(0b000), CRn(0b0010), CRm(0b), Op2(0b000),
- NULL, reset_unknown, TTBR0_EL1 },
+ access_vm_reg, reset_unknown, TTBR0_EL1 },
/* TTBR1_EL1 */
{ Op0(0b11), Op1(0b000), CRn(0b0010), CRm(0b), Op2(0b001),
- NULL, reset_unknown, TTBR1_EL1 },
+ access_vm_reg, reset_unknown, TTBR1_EL1 },
/* TCR_EL1 */
{ Op0(0b11), Op1(0b000), CRn(0b0010), CRm(0b), Op2(0b010),
- NULL, reset_val, TCR_EL1, 0 },
+ access_vm_reg, reset_val, TCR_EL1, 0 },
 
/*

[PATCH v3 11/11] ARM: KVM: trap VM system registers until MMU and caches are ON

2014-02-05 Thread Marc Zyngier
In order to be able to detect the point where the guest enables
its MMU and caches, trap all the VM related system registers.

Once we see the guest enabling both the MMU and the caches, we
can go back to a saner mode of operation, which is to leave these
registers in complete control of the guest.

Signed-off-by: Marc Zyngier 
---
 arch/arm/include/asm/kvm_arm.h |  3 +-
 arch/arm/kvm/coproc.c  | 74 +-
 arch/arm/kvm/coproc.h  |  4 +++
 arch/arm/kvm/coproc_a15.c  |  2 +-
 arch/arm/kvm/coproc_a7.c   |  2 +-
 5 files changed, 66 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/kvm_arm.h b/arch/arm/include/asm/kvm_arm.h
index a843e74..816db0b 100644
--- a/arch/arm/include/asm/kvm_arm.h
+++ b/arch/arm/include/asm/kvm_arm.h
@@ -55,6 +55,7 @@
  * The bits we set in HCR:
  * TAC:Trap ACTLR
  * TSC:Trap SMC
+ * TVM:Trap VM ops (until MMU and caches are on)
  * TSW:Trap cache operations by set/way
  * TWI:Trap WFI
  * TWE:Trap WFE
@@ -68,7 +69,7 @@
  */
 #define HCR_GUEST_MASK (HCR_TSC | HCR_TSW | HCR_TWI | HCR_VM | HCR_BSU_IS | \
HCR_FB | HCR_TAC | HCR_AMO | HCR_IMO | HCR_FMO | \
-   HCR_TWE | HCR_SWIO | HCR_TIDCP)
+   HCR_TVM | HCR_TWE | HCR_SWIO | HCR_TIDCP)
 
 /* System Control Register (SCTLR) bits */
 #define SCTLR_TE   (1 << 30)
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index a5a54a4..c58a351 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -205,6 +206,44 @@ done:
 }
 
 /*
+ * Generic accessor for VM registers. Only called as long as HCR_TVM
+ * is set.
+ */
+static bool access_vm_reg(struct kvm_vcpu *vcpu,
+ const struct coproc_params *p,
+ const struct coproc_reg *r)
+{
+   BUG_ON(!p->is_write);
+
+   vcpu->arch.cp15[r->reg] = *vcpu_reg(vcpu, p->Rt1);
+   if (p->is_64bit)
+   vcpu->arch.cp15[r->reg + 1] = *vcpu_reg(vcpu, p->Rt2);
+
+   return true;
+}
+
+/*
+ * SCTLR accessor. Only called as long as HCR_TVM is set.  If the
+ * guest enables the MMU, we stop trapping the VM sys_regs and leave
+ * it in complete control of the caches.
+ *
+ * Used by the cpu-specific code.
+ */
+bool access_sctlr(struct kvm_vcpu *vcpu,
+ const struct coproc_params *p,
+ const struct coproc_reg *r)
+{
+   access_vm_reg(vcpu, p, r);
+
+   if (vcpu_has_cache_enabled(vcpu)) { /* MMU+Caches enabled? */
+   vcpu->arch.hcr &= ~HCR_TVM;
+   stage2_flush_vm(vcpu->kvm);
+   }
+
+   return true;
+}
+
+/*
  * We could trap ID_DFR0 and tell the guest we don't support performance
  * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
  * NAKed, so it will read the PMCR anyway.
@@ -261,33 +300,36 @@ static const struct coproc_reg cp15_regs[] = {
{ CRn( 1), CRm( 0), Op1( 0), Op2( 2), is32,
NULL, reset_val, c1_CPACR, 0x },
 
-   /* TTBR0/TTBR1: swapped by interrupt.S. */
-   { CRm64( 2), Op1( 0), is64, NULL, reset_unknown64, c2_TTBR0 },
-   { CRm64( 2), Op1( 1), is64, NULL, reset_unknown64, c2_TTBR1 },
-
-   /* TTBCR: swapped by interrupt.S. */
+   /* TTBR0/TTBR1/TTBCR: swapped by interrupt.S. */
+   { CRm64( 2), Op1( 0), is64, access_vm_reg, reset_unknown64, c2_TTBR0 },
+   { CRn(2), CRm( 0), Op1( 0), Op2( 0), is32,
+   access_vm_reg, reset_unknown, c2_TTBR0 },
+   { CRn(2), CRm( 0), Op1( 0), Op2( 1), is32,
+   access_vm_reg, reset_unknown, c2_TTBR1 },
{ CRn( 2), CRm( 0), Op1( 0), Op2( 2), is32,
-   NULL, reset_val, c2_TTBCR, 0x },
+   access_vm_reg, reset_val, c2_TTBCR, 0x },
+   { CRm64( 2), Op1( 1), is64, access_vm_reg, reset_unknown64, c2_TTBR1 },
+
 
/* DACR: swapped by interrupt.S. */
{ CRn( 3), CRm( 0), Op1( 0), Op2( 0), is32,
-   NULL, reset_unknown, c3_DACR },
+   access_vm_reg, reset_unknown, c3_DACR },
 
/* DFSR/IFSR/ADFSR/AIFSR: swapped by interrupt.S. */
{ CRn( 5), CRm( 0), Op1( 0), Op2( 0), is32,
-   NULL, reset_unknown, c5_DFSR },
+   access_vm_reg, reset_unknown, c5_DFSR },
{ CRn( 5), CRm( 0), Op1( 0), Op2( 1), is32,
-   NULL, reset_unknown, c5_IFSR },
+   access_vm_reg, reset_unknown, c5_IFSR },
{ CRn( 5), CRm( 1), Op1( 0), Op2( 0), is32,
-   NULL, reset_unknown, c5_ADFSR },
+   access_vm_reg, reset_unknown, c5_ADFSR },
{ CRn( 5), CRm( 1), Op1( 0), Op2( 1), is32,
-   NULL, reset

[PATCH v3 04/11] arm64: KVM: flush VM pages before letting the guest enable caches

2014-02-05 Thread Marc Zyngier
When the guest runs with caches disabled (like in an early boot
sequence, for example), all the writes are diectly going to RAM,
bypassing the caches altogether.

Once the MMU and caches are enabled, whatever sits in the cache
becomes suddenly visible, which isn't what the guest expects.

A way to avoid this potential disaster is to invalidate the cache
when the MMU is being turned on. For this, we hook into the SCTLR_EL1
trapping code, and scan the stage-2 page tables, invalidating the
pages/sections that have already been mapped in.

Signed-off-by: Marc Zyngier 
Reviewed-by: Catalin Marinas 
---
 arch/arm/include/asm/kvm_mmu.h   |  8 
 arch/arm/kvm/mmu.c   | 93 
 arch/arm64/include/asm/kvm_mmu.h |  4 ++
 arch/arm64/kvm/sys_regs.c|  5 ++-
 4 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 6d0f3d3..0931cda 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -114,6 +114,12 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
pmd_val(*pmd) |= L_PMD_S2_RDWR;
 }
 
+/* Open coded pgd_addr_end that can deal with 64bit addresses */
+#define kvm_pgd_addr_end(addr, end)\
+({ u64 __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK;\
+   (__boundary - 1 < (end) - 1)? __boundary: (end);\
+})
+
 struct kvm;
 
 static inline void coherent_cache_guest_page(struct kvm_vcpu *vcpu, hva_t hva,
@@ -142,6 +148,8 @@ static inline void coherent_cache_guest_page(struct 
kvm_vcpu *vcpu, hva_t hva,
 #define kvm_flush_dcache_to_poc(a,l)   __cpuc_flush_dcache_area((a), (l))
 #define kvm_virt_to_phys(x)virt_to_idmap((unsigned long)(x))
 
+void stage2_flush_vm(struct kvm *kvm);
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ARM_KVM_MMU_H__ */
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index fc71a8d..ea21b6a 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -187,6 +187,99 @@ static void unmap_range(struct kvm *kvm, pgd_t *pgdp,
}
 }
 
+static void stage2_flush_ptes(struct kvm *kvm, pmd_t *pmd,
+ phys_addr_t addr, phys_addr_t end)
+{
+   pte_t *pte;
+
+   pte = pte_offset_kernel(pmd, addr);
+   do {
+   if (!pte_none(*pte)) {
+   hva_t hva = gfn_to_hva(kvm, addr >> PAGE_SHIFT);
+   kvm_flush_dcache_to_poc((void*)hva, PAGE_SIZE);
+   }
+   } while (pte++, addr += PAGE_SIZE, addr != end);
+}
+
+static void stage2_flush_pmds(struct kvm *kvm, pud_t *pud,
+ phys_addr_t addr, phys_addr_t end)
+{
+   pmd_t *pmd;
+   phys_addr_t next;
+
+   pmd = pmd_offset(pud, addr);
+   do {
+   next = pmd_addr_end(addr, end);
+   if (!pmd_none(*pmd)) {
+   if (kvm_pmd_huge(*pmd)) {
+   hva_t hva = gfn_to_hva(kvm, addr >> PAGE_SHIFT);
+   kvm_flush_dcache_to_poc((void*)hva, PMD_SIZE);
+   } else {
+   stage2_flush_ptes(kvm, pmd, addr, next);
+   }
+   }
+   } while (pmd++, addr = next, addr != end);
+}
+
+static void stage2_flush_puds(struct kvm *kvm, pgd_t *pgd,
+ phys_addr_t addr, phys_addr_t end)
+{
+   pud_t *pud;
+   phys_addr_t next;
+
+   pud = pud_offset(pgd, addr);
+   do {
+   next = pud_addr_end(addr, end);
+   if (!pud_none(*pud)) {
+   if (pud_huge(*pud)) {
+   hva_t hva = gfn_to_hva(kvm, addr >> PAGE_SHIFT);
+   kvm_flush_dcache_to_poc((void*)hva, PUD_SIZE);
+   } else {
+   stage2_flush_pmds(kvm, pud, addr, next);
+   }
+   }
+   } while(pud++, addr = next, addr != end);
+}
+
+static void stage2_flush_memslot(struct kvm *kvm,
+struct kvm_memory_slot *memslot)
+{
+   phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT;
+   phys_addr_t end = addr + PAGE_SIZE * memslot->npages;
+   phys_addr_t next;
+   pgd_t *pgd;
+
+   pgd = kvm->arch.pgd + pgd_index(addr);
+   do {
+   next = kvm_pgd_addr_end(addr, end);
+   stage2_flush_puds(kvm, pgd, addr, next);
+   } while (pgd++, addr = next, addr != end);
+}
+
+/**
+ * stage2_flush_vm - Invalidate cache for pages mapped in stage 2
+ * @kvm: The struct kvm pointer
+ *
+ * Go through the stage 2 page tables and invalidate any cache lines
+ * backing memory already mapped to the VM.
+ */
+void stage2_flush_vm(struct kvm *kvm)
+{
+   struct kvm_memslots *slots;
+   struct kvm_memory_slot *memslot;
+   int idx;
+
+   idx = srcu_read_lock(&kvm->srcu);
+   spin_lock

[PATCH v3 10/11] ARM: KVM: add world-switch for AMAIR{0,1}

2014-02-05 Thread Marc Zyngier
HCR.TVM traps (among other things) accesses to AMAIR0 and AMAIR1.
In order to minimise the amount of surprise a guest could generate by
trying to access these registers with caches off, add them to the
list of registers we switch/handle.

Signed-off-by: Marc Zyngier 
Reviewed-by: Christoffer Dall 
---
 arch/arm/include/asm/kvm_asm.h |  4 +++-
 arch/arm/kvm/coproc.c  |  6 ++
 arch/arm/kvm/interrupts_head.S | 12 ++--
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 661da11..53b3c4a 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -48,7 +48,9 @@
 #define c13_TID_URO26  /* Thread ID, User R/O */
 #define c13_TID_PRIV   27  /* Thread ID, Privileged */
 #define c14_CNTKCTL28  /* Timer Control Register (PL1) */
-#define NR_CP15_REGS   29  /* Number of regs (incl. invalid) */
+#define c10_AMAIR0 29  /* Auxilary Memory Attribute Indirection Reg0 */
+#define c10_AMAIR1 30  /* Auxilary Memory Attribute Indirection Reg1 */
+#define NR_CP15_REGS   31  /* Number of regs (incl. invalid) */
 
 #define ARM_EXCEPTION_RESET  0
 #define ARM_EXCEPTION_UNDEFINED   1
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 126c90d..a5a54a4 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -328,6 +328,12 @@ static const struct coproc_reg cp15_regs[] = {
{ CRn(10), CRm( 2), Op1( 0), Op2( 1), is32,
NULL, reset_unknown, c10_NMRR},
 
+   /* AMAIR0/AMAIR1: swapped by interrupt.S. */
+   { CRn(10), CRm( 3), Op1( 0), Op2( 0), is32,
+   access_vm_reg, reset_unknown, c10_AMAIR0},
+   { CRn(10), CRm( 3), Op1( 0), Op2( 1), is32,
+   access_vm_reg, reset_unknown, c10_AMAIR1},
+
/* VBAR: swapped by interrupt.S. */
{ CRn(12), CRm( 0), Op1( 0), Op2( 0), is32,
NULL, reset_val, c12_VBAR, 0x },
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index 7cb41e1..e4eaf30 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -303,13 +303,17 @@ vcpu  .reqr0  @ vcpu pointer always 
in r0
 
mrc p15, 0, r2, c14, c1, 0  @ CNTKCTL
mrrcp15, 0, r4, r5, c7  @ PAR
+   mrc p15, 0, r6, c10, c3, 0  @ AMAIR0
+   mrc p15, 0, r7, c10, c3, 1  @ AMAIR1
 
.if \store_to_vcpu == 0
-   push{r2,r4-r5}
+   push{r2,r4-r7}
.else
str r2, [vcpu, #CP15_OFFSET(c14_CNTKCTL)]
add r12, vcpu, #CP15_OFFSET(c7_PAR)
strdr4, r5, [r12]
+   str r6, [vcpu, #CP15_OFFSET(c10_AMAIR0)]
+   str r7, [vcpu, #CP15_OFFSET(c10_AMAIR1)]
.endif
 .endm
 
@@ -322,15 +326,19 @@ vcpu  .reqr0  @ vcpu pointer always 
in r0
  */
 .macro write_cp15_state read_from_vcpu
.if \read_from_vcpu == 0
-   pop {r2,r4-r5}
+   pop {r2,r4-r7}
.else
ldr r2, [vcpu, #CP15_OFFSET(c14_CNTKCTL)]
add r12, vcpu, #CP15_OFFSET(c7_PAR)
ldrdr4, r5, [r12]
+   ldr r6, [vcpu, #CP15_OFFSET(c10_AMAIR0)]
+   ldr r7, [vcpu, #CP15_OFFSET(c10_AMAIR1)]
.endif
 
mcr p15, 0, r2, c14, c1, 0  @ CNTKCTL
mcrrp15, 0, r4, r5, c7  @ PAR
+   mcr p15, 0, r6, c10, c3, 0  @ AMAIR0
+   mcr p15, 0, r7, c10, c3, 1  @ AMAIR1
 
.if \read_from_vcpu == 0
pop {r2-r12}
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 08/11] ARM: KVM: fix ordering of 64bit coprocessor accesses

2014-02-05 Thread Marc Zyngier
Commit 240e99cbd00a (ARM: KVM: Fix 64-bit coprocessor handling)
added an ordering dependency for the 64bit registers.

The order described is: CRn, CRm, Op1, Op2, 64bit-first.

Unfortunately, the implementation is: CRn, 64bit-first, CRm...

Move the 64bit test to be last in order to match the documentation.

Signed-off-by: Marc Zyngier 
Reviewed-by: Christoffer Dall 
---
 arch/arm/kvm/coproc.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kvm/coproc.h b/arch/arm/kvm/coproc.h
index c5ad7ff..8dda870 100644
--- a/arch/arm/kvm/coproc.h
+++ b/arch/arm/kvm/coproc.h
@@ -135,13 +135,13 @@ static inline int cmp_reg(const struct coproc_reg *i1,
return -1;
if (i1->CRn != i2->CRn)
return i1->CRn - i2->CRn;
-   if (i1->is_64 != i2->is_64)
-   return i2->is_64 - i1->is_64;
if (i1->CRm != i2->CRm)
return i1->CRm - i2->CRm;
if (i1->Op1 != i2->Op1)
return i1->Op1 - i2->Op1;
-   return i1->Op2 - i2->Op2;
+   if (i1->Op2 != i2->Op2)
+   return i1->Op2 - i2->Op2;
+   return i2->is_64 - i1->is_64;
 }
 
 
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 00/11] arm/arm64: KVM: host cache maintenance when guest caches are off

2014-02-05 Thread Marc Zyngier
When we run a guest with cache disabled, we don't flush the cache to
the Point of Coherency, hence possibly missing bits of data that have
been written in the cache, but have not yet reached memory.

We also have the opposite issue: when a guest enables its cache,
whatever sits in the cache is suddenly going to become visible,
shadowing whatever the guest has written into RAM.

There are several approaches to these issues:
- Using the DC bit when caches are off: this breaks guests assuming
  caches off while doing DMA operations. Bootloaders, for example.
  It also breaks the I-D coherency.
- Fetch the memory attributes on translation fault, and flush the
  cache while handling the fault. This relies on using the PAR_EL1
  register to obtain the Stage-1 memory attributes, and tends to be
  slow.
- Detecting the translation faults occuring with MMU off (and
  performing a cache clean), and trapping SCTLR_EL1 to detect the
  moment when the guest is turning its caches on (and performing a
  cache invalidation). Trapping of SCTLR_EL1 is then disabled to
  ensure the best performance.

This patch series implements the last solution, for both arm and
arm64. Tested on TC2 (ARMv7) and FVP model (ARMv8).

>From v2 (http://www.spinics.net/lists/arm-kernel/msg302472.html):
- Addressed most (hopefully all) of Christoffer's comments
- Added a new LPAE pmd_addr_end to deal with 40bit IPAs

>From v1 (http://www.spinics.net/lists/kvm/msg99404.html):
- Fixed AArch32 VM handling on arm64 (Reported by Anup)
- Added ARMv7 support:
  * Fixed a couple of issues regarding handling of 64bit cp15 regs
  * Per-vcpu HCR
  * Switching of AMAIR0 and AMAIR1

Marc Zyngier (11):
  arm64: KVM: force cache clean on page fault when caches are off
  arm64: KVM: allows discrimination of AArch32 sysreg access
  arm64: KVM: trap VM system registers until MMU and caches are ON
  arm64: KVM: flush VM pages before letting the guest enable caches
  ARM: LPAE: provide an IPA capable pmd_addr_end
  ARM: KVM: force cache clean on page fault when caches are off
  ARM: KVM: fix handling of trapped 64bit coprocessor accesses
  ARM: KVM: fix ordering of 64bit coprocessor accesses
  ARM: KVM: introduce per-vcpu HYP Configuration Register
  ARM: KVM: add world-switch for AMAIR{0,1}
  ARM: KVM: trap VM system registers until MMU and caches are ON

 arch/arm/include/asm/kvm_arm.h|  4 +-
 arch/arm/include/asm/kvm_asm.h|  4 +-
 arch/arm/include/asm/kvm_host.h   |  9 ++--
 arch/arm/include/asm/kvm_mmu.h| 23 ++--
 arch/arm/include/asm/pgtable-3level.h |  5 ++
 arch/arm/kernel/asm-offsets.c |  1 +
 arch/arm/kvm/coproc.c | 84 ++---
 arch/arm/kvm/coproc.h | 14 +++--
 arch/arm/kvm/coproc_a15.c |  2 +-
 arch/arm/kvm/coproc_a7.c  |  2 +-
 arch/arm/kvm/guest.c  |  1 +
 arch/arm/kvm/interrupts_head.S| 21 +---
 arch/arm/kvm/mmu.c| 97 +-
 arch/arm64/include/asm/kvm_arm.h  |  3 +-
 arch/arm64/include/asm/kvm_asm.h  |  3 +-
 arch/arm64/include/asm/kvm_mmu.h  | 20 +--
 arch/arm64/kvm/sys_regs.c | 99 ++-
 arch/arm64/kvm/sys_regs.h |  2 +
 18 files changed, 332 insertions(+), 62 deletions(-)

-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 05/11] ARM: LPAE: provide an IPA capable pmd_addr_end

2014-02-05 Thread Marc Zyngier
The default pmd_addr_end macro uses an unsigned long to represent
the VA. When used with KVM and stage-2 translation, the VA is
actually an IPA, which is up to 40 bits. This also affect the
SMMU driver, which also deals with stage-2 translation.

Instead, provide an implementation that can cope with larger VAs
by using a u64 instead. This version will overload the default
one provided in include/asm-generic/pgtable.h.

Signed-off-by: Marc Zyngier 
---
 arch/arm/include/asm/pgtable-3level.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/include/asm/pgtable-3level.h 
b/arch/arm/include/asm/pgtable-3level.h
index 03243f7..594867b 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -262,6 +262,11 @@ static inline int has_transparent_hugepage(void)
return 1;
 }
 
+#define pmd_addr_end(addr, end)
\
+({ u64 __boundary = ((addr) + PMD_SIZE) & PMD_MASK;\
+   (__boundary - 1 < (end) - 1)? __boundary: (end);\
+})
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_PGTABLE_3LEVEL_H */
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Get a vm fd using kvm API's ioctls

2014-02-05 Thread Vincent KHERBACHE
Ok this is what I feared, modifying Qemu is the only way to get the dirty 
bitmap from a VM launched by a Qemu process..

Ans this is definitely what I will do. Thanks again for your help !


Paolo Bonzini  a écrit :
>Il 05/02/2014 18:17, Vincent KHERBACHE ha scritto:
>> Thank you for your reply !
>>
>> I take note of your remark about the bitmap clearing, it could be an
>> effective issue.
>>
>> But I really have no idea of how can I 'ask' something the other
>process ?
>
>You modify the source code for that program. :)  It's likely QEMU or 
>kvmtool, in either case it's free software. :)
>
>Paolo

-- 
Vincent KHERBACHE
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Get a vm fd using kvm API's ioctls

2014-02-05 Thread Paolo Bonzini

Il 05/02/2014 18:17, Vincent KHERBACHE ha scritto:

Thank you for your reply !

I take note of your remark about the bitmap clearing, it could be an
effective issue.

But I really have no idea of how can I 'ask' something the other process ?


You modify the source code for that program. :)  It's likely QEMU or 
kvmtool, in either case it's free software. :)


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Get a vm fd using kvm API's ioctls

2014-02-05 Thread Vincent KHERBACHE
Le 05/02/2014 17:52, Paolo Bonzini a écrit :
> Il 05/02/2014 17:30, Vincent KHERBACHE ha scritto:
>> Hi all,
>>
>> I'm trying to get the dirty bitmap of a specific VM, using
>> KVM_GET_DIRTY_LOG ioctl.
>>
>> For this purpose, I should be able to get the file descriptor of an
>> existing VM by doing something like :
>>
>> kvm_fd = open("/dev/kvm")
>> ...
>> b = ioctl(KVM_GET_DIRTY_LOG, vm_fd)
>>
>>
>> I also can see, from the API documentation
>> (https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt), that
>> there is the following restriction concerning VM ioctls :
>>
>> "Only run VM ioctls from the same process (address space) that was used
>> to create the VM."
>>
>>
>> Is there a way to get the fd of a running VM (created from an other
>> process), or maybe a better/easier manner to get the dirty bitmap ?
> 
> You can get ask the other process to retrieve the dirty bitmap and place
> it in a shared memory segment.
> 
> However, note that KVM_GET_DIRTY_LOG retrieves _and clears_ the dirty
> bitmap.  So if the "owner" of the running VM is already using the dirty
> bitmap, calling KVM_GET_DIRTY_LOG will likely break that usage.


Thank you for your reply !

I take note of your remark about the bitmap clearing, it could be an
effective issue.

But I really have no idea of how can I 'ask' something the other process ?


Regards.
-- 
Vincent KHERBACHE
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Get a vm fd using kvm API's ioctls

2014-02-05 Thread Paolo Bonzini

Il 05/02/2014 17:30, Vincent KHERBACHE ha scritto:

Hi all,

I'm trying to get the dirty bitmap of a specific VM, using
KVM_GET_DIRTY_LOG ioctl.

For this purpose, I should be able to get the file descriptor of an
existing VM by doing something like :

kvm_fd = open("/dev/kvm")
...
b = ioctl(KVM_GET_DIRTY_LOG, vm_fd)


I also can see, from the API documentation
(https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt), that
there is the following restriction concerning VM ioctls :

"Only run VM ioctls from the same process (address space) that was used
to create the VM."


Is there a way to get the fd of a running VM (created from an other
process), or maybe a better/easier manner to get the dirty bitmap ?


You can get ask the other process to retrieve the dirty bitmap and place 
it in a shared memory segment.


However, note that KVM_GET_DIRTY_LOG retrieves _and clears_ the dirty 
bitmap.  So if the "owner" of the running VM is already using the dirty 
bitmap, calling KVM_GET_DIRTY_LOG will likely break that usage.


Paolo

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Get a vm fd using kvm API's ioctls

2014-02-05 Thread Vincent KHERBACHE
Hi all,

I'm trying to get the dirty bitmap of a specific VM, using
KVM_GET_DIRTY_LOG ioctl.

For this purpose, I should be able to get the file descriptor of an
existing VM by doing something like :

kvm_fd = open("/dev/kvm")
...
b = ioctl(KVM_GET_DIRTY_LOG, vm_fd)


I also can see, from the API documentation
(https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt), that
there is the following restriction concerning VM ioctls :

"Only run VM ioctls from the same process (address space) that was used
to create the VM."


Is there a way to get the fd of a running VM (created from an other
process), or maybe a better/easier manner to get the dirty bitmap ?


Any help would be welcome.
Thanks,

-- 
Vincent KHERBACHE
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QEMU P2P migration speed

2014-02-05 Thread Paolo Bonzini

Il 05/02/2014 11:46, Andrey Korolyov ha scritto:

On 02/05/2014 11:27 AM, Paolo Bonzini wrote:

Il 04/02/2014 18:06, Andrey Korolyov ha scritto:

Migration time is almost independent of VM RSS(varies by ten percent at
maximum), for situation when VM is active on target host, time is about
85 seconds to migrate 8G between hosts, and when it is turned off,
migration time *increasing* to 120s. For curious ones, frequency
management is completely inactive on both nodes, neither CStates
mechanism. Interconnection is relatively fast (20+Gbit/s by IPoIB).


What version of QEMU?

Paolo


Ancie.. ehm, stable - 1.1.2 from wheezy. Should I try 1.6/1.7?


Yeah, you can checkout the release notes on wiki.qemu.org to find out 
which versions had good improvements.  You can also try compiling 
straight from git, there are more speedups there.


Paolo

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] tree-wide: clean up no longer required #include

2014-02-05 Thread Paul Gortmaker
[Re: [GIT PULL] tree-wide: clean up no longer required #include ] 
On 05/02/2014 (Wed 07:41) Ingo Molnar wrote:

> 
> * Stephen Rothwell  wrote:
> 
> > Hi Ingo,
> > 
> > On Wed, 5 Feb 2014 07:06:33 +0100 Ingo Molnar  wrote:
> > > 
> > > So, if you meant Linus to pull it, you probably want to cite a real 
> > > Git URI along the lines of:
> > > 
> > >git://git.kernel.org/pub/scm/linux/kernel/git/paulg/init.git
> > 
> > Paul provided the proper git url further down in the mail along with the
> > usual pull request message (I guess he should have put that bit at the
> > top).
> 
> Yeah, indeed, and it even comes with a signed tag, which is an extra 
> nice touch:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux.git 
> tags/init-cleanup
> 
> (I guess the https was mentioned first to lower expectations.)

Just to clarify, the init.git was the repo of raw commits+series file
that was used for testing on linux next; now useless, except for showing
the last several weeks of history (hence the visual http link).  The
signed tag [separate repo] is the application of those commits against
the 3.14-rc1 tag, which was the end goal from the beginning.

Does history matter?  In the case of a cleanup like this, it does only
in the immediate context of this pull request; to help distinguish this
work from some short lived half baked idea that also had its testing
invalidated by arbitrarily rebasing onto the latest shiny tag.

I wouldn't have even mentioned the patch repo, except for the fact that
I know how Linus loves arbitrary rebases [and malformed pull requests]  :)

Thanks,
Paul.
--

> 
> Thanks,
> 
>   Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QEMU P2P migration speed

2014-02-05 Thread Andrey Korolyov
On 02/05/2014 11:27 AM, Paolo Bonzini wrote:
> Il 04/02/2014 18:06, Andrey Korolyov ha scritto:
>> Migration time is almost independent of VM RSS(varies by ten percent at
>> maximum), for situation when VM is active on target host, time is about
>> 85 seconds to migrate 8G between hosts, and when it is turned off,
>> migration time *increasing* to 120s. For curious ones, frequency
>> management is completely inactive on both nodes, neither CStates
>> mechanism. Interconnection is relatively fast (20+Gbit/s by IPoIB).
> 
> What version of QEMU?
> 
> Paolo

Ancie.. ehm, stable - 1.1.2 from wheezy. Should I try 1.6/1.7?
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 01/10] KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation

2014-02-05 Thread Alexander Graf

On 31.01.2014, at 23:17, Paul Mackerras  wrote:

> On Fri, Jan 31, 2014 at 11:47:44AM +0100, Alexander Graf wrote:
>> 
>> On 31.01.2014, at 11:38, Aneesh Kumar K.V  
>> wrote:
>> 
>>> Alexander Graf  writes:
>>> 
 On 01/28/2014 05:44 PM, Aneesh Kumar K.V wrote:
> We definitely don't need to emulate mtspr, because both the registers
> are hypervisor resource.
 
 This patch description doesn't cover what the patch actually does. It 
 changes the implementation from "always tell the guest it uses 100%" to 
 "give the guest an accurate amount of cpu time spent inside guest
 context".
>>> 
>>> Will fix that
>>> 
 
 Also, I think we either go with full hyp semantics which means we also 
 emulate the offset or we go with no hyp awareness in the guest at all 
 which means we also don't emulate SPURR which is a hyp privileged
 register.
>>> 
>>> Can you clarify this ?
>> 
>> In the 2.06 ISA SPURR is hypervisor privileged. That changed for 2.07 where 
>> it became supervisor privileged. So I suppose your patch is ok. When 
>> reviewing those patches I only had 2.06 around because power.org was broken.
> 
> It's always been supervisor privilege for reading and hypervisor
> privilege for writing, ever since it was introduced in 2.05, and that
> hasn't changed.  So I think what Aneesh is doing is correct.

This is what ISA 2.06B says:

308 SPURR   hypvhypv64  S
309 PURRhypvyes 64  S

And this is ISA 2.07:

308 SPURR   hypvyes 64  S
309 PURRhypvyes 64  S

So as you can see, from 2.06 to 2.07 SPURR became supervisor readable. Either 
the spec is wrong, the respective POWER CPUs don't implement the spec correctly 
or "hypv" doesn't mean "hypv" but means "may be hypv or yes".

I think in the context of this patch it's perfectly reasonable to treat SPURR 
as supervisor readable.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html