Re: [patch 04/12] [PATCH] kvm-s390-ucontrol: export SIE control block to user

2011-12-10 Thread Carsten Otte

On 09.12.2011 17:06, Alexander Graf wrote:

Same as this. It's an s390 specific hack, so it should be identified as such.
Naming is fine either way with me. Sasha Levin and Avi seemed to prefer 
not to have _S390 in it.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 10/12] [PATCH] kvm-s390: storage key interface

2011-12-10 Thread Carsten Otte

On 09.12.2011 14:46, heica...@linux.vnet.ibm.com wrote:

On Fri, Dec 09, 2011 at 01:49:35PM +0100, Carsten Otte wrote:

This patch introduces an interface to access the guest visible
storage keys. It supports three operations that model the behavior
that SSKE/ISKE/RRBE instructions would have if they were issued by
the guest. These instructions are all documented in the z architecture
principles of operation book.

Signed-off-by: Carsten Otteco...@de.ibm.com
---


[...]


+   spin_lock(current-mm-page_table_lock);
+   ptep = ptep_for_addr(addr);
+   if (!ptep)
+   goto out_unlock;


FWIW, this is also a bit odd: if the guest would perform a storage key
operation on such an address it would succeed. If the host will do it,
it will fail (which doesn't match your description above).
No?

Good catch, will fix.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] pit/hpet: Fix legacy mode switching

2011-12-10 Thread Jan Kiszka
This is a small preparatory series to allow the introduction of the KVM
in-kernel PIT. Of course, it is also a fix for the various bugs in the
related PIT/HPET code. See patches for details.

Jan Kiszka (2):
  hpet: Save/restore cached RTC IRQ level
  i8254: Rework  fix interaction with HPET in legacy mode

 hw/alpha_dp264.c   |2 +-
 hw/hpet.c  |   64 +--
 hw/hpet_emul.h |3 ++
 hw/i8254.c |   60 +++-
 hw/mips_fulong2e.c |2 +-
 hw/mips_jazz.c |2 +-
 hw/mips_malta.c|2 +-
 hw/mips_r4k.c  |2 +-
 hw/pc.c|   13 --
 hw/pc.h|   13 +-
 hw/ppc_prep.c  |2 +-
 11 files changed, 100 insertions(+), 65 deletions(-)

-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] hpet: Save/restore cached RTC IRQ level

2011-12-10 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com

In legacy mode, the HPET suppresses the RTC interrupt delivery via IRQ
8 but keeps track of the RTC output level and applies it when legacy
mode is turned off again. This value has to be preserved across save/
restore as it cannot be reconstructed otherwise.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c |   26 ++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index 5312df7..1b64e6a 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -240,6 +240,24 @@ static int hpet_post_load(void *opaque, int version_id)
 return 0;
 }
 
+static bool hpet_rtc_irq_level_needed(void *opaque)
+{
+HPETState *s = opaque;
+
+return s-rtc_irq_level != 0;
+}
+
+static const VMStateDescription vmstate_hpet_rtc_irq_level = {
+.name = hpet/rtc_irq_level,
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField[]) {
+VMSTATE_UINT8(rtc_irq_level, HPETState),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_hpet_timer = {
 .name = hpet_timer,
 .version_id = 1,
@@ -273,6 +291,14 @@ static const VMStateDescription vmstate_hpet = {
 VMSTATE_STRUCT_VARRAY_UINT8(timer, HPETState, num_timers, 0,
 vmstate_hpet_timer, HPETTimer),
 VMSTATE_END_OF_LIST()
+},
+.subsections = (VMStateSubsection[]) {
+{
+.vmsd = vmstate_hpet_rtc_irq_level,
+.needed = hpet_rtc_irq_level_needed,
+}, {
+/* empty */
+}
 }
 };
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] i8254: Rework fix interaction with HPET in legacy mode

2011-12-10 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com

When the HPET enters legacy mode, the IRQ output of the PIT is
suppressed and replaced by the HPET timer 0. But the current code to
emulate this was broken in many ways. It reset the PIT state after
re-enabling, it worked against a stale static PIT structure, and it did
not properly saved/restored the IRQ output mask in the PIT vmstate.

This patch solves the PIT IRQ control in a different way. On x86, it
both redirects the PIT IRQ to the HPET, just like the RTC. But it also
keeps the control line from the HPET to the PIT. This allows to disable
the PIT QEMU timer when it is not needed. The PIT's view on the control
line state is now saved in the same format that qemu-kvm is already
using.

Note that, in contrast to the suppressed RTC IRQ line, we do not need to
save/restore the PIT line state in the HPET. As we trigger a PIT IRQ
update via the control line, the line state is reconstructed on mode
switch.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/alpha_dp264.c   |2 +-
 hw/hpet.c  |   38 +---
 hw/hpet_emul.h |3 ++
 hw/i8254.c |   60 +--
 hw/mips_fulong2e.c |2 +-
 hw/mips_jazz.c |2 +-
 hw/mips_malta.c|2 +-
 hw/mips_r4k.c  |2 +-
 hw/pc.c|   13 --
 hw/pc.h|   13 +--
 hw/ppc_prep.c  |2 +-
 11 files changed, 74 insertions(+), 65 deletions(-)

diff --git a/hw/alpha_dp264.c b/hw/alpha_dp264.c
index fcc20e9..412ccf0 100644
--- a/hw/alpha_dp264.c
+++ b/hw/alpha_dp264.c
@@ -70,7 +70,7 @@ static void clipper_init(ram_addr_t ram_size,
 pci_bus = typhoon_init(ram_size, rtc_irq, cpus, clipper_pci_map_irq);
 
 rtc_init(1980, rtc_irq);
-pit_init(0x40, 0);
+pit_init(0x40, isa_get_irq(0));
 isa_create_simple(i8042);
 
 /* VGA setup.  Don't bother loading the bios.  */
diff --git a/hw/hpet.c b/hw/hpet.c
index 1b64e6a..ace0b1d 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -64,6 +64,7 @@ typedef struct HPETState {
 qemu_irq irqs[HPET_NUM_IRQ_ROUTES];
 uint32_t flags;
 uint8_t rtc_irq_level;
+qemu_irq pit_enabled;
 uint8_t num_timers;
 HPETTimer timer[HPET_MAX_TIMERS];
 
@@ -572,12 +573,15 @@ static void hpet_ram_write(void *opaque, 
target_phys_addr_t addr,
 hpet_del_timer(s-timer[i]);
 }
 }
-/* i8254 and RTC are disabled when HPET is in legacy mode */
+/* i8254 and RTC output pins are disabled
+ * when HPET is in legacy mode */
 if (activating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
-hpet_pit_disable();
+qemu_set_irq(s-pit_enabled, 0);
+qemu_irq_lower(s-irqs[0]);
 qemu_irq_lower(s-irqs[RTC_ISA_IRQ]);
 } else if (deactivating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
-hpet_pit_enable();
+qemu_irq_lower(s-irqs[0]);
+qemu_set_irq(s-pit_enabled, 1);
 qemu_set_irq(s-irqs[RTC_ISA_IRQ], s-rtc_irq_level);
 }
 break;
@@ -631,7 +635,6 @@ static void hpet_reset(DeviceState *d)
 {
 HPETState *s = FROM_SYSBUS(HPETState, sysbus_from_qdev(d));
 int i;
-static int count = 0;
 
 for (i = 0; i  s-num_timers; i++) {
 HPETTimer *timer = s-timer[i];
@@ -648,29 +651,27 @@ static void hpet_reset(DeviceState *d)
 timer-wrap_flag = 0;
 }
 
+qemu_set_irq(s-pit_enabled, 1);
 s-hpet_counter = 0ULL;
 s-hpet_offset = 0ULL;
 s-config = 0ULL;
-if (count  0) {
-/* we don't enable pit when hpet_reset is first called (by hpet_init)
- * because hpet is taking over for pit here. On subsequent invocations,
- * hpet_reset is called due to system reset. At this point control must
- * be returned to pit until SW reenables hpet.
- */
-hpet_pit_enable();
-}
 hpet_cfg.hpet[s-hpet_id].event_timer_block_id = (uint32_t)s-capability;
 hpet_cfg.hpet[s-hpet_id].address = sysbus_from_qdev(d)-mmio[0].addr;
-count = 1;
 }
 
-static void hpet_handle_rtc_irq(void *opaque, int n, int level)
+static void hpet_handle_legacy_irq(void *opaque, int n, int level)
 {
 HPETState *s = FROM_SYSBUS(HPETState, opaque);
 
-s-rtc_irq_level = level;
-if (!hpet_in_legacy_mode(s)) {
-qemu_set_irq(s-irqs[RTC_ISA_IRQ], level);
+if (n == HPET_LEGACY_PIT_INT) {
+if (!hpet_in_legacy_mode(s)) {
+qemu_set_irq(s-irqs[0], level);
+}
+} else {
+s-rtc_irq_level = level;
+if (!hpet_in_legacy_mode(s)) {
+qemu_set_irq(s-irqs[RTC_ISA_IRQ], level);
+}
 }
 }
 
@@ -713,7 +714,8 @@ static int hpet_init(SysBusDevice *dev)
 s-capability |= (s-num_timers - 1)  HPET_ID_NUM_TIM_SHIFT;
 s-capability |= ((HPET_CLK_PERIOD)  32);
 
-qdev_init_gpio_in(dev-qdev, 

[patch 00/12] Ucontrol patchset V5

2011-12-10 Thread Carsten Otte
Hi Avi, Hi Marcelo,

this iteration of the patchset has two changes:
- Handling of null PTEs is fixed (thanks Heiko)
- Typo in comment is fixed (thanks Joachim)

so long,
Carsten

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 01/12] [PATCH] kvm-s390: add parameter for KVM_CREATE_VM

2011-12-10 Thread Carsten Otte
This patch introduces a new config option for user controlled kernel
virtual machines. It introduces an optional parameter to
KVM_CREATE_VM in order to create a user controlled virtual machine.
The parameter is passed to kvm_arch_init_vm for all architectures.
Valid values for the new parameter are KVM_VM_REGULAR (defined to 0
for backward compatibility to old KVM_CREATE_VM) and
KVM_VM_UCONTROL for s390 only.
Note that the user controlled virtual machines require CAP_SYS_ADMIN
privileges.

Signed-off-by: Carsten Otte co...@de.ibm.com
---
---
 Documentation/virtual/kvm/api.txt |7 ++-
 arch/ia64/kvm/kvm-ia64.c  |5 -
 arch/powerpc/kvm/powerpc.c|5 -
 arch/s390/kvm/Kconfig |9 +
 arch/s390/kvm/kvm-s390.c  |   30 +-
 arch/s390/kvm/kvm-s390.h  |   10 ++
 arch/x86/kvm/x86.c|5 -
 include/linux/kvm.h   |3 +++
 include/linux/kvm_host.h  |2 +-
 virt/kvm/kvm_main.c   |   19 +--
 10 files changed, 79 insertions(+), 16 deletions(-)

--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -95,7 +95,7 @@ described as 'basic' will be available.
 Capability: basic
 Architectures: all
 Type: system ioctl
-Parameters: none
+Parameters: machine type identifier (KVM_VM_*)
 Returns: a VM fd that can be used to control the new virtual machine.
 
 The new VM has no virtual cpus and no memory.  An mmap() of a VM fd
@@ -103,6 +103,11 @@ will access the virtual machine's physic
 corresponds to guest physical address zero.  Use of mmap() on a VM fd
 is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is
 available.
+You most certainly want to use KVM_VM_REGULAR as machine type.
+
+In order to create user controlled virtual machines on S390, check
+KVM_CAP_UCONTROL and use KVM_VM_UCONTROL as machine type as
+privileged user (CAP_SYS_ADMIN).
 
 4.3 KVM_GET_MSR_INDEX_LIST
 
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -809,10 +809,13 @@ static void kvm_build_io_pmt(struct kvm
 #define GUEST_PHYSICAL_RR4 0x2739
 #define VMM_INIT_RR0x1660
 
-int kvm_arch_init_vm(struct kvm *kvm)
+int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 {
BUG_ON(!kvm);
 
+   if (type != KVM_VM_REGULAR)
+   return -EINVAL;
+
kvm-arch.is_sn2 = ia64_platform_is(sn2);
 
kvm-arch.metaphysical_rr0 = GUEST_PHYSICAL_RR0;
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -171,8 +171,11 @@ void kvm_arch_check_processor_compat(voi
*(int *)rtn = kvmppc_core_check_processor_compat();
 }
 
-int kvm_arch_init_vm(struct kvm *kvm)
+int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 {
+   if (type != KVM_VM_REGULAR)
+   return -EINVAL;
+
return kvmppc_core_init_vm(kvm);
 }
 
--- a/arch/s390/kvm/Kconfig
+++ b/arch/s390/kvm/Kconfig
@@ -34,6 +34,15 @@ config KVM
 
  If unsure, say N.
 
+config KVM_UCONTROL
+   bool Userspace controlled virtual machines
+   depends on KVM
+   ---help---
+ Allow CAP_SYS_ADMIN users to create KVM virtual machines that are
+ controlled by userspace.
+
+ If unsure, say N.
+
 # OK, it's a little counter-intuitive to do this, but it puts it neatly under
 # the virtualization menu.
 source drivers/vhost/Kconfig
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -171,11 +171,28 @@ long kvm_arch_vm_ioctl(struct file *filp
return r;
 }
 
-int kvm_arch_init_vm(struct kvm *kvm)
+int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 {
int rc;
char debug_name[16];
 
+   rc = -EINVAL;
+#ifdef CONFIG_KVM_UCONTROL
+   switch (type) {
+   case KVM_VM_REGULAR:
+   break;
+   case KVM_VM_UCONTROL:
+   if (!capable(CAP_SYS_ADMIN))
+   goto out_err;
+   break;
+   default:
+   goto out_err;
+   }
+#else
+   if (type != KVM_VM_REGULAR)
+   goto out_err;
+#endif
+
rc = s390_enable_sie();
if (rc)
goto out_err;
@@ -198,10 +215,13 @@ int kvm_arch_init_vm(struct kvm *kvm)
debug_register_view(kvm-arch.dbf, debug_sprintf_view);
VM_EVENT(kvm, 3, %s, vm created);
 
-   kvm-arch.gmap = gmap_alloc(current-mm);
-   if (!kvm-arch.gmap)
-   goto out_nogmap;
-
+   if (type == KVM_VM_REGULAR) {
+   kvm-arch.gmap = gmap_alloc(current-mm);
+   if (!kvm-arch.gmap)
+   goto out_nogmap;
+   } else {
+   kvm-arch.gmap = NULL;
+   }
return 0;
 out_nogmap:
debug_unregister(kvm-arch.dbf);
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -47,6 +47,16 @@ static inline int __cpu_is_stopped(struc
return atomic_read(vcpu-arch.sie_block-cpuflags)  CPUSTAT_STOP_INT;
 

[patch 05/12] [PATCH] kvm-s390-ucontrol: disable in-kernel handling of SIE intercepts

2011-12-10 Thread Carsten Otte
This patch disables in-kernel handling of SIE intercepts for user
controlled virtual machines. All intercepts are passed to userspace
via KVM_EXIT_SIE exit reason just like SIE intercepts that cannot be
handled in-kernel for regular KVM guests.

Signed-off-by: Carsten Otte co...@de.ibm.com
---
---
 arch/s390/kvm/kvm-s390.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -572,7 +572,10 @@ rerun_vcpu:
rc = __vcpu_run(vcpu);
if (rc)
break;
-   rc = kvm_handle_sie_intercept(vcpu);
+   if (kvm_is_ucontrol(vcpu-kvm))
+   rc = -EOPNOTSUPP;
+   else
+   rc = kvm_handle_sie_intercept(vcpu);
} while (!signal_pending(current)  !rc);
 
if (rc == SIE_INTERCEPT_RERUNVCPU)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 09/12] [PATCH] kvm-s390: fix assumption for KVM_MAX_VCPUS

2011-12-10 Thread Carsten Otte
This patch fixes definition of the idle_mask and the local_int array
in kvm_s390_float_interrupt. Previous definition had 64 cpus max
hardcoded instead of using KVM_MAX_VCPUS.

Signed-off-by: Carsten Otte co...@de.ibm.com
---
---
 arch/s390/include/asm/kvm_host.h |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -220,8 +220,9 @@ struct kvm_s390_float_interrupt {
struct list_head list;
atomic_t active;
int next_rr_cpu;
-   unsigned long idle_mask [(64 + sizeof(long) - 1) / sizeof(long)];
-   struct kvm_s390_local_interrupt *local_int[64];
+   unsigned long idle_mask[(KVM_MAX_VCPUS + sizeof(long) - 1)
+   / sizeof(long)];
+   struct kvm_s390_local_interrupt *local_int[KVM_MAX_VCPUS];
 };
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 08/12] [PATCH] kvm-s390-ucontrol: disable sca

2011-12-10 Thread Carsten Otte
This patch makes sure user controlled virtual machines do not use a
system control area (sca). This is needed in order to create
virtual machines with more cpus than the size of the sca [64].

Signed-off-by: Carsten Otte co...@de.ibm.com
---
---
 arch/s390/kvm/kvm-s390.c |   30 --
 1 file changed, 20 insertions(+), 10 deletions(-)

--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -234,10 +234,13 @@ out_err:
 void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
VCPU_EVENT(vcpu, 3, %s, free cpu);
-   clear_bit(63 - vcpu-vcpu_id, (unsigned long *) 
vcpu-kvm-arch.sca-mcn);
-   if (vcpu-kvm-arch.sca-cpu[vcpu-vcpu_id].sda ==
-   (__u64) vcpu-arch.sie_block)
-   vcpu-kvm-arch.sca-cpu[vcpu-vcpu_id].sda = 0;
+   if (!kvm_is_ucontrol(vcpu-kvm)) {
+   clear_bit(63 - vcpu-vcpu_id,
+ (unsigned long *) vcpu-kvm-arch.sca-mcn);
+   if (vcpu-kvm-arch.sca-cpu[vcpu-vcpu_id].sda ==
+   (__u64) vcpu-arch.sie_block)
+   vcpu-kvm-arch.sca-cpu[vcpu-vcpu_id].sda = 0;
+   }
smp_mb();
 
if (kvm_is_ucontrol(vcpu-kvm))
@@ -374,12 +377,19 @@ struct kvm_vcpu *kvm_arch_vcpu_create(st
goto out_free_cpu;
 
vcpu-arch.sie_block-icpua = id;
-   BUG_ON(!kvm-arch.sca);
-   if (!kvm-arch.sca-cpu[id].sda)
-   kvm-arch.sca-cpu[id].sda = (__u64) vcpu-arch.sie_block;
-   vcpu-arch.sie_block-scaoh = (__u32)(((__u64)kvm-arch.sca)  32);
-   vcpu-arch.sie_block-scaol = (__u32)(__u64)kvm-arch.sca;
-   set_bit(63 - id, (unsigned long *) kvm-arch.sca-mcn);
+   if (!kvm_is_ucontrol(kvm)) {
+   if (!kvm-arch.sca) {
+   WARN_ON_ONCE(1);
+   goto out_free_cpu;
+   }
+   if (!kvm-arch.sca-cpu[id].sda)
+   kvm-arch.sca-cpu[id].sda =
+   (__u64) vcpu-arch.sie_block;
+   vcpu-arch.sie_block-scaoh =
+   (__u32)(((__u64)kvm-arch.sca)  32);
+   vcpu-arch.sie_block-scaol = (__u32)(__u64)kvm-arch.sca;
+   set_bit(63 - id, (unsigned long *) kvm-arch.sca-mcn);
+   }
 
spin_lock_init(vcpu-arch.local_int.lock);
INIT_LIST_HEAD(vcpu-arch.local_int.list);

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 04/12] [PATCH] kvm-s390-ucontrol: export SIE control block to user

2011-12-10 Thread Carsten Otte
This patch exports the s390 SIE hardware control block to userspace
via the mapping of the vcpu file descriptor. In order to do so,
a new arch callback named kvm_arch_vcpu_fault  is introduced for all
architectures. It allows to map architecture specific pages.

Signed-off-by: Carsten Otte co...@de.ibm.com
---
---
 Documentation/virtual/kvm/api.txt |5 +
 arch/ia64/kvm/kvm-ia64.c  |5 +
 arch/powerpc/kvm/powerpc.c|5 +
 arch/s390/kvm/kvm-s390.c  |   13 +
 arch/x86/kvm/x86.c|5 +
 include/linux/kvm.h   |1 +
 include/linux/kvm_host.h  |1 +
 virt/kvm/kvm_main.c   |2 +-
 8 files changed, 36 insertions(+), 1 deletion(-)

--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -218,6 +218,11 @@ allocation of vcpu ids.  For example, if
 single-threaded guest vcpus, it should make all vcpu ids be a multiple
 of the number of vcpus per vcore.
 
+For virtual cpus that have been created with S390 user controlled virtual
+machines, the resulting vcpu fd can be memory mapped at page offset
+KVM_S390_SIE_PAGE_OFFSET in order to obtain a memory map of the virtual
+cpu's hardware control block.
+
 4.8 KVM_GET_DIRTY_LOG (vm ioctl)
 
 Capability: basic
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -1566,6 +1566,11 @@ out:
return r;
 }
 
+int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
+{
+   return VM_FAULT_SIGBUS;
+}
+
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
struct kvm_memory_slot old,
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -659,6 +659,11 @@ out:
return r;
 }
 
+int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
+{
+   return VM_FAULT_SIGBUS;
+}
+
 static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 {
u32 inst_lis = 0x3c00;
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -769,6 +769,19 @@ long kvm_arch_vcpu_ioctl(struct file *fi
return r;
 }
 
+int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
+{
+#ifdef CONFIG_KVM_UCONTROL
+   if ((vmf-pgoff == KVM_S390_SIE_PAGE_OFFSET)
+ (kvm_is_ucontrol(vcpu-kvm))) {
+   vmf-page = virt_to_page(vcpu-arch.sie_block);
+   get_page(vmf-page);
+   return 0;
+   }
+#endif
+   return VM_FAULT_SIGBUS;
+}
+
 /* Section: memory related */
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
   struct kvm_memory_slot *memslot,
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2790,6 +2790,11 @@ out:
return r;
 }
 
+int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
+{
+   return VM_FAULT_SIGBUS;
+}
+
 static int kvm_vm_ioctl_set_tss_addr(struct kvm *kvm, unsigned long addr)
 {
int ret;
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -439,6 +439,7 @@ struct kvm_ppc_pvinfo {
 
 #define KVM_VM_REGULAR  0
 #define KVM_VM_UCONTROL1
+#define KVM_S390_SIE_PAGE_OFFSET 1
 
 /*
  * ioctls for /dev/kvm fds:
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -449,6 +449,7 @@ long kvm_arch_dev_ioctl(struct file *fil
unsigned int ioctl, unsigned long arg);
 long kvm_arch_vcpu_ioctl(struct file *filp,
 unsigned int ioctl, unsigned long arg);
+int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf);
 
 int kvm_dev_ioctl_check_extension(long ext);
 
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1657,7 +1657,7 @@ static int kvm_vcpu_fault(struct vm_area
page = virt_to_page(vcpu-kvm-coalesced_mmio_ring);
 #endif
else
-   return VM_FAULT_SIGBUS;
+   return kvm_arch_vcpu_fault(vcpu, vmf);
get_page(page);
vmf-page = page;
return 0;

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 07/12] [PATCH] kvm-s390-ucontrol: interface to inject faults on a vcpu page table

2011-12-10 Thread Carsten Otte
This patch allows the user to fault in pages on a virtual cpus
address space for user controlled virtual machines. Typically this
is superfluous because userspace can just create a mapping and
let the kernel's page fault logic take are of it. There is one
exception: SIE won't start if the lowcore is not present. Normally
the kernel takes care of this [handle_validity() in
arch/s390/kvm/intercept.c] but since the kernel does not handle
intercepts for user controlled virtual machines, userspace needs to
be able to handle this condition.

Signed-off-by: Carsten Otte co...@de.ibm.com
---
---
 Documentation/virtual/kvm/api.txt |   16 
 arch/s390/kvm/kvm-s390.c  |6 ++
 include/linux/kvm.h   |1 +
 3 files changed, 23 insertions(+)

--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1498,6 +1498,22 @@ This ioctl unmaps the memory in the vcpu
 vcpu_addr with the length length. The field user_addr is ignored.
 All parameters need to be alligned by 1 megabyte.
 
+4.66 KVM_S390_VCPU_FAULT
+
+Capability: KVM_CAP_UCONTROL
+Architectures: s390
+Type: vcpu ioctl
+Parameters: vcpu absolute address (in)
+Returns: 0 in case of success
+
+This call creates a page table entry on the virtual cpu's address space
+(for user controlled virtual machines) or the virtual machine's address
+space (for regular virtual machines). This only works for minor faults,
+thus it's recommended to access subject memory page via the user page
+table upfront. This is useful to handle validity intercepts for user
+controlled virtual machines to fault in the virtual cpu's lowcore pages
+prior to calling the KVM_RUN ioctl.
+
 5. The kvm_run structure
 
 Application code obtains a pointer to the kvm_run structure by
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -767,6 +767,12 @@ long kvm_arch_vcpu_ioctl(struct file *fi
break;
}
 #endif
+   case KVM_S390_VCPU_FAULT: {
+   r = gmap_fault(arg, vcpu-arch.gmap);
+   if (!IS_ERR_VALUE(r))
+   r = 0;
+   break;
+   }
default:
r = -EINVAL;
}
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -673,6 +673,7 @@ struct kvm_s390_ucas_mapping {
 };
 #define KVM_S390_UCAS_MAP_IOW(KVMIO, 0x50, struct 
kvm_s390_ucas_mapping)
 #define KVM_S390_UCAS_UNMAP  _IOW(KVMIO, 0x51, struct 
kvm_s390_ucas_mapping)
+#define KVM_S390_VCPU_FAULT _IOW(KVMIO, 0x52, unsigned long)
 
 /* Device model IOC */
 #define KVM_CREATE_IRQCHIP_IO(KVMIO,   0x60)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 12/12] [PATCH] kvm-s390: Fix return code for unknown ioctl numbers

2011-12-10 Thread Carsten Otte
This patch fixes the return code of kvm_arch_vcpu_ioctl in case
of an unkown ioctl number.

Signed-off-by: Carsten Otte co...@de.ibm.com
---
---
 arch/s390/kvm/kvm-s390.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -889,7 +889,7 @@ long kvm_arch_vcpu_ioctl(struct file *fi
break;
}
default:
-   r = -EINVAL;
+   r = -ENOTTY;
}
return r;
 }

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 11/12] [PATCH] kvm-s390-ucontrol: announce capability for user controlled vms

2011-12-10 Thread Carsten Otte
This patch announces a new capability KVM_CAP_UCONTROL that
indicates that kvm can now support virtual machines that are
controlled by userspace.

Signed-off-by: Carsten Otte co...@de.ibm.com
---
---
 arch/s390/kvm/kvm-s390.c |3 +++
 include/linux/kvm.h  |1 +
 2 files changed, 4 insertions(+)

--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -231,6 +231,9 @@ int kvm_dev_ioctl_check_extension(long e
case KVM_CAP_S390_PSW:
case KVM_CAP_S390_GMAP:
case KVM_CAP_SYNC_MMU:
+#ifdef CONFIG_KVM_UCONTROL
+   case KVM_CAP_UCONTROL:
+#endif
r = 1;
break;
default:
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -574,6 +574,7 @@ struct kvm_s390_keyop {
 #define KVM_CAP_MAX_VCPUS 66   /* returns max vcpus per vm */
 #define KVM_CAP_PPC_PAPR 68
 #define KVM_CAP_S390_GMAP 71
+#define KVM_CAP_UCONTROL 72
 
 #ifdef KVM_CAP_IRQ_ROUTING
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 10/12] [PATCH] kvm-s390: storage key interface

2011-12-10 Thread Carsten Otte
This patch introduces an interface to access the guest visible
storage keys. It supports three operations that model the behavior
that SSKE/ISKE/RRBE instructions would have if they were issued by
the guest. These instructions are all documented in the z architecture
principles of operation book.

Signed-off-by: Carsten Otte co...@de.ibm.com
---
---
 Documentation/virtual/kvm/api.txt |   38 +
 arch/s390/include/asm/kvm_host.h  |4 +
 arch/s390/include/asm/pgtable.h   |1 
 arch/s390/kvm/kvm-s390.c  |  108 --
 arch/s390/mm/pgtable.c|   70 ++--
 include/linux/kvm.h   |7 ++
 6 files changed, 207 insertions(+), 21 deletions(-)

--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1514,6 +1514,44 @@ table upfront. This is useful to handle
 controlled virtual machines to fault in the virtual cpu's lowcore pages
 prior to calling the KVM_RUN ioctl.
 
+4.67 KVM_S390_KEYOP
+
+Capability: KVM_CAP_UCONTROL
+Architectures: s390
+Type: vm ioctl
+Parameters: struct kvm_s390_keyop (in+out)
+Returns: 0 in case of success
+
+The parameter looks like this:
+   struct kvm_s390_keyop {
+   __u64 user_addr;
+   __u8  key;
+   __u8  operation;
+   };
+
+user_addr  contains the userspace address of a memory page
+keycontains the guest visible storage key as defined by the
+   z Architecture Principles of Operation book, including key
+   value for key controlled storage protection, the fetch
+   protection bit, and the reference and change indicator bits
+operation  indicates the key operation that should be performed
+
+The following operations are supported:
+KVM_S390_KEYOP_SSKE:
+   This operation behaves just like the set storage key extended (SSKE)
+   instruction would, if it were issued by the guest. The storage key
+   provided in key is placed in the guest visible storage key.
+KVM_S390_KEYOP_ISKE:
+   This operation behaves just like the insert storage key extended (ISKE)
+   instruction would, if it were issued by the guest. After this call,
+   the guest visible storage key is presented in the key field.
+KVM_S390_KEYOP_RRBE:
+   This operation behaves just like the reset referenced bit extended
+   (RRBE) instruction would, if it were issued by the guest. The guest
+   visible reference bit is cleared, and the value presented in the key
+   field after this call has the reference bit set to 1 in case the
+   guest view of the reference bit was 1 prior to this call.
+
 5. The kvm_run structure
 
 Application code obtains a pointer to the kvm_run structure by
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -24,6 +24,10 @@
 /* memory slots that does not exposed to userspace */
 #define KVM_PRIVATE_MEM_SLOTS 4
 
+#define KVM_S390_KEYOP_SSKE 0x01
+#define KVM_S390_KEYOP_ISKE 0x02
+#define KVM_S390_KEYOP_RRBE 0x03
+
 struct sca_entry {
atomic_t scn;
__u32   reserved;
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1254,6 +1254,7 @@ static inline pte_t mk_swap_pte(unsigned
 extern int vmem_add_mapping(unsigned long start, unsigned long size);
 extern int vmem_remove_mapping(unsigned long start, unsigned long size);
 extern int s390_enable_sie(void);
+extern pte_t *ptep_for_addr(unsigned long addr);
 
 /*
  * No page table caches to initialise
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -112,13 +112,115 @@ void kvm_arch_exit(void)
 {
 }
 
+static long kvm_s390_keyop(struct kvm_s390_keyop *kop)
+{
+   unsigned long addr = kop-user_addr;
+   pte_t *ptep;
+   pgste_t pgste;
+   int r;
+   unsigned long skey;
+   unsigned long bits;
+
+   /* make sure this process is a hypervisor */
+   r = -EINVAL;
+   if (!mm_has_pgste(current-mm))
+   goto out;
+
+   r = -EFAULT;
+   if (addr = PGDIR_SIZE)
+   goto out;
+
+   spin_lock(current-mm-page_table_lock);
+   ptep = ptep_for_addr(addr);
+   if (IS_ERR(ptep)) {
+   r = PTR_ERR(ptep);
+   goto out_unlock;
+   }
+
+   pgste = pgste_get_lock(ptep);
+
+   switch (kop-operation) {
+   case KVM_S390_KEYOP_SSKE:
+   pgste = pgste_update_all(ptep, pgste);
+   /* set the real key back w/o rc bits */
+   skey = kop-key  (_PAGE_ACC_BITS | _PAGE_FP_BIT);
+   if (pte_present(*ptep)) {
+   page_set_storage_key(pte_val(*ptep), skey, 1);
+   /* avoid race clobbering changed bit */
+   pte_val(*ptep) |= _PAGE_SWC;
+   }
+   /* put acc+f plus guest referenced and changed into the pgste */
+   pgste_val(pgste) = ~(RCP_ACC_BITS | RCP_FP_BIT | RCP_GR_BIT
+  

[patch 02/12] [PATCH] kvm-s390-ucontrol: per vcpu address spaces

2011-12-10 Thread Carsten Otte
This patch introduces two ioctls for virtual cpus, that are only
valid for kernel virtual machines that are controlled by userspace.
Each virtual cpu has its individual address space in this mode of
operation, and each address space is backed by the gmap
implementation just like the address space for regular KVM guests.
KVM_S390_UCAS_MAP allows to map a part of the user's virtual address
space to the vcpu. Starting offset and length in both the user and
the vcpu address space need to be aligned to 1M.
KVM_S390_UCAS_UNMAP can be used to unmap a range of memory from a
virtual cpu in a similar way.

Signed-off-by: Carsten Otte co...@de.ibm.com
---
---
 Documentation/virtual/kvm/api.txt |   38 
 arch/s390/kvm/kvm-s390.c  |   50 +-
 include/linux/kvm.h   |   10 +++
 3 files changed, 97 insertions(+), 1 deletion(-)

--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1455,6 +1455,44 @@ is supported; 2 if the processor require
 an RMA, or 1 if the processor can use an RMA but doesn't require it,
 because it supports the Virtual RMA (VRMA) facility.
 
+4.64 KVM_S390_UCAS_MAP
+
+Capability: KVM_CAP_UCONTROL
+Architectures: s390
+Type: vcpu ioctl
+Parameters: struct kvm_s390_ucas_mapping (in)
+Returns: 0 in case of success
+
+The parameter is defined like this:
+   struct kvm_s390_ucas_mapping {
+   __u64 user_addr;
+   __u64 vcpu_addr;
+   __u64 length;
+   };
+
+This ioctl maps the memory at user_addr with the length length to
+the vcpu's address space starting at vcpu_addr. All parameters need to
+be alligned by 1 megabyte.
+
+4.65 KVM_S390_UCAS_UNMAP
+
+Capability: KVM_CAP_UCONTROL
+Architectures: s390
+Type: vcpu ioctl
+Parameters: struct kvm_s390_ucas_mapping (in)
+Returns: 0 in case of success
+
+The parameter is defined like this:
+   struct kvm_s390_ucas_mapping {
+   __u64 user_addr;
+   __u64 vcpu_addr;
+   __u64 length;
+   };
+
+This ioctl unmaps the memory in the vcpu's address space starting at
+vcpu_addr with the length length. The field user_addr is ignored.
+All parameters need to be alligned by 1 megabyte.
+
 5. The kvm_run structure
 
 Application code obtains a pointer to the kvm_run structure by
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -239,6 +239,10 @@ void kvm_arch_vcpu_destroy(struct kvm_vc
(__u64) vcpu-arch.sie_block)
vcpu-kvm-arch.sca-cpu[vcpu-vcpu_id].sda = 0;
smp_mb();
+
+   if (kvm_is_ucontrol(vcpu-kvm))
+   gmap_free(vcpu-arch.gmap);
+
free_page((unsigned long)(vcpu-arch.sie_block));
kvm_vcpu_uninit(vcpu);
kfree(vcpu);
@@ -269,12 +273,20 @@ void kvm_arch_destroy_vm(struct kvm *kvm
kvm_free_vcpus(kvm);
free_page((unsigned long)(kvm-arch.sca));
debug_unregister(kvm-arch.dbf);
-   gmap_free(kvm-arch.gmap);
+   if (!kvm_is_ucontrol(kvm))
+   gmap_free(kvm-arch.gmap);
 }
 
 /* Section: vcpu related */
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 {
+   if (kvm_is_ucontrol(vcpu-kvm)) {
+   vcpu-arch.gmap = gmap_alloc(current-mm);
+   if (!vcpu-arch.gmap)
+   return -ENOMEM;
+   return 0;
+   }
+
vcpu-arch.gmap = vcpu-kvm-arch.gmap;
return 0;
 }
@@ -693,6 +705,42 @@ long kvm_arch_vcpu_ioctl(struct file *fi
case KVM_S390_INITIAL_RESET:
r = kvm_arch_vcpu_ioctl_initial_reset(vcpu);
break;
+#ifdef CONFIG_KVM_UCONTROL
+   case KVM_S390_UCAS_MAP: {
+   struct kvm_s390_ucas_mapping ucasmap;
+
+   if (copy_from_user(ucasmap, argp, sizeof(ucasmap))) {
+   r = -EFAULT;
+   break;
+   }
+
+   if (!kvm_is_ucontrol(vcpu-kvm)) {
+   r = -EINVAL;
+   break;
+   }
+
+   r = gmap_map_segment(vcpu-arch.gmap, ucasmap.user_addr,
+ucasmap.vcpu_addr, ucasmap.length);
+   break;
+   }
+   case KVM_S390_UCAS_UNMAP: {
+   struct kvm_s390_ucas_mapping ucasmap;
+
+   if (copy_from_user(ucasmap, argp, sizeof(ucasmap))) {
+   r = -EFAULT;
+   break;
+   }
+
+   if (!kvm_is_ucontrol(vcpu-kvm)) {
+   r = -EINVAL;
+   break;
+   }
+
+   r = gmap_unmap_segment(vcpu-arch.gmap, ucasmap.vcpu_addr,
+   ucasmap.length);
+   break;
+   }
+#endif
default:
r = -EINVAL;
}
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -657,6 +657,16 @@ struct kvm_clock_data {
struct 

[patch 03/12] [PATCH] kvm-s390-ucontrol: export page faults to user

2011-12-10 Thread Carsten Otte
This patch introduces a new exit reason in the kvm_run structure
named KVM_EXIT_UCONTROL. This exit indicates, that a virtual cpu
has regognized a fault on the host page table. The idea is that
userspace can handle this fault by mapping memory at the fault
location into the cpu's address space and then continue to run the
virtual cpu.

Signed-off-by: Carsten Otte co...@de.ibm.com
---
---
 Documentation/virtual/kvm/api.txt |   14 ++
 arch/s390/kvm/kvm-s390.c  |   32 +++-
 arch/s390/kvm/kvm-s390.h  |1 +
 include/linux/kvm.h   |6 ++
 4 files changed, 48 insertions(+), 5 deletions(-)

--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1653,6 +1653,20 @@ s390 specific.
 
 s390 specific.
 
+   /* KVM_EXIT_UCONTROL */
+   struct {
+   __u64 trans_exc_code;
+   __u32 pgm_code;
+   } s390_ucontrol;
+
+s390 specific. A page fault has occurred for a user controlled virtual
+machine (KVM_VM_S390_UNCONTROL) on it's host page table that cannot be
+resolved by the kernel.
+The program code and the translation exception code that were placed
+in the cpu's lowcore are presented here as defined by the z Architecture
+Principles of Operation Book in the Chapter for Dynamic Address Translation
+(DAT)
+
/* KVM_EXIT_DCR */
struct {
__u32 dcrn;
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -499,8 +499,10 @@ int kvm_arch_vcpu_ioctl_set_mpstate(stru
return -EINVAL; /* not implemented yet */
 }
 
-static void __vcpu_run(struct kvm_vcpu *vcpu)
+static int __vcpu_run(struct kvm_vcpu *vcpu)
 {
+   int rc;
+
memcpy(vcpu-arch.sie_block-gg14, vcpu-arch.guest_gprs[14], 16);
 
if (need_resched())
@@ -517,9 +519,15 @@ static void __vcpu_run(struct kvm_vcpu *
local_irq_enable();
VCPU_EVENT(vcpu, 6, entering sie flags %x,
   atomic_read(vcpu-arch.sie_block-cpuflags));
-   if (sie64a(vcpu-arch.sie_block, vcpu-arch.guest_gprs)) {
-   VCPU_EVENT(vcpu, 3, %s, fault in sie instruction);
-   kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
+   rc = sie64a(vcpu-arch.sie_block, vcpu-arch.guest_gprs);
+   if (rc) {
+   if (kvm_is_ucontrol(vcpu-kvm)) {
+   rc = SIE_INTERCEPT_UCONTROL;
+   } else {
+   VCPU_EVENT(vcpu, 3, %s, fault in sie instruction);
+   kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
+   rc = 0;
+   }
}
VCPU_EVENT(vcpu, 6, exit sie icptcode %d,
   vcpu-arch.sie_block-icptcode);
@@ -528,6 +536,7 @@ static void __vcpu_run(struct kvm_vcpu *
local_irq_enable();
 
memcpy(vcpu-arch.guest_gprs[14], vcpu-arch.sie_block-gg14, 16);
+   return rc;
 }
 
 int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
@@ -548,6 +557,7 @@ rerun_vcpu:
case KVM_EXIT_UNKNOWN:
case KVM_EXIT_INTR:
case KVM_EXIT_S390_RESET:
+   case KVM_EXIT_UCONTROL:
break;
default:
BUG();
@@ -559,7 +569,9 @@ rerun_vcpu:
might_fault();
 
do {
-   __vcpu_run(vcpu);
+   rc = __vcpu_run(vcpu);
+   if (rc)
+   break;
rc = kvm_handle_sie_intercept(vcpu);
} while (!signal_pending(current)  !rc);
 
@@ -571,6 +583,16 @@ rerun_vcpu:
rc = -EINTR;
}
 
+#ifdef CONFIG_KVM_UCONTROL
+   if (rc == SIE_INTERCEPT_UCONTROL) {
+   kvm_run-exit_reason = KVM_EXIT_UCONTROL;
+   kvm_run-s390_ucontrol.trans_exc_code =
+   current-thread.gmap_addr;
+   kvm_run-s390_ucontrol.pgm_code = 0x10;
+   rc = 0;
+   }
+#endif
+
if (rc == -EOPNOTSUPP) {
/* intercept cannot be handled in-kernel, prepare kvm-run */
kvm_run-exit_reason = KVM_EXIT_S390_SIEIC;
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -26,6 +26,7 @@ typedef int (*intercept_handler_t)(struc
 
 /* negativ values are error codes, positive values for internal conditions */
 #define SIE_INTERCEPT_RERUNVCPU(10)
+#define SIE_INTERCEPT_UCONTROL (11)
 int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu);
 
 #define VM_EVENT(d_kvm, d_loglevel, d_string, d_args...)\
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -162,6 +162,7 @@ struct kvm_pit_config {
 #define KVM_EXIT_INTERNAL_ERROR   17
 #define KVM_EXIT_OSI  18
 #define KVM_EXIT_PAPR_HCALL  19
+#define KVM_EXIT_UCONTROL20
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 #define KVM_INTERNAL_ERROR_EMULATION 1
@@ -249,6 +250,11 @@ struct kvm_run {
 #define KVM_S390_RESET_CPU_INIT  8
 #define 

[patch 06/12] [PATCH] kvm-s390-ucontrol: disable in-kernel irq stack

2011-12-10 Thread Carsten Otte
This patch disables the in-kernel interrupt stack for KVM virtual
machines that are controlled by user. Userspace has to take care
of handling interrupts on its own.

Signed-off-by: Carsten Otte co...@de.ibm.com
---
---
 arch/s390/kvm/kvm-s390.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -511,7 +511,8 @@ static int __vcpu_run(struct kvm_vcpu *v
if (test_thread_flag(TIF_MCCK_PENDING))
s390_handle_mcck();
 
-   kvm_s390_deliver_pending_interrupts(vcpu);
+   if (!kvm_is_ucontrol(vcpu-kvm))
+   kvm_s390_deliver_pending_interrupts(vcpu);
 
vcpu-arch.sie_block-icptcode = 0;
local_irq_disable();

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 12/15] kvm: x86: Add user space part for in-kernel APIC

2011-12-10 Thread Blue Swirl
On Fri, Dec 9, 2011 at 07:52, Jan Kiszka jan.kis...@siemens.com wrote:
 On 2011-12-09 08:45, Jan Kiszka wrote:
 On 2011-12-08 22:16, Blue Swirl wrote:
 On Thu, Dec 8, 2011 at 11:52, Jan Kiszka jan.kis...@siemens.com wrote:
 This introduces the alternative APIC backend which makes use of KVM's
 in-kernel device model. External NMI injection via LINT1 is emulated by
 checking the current state of the in-kernel APIC, only injecting a NMI
 into the VCPU if LINT1 is unmasked and configured to DM_NMI.

 MSI is not yet supported, so we disable this when the in-kernel model is
 in use.

 CC: Lai Jiangshan la...@cn.fujitsu.com
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  Makefile.target   |    2 +-
  hw/kvm/apic.c     |  154 
 +
  hw/pc.c           |   15 --
  kvm.h             |    3 +
  target-i386/kvm.c |    8 +++
  5 files changed, 176 insertions(+), 6 deletions(-)
  create mode 100644 hw/kvm/apic.c

 diff --git a/Makefile.target b/Makefile.target
 index b549988..76de485 100644
 --- a/Makefile.target
 +++ b/Makefile.target
 @@ -236,7 +236,7 @@ obj-i386-y += vmport.o
  obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
  obj-i386-y += debugcon.o multiboot.o
  obj-i386-y += pc_piix.o
 -obj-i386-$(CONFIG_KVM) += kvm/clock.o
 +obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o
  obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o

  # shared objects
 diff --git a/hw/kvm/apic.c b/hw/kvm/apic.c
 new file mode 100644
 index 000..3924f9e
 --- /dev/null
 +++ b/hw/kvm/apic.c
 @@ -0,0 +1,154 @@
 +/*
 + * KVM in-kernel APIC support
 + *
 + * Copyright (c) 2011 Siemens AG
 + *
 + * Authors:
 + *  Jan Kiszka          jan.kis...@siemens.com
 + *
 + * This work is licensed under the terms of the GNU GPL version 2.
 + * See the COPYING file in the top-level directory.
 + */
 +#include hw/apic_internal.h
 +#include kvm.h
 +
 +static inline void kvm_apic_set_reg(struct kvm_lapic_state *kapic,
 +                                   int reg_id, uint32_t val)
 +{
 +    *((uint32_t *)(kapic-regs + (reg_id  4))) = val;
 +}
 +
 +static inline uint32_t kvm_apic_get_reg(struct kvm_lapic_state *kapic,
 +                                       int reg_id)
 +{
 +    return *((uint32_t *)(kapic-regs + (reg_id  4)));
 +}
 +
 +int kvm_put_apic(CPUState *env)
 +{
 +    APICState *s = DO_UPCAST(APICState, busdev.qdev, env-apic_state);

 Please pass APICState instead of CPUState.

 DeviceState, I suppose. Yes, makes more sense, update will follow.

 On second look: no, I'll keep it as is. All kvm_get/put_* helpers have
 this kind of signature, i.e. are working against env.

There's kvm_get_supported_msrs for example.

 kvm_get/put_apic
 just happens to be implemented outside of target-i386/kvm.c. And they
 require both APIC and CPUState anyway, so it makes no difference.

It does, passing CPUState violates layering. Please split the
functions so that the ioctl calls which need CPUState go to kvm.c. For
example, the functions in kvm/apic.c could just perform copying from
kvm_lapic_state fields to APICstate fields and vice versa.

The KVM interface by the way does not look so clever. Why isn't there
just an array of 32 bit fields so the casts can be avoided? Perhaps
APICState should be (later) changed to match KVM version so that the
structure can be passed directly without copying.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] i8254: Rework fix interaction with HPET in legacy mode

2011-12-10 Thread Blue Swirl
On Sat, Dec 10, 2011 at 12:28, Jan Kiszka jan.kis...@web.de wrote:
 From: Jan Kiszka jan.kis...@siemens.com

 When the HPET enters legacy mode, the IRQ output of the PIT is
 suppressed and replaced by the HPET timer 0. But the current code to
 emulate this was broken in many ways. It reset the PIT state after
 re-enabling, it worked against a stale static PIT structure, and it did
 not properly saved/restored the IRQ output mask in the PIT vmstate.

 This patch solves the PIT IRQ control in a different way. On x86, it
 both redirects the PIT IRQ to the HPET, just like the RTC. But it also
 keeps the control line from the HPET to the PIT. This allows to disable
 the PIT QEMU timer when it is not needed. The PIT's view on the control
 line state is now saved in the same format that qemu-kvm is already
 using.

 Note that, in contrast to the suppressed RTC IRQ line, we do not need to
 save/restore the PIT line state in the HPET. As we trigger a PIT IRQ
 update via the control line, the line state is reconstructed on mode
 switch.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/alpha_dp264.c   |    2 +-
  hw/hpet.c          |   38 +---
  hw/hpet_emul.h     |    3 ++
  hw/i8254.c         |   60 +--
  hw/mips_fulong2e.c |    2 +-
  hw/mips_jazz.c     |    2 +-
  hw/mips_malta.c    |    2 +-
  hw/mips_r4k.c      |    2 +-
  hw/pc.c            |   13 --
  hw/pc.h            |   13 +--
  hw/ppc_prep.c      |    2 +-
  11 files changed, 74 insertions(+), 65 deletions(-)

 diff --git a/hw/alpha_dp264.c b/hw/alpha_dp264.c
 index fcc20e9..412ccf0 100644
 --- a/hw/alpha_dp264.c
 +++ b/hw/alpha_dp264.c
 @@ -70,7 +70,7 @@ static void clipper_init(ram_addr_t ram_size,
     pci_bus = typhoon_init(ram_size, rtc_irq, cpus, clipper_pci_map_irq);

     rtc_init(1980, rtc_irq);
 -    pit_init(0x40, 0);
 +    pit_init(0x40, isa_get_irq(0));
     isa_create_simple(i8042);

     /* VGA setup.  Don't bother loading the bios.  */
 diff --git a/hw/hpet.c b/hw/hpet.c
 index 1b64e6a..ace0b1d 100644
 --- a/hw/hpet.c
 +++ b/hw/hpet.c
 @@ -64,6 +64,7 @@ typedef struct HPETState {
     qemu_irq irqs[HPET_NUM_IRQ_ROUTES];
     uint32_t flags;
     uint8_t rtc_irq_level;
 +    qemu_irq pit_enabled;
     uint8_t num_timers;
     HPETTimer timer[HPET_MAX_TIMERS];

 @@ -572,12 +573,15 @@ static void hpet_ram_write(void *opaque, 
 target_phys_addr_t addr,
                     hpet_del_timer(s-timer[i]);
                 }
             }
 -            /* i8254 and RTC are disabled when HPET is in legacy mode */
 +            /* i8254 and RTC output pins are disabled
 +             * when HPET is in legacy mode */
             if (activating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
 -                hpet_pit_disable();
 +                qemu_set_irq(s-pit_enabled, 0);
 +                qemu_irq_lower(s-irqs[0]);
                 qemu_irq_lower(s-irqs[RTC_ISA_IRQ]);
             } else if (deactivating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
 -                hpet_pit_enable();
 +                qemu_irq_lower(s-irqs[0]);
 +                qemu_set_irq(s-pit_enabled, 1);
                 qemu_set_irq(s-irqs[RTC_ISA_IRQ], s-rtc_irq_level);
             }
             break;
 @@ -631,7 +635,6 @@ static void hpet_reset(DeviceState *d)
  {
     HPETState *s = FROM_SYSBUS(HPETState, sysbus_from_qdev(d));
     int i;
 -    static int count = 0;

     for (i = 0; i  s-num_timers; i++) {
         HPETTimer *timer = s-timer[i];
 @@ -648,29 +651,27 @@ static void hpet_reset(DeviceState *d)
         timer-wrap_flag = 0;
     }

 +    qemu_set_irq(s-pit_enabled, 1);
     s-hpet_counter = 0ULL;
     s-hpet_offset = 0ULL;
     s-config = 0ULL;
 -    if (count  0) {
 -        /* we don't enable pit when hpet_reset is first called (by hpet_init)
 -         * because hpet is taking over for pit here. On subsequent 
 invocations,
 -         * hpet_reset is called due to system reset. At this point control 
 must
 -         * be returned to pit until SW reenables hpet.
 -         */
 -        hpet_pit_enable();
 -    }
     hpet_cfg.hpet[s-hpet_id].event_timer_block_id = (uint32_t)s-capability;
     hpet_cfg.hpet[s-hpet_id].address = sysbus_from_qdev(d)-mmio[0].addr;
 -    count = 1;
  }

 -static void hpet_handle_rtc_irq(void *opaque, int n, int level)
 +static void hpet_handle_legacy_irq(void *opaque, int n, int level)
  {
     HPETState *s = FROM_SYSBUS(HPETState, opaque);

 -    s-rtc_irq_level = level;
 -    if (!hpet_in_legacy_mode(s)) {
 -        qemu_set_irq(s-irqs[RTC_ISA_IRQ], level);
 +    if (n == HPET_LEGACY_PIT_INT) {
 +        if (!hpet_in_legacy_mode(s)) {
 +            qemu_set_irq(s-irqs[0], level);
 +        }
 +    } else {
 +        s-rtc_irq_level = level;
 +        if (!hpet_in_legacy_mode(s)) {
 +            qemu_set_irq(s-irqs[RTC_ISA_IRQ], level);
 +        }
     }
  }

 @@ -713,7 +714,8 @@ static int hpet_init(SysBusDevice *dev)
     

Re: [PATCH 2/2] i8254: Rework fix interaction with HPET in legacy mode

2011-12-10 Thread Jan Kiszka
On 2011-12-10 16:49, Blue Swirl wrote:

 +ISADevice *pit_init(int base, qemu_irq irq)
 
 Please retain this function in pc.h, or even better, introduce i8254.h.

No concerns about i8254.h, but this function does not qualify for static
inline.

 
 +{
 +ISADevice *dev;
 +
 +dev = isa_create(isa-pit);
 +qdev_prop_set_uint32(dev-qdev, iobase, base);
 +qdev_init_nofail(dev-qdev);
 +qdev_connect_gpio_out(dev-qdev, 0, irq);
 +
 +return dev;
 +}
 +

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 0/2] pit/hpet: Fix legacy mode switching

2011-12-10 Thread Blue Swirl
On Sat, Dec 10, 2011 at 12:28, Jan Kiszka jan.kis...@web.de wrote:
 This is a small preparatory series to allow the introduction of the KVM
 in-kernel PIT. Of course, it is also a fix for the various bugs in the
 related PIT/HPET code. See patches for details.

 Jan Kiszka (2):
  hpet: Save/restore cached RTC IRQ level
  i8254: Rework  fix interaction with HPET in legacy mode

I had one comment to this patch.

Otherwise nice cleanups, I think this logic matches real PIT/HPET
routing better.

  hw/alpha_dp264.c   |    2 +-
  hw/hpet.c          |   64 +--
  hw/hpet_emul.h     |    3 ++
  hw/i8254.c         |   60 +++-
  hw/mips_fulong2e.c |    2 +-
  hw/mips_jazz.c     |    2 +-
  hw/mips_malta.c    |    2 +-
  hw/mips_r4k.c      |    2 +-
  hw/pc.c            |   13 --
  hw/pc.h            |   13 +-
  hw/ppc_prep.c      |    2 +-
  11 files changed, 100 insertions(+), 65 deletions(-)

 --
 1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] i8254: Rework fix interaction with HPET in legacy mode

2011-12-10 Thread Blue Swirl
On Sat, Dec 10, 2011 at 15:51, Jan Kiszka jan.kis...@web.de wrote:
 On 2011-12-10 16:49, Blue Swirl wrote:

 +ISADevice *pit_init(int base, qemu_irq irq)

 Please retain this function in pc.h, or even better, introduce i8254.h.

 No concerns about i8254.h, but this function does not qualify for static
 inline.

The function is static inline in a header file not for performance
reasons, but to keep the instantiation separate from device internals.


 +{
 +    ISADevice *dev;
 +
 +    dev = isa_create(isa-pit);
 +    qdev_prop_set_uint32(dev-qdev, iobase, base);
 +    qdev_init_nofail(dev-qdev);
 +    qdev_connect_gpio_out(dev-qdev, 0, irq);
 +
 +    return dev;
 +}
 +

 Jan

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 12/15] kvm: x86: Add user space part for in-kernel APIC

2011-12-10 Thread Jan Kiszka
On 2011-12-10 16:40, Blue Swirl wrote:
 On Fri, Dec 9, 2011 at 07:52, Jan Kiszka jan.kis...@siemens.com wrote:
 On 2011-12-09 08:45, Jan Kiszka wrote:
 On 2011-12-08 22:16, Blue Swirl wrote:
 On Thu, Dec 8, 2011 at 11:52, Jan Kiszka jan.kis...@siemens.com wrote:
 This introduces the alternative APIC backend which makes use of KVM's
 in-kernel device model. External NMI injection via LINT1 is emulated by
 checking the current state of the in-kernel APIC, only injecting a NMI
 into the VCPU if LINT1 is unmasked and configured to DM_NMI.

 MSI is not yet supported, so we disable this when the in-kernel model is
 in use.

 CC: Lai Jiangshan la...@cn.fujitsu.com
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  Makefile.target   |2 +-
  hw/kvm/apic.c |  154 
 +
  hw/pc.c   |   15 --
  kvm.h |3 +
  target-i386/kvm.c |8 +++
  5 files changed, 176 insertions(+), 6 deletions(-)
  create mode 100644 hw/kvm/apic.c

 diff --git a/Makefile.target b/Makefile.target
 index b549988..76de485 100644
 --- a/Makefile.target
 +++ b/Makefile.target
 @@ -236,7 +236,7 @@ obj-i386-y += vmport.o
  obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
  obj-i386-y += debugcon.o multiboot.o
  obj-i386-y += pc_piix.o
 -obj-i386-$(CONFIG_KVM) += kvm/clock.o
 +obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o
  obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o

  # shared objects
 diff --git a/hw/kvm/apic.c b/hw/kvm/apic.c
 new file mode 100644
 index 000..3924f9e
 --- /dev/null
 +++ b/hw/kvm/apic.c
 @@ -0,0 +1,154 @@
 +/*
 + * KVM in-kernel APIC support
 + *
 + * Copyright (c) 2011 Siemens AG
 + *
 + * Authors:
 + *  Jan Kiszka  jan.kis...@siemens.com
 + *
 + * This work is licensed under the terms of the GNU GPL version 2.
 + * See the COPYING file in the top-level directory.
 + */
 +#include hw/apic_internal.h
 +#include kvm.h
 +
 +static inline void kvm_apic_set_reg(struct kvm_lapic_state *kapic,
 +   int reg_id, uint32_t val)
 +{
 +*((uint32_t *)(kapic-regs + (reg_id  4))) = val;
 +}
 +
 +static inline uint32_t kvm_apic_get_reg(struct kvm_lapic_state *kapic,
 +   int reg_id)
 +{
 +return *((uint32_t *)(kapic-regs + (reg_id  4)));
 +}
 +
 +int kvm_put_apic(CPUState *env)
 +{
 +APICState *s = DO_UPCAST(APICState, busdev.qdev, env-apic_state);

 Please pass APICState instead of CPUState.

 DeviceState, I suppose. Yes, makes more sense, update will follow.

 On second look: no, I'll keep it as is. All kvm_get/put_* helpers have
 this kind of signature, i.e. are working against env.
 
 There's kvm_get_supported_msrs for example.
 
 kvm_get/put_apic
 just happens to be implemented outside of target-i386/kvm.c. And they
 require both APIC and CPUState anyway, so it makes no difference.
 
 It does, passing CPUState violates layering. Please split the
 functions so that the ioctl calls which need CPUState go to kvm.c. For
 example, the functions in kvm/apic.c could just perform copying from
 kvm_lapic_state fields to APICstate fields and vice versa.

That's a good idea.

 
 The KVM interface by the way does not look so clever. Why isn't there
 just an array of 32 bit fields so the casts can be avoided? Perhaps
 APICState should be (later) changed to match KVM version so that the
 structure can be passed directly without copying.

Wouldn't that complicate the use in the user space model again? At least
for registers that are used with both backends.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 2/2] i8254: Rework fix interaction with HPET in legacy mode

2011-12-10 Thread Jan Kiszka
On 2011-12-10 16:54, Blue Swirl wrote:
 On Sat, Dec 10, 2011 at 15:51, Jan Kiszka jan.kis...@web.de wrote:
 On 2011-12-10 16:49, Blue Swirl wrote:

 +ISADevice *pit_init(int base, qemu_irq irq)

 Please retain this function in pc.h, or even better, introduce i8254.h.

 No concerns about i8254.h, but this function does not qualify for static
 inline.
 
 The function is static inline in a header file not for performance
 reasons, but to keep the instantiation separate from device internals.

Not performance, footprint and header dependencies. You need to pull in
all the stuff the inline function needs for everyone including the
header that contains this function. That's messy.

Even if the instantiation helper should not poke into the device model
internals (and I don't want this to change as well), it belongs to the
module that implements the device. We do the same with other fabric
functions.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v4 12/15] kvm: x86: Add user space part for in-kernel APIC

2011-12-10 Thread Blue Swirl
On Sat, Dec 10, 2011 at 15:58, Jan Kiszka jan.kis...@web.de wrote:
 On 2011-12-10 16:40, Blue Swirl wrote:
 On Fri, Dec 9, 2011 at 07:52, Jan Kiszka jan.kis...@siemens.com wrote:
 On 2011-12-09 08:45, Jan Kiszka wrote:
 On 2011-12-08 22:16, Blue Swirl wrote:
 On Thu, Dec 8, 2011 at 11:52, Jan Kiszka jan.kis...@siemens.com wrote:
 This introduces the alternative APIC backend which makes use of KVM's
 in-kernel device model. External NMI injection via LINT1 is emulated by
 checking the current state of the in-kernel APIC, only injecting a NMI
 into the VCPU if LINT1 is unmasked and configured to DM_NMI.

 MSI is not yet supported, so we disable this when the in-kernel model is
 in use.

 CC: Lai Jiangshan la...@cn.fujitsu.com
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  Makefile.target   |    2 +-
  hw/kvm/apic.c     |  154 
 +
  hw/pc.c           |   15 --
  kvm.h             |    3 +
  target-i386/kvm.c |    8 +++
  5 files changed, 176 insertions(+), 6 deletions(-)
  create mode 100644 hw/kvm/apic.c

 diff --git a/Makefile.target b/Makefile.target
 index b549988..76de485 100644
 --- a/Makefile.target
 +++ b/Makefile.target
 @@ -236,7 +236,7 @@ obj-i386-y += vmport.o
  obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
  obj-i386-y += debugcon.o multiboot.o
  obj-i386-y += pc_piix.o
 -obj-i386-$(CONFIG_KVM) += kvm/clock.o
 +obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o
  obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o

  # shared objects
 diff --git a/hw/kvm/apic.c b/hw/kvm/apic.c
 new file mode 100644
 index 000..3924f9e
 --- /dev/null
 +++ b/hw/kvm/apic.c
 @@ -0,0 +1,154 @@
 +/*
 + * KVM in-kernel APIC support
 + *
 + * Copyright (c) 2011 Siemens AG
 + *
 + * Authors:
 + *  Jan Kiszka          jan.kis...@siemens.com
 + *
 + * This work is licensed under the terms of the GNU GPL version 2.
 + * See the COPYING file in the top-level directory.
 + */
 +#include hw/apic_internal.h
 +#include kvm.h
 +
 +static inline void kvm_apic_set_reg(struct kvm_lapic_state *kapic,
 +                                   int reg_id, uint32_t val)
 +{
 +    *((uint32_t *)(kapic-regs + (reg_id  4))) = val;
 +}
 +
 +static inline uint32_t kvm_apic_get_reg(struct kvm_lapic_state *kapic,
 +                                       int reg_id)
 +{
 +    return *((uint32_t *)(kapic-regs + (reg_id  4)));
 +}
 +
 +int kvm_put_apic(CPUState *env)
 +{
 +    APICState *s = DO_UPCAST(APICState, busdev.qdev, env-apic_state);

 Please pass APICState instead of CPUState.

 DeviceState, I suppose. Yes, makes more sense, update will follow.

 On second look: no, I'll keep it as is. All kvm_get/put_* helpers have
 this kind of signature, i.e. are working against env.

 There's kvm_get_supported_msrs for example.

 kvm_get/put_apic
 just happens to be implemented outside of target-i386/kvm.c. And they
 require both APIC and CPUState anyway, so it makes no difference.

 It does, passing CPUState violates layering. Please split the
 functions so that the ioctl calls which need CPUState go to kvm.c. For
 example, the functions in kvm/apic.c could just perform copying from
 kvm_lapic_state fields to APICstate fields and vice versa.

 That's a good idea.


 The KVM interface by the way does not look so clever. Why isn't there
 just an array of 32 bit fields so the casts can be avoided? Perhaps
 APICState should be (later) changed to match KVM version so that the
 structure can be passed directly without copying.

 Wouldn't that complicate the use in the user space model again? At least
 for registers that are used with both backends.

Well, we have (at least) two styles how to model devices.

In the first one, the device state structure contains an array of
registers, so the functions which use them may need for example to
perform some bit field extraction to get what they need.

In the model used by APIC and others, the structure contains cooked
values, for example divide_count and count_shift in APICState. This
means that the CPU accesses get slightly slower since the fields need
to be packed and unpacked but the other functions may be faster.

Which one is better depends on frequency and importance of register
accesses by CPU vs. other accesses. But it shouldn't complicate that
much either way. Actually design choices like this may have been taken
without too much consideration.

Alternatively, KVM interface could be changed to take QEMU structure
directly, but I don't suppose that would be a good idea. It would be
easier for everyone if QEMU changed instead.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] i8254: Rework fix interaction with HPET in legacy mode

2011-12-10 Thread Blue Swirl
On Sat, Dec 10, 2011 at 16:03, Jan Kiszka jan.kis...@web.de wrote:
 On 2011-12-10 16:54, Blue Swirl wrote:
 On Sat, Dec 10, 2011 at 15:51, Jan Kiszka jan.kis...@web.de wrote:
 On 2011-12-10 16:49, Blue Swirl wrote:

 +ISADevice *pit_init(int base, qemu_irq irq)

 Please retain this function in pc.h, or even better, introduce i8254.h.

 No concerns about i8254.h, but this function does not qualify for static
 inline.

 The function is static inline in a header file not for performance
 reasons, but to keep the instantiation separate from device internals.

 Not performance, footprint and header dependencies. You need to pull in
 all the stuff the inline function needs for everyone including the
 header that contains this function. That's messy.

There's only ISA and qdev stuff, that's not messy since both are
needed in any case.

 Even if the instantiation helper should not poke into the device model
 internals (and I don't want this to change as well), it belongs to the
 module that implements the device. We do the same with other fabric
 functions.

In this case, the callers have the same needs and there are several of
them. In general this need not be true at all, if for example some
part of instantiation would have to be skipped, the functions may need
to be manually inlined to the board level anyway. The instantiation
definitely does not belong to the implementer but to the creator.
Ideally file implementing the device contains only static functions
and instantiation is either in a header file or at the board. This is
true for example for several Sparc32 devices.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] i8254: Rework fix interaction with HPET in legacy mode

2011-12-10 Thread Jan Kiszka
On 2011-12-10 17:26, Blue Swirl wrote:
 On Sat, Dec 10, 2011 at 16:03, Jan Kiszka jan.kis...@web.de wrote:
 On 2011-12-10 16:54, Blue Swirl wrote:
 On Sat, Dec 10, 2011 at 15:51, Jan Kiszka jan.kis...@web.de wrote:
 On 2011-12-10 16:49, Blue Swirl wrote:

 +ISADevice *pit_init(int base, qemu_irq irq)

 Please retain this function in pc.h, or even better, introduce i8254.h.

 No concerns about i8254.h, but this function does not qualify for static
 inline.

 The function is static inline in a header file not for performance
 reasons, but to keep the instantiation separate from device internals.

 Not performance, footprint and header dependencies. You need to pull in
 all the stuff the inline function needs for everyone including the
 header that contains this function. That's messy.
 
 There's only ISA and qdev stuff, that's not messy since both are
 needed in any case.
 
 Even if the instantiation helper should not poke into the device model
 internals (and I don't want this to change as well), it belongs to the
 module that implements the device. We do the same with other fabric
 functions.
 
 In this case, the callers have the same needs and there are several of
 them. In general this need not be true at all, if for example some
 part of instantiation would have to be skipped, the functions may need
 to be manually inlined to the board level anyway. The instantiation
 definitely does not belong to the implementer but to the creator.
 Ideally file implementing the device contains only static functions
 and instantiation is either in a header file or at the board. This is
 true for example for several Sparc32 devices.

The helper is wrapping the property base API into a proper function call
- nothing that is board-specific.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 2/2] i8254: Rework fix interaction with HPET in legacy mode

2011-12-10 Thread Blue Swirl
On Sat, Dec 10, 2011 at 16:29, Jan Kiszka jan.kis...@web.de wrote:
 On 2011-12-10 17:26, Blue Swirl wrote:
 On Sat, Dec 10, 2011 at 16:03, Jan Kiszka jan.kis...@web.de wrote:
 On 2011-12-10 16:54, Blue Swirl wrote:
 On Sat, Dec 10, 2011 at 15:51, Jan Kiszka jan.kis...@web.de wrote:
 On 2011-12-10 16:49, Blue Swirl wrote:

 +ISADevice *pit_init(int base, qemu_irq irq)

 Please retain this function in pc.h, or even better, introduce i8254.h.

 No concerns about i8254.h, but this function does not qualify for static
 inline.

 The function is static inline in a header file not for performance
 reasons, but to keep the instantiation separate from device internals.

 Not performance, footprint and header dependencies. You need to pull in
 all the stuff the inline function needs for everyone including the
 header that contains this function. That's messy.

 There's only ISA and qdev stuff, that's not messy since both are
 needed in any case.

 Even if the instantiation helper should not poke into the device model
 internals (and I don't want this to change as well), it belongs to the
 module that implements the device. We do the same with other fabric
 functions.

 In this case, the callers have the same needs and there are several of
 them. In general this need not be true at all, if for example some
 part of instantiation would have to be skipped, the functions may need
 to be manually inlined to the board level anyway. The instantiation
 definitely does not belong to the implementer but to the creator.
 Ideally file implementing the device contains only static functions
 and instantiation is either in a header file or at the board. This is
 true for example for several Sparc32 devices.

 The helper is wrapping the property base API into a proper function call
 - nothing that is board-specific.

Not in this case, but in general boards could need to pass different
sets of properties or avoid passing something at all.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [libvirt] (no subject)

2011-12-10 Thread Pekka Enberg
On Fri, 2011-12-09 at 20:30 +0800, Osier Yang wrote:
 By the way, nobody is interested in kvmtool privodes a way to for 
 external apps to get the capabilities? Do we still want to suffer
 from parsing the capabilities ourselves in future just like what
 we do for qemu? :-)

I think the feature makes sense especially if it simplifies libvirt.

Pekka

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] kvm tools: Free up the MSI-X PBA BAR

2011-12-10 Thread Sasha Levin
Free up the BAR to make space for the new virtio BARs. It isn't required
to have the PBA and the table in the separate BARs, and uniting them will
just give us extra BARs to play with.

Signed-off-by: Sasha Levin levinsasha...@gmail.com
---
 tools/kvm/include/kvm/virtio-pci.h |1 -
 tools/kvm/virtio/pci.c |   35 +++
 2 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/tools/kvm/include/kvm/virtio-pci.h 
b/tools/kvm/include/kvm/virtio-pci.h
index 2bbb271..73f7486 100644
--- a/tools/kvm/include/kvm/virtio-pci.h
+++ b/tools/kvm/include/kvm/virtio-pci.h
@@ -30,7 +30,6 @@ struct virtio_pci {
u32 vq_vector[VIRTIO_PCI_MAX_VQ];
u32 gsis[VIRTIO_PCI_MAX_VQ];
u32 msix_io_block;
-   u32 msix_pba_block;
u64 msix_pba;
struct msix_table   msix_table[VIRTIO_PCI_MAX_VQ + 
VIRTIO_PCI_MAX_CONFIG];
 
diff --git a/tools/kvm/virtio/pci.c b/tools/kvm/virtio/pci.c
index e2159d9..a7da8e8 100644
--- a/tools/kvm/virtio/pci.c
+++ b/tools/kvm/virtio/pci.c
@@ -220,23 +220,21 @@ static struct ioport_operations virtio_pci__io_ops = {
 static void callback_mmio_table(u64 addr, u8 *data, u32 len, u8 is_write, void 
*ptr)
 {
struct virtio_pci *vpci = ptr;
-   void *table = vpci-msix_table;
+   void *table;
+   u32 offset;
 
-   if (is_write)
-   memcpy(table + addr - vpci-msix_io_block, data, len);
-   else
-   memcpy(data, table + addr - vpci-msix_io_block, len);
-}
-
-static void callback_mmio_pba(u64 addr, u8 *data, u32 len, u8 is_write, void 
*ptr)
-{
-   struct virtio_pci *vpci = ptr;
-   void *pba = vpci-msix_pba;
+   if (addr  vpci-msix_io_block + PCI_IO_SIZE) {
+   table   = vpci-msix_pba;
+   offset  = vpci-msix_io_block + PCI_IO_SIZE;
+   } else {
+   table   = vpci-msix_table;
+   offset  = vpci-msix_io_block;
+   }
 
if (is_write)
-   memcpy(pba + addr - vpci-msix_pba_block, data, len);
+   memcpy(table + addr - offset, data, len);
else
-   memcpy(data, pba + addr - vpci-msix_pba_block, len);
+   memcpy(data, table + addr - offset, len);
 }
 
 int virtio_pci__signal_vq(struct kvm *kvm, struct virtio_trans *vtrans, u32 vq)
@@ -289,12 +287,10 @@ int virtio_pci__init(struct kvm *kvm, struct virtio_trans 
*vtrans, void *dev,
u8 pin, line, ndev;
 
vpci-dev = dev;
-   vpci-msix_io_block = pci_get_io_space_block(PCI_IO_SIZE);
-   vpci-msix_pba_block = pci_get_io_space_block(PCI_IO_SIZE);
+   vpci-msix_io_block = pci_get_io_space_block(PCI_IO_SIZE * 2);
 
vpci-base_addr = ioport__register(IOPORT_EMPTY, virtio_pci__io_ops, 
IOPORT_SIZE, vtrans);
kvm__register_mmio(kvm, vpci-msix_io_block, PCI_IO_SIZE, 
callback_mmio_table, vpci);
-   kvm__register_mmio(kvm, vpci-msix_pba_block, PCI_IO_SIZE, 
callback_mmio_pba, vpci);
 
vpci-pci_hdr = (struct pci_device_header) {
.vendor_id  = 
cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET),
@@ -306,11 +302,10 @@ int virtio_pci__init(struct kvm *kvm, struct virtio_trans 
*vtrans, void *dev,
.class[2]   = (class  16)  0xff,
.subsys_vendor_id   = 
cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET),
.subsys_id  = cpu_to_le16(subsys_id),
-   .bar[0] = cpu_to_le32(vpci-base_addr | 
PCI_BASE_ADDRESS_SPACE_IO),
+   .bar[0] = cpu_to_le32(vpci-base_addr
+   | 
PCI_BASE_ADDRESS_SPACE_IO),
.bar[1] = cpu_to_le32(vpci-msix_io_block
| 
PCI_BASE_ADDRESS_SPACE_MEMORY),
-   .bar[3] = cpu_to_le32(vpci-msix_pba_block
-   | 
PCI_BASE_ADDRESS_SPACE_MEMORY),
.status = cpu_to_le16(PCI_STATUS_CAP_LIST),
.capabilities   = (void *)vpci-pci_hdr.msix - (void 
*)vpci-pci_hdr,
.bar_size[0]= IOPORT_SIZE,
@@ -338,7 +333,7 @@ int virtio_pci__init(struct kvm *kvm, struct virtio_trans 
*vtrans, void *dev,
 * we're not in short of BARs
 */
vpci-pci_hdr.msix.table_offset = cpu_to_le32(1); /* Use BAR 1 */
-   vpci-pci_hdr.msix.pba_offset = cpu_to_le32(3); /* Use BAR 3 */
+   vpci-pci_hdr.msix.pba_offset = cpu_to_le32(1 | PCI_IO_SIZE); /* Use 
BAR 3 */
vpci-config_vector = 0;
 
if (irq__register_device(subsys_id, ndev, pin, line)  0)
-- 
1.7.8

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

[PATCH 1/2] kvm tools: Don't use 64bit BARs

2011-12-10 Thread Sasha Levin
We don't really support that, so no point in using 64bit BARs.

Signed-off-by: Sasha Levin levinsasha...@gmail.com
---
 tools/kvm/virtio/pci.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/kvm/virtio/pci.c b/tools/kvm/virtio/pci.c
index 0b44a19..e2159d9 100644
--- a/tools/kvm/virtio/pci.c
+++ b/tools/kvm/virtio/pci.c
@@ -307,10 +307,10 @@ int virtio_pci__init(struct kvm *kvm, struct virtio_trans 
*vtrans, void *dev,
.subsys_vendor_id   = 
cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET),
.subsys_id  = cpu_to_le16(subsys_id),
.bar[0] = cpu_to_le32(vpci-base_addr | 
PCI_BASE_ADDRESS_SPACE_IO),
-   .bar[1] = cpu_to_le32(vpci-msix_io_block | 
PCI_BASE_ADDRESS_SPACE_MEMORY
- | 
PCI_BASE_ADDRESS_MEM_TYPE_64),
-   .bar[3] = cpu_to_le32(vpci-msix_pba_block | 
PCI_BASE_ADDRESS_SPACE_MEMORY
- | 
PCI_BASE_ADDRESS_MEM_TYPE_64),
+   .bar[1] = cpu_to_le32(vpci-msix_io_block
+   | 
PCI_BASE_ADDRESS_SPACE_MEMORY),
+   .bar[3] = cpu_to_le32(vpci-msix_pba_block
+   | 
PCI_BASE_ADDRESS_SPACE_MEMORY),
.status = cpu_to_le16(PCI_STATUS_CAP_LIST),
.capabilities   = (void *)vpci-pci_hdr.msix - (void 
*)vpci-pci_hdr,
.bar_size[0]= IOPORT_SIZE,
-- 
1.7.8

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/11] RFC: PCI using capabilitities

2011-12-10 Thread Sasha Levin
On Fri, 2011-12-09 at 16:47 +1030, Rusty Russell wrote:
 On Thu, 08 Dec 2011 17:37:37 +0200, Sasha Levin levinsasha...@gmail.com 
 wrote:
  Which leads me to the question: Are MMIO vs MMIO reads/writes not
  ordered?
 
 That seems really odd, especially being repeatable.

Happens every single time. Can't be a coincidence.

I even went into paranoia mode and made sure that both IO requests come
from the same vcpu.

Another weird thing I've noticed is that mb() doesn't fix it, while if I
replace the mb() with a printk() it works well.

 BTW, that's an address, not a pfn now.

Fixed :)

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html