Re: [PATCH v7 00/18] uq/master: Introduce basic irqchip support

2012-01-18 Thread Marcelo Tosatti
On Mon, Jan 16, 2012 at 04:55:34PM +0100, Jan Kiszka wrote:
 Changes in v7:
 - introduce {apic,pic,ioapic}_qdev_register and use
   {APIC,PIC,IOAPIC}CommonInfo to move more code into the common modules
 - clean up forgotten fragments of backend/frontend approach
 - rephrased potentially misleading title of last patch ;)
 
 CC: Lai Jiangshan la...@cn.fujitsu.com
 
 Jan Kiszka (18):
   msi: Generalize msix_supported to msi_supported
   kvm: Move kvmclock into hw/kvm folder
   apic: Stop timer on reset
   apic: Inject external NMI events via LINT1
   apic: Introduce apic_report_irq_delivered
   apic: Factor out base class for KVM reuse
   apic: Open-code timer save/restore
   i8259: Completely privatize PicState
   i8259: Factor out base class for KVM reuse
   ioapic: Drop post-load irr initialization
   ioapic: Factor out base class for KVM reuse
   memory: Introduce memory_region_init_reservation
   kvm: Introduce core services for in-kernel irqchip support
   kvm: x86: Establish IRQ0 override control
   kvm: x86: Add user space part for in-kernel APIC
   kvm: x86: Add user space part for in-kernel i8259
   kvm: x86: Add user space part for in-kernel IOAPIC
   kvm: Activate in-kernel irqchip support

Patchset does not apply, please regenarate (patch 2 is missing actual
file move), thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3 kvm-unit-tests] Dirty logging performance test

2012-01-18 Thread Marcelo Tosatti
On Sun, Jan 15, 2012 at 12:41:31PM +0900, Takuya Yoshikawa wrote:
 My 32 bit host running on an intel core i3 box said:
 
   $ ./api/dirty-log-perf
   dirty-log-perf: 262144 slot pages / 262144 mem pages
   rip 804a74a
   rip 804a74a
   get dirty log:  51571 ns for  1 dirty pages
   rip 804a74a
   get dirty log:  81190 ns for  2 dirty pages
   rip 804a74a
   get dirty log:  66606 ns for  4 dirty pages
   rip 804a74a
   get dirty log:  60408 ns for  8 dirty pages
   rip 804a74a
   get dirty log:  46711 ns for 16 dirty pages
   rip 804a74a
   get dirty log:  83563 ns for 32 dirty pages
   rip 804a74a
   get dirty log:  74367 ns for 64 dirty pages
   rip 804a74a
   get dirty log:  87240 ns for128 dirty pages
   rip 804a74a
   get dirty log: 140161 ns for256 dirty pages
   rip 804a74a
   get dirty log: 191288 ns for512 dirty pages
   rip 804a74a
   get dirty log: 981045 ns for   1024 dirty pages
   rip 804a74a
   get dirty log:1000755 ns for   2048 dirty pages
   rip 804a74a
   get dirty log:1122837 ns for   4096 dirty pages
   rip 804a74a
   get dirty log:1362598 ns for   8192 dirty pages
   rip 804a74a
   get dirty log:1202789 ns for  16384 dirty pages
   rip 804a74a
   get dirty log:1598484 ns for  32768 dirty pages
   rip 804a74a
   get dirty log:2456946 ns for  65536 dirty pages
   rip 804a74a
   get dirty log:3366358 ns for 131072 dirty pages
   rip 804a74a
   get dirty log:5634134 ns for 262144 dirty pages
 
 Takuya Yoshikawa (3):
   Add dirty logging performance test
   dirty-log-perf: Split guest memory into two slots
   dirty-log-perf: Take slot size from command line
 
  api/dirty-log-perf.cc |  144 
 +
  config-x86-common.mak |5 ++-
  2 files changed, 148 insertions(+), 1 deletions(-)
  create mode 100644 api/dirty-log-perf.cc

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] pci-assign: Fix multifunction support

2012-01-18 Thread Marcelo Tosatti
On Mon, Jan 16, 2012 at 10:11:51AM -0700, Alex Williamson wrote:
 The core PCI code sets the multifunction bit in the header before
 calling the device initfn.  For device assignment, we're blasting
 that value with the actual hardware value, so nobody sees the
 additional functions if the devices isn't physically multifunction.
 Switch the HEADER_TYPE to a fully emulated field (all read-only
 anyway) and add setting and clearing of the multifunction bit to
 match qemu directive.
 
 Signed-off-by: Alex Williamson alex.william...@redhat.com

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 42600] New: Live migration of very large vm get's stuck

2012-01-18 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=42600

   Summary: Live migration of very large vm get's stuck
   Product: Virtualization
   Version: unspecified
Kernel Version: 2.6.32-131.6.1.el6.x86_64
  Platform: All
OS/Version: Linux
  Tree: Mainline
Status: NEW
  Severity: high
  Priority: P1
 Component: kvm
AssignedTo: virtualization_...@kernel-bugs.osdl.org
ReportedBy: florian.rust...@unitedexperts.de
Regression: No


We have several high load Tomcat servers virtualized on Scientific Linux via
KVM.

Configuration is between 10 and 30 GB of RAM and between 8 and 12 cores per VM.

What i can see so far is, that after round about 20% of migration the progress
get's slower and slower and finally the migration runs into a timeout,
sometimes also with a broken source vm, means stuck and needs to be rebootet :(

Both hypervisor servers are connected via 1 GB interface, so bandwidth should
be fine.
The Hypervisors are Intel modular server blades with 2x6cores and 96GB of RAM,
connected to a shared storage.

From theory i would asume two possible explanations:
1. dirty RAM is changing too frequently and migration is transfering slower
than the changes are differing in size
2. Too much CPU's and therefore running threads break down migration
performance

If one of theese explanations is true, this is not a bug, but a problem ;)

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: fix error handling for out of range irq

2012-01-18 Thread Michael S. Tsirkin
find_index_from_host_irq returns 0 on error
but callers assume  0 on error. This should
not matter much: an out of range irq should never happen since
irq handler was registered with this irq #,
and even if it does we get a spurious msix irq in guest
and typically nothing terrible happens.

Still, better to make it consistent.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 virt/kvm/assigned-dev.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index 73bb001..0cbd8a1 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -49,10 +49,8 @@ static int find_index_from_host_irq(struct 
kvm_assigned_dev_kernel
index = i;
break;
}
-   if (index  0) {
+   if (index  0)
printk(KERN_WARNING Fail to find correlated MSI-X entry!\n);
-   return 0;
-   }
 
return index;
 }
-- 
1.7.8.2.325.g247f9
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC] kvm: deliver msix interrupts from irq handler

2012-01-18 Thread Michael S. Tsirkin
We can deliver certain interrupts, notably MSIX,
from atomic context.  Add a new API kvm_set_irq_inatomic,
that does exactly that, and use it to implement
an irq handler for msi.

This reduces the pressure on scheduler in case
where host and guest irq share a host cpu.

Signed-off-by: Michael S. Tsirkin m...@redhat.com

Untested.
Note: this is on top of my host irq patch.
Probably needs to be rebased to be independent
and split up to new API + usage.

---
 include/linux/kvm_host.h |2 +
 virt/kvm/assigned-dev.c  |   31 +-
 virt/kvm/irq_comm.c  |   52 ++
 3 files changed, 83 insertions(+), 2 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f0361bc..e2b89ea 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -548,6 +548,8 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic 
*ioapic,
 #endif
 int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
int host_irq);
+int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int 
level,
+int host_irq);
 int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, struct kvm 
*kvm,
int irq_source_id, int level, int host_irq);
 void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin);
diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index cc4bb7a..73bb001 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -57,6 +57,14 @@ static int find_index_from_host_irq(struct 
kvm_assigned_dev_kernel
return index;
 }
 
+static irqreturn_t kvm_assigned_dev_msi(int irq, void *dev_id)
+{
+   int ret = kvm_set_irq_inatomic(assigned_dev-kvm,
+  assigned_dev-irq_source_id,
+  assigned_dev-guest_irq, 1, irq);
+   return unlikely(ret == -EWOULDBLOCK) ? IRQ_WAKE_THREAD : IRQ_HANDLED;
+}
+
 static irqreturn_t kvm_assigned_dev_thread(int irq, void *dev_id)
 {
struct kvm_assigned_dev_kernel *assigned_dev = dev_id;
@@ -75,6 +83,23 @@ static irqreturn_t kvm_assigned_dev_thread(int irq, void 
*dev_id)
 }
 
 #ifdef __KVM_HAVE_MSIX
+static irqreturn_t kvm_assigned_dev_msix(int irq, void *dev_id)
+{
+   struct kvm_assigned_dev_kernel *assigned_dev = dev_id;
+   int index = find_index_from_host_irq(assigned_dev, irq);
+   u32 vector;
+   int ret = 0;
+
+   if (index = 0) {
+   vector = assigned_dev-guest_msix_entries[index].vector;
+   ret = kvm_set_irq_inatomic(assigned_dev-kvm,
+  assigned_dev-irq_source_id,
+  vector, 1, irq);
+   }
+
+   return unlikely(ret == -EWOULDBLOCK) ? IRQ_WAKE_THREAD : IRQ_HANDLED;
+}
+
 static irqreturn_t kvm_assigned_dev_thread_msix(int irq, void *dev_id)
 {
struct kvm_assigned_dev_kernel *assigned_dev = dev_id;
@@ -266,7 +291,8 @@ static int assigned_device_enable_host_msi(struct kvm *kvm,
}
 
dev-host_irq = dev-dev-irq;
-   if (request_threaded_irq(dev-host_irq, NULL, kvm_assigned_dev_thread,
+   if (request_threaded_irq(dev-host_irq, kvm_assigned_dev_msi,
+kvm_assigned_dev_thread,
 0, dev-irq_name, dev)) {
pci_disable_msi(dev-dev);
return -EIO;
@@ -293,7 +319,8 @@ static int assigned_device_enable_host_msix(struct kvm *kvm,
 
for (i = 0; i  dev-entries_nr; i++) {
r = request_threaded_irq(dev-host_msix_entries[i].vector,
-NULL, kvm_assigned_dev_thread_msix,
+kvm_assigned_dev_msix,
+kvm_assigned_dev_thread_msix,
 0, dev-irq_name, dev);
if (r)
goto err;
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index ba892df..68cd127 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -201,6 +201,58 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 
irq, int level,
return ret;
 }
 
+static inline struct kvm_kernel_irq_routing_entry *
+kvm_get_entry(struct kvm *kvm, struct kvm_irq_routing_table *irq_rq, u32 irq)
+{
+   struct kvm_kernel_irq_routing_entry *e;
+   if (likely(irq  irq_rt-nr_rt_entries))
+   hlist_for_each_entry(e, n, irq_rt-map[irq], link)
+   if (e-type == KVM_IRQ_ROUTING_MSI)
+   return e;
+   else
+   return ERR_PTR(-EWOULDBLOCK);
+   return ERR_PTR(-EINVAL);
+}
+
+/*
+ * Deliver an IRQ in an atomic context if we can, or return a failure,
+ * user can retry in a process context.
+ * Return value:
+ *  -EWOULDBLOCK   Can't deliver in atomic context
+ *   0

Re: [PATCH RFC v3 1/2] hyper-v: introduce Hyper-V support infrastructure.

2012-01-18 Thread Jan Kiszka
On 2011-12-18 21:48, Vadim Rozenfeld wrote:
 ---
  Makefile.target  |2 +
  target-i386/cpuid.c  |   14 ++
  target-i386/hyperv.c |   65 
 ++
  target-i386/hyperv.h |   37 
  4 files changed, 118 insertions(+), 0 deletions(-)
  create mode 100644 target-i386/hyperv.c
  create mode 100644 target-i386/hyperv.h
 
 diff --git a/Makefile.target b/Makefile.target
 index 6e742c2..6245796 100644
 --- a/Makefile.target
 +++ b/Makefile.target
 @@ -209,6 +209,8 @@ obj-$(CONFIG_NO_KVM) += kvm-stub.o
  obj-y += memory.o
  LIBS+=-lz
  
 +obj-i386-y +=hyperv.o
 +
  QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
  QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
  QEMU_CFLAGS += $(VNC_JPEG_CFLAGS)
 diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
 index 1e8bcff..4193df1 100644
 --- a/target-i386/cpuid.c
 +++ b/target-i386/cpuid.c
 @@ -27,6 +27,8 @@
  #include qemu-option.h
  #include qemu-config.h
  
 +#include hyperv.h
 +
  /* feature flags taken from Intel Processor Identification and the CPUID
   * Instruction and AMD's CPUID Specification.  In cases of disagreement
   * between feature naming conventions, aliases may be added.
 @@ -716,6 +718,14 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
 const char *cpu_model)
  goto error;
  }
  x86_cpu_def-tsc_khz = tsc_freq / 1000;
 +} else if (!strcmp(featurestr, hv_spinlocks)) {
 +char *err;
 +numvalue = strtoul(val, err, 0);
 +if (!*val || *err) {
 +fprintf(stderr, bad numerical value %s\n, val);
 +goto error;
 +}
 +hyperv_set_spinlock_retries(numvalue);
  } else {
  fprintf(stderr, unrecognized feature %s\n, featurestr);
  goto error;
 @@ -724,6 +734,10 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
 const char *cpu_model)
  check_cpuid = 1;
  } else if (!strcmp(featurestr, enforce)) {
  check_cpuid = enforce_cpuid = 1;
 +} else if (!strcmp(featurestr, hv_relaxed)) {
 +hyperv_enable_relaxed_timing(true);
 +} else if (!strcmp(featurestr, hv_vapic)) {
 +hyperv_enable_vapic_recommended(true);
  } else {
  fprintf(stderr, feature string `%s' not in format 
 (+feature|-feature|feature=xyz)\n, featurestr);
  goto error;
 diff --git a/target-i386/hyperv.c b/target-i386/hyperv.c
 new file mode 100644
 index 000..b2e57ad
 --- /dev/null
 +++ b/target-i386/hyperv.c
 @@ -0,0 +1,65 @@
 +/*
 + * QEMU Hyper-V support
 + *
 + * Copyright Red Hat, Inc. 2011
 + *
 + * Author: Vadim Rozenfeld vroze...@redhat.com
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or later.
 + * See the COPYING file in the top-level directory.
 + *
 + */
 +
 +#include hyperv.h
 +
 +static bool hyperv_vapic;
 +static bool hyperv_relaxed_timing;
 +static int hyperv_spinlock_attempts = HYPERV_SPINLOCK_NEVER_RETRY;
 +
 +void hyperv_enable_vapic_recommended(bool val)
 +{
 +hyperv_vapic = val;
 +}
 +
 +void hyperv_enable_relaxed_timing(bool val)
 +{
 +hyperv_relaxed_timing = val;
 +}
 +
 +void hyperv_set_spinlock_retries(int val)
 +{
 +hyperv_spinlock_attempts = val;
 +if (hyperv_spinlock_attempts  0xFFF) {
 +hyperv_spinlock_attempts = 0xFFF;
 +}
 +}
 +
 +bool hyperv_enabled(void)
 +{
 +return hyperv_hypercall_available() || hyperv_relaxed_timing_enabled();
 +}
 +
 +bool hyperv_hypercall_available(void)
 +{
 +if (hyperv_vapic ||
 +(hyperv_spinlock_attempts != HYPERV_SPINLOCK_NEVER_RETRY)) {
 +  return true;
 +}
 +return false;
 +}
 +
 +bool hyperv_vapic_recommended(void)
 +{
 +return hyperv_vapic;
 +}
 +
 +bool hyperv_relaxed_timing_enabled(void)
 +{
 +return hyperv_relaxed_timing;
 +}
 +
 +int hyperv_get_spinlock_retries(void)
 +{
 +return hyperv_spinlock_attempts;
 +}
 +
 diff --git a/target-i386/hyperv.h b/target-i386/hyperv.h
 new file mode 100644
 index 000..0d742f8
 --- /dev/null
 +++ b/target-i386/hyperv.h
 @@ -0,0 +1,37 @@
 +/*
 + * QEMU Hyper-V support
 + *
 + * Copyright Red Hat, Inc. 2011
 + *
 + * Author: Vadim Rozenfeld vroze...@redhat.com
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or later.
 + * See the COPYING file in the top-level directory.
 + *
 + */
 +
 +#ifndef QEMU_HW_HYPERV_H
 +#define QEMU_HW_HYPERV_H 1
 +
 +#include qemu-common.h
 +#include asm/hyperv.h
 +
 +#ifndef HYPERV_SPINLOCK_NEVER_RETRY
 +#define HYPERV_SPINLOCK_NEVER_RETRY 0x
 +#endif
 +
 +#ifndef KVM_CPUID_SIGNATURE_NEXT
 +#define KVM_CPUID_SIGNATURE_NEXT0x4100
 +#endif
 +
 +void hyperv_enable_vapic_recommended(bool val);
 +void hyperv_enable_relaxed_timing(bool val);
 +void hyperv_set_spinlock_retries(int val);
 +
 +bool 

Re: [PATCH v7 00/18] uq/master: Introduce basic irqchip support

2012-01-18 Thread Jan Kiszka
On 2012-01-18 10:48, Marcelo Tosatti wrote:
 On Mon, Jan 16, 2012 at 04:55:34PM +0100, Jan Kiszka wrote:
 Changes in v7:
 - introduce {apic,pic,ioapic}_qdev_register and use
   {APIC,PIC,IOAPIC}CommonInfo to move more code into the common modules
 - clean up forgotten fragments of backend/frontend approach
 - rephrased potentially misleading title of last patch ;)

 CC: Lai Jiangshan la...@cn.fujitsu.com

 Jan Kiszka (18):
   msi: Generalize msix_supported to msi_supported
   kvm: Move kvmclock into hw/kvm folder
   apic: Stop timer on reset
   apic: Inject external NMI events via LINT1
   apic: Introduce apic_report_irq_delivered
   apic: Factor out base class for KVM reuse
   apic: Open-code timer save/restore
   i8259: Completely privatize PicState
   i8259: Factor out base class for KVM reuse
   ioapic: Drop post-load irr initialization
   ioapic: Factor out base class for KVM reuse
   memory: Introduce memory_region_init_reservation
   kvm: Introduce core services for in-kernel irqchip support
   kvm: x86: Establish IRQ0 override control
   kvm: x86: Add user space part for in-kernel APIC
   kvm: x86: Add user space part for in-kernel i8259
   kvm: x86: Add user space part for in-kernel IOAPIC
   kvm: Activate in-kernel irqchip support
 
 Patchset does not apply, please regenarate

OK, working on it. I think it had some build issue with !CONFIG_KVM anyway.

 (patch 2 is missing actual
 file move), thanks.

Hmm, possibly requires a fairly recent diff. Are you fine with pulling
from my tree? Then I will attach the url, otherwise expand this.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v7 00/18] uq/master: Introduce basic irqchip support

2012-01-18 Thread Marcelo Tosatti
On Wed, Jan 18, 2012 at 09:09:22PM +0100, Jan Kiszka wrote:
  Patchset does not apply, please regenarate
 
 OK, working on it. I think it had some build issue with !CONFIG_KVM anyway.
 
  (patch 2 is missing actual
  file move), thanks.
 
 Hmm, possibly requires a fairly recent diff. Are you fine with pulling
 from my tree? Then I will attach the url, otherwise expand this.

Pull is OK.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v3 1/2] hyper-v: introduce Hyper-V support infrastructure.

2012-01-18 Thread Jan Kiszka
On 2012-01-18 21:05, Jan Kiszka wrote:
 On 2011-12-18 21:48, Vadim Rozenfeld wrote:
 ---
  Makefile.target  |2 +
  target-i386/cpuid.c  |   14 ++
  target-i386/hyperv.c |   65 
 ++
  target-i386/hyperv.h |   37 
  4 files changed, 118 insertions(+), 0 deletions(-)
  create mode 100644 target-i386/hyperv.c
  create mode 100644 target-i386/hyperv.h

 diff --git a/Makefile.target b/Makefile.target
 index 6e742c2..6245796 100644
 --- a/Makefile.target
 +++ b/Makefile.target
 @@ -209,6 +209,8 @@ obj-$(CONFIG_NO_KVM) += kvm-stub.o
  obj-y += memory.o
  LIBS+=-lz
  
 +obj-i386-y +=hyperv.o
 +
  QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
  QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
  QEMU_CFLAGS += $(VNC_JPEG_CFLAGS)
 diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
 index 1e8bcff..4193df1 100644
 --- a/target-i386/cpuid.c
 +++ b/target-i386/cpuid.c
 @@ -27,6 +27,8 @@
  #include qemu-option.h
  #include qemu-config.h
  
 +#include hyperv.h
 +
  /* feature flags taken from Intel Processor Identification and the CPUID
   * Instruction and AMD's CPUID Specification.  In cases of disagreement
   * between feature naming conventions, aliases may be added.
 @@ -716,6 +718,14 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
 const char *cpu_model)
  goto error;
  }
  x86_cpu_def-tsc_khz = tsc_freq / 1000;
 +} else if (!strcmp(featurestr, hv_spinlocks)) {
 +char *err;
 +numvalue = strtoul(val, err, 0);
 +if (!*val || *err) {
 +fprintf(stderr, bad numerical value %s\n, val);
 +goto error;
 +}
 +hyperv_set_spinlock_retries(numvalue);
  } else {
  fprintf(stderr, unrecognized feature %s\n, featurestr);
  goto error;
 @@ -724,6 +734,10 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
 const char *cpu_model)
  check_cpuid = 1;
  } else if (!strcmp(featurestr, enforce)) {
  check_cpuid = enforce_cpuid = 1;
 +} else if (!strcmp(featurestr, hv_relaxed)) {
 +hyperv_enable_relaxed_timing(true);
 +} else if (!strcmp(featurestr, hv_vapic)) {
 +hyperv_enable_vapic_recommended(true);
  } else {
  fprintf(stderr, feature string `%s' not in format 
 (+feature|-feature|feature=xyz)\n, featurestr);
  goto error;
 diff --git a/target-i386/hyperv.c b/target-i386/hyperv.c
 new file mode 100644
 index 000..b2e57ad
 --- /dev/null
 +++ b/target-i386/hyperv.c
 @@ -0,0 +1,65 @@
 +/*
 + * QEMU Hyper-V support
 + *
 + * Copyright Red Hat, Inc. 2011
 + *
 + * Author: Vadim Rozenfeld vroze...@redhat.com
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or later.
 + * See the COPYING file in the top-level directory.
 + *
 + */
 +
 +#include hyperv.h
 +
 +static bool hyperv_vapic;
 +static bool hyperv_relaxed_timing;
 +static int hyperv_spinlock_attempts = HYPERV_SPINLOCK_NEVER_RETRY;
 +
 +void hyperv_enable_vapic_recommended(bool val)
 +{
 +hyperv_vapic = val;
 +}
 +
 +void hyperv_enable_relaxed_timing(bool val)
 +{
 +hyperv_relaxed_timing = val;
 +}
 +
 +void hyperv_set_spinlock_retries(int val)
 +{
 +hyperv_spinlock_attempts = val;
 +if (hyperv_spinlock_attempts  0xFFF) {
 +hyperv_spinlock_attempts = 0xFFF;
 +}
 +}
 +
 +bool hyperv_enabled(void)
 +{
 +return hyperv_hypercall_available() || hyperv_relaxed_timing_enabled();
 +}
 +
 +bool hyperv_hypercall_available(void)
 +{
 +if (hyperv_vapic ||
 +(hyperv_spinlock_attempts != HYPERV_SPINLOCK_NEVER_RETRY)) {
 +  return true;
 +}
 +return false;
 +}
 +
 +bool hyperv_vapic_recommended(void)
 +{
 +return hyperv_vapic;
 +}
 +
 +bool hyperv_relaxed_timing_enabled(void)
 +{
 +return hyperv_relaxed_timing;
 +}
 +
 +int hyperv_get_spinlock_retries(void)
 +{
 +return hyperv_spinlock_attempts;
 +}
 +
 diff --git a/target-i386/hyperv.h b/target-i386/hyperv.h
 new file mode 100644
 index 000..0d742f8
 --- /dev/null
 +++ b/target-i386/hyperv.h
 @@ -0,0 +1,37 @@
 +/*
 + * QEMU Hyper-V support
 + *
 + * Copyright Red Hat, Inc. 2011
 + *
 + * Author: Vadim Rozenfeld vroze...@redhat.com
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or later.
 + * See the COPYING file in the top-level directory.
 + *
 + */
 +
 +#ifndef QEMU_HW_HYPERV_H
 +#define QEMU_HW_HYPERV_H 1
 +
 +#include qemu-common.h
 +#include asm/hyperv.h
 +
 +#ifndef HYPERV_SPINLOCK_NEVER_RETRY
 +#define HYPERV_SPINLOCK_NEVER_RETRY 0x
 +#endif
 +
 +#ifndef KVM_CPUID_SIGNATURE_NEXT
 +#define KVM_CPUID_SIGNATURE_NEXT0x4100
 +#endif
 +
 +void hyperv_enable_vapic_recommended(bool val);
 +void hyperv_enable_relaxed_timing(bool val);
 +void 

[PATCH] virt: Fix libvirt vm incompatibility with RHEL 5

2012-01-18 Thread Lucas Meneghel Rodrigues
There's no vnclisten param in the virt-install command
shipped in RHEL 5, so let's add it to the command line
only if does support this option.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/virt/libvirt_vm.py |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/client/virt/libvirt_vm.py b/client/virt/libvirt_vm.py
index c825661..73894c6 100644
--- a/client/virt/libvirt_vm.py
+++ b/client/virt/libvirt_vm.py
@@ -536,7 +536,10 @@ class VM(virt_vm.BaseVM):
 return  --vnc --vncport=%d % (vnc_port)
 
 def add_vnclisten(help, vnclisten):
-return  --vnclisten=%s  % (vnclisten)
+if has_option(help, vnclisten):
+return  --vnclisten=%s % (vnclisten)
+else:
+return 
 
 def add_sdl(help):
 if has_option(help, sdl):
-- 
1.7.7.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2 1/7] i8254: Do not raise IRQ level on reset

2012-01-18 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com

Avoid changing the IRQ level to high on reset as it may trigger spurious
events. Instead, open-code the effects of pit_load_count(0) in the reset
handler.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/i8254.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/hw/i8254.c b/hw/i8254.c
index cf9ed2f..df42c07 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -481,7 +481,13 @@ static void pit_reset(DeviceState *dev)
 s = pit-channels[i];
 s-mode = 3;
 s-gate = (i != 2);
-pit_load_count(s, 0);
+s-count_load_time = qemu_get_clock_ns(vm_clock);
+s-count = 0x1;
+if (i == 0  !s-irq_disabled) {
+s-next_transition_time =
+pit_get_next_transition_time(s, s-count_load_time);
+qemu_mod_timer(s-irq_timer, s-next_transition_time);
+}
 }
 }
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2 0/7] pit, hpet, pcspk: fixes preparation for KVM

2012-01-18 Thread Jan Kiszka
This is a preparatory series to allow the introduction of the KVM
in-kernel PIT. A working and fairly clean version for that is ready. It
is just waiting for the irqchip baseline and this series to be merged.

This series also fixes various bugs in the PIT and HPET code, see
patches for details.

Changes in V2:
 - do not raise i8254 IRQ on reset
 - introduce i8254.h
 - pass irq output object on i8254 initialization
 - convert PC speaker to qdev
 - factor out pit_get_channel_info

Jan Kiszka (7):
  i8254: Do not raise IRQ level on reset
  hpet: Save/restore cached RTC IRQ level
  i8254: Factor out interface header
  i8254: Pass irq output object on initialization
  i8254: Rework  fix interaction with HPET in legacy mode
  pcspk: Convert to qdev
  i8254: Factor out pit_get_channel_info

 arch_init.c|1 +
 hw/alpha_dp264.c   |3 +-
 hw/hpet.c  |   65 ++--
 hw/hpet_emul.h |3 ++
 hw/i8254.c |   92 ++-
 hw/i8254.h |   55 +++
 hw/mips_fulong2e.c |3 +-
 hw/mips_jazz.c |6 ++-
 hw/mips_malta.c|3 +-
 hw/mips_r4k.c  |3 +-
 hw/pc.c|   17 +++--
 hw/pc.h|   29 
 hw/pcspk.c |   73 
 hw/pcspk.h |   45 +
 hw/ppc_prep.c  |2 +-
 15 files changed, 275 insertions(+), 125 deletions(-)
 create mode 100644 hw/i8254.h
 create mode 100644 hw/pcspk.h

-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2 7/7] i8254: Factor out pit_get_channel_info

2012-01-18 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com

Instead of providing 4 individual query functions for mode, gate, output
and initial counter state, introduce a service that queries all
information at once. This comes with tiny additional costs for
pcspk_callback but with a much cleaner interface. Also, it will simplify
the implementation of the KVM in-kernel PIT model.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/i8254.c |   35 ++-
 hw/i8254.h |   12 
 hw/pcspk.c |   16 +++-
 3 files changed, 29 insertions(+), 34 deletions(-)

diff --git a/hw/i8254.c b/hw/i8254.c
index f5be0e5..2f1f370 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -90,7 +90,7 @@ static int pit_get_count(PITChannelState *s)
 }
 
 /* get pit output bit */
-static int pit_get_out1(PITChannelState *s, int64_t current_time)
+static int pit_get_out(PITChannelState *s, int64_t current_time)
 {
 uint64_t d;
 int out;
@@ -122,13 +122,6 @@ static int pit_get_out1(PITChannelState *s, int64_t 
current_time)
 return out;
 }
 
-int pit_get_out(ISADevice *dev, int channel, int64_t current_time)
-{
-PITState *pit = DO_UPCAST(PITState, dev, dev);
-PITChannelState *s = pit-channels[channel];
-return pit_get_out1(s, current_time);
-}
-
 /* return -1 if no transition will occur.  */
 static int64_t pit_get_next_transition_time(PITChannelState *s,
 int64_t current_time)
@@ -215,25 +208,15 @@ void pit_set_gate(ISADevice *dev, int channel, int val)
 s-gate = val;
 }
 
-int pit_get_gate(ISADevice *dev, int channel)
-{
-PITState *pit = DO_UPCAST(PITState, dev, dev);
-PITChannelState *s = pit-channels[channel];
-return s-gate;
-}
-
-int pit_get_initial_count(ISADevice *dev, int channel)
+void pit_get_channel_info(ISADevice *dev, int channel, PITChannelInfo *info)
 {
 PITState *pit = DO_UPCAST(PITState, dev, dev);
 PITChannelState *s = pit-channels[channel];
-return s-count;
-}
 
-int pit_get_mode(ISADevice *dev, int channel)
-{
-PITState *pit = DO_UPCAST(PITState, dev, dev);
-PITChannelState *s = pit-channels[channel];
-return s-mode;
+info-gate = s-gate;
+info-mode = s-mode;
+info-initial_count = s-count;
+info-out = pit_get_out(s, qemu_get_clock_ns(vm_clock));
 }
 
 static inline void pit_load_count(PITChannelState *s, int val)
@@ -274,7 +257,9 @@ static void pit_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 if (!(val  0x10)  !s-status_latched) {
 /* status latch */
 /* XXX: add BCD and null count */
-s-status =  (pit_get_out1(s, 
qemu_get_clock_ns(vm_clock))  7) |
+s-status =
+(pit_get_out(s,
+ qemu_get_clock_ns(vm_clock))  7) |
 (s-rw_mode  4) |
 (s-mode  1) |
 s-bcd;
@@ -381,7 +366,7 @@ static void pit_irq_timer_update(PITChannelState *s, 
int64_t current_time)
 return;
 }
 expire_time = pit_get_next_transition_time(s, current_time);
-irq_level = pit_get_out1(s, current_time);
+irq_level = pit_get_out(s, current_time);
 qemu_set_irq(s-irq, irq_level);
 #ifdef DEBUG_PIT
 printf(irq_level=%d next_delay=%f\n,
diff --git a/hw/i8254.h b/hw/i8254.h
index 2dc1008..bf05d58 100644
--- a/hw/i8254.h
+++ b/hw/i8254.h
@@ -30,6 +30,13 @@
 
 #define PIT_FREQ 1193182
 
+typedef struct PITChannelInfo {
+int gate;
+int mode;
+int initial_count;
+int out;
+} PITChannelInfo;
+
 static inline ISADevice *pit_init(ISABus *bus, int base, qemu_irq irq)
 {
 ISADevice *dev;
@@ -43,9 +50,6 @@ static inline ISADevice *pit_init(ISABus *bus, int base, 
qemu_irq irq)
 }
 
 void pit_set_gate(ISADevice *dev, int channel, int val);
-int pit_get_gate(ISADevice *dev, int channel);
-int pit_get_initial_count(ISADevice *dev, int channel);
-int pit_get_mode(ISADevice *dev, int channel);
-int pit_get_out(ISADevice *dev, int channel, int64_t current_time);
+void pit_get_channel_info(ISADevice *dev, int channel, PITChannelInfo *info);
 
 #endif /* !HW_I8254_H */
diff --git a/hw/pcspk.c b/hw/pcspk.c
index 223e22a..951b5cd 100644
--- a/hw/pcspk.c
+++ b/hw/pcspk.c
@@ -75,12 +75,16 @@ static inline void generate_samples(PCSpkState *s)
 static void pcspk_callback(void *opaque, int free)
 {
 PCSpkState *s = opaque;
+PITChannelInfo ch;
 unsigned int n;
 
-if (pit_get_mode(s-pit, 2) != 3)
+pit_get_channel_info(s-pit, 2, ch);
+
+if (ch.mode != 3) {
 return;
+}
 
-n = pit_get_initial_count(s-pit, 2);
+n = ch.initial_count;
 /* avoid frequencies that are not reproducible with sample rate */
 if (n  PCSPK_MIN_COUNT)
 n = 0;
@@ -121,12 +125,14 @@ static uint64_t pcspk_io_read(void *opaque, 
target_phys_addr_t addr,
   

[PATCH V2 6/7] pcspk: Convert to qdev

2012-01-18 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com

Convert the PC speaker device to a qdev ISA model. Move the public
interface to a dedicated header file at this chance.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 arch_init.c|1 +
 hw/mips_jazz.c |3 ++-
 hw/pc.c|3 ++-
 hw/pc.h|4 
 hw/pcspk.c |   56 ++--
 hw/pcspk.h |   45 +
 6 files changed, 96 insertions(+), 16 deletions(-)
 create mode 100644 hw/pcspk.h

diff --git a/arch_init.c b/arch_init.c
index 95ac682..f39b979 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -42,6 +42,7 @@
 #include gdbstub.h
 #include hw/smbios.h
 #include exec-memory.h
+#include hw/pcspk.h
 
 #ifdef TARGET_SPARC
 int graphic_width = 1024;
diff --git a/hw/mips_jazz.c b/hw/mips_jazz.c
index 9878b78..5398003 100644
--- a/hw/mips_jazz.c
+++ b/hw/mips_jazz.c
@@ -37,6 +37,7 @@
 #include loader.h
 #include mc146818rtc.h
 #include i8254.h
+#include pcspk.h
 #include blockdev.h
 #include sysbus.h
 #include exec-memory.h
@@ -193,7 +194,7 @@ static void mips_jazz_init(MemoryRegion *address_space,
 cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
 DMA_init(0, cpu_exit_irq);
 pit = pit_init(isa_bus, 0x40, isa_get_irq(NULL, 0));
-pcspk_init(pit);
+pcspk_init(isa_bus, pit);
 
 /* ISA IO space at 0x9000 */
 isa_mmio_init(0x9000, 0x0100);
diff --git a/hw/pc.c b/hw/pc.c
index b199fa4..50e2643 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -37,6 +37,7 @@
 #include multiboot.h
 #include mc146818rtc.h
 #include i8254.h
+#include pcspk.h
 #include msix.h
 #include sysbus.h
 #include sysemu.h
@@ -1160,7 +1161,7 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
 /* connect PIT to output control line of the HPET */
 qdev_connect_gpio_out(hpet, 0, qdev_get_gpio_in(pit-qdev, 0));
 }
-pcspk_init(pit);
+pcspk_init(isa_bus, pit);
 
 for(i = 0; i  MAX_SERIAL_PORTS; i++) {
 if (serial_hds[i]) {
diff --git a/hw/pc.h b/hw/pc.h
index 367e750..cefdf0f 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -149,10 +149,6 @@ void piix4_smbus_register_device(SMBusDevice *dev, uint8_t 
addr);
 /* hpet.c */
 extern int no_hpet;
 
-/* pcspk.c */
-void pcspk_init(ISADevice *pit);
-int pcspk_audio_init(ISABus *bus);
-
 /* piix_pci.c */
 struct PCII440FXState;
 typedef struct PCII440FXState PCII440FXState;
diff --git a/hw/pcspk.c b/hw/pcspk.c
index 43df818..223e22a 100644
--- a/hw/pcspk.c
+++ b/hw/pcspk.c
@@ -28,6 +28,7 @@
 #include audio/audio.h
 #include qemu-timer.h
 #include i8254.h
+#include pcspk.h
 
 #define PCSPK_BUF_LEN 1792
 #define PCSPK_SAMPLE_RATE 32000
@@ -35,10 +36,13 @@
 #define PCSPK_MIN_COUNT ((PIT_FREQ + PCSPK_MAX_FREQ - 1) / PCSPK_MAX_FREQ)
 
 typedef struct {
+ISADevice dev;
+MemoryRegion ioport;
+uint32_t iobase;
 uint8_t sample_buf[PCSPK_BUF_LEN];
 QEMUSoundCard card;
 SWVoiceOut *voice;
-ISADevice *pit;
+void *pit;
 unsigned int pit_count;
 unsigned int samples;
 unsigned int play_pos;
@@ -47,7 +51,7 @@ typedef struct {
 } PCSpkState;
 
 static const char *s_spk = pcspk;
-static PCSpkState pcspk_state;
+static PCSpkState *pcspk_state;
 
 static inline void generate_samples(PCSpkState *s)
 {
@@ -99,7 +103,7 @@ static void pcspk_callback(void *opaque, int free)
 
 int pcspk_audio_init(ISABus *bus)
 {
-PCSpkState *s = pcspk_state;
+PCSpkState *s = pcspk_state;
 struct audsettings as = {PCSPK_SAMPLE_RATE, 1, AUD_FMT_U8, 0};
 
 AUD_register_card(s_spk, s-card);
@@ -113,7 +117,8 @@ int pcspk_audio_init(ISABus *bus)
 return 0;
 }
 
-static uint32_t pcspk_ioport_read(void *opaque, uint32_t addr)
+static uint64_t pcspk_io_read(void *opaque, target_phys_addr_t addr,
+  unsigned size)
 {
 PCSpkState *s = opaque;
 int out;
@@ -124,7 +129,8 @@ static uint32_t pcspk_ioport_read(void *opaque, uint32_t 
addr)
 return pit_get_gate(s-pit, 2) | (s-data_on  1) | 
s-dummy_refresh_clock | out;
 }
 
-static void pcspk_ioport_write(void *opaque, uint32_t addr, uint32_t val)
+static void pcspk_io_write(void *opaque, target_phys_addr_t addr, uint64_t val,
+   unsigned size)
 {
 PCSpkState *s = opaque;
 const int gate = val  1;
@@ -138,11 +144,41 @@ static void pcspk_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
 }
 }
 
-void pcspk_init(ISADevice *pit)
+static const MemoryRegionOps pcspk_io_ops = {
+.read = pcspk_io_read,
+.write = pcspk_io_write,
+.impl = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
+};
+
+static int pcspk_initfn(ISADevice *dev)
 {
-PCSpkState *s = pcspk_state;
+PCSpkState *s = DO_UPCAST(PCSpkState, dev, dev);
+
+memory_region_init_io(s-ioport, pcspk_io_ops, s, elcr, 1);
+isa_register_ioport(NULL, s-ioport, s-iobase);
+
+pcspk_state = s;
 
-s-pit = pit;
-

[PATCH V2 4/7] i8254: Pass irq output object on initialization

2012-01-18 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com

Instead of retrieving the IRQ object from the ISA bus, let the creator
of the PIT pick it. pit_init can then connect it to a generic GPIO
output pin.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/alpha_dp264.c   |2 +-
 hw/i8254.c |4 +---
 hw/i8254.h |4 ++--
 hw/mips_fulong2e.c |2 +-
 hw/mips_jazz.c |2 +-
 hw/mips_malta.c|2 +-
 hw/mips_r4k.c  |2 +-
 hw/pc.c|2 +-
 hw/ppc_prep.c  |2 +-
 9 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/hw/alpha_dp264.c b/hw/alpha_dp264.c
index 4c0efd3..5b49b90 100644
--- a/hw/alpha_dp264.c
+++ b/hw/alpha_dp264.c
@@ -73,7 +73,7 @@ static void clipper_init(ram_addr_t ram_size,
clipper_pci_map_irq);
 
 rtc_init(isa_bus, 1980, rtc_irq);
-pit_init(isa_bus, 0x40, 0);
+pit_init(isa_bus, 0x40, isa_get_irq(NULL, 0));
 isa_create_simple(isa_bus, i8042);
 
 /* VGA setup.  Don't bother loading the bios.  */
diff --git a/hw/i8254.c b/hw/i8254.c
index 7d5ca3a..dd49552 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -57,7 +57,6 @@ typedef struct PITChannelState {
 typedef struct PITState {
 ISADevice dev;
 MemoryRegion ioports;
-uint32_t irq;
 uint32_t iobase;
 PITChannelState channels[3];
 } PITState;
@@ -532,7 +531,7 @@ static int pit_initfn(ISADevice *dev)
 s = pit-channels[0];
 /* the timer 0 is connected to an IRQ */
 s-irq_timer = qemu_new_timer_ns(vm_clock, pit_irq_timer, s);
-s-irq = isa_get_irq(dev, pit-irq);
+qdev_init_gpio_out(dev-qdev, s-irq, 1);
 
 memory_region_init_io(pit-ioports, pit_ioport_ops, pit, pit, 4);
 isa_register_ioport(dev, pit-ioports, pit-iobase);
@@ -550,7 +549,6 @@ static ISADeviceInfo pit_info = {
 .qdev.no_user  = 1,
 .init  = pit_initfn,
 .qdev.props = (Property[]) {
-DEFINE_PROP_UINT32(irq, PITState, irq,  -1),
 DEFINE_PROP_HEX32(iobase, PITState, iobase,  -1),
 DEFINE_PROP_END_OF_LIST(),
 },
diff --git a/hw/i8254.h b/hw/i8254.h
index cd3111c..4821fb4 100644
--- a/hw/i8254.h
+++ b/hw/i8254.h
@@ -30,14 +30,14 @@
 
 #define PIT_FREQ 1193182
 
-static inline ISADevice *pit_init(ISABus *bus, int base, int irq)
+static inline ISADevice *pit_init(ISABus *bus, int base, qemu_irq irq)
 {
 ISADevice *dev;
 
 dev = isa_create(bus, isa-pit);
 qdev_prop_set_uint32(dev-qdev, iobase, base);
-qdev_prop_set_uint32(dev-qdev, irq, irq);
 qdev_init_nofail(dev-qdev);
+qdev_connect_gpio_out(dev-qdev, 0, irq);
 
 return dev;
 }
diff --git a/hw/mips_fulong2e.c b/hw/mips_fulong2e.c
index ead72ae..fedc929 100644
--- a/hw/mips_fulong2e.c
+++ b/hw/mips_fulong2e.c
@@ -364,7 +364,7 @@ static void mips_fulong2e_init(ram_addr_t ram_size, const 
char *boot_device,
 smbus_eeprom_init(smbus, 1, eeprom_spd, sizeof(eeprom_spd));
 
 /* init other devices */
-pit = pit_init(isa_bus, 0x40, 0);
+pit = pit_init(isa_bus, 0x40, isa_get_irq(NULL, 0));
 cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
 DMA_init(0, cpu_exit_irq);
 
diff --git a/hw/mips_jazz.c b/hw/mips_jazz.c
index 61dee4d..9878b78 100644
--- a/hw/mips_jazz.c
+++ b/hw/mips_jazz.c
@@ -192,7 +192,7 @@ static void mips_jazz_init(MemoryRegion *address_space,
 isa_bus_irqs(isa_bus, i8259);
 cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
 DMA_init(0, cpu_exit_irq);
-pit = pit_init(isa_bus, 0x40, 0);
+pit = pit_init(isa_bus, 0x40, isa_get_irq(NULL, 0));
 pcspk_init(pit);
 
 /* ISA IO space at 0x9000 */
diff --git a/hw/mips_malta.c b/hw/mips_malta.c
index 7ddfc3a..506244b 100644
--- a/hw/mips_malta.c
+++ b/hw/mips_malta.c
@@ -970,7 +970,7 @@ void mips_malta_init (ram_addr_t ram_size,
   isa_get_irq(NULL, 9), NULL, NULL, 0);
 /* TODO: Populate SPD eeprom data.  */
 smbus_eeprom_init(smbus, 8, NULL, 0);
-pit = pit_init(isa_bus, 0x40, 0);
+pit = pit_init(isa_bus, 0x40, isa_get_irq(NULL, 0));
 cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
 DMA_init(0, cpu_exit_irq);
 
diff --git a/hw/mips_r4k.c b/hw/mips_r4k.c
index 1b3ec2d..6ff56e9 100644
--- a/hw/mips_r4k.c
+++ b/hw/mips_r4k.c
@@ -270,7 +270,7 @@ void mips_r4k_init (ram_addr_t ram_size,
 isa_mmio_init(0x1400, 0x0001);
 isa_mem_base = 0x1000;
 
-pit = pit_init(isa_bus, 0x40, 0);
+pit = pit_init(isa_bus, 0x40, isa_get_irq(NULL, 0));
 
 for(i = 0; i  MAX_SERIAL_PORTS; i++) {
 if (serial_hds[i]) {
diff --git a/hw/pc.c b/hw/pc.c
index ea60a7c..29a4187 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1152,7 +1152,7 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
 
 qemu_register_boot_set(pc_boot_set, *rtc_state);
 
-pit = pit_init(isa_bus, 0x40, 0);
+pit = pit_init(isa_bus, 0x40, isa_get_irq(NULL, 0));
 pcspk_init(pit);
 
 for(i = 0; i  MAX_SERIAL_PORTS; i++) {
diff --git 

[PATCH V2 2/7] hpet: Save/restore cached RTC IRQ level

2012-01-18 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com

In legacy mode, the HPET suppresses the RTC interrupt delivery via IRQ
8 but keeps track of the RTC output level and applies it when legacy
mode is turned off again. This value has to be preserved across save/
restore as it cannot be reconstructed otherwise.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c |   26 ++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index 5312df7..1b64e6a 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -240,6 +240,24 @@ static int hpet_post_load(void *opaque, int version_id)
 return 0;
 }
 
+static bool hpet_rtc_irq_level_needed(void *opaque)
+{
+HPETState *s = opaque;
+
+return s-rtc_irq_level != 0;
+}
+
+static const VMStateDescription vmstate_hpet_rtc_irq_level = {
+.name = hpet/rtc_irq_level,
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField[]) {
+VMSTATE_UINT8(rtc_irq_level, HPETState),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_hpet_timer = {
 .name = hpet_timer,
 .version_id = 1,
@@ -273,6 +291,14 @@ static const VMStateDescription vmstate_hpet = {
 VMSTATE_STRUCT_VARRAY_UINT8(timer, HPETState, num_timers, 0,
 vmstate_hpet_timer, HPETTimer),
 VMSTATE_END_OF_LIST()
+},
+.subsections = (VMStateSubsection[]) {
+{
+.vmsd = vmstate_hpet_rtc_irq_level,
+.needed = hpet_rtc_irq_level_needed,
+}, {
+/* empty */
+}
 }
 };
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2 5/7] i8254: Rework fix interaction with HPET in legacy mode

2012-01-18 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com

When the HPET enters legacy mode, the IRQ output of the PIT is
suppressed and replaced by the HPET timer 0. But the current code to
emulate this was broken in many ways. It reset the PIT state after
re-enabling, it worked against a stale static PIT structure, and it did
not properly saved/restored the IRQ output mask in the PIT vmstate.

This patch solves the PIT IRQ control in a different way. On x86, it
both redirects the PIT IRQ to the HPET, just like the RTC. But it also
keeps the control line from the HPET to the PIT. This allows to disable
the PIT QEMU timer when it is not needed. The PIT's view on the control
line state is now saved in the same format that qemu-kvm is already
using.

Note that, in contrast to the suppressed RTC IRQ line, we do not need to
save/restore the PIT line state in the HPET. As we trigger a PIT IRQ
update via the control line, the line state is reconstructed on mode
switch.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c  |   38 --
 hw/hpet_emul.h |3 +++
 hw/i8254.c |   44 +---
 hw/i8254.h |3 ---
 hw/pc.c|   13 ++---
 5 files changed, 54 insertions(+), 47 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index 42e88fd..fd12bd6 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -65,6 +65,7 @@ typedef struct HPETState {
 qemu_irq irqs[HPET_NUM_IRQ_ROUTES];
 uint32_t flags;
 uint8_t rtc_irq_level;
+qemu_irq pit_enabled;
 uint8_t num_timers;
 HPETTimer timer[HPET_MAX_TIMERS];
 
@@ -573,12 +574,15 @@ static void hpet_ram_write(void *opaque, 
target_phys_addr_t addr,
 hpet_del_timer(s-timer[i]);
 }
 }
-/* i8254 and RTC are disabled when HPET is in legacy mode */
+/* i8254 and RTC output pins are disabled
+ * when HPET is in legacy mode */
 if (activating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
-hpet_pit_disable();
+qemu_set_irq(s-pit_enabled, 0);
+qemu_irq_lower(s-irqs[0]);
 qemu_irq_lower(s-irqs[RTC_ISA_IRQ]);
 } else if (deactivating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
-hpet_pit_enable();
+qemu_irq_lower(s-irqs[0]);
+qemu_set_irq(s-pit_enabled, 1);
 qemu_set_irq(s-irqs[RTC_ISA_IRQ], s-rtc_irq_level);
 }
 break;
@@ -632,7 +636,6 @@ static void hpet_reset(DeviceState *d)
 {
 HPETState *s = FROM_SYSBUS(HPETState, sysbus_from_qdev(d));
 int i;
-static int count = 0;
 
 for (i = 0; i  s-num_timers; i++) {
 HPETTimer *timer = s-timer[i];
@@ -649,29 +652,27 @@ static void hpet_reset(DeviceState *d)
 timer-wrap_flag = 0;
 }
 
+qemu_set_irq(s-pit_enabled, 1);
 s-hpet_counter = 0ULL;
 s-hpet_offset = 0ULL;
 s-config = 0ULL;
-if (count  0) {
-/* we don't enable pit when hpet_reset is first called (by hpet_init)
- * because hpet is taking over for pit here. On subsequent invocations,
- * hpet_reset is called due to system reset. At this point control must
- * be returned to pit until SW reenables hpet.
- */
-hpet_pit_enable();
-}
 hpet_cfg.hpet[s-hpet_id].event_timer_block_id = (uint32_t)s-capability;
 hpet_cfg.hpet[s-hpet_id].address = sysbus_from_qdev(d)-mmio[0].addr;
-count = 1;
 }
 
-static void hpet_handle_rtc_irq(void *opaque, int n, int level)
+static void hpet_handle_legacy_irq(void *opaque, int n, int level)
 {
 HPETState *s = FROM_SYSBUS(HPETState, opaque);
 
-s-rtc_irq_level = level;
-if (!hpet_in_legacy_mode(s)) {
-qemu_set_irq(s-irqs[RTC_ISA_IRQ], level);
+if (n == HPET_LEGACY_PIT_INT) {
+if (!hpet_in_legacy_mode(s)) {
+qemu_set_irq(s-irqs[0], level);
+}
+} else {
+s-rtc_irq_level = level;
+if (!hpet_in_legacy_mode(s)) {
+qemu_set_irq(s-irqs[RTC_ISA_IRQ], level);
+}
 }
 }
 
@@ -714,7 +715,8 @@ static int hpet_init(SysBusDevice *dev)
 s-capability |= (s-num_timers - 1)  HPET_ID_NUM_TIM_SHIFT;
 s-capability |= ((HPET_CLK_PERIOD)  32);
 
-qdev_init_gpio_in(dev-qdev, hpet_handle_rtc_irq, 1);
+qdev_init_gpio_in(dev-qdev, hpet_handle_legacy_irq, 2);
+qdev_init_gpio_out(dev-qdev, s-pit_enabled, 1);
 
 /* HPET Area */
 memory_region_init_io(s-iomem, hpet_ram_ops, s, hpet, 0x400);
diff --git a/hw/hpet_emul.h b/hw/hpet_emul.h
index 6128702..757f79f 100644
--- a/hw/hpet_emul.h
+++ b/hw/hpet_emul.h
@@ -22,6 +22,9 @@
 
 #define HPET_NUM_IRQ_ROUTES 32
 
+#define HPET_LEGACY_PIT_INT 0
+#define HPET_LEGACY_RTC_INT 1
+
 #define HPET_CFG_ENABLE 0x001
 #define HPET_CFG_LEGACY 0x002
 
diff --git a/hw/i8254.c b/hw/i8254.c
index dd49552..f5be0e5 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -52,6 +52,7 @@ 

[PATCH V2 3/7] i8254: Factor out interface header

2012-01-18 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com

Move the public interface of the PIT into its own header file and update
all users.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/alpha_dp264.c   |1 +
 hw/hpet.c  |1 +
 hw/i8254.c |1 +
 hw/i8254.h |   54 
 hw/mips_fulong2e.c |1 +
 hw/mips_jazz.c |1 +
 hw/mips_malta.c|1 +
 hw/mips_r4k.c  |1 +
 hw/pc.c|1 +
 hw/pc.h|   25 
 hw/pcspk.c |1 +
 11 files changed, 63 insertions(+), 25 deletions(-)
 create mode 100644 hw/i8254.h

diff --git a/hw/alpha_dp264.c b/hw/alpha_dp264.c
index 876335a..4c0efd3 100644
--- a/hw/alpha_dp264.c
+++ b/hw/alpha_dp264.c
@@ -14,6 +14,7 @@
 #include sysemu.h
 #include mc146818rtc.h
 #include ide.h
+#include i8254.h
 
 #define MAX_IDE_BUS 2
 
diff --git a/hw/hpet.c b/hw/hpet.c
index 1b64e6a..42e88fd 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -31,6 +31,7 @@
 #include hpet_emul.h
 #include sysbus.h
 #include mc146818rtc.h
+#include i8254.h
 
 //#define HPET_DEBUG
 #ifdef HPET_DEBUG
diff --git a/hw/i8254.c b/hw/i8254.c
index df42c07..7d5ca3a 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -25,6 +25,7 @@
 #include pc.h
 #include isa.h
 #include qemu-timer.h
+#include i8254.h
 
 //#define DEBUG_PIT
 
diff --git a/hw/i8254.h b/hw/i8254.h
new file mode 100644
index 000..cd3111c
--- /dev/null
+++ b/hw/i8254.h
@@ -0,0 +1,54 @@
+/*
+ * QEMU 8253/8254 interval timer emulation
+ *
+ * Copyright (c) 2003-2004 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef HW_I8254_H
+#define HW_I8254_H
+
+#include hw.h
+#include isa.h
+
+#define PIT_FREQ 1193182
+
+static inline ISADevice *pit_init(ISABus *bus, int base, int irq)
+{
+ISADevice *dev;
+
+dev = isa_create(bus, isa-pit);
+qdev_prop_set_uint32(dev-qdev, iobase, base);
+qdev_prop_set_uint32(dev-qdev, irq, irq);
+qdev_init_nofail(dev-qdev);
+
+return dev;
+}
+
+void pit_set_gate(ISADevice *dev, int channel, int val);
+int pit_get_gate(ISADevice *dev, int channel);
+int pit_get_initial_count(ISADevice *dev, int channel);
+int pit_get_mode(ISADevice *dev, int channel);
+int pit_get_out(ISADevice *dev, int channel, int64_t current_time);
+
+void hpet_pit_disable(void);
+void hpet_pit_enable(void);
+
+#endif /* !HW_I8254_H */
diff --git a/hw/mips_fulong2e.c b/hw/mips_fulong2e.c
index 163a668..ead72ae 100644
--- a/hw/mips_fulong2e.c
+++ b/hw/mips_fulong2e.c
@@ -40,6 +40,7 @@
 #include elf.h
 #include vt82c686.h
 #include mc146818rtc.h
+#include i8254.h
 #include blockdev.h
 #include exec-memory.h
 
diff --git a/hw/mips_jazz.c b/hw/mips_jazz.c
index 63165b9..61dee4d 100644
--- a/hw/mips_jazz.c
+++ b/hw/mips_jazz.c
@@ -36,6 +36,7 @@
 #include mips-bios.h
 #include loader.h
 #include mc146818rtc.h
+#include i8254.h
 #include blockdev.h
 #include sysbus.h
 #include exec-memory.h
diff --git a/hw/mips_malta.c b/hw/mips_malta.c
index e625ec3..7ddfc3a 100644
--- a/hw/mips_malta.c
+++ b/hw/mips_malta.c
@@ -45,6 +45,7 @@
 #include loader.h
 #include elf.h
 #include mc146818rtc.h
+#include i8254.h
 #include blockdev.h
 #include exec-memory.h
 #include sysbus.h /* SysBusDevice */
diff --git a/hw/mips_r4k.c b/hw/mips_r4k.c
index 1c0615c..1b3ec2d 100644
--- a/hw/mips_r4k.c
+++ b/hw/mips_r4k.c
@@ -22,6 +22,7 @@
 #include loader.h
 #include elf.h
 #include mc146818rtc.h
+#include i8254.h
 #include blockdev.h
 #include exec-memory.h
 
diff --git a/hw/pc.c b/hw/pc.c
index 85304cf..ea60a7c 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -36,6 +36,7 @@
 #include elf.h
 #include multiboot.h
 #include mc146818rtc.h
+#include i8254.h
 #include msix.h
 #include sysbus.h
 #include sysemu.h
diff --git a/hw/pc.h b/hw/pc.h
index 13e41f1..367e750 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -81,31 +81,6 @@ typedef struct GSIState {
 
 void gsi_handler(void *opaque, int n, int level);
 

[PATCH 3/3] KVM: PPC: Add HPT preallocator

2012-01-18 Thread Alexander Graf
We're currently allocating 16MB of linear memory on demand when creating
a guest. That does work some times, but finding 16MB of linear memory
available in the system at runtime is definitely not a given.

So let's add another command line option similar to the RMA preallocator,
that we can use to keep a pool of page tables around. Now, when a guest
gets created it has a pretty low chance of receiving an OOM.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h  |1 +
 arch/powerpc/include/asm/kvm_ppc.h   |2 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c  |   18 ++-
 arch/powerpc/kvm/book3s_hv_builtin.c |   39 +-
 4 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 8221e71..1843d5d 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -235,6 +235,7 @@ struct kvm_arch {
int slot_npages[KVM_MEM_SLOTS_NUM];
unsigned short last_vcpu[NR_CPUS];
struct kvmppc_vcore *vcores[KVM_MAX_VCORES];
+   struct kvmppc_linear_info *hpt_li;
 #endif /* CONFIG_KVM_BOOK3S_64_HV */
 };
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 1c37a2f..9d6dee0 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -130,6 +130,8 @@ extern long kvm_vm_ioctl_allocate_rma(struct kvm *kvm,
struct kvm_allocate_rma *rma);
 extern struct kvmppc_linear_info *kvm_alloc_rma(void);
 extern void kvm_release_rma(struct kvmppc_linear_info *ri);
+extern struct kvmppc_linear_info *kvm_alloc_hpt(void);
+extern void kvm_release_hpt(struct kvmppc_linear_info *li);
 extern int kvmppc_core_init_vm(struct kvm *kvm);
 extern void kvmppc_core_destroy_vm(struct kvm *kvm);
 extern int kvmppc_core_prepare_memory_region(struct kvm *kvm,
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 783cd35..b3207c7 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -44,10 +44,23 @@ long kvmppc_alloc_hpt(struct kvm *kvm)
unsigned long hpt;
unsigned long lpid;
struct revmap_entry *rev;
+   struct kvmppc_linear_info *li;
 
/* Allocate guest's hashed page table */
+
+   /* using preallocated memory */
+   li = kvm_alloc_hpt();
+   if (li) {
+   hpt = (ulong)li-base_virt;
+   kvm-arch.hpt_li = li;
+   goto has_hpt;
+   }
+
+   /* using dynamic memory */
hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|__GFP_NOWARN,
   HPT_ORDER - PAGE_SHIFT);
+
+has_hpt:
if (!hpt) {
pr_err(kvm_alloc_hpt: Couldn't alloc HPT\n);
return -ENOMEM;
@@ -88,7 +101,10 @@ void kvmppc_free_hpt(struct kvm *kvm)
 {
clear_bit(kvm-arch.lpid, lpid_inuse);
vfree(kvm-arch.revmap);
-   free_pages(kvm-arch.hpt_virt, HPT_ORDER - PAGE_SHIFT);
+   if (kvm-arch.hpt_li)
+   kvm_release_hpt(kvm-arch.hpt_li);
+   else
+   free_pages(kvm-arch.hpt_virt, HPT_ORDER - PAGE_SHIFT);
 }
 
 /* Bits in first HPTE dword for pagesize 4k, 64k or 16M */
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 7caed1d..bed1279 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -19,6 +19,7 @@
 #include asm/kvm_book3s.h
 
 #define KVM_LINEAR_RMA 0
+#define KVM_LINEAR_HPT 1
 
 static void __init kvm_linear_init_one(ulong size, int count, int type);
 static struct kvmppc_linear_info *kvm_alloc_linear(int type);
@@ -97,6 +98,39 @@ void kvm_release_rma(struct kvmppc_linear_info *ri)
 }
 EXPORT_SYMBOL_GPL(kvm_release_rma);
 
+/*** HPT */
+
+/*
+ * This maintains a list of big linear HPT tables that contain the GVA-HPA
+ * memory mappings. If we don't reserve those early on, we might not be able
+ * to get a big (usually 16MB) linear memory region from the kernel anymore.
+ */
+
+static unsigned long kvm_hpt_count;
+
+static int __init early_parse_hpt_count(char *p)
+{
+   if (!p)
+   return 1;
+
+   kvm_hpt_count = simple_strtoul(p, NULL, 0);
+
+   return 0;
+}
+early_param(kvm_hpt_count, early_parse_hpt_count);
+
+struct kvmppc_linear_info *kvm_alloc_hpt(void)
+{
+   return kvm_alloc_linear(KVM_LINEAR_HPT);
+}
+EXPORT_SYMBOL_GPL(kvm_alloc_hpt);
+
+void kvm_release_hpt(struct kvmppc_linear_info *li)
+{
+   kvm_release_linear(li);
+}
+EXPORT_SYMBOL_GPL(kvm_release_hpt);
+
 /*** generic */
 
 static LIST_HEAD(free_linears);
@@ -114,7 +148,7 @@ static void __init kvm_linear_init_one(ulong size, int 
count, int type)
if (!count)
return;
 
-   typestr = (type == KVM_LINEAR_RMA) ? RMA : ;
+   

[PATCH 0/3] KVM: PPC: Book3s: HV: Add HPT preallocation

2012-01-18 Thread Alexander Graf
While using the book3s hv code on a 970 system, we quickly ran into situations
where we didn't have enough contiguous memory available to allocate a 16MB
region for the page table we need to manage the guest's memory.

So I went ahead, cleaned up the code we currently use to preallocate RMAs and
made it usable for HPT allocation. This patch set is the result of it. As a
nice side effect, it also solves a potential security issue on 970 where we
reused the RMA of previous guests without clearing them.

These patches are in use in our internal build service testbed and work
smoothly, giving us a working setup based on 970.

Alexander Graf (3):
  KVM: PPC: Convert RMA allocation into generic code
  KVM: PPC: Initialize linears with zeros
  KVM: PPC: Add HPT preallocator

 arch/powerpc/include/asm/kvm_host.h  |8 +-
 arch/powerpc/include/asm/kvm_ppc.h   |   10 +-
 arch/powerpc/kernel/setup_64.c   |2 +-
 arch/powerpc/kvm/book3s_64_mmu_hv.c  |   18 +++-
 arch/powerpc/kvm/book3s_hv.c |8 +-
 arch/powerpc/kvm/book3s_hv_builtin.c |  209 +++---
 6 files changed, 174 insertions(+), 81 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] KVM: PPC: Initialize linears with zeros

2012-01-18 Thread Alexander Graf
RMAs and HPT preallocated spaces should be zeroed, so we don't accidently
leak information from previous VM executions.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_builtin.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 1c7e6ab..7caed1d 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -152,6 +152,7 @@ static struct kvmppc_linear_info *kvm_alloc_linear(int type)
break;
}
spin_unlock(linear_lock);
+   memset(ri-base_virt, 0, ri-npages  PAGE_SHIFT);
return ri;
 }
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] KVM: PPC: Convert RMA allocation into generic code

2012-01-18 Thread Alexander Graf
We have code to allocate big chunks of linear memory on bootup for later use.
This code is currently used for RMA allocation, but can be useful beyond that
extent.

Make it generic so we can reuse it for other stuff later.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h  |7 +-
 arch/powerpc/include/asm/kvm_ppc.h   |8 +-
 arch/powerpc/kernel/setup_64.c   |2 +-
 arch/powerpc/kvm/book3s_hv.c |8 +-
 arch/powerpc/kvm/book3s_hv_builtin.c |  175 --
 5 files changed, 118 insertions(+), 82 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index af438b1..8221e71 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -173,12 +173,13 @@ struct kvmppc_spapr_tce_table {
struct page *pages[0];
 };
 
-struct kvmppc_rma_info {
+struct kvmppc_linear_info {
void*base_virt;
unsigned longbase_pfn;
unsigned longnpages;
struct list_head list;
-   atomic_t use_count;
+   atomic_t use_count;
+   int  type;
 };
 
 /*
@@ -224,7 +225,7 @@ struct kvm_arch {
int tlbie_lock;
unsigned long lpcr;
unsigned long rmor;
-   struct kvmppc_rma_info *rma;
+   struct kvmppc_linear_info *rma;
unsigned long vrma_slb_v;
int rma_setup_done;
int using_mmu_notifiers;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index a61b5b5..1c37a2f 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -128,8 +128,8 @@ extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
struct kvm_create_spapr_tce *args);
 extern long kvm_vm_ioctl_allocate_rma(struct kvm *kvm,
struct kvm_allocate_rma *rma);
-extern struct kvmppc_rma_info *kvm_alloc_rma(void);
-extern void kvm_release_rma(struct kvmppc_rma_info *ri);
+extern struct kvmppc_linear_info *kvm_alloc_rma(void);
+extern void kvm_release_rma(struct kvmppc_linear_info *ri);
 extern int kvmppc_core_init_vm(struct kvm *kvm);
 extern void kvmppc_core_destroy_vm(struct kvm *kvm);
 extern int kvmppc_core_prepare_memory_region(struct kvm *kvm,
@@ -187,13 +187,13 @@ static inline void kvmppc_set_xics_phys(int cpu, unsigned 
long addr)
paca[cpu].kvm_hstate.xics_phys = addr;
 }
 
-extern void kvm_rma_init(void);
+extern void kvm_linear_init(void);
 
 #else
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
 {}
 
-static inline void kvm_rma_init(void)
+static inline void kvm_linear_init(void)
 {}
 #endif
 
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 4cb8f1e..4721b0c 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -598,7 +598,7 @@ void __init setup_arch(char **cmdline_p)
/* Initialize the MMU context management stuff */
mmu_context_init();
 
-   kvm_rma_init();
+   kvm_linear_init();
 
ppc64_boot_msg(0x15, Setup Done);
 }
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 3580db8..ce1cac7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1055,7 +1055,7 @@ static inline int lpcr_rmls(unsigned long rma_size)
 
 static int kvm_rma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
-   struct kvmppc_rma_info *ri = vma-vm_file-private_data;
+   struct kvmppc_linear_info *ri = vma-vm_file-private_data;
struct page *page;
 
if (vmf-pgoff = ri-npages)
@@ -1080,7 +1080,7 @@ static int kvm_rma_mmap(struct file *file, struct 
vm_area_struct *vma)
 
 static int kvm_rma_release(struct inode *inode, struct file *filp)
 {
-   struct kvmppc_rma_info *ri = filp-private_data;
+   struct kvmppc_linear_info *ri = filp-private_data;
 
kvm_release_rma(ri);
return 0;
@@ -1093,7 +1093,7 @@ static struct file_operations kvm_rma_fops = {
 
 long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct kvm_allocate_rma *ret)
 {
-   struct kvmppc_rma_info *ri;
+   struct kvmppc_linear_info *ri;
long fd;
 
ri = kvm_alloc_rma();
@@ -1212,7 +1212,7 @@ static int kvmppc_hv_setup_rma(struct kvm_vcpu *vcpu)
 {
int err = 0;
struct kvm *kvm = vcpu-kvm;
-   struct kvmppc_rma_info *ri = NULL;
+   struct kvmppc_linear_info *ri = NULL;
unsigned long hva;
struct kvm_memory_slot *memslot;
struct vm_area_struct *vma;
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index a795a13..1c7e6ab 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -18,6 +18,14 @@
 #include asm/kvm_ppc.h
 #include asm/kvm_book3s.h
 
+#define KVM_LINEAR_RMA 0
+
+static void __init kvm_linear_init_one(ulong size, int count, 

[PATCH] KVM: PPC: E500: Fail init when not on e500v2

2012-01-18 Thread Alexander Graf
When enabling the current KVM code on e500mc, I get the following oops:

Oops: Exception in kernel mode, sig: 4 [#1]
SMP NR_CPUS=8 P2041 RDB
Modules linked in:
NIP: c067df4c LR: c067df44 CTR: 
REGS: ee055ed0 TRAP: 0700   Not tainted  (3.2.0-10391-g36c5afe)
MSR: 00029002 CE,EE,ME  CR: 24042022  XER: 
TASK = ee0429b0[1] 'swapper/0' THREAD: ee054000 CPU: 2
GPR00: c067df44 ee055f80 ee0429b0  0058 003f ee211600 
60c6b864
GPR08: 7cc903a6 002c  0001 44042082 2d180088  

GPR16: ca00 0014 3fff 03fe9000 0015 7ff3be68 c06e 

GPR24:   1720 c067df1c c06e  ee054000 
c06ab51c
NIP [c067df4c] kvmppc_e500_init+0x30/0xf8
LR [c067df44] kvmppc_e500_init+0x28/0xf8
Call Trace:
[ee055f80] [c067df44] kvmppc_e500_init+0x28/0xf8 (unreliable)
[ee055fb0] [c0001d30] do_one_initcall+0x50/0x1f0
[ee055fe0] [c06721dc] kernel_init+0xa4/0x14c
[ee055ff0] [c000e910] kernel_thread+0x4c/0x68
Instruction dump:
9421ffd0 7c0802a6 93410018 9361001c 90010034 93810020 93a10024 93c10028
93e1002c 4bfffe7d 2c03 408200a4 7c1082a6 90010008 7c1182a6 9001000c
---[ end trace b8ef4903fcbf9dd3 ]---

Since it doesn't make sense to run the init function on any non-supported
platform, we can just call our is this platform supported? function and
bail out of init() if it's not.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/e500.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index 709d82f..afe070c 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -229,6 +229,9 @@ static int __init kvmppc_e500_init(void)
unsigned long ivor[3];
unsigned long max_ivor = 0;
 
+   if (r = kvmppc_core_check_processor_compat())
+   return r;
+
r = kvmppc_booke_init();
if (r)
return r;
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] kvm: deliver msix interrupts from irq handler

2012-01-18 Thread Gleb Natapov
On Wed, Jan 18, 2012 at 08:10:24PM +0200, Michael S. Tsirkin wrote:
 We can deliver certain interrupts, notably MSIX,
 from atomic context.  Add a new API kvm_set_irq_inatomic,
 that does exactly that, and use it to implement
 an irq handler for msi.
 
 This reduces the pressure on scheduler in case
 where host and guest irq share a host cpu.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 
 Untested.
 Note: this is on top of my host irq patch.
 Probably needs to be rebased to be independent
 and split up to new API + usage.
 
 ---
  include/linux/kvm_host.h |2 +
  virt/kvm/assigned-dev.c  |   31 +-
  virt/kvm/irq_comm.c  |   52 
 ++
  3 files changed, 83 insertions(+), 2 deletions(-)
 
 diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
 index f0361bc..e2b89ea 100644
 --- a/include/linux/kvm_host.h
 +++ b/include/linux/kvm_host.h
 @@ -548,6 +548,8 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic 
 *ioapic,
  #endif
  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
   int host_irq);
 +int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int 
 level,
 +  int host_irq);
  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, struct kvm 
 *kvm,
   int irq_source_id, int level, int host_irq);
  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin);
 diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
 index cc4bb7a..73bb001 100644
 --- a/virt/kvm/assigned-dev.c
 +++ b/virt/kvm/assigned-dev.c
 @@ -57,6 +57,14 @@ static int find_index_from_host_irq(struct 
 kvm_assigned_dev_kernel
   return index;
  }
  
 +static irqreturn_t kvm_assigned_dev_msi(int irq, void *dev_id)
 +{
 + int ret = kvm_set_irq_inatomic(assigned_dev-kvm,
 +assigned_dev-irq_source_id,
 +assigned_dev-guest_irq, 1, irq);
 + return unlikely(ret == -EWOULDBLOCK) ? IRQ_WAKE_THREAD : IRQ_HANDLED;
 +}
 +
  static irqreturn_t kvm_assigned_dev_thread(int irq, void *dev_id)
  {
   struct kvm_assigned_dev_kernel *assigned_dev = dev_id;
 @@ -75,6 +83,23 @@ static irqreturn_t kvm_assigned_dev_thread(int irq, void 
 *dev_id)
  }
  
  #ifdef __KVM_HAVE_MSIX
 +static irqreturn_t kvm_assigned_dev_msix(int irq, void *dev_id)
 +{
 + struct kvm_assigned_dev_kernel *assigned_dev = dev_id;
 + int index = find_index_from_host_irq(assigned_dev, irq);
 + u32 vector;
 + int ret = 0;
 +
 + if (index = 0) {
 + vector = assigned_dev-guest_msix_entries[index].vector;
 + ret = kvm_set_irq_inatomic(assigned_dev-kvm,
 +assigned_dev-irq_source_id,
 +vector, 1, irq);
 + }
 +
 + return unlikely(ret == -EWOULDBLOCK) ? IRQ_WAKE_THREAD : IRQ_HANDLED;
 +}
 +
  static irqreturn_t kvm_assigned_dev_thread_msix(int irq, void *dev_id)
  {
   struct kvm_assigned_dev_kernel *assigned_dev = dev_id;
 @@ -266,7 +291,8 @@ static int assigned_device_enable_host_msi(struct kvm 
 *kvm,
   }
  
   dev-host_irq = dev-dev-irq;
 - if (request_threaded_irq(dev-host_irq, NULL, kvm_assigned_dev_thread,
 + if (request_threaded_irq(dev-host_irq, kvm_assigned_dev_msi,
 +  kvm_assigned_dev_thread,
0, dev-irq_name, dev)) {
   pci_disable_msi(dev-dev);
   return -EIO;
 @@ -293,7 +319,8 @@ static int assigned_device_enable_host_msix(struct kvm 
 *kvm,
  
   for (i = 0; i  dev-entries_nr; i++) {
   r = request_threaded_irq(dev-host_msix_entries[i].vector,
 -  NULL, kvm_assigned_dev_thread_msix,
 +  kvm_assigned_dev_msix,
 +  kvm_assigned_dev_thread_msix,
0, dev-irq_name, dev);
   if (r)
   goto err;
 diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
 index ba892df..68cd127 100644
 --- a/virt/kvm/irq_comm.c
 +++ b/virt/kvm/irq_comm.c
 @@ -201,6 +201,58 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 
 irq, int level,
   return ret;
  }
  
 +static inline struct kvm_kernel_irq_routing_entry *
 +kvm_get_entry(struct kvm *kvm, struct kvm_irq_routing_table *irq_rq, u32 irq)
 +{
 + struct kvm_kernel_irq_routing_entry *e;
 + if (likely(irq  irq_rt-nr_rt_entries))
 + hlist_for_each_entry(e, n, irq_rt-map[irq], link)
 + if (e-type == KVM_IRQ_ROUTING_MSI)
 + return e;
 + else
 + return ERR_PTR(-EWOULDBLOCK);
 + return ERR_PTR(-EINVAL);
 +}
Unused?

 +
 +/*
 + * Deliver an IRQ in an atomic context if we can, or return a failure,
 + * user can retry 

[PATCH 2/3] KVM: PPC: Initialize linears with zeros

2012-01-18 Thread Alexander Graf
RMAs and HPT preallocated spaces should be zeroed, so we don't accidently
leak information from previous VM executions.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_builtin.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 1c7e6ab..7caed1d 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -152,6 +152,7 @@ static struct kvmppc_linear_info *kvm_alloc_linear(int type)
break;
}
spin_unlock(linear_lock);
+   memset(ri-base_virt, 0, ri-npages  PAGE_SHIFT);
return ri;
 }
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] KVM: PPC: Add HPT preallocator

2012-01-18 Thread Alexander Graf
We're currently allocating 16MB of linear memory on demand when creating
a guest. That does work some times, but finding 16MB of linear memory
available in the system at runtime is definitely not a given.

So let's add another command line option similar to the RMA preallocator,
that we can use to keep a pool of page tables around. Now, when a guest
gets created it has a pretty low chance of receiving an OOM.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h  |1 +
 arch/powerpc/include/asm/kvm_ppc.h   |2 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c  |   18 ++-
 arch/powerpc/kvm/book3s_hv_builtin.c |   39 +-
 4 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 8221e71..1843d5d 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -235,6 +235,7 @@ struct kvm_arch {
int slot_npages[KVM_MEM_SLOTS_NUM];
unsigned short last_vcpu[NR_CPUS];
struct kvmppc_vcore *vcores[KVM_MAX_VCORES];
+   struct kvmppc_linear_info *hpt_li;
 #endif /* CONFIG_KVM_BOOK3S_64_HV */
 };
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 1c37a2f..9d6dee0 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -130,6 +130,8 @@ extern long kvm_vm_ioctl_allocate_rma(struct kvm *kvm,
struct kvm_allocate_rma *rma);
 extern struct kvmppc_linear_info *kvm_alloc_rma(void);
 extern void kvm_release_rma(struct kvmppc_linear_info *ri);
+extern struct kvmppc_linear_info *kvm_alloc_hpt(void);
+extern void kvm_release_hpt(struct kvmppc_linear_info *li);
 extern int kvmppc_core_init_vm(struct kvm *kvm);
 extern void kvmppc_core_destroy_vm(struct kvm *kvm);
 extern int kvmppc_core_prepare_memory_region(struct kvm *kvm,
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 783cd35..b3207c7 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -44,10 +44,23 @@ long kvmppc_alloc_hpt(struct kvm *kvm)
unsigned long hpt;
unsigned long lpid;
struct revmap_entry *rev;
+   struct kvmppc_linear_info *li;
 
/* Allocate guest's hashed page table */
+
+   /* using preallocated memory */
+   li = kvm_alloc_hpt();
+   if (li) {
+   hpt = (ulong)li-base_virt;
+   kvm-arch.hpt_li = li;
+   goto has_hpt;
+   }
+
+   /* using dynamic memory */
hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|__GFP_NOWARN,
   HPT_ORDER - PAGE_SHIFT);
+
+has_hpt:
if (!hpt) {
pr_err(kvm_alloc_hpt: Couldn't alloc HPT\n);
return -ENOMEM;
@@ -88,7 +101,10 @@ void kvmppc_free_hpt(struct kvm *kvm)
 {
clear_bit(kvm-arch.lpid, lpid_inuse);
vfree(kvm-arch.revmap);
-   free_pages(kvm-arch.hpt_virt, HPT_ORDER - PAGE_SHIFT);
+   if (kvm-arch.hpt_li)
+   kvm_release_hpt(kvm-arch.hpt_li);
+   else
+   free_pages(kvm-arch.hpt_virt, HPT_ORDER - PAGE_SHIFT);
 }
 
 /* Bits in first HPTE dword for pagesize 4k, 64k or 16M */
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 7caed1d..bed1279 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -19,6 +19,7 @@
 #include asm/kvm_book3s.h
 
 #define KVM_LINEAR_RMA 0
+#define KVM_LINEAR_HPT 1
 
 static void __init kvm_linear_init_one(ulong size, int count, int type);
 static struct kvmppc_linear_info *kvm_alloc_linear(int type);
@@ -97,6 +98,39 @@ void kvm_release_rma(struct kvmppc_linear_info *ri)
 }
 EXPORT_SYMBOL_GPL(kvm_release_rma);
 
+/*** HPT */
+
+/*
+ * This maintains a list of big linear HPT tables that contain the GVA-HPA
+ * memory mappings. If we don't reserve those early on, we might not be able
+ * to get a big (usually 16MB) linear memory region from the kernel anymore.
+ */
+
+static unsigned long kvm_hpt_count;
+
+static int __init early_parse_hpt_count(char *p)
+{
+   if (!p)
+   return 1;
+
+   kvm_hpt_count = simple_strtoul(p, NULL, 0);
+
+   return 0;
+}
+early_param(kvm_hpt_count, early_parse_hpt_count);
+
+struct kvmppc_linear_info *kvm_alloc_hpt(void)
+{
+   return kvm_alloc_linear(KVM_LINEAR_HPT);
+}
+EXPORT_SYMBOL_GPL(kvm_alloc_hpt);
+
+void kvm_release_hpt(struct kvmppc_linear_info *li)
+{
+   kvm_release_linear(li);
+}
+EXPORT_SYMBOL_GPL(kvm_release_hpt);
+
 /*** generic */
 
 static LIST_HEAD(free_linears);
@@ -114,7 +148,7 @@ static void __init kvm_linear_init_one(ulong size, int 
count, int type)
if (!count)
return;
 
-   typestr = (type == KVM_LINEAR_RMA) ? RMA : ;
+   

[PATCH 0/3] KVM: PPC: Book3s: HV: Add HPT preallocation

2012-01-18 Thread Alexander Graf
While using the book3s hv code on a 970 system, we quickly ran into situations
where we didn't have enough contiguous memory available to allocate a 16MB
region for the page table we need to manage the guest's memory.

So I went ahead, cleaned up the code we currently use to preallocate RMAs and
made it usable for HPT allocation. This patch set is the result of it. As a
nice side effect, it also solves a potential security issue on 970 where we
reused the RMA of previous guests without clearing them.

These patches are in use in our internal build service testbed and work
smoothly, giving us a working setup based on 970.

Alexander Graf (3):
  KVM: PPC: Convert RMA allocation into generic code
  KVM: PPC: Initialize linears with zeros
  KVM: PPC: Add HPT preallocator

 arch/powerpc/include/asm/kvm_host.h  |8 +-
 arch/powerpc/include/asm/kvm_ppc.h   |   10 +-
 arch/powerpc/kernel/setup_64.c   |2 +-
 arch/powerpc/kvm/book3s_64_mmu_hv.c  |   18 +++-
 arch/powerpc/kvm/book3s_hv.c |8 +-
 arch/powerpc/kvm/book3s_hv_builtin.c |  209 +++---
 6 files changed, 174 insertions(+), 81 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] KVM: PPC: Convert RMA allocation into generic code

2012-01-18 Thread Alexander Graf
We have code to allocate big chunks of linear memory on bootup for later use.
This code is currently used for RMA allocation, but can be useful beyond that
extent.

Make it generic so we can reuse it for other stuff later.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h  |7 +-
 arch/powerpc/include/asm/kvm_ppc.h   |8 +-
 arch/powerpc/kernel/setup_64.c   |2 +-
 arch/powerpc/kvm/book3s_hv.c |8 +-
 arch/powerpc/kvm/book3s_hv_builtin.c |  175 --
 5 files changed, 118 insertions(+), 82 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index af438b1..8221e71 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -173,12 +173,13 @@ struct kvmppc_spapr_tce_table {
struct page *pages[0];
 };
 
-struct kvmppc_rma_info {
+struct kvmppc_linear_info {
void*base_virt;
unsigned longbase_pfn;
unsigned longnpages;
struct list_head list;
-   atomic_t use_count;
+   atomic_t use_count;
+   int  type;
 };
 
 /*
@@ -224,7 +225,7 @@ struct kvm_arch {
int tlbie_lock;
unsigned long lpcr;
unsigned long rmor;
-   struct kvmppc_rma_info *rma;
+   struct kvmppc_linear_info *rma;
unsigned long vrma_slb_v;
int rma_setup_done;
int using_mmu_notifiers;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index a61b5b5..1c37a2f 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -128,8 +128,8 @@ extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
struct kvm_create_spapr_tce *args);
 extern long kvm_vm_ioctl_allocate_rma(struct kvm *kvm,
struct kvm_allocate_rma *rma);
-extern struct kvmppc_rma_info *kvm_alloc_rma(void);
-extern void kvm_release_rma(struct kvmppc_rma_info *ri);
+extern struct kvmppc_linear_info *kvm_alloc_rma(void);
+extern void kvm_release_rma(struct kvmppc_linear_info *ri);
 extern int kvmppc_core_init_vm(struct kvm *kvm);
 extern void kvmppc_core_destroy_vm(struct kvm *kvm);
 extern int kvmppc_core_prepare_memory_region(struct kvm *kvm,
@@ -187,13 +187,13 @@ static inline void kvmppc_set_xics_phys(int cpu, unsigned 
long addr)
paca[cpu].kvm_hstate.xics_phys = addr;
 }
 
-extern void kvm_rma_init(void);
+extern void kvm_linear_init(void);
 
 #else
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
 {}
 
-static inline void kvm_rma_init(void)
+static inline void kvm_linear_init(void)
 {}
 #endif
 
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 4cb8f1e..4721b0c 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -598,7 +598,7 @@ void __init setup_arch(char **cmdline_p)
/* Initialize the MMU context management stuff */
mmu_context_init();
 
-   kvm_rma_init();
+   kvm_linear_init();
 
ppc64_boot_msg(0x15, Setup Done);
 }
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 3580db8..ce1cac7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1055,7 +1055,7 @@ static inline int lpcr_rmls(unsigned long rma_size)
 
 static int kvm_rma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
-   struct kvmppc_rma_info *ri = vma-vm_file-private_data;
+   struct kvmppc_linear_info *ri = vma-vm_file-private_data;
struct page *page;
 
if (vmf-pgoff = ri-npages)
@@ -1080,7 +1080,7 @@ static int kvm_rma_mmap(struct file *file, struct 
vm_area_struct *vma)
 
 static int kvm_rma_release(struct inode *inode, struct file *filp)
 {
-   struct kvmppc_rma_info *ri = filp-private_data;
+   struct kvmppc_linear_info *ri = filp-private_data;
 
kvm_release_rma(ri);
return 0;
@@ -1093,7 +1093,7 @@ static struct file_operations kvm_rma_fops = {
 
 long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct kvm_allocate_rma *ret)
 {
-   struct kvmppc_rma_info *ri;
+   struct kvmppc_linear_info *ri;
long fd;
 
ri = kvm_alloc_rma();
@@ -1212,7 +1212,7 @@ static int kvmppc_hv_setup_rma(struct kvm_vcpu *vcpu)
 {
int err = 0;
struct kvm *kvm = vcpu-kvm;
-   struct kvmppc_rma_info *ri = NULL;
+   struct kvmppc_linear_info *ri = NULL;
unsigned long hva;
struct kvm_memory_slot *memslot;
struct vm_area_struct *vma;
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index a795a13..1c7e6ab 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -18,6 +18,14 @@
 #include asm/kvm_ppc.h
 #include asm/kvm_book3s.h
 
+#define KVM_LINEAR_RMA 0
+
+static void __init kvm_linear_init_one(ulong size, int count, 

[PATCH] KVM: PPC: E500: Fail init when not on e500v2

2012-01-18 Thread Alexander Graf
When enabling the current KVM code on e500mc, I get the following oops:

Oops: Exception in kernel mode, sig: 4 [#1]
SMP NR_CPUS=8 P2041 RDB
Modules linked in:
NIP: c067df4c LR: c067df44 CTR: 
REGS: ee055ed0 TRAP: 0700   Not tainted  (3.2.0-10391-g36c5afe)
MSR: 00029002 CE,EE,ME  CR: 24042022  XER: 
TASK = ee0429b0[1] 'swapper/0' THREAD: ee054000 CPU: 2
GPR00: c067df44 ee055f80 ee0429b0  0058 003f ee211600 
60c6b864
GPR08: 7cc903a6 002c  0001 44042082 2d180088  

GPR16: ca00 0014 3fff 03fe9000 0015 7ff3be68 c06e 

GPR24:   1720 c067df1c c06e  ee054000 
c06ab51c
NIP [c067df4c] kvmppc_e500_init+0x30/0xf8
LR [c067df44] kvmppc_e500_init+0x28/0xf8
Call Trace:
[ee055f80] [c067df44] kvmppc_e500_init+0x28/0xf8 (unreliable)
[ee055fb0] [c0001d30] do_one_initcall+0x50/0x1f0
[ee055fe0] [c06721dc] kernel_init+0xa4/0x14c
[ee055ff0] [c000e910] kernel_thread+0x4c/0x68
Instruction dump:
9421ffd0 7c0802a6 93410018 9361001c 90010034 93810020 93a10024 93c10028
93e1002c 4bfffe7d 2c03 408200a4 7c1082a6 90010008 7c1182a6 9001000c
---[ end trace b8ef4903fcbf9dd3 ]---

Since it doesn't make sense to run the init function on any non-supported
platform, we can just call our is this platform supported? function and
bail out of init() if it's not.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/e500.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index 709d82f..afe070c 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -229,6 +229,9 @@ static int __init kvmppc_e500_init(void)
unsigned long ivor[3];
unsigned long max_ivor = 0;
 
+   if (r = kvmppc_core_check_processor_compat())
+   return r;
+
r = kvmppc_booke_init();
if (r)
return r;
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html