Re: [PATCH RFC] virtio-net: remove useless disable on freeze

2012-05-31 Thread Rusty Russell
On Mon, 28 May 2012 15:53:25 +0300, Michael S. Tsirkin m...@redhat.com 
wrote:
 On Wed, Apr 04, 2012 at 12:19:54PM +0300, Michael S. Tsirkin wrote:
  disable_cb is just an optimization: it
  can not guarantee that there are no callbacks.
  
  I didn't yet figure out whether a callback
  in freeze will trigger a bug, but disable_cb
  won't address it in any case. So let's remove
  the useless calls as a first step.
  
  Signed-off-by: Michael S. Tsirkin m...@redhat.com
 
 Looks like this isn't in the 3.5 pull request -
 just lost in the shuffle?
 disable_cb is advisory so can't be relied upon.

I always (try to?) reply as I accept patches.

This one did slip by, but it's harmless so no need to push AFAICT.

Applied.

Thanks!
Rusty.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Documentation/kvm : Add documentation on Hypercalls

2012-05-31 Thread Raghavendra K T
From: Raghavendra K T raghavendra...@linux.vnet.ibm.com 

Thanks Alex for KVM_HC_FEATURES inputs and Jan for VAPIC_POLL_IRQ 

Signed-off-by: Raghavendra K T raghavendra...@linux.vnet.ibm.com
---
diff --git a/Documentation/virtual/kvm/hypercalls.txt 
b/Documentation/virtual/kvm/hypercalls.txt
new file mode 100644
index 000..c79335a
--- /dev/null
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -0,0 +1,46 @@
+KVM Hypercalls Documentation
+===
+The template for each hypercall is:
+1. Hypercall name, value.
+2. Architecture(s)
+3. Status (deprecated, obsolete, active)
+4. Purpose
+
+1. KVM_HC_VAPIC_POLL_IRQ
+
+Value: 1
+Architecture: x86
+Purpose: Trigger guest exit so that the host can check for pending
+interrupts on reentry.
+
+2. KVM_HC_MMU_OP
+
+Value: 2
+Architecture: x86
+Status: deprecated.
+Purpose: Support MMU operations such as writing to PTE,
+flushing TLB, release PT.
+
+3. KVM_HC_FEATURES
+
+Value: 3
+Architecture: PPC
+Status: active
+Purpose: Expose hypercall availability to the guest. On x86 platforms, cpuid
+used to enumerate which hypercalls are available. On PPC, either device tree
+based lookup ( which is also what EPAPR dictates) OR KVM specific enumeration
+mechanism (which is this hypercall) can be used.
+
+4. KVM_HC_PPC_MAP_MAGIC_PAGE
+
+Value: 4
+Architecture: PPC
+Status: active
+Purpose: To enable communication between the hypervisor and guest there is a
+shared page that contains parts of supervisor visible register state.
+The guest can map this shared page to access its supervisor register through
+memory using this hypercall.
+
+TODO:
+1. more information on input and output needed?
+2. Add more detail to purpose of hypercalls.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-net: remove useless disable on freeze

2012-05-31 Thread Stephen Rothwell
Hi all,

On Wed, 30 May 2012 19:41:47 +0930 Rusty Russell ru...@rustcorp.com.au wrote:

 On Mon, 28 May 2012 15:53:25 +0300, Michael S. Tsirkin m...@redhat.com 
 wrote:
  On Wed, Apr 04, 2012 at 12:19:54PM +0300, Michael S. Tsirkin wrote:
   disable_cb is just an optimization: it
   can not guarantee that there are no callbacks.
   
   I didn't yet figure out whether a callback
   in freeze will trigger a bug, but disable_cb
   won't address it in any case. So let's remove
   the useless calls as a first step.
   
   Signed-off-by: Michael S. Tsirkin m...@redhat.com
  
  Looks like this isn't in the 3.5 pull request -
  just lost in the shuffle?
  disable_cb is advisory so can't be relied upon.
 
 I always (try to?) reply as I accept patches.
 
 This one did slip by, but it's harmless so no need to push AFAICT.
 
 Applied.

This patch exists in two trees in linux-next already ... Davem's net tree
(so presumably he will send it to Linus shortly) and Michael's vhost tree
(is that tree needed any more?).  Presumably it is now also in the rr
tree?

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpsAKJ5kzVVL.pgp
Description: PGP signature


Re: [PATCH RFC] virtio-net: remove useless disable on freeze

2012-05-31 Thread Michael S. Tsirkin
On Thu, May 31, 2012 at 06:35:08PM +1000, Stephen Rothwell wrote:
 Hi all,
 
 On Wed, 30 May 2012 19:41:47 +0930 Rusty Russell ru...@rustcorp.com.au 
 wrote:
 
  On Mon, 28 May 2012 15:53:25 +0300, Michael S. Tsirkin m...@redhat.com 
  wrote:
   On Wed, Apr 04, 2012 at 12:19:54PM +0300, Michael S. Tsirkin wrote:
disable_cb is just an optimization: it
can not guarantee that there are no callbacks.

I didn't yet figure out whether a callback
in freeze will trigger a bug, but disable_cb
won't address it in any case. So let's remove
the useless calls as a first step.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
   
   Looks like this isn't in the 3.5 pull request -
   just lost in the shuffle?
   disable_cb is advisory so can't be relied upon.
  
  I always (try to?) reply as I accept patches.
  
  This one did slip by, but it's harmless so no need to push AFAICT.
  
  Applied.
 
 This patch exists in two trees in linux-next already ... Davem's net tree
 (so presumably he will send it to Linus shortly) and Michael's vhost tree
 (is that tree needed any more?).

Yes and I dropped the patch from there, just did not push yet.

  Presumably it is now also in the rr
 tree?
 
 -- 
 Cheers,
 Stephen Rothwells...@canb.auug.org.au


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-net: remove useless disable on freeze

2012-05-31 Thread Michael S. Tsirkin
On Thu, May 31, 2012 at 11:47:17AM +0300, Michael S. Tsirkin wrote:
 On Thu, May 31, 2012 at 06:35:08PM +1000, Stephen Rothwell wrote:
  Hi all,
  
  On Wed, 30 May 2012 19:41:47 +0930 Rusty Russell ru...@rustcorp.com.au 
  wrote:
  
   On Mon, 28 May 2012 15:53:25 +0300, Michael S. Tsirkin 
   m...@redhat.com wrote:
On Wed, Apr 04, 2012 at 12:19:54PM +0300, Michael S. Tsirkin wrote:
 disable_cb is just an optimization: it
 can not guarantee that there are no callbacks.
 
 I didn't yet figure out whether a callback
 in freeze will trigger a bug, but disable_cb
 won't address it in any case. So let's remove
 the useless calls as a first step.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com

Looks like this isn't in the 3.5 pull request -
just lost in the shuffle?
disable_cb is advisory so can't be relied upon.
   
   I always (try to?) reply as I accept patches.
   
   This one did slip by, but it's harmless so no need to push AFAICT.
   
   Applied.
  
  This patch exists in two trees in linux-next already ... Davem's net tree
  (so presumably he will send it to Linus shortly) and Michael's vhost tree
  (is that tree needed any more?).
 
 Yes and I dropped the patch from there, just did not push yet.

pushed.
There's usually not too much in my tree but it helps me a lot
that it's in linux-next.

   Presumably it is now also in the rr
  tree?
  
  -- 
  Cheers,
  Stephen Rothwells...@canb.auug.org.au
 
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] KVM: Optimize vcpu-requests slow path slightly

2012-05-31 Thread Avi Kivity
On 05/30/2012 11:03 PM, Marcelo Tosatti wrote:
 On Sun, May 20, 2012 at 04:49:27PM +0300, Avi Kivity wrote:
 Instead of using a atomic operation per active request, use just one
 to get all requests at once, then check them with local ops.  This
 probably isn't any faster, since simultaneous requests are rare, but
 it does reduce code size.
 
 Signed-off-by: Avi Kivity a...@redhat.com
 ---
  arch/x86/kvm/x86.c |   33 ++---
  1 file changed, 18 insertions(+), 15 deletions(-)
 
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 953e692..c0209eb 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -5232,55 +5232,58 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
  bool req_int_win = !irqchip_in_kernel(vcpu-kvm) 
  vcpu-run-request_interrupt_window;
  bool req_immediate_exit = 0;
 +ulong reqs;
  
  if (unlikely(req_int_win))
  kvm_make_request(KVM_REQ_EVENT, vcpu);
  
  if (vcpu-requests) {
 -if (kvm_check_request(KVM_REQ_MMU_RELOAD, vcpu))
 +reqs = xchg(vcpu-requests, 0UL);
 +
 +if (test_bit(KVM_REQ_MMU_RELOAD, reqs))
  kvm_mmu_unload(vcpu);
 -if (kvm_check_request(KVM_REQ_MIGRATE_TIMER, vcpu))
 +if (test_bit(KVM_REQ_MIGRATE_TIMER, reqs))
  __kvm_migrate_timers(vcpu);
 -if (kvm_check_request(KVM_REQ_CLOCK_UPDATE, vcpu)) {
 +if (test_bit(KVM_REQ_CLOCK_UPDATE, reqs)) {
  r = kvm_guest_time_update(vcpu);
  if (unlikely(r))
  goto out;
  }
 
 Bailing out loses requests in reqs. 


Whoops, good catch.

 
 Caching the requests makes the following type of sequence behave strangely
 
 req = xchg(vcpu-requests);
 if request is set
 request handler
 ...
 set REQ_EVENT
 ...
 
 prepare for guest entry
 vcpu-requests set
 bail


I don't really mind that.  But I do want to reduce the overhead of a
request, they're not that rare in normal workloads.

How about


   for_each_set_bit(req, vcpu-requests, BITS_PER_LONG) {
   clear_bit(bit, vcpu-requests);
   r = request_handlers[bit](vcpu);
   if (r)
 goto out;
   }

? That makes for O(1) handling since usually we only have one request
set (KVM_REQ_EVENT).  We'll make that the last one to avoid the scenario
above.

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [PATCH] virt: Add Fedora 17 to the list of guests

2012-05-31 Thread Miroslav Rezanina
Hi Lucas,
F17 is not working on xen hypervisor without some changes. I'm going to prepare 
fixes for xen handling.

Mirek

- Original Message -
 From: Lucas Meneghel Rodrigues l...@redhat.com
 To: autot...@test.kernel.org
 Cc: kvm@vger.kernel.org
 Sent: Wednesday, May 30, 2012 11:20:55 PM
 Subject: [Autotest] [PATCH] virt: Add Fedora 17 to the list of guests
 
 Also, make it the default guest for KVM autotest.
 
 Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
 ---
  client/tests/kvm/tests.cfg.sample |   18 -
  client/tests/libvirt/tests.cfg.sample |   21 ++--
  client/virt/guest-os.cfg.sample   |   28
  ++
  client/virt/unattended/Fedora-17.ks   |   35
  +
  client/virt/virt_utils.py |6 +++---
  5 files changed, 86 insertions(+), 22 deletions(-)
  create mode 100644 client/virt/unattended/Fedora-17.ks
 
 diff --git a/client/tests/kvm/tests.cfg.sample
 b/client/tests/kvm/tests.cfg.sample
 index 9329a05..912232e 100644
 --- a/client/tests/kvm/tests.cfg.sample
 +++ b/client/tests/kvm/tests.cfg.sample
 @@ -57,8 +57,8 @@ variants:
  # Subtest choice. You can modify that line to add more
  subtests
  only unattended_install.cdrom, boot, shutdown
  
 -# Runs qemu, f16 64 bit guest OS, install, boot, shutdown
 -- @qemu_f16_quick:
 +# Runs qemu, f17 64 bit guest OS, install, boot, shutdown
 +- @qemu_f17_quick:
  # We want qemu for this run
  qemu_binary = /usr/bin/qemu
  qemu_img_binary = /usr/bin/qemu-img
 @@ -72,13 +72,13 @@ variants:
  only no_9p_export
  only no_pci_assignable
  only smallpages
 -only Fedora.16.64
 +only Fedora.17.64
  only unattended_install.cdrom.extra_cdrom_ks, boot, shutdown
  # qemu needs -enable-kvm on the cmdline
  extra_params += ' -enable-kvm'
  
 -# Runs qemu-kvm, f16 64 bit guest OS, install, boot, shutdown
 -- @qemu_kvm_f16_quick:
 +# Runs qemu-kvm, f17 64 bit guest OS, install, boot, shutdown
 +- @qemu_kvm_f17_quick:
  # We want qemu-kvm for this run
  qemu_binary = /usr/bin/qemu-kvm
  qemu_img_binary = /usr/bin/qemu-img
 @@ -90,10 +90,10 @@ variants:
  only no_9p_export
  only no_pci_assignable
  only smallpages
 -only Fedora.16.64
 +only Fedora.17.64
  only unattended_install.cdrom.extra_cdrom_ks, boot, shutdown
  
 -# Runs qemu-kvm, f16 64 bit guest OS, install, starts qemu-kvm
 +# Runs qemu-kvm, f17 64 bit guest OS, install, starts qemu-kvm
  # with 9P support and runs 9P CI tests
  - @qemu_kvm_9p_export:
  qemu_binary = /usr/bin/qemu-kvm
 @@ -106,7 +106,7 @@ variants:
  only no_pci_assignable
  only smallpages
  only 9p_export
 -only Fedora.16.64
 +only Fedora.17.64
  only unattended_install.cdrom.extra_cdrom_ks, boot,
  9p.9p_ci, shutdown
  
  # Runs your own guest image (qcow2, can be adjusted), all
  migration tests
 @@ -129,4 +129,4 @@ variants:
  only migrate
  
  # Choose your test list from the testsets defined
 -only qemu_kvm_f16_quick
 +only qemu_kvm_f17_quick
 diff --git a/client/tests/libvirt/tests.cfg.sample
 b/client/tests/libvirt/tests.cfg.sample
 index 548caf8..2b17d50 100644
 --- a/client/tests/libvirt/tests.cfg.sample
 +++ b/client/tests/libvirt/tests.cfg.sample
 @@ -6,8 +6,8 @@
  include tests-shared.cfg
  
  variants:
 -# Runs virt-install, f16 64 bit guest OS, install, boot,
 shutdown
 -- @libvirt_f16_quick:
 +# Runs virt-install, f17 64 bit guest OS, install, boot,
 shutdown
 +- @libvirt_f17_quick:
  virt_install_binary = /usr/bin/virt-install
  qemu_img_binary = /usr/bin/qemu-img
  hvm_or_pv = hvm
 @@ -18,11 +18,12 @@ variants:
  only no_9p_export
  only no_pci_assignable
  only smallpages
 -only Fedora.16.64
 +only Fedora.17.64
  only unattended_install.cdrom.extra_cdrom_ks, boot, reboot,
  shutdown, remove_guest.with_disk
  
 -# Runs virt-install, f16 64 as a 64 bit PV guest OS, install,
 boot, shutdown
 -- @libvirt_xenpv_f16_quick:
 +
 +# Runs virt-install, f17 64 as a 64 bit PV guest OS, install,
 boot, shutdown
 +- @libvirt_xenpv_f17_quick:
  virt_install_binary = /usr/bin/virt-install
  qemu_img_binary = /usr/bin/qemu-img
  hvm_or_pv = pv
 @@ -34,12 +35,12 @@ variants:
  only no_9p_export
  only no_pci_assignable
  only smallpages
 -only Fedora.16.64
 +only Fedora.17.64
  only unattended_install.cdrom.http_ks, boot, reboot,
  shutdown, remove_guest.with_disk
  
 -# Runs virt-install, f16 64 as a 64 bit HVM (full virt) guest
 OS,
 +# Runs virt-install, f17 64 as a 64 bit HVM (full virt) guest
 OS,
  # install, boot, 

Re: [PATCHv5 5/5] kvm: host side for eoi optimization

2012-05-31 Thread Gleb Natapov
On Tue, May 22, 2012 at 05:05:47PM +0300, Michael S. Tsirkin wrote:
 Implementation of PV EOI using shared memory.
 This reduces the number of exits an interrupt
 causes as much as by half.
 
 The idea is simple: there's a bit, per APIC, in guest memory,
 that tells the guest that it does not need EOI.
 We set it before injecting an interrupt and clear
 before injecting a nested one. Guest tests it using
 a test and clear operation - this is necessary
 so that host can detect interrupt nesting -
 and if set, it can skip the EOI MSR.
 
 There's a new MSR to set the address of said register
 in guest memory. Otherwise not much changed:
 - Guest EOI is not required
 - Register is tested  ISR is automatically cleared on exit
 
 For testing results see description of previous patch
 'kvm_para: guest side for eoi avoidance'.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---
  arch/x86/include/asm/kvm_host.h |   12 
  arch/x86/kvm/cpuid.c|1 +
  arch/x86/kvm/lapic.c|  135 +-
  arch/x86/kvm/lapic.h|2 +
  arch/x86/kvm/trace.h|   34 ++
  arch/x86/kvm/x86.c  |5 ++
  6 files changed, 185 insertions(+), 4 deletions(-)
 
 diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
 index 32297f2..3ded418 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -174,6 +174,13 @@ enum {
  
  /* apic attention bits */
  #define KVM_APIC_CHECK_VAPIC 0
 +/*
 + * The following bit is set with PV-EOI, unset on EOI.
 + * We detect PV-EOI changes by guest by comparing
 + * this bit with PV-EOI in guest memory.
 + * See the implementation in apic_update_pv_eoi.
 + */ 
 +#define KVM_APIC_PV_EOI_PENDING  1
  
  /*
   * We don't want allocation failures within the mmu code, so we preallocate
 @@ -485,6 +492,11 @@ struct kvm_vcpu_arch {
   u64 length;
   u64 status;
   } osvw;
 +
 + struct {
 + u64 msr_val;
 + struct gfn_to_hva_cache data;
 + } pv_eoi;
  };
  
  struct kvm_lpage_info {
 diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
 index eba8345..a9f155a 100644
 --- a/arch/x86/kvm/cpuid.c
 +++ b/arch/x86/kvm/cpuid.c
 @@ -408,6 +408,7 @@ static int do_cpuid_ent(struct kvm_cpuid_entry2 *entry, 
 u32 function,
(1  KVM_FEATURE_NOP_IO_DELAY) |
(1  KVM_FEATURE_CLOCKSOURCE2) |
(1  KVM_FEATURE_ASYNC_PF) |
 +  (1  KVM_FEATURE_PV_EOI) |
(1  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT);
  
   if (sched_info_on())
 diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
 index 0d2985d..9d804e7 100644
 --- a/arch/x86/kvm/lapic.c
 +++ b/arch/x86/kvm/lapic.c
 @@ -309,6 +309,54 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct 
 kvm_lapic_irq *irq)
   irq-level, irq-trig_mode);
  }
  
 +static int pv_eoi_put_user(struct kvm_vcpu *vcpu, u8 val)
 +{
 +
 + return kvm_write_guest_cached(vcpu-kvm, vcpu-arch.pv_eoi.data, val,
 +   sizeof(val));
 +}
 +
 +static int pv_eoi_get_user(struct kvm_vcpu *vcpu, u8 *val)
 +{
 +
 + return kvm_read_guest_cached(vcpu-kvm, vcpu-arch.pv_eoi.data, val,
 +   sizeof(*val));
 +}
 +
 +static inline bool pv_eoi_enabled(struct kvm_vcpu *vcpu)
 +{
 + return vcpu-arch.pv_eoi.msr_val  KVM_MSR_ENABLED;
 +}
 +
 +static bool pv_eoi_get_pending(struct kvm_vcpu *vcpu)
 +{
 + u8 val;
 + if (pv_eoi_get_user(vcpu, val)  0)
 + apic_debug(Can't read EOI MSR value: 0x%llx\n,
 +(unsigned long long)vcpi-arch.pv_eoi.msr_val);
 + return val  0x1;
 +}
 +
 +static void pv_eoi_set_pending(struct kvm_vcpu *vcpu)
 +{
 + if (pv_eoi_put_user(vcpu, KVM_PV_EOI_ENABLED)  0) {
 + apic_debug(Can't set EOI MSR value: 0x%llx\n,
 +(unsigned long long)vcpi-arch.pv_eoi.msr_val);
 + return;
 + }
 + __set_bit(KVM_APIC_PV_EOI_PENDING, vcpu-arch.apic_attention);
 +}
 +
 +static void pv_eoi_clr_pending(struct kvm_vcpu *vcpu)
 +{
 + if (pv_eoi_put_user(vcpu, KVM_PV_EOI_DISABLED)  0) {
 + apic_debug(Can't clear EOI MSR value: 0x%llx\n,
 +(unsigned long long)vcpi-arch.pv_eoi.msr_val);
 + return;
 + }
 + __clear_bit(KVM_APIC_PV_EOI_PENDING, vcpu-arch.apic_attention);
 +}
 +
  static inline int apic_find_highest_isr(struct kvm_lapic *apic)
  {
   int result;
 @@ -525,15 +573,18 @@ int kvm_apic_compare_prio(struct kvm_vcpu *vcpu1, 
 struct kvm_vcpu *vcpu2)
   return vcpu1-arch.apic_arb_prio - vcpu2-arch.apic_arb_prio;
  }
  
 -static void apic_set_eoi(struct kvm_lapic *apic)
 +static int apic_set_eoi(struct kvm_lapic *apic)
  {
   int vector = apic_find_highest_isr(apic);
 +
 + trace_kvm_eoi(apic, vector);

Re: [PATCHv2] kvm: optimize ISR lookups

2012-05-31 Thread Gleb Natapov
On Wed, May 30, 2012 at 05:54:42PM -0300, Marcelo Tosatti wrote:
 On Tue, May 22, 2012 at 03:54:56PM +0300, Michael S. Tsirkin wrote:
  We perform ISR lookups twice: during interrupt
  injection and on EOI. Typical workloads only have
  a single bit set there. So we can avoid ISR scans by
  1. counting bits as we set/clear them in ISR
  2. if count is 1, caching the vector number
  3. if count != 1, invalidating the cache
  
  The real purpose of this is enabling PV EOI
  which needs to quickly validate the vector.
  But non PV guests might also benefit.
  
  Signed-off-by: Michael S. Tsirkin m...@redhat.com
  ---
  
  I am well aware of Thomas and Peter's suggestion of reworking APIC
  register handling in kvm instead of adding a cache like this patch does.
  
  This revision does *not* address that comment yet: it only corrects a
  bug in the original patch.
  
  Posting in this form for ease of testing.
  
  Changes from v1:
  replace ASSERT by BUG_ON, correcting inverted logic
  
   arch/x86/kvm/lapic.c |   51 
  -
   arch/x86/kvm/lapic.h |2 +
   2 files changed, 51 insertions(+), 2 deletions(-)
  
  diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
  index 93c1574..0d2985d 100644
  --- a/arch/x86/kvm/lapic.c
  +++ b/arch/x86/kvm/lapic.c
  @@ -107,6 +107,16 @@ static inline void apic_clear_vector(int vec, void 
  *bitmap)
  clear_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
   }
   
  +static inline int __apic_test_and_set_vector(int vec, void *bitmap)
  +{
  +   return __test_and_set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
  +}
  +
  +static inline int __apic_test_and_clear_vector(int vec, void *bitmap)
  +{
  +   return __test_and_clear_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
  +}
  +
   static inline int apic_hw_enabled(struct kvm_lapic *apic)
   {
  return (apic)-vcpu-arch.apic_base  MSR_IA32_APICBASE_ENABLE;
  @@ -210,6 +220,16 @@ static int find_highest_vector(void *bitmap)
  return fls(word[word_offset  2]) - 1 + (word_offset  5);
   }
   
  +static u8 count_vectors(void *bitmap)
  +{
  +   u32 *word = bitmap;
  +   int word_offset;
  +   u8 count = 0;
  +   for (word_offset = 0; word_offset  MAX_APIC_VECTOR  5; ++word_offset)
  +   count += hweight32(word[word_offset  2]);
  +   return count;
  +}
  +
   static inline int apic_test_and_set_irr(int vec, struct kvm_lapic *apic)
   {
  apic-irr_pending = true;
  @@ -242,6 +262,25 @@ static inline void apic_clear_irr(int vec, struct 
  kvm_lapic *apic)
  apic-irr_pending = true;
   }
   
  +static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
  +{
  +   if (!__apic_test_and_set_vector(vec, apic-regs + APIC_ISR))
  +   ++apic-isr_count;
  +   BUG_ON(apic-isr_count  MAX_APIC_VECTOR);
  +   if (likely(apic-isr_count == 1))
  +   apic-isr_cache = vec;
  +   else
  +   apic-isr_cache = -1;
  +}
  +
  +static inline void apic_clear_isr(int vec, struct kvm_lapic *apic)
  +{
  +   if (__apic_test_and_clear_vector(vec, apic-regs + APIC_ISR))
  +   --apic-isr_count;
  +   BUG_ON(apic-isr_count  0);
  +   apic-isr_cache = -1;
  +}
  +
   int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu)
   {
  struct kvm_lapic *apic = vcpu-arch.apic;
  @@ -273,6 +312,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct 
  kvm_lapic_irq *irq)
   static inline int apic_find_highest_isr(struct kvm_lapic *apic)
   {
  int result;
  +   if (!apic-isr_count)
  +   return -1;
  +   if (likely(apic-isr_cache != -1))
 
 assert(isr_count == 1).
 
 Looks fine otherwise. Gleb can you review please?
I am not convinced we need to keep track of isr_count. isr_cache should be 
enough.
Why setting isr_cache to -1 if isr_count  1? Just overwrite isr_cache
values with newly injected vector and use it on next EOI.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv5 5/5] kvm: host side for eoi optimization

2012-05-31 Thread Michael S. Tsirkin
On Thu, May 31, 2012 at 12:57:10PM +0300, Gleb Natapov wrote:
  @@ -1345,17 +1437,40 @@ void kvm_lapic_sync_from_vapic(struct kvm_vcpu 
  *vcpu)
  apic_set_tpr(vcpu-arch.apic, data  0xff);
   }
   
  +/*
  + * apic_sync_pv_eoi_to_guest - called before vmentry
  + *
  + * Detect whether it's safe to enable PV EOI and
  + * if yes do so.
  + */
  +static void apic_sync_pv_eoi_to_guest(struct kvm_vcpu *vcpu,
  +   struct kvm_lapic *apic)
  +{
  +   if (!pv_eoi_enabled(vcpu) ||
  +   /* IRR set or many bits in ISR: could be nested. */
  +   unlikely(apic-irr_pending) ||
  +   unlikely(apic-isr_count != 1) ||
 Remind me why pv_eoi should not be set if there is more than one isr?

There's a comment below: it might be safe but
we do not bother: no easy way to know which interrupt
has higher priority.

In my testing more than one bit almost never happens in practice so not
worth optimizing for.


 
  +   /* Cache not set: safe but we don't bother. */
  +   unlikely(apic-isr_cache == -1) ||
  +   /* Need EOI to update ioapic. */
  +   unlikely(kvm_ioapic_handles_vector(vcpu-kvm, apic-isr_cache)))
  +   return;
  +
  +   pv_eoi_set_pending(apic-vcpu);
  +}
  +
 apic_sync_pv_eoi_to_guest() is not paired with
 apic_sync_pv_eoi_from_guest() if event injection is canceled.
 You can enter guest with stale pv_eoi bit.

Never. The pv_eoi bit is cleared on each exit.
It will stay cleared unless we set it here.
I will add a comment.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] kvm: optimize ISR lookups

2012-05-31 Thread Michael S. Tsirkin
On Thu, May 31, 2012 at 01:06:57PM +0300, Gleb Natapov wrote:
 On Wed, May 30, 2012 at 05:54:42PM -0300, Marcelo Tosatti wrote:
  On Tue, May 22, 2012 at 03:54:56PM +0300, Michael S. Tsirkin wrote:
   We perform ISR lookups twice: during interrupt
   injection and on EOI. Typical workloads only have
   a single bit set there. So we can avoid ISR scans by
   1. counting bits as we set/clear them in ISR
   2. if count is 1, caching the vector number
   3. if count != 1, invalidating the cache
   
   The real purpose of this is enabling PV EOI
   which needs to quickly validate the vector.
   But non PV guests might also benefit.
   
   Signed-off-by: Michael S. Tsirkin m...@redhat.com
   ---
   
   I am well aware of Thomas and Peter's suggestion of reworking APIC
   register handling in kvm instead of adding a cache like this patch does.
   
   This revision does *not* address that comment yet: it only corrects a
   bug in the original patch.
   
   Posting in this form for ease of testing.
   
   Changes from v1:
 replace ASSERT by BUG_ON, correcting inverted logic
   
arch/x86/kvm/lapic.c |   51 
   -
arch/x86/kvm/lapic.h |2 +
2 files changed, 51 insertions(+), 2 deletions(-)
   
   diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
   index 93c1574..0d2985d 100644
   --- a/arch/x86/kvm/lapic.c
   +++ b/arch/x86/kvm/lapic.c
   @@ -107,6 +107,16 @@ static inline void apic_clear_vector(int vec, void 
   *bitmap)
 clear_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
}

   +static inline int __apic_test_and_set_vector(int vec, void *bitmap)
   +{
   + return __test_and_set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
   +}
   +
   +static inline int __apic_test_and_clear_vector(int vec, void *bitmap)
   +{
   + return __test_and_clear_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
   +}
   +
static inline int apic_hw_enabled(struct kvm_lapic *apic)
{
 return (apic)-vcpu-arch.apic_base  MSR_IA32_APICBASE_ENABLE;
   @@ -210,6 +220,16 @@ static int find_highest_vector(void *bitmap)
 return fls(word[word_offset  2]) - 1 + (word_offset  5);
}

   +static u8 count_vectors(void *bitmap)
   +{
   + u32 *word = bitmap;
   + int word_offset;
   + u8 count = 0;
   + for (word_offset = 0; word_offset  MAX_APIC_VECTOR  5; ++word_offset)
   + count += hweight32(word[word_offset  2]);
   + return count;
   +}
   +
static inline int apic_test_and_set_irr(int vec, struct kvm_lapic *apic)
{
 apic-irr_pending = true;
   @@ -242,6 +262,25 @@ static inline void apic_clear_irr(int vec, struct 
   kvm_lapic *apic)
 apic-irr_pending = true;
}

   +static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
   +{
   + if (!__apic_test_and_set_vector(vec, apic-regs + APIC_ISR))
   + ++apic-isr_count;
   + BUG_ON(apic-isr_count  MAX_APIC_VECTOR);
   + if (likely(apic-isr_count == 1))
   + apic-isr_cache = vec;
   + else
   + apic-isr_cache = -1;
   +}
   +
   +static inline void apic_clear_isr(int vec, struct kvm_lapic *apic)
   +{
   + if (__apic_test_and_clear_vector(vec, apic-regs + APIC_ISR))
   + --apic-isr_count;
   + BUG_ON(apic-isr_count  0);
   + apic-isr_cache = -1;
   +}
   +
int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu)
{
 struct kvm_lapic *apic = vcpu-arch.apic;
   @@ -273,6 +312,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct 
   kvm_lapic_irq *irq)
static inline int apic_find_highest_isr(struct kvm_lapic *apic)
{
 int result;
   + if (!apic-isr_count)
   + return -1;
   + if (likely(apic-isr_cache != -1))
  
  assert(isr_count == 1).
  
  Looks fine otherwise. Gleb can you review please?
 I am not convinced we need to keep track of isr_count. isr_cache should be 
 enough.
 Why setting isr_cache to -1 if isr_count  1?

So that data path can check cache and if != -1 know it
is valid: we check the cache more than we set it.

I'll add a comment to explain this.

 Just overwrite isr_cache
 values with newly injected vector and use it on next EOI.
 --
   Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv5 5/5] kvm: host side for eoi optimization

2012-05-31 Thread Gleb Natapov
On Thu, May 31, 2012 at 01:11:11PM +0300, Michael S. Tsirkin wrote:
 On Thu, May 31, 2012 at 12:57:10PM +0300, Gleb Natapov wrote:
   @@ -1345,17 +1437,40 @@ void kvm_lapic_sync_from_vapic(struct kvm_vcpu 
   *vcpu)
 apic_set_tpr(vcpu-arch.apic, data  0xff);
}

   +/*
   + * apic_sync_pv_eoi_to_guest - called before vmentry
   + *
   + * Detect whether it's safe to enable PV EOI and
   + * if yes do so.
   + */
   +static void apic_sync_pv_eoi_to_guest(struct kvm_vcpu *vcpu,
   + struct kvm_lapic *apic)
   +{
   + if (!pv_eoi_enabled(vcpu) ||
   + /* IRR set or many bits in ISR: could be nested. */
   + unlikely(apic-irr_pending) ||
   + unlikely(apic-isr_count != 1) ||
  Remind me why pv_eoi should not be set if there is more than one isr?
 
 There's a comment below: it might be safe but
 we do not bother: no easy way to know which interrupt
 has higher priority.
Last injected interrupt has highest priority.

 
 In my testing more than one bit almost never happens in practice so not
 worth optimizing for.
 
 
  
   + /* Cache not set: safe but we don't bother. */
   + unlikely(apic-isr_cache == -1) ||
   + /* Need EOI to update ioapic. */
   + unlikely(kvm_ioapic_handles_vector(vcpu-kvm, apic-isr_cache)))
   + return;
   +
   + pv_eoi_set_pending(apic-vcpu);
   +}
   +
  apic_sync_pv_eoi_to_guest() is not paired with
  apic_sync_pv_eoi_from_guest() if event injection is canceled.
  You can enter guest with stale pv_eoi bit.
 
 Never. The pv_eoi bit is cleared on each exit.
There will be no exit since there will be no entry. Search for
goto out after kvm_lapic_sync_to_vapic().

 It will stay cleared unless we set it here.
 I will add a comment.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix migration of real mode guests from host with unrestricted guest to a host without it.

2012-05-31 Thread Orit Wasserman
For example migration between Westmere and Nehelem hosts.
The patch fixes the guest segments similar to enter_rmode function.

Signed-off-by: Orit Wasserman owass...@rehdat.com
---
 arch/x86/kvm/vmx.c |   38 ++
 1 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 32eb588..2eca18f 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3229,6 +3229,44 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu,
 
vmcs_write32(sf-ar_bytes, ar);
__clear_bit(VCPU_EXREG_CPL, (ulong *)vcpu-arch.regs_avail);
+
+   /*
+* Fix segments for real mode guest in hosts that don't have
+* unrestricted_mode or it was disabled.
+* This is done to allow migration of the guests from hosts with
+* unrestricted guest like Westmere to older host that don't have
+* unrestricted guest like Nehelem.
+*/
+   if (!enable_unrestricted_guest  vmx-rmode.vm86_active) {
+   switch (seg) {
+   case VCPU_SREG_CS:
+   vmcs_write32(GUEST_CS_AR_BYTES, 0xf3);
+   vmcs_write32(GUEST_CS_LIMIT, 0x);
+   if (vmcs_readl(GUEST_CS_BASE) == 0x)
+   vmcs_writel(GUEST_CS_BASE, 0xf);
+   vmcs_write16(GUEST_CS_SELECTOR,
+vmcs_readl(GUEST_CS_BASE)  4);
+   break;
+   case VCPU_SREG_ES:
+   fix_rmode_seg(VCPU_SREG_ES, vmx-rmode.es);
+   break;
+   case VCPU_SREG_DS:
+   fix_rmode_seg(VCPU_SREG_DS, vmx-rmode.ds);
+   break;
+   case VCPU_SREG_GS:
+   fix_rmode_seg(VCPU_SREG_GS, vmx-rmode.gs);
+   break;
+   case VCPU_SREG_FS:
+   fix_rmode_seg(VCPU_SREG_FS, vmx-rmode.fs);
+   break;
+   case VCPU_SREG_SS:
+   vmcs_write16(GUEST_SS_SELECTOR,
+vmcs_readl(GUEST_SS_BASE)  4);
+   vmcs_write32(GUEST_SS_LIMIT, 0x);
+   vmcs_write32(GUEST_SS_AR_BYTES, 0xf3);
+   break;
+   }
+   }
 }
 
 static void vmx_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] Fix migration of real mode guests from host with unrestricted guest to a host without it.

2012-05-31 Thread Orit Wasserman
For example migration between Westmere and Nehelem hosts.

The code that fixes the segments for real mode guest was moved from enter_rmode
to vmx_set_segments. enter_rmode calls vmx_set_segments for each segment.

Signed-off-by: Orit Wasserman owass...@rehdat.com
---
 arch/x86/kvm/vmx.c |   70 +++-
 1 files changed, 58 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 32eb588..548c9b5 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -615,6 +615,10 @@ static void kvm_cpu_vmxon(u64 addr);
 static void kvm_cpu_vmxoff(void);
 static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3);
 static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr);
+static void vmx_set_segment(struct kvm_vcpu *vcpu,
+   struct kvm_segment *var, int seg);
+static void vmx_get_segment(struct kvm_vcpu *vcpu,
+   struct kvm_segment *var, int seg);
 
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
 static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -2770,6 +2774,7 @@ static void enter_rmode(struct kvm_vcpu *vcpu)
 {
unsigned long flags;
struct vcpu_vmx *vmx = to_vmx(vcpu);
+   struct kvm_segment var;
 
if (enable_unrestricted_guest)
return;
@@ -2813,20 +2818,23 @@ static void enter_rmode(struct kvm_vcpu *vcpu)
if (emulate_invalid_guest_state)
goto continue_rmode;
 
-   vmcs_write16(GUEST_SS_SELECTOR, vmcs_readl(GUEST_SS_BASE)  4);
-   vmcs_write32(GUEST_SS_LIMIT, 0x);
-   vmcs_write32(GUEST_SS_AR_BYTES, 0xf3);
+   vmx_get_segment(vcpu, var, VCPU_SREG_SS);
+   vmx_set_segment(vcpu, var, VCPU_SREG_SS);
+
+   vmx_get_segment(vcpu, var, VCPU_SREG_CS);
+   vmx_set_segment(vcpu, var, VCPU_SREG_CS);
+
+   vmx_get_segment(vcpu, var, VCPU_SREG_ES);
+   vmx_set_segment(vcpu, var, VCPU_SREG_ES);
 
-   vmcs_write32(GUEST_CS_AR_BYTES, 0xf3);
-   vmcs_write32(GUEST_CS_LIMIT, 0x);
-   if (vmcs_readl(GUEST_CS_BASE) == 0x)
-   vmcs_writel(GUEST_CS_BASE, 0xf);
-   vmcs_write16(GUEST_CS_SELECTOR, vmcs_readl(GUEST_CS_BASE)  4);
+   vmx_get_segment(vcpu, var, VCPU_SREG_DS);
+   vmx_set_segment(vcpu, var, VCPU_SREG_DS);
 
-   fix_rmode_seg(VCPU_SREG_ES, vmx-rmode.es);
-   fix_rmode_seg(VCPU_SREG_DS, vmx-rmode.ds);
-   fix_rmode_seg(VCPU_SREG_GS, vmx-rmode.gs);
-   fix_rmode_seg(VCPU_SREG_FS, vmx-rmode.fs);
+   vmx_get_segment(vcpu, var, VCPU_SREG_GS);
+   vmx_set_segment(vcpu, var, VCPU_SREG_GS);
+
+   vmx_get_segment(vcpu, var, VCPU_SREG_FS);
+   vmx_set_segment(vcpu, var, VCPU_SREG_FS);
 
 continue_rmode:
kvm_mmu_reset_context(vcpu);
@@ -3229,6 +3237,44 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu,
 
vmcs_write32(sf-ar_bytes, ar);
__clear_bit(VCPU_EXREG_CPL, (ulong *)vcpu-arch.regs_avail);
+
+   /*
+* Fix segments for real mode guest in hosts that don't have
+* unrestricted_mode or it was disabled.
+* This is done to allow migration of the guests from hosts with
+* unrestricted guest like Westmere to older host that don't have
+* unrestricted guest like Nehelem.
+*/
+   if (!enable_unrestricted_guest  vmx-rmode.vm86_active) {
+   switch (seg) {
+   case VCPU_SREG_CS:
+   vmcs_write32(GUEST_CS_AR_BYTES, 0xf3);
+   vmcs_write32(GUEST_CS_LIMIT, 0x);
+   if (vmcs_readl(GUEST_CS_BASE) == 0x)
+   vmcs_writel(GUEST_CS_BASE, 0xf);
+   vmcs_write16(GUEST_CS_SELECTOR,
+vmcs_readl(GUEST_CS_BASE)  4);
+   break;
+   case VCPU_SREG_ES:
+   fix_rmode_seg(VCPU_SREG_ES, vmx-rmode.es);
+   break;
+   case VCPU_SREG_DS:
+   fix_rmode_seg(VCPU_SREG_DS, vmx-rmode.ds);
+   break;
+   case VCPU_SREG_GS:
+   fix_rmode_seg(VCPU_SREG_GS, vmx-rmode.gs);
+   break;
+   case VCPU_SREG_FS:
+   fix_rmode_seg(VCPU_SREG_FS, vmx-rmode.fs);
+   break;
+   case VCPU_SREG_SS:
+   vmcs_write16(GUEST_SS_SELECTOR,
+vmcs_readl(GUEST_SS_BASE)  4);
+   vmcs_write32(GUEST_SS_LIMIT, 0x);
+   vmcs_write32(GUEST_SS_AR_BYTES, 0xf3);
+   break;
+   }
+   }
 }
 
 static void vmx_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: [PATCH] Fix migration of real mode guests from host with unrestricted guest to a host without it.

2012-05-31 Thread Avi Kivity
On 05/31/2012 02:13 PM, Orit Wasserman wrote:
 For example migration between Westmere and Nehelem hosts.
 The patch fixes the guest segments similar to enter_rmode function.
 
 Signed-off-by: Orit Wasserman owass...@rehdat.com
 ---
  arch/x86/kvm/vmx.c |   38 ++
  1 files changed, 38 insertions(+), 0 deletions(-)
 
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index 32eb588..2eca18f 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -3229,6 +3229,44 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu,
  
   vmcs_write32(sf-ar_bytes, ar);
   __clear_bit(VCPU_EXREG_CPL, (ulong *)vcpu-arch.regs_avail);
 +
 + /*
 +  * Fix segments for real mode guest in hosts that don't have
 +  * unrestricted_mode or it was disabled.
 +  * This is done to allow migration of the guests from hosts with
 +  * unrestricted guest like Westmere to older host that don't have
 +  * unrestricted guest like Nehelem.
 +  */
 + if (!enable_unrestricted_guest  vmx-rmode.vm86_active) {
 + switch (seg) {
 + case VCPU_SREG_CS:
 + vmcs_write32(GUEST_CS_AR_BYTES, 0xf3);
 + vmcs_write32(GUEST_CS_LIMIT, 0x);
 + if (vmcs_readl(GUEST_CS_BASE) == 0x)
 + vmcs_writel(GUEST_CS_BASE, 0xf);
 + vmcs_write16(GUEST_CS_SELECTOR,
 +  vmcs_readl(GUEST_CS_BASE)  4);
 + break;
 + case VCPU_SREG_ES:
 + fix_rmode_seg(VCPU_SREG_ES, vmx-rmode.es);
 + break;
 + case VCPU_SREG_DS:
 + fix_rmode_seg(VCPU_SREG_DS, vmx-rmode.ds);
 + break;
 + case VCPU_SREG_GS:
 + fix_rmode_seg(VCPU_SREG_GS, vmx-rmode.gs);
 + break;
 + case VCPU_SREG_FS:
 + fix_rmode_seg(VCPU_SREG_FS, vmx-rmode.fs);
 + break;
 + case VCPU_SREG_SS:
 + vmcs_write16(GUEST_SS_SELECTOR,
 +  vmcs_readl(GUEST_SS_BASE)  4);
 + vmcs_write32(GUEST_SS_LIMIT, 0x);
 + vmcs_write32(GUEST_SS_AR_BYTES, 0xf3);
 + break;
 + }
 + }
  }
  

This duplicates the code in enter_rmode().  Please change enter_rmode()
to vmx_set_segment(vmx_get_segment()) to avoid duplication.

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] kvm: optimize ISR lookups

2012-05-31 Thread Michael S. Tsirkin
On Thu, May 31, 2012 at 01:06:57PM +0300, Gleb Natapov wrote:
 On Wed, May 30, 2012 at 05:54:42PM -0300, Marcelo Tosatti wrote:
  On Tue, May 22, 2012 at 03:54:56PM +0300, Michael S. Tsirkin wrote:
   We perform ISR lookups twice: during interrupt
   injection and on EOI. Typical workloads only have
   a single bit set there. So we can avoid ISR scans by
   1. counting bits as we set/clear them in ISR
   2. if count is 1, caching the vector number
   3. if count != 1, invalidating the cache
   
   The real purpose of this is enabling PV EOI
   which needs to quickly validate the vector.
   But non PV guests might also benefit.
   
   Signed-off-by: Michael S. Tsirkin m...@redhat.com
   ---
   
   I am well aware of Thomas and Peter's suggestion of reworking APIC
   register handling in kvm instead of adding a cache like this patch does.
   
   This revision does *not* address that comment yet: it only corrects a
   bug in the original patch.
   
   Posting in this form for ease of testing.
   
   Changes from v1:
 replace ASSERT by BUG_ON, correcting inverted logic
   
arch/x86/kvm/lapic.c |   51 
   -
arch/x86/kvm/lapic.h |2 +
2 files changed, 51 insertions(+), 2 deletions(-)
   
   diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
   index 93c1574..0d2985d 100644
   --- a/arch/x86/kvm/lapic.c
   +++ b/arch/x86/kvm/lapic.c
   @@ -107,6 +107,16 @@ static inline void apic_clear_vector(int vec, void 
   *bitmap)
 clear_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
}

   +static inline int __apic_test_and_set_vector(int vec, void *bitmap)
   +{
   + return __test_and_set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
   +}
   +
   +static inline int __apic_test_and_clear_vector(int vec, void *bitmap)
   +{
   + return __test_and_clear_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
   +}
   +
static inline int apic_hw_enabled(struct kvm_lapic *apic)
{
 return (apic)-vcpu-arch.apic_base  MSR_IA32_APICBASE_ENABLE;
   @@ -210,6 +220,16 @@ static int find_highest_vector(void *bitmap)
 return fls(word[word_offset  2]) - 1 + (word_offset  5);
}

   +static u8 count_vectors(void *bitmap)
   +{
   + u32 *word = bitmap;
   + int word_offset;
   + u8 count = 0;
   + for (word_offset = 0; word_offset  MAX_APIC_VECTOR  5; ++word_offset)
   + count += hweight32(word[word_offset  2]);
   + return count;
   +}
   +
static inline int apic_test_and_set_irr(int vec, struct kvm_lapic *apic)
{
 apic-irr_pending = true;
   @@ -242,6 +262,25 @@ static inline void apic_clear_irr(int vec, struct 
   kvm_lapic *apic)
 apic-irr_pending = true;
}

   +static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
   +{
   + if (!__apic_test_and_set_vector(vec, apic-regs + APIC_ISR))
   + ++apic-isr_count;
   + BUG_ON(apic-isr_count  MAX_APIC_VECTOR);
   + if (likely(apic-isr_count == 1))
   + apic-isr_cache = vec;
   + else
   + apic-isr_cache = -1;
   +}
   +
   +static inline void apic_clear_isr(int vec, struct kvm_lapic *apic)
   +{
   + if (__apic_test_and_clear_vector(vec, apic-regs + APIC_ISR))
   + --apic-isr_count;
   + BUG_ON(apic-isr_count  0);
   + apic-isr_cache = -1;
   +}
   +
int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu)
{
 struct kvm_lapic *apic = vcpu-arch.apic;
   @@ -273,6 +312,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct 
   kvm_lapic_irq *irq)
static inline int apic_find_highest_isr(struct kvm_lapic *apic)
{
 int result;
   + if (!apic-isr_count)
   + return -1;
   + if (likely(apic-isr_cache != -1))
  
  assert(isr_count == 1).
  
  Looks fine otherwise. Gleb can you review please?
 I am not convinced we need to keep track of isr_count. isr_cache should be 
 enough.
 Why setting isr_cache to -1 if isr_count  1? Just overwrite isr_cache
 values with newly injected vector and use it on next EOI.

I thought about this some more. This counter is a win because
apic_find_highest_isr is called twice:

1. apic_set_eoi calls apic_find_highest_isr before clearing

2. apic_update_ppr calls apic_find_highest_isr after clearing

So with isr_count you never scan ISR unless there is nesting.

So we do need isr_count. I agree we can make this work for
more cases because the latest ISR set is always the
highest one. Ronen suggested such an
optimization offline too, and it makes
code simpler so I'll implement this optimization.

 --
   Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv5 5/5] kvm: host side for eoi optimization

2012-05-31 Thread Michael S. Tsirkin
On Thu, May 31, 2012 at 01:15:13PM +0300, Gleb Natapov wrote:
+   /* Cache not set: safe but we don't bother. */
+   unlikely(apic-isr_cache == -1) ||
+   /* Need EOI to update ioapic. */
+   unlikely(kvm_ioapic_handles_vector(vcpu-kvm, 
apic-isr_cache)))
+   return;
+
+   pv_eoi_set_pending(apic-vcpu);
+}
+
   apic_sync_pv_eoi_to_guest() is not paired with
   apic_sync_pv_eoi_from_guest() if event injection is canceled.
   You can enter guest with stale pv_eoi bit.
  
  Never. The pv_eoi bit is cleared on each exit.
 There will be no exit since there will be no entry. Search for
 goto out after kvm_lapic_sync_to_vapic().

I think you've found a bug, thanks a bunch.
I have fixed it but the x86 guys asked me not to
post more patches until merge window closes :(
So I'll sit on a fix for several days.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation/kvm : Add documentation on Hypercalls

2012-05-31 Thread Randy Dunlap
On 05/31/2012 01:01 AM, Raghavendra K T wrote:

 From: Raghavendra K T raghavendra...@linux.vnet.ibm.com 
 
 Thanks Alex for KVM_HC_FEATURES inputs and Jan for VAPIC_POLL_IRQ 
 
 Signed-off-by: Raghavendra K T raghavendra...@linux.vnet.ibm.com
 ---
 diff --git a/Documentation/virtual/kvm/hypercalls.txt 
 b/Documentation/virtual/kvm/hypercalls.txt
 new file mode 100644
 index 000..c79335a
 --- /dev/null
 +++ b/Documentation/virtual/kvm/hypercalls.txt
 @@ -0,0 +1,46 @@
 +KVM Hypercalls Documentation
 +===
 +The template for each hypercall is:
 +1. Hypercall name, value.
 +2. Architecture(s)
 +3. Status (deprecated, obsolete, active)
 +4. Purpose
 +
 +1. KVM_HC_VAPIC_POLL_IRQ
 +
 +Value: 1
 +Architecture: x86


No Status, does that imply Active ?

 +Purpose: Trigger guest exit so that the host can check for pending
 +interrupts on reentry.
 +
 +2. KVM_HC_MMU_OP
 +
 +Value: 2
 +Architecture: x86
 +Status: deprecated.
 +Purpose: Support MMU operations such as writing to PTE,
 +flushing TLB, release PT.
 +
 +3. KVM_HC_FEATURES
 +
 +Value: 3
 +Architecture: PPC
 +Status: active
 +Purpose: Expose hypercall availability to the guest. On x86 platforms, cpuid
 +used to enumerate which hypercalls are available. On PPC, either device tree
 +based lookup ( which is also what EPAPR dictates) OR KVM specific enumeration
 +mechanism (which is this hypercall) can be used.
 +
 +4. KVM_HC_PPC_MAP_MAGIC_PAGE
 +
 +Value: 4
 +Architecture: PPC
 +Status: active
 +Purpose: To enable communication between the hypervisor and guest there is a
 +shared page that contains parts of supervisor visible register state.
 +The guest can map this shared page to access its supervisor register through
 +memory using this hypercall.
 +
 +TODO:
 +1. more information on input and output needed?
 +2. Add more detail to purpose of hypercalls.
 
 --


-- 
~Randy
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation/kvm : Add documentation on Hypercalls

2012-05-31 Thread H. Peter Anvin
On 05/31/2012 01:01 AM, Raghavendra K T wrote:
 +
 +TODO:
 +1. more information on input and output needed?
 +2. Add more detail to purpose of hypercalls.
 

1. definitely, including the hypercall ABI.

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation/kvm : Add documentation on Hypercalls

2012-05-31 Thread Jan Kiszka
On 2012-05-31 19:44, Randy Dunlap wrote:
 On 05/31/2012 01:01 AM, Raghavendra K T wrote:
 
 From: Raghavendra K T raghavendra...@linux.vnet.ibm.com 

 Thanks Alex for KVM_HC_FEATURES inputs and Jan for VAPIC_POLL_IRQ 

 Signed-off-by: Raghavendra K T raghavendra...@linux.vnet.ibm.com
 ---
 diff --git a/Documentation/virtual/kvm/hypercalls.txt 
 b/Documentation/virtual/kvm/hypercalls.txt
 new file mode 100644
 index 000..c79335a
 --- /dev/null
 +++ b/Documentation/virtual/kvm/hypercalls.txt
 @@ -0,0 +1,46 @@
 +KVM Hypercalls Documentation
 +===
 +The template for each hypercall is:
 +1. Hypercall name, value.
 +2. Architecture(s)
 +3. Status (deprecated, obsolete, active)
 +4. Purpose
 +
 +1. KVM_HC_VAPIC_POLL_IRQ
 +
 +Value: 1
 +Architecture: x86
 
 
 No Status, does that imply Active ?

It should actually state that this call is in use.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation/kvm : Add documentation on Hypercalls

2012-05-31 Thread H. Peter Anvin
On 05/31/2012 11:55 AM, Jan Kiszka wrote:

 No Status, does that imply Active ?
 
 It should actually state that this call is in use.
 

It might be nice to point out when a call was introduced, and for
deprecated calls, when it was deprecated.

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Add PCID/INVPCID test

2012-05-31 Thread Marcelo Tosatti
On Fri, May 18, 2012 at 06:27:27AM +, Mao, Junjie wrote:
 Add unit test for basic functionality of PCID/INVPCID feature exposing in kvm.
 
 Changes from v1:
   Hard code 'invpcid' instruction
 
 Signed-off-by: Junjie Mao junjie@intel.com
 ---
  config-x86-common.mak |2 +
  config-x86_64.mak |3 +-
  x86/README|1 +
  x86/pcid.c|  185 
 +
  x86/unittests.cfg |4 +
  5 files changed, 194 insertions(+), 1 deletions(-)
  create mode 100644 x86/pcid.c

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: ia64: Mark ia64 KVM as BROKEN

2012-05-31 Thread Marcelo Tosatti
On Thu, May 17, 2012 at 01:14:08PM +0300, Avi Kivity wrote:
 Practically all patches to ia64 KVM are build fixes; numerous warnings remain;
 the last patch from the maintainer was committed more than three years ago.  
 It
 is clear that no one is using this thing.
 
 Mark as BROKEN to ensure people don't get hit by pointless build problems.
 
 Signed-off-by: Avi Kivity a...@redhat.com

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] qemu-kvm-1.1-rc4

2012-05-31 Thread Marcelo Tosatti


qemu-kvm-1.1-rc4 is now available. This release is based on the upstream
qemu 1.1-rc4, plus kvm-specific enhancements. Please see the
original QEMU 1.1-rc4 release announcement [1] for details.

This release can be used with the kvm kernel modules provided by your
distribution kernel, or by the modules in the kvm-kmod package, such
as kvm-kmod-3.1.

http://www.linux-kvm.org

[1] http://lists.gnu.org/archive/html/qemu-devel/2012-05/msg04401.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] virtio_blk: unlock vblk-lock during kick

2012-05-31 Thread Asias He

Hello Stefan,

On 05/30/2012 09:19 PM, Stefan Hajnoczi wrote:

Holding the vblk-lock across kick causes poor scalability in SMP
guests.  If one CPU is doing virtqueue kick and another CPU touches the
vblk-lock it will have to spin until virtqueue kick completes.

This patch reduces system% CPU utilization in SMP guests that are
running multithreaded I/O-bound workloads.  The improvements are small
but show as iops and SMP are increased.

Khoa Huynhk...@us.ibm.com  provided initial performance data that
indicates this optimization is worthwhile at high iops.

Asias Heas...@redhat.com  reports the following fio results:

Host: Linux 3.4.0+ #302 SMP x86_64 GNU/Linux
Guest: same as host kernel

Average 3 runs:
with locked kick
-
readiops=119907.50 bw=59954.00 runt=35018.50 io=2048.00
write   iops=217187.00 bw=108594.00 runt=19312.00 io=2048.00
readiops=33948.00 bw=16974.50 runt=186820.50 io=3095.70
write   iops=35014.00 bw=17507.50 runt=181151.00 io=3095.70
clat (usec) max=3484.10 avg=121085.38 stdev=174416.11 min=0.00
clat (usec) max=3438.30 avg=59863.35 stdev=116607.69 min=0.00
clat (usec) max=3745.65 avg=454501.30 stdev=332699.00 min=0.00
clat (usec) max=4089.75 avg=442374.99 stdev=304874.62 min=0.00
cpu sys=615.12 majf=24080.50 ctx=64253616.50 usr=68.08 minf=17907363.00
cpu sys=1235.95 majf=23389.00 ctx=59788148.00 usr=98.34 minf=20020008.50
cpu sys=764.96 majf=28414.00 ctx=848279274.00 usr=36.39 minf=19737254.00
cpu sys=714.13 majf=21853.50 ctx=854608972.00 usr=33.56 minf=18256760.50

with unlocked kick
-
readiops=118559.00 bw=59279.66 runt=35400.66 io=2048.00
write   iops=227560.00 bw=113780.33 runt=18440.00 io=2048.00
readiops=34567.66 bw=17284.00 runt=183497.33 io=3095.70
write   iops=34589.33 bw=17295.00 runt=183355.00 io=3095.70
clat (usec) max=3485.56 avg=121989.58 stdev=197355.15 min=0.00
clat (usec) max=3222.33 avg=57784.11 stdev=141002.89 min=0.00
clat (usec) max=4060.93 avg=447098.65 stdev=315734.33 min=0.00
clat (usec) max=3656.30 avg=447281.70 stdev=314051.33 min=0.00
cpu sys=683.78 majf=24501.33 ctx=64435364.66 usr=68.91 minf=17907893.33
cpu sys=1218.24 majf=25000.33 ctx=60451475.00 usr=101.04 minf=19757720.00
cpu sys=740.39 majf=24809.00 ctx=845290443.66 usr=37.25 minf=19349958.33
cpu sys=723.63 majf=27597.33 ctx=850199927.33 usr=35.35 minf=19092343.00

FIO config file
-

[global]
exec_prerun=echo 3  /proc/sys/vm/drop_caches
group_reporting
norandommap
ioscheduler=noop
thread
bs=512
size=4MB
direct=1
filename=/dev/vdb
numjobs=256
ioengine=aio
iodepth=64
loops=3

Signed-off-by: Stefan Hajnoczistefa...@linux.vnet.ibm.com
---
  drivers/block/virtio_blk.c |   10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 693187d..1a50f41 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -201,8 +201,14 @@ static void do_virtblk_request(struct request_queue *q)
issued++;
}

-   if (issued)
-   virtqueue_kick(vblk-vq);
+   if (!issued)
+   return;
+
+   if (virtqueue_kick_prepare(vblk-vq)) {
+   spin_unlock(vblk-lock);
+   virtqueue_notify(vblk-vq);
+   spin_lock(vblk-lock);
+   }
  }


Could you use vblk-disk-queue-queue_lock to reference the lock so 
that this patch will work on top of this one:


   virtio-blk: Use block layer provided spinlock

BTW. Why the function name is changed to virtqueue_notify() from 
virtqueue_kick_notify()? Seems Rusty renamed it. I think the latter name 
is better because it is more consistent and easier to remember.


   virtqueue_kick()
   virtqueue_kick_prepare()
   virtqueue_kick_notify()

I believe you used virtqueue_kick_notify() in your original patch.
See:
   http://www.spinics.net/lists/linux-virtualization/msg14616.html

--
Asias
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html