Re: KVM: VMX: fix incorrect cached cpl value with real/v8086 modes

2013-01-03 Thread Gleb Natapov
On Tue, Jan 01, 2013 at 09:36:38PM -0200, Marcelo Tosatti wrote:
 On Wed, Dec 26, 2012 at 03:33:16PM +0200, Gleb Natapov wrote:
  On Wed, Dec 26, 2012 at 11:25:24AM -0200, Marcelo Tosatti wrote:
   On Wed, Dec 26, 2012 at 07:25:49AM +0200, Gleb Natapov wrote:
On Tue, Dec 25, 2012 at 07:37:10PM -0200, Marcelo Tosatti wrote:
 On Tue, Dec 25, 2012 at 02:48:08PM +0200, Gleb Natapov wrote:
  On Sat, Dec 22, 2012 at 02:31:10PM +0200, Avi Kivity wrote:
   On Wed, Dec 19, 2012 at 3:29 PM, Marcelo Tosatti 
   mtosa...@redhat.comwrote:
   
   
   
CPL is always 0 when in real mode, and always 3 when virtual 
8086 mode.
   
Using values other than those can cause failures on operations 
that check
CPL.
   
Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
   
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a4ecf7c..3abe433 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3215,13 +3215,6 @@ static u64 vmx_get_segment_base(struct 
kvm_vcpu
*vcpu, int seg)
   
 static int __vmx_get_cpl(struct kvm_vcpu *vcpu)
 {
-   if (!is_protmode(vcpu))
-   return 0;
-
-   if (!is_long_mode(vcpu)
-(kvm_get_rflags(vcpu)  X86_EFLAGS_VM)) /* if 
virtual 8086
*/
-   return 3;
-
return vmx_read_guest_seg_selector(to_vmx(vcpu), 
VCPU_SREG_CS)  3;
 }
   
@@ -3229,6 +3222,13 @@ static int vmx_get_cpl(struct kvm_vcpu 
*vcpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
   
+   if (!is_protmode(vcpu))
+   return 0;
+
+   if (!is_long_mode(vcpu)
+(kvm_get_rflags(vcpu)  X86_EFLAGS_VM)) /* if 
virtual 8086
*/
+   return 3;
+
/*
 * If we enter real mode with cs.sel  3 != 0, the 
normal CPL
calculations
 * fail; use the cache instead.
   
   
   
   This undoes the cache, now every vmx_get_cpl() in protected mode 
   has to
   VMREAD(GUEST_RFLAGS).
  True. Marcelo what failure do you see without the patch?
  
  --
  Gleb.
 
 On transition _to_ real mode, linearize fails due to CPL checks
 (FreeBSD). I'll resend the patch with use of cache for
 VMREAD(GUEST_RFLAGS), which is already implemented.
I am curious does it still fails with all my vmx patches applied too?
   
   Yes.
   
  Which FreeBSD exactly? Cannot reproduce with FreeBSD 9 64 bits here.
 
 FreeBSD 9.1 with -smp 2.
I cannot reproduce. I do see boot failure on the next branch with 9.[01]
64 bit -smp 2 here, but it is caused but segment registers been all
incorrect on a secondary vcpu and this patch does not help. Applying
http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102593 fixes
it.

  
The question is how does it happen that we enter real mode while cache
is set to 3. It should never be 3 during boot since boot process never
enters the userspace.
   
   Its transition _to_ real mode (from protected).
  But in protected mode CPL should be 0 during boot.
 
 BTX (FreeBSD's bootloader) does:
 
 http://people.freebsd.org/~jhb/docs/loader.txt. 
Crazy. Regardless, I think your patch is correct and the current code
that uses cpl cache during emulation is wrong and it remains wrong even
after your patch. What if last time cpl cache was updated was while vcpu
ran in cpl 3? Commit message says that it tries to fix the case when
CS.selector3 != 0 during transition to protected mode, but this can be
fixed by setting cpl cache to 0 in vmx_set_cr0() instead of clearing it.

Two things about the patch itself. Get rid of __vmx_get_cpl() and call
to vmx_read_guest_seg_selector() directly from vmx_get_cpl() and drop
__clear_bit(VCPU_EXREG_CPL) from vmx_set_rflags() and vmx_set_cr0()
since vmx-cpl no longer depends on rflags and cr0.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/5] virtio: add functions for piecewise addition of buffers

2013-01-03 Thread Wanlong Gao
On 01/02/2013 01:03 PM, Rusty Russell wrote:
 Paolo Bonzini pbonz...@redhat.com writes:
 The virtqueue_add_buf function has two limitations:

 1) it requires the caller to provide all the buffers in a single call;

 2) it does not support chained scatterlists: the buffers must be
 provided as an array of struct scatterlist;
 
 Chained scatterlists are a horrible interface, but that doesn't mean we
 shouldn't support them if there's a need.
 
 I think I once even had a patch which passed two chained sgs, rather
 than a combo sg and two length numbers.  It's very old, but I've pasted
 it below.
 
 Duplicating the implementation by having another interface is pretty
 nasty; I think I'd prefer the chained scatterlists, if that's optimal
 for you.

I rebased against virtio-next and use it in virtio-scsi, and tested it with 4 
targets
virtio-scsi devices and host cpu idle=poll. Saw a little performance regression 
here.

General:
Run status group 0 (all jobs):
   READ: io=34675MB, aggrb=248257KB/s, minb=248257KB/s, maxb=248257KB/s, 
mint=143025msec, maxt=143025msec
  WRITE: io=34625MB, aggrb=247902KB/s, minb=247902KB/s, maxb=247902KB/s, 
mint=143025msec, maxt=143025msec

Chained:
Run status group 0 (all jobs):
   READ: io=34863MB, aggrb=242320KB/s, minb=242320KB/s, maxb=242320KB/s, 
mint=147325msec, maxt=147325msec
  WRITE: io=34437MB, aggrb=239357KB/s, minb=239357KB/s, maxb=239357KB/s, 
mint=147325msec, maxt=147325msec

Thanks,
Wanlong Gao

From d3181b3f9bbdebbd3f2928b64821b406774757f8 Mon Sep 17 00:00:00 2001
From: Rusty Russell ru...@rustcorp.com.au
Date: Wed, 2 Jan 2013 16:43:49 +0800
Subject: [PATCH] virtio: use chained scatterlists

Rather than handing a scatterlist[] and out and in numbers to
virtqueue_add_buf(), hand two separate ones which can be chained.

I shall refrain from ranting about what a disgusting hack chained
scatterlists are.  I'll just note that this doesn't make things
simpler (see diff).

The scatterlists we use can be too large for the stack, so we put them
in our device struct and reuse them.  But in many cases we don't want
to pay the cost of sg_init_table() as we don't know how many elements
we'll have and we'd have to initialize the entire table.

This means we have two choices: carefully reset the end markers after
we call virtqueue_add_buf(), which we do in virtio_net for the xmit
path where it's easy and we want to be optimal.  Elsewhere we
implement a helper to unset the end markers after we've filled the
array.

Signed-off-by: Rusty Russell ru...@rustcorp.com.au
Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com
---
 drivers/block/virtio_blk.c  | 57 ++---
 drivers/char/hw_random/virtio-rng.c |  2 +-
 drivers/char/virtio_console.c   |  6 ++--
 drivers/net/virtio_net.c| 68 ++-
 drivers/scsi/virtio_scsi.c  | 18 ++
 drivers/virtio/virtio_balloon.c |  6 ++--
 drivers/virtio/virtio_ring.c| 71 +++--
 include/linux/virtio.h  | 14 ++--
 net/9p/trans_virtio.c   | 31 +---
 9 files changed, 172 insertions(+), 101 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 0bdde8f..17cf0b7 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -102,8 +102,8 @@ static inline struct virtblk_req *virtblk_alloc_req(struct 
virtio_blk *vblk,
 
 static void virtblk_add_buf_wait(struct virtio_blk *vblk,
 struct virtblk_req *vbr,
-unsigned long out,
-unsigned long in)
+struct scatterlist *out,
+struct scatterlist *in)
 {
DEFINE_WAIT(wait);
 
@@ -112,7 +112,7 @@ static void virtblk_add_buf_wait(struct virtio_blk *vblk,
  TASK_UNINTERRUPTIBLE);
 
spin_lock_irq(vblk-disk-queue-queue_lock);
-   if (virtqueue_add_buf(vblk-vq, vbr-sg, out, in, vbr,
+   if (virtqueue_add_buf(vblk-vq, out, in, vbr,
  GFP_ATOMIC)  0) {
spin_unlock_irq(vblk-disk-queue-queue_lock);
io_schedule();
@@ -128,12 +128,13 @@ static void virtblk_add_buf_wait(struct virtio_blk *vblk,
 }
 
 static inline void virtblk_add_req(struct virtblk_req *vbr,
-  unsigned int out, unsigned int in)
+  struct scatterlist *out,
+  struct scatterlist *in)
 {
struct virtio_blk *vblk = vbr-vblk;
 
spin_lock_irq(vblk-disk-queue-queue_lock);
-   if (unlikely(virtqueue_add_buf(vblk-vq, vbr-sg, out, in, vbr,
+   if (unlikely(virtqueue_add_buf(vblk-vq, out, in, vbr,
GFP_ATOMIC)  0)) {
spin_unlock_irq(vblk-disk-queue-queue_lock);
  

Re: [PATCH v2 1/5] virtio: add functions for piecewise addition of buffers

2013-01-03 Thread Paolo Bonzini
Il 02/01/2013 06:03, Rusty Russell ha scritto:
 Paolo Bonzini pbonz...@redhat.com writes:
 The virtqueue_add_buf function has two limitations:

 1) it requires the caller to provide all the buffers in a single call;

 2) it does not support chained scatterlists: the buffers must be
 provided as an array of struct scatterlist;
 
 Chained scatterlists are a horrible interface, but that doesn't mean we
 shouldn't support them if there's a need.
 
 I think I once even had a patch which passed two chained sgs, rather
 than a combo sg and two length numbers.  It's very old, but I've pasted
 it below.
 
 Duplicating the implementation by having another interface is pretty
 nasty; I think I'd prefer the chained scatterlists, if that's optimal
 for you.

Unfortunately, that cannot work because not all architectures support
chained scatterlists.  Having two different implementations on the
driver side, with different locking rules, depending on the support for
chained scatterlists is awful.

(Also, as you mention chained scatterlists are horrible.  They'd happen
to work for virtio-scsi, but not for virtio-blk where the response
status is part of the footer, not the header).

We can move all drivers to virtqueue_add_sg, I just didn't do it in this
series because I first wanted to get the API right.

Paolo

 Cheers,
 Rusty.
 
 From: Rusty Russell ru...@rustcorp.com.au
 Subject: virtio: use chained scatterlists.
 
 Rather than handing a scatterlist[] and out and in numbers to
 virtqueue_add_buf(), hand two separate ones which can be chained.
 
 I shall refrain from ranting about what a disgusting hack chained
 scatterlists are.  I'll just note that this doesn't make things
 simpler (see diff).
 
 The scatterlists we use can be too large for the stack, so we put them
 in our device struct and reuse them.  But in many cases we don't want
 to pay the cost of sg_init_table() as we don't know how many elements
 we'll have and we'd have to initialize the entire table.
 
 This means we have two choices: carefully reset the end markers after
 we call virtqueue_add_buf(), which we do in virtio_net for the xmit
 path where it's easy and we want to be optimal.  Elsewhere we
 implement a helper to unset the end markers after we've filled the
 array.
 
 Signed-off-by: Rusty Russell ru...@rustcorp.com.au
 ---
  drivers/block/virtio_blk.c  |   37 +-
  drivers/char/hw_random/virtio-rng.c |2 -
  drivers/char/virtio_console.c   |6 +--
  drivers/net/virtio_net.c|   67 ++---
  drivers/virtio/virtio_balloon.c |6 +--
  drivers/virtio/virtio_ring.c|   71 
 ++--
  include/linux/virtio.h  |5 +-
  net/9p/trans_virtio.c   |   38 +--
  8 files changed, 151 insertions(+), 81 deletions(-)
 
 diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
 --- a/drivers/block/virtio_blk.c
 +++ b/drivers/block/virtio_blk.c
 @@ -100,6 +100,14 @@ static void blk_done(struct virtqueue *v
   spin_unlock_irqrestore(vblk-disk-queue-queue_lock, flags);
  }
  
 +static void sg_unset_end_markers(struct scatterlist *sg, unsigned int num)
 +{
 + unsigned int i;
 +
 + for (i = 0; i  num; i++)
 + sg[i].page_link = ~0x02;
 +}
 +
  static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
  struct request *req)
  {
 @@ -140,6 +148,7 @@ static bool do_req(struct request_queue 
   }
   }
  
 + /* We layout out scatterlist in a single array, out then in. */
   sg_set_buf(vblk-sg[out++], vbr-out_hdr, sizeof(vbr-out_hdr));
  
   /*
 @@ -151,17 +160,8 @@ static bool do_req(struct request_queue 
   if (vbr-req-cmd_type == REQ_TYPE_BLOCK_PC)
   sg_set_buf(vblk-sg[out++], vbr-req-cmd, vbr-req-cmd_len);
  
 + /* This marks the end of the sg list at vblk-sg[out]. */
   num = blk_rq_map_sg(q, vbr-req, vblk-sg + out);
 -
 - if (vbr-req-cmd_type == REQ_TYPE_BLOCK_PC) {
 - sg_set_buf(vblk-sg[num + out + in++], vbr-req-sense, 
 SCSI_SENSE_BUFFERSIZE);
 - sg_set_buf(vblk-sg[num + out + in++], vbr-in_hdr,
 -sizeof(vbr-in_hdr));
 - }
 -
 - sg_set_buf(vblk-sg[num + out + in++], vbr-status,
 -sizeof(vbr-status));
 -
   if (num) {
   if (rq_data_dir(vbr-req) == WRITE) {
   vbr-out_hdr.type |= VIRTIO_BLK_T_OUT;
 @@ -172,7 +172,22 @@ static bool do_req(struct request_queue 
   }
   }
  
 - if (virtqueue_add_buf(vblk-vq, vblk-sg, out, in, vbr, GFP_ATOMIC)0) {
 + if (vbr-req-cmd_type == REQ_TYPE_BLOCK_PC) {
 + sg_set_buf(vblk-sg[out + in++], vbr-req-sense,
 +SCSI_SENSE_BUFFERSIZE);
 + sg_set_buf(vblk-sg[out + in++], vbr-in_hdr,
 +sizeof(vbr-in_hdr));
 + }
 +
 + sg_set_buf(vblk-sg[out + in++], vbr-status,
 + 

Re: [PATCH qom-cpu 4/4] target-ppc: Error out for -cpu host on unknown PVR

2013-01-03 Thread Alexander Graf

On 18.12.2012, at 08:53, Andreas Färber wrote:

 Previously we silently exited, with subclasses we got an opcode warning.
 Instead explicitly tell the user what's wrong.
 
 An indication for this is -cpu ? showing host with an all-zero PVR.
 
 Signed-off-by: Andreas Färber afaer...@suse.de
 ---
 target-ppc/kvm.c |7 +++
 1 Datei geändert, 7 Zeilen hinzugefügt(+)
 
 diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
 index f115892..8998d0f 100644
 --- a/target-ppc/kvm.c
 +++ b/target-ppc/kvm.c
 @@ -1186,7 +1186,14 @@ static void alter_insns(uint64_t *word, uint64_t 
 flags, bool on)
 
 static void kvmppc_host_cpu_initfn(Object *obj)
 {
 +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(obj);
 +
 assert(kvm_enabled());
 +
 +if (pcc-info-pvr != mfpvr()) {
 +fprintf(stderr, Host PVR unsupported.\n);

This should probably rather say Host CPU unsupported for -cpu host or so :). 
Not everyone who invokes qemu-system-ppc knows what a PVR is.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 0/7] s390: Host support for channel I/O.

2013-01-03 Thread Alexander Graf

On 20.12.2012, at 15:32, Cornelia Huck wrote:

 Hi,
 
 here's the next iteration of the host patches to support channel
 I/O against kvm/next.
 
 Changes from v4 are on the style side; mainly using defines instead
 of magic numbers and using helper functions for decoding instructions.
 
 Patches 1 and 2 are new (and can be applied independently of the
 channel I/O patches); some things Alex pointed out in the patches
 apply to existing code as well.
 
 Please consider for kvm/next.

Reviewed-by: Alexander Graf ag...@suse.de


Alex

 
 Cornelia Huck (7):
  KVM: s390: Constify intercept handler tables.
  KVM: s390: Decoding helper functions.
  KVM: s390: Support for I/O interrupts.
  KVM: s390: Add support for machine checks.
  KVM: s390: In-kernel handling of I/O instructions.
  KVM: s390: Base infrastructure for enabling capabilities.
  KVM: s390: Add support for channel I/O instructions.
 
 Documentation/virtual/kvm/api.txt |  40 -
 arch/s390/include/asm/kvm_host.h  |  11 ++
 arch/s390/kvm/intercept.c |  45 +++---
 arch/s390/kvm/interrupt.c | 252 +-
 arch/s390/kvm/kvm-s390.c  |  38 +
 arch/s390/kvm/kvm-s390.h  |  43 ++
 arch/s390/kvm/priv.c  | 316 --
 arch/s390/kvm/sigp.c  |   6 +-
 arch/s390/kvm/trace-s390.h|  26 +++-
 include/trace/events/kvm.h|   2 +-
 include/uapi/linux/kvm.h  |  21 +++
 11 files changed, 721 insertions(+), 79 deletions(-)
 
 -- 
 1.7.12.4
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 0/6] more VMX real mode emulation fixes

2013-01-03 Thread Marcelo Tosatti
On Thu, Dec 20, 2012 at 04:57:41PM +0200, Gleb Natapov wrote:
 This series goes on top of my previous one: Fix
 emulate_invalid_guest_state=0 part 2. It does not only fixes bugs,
 but also does a nice cleanup of VMX real mode emulation. All real mode
 segment register mangling is now contained in fix_rmode_seg() function.
 
 Changelog:
   v1 - v2:
 - emulate_invalid_guest_state=0 broke again. Fix it.
 - additional patch to handle IO during emulation caused by #GP
 
 Gleb Natapov (6):
   KVM: emulator: drop RPL check from linearize() function
   KVM: emulator: implement fninit, fnstsw, fnstcw
   KVM: VMX: make rmode_segment_valid() more strict.
   KVM: VMX: fix emulation of invalid guest state.
   KVM: VMX: Do not fix segment register during vcpu initialization.
   KVM: VMX: handle IO when emulation is due to #GP in real mode.
 
  arch/x86/kvm/emulate.c |  133 +++--
  arch/x86/kvm/vmx.c |  219 
 +---
  2 files changed, 241 insertions(+), 111 deletions(-)
 
 -- 
 1.7.10.4

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM: VMX: fix incorrect cached cpl value with real/v8086 modes (v2)

2013-01-03 Thread Marcelo Tosatti

CPL is always 0 when in real mode, and always 3 when virtual 8086 mode.

Using values other than those can cause failures on operations that check CPL.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 55dfc37..849f5f2 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1696,7 +1696,6 @@ static unsigned long vmx_get_rflags(struct kvm_vcpu *vcpu)
 static void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
 {
__set_bit(VCPU_EXREG_RFLAGS, (ulong *)vcpu-arch.regs_avail);
-   __clear_bit(VCPU_EXREG_CPL, (ulong *)vcpu-arch.regs_avail);
to_vmx(vcpu)-rflags = rflags;
if (to_vmx(vcpu)-rmode.vm86_active) {
to_vmx(vcpu)-rmode.save_rflags = rflags;
@@ -3220,8 +3219,10 @@ static u64 vmx_get_segment_base(struct kvm_vcpu *vcpu, 
int seg)
return vmx_read_guest_seg_base(to_vmx(vcpu), seg);
 }
 
-static int __vmx_get_cpl(struct kvm_vcpu *vcpu)
+static int vmx_get_cpl(struct kvm_vcpu *vcpu)
 {
+   struct vcpu_vmx *vmx = to_vmx(vcpu);
+
if (!is_protmode(vcpu))
return 0;
 
@@ -3229,13 +3230,6 @@ static int __vmx_get_cpl(struct kvm_vcpu *vcpu)
 (kvm_get_rflags(vcpu)  X86_EFLAGS_VM)) /* if virtual 8086 */
return 3;
 
-   return vmx_read_guest_seg_selector(to_vmx(vcpu), VCPU_SREG_CS)  3;
-}
-
-static int vmx_get_cpl(struct kvm_vcpu *vcpu)
-{
-   struct vcpu_vmx *vmx = to_vmx(vcpu);
-
/*
 * If we enter real mode with cs.sel  3 != 0, the normal CPL 
calculations
 * fail; use the cache instead.
@@ -3246,7 +3240,7 @@ static int vmx_get_cpl(struct kvm_vcpu *vcpu)
 
if (!test_bit(VCPU_EXREG_CPL, (ulong *)vcpu-arch.regs_avail)) {
__set_bit(VCPU_EXREG_CPL, (ulong *)vcpu-arch.regs_avail);
-   vmx-cpl = __vmx_get_cpl(vcpu);
+   vmx-cpl = vmx_read_guest_seg_selector(vmx, VCPU_SREG_CS)  3;
}
 
return vmx-cpl;
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: VMX: fix incorrect cached cpl value with real/v8086 modes

2013-01-03 Thread Marcelo Tosatti
On Thu, Jan 03, 2013 at 10:11:53AM +0200, Gleb Natapov wrote:
  FreeBSD 9.1 with -smp 2.
 I cannot reproduce. I do see boot failure on the next branch with 9.[01]
 64 bit -smp 2 here, but it is caused but segment registers been all
 incorrect on a secondary vcpu and this patch does not help. Applying
 http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102593 fixes
 it.
 
   
 The question is how does it happen that we enter real mode while cache
 is set to 3. It should never be 3 during boot since boot process never
 enters the userspace.

Its transition _to_ real mode (from protected).
   But in protected mode CPL should be 0 during boot.
  
  BTX (FreeBSD's bootloader) does:
  
  http://people.freebsd.org/~jhb/docs/loader.txt. 
 Crazy. Regardless, I think your patch is correct and the current code
 that uses cpl cache during emulation is wrong and it remains wrong even
 after your patch. What if last time cpl cache was updated was while vcpu
 ran in cpl 3? Commit message says that it tries to fix the case when
 CS.selector3 != 0 during transition to protected mode, but this can be
 fixed by setting cpl cache to 0 in vmx_set_cr0() instead of clearing it.

Yes, i suppose so, can be done by a separatch patch, though.

 Two things about the patch itself. Get rid of __vmx_get_cpl() and call
 to vmx_read_guest_seg_selector() directly from vmx_get_cpl() and drop
 __clear_bit(VCPU_EXREG_CPL) from vmx_set_rflags() and vmx_set_cr0()
 since vmx-cpl no longer depends on rflags and cr0.

You still want to invalidate vmx-cpl cache on cr0 writes:
think protected - real - protected transition.
And as for EFLAGS, agreed. Sending new patch.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM: x86: use dynamic percpu allocations for shared msrs area

2013-01-03 Thread Marcelo Tosatti

Andy, Mike, can you confirm whether this fixes the percpu allocation
failures when loading kvm.ko? TIA



Use dynamic percpu allocations for the shared msrs structure, 
to avoid using the limited reserved percpu space.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1c9c834..5229a67 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -120,7 +120,7 @@ struct kvm_shared_msrs {
 };
 
 static struct kvm_shared_msrs_global __read_mostly shared_msrs_global;
-static DEFINE_PER_CPU(struct kvm_shared_msrs, shared_msrs);
+static struct kvm_shared_msrs __percpu *shared_msrs;
 
 struct kvm_stats_debugfs_item debugfs_entries[] = {
{ pf_fixed, VCPU_STAT(pf_fixed) },
@@ -191,10 +191,10 @@ static void kvm_on_user_return(struct 
user_return_notifier *urn)
 
 static void shared_msr_update(unsigned slot, u32 msr)
 {
-   struct kvm_shared_msrs *smsr;
u64 value;
+   unsigned int cpu = smp_processor_id();
+   struct kvm_shared_msrs *smsr = per_cpu_ptr(shared_msrs, cpu);
 
-   smsr = __get_cpu_var(shared_msrs);
/* only read, and nobody should modify it at this time,
 * so don't need lock */
if (slot = shared_msrs_global.nr) {
@@ -226,7 +226,8 @@ static void kvm_shared_msr_cpu_online(void)
 
 void kvm_set_shared_msr(unsigned slot, u64 value, u64 mask)
 {
-   struct kvm_shared_msrs *smsr = __get_cpu_var(shared_msrs);
+   unsigned int cpu = smp_processor_id();
+   struct kvm_shared_msrs *smsr = per_cpu_ptr(shared_msrs, cpu);
 
if (((value ^ smsr-values[slot].curr)  mask) == 0)
return;
@@ -242,7 +243,8 @@ EXPORT_SYMBOL_GPL(kvm_set_shared_msr);
 
 static void drop_user_return_notifiers(void *ignore)
 {
-   struct kvm_shared_msrs *smsr = __get_cpu_var(shared_msrs);
+   unsigned int cpu = smp_processor_id();
+   struct kvm_shared_msrs *smsr = per_cpu_ptr(shared_msrs, cpu);
 
if (smsr-registered)
kvm_on_user_return(smsr-urn);
@@ -5233,9 +5235,16 @@ int kvm_arch_init(void *opaque)
goto out;
}
 
+   r = -ENOMEM;
+   shared_msrs = alloc_percpu(struct kvm_shared_msrs);
+   if (!shared_msrs) {
+   printk(KERN_ERR kvm: failed to allocate percpu 
kvm_shared_msrs\n);
+   goto out;
+   }
+
r = kvm_mmu_module_init();
if (r)
-   goto out;
+   goto out_free_percpu;
 
kvm_set_mmio_spte_mask();
kvm_init_msr_list();
@@ -5258,6 +5267,8 @@ int kvm_arch_init(void *opaque)
 
return 0;
 
+out_free_percpu:
+   free_percpu(shared_msrs);
 out:
return r;
 }
@@ -5275,6 +5286,7 @@ void kvm_arch_exit(void)
 #endif
kvm_x86_ops = NULL;
kvm_mmu_module_exit();
+   free_percpu(shared_msrs);
 }
 
 int kvm_emulate_halt(struct kvm_vcpu *vcpu)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] Streamline arithmetic instruction emulation

2013-01-03 Thread Marcelo Tosatti
On Thu, Jan 03, 2013 at 09:39:14PM +0200, Avi Kivity wrote:
   Avi Kivity (7):
 KVM: x86 emulator: framework for streamlining arithmetic opcodes
 KVM: x86 emulator: Support for declaring single operand fastops
 KVM: x86 emulator: introduce NoWrite flag
 KVM: x86 emulator: mark CMP, CMPS, SCAS, TEST as NoWrite
 KVM: x86 emulator: convert NOT, NEG to fastop
 KVM: x86 emulator: add macros for defining 2-operand fastop emulation
 KVM: x86 emulator: convert basic ALU ops to fastop
  
arch/x86/kvm/emulate.c | 215
 +++--
1 file changed, 120 insertions(+), 95 deletions(-)
  
  Looks good to me.
 
 Ping?

Please regenerate against 'queue' branch. 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 10/12] virtio-net: multiqueue support

2013-01-03 Thread Jason Wang
On 12/29/2012 01:52 AM, Blue Swirl wrote:
 On Fri, Dec 28, 2012 at 10:32 AM, Jason Wang jasow...@redhat.com wrote:
 This patch implements both userspace and vhost support for multiple queue
 virtio-net (VIRTIO_NET_F_MQ). This is done by introducing an array of
 VirtIONetQueue to VirtIONet.

 Signed-off-by: Jason Wang jasow...@redhat.com
 ---
  hw/virtio-net.c |  318 
 ++-
  hw/virtio-net.h |   27 +-
  2 files changed, 271 insertions(+), 74 deletions(-)
[...]
  static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
  {
  VirtIONet *n = to_virtio_net(vdev);
 @@ -464,6 +578,8 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
 VirtQueue *vq)
  status = virtio_net_handle_mac(n, ctrl.cmd, elem);
  else if (ctrl.class == VIRTIO_NET_CTRL_VLAN)
  status = virtio_net_handle_vlan_table(n, ctrl.cmd, elem);
 +else if (ctrl.class == VIRTIO_NET_CTRL_MQ)
 Please add braces.

Sure.

 +status = virtio_net_handle_mq(n, ctrl.cmd, elem);

  stb_p(elem.in_sg[elem.in_num - 1].iov_base, status);

 @@ -477,19 +593,24 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
 VirtQueue *vq)
  static void virtio_net_handle_rx(VirtIODevice *vdev, VirtQueue *vq)
  {
  VirtIONet *n = to_virtio_net(vdev);
 +int queue_index = vq2q(virtio_get_queue_index(vq));

 -qemu_flush_queued_packets(qemu_get_queue(n-nic));
 +qemu_flush_queued_packets(qemu_get_subqueue(n-nic, queue_index));
  }

  
[...]

 +static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue, int 
 ctrl)
 +{
 +VirtIODevice *vdev = n-vdev;
 +int i;
 +
 +n-multiqueue = multiqueue;
 +
 +if (!multiqueue)
 +n-curr_queues = 1;
 Ditto. Didn't checkpatch.pl catch these or did you not check?

Sorry, will add braces here. I run checkpatch.pl but finally find that
some or lots of the existed codes (such as this file) does not obey the
rules. So I'm not sure whether I need to correct my own codes, or left
them as this file does and correct them all in the future.

[...]
  } QEMU_PACKED;

  /* This is the first element of the scatter-gather list.  If you don't
 @@ -168,6 +172,26 @@ struct virtio_net_ctrl_mac {
   #define VIRTIO_NET_CTRL_VLAN_ADD 0
   #define VIRTIO_NET_CTRL_VLAN_DEL 1

 +/*
 + * Control Multiqueue
 + *
 + * The command VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
 + * enables multiqueue, specifying the number of the transmit and
 + * receive queues that will be used. After the command is consumed and 
 acked by
 + * the device, the device will not steer new packets on receive virtqueues
 + * other than specified nor read from transmit virtqueues other than 
 specified.
 + * Accordingly, driver should not transmit new packets  on virtqueues other 
 than
 + * specified.
 + */
 +struct virtio_net_ctrl_mq {
 VirtIONetCtrlMQ and please don't forget the typedef.

Sure, but the same question as above. (See other structures in this file).

 +uint16_t virtqueue_pairs;
 +};
 +
 +#define VIRTIO_NET_CTRL_MQ   4
 + #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET0
 + #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN1
 + #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX0x8000
 +
  #define DEFINE_VIRTIO_NET_FEATURES(_state, _field) \
  DEFINE_VIRTIO_COMMON_FEATURES(_state, _field), \
  DEFINE_PROP_BIT(csum, _state, _field, VIRTIO_NET_F_CSUM, true), \
 @@ -186,5 +210,6 @@ struct virtio_net_ctrl_mac {
  DEFINE_PROP_BIT(ctrl_vq, _state, _field, VIRTIO_NET_F_CTRL_VQ, 
 true), \
  DEFINE_PROP_BIT(ctrl_rx, _state, _field, VIRTIO_NET_F_CTRL_RX, 
 true), \
  DEFINE_PROP_BIT(ctrl_vlan, _state, _field, 
 VIRTIO_NET_F_CTRL_VLAN, true), \
 -DEFINE_PROP_BIT(ctrl_rx_extra, _state, _field, 
 VIRTIO_NET_F_CTRL_RX_EXTRA, true)
 +DEFINE_PROP_BIT(ctrl_rx_extra, _state, _field, 
 VIRTIO_NET_F_CTRL_RX_EXTRA, true), \
 +DEFINE_PROP_BIT(mq, _state, _field, VIRTIO_NET_F_MQ, true)
  #endif
 --
 1.7.1



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: x86: use dynamic percpu allocations for shared msrs area

2013-01-03 Thread Mike Galbraith
On Thu, 2013-01-03 at 11:41 -0200, Marcelo Tosatti wrote: 
 Andy, Mike, can you confirm whether this fixes the percpu allocation
 failures when loading kvm.ko? TIA

monteverdi:~/:[1]# dmesg|grep PERCPU
[0.00] PERCPU: Embedded 27 pages/cpu @88047f80 s80704 r8192 
d21696 u262144
monteverdi:~/:[0]# lsmod|grep kvm
kvm_intel 132688  0 
kvm   423692  1 kvm_intel
monteverdi:~/:[0]# uname -r
3.6.0-bisect

Yup, kvm loaded itself, no gripeage.
   
 
 
 Use dynamic percpu allocations for the shared msrs structure, 
 to avoid using the limited reserved percpu space.
 
 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
 
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 1c9c834..5229a67 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -120,7 +120,7 @@ struct kvm_shared_msrs {
  };
  
  static struct kvm_shared_msrs_global __read_mostly shared_msrs_global;
 -static DEFINE_PER_CPU(struct kvm_shared_msrs, shared_msrs);
 +static struct kvm_shared_msrs __percpu *shared_msrs;
  
  struct kvm_stats_debugfs_item debugfs_entries[] = {
   { pf_fixed, VCPU_STAT(pf_fixed) },
 @@ -191,10 +191,10 @@ static void kvm_on_user_return(struct 
 user_return_notifier *urn)
  
  static void shared_msr_update(unsigned slot, u32 msr)
  {
 - struct kvm_shared_msrs *smsr;
   u64 value;
 + unsigned int cpu = smp_processor_id();
 + struct kvm_shared_msrs *smsr = per_cpu_ptr(shared_msrs, cpu);
  
 - smsr = __get_cpu_var(shared_msrs);
   /* only read, and nobody should modify it at this time,
* so don't need lock */
   if (slot = shared_msrs_global.nr) {
 @@ -226,7 +226,8 @@ static void kvm_shared_msr_cpu_online(void)
  
  void kvm_set_shared_msr(unsigned slot, u64 value, u64 mask)
  {
 - struct kvm_shared_msrs *smsr = __get_cpu_var(shared_msrs);
 + unsigned int cpu = smp_processor_id();
 + struct kvm_shared_msrs *smsr = per_cpu_ptr(shared_msrs, cpu);
  
   if (((value ^ smsr-values[slot].curr)  mask) == 0)
   return;
 @@ -242,7 +243,8 @@ EXPORT_SYMBOL_GPL(kvm_set_shared_msr);
  
  static void drop_user_return_notifiers(void *ignore)
  {
 - struct kvm_shared_msrs *smsr = __get_cpu_var(shared_msrs);
 + unsigned int cpu = smp_processor_id();
 + struct kvm_shared_msrs *smsr = per_cpu_ptr(shared_msrs, cpu);
  
   if (smsr-registered)
   kvm_on_user_return(smsr-urn);
 @@ -5233,9 +5235,16 @@ int kvm_arch_init(void *opaque)
   goto out;
   }
  
 + r = -ENOMEM;
 + shared_msrs = alloc_percpu(struct kvm_shared_msrs);
 + if (!shared_msrs) {
 + printk(KERN_ERR kvm: failed to allocate percpu 
 kvm_shared_msrs\n);
 + goto out;
 + }
 +
   r = kvm_mmu_module_init();
   if (r)
 - goto out;
 + goto out_free_percpu;
  
   kvm_set_mmio_spte_mask();
   kvm_init_msr_list();
 @@ -5258,6 +5267,8 @@ int kvm_arch_init(void *opaque)
  
   return 0;
  
 +out_free_percpu:
 + free_percpu(shared_msrs);
  out:
   return r;
  }
 @@ -5275,6 +5286,7 @@ void kvm_arch_exit(void)
  #endif
   kvm_x86_ops = NULL;
   kvm_mmu_module_exit();
 + free_percpu(shared_msrs);
  }
  
  int kvm_emulate_halt(struct kvm_vcpu *vcpu)


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 5/5] KVM: x86: improve reexecute_instruction

2013-01-03 Thread Xiao Guangrong
Hi Gleb,

Thanks for your review and sorry for the delay reply since i was on my vacation.

On 12/23/2012 11:02 PM, Gleb Natapov wrote:
 On Sat, Dec 15, 2012 at 03:01:12PM +0800, Xiao Guangrong wrote:


 +is_self_change_mapping = FNAME(is_self_change_mapping)(vcpu, addr,
 +   walker, user_fault);
 +
 is_self_change_mapping() has a subtle side-effect by setting
 vcpu-arch.target_gfn_is_pt. From reading the page_fault() function
 you cannot guess why is_self_change_mapping() is not called inside if
 (walker.level = PT_DIRECTORY_LEVEL) since this is the only place where
 its output is used. May be pass it pointer to target_gfn_is_pt as a
 parameter to make it clear that return value is not the only output of
 the function.

Yes, it is clearer, will do it in the next version.

 
  if (walker.level = PT_DIRECTORY_LEVEL)
  force_pt_level = mapping_level_dirty_bitmap(vcpu, walker.gfn)
 -   || FNAME(is_self_change_mapping)(vcpu, walker, user_fault);
 +   || is_self_change_mapping;
  else
  force_pt_level = 1;
  if (!force_pt_level) {
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index bf66169..fc33563 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -4756,29 +4756,25 @@ static int handle_emulation_failure(struct kvm_vcpu 
 *vcpu)
  static bool reexecute_instruction(struct kvm_vcpu *vcpu, unsigned long cr2)
  {
  gpa_t gpa = cr2;
 +gfn_t gfn;
  pfn_t pfn;
 -unsigned int indirect_shadow_pages;
 -
 -spin_lock(vcpu-kvm-mmu_lock);
 -indirect_shadow_pages = vcpu-kvm-arch.indirect_shadow_pages;
 -spin_unlock(vcpu-kvm-mmu_lock);
 -
 -if (!indirect_shadow_pages)
 -return false;

  if (!vcpu-arch.mmu.direct_map) {
 -gpa = kvm_mmu_gva_to_gpa_read(vcpu, cr2, NULL);
 +/*
 + * Write permission should be allowed since only
 + * write access need to be emulated.
 + */
 +gpa = kvm_mmu_gva_to_gpa_write(vcpu, cr2, NULL);
 +
 +/*
 + * If the mapping is invalid in guest, let cpu retry
 + * it to generate fault.
 + */
  if (gpa == UNMAPPED_GVA)
 -return true; /* let cpu generate fault */
 +return true;
  }
 Why not fold this change to if (!vcpu-arch.mmu.direct_map) into
 previous patch where it was introduced. This looks independent of
 what you are doing in this patch.

Fine to me.

 

 -/*
 - * if emulation was due to access to shadowed page table
 - * and it failed try to unshadow page and re-enter the
 - * guest to let CPU execute the instruction.
 - */
 -if (kvm_mmu_unprotect_page(vcpu-kvm, gpa_to_gfn(gpa)))
 -return true;
 +gfn = gpa_to_gfn(gpa);

  /*
   * Do not retry the unhandleable instruction if it faults on the
 @@ -4786,13 +4782,33 @@ static bool reexecute_instruction(struct kvm_vcpu 
 *vcpu, unsigned long cr2)
   * retry instruction - write #PF - emulation fail - retry
   * instruction - ...
   */
 -pfn = gfn_to_pfn(vcpu-kvm, gpa_to_gfn(gpa));
 -if (!is_error_noslot_pfn(pfn)) {
 -kvm_release_pfn_clean(pfn);
 +pfn = gfn_to_pfn(vcpu-kvm, gfn);
 +
 +/*
 + * If the instruction failed on the error pfn, it can not be fixed,
 + * report the error to userspace.
 + */
 +if (is_error_noslot_pfn(pfn))
 +return false;
 +
 +kvm_release_pfn_clean(pfn);
 +
 +/* The instructions are well-emulated on direct mmu. */
 +if (vcpu-arch.mmu.direct_map) {
 +unsigned int indirect_shadow_pages;
 +
 +spin_lock(vcpu-kvm-mmu_lock);
 +indirect_shadow_pages = vcpu-kvm-arch.indirect_shadow_pages;
 +spin_unlock(vcpu-kvm-mmu_lock);
 +
 +if (indirect_shadow_pages)
 +kvm_mmu_unprotect_page(vcpu-kvm, gfn);
 +
  return true;
  }

 -return false;
 +kvm_mmu_unprotect_page(vcpu-kvm, gfn);
 +return !(vcpu-arch.fault_addr == cr2  vcpu-arch.target_gfn_is_pt);
 Do you store fault_addr only to avoid using stale target_gfn_is_pt? If
 yes why not reset target_gfn_is_pt to false at the beginning of a page
 fault and get rid of fault_addr?

Good suggestion, will do. :)


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html