Re: stand-alone kvmtool

2015-02-23 Thread Andre Przywara
Hi Will,

On 18/02/15 15:50, Will Deacon wrote:
 Hi Andre,
 
 Thanks for doing this. Since it looks unlikely that kvmtool will ever be
 merged back into the kernel tree, it makes sense to cut the dependency
 in my opinion.
 
 On Fri, Feb 13, 2015 at 10:39:33AM +, Andre Przywara wrote:
 as I found it increasingly inconvenient to use kvmtool[1] as part of a
 Linux repository, I decided to give it a go and make it a stand-alone
 project. So I filtered all the respective commits, adjusted the paths in
 there (while keeping authorship and commit date, of course) and then
 added the missing bits to let it compile without a kernel tree nearby.
 The result is now available on:

 git://linux-arm.org/kvmtool.git
 http://linux-arm.org/kvmtool.git

 You can simply check it out, type make and use ./lkvm run for a quick
 test. So far I briefly tested x86-64, arm and arm64, the later two were
 also cross-compiled. For sure there are rough edges in there (for
 instance copying a few non-uapi header files into), but I deem it worthy
 enough to get some public comments.
 For me that also fixed some nasty warnings about libfdt, which now are
 gone due it using your system library version of it.
 I also managed to get rid of the libc-i386-dev dependency when compiling
 for x86-64, but that still needs to be cleaned up and thus is not in the
 current HEAD.
 I haven't got around to compile-test the other supported architectures,
 but supporting them should be as easy as copying over the uapi kvm.h
 header file (see the respective ARM commit). Contributions (and tests!)
 are welcome.

 Please give it a go and tell me what you think. I don't want to fork the
 project, so I am happy if someone official picks it up.
 
 In which case, it's probably best to post the patches for review rather
 than just point me at your git repo!

Makes some sense, although part of the exercise was to get rid of the
huge, now unneeded Linux kernel code base.
So this approach required a fresh repository, and due to the different
paths there is no out-of-the-box patch compatibility between the two.
Also I wanted to provide an easy way for people to give it a test.

So what I could do is to send the top-most patches against Pekka's
github repository, which would eliminate the references to the kernel
directory (at the cost of duplicating some files).
Once this is settled, acked and applied, one could try to create a new
repository with the tools/kvm directory being the new root.

Let me know if that makes more sense and I will rework the patches to
apply against the current upstream kvmtool.

Cheers,
Andre.

P.S. Although both approaches still provide the kvmtool patch history,
they do not compile before the dependency cut patches. If that is an
issue, one could think about injecting those new patches back into the
repository time line. Admittedly that sounds scary, but would solve the
problem.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: stand-alone kvmtool

2015-02-23 Thread Andre Przywara
Hi Sasha,

thanks for taking a look!

On 19/02/15 10:56, Sasha Levin wrote:
 On 02/13/2015 05:39 AM, Andre Przywara wrote:
 Hi,

 as I found it increasingly inconvenient to use kvmtool[1] as part of a
 Linux repository, I decided to give it a go and make it a stand-alone
 project. So I filtered all the respective commits, adjusted the paths in
 there (while keeping authorship and commit date, of course) and then
 added the missing bits to let it compile without a kernel tree nearby.
 The result is now available on:

 git://linux-arm.org/kvmtool.git
 http://linux-arm.org/kvmtool.git
 
 Hi Andre,
 
 What inconvenience is caused by having it sit inside the kernel tree
 beyond an increased requirement in disk space?

Reduced disk space is admittedly one of the benefits of this exercise.
Also it makes cloning a lot easier and would allow easier packaging.

Many of the issues we face here come from the fact that kvmtool lives in
_a_ kernel repository, but it's not upstream. So we loose the benefit of
joined kernel-userland development. In fact we have to do regular merges
of mostly unrelated upstream kernel code into the branch to get it
compiled with a new feature.
Also having a pure userland tool in the kernel repository sounds just
wrong to me, especially as KVM has a nice API with compatibility
features. There is a clear interface between the KVM kernel and the
controlling userland, so they should not need to share code beyond the
API defining header files. Having a shared code base lures people into
breaking the interface.

 Moving it out will make us lose all the new features and bug fixes we
 gain from using the kernel code directly rather than copying it once
 in a while.

Which code are you exactly thinking of?
From the code I copied I don't see that rbtree or the Linux list
implementations for instance justify a common code base. If in dire
need, one could setup alerts on the few code files copied to spot
upstream bug fixes.
I see there is a slight drawback in this regard, but I think the
benefits outweigh it.

 With your suggestion we'll end up needing something that copies stuff
 from the kernel into that standalone tree, just like what qemu does.

While I see that copying is not the best solution, QEMU lives very well
with it, doesn't it? With KVM's feature compatibility API and the
kernel's don't break userland policy there should be no real problem.
Also with the current situation we just replace copy uapi header files
with merge in upstream kernel code base, which is also manually
triggered and much more ugly IMHO.


I agree that the whole argumentation would be much different if kvmtool
would be upstream, but it is not and as Will pointed out will probably
never be. So to make it's usage easier for the users and distribution
package maintainers, I'd like to see it live on in a separate
(kernel.org) repository.
I could imagine that the easier accessibility would make it more
appealing to potential users (and packagers!)

Cheers,
Andre.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: FreeBSD 10.1 disk performance lower than identical Linux Guest

2015-02-23 Thread Stefan Hajnoczi
On Sat, Feb 21, 2015 at 06:59:23PM +, Greg Langford wrote:
 I am running CentOS 6.6 x86_64 as a KVM host and have a number of
 guests running.
 
 After some experimenting I have noticed something curious, FreeBSD
 disk performance seems to be just over half of that of a Linux guest
 with an identical configuration.
 
 From my understanding virtio is included in FreeBSD 10.0 onwards.
 
 My CentOS guest has approx 130MB/s when using dd to read /dev/zero and
 write it to a file on the guest file system. This is about the same
 when doing the same on the hypervisor it's self. The stats are gained
 using iotop on the hypervisor while the test is performed.
 
 However the FreeBSD guest gets about 70MB/s maximum when performing
 the same test and is running FreeBSD 10.1
 
 Has anyone seen this before, is it a known issue or expected
 behaviour? I have been scratching my head about it for a number of
 days now.

Please post the dd command-line and the QEMU command-lines for launching
the Linux and FreeBSD guests.

Stefan


pgpiFqMmCaeyl.pgp
Description: PGP signature


Re: stand-alone kvmtool

2015-02-23 Thread Russell King - ARM Linux
On Thu, Feb 19, 2015 at 05:56:45AM -0500, Sasha Levin wrote:
 What inconvenience is caused by having it sit inside the kernel tree
 beyond an increased requirement in disk space?

I've come across this problem with the perf tools - luckily, the perf
tools allow the source to be exported from the kernel tree, but it is
far from a good solution.

The problem is when you're primarily cross-building the kernel on a
system where you don't have the target libraries (because, eg, you're
running in a build environment for multiple different target systems.)
Having to build userspace tools in that scenario is a _major_ pita.

Yes, of course it's possible to pull the 1GB of kernel GIT respository
down onto the target just to build some silly userspace tool, but when
your rootfs lives on an 8GB SD card or a USB memory stick (as is the
case with the ARM Juno 64-bit platform), and when the userspace tool
somehow depends on the kernel source tree being configured, it really
starts getting painful.

TBH, I don't much care provided there is a way to export a source
tarball for the tool from the kernel (like perf does) which can then
be transferred to the target and built there.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: stand-alone kvmtool

2015-02-23 Thread Christoffer Dall
Hi,

On Thu, Feb 19, 2015 at 05:56:45AM -0500, Sasha Levin wrote:

[...]

 
 What inconvenience is caused by having it sit inside the kernel tree
 beyond an increased requirement in disk space?
 
FWIW: I would prefer seeing this outside the kernel tree; I think it is
slightly confusing to keep it as part of a non-upstream kernel repo and
it is simpler to view git change logs etc. in gitweb for a stand-alone
repo.

-Christoffer
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC PATCH v2 04/15] cpu-model/s390: Introduce S390 CPU models

2015-02-23 Thread Christian Borntraeger
Am 23.02.2015 um 13:56 schrieb Christian Borntraeger:
 Am 20.02.2015 um 16:22 schrieb Alexander Graf:



 Am 20.02.2015 um 16:00 schrieb Michael Mueller m...@linux.vnet.ibm.com:

 On Fri, 20 Feb 2015 14:54:23 +0100
 Alexander Graf ag...@suse.de wrote:


 +/* machine related properties */
 +typedef struct S390CPUMachineProps {
 +uint16_t class;  /* machine class */
 +uint16_t ga; /* availability number of machine */
 +uint16_t order;  /* order of availability */
 +} S390CPUMachineProps;
 +
 +/* processor related properties */
 +typedef struct S390CPUProcessorProps {
 +uint16_t gen;/* S390 CMOS generation */
 +uint16_t ver;/* version of processor */
 +uint32_t id; /* processor identification*/
 +uint16_t type;   /* machine type */
 +uint16_t ibc;/* IBC value */
 +uint64_t *fac_list;  /* list of facilities */  

 Just make this uint64_t fac_list[2]. That way we don't have to track any
 messy allocations.

 It will be something like uint64_t 
 fac_list[S390_CPU_FAC_LIST_SIZE_UINT64] and in total 2KB not
 just 16 bytes but I will change it. 

 Why? Do we actually need that many? This is a qemu internal struct.
 
 The kernel already enabled the 3rd word for z13 support, 
 https://git.kernel.org/cgit/linux/kernel/git/s390/linux.git/commit/?id=f8b2dcbd9e6d1479b9b5a9e9e78bbaf783bde819
 
 so make it at least 3.


This should have been 

commit 8070361799ae1e3f4ef347bd10f0a508ac10acfb
Author: Martin Schwidefsky schwidef...@de.ibm.com
Date:   Mon Oct 6 17:53:53 2014 +0200

s390: add support for vector extension

which uses bit 129.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] vhost_net: support for cross endian guests

2015-02-23 Thread Greg Kurz
On Sun, 22 Feb 2015 10:53:51 +0100
Michael S. Tsirkin m...@redhat.com wrote:
 On Fri, Feb 20, 2015 at 11:07:24AM +0100, Greg Kurz wrote:
  Hi,
  
  This patchset allows vhost_net to be used with legacy virtio
  when guest and host have a different endianness. It is based
  on previous work by Cédric Le Goater:
  
  https://www.mail-archive.com/kvm-ppc@vger.kernel.org/msg09848.html
  
  As suggested by MST:
  - the API now asks for a specific format (big endian) instead of the hint
whether byteswap is needed or not (patch 1)
  - rebased on top of the virtio-1 accessors (patch 2)
  
  Patch 3 is a separate fix: I think it is also valid for virtio-1.
 
 I don't think so. See e.g. this code in tun:
 gso.csum_offset = cpu_to_tun16(tun, skb-csum_offset);
 looks like it has the correct endian-ness for virtio-1.
 
 

Indeed. I will fix tun/macvtap as you suggested.

Thanks for the review.

--
Greg

 
  Please comment.
  
  ---
  
  Greg Kurz (3):
vhost: add VHOST_VRING_F_LEGACY_BIG_ENDIAN flag
vhost: add support for legacy virtio
vhost_net: fix virtio_net header endianness
  
  
   drivers/vhost/net.c|   32 ++--
   drivers/vhost/vhost.c  |6 +-
   drivers/vhost/vhost.h  |   23 +--
   include/uapi/linux/vhost.h |2 ++
   4 files changed, 50 insertions(+), 13 deletions(-)
  
  --
  Greg
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nVMX] With 3.20.0-0.rc0.git5.1 on L0, booting L2 guest results in L1 *rebooting*

2015-02-23 Thread Radim Krčmář
2015-02-22 16:46+0100, Kashyap Chamarthy:
 Radim,
 
 I just tested with your patch[1] in this thread. I built a Fedora
 Kernel[2] with it, and installed (and booted into) it on both L0 and L1. 
 
 Result: I don't have good news, I'm afraid: L1 *still* reboots when an
 L2 guest is booted. And, L0 throws the stack trace that was
 previously noted on this thread:

Thanks, I'm puzzled though ... isn't it possible that a wrong kernel
sneaked into grub?

 . . .
 [   57.747345] [ cut here ]
 [0.004638] WARNING: CPU: 5 PID: 50206 at arch/x86/kvm/vmx.c:8962 
 nested_vmx_vmexit+0x7ee/0x880 [kvm_intel]()
 [0.060404] CPU: 5 PID: 50206 Comm: qemu-system-x86 Not tainted 
 3.18.7-200.fc21.x86_64 #1

This looks like a new backtrace, but the kernel is not [2].

 [  +0.006055]  [810992ea] warn_slowpath_null+0x1a/0x20
 [  +0.005889]  [a02f00ee] nested_vmx_vmexit+0x7ee/0x880 [kvm_intel]
 [  +0.007014]  [a02f05af] ? vmx_handle_exit+0x1bf/0xaa0 [kvm_intel]
 [  +0.007015]  [a02f039c] vmx_queue_exception+0xfc/0x150 [kvm_intel]
 [  +0.007130]  [a028cdfd] kvm_arch_vcpu_ioctl_run+0xd9d/0x1290 [kvm]

(There is only one execution path and unless there is a race, it would
 be prevented by [1].)

 [  +0.007111]  [a0288528] ? kvm_arch_vcpu_load+0x58/0x220 [kvm]
 [  +0.006670]  [a0274cbc] kvm_vcpu_ioctl+0x32c/0x5c0 [kvm]
[...]
   [1] http://article.gmane.org/gmane.comp.emulators.kvm.devel/132937
   [2] http://koji.fedoraproject.org/koji/taskinfo?taskID=9004708
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC PATCH v2 04/15] cpu-model/s390: Introduce S390 CPU models

2015-02-23 Thread Christian Borntraeger
Am 20.02.2015 um 16:22 schrieb Alexander Graf:
 
 
 
 Am 20.02.2015 um 16:00 schrieb Michael Mueller m...@linux.vnet.ibm.com:

 On Fri, 20 Feb 2015 14:54:23 +0100
 Alexander Graf ag...@suse.de wrote:


 +/* machine related properties */
 +typedef struct S390CPUMachineProps {
 +uint16_t class;  /* machine class */
 +uint16_t ga; /* availability number of machine */
 +uint16_t order;  /* order of availability */
 +} S390CPUMachineProps;
 +
 +/* processor related properties */
 +typedef struct S390CPUProcessorProps {
 +uint16_t gen;/* S390 CMOS generation */
 +uint16_t ver;/* version of processor */
 +uint32_t id; /* processor identification*/
 +uint16_t type;   /* machine type */
 +uint16_t ibc;/* IBC value */
 +uint64_t *fac_list;  /* list of facilities */  

 Just make this uint64_t fac_list[2]. That way we don't have to track any
 messy allocations.

 It will be something like uint64_t fac_list[S390_CPU_FAC_LIST_SIZE_UINT64] 
 and in total 2KB not
 just 16 bytes but I will change it. 
 
 Why? Do we actually need that many? This is a qemu internal struct.

The kernel already enabled the 3rd word for z13 support, 
https://git.kernel.org/cgit/linux/kernel/git/s390/linux.git/commit/?id=f8b2dcbd9e6d1479b9b5a9e9e78bbaf783bde819

so make it at least 3.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arch: mips: kvm: Enable after disabling interrupt

2015-02-23 Thread James Hogan
On Sun, Feb 22, 2015 at 09:48:21PM +0530, Tapasweni Pathak wrote:
 Enable disabled interrupt, on unsuccessful operation.
 
 Found by Coccinelle.
 
 Signed-off-by: Tapasweni Pathak tapaswenipat...@gmail.com
 Acked-by: Julia Lawall julia.law...@lip6.fr

Reviewed-by: James Hogan james.ho...@imgtec.com

Cheers
James

 ---
  arch/mips/kvm/tlb.c |1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/arch/mips/kvm/tlb.c b/arch/mips/kvm/tlb.c
 index bbcd822..b6beb0e 100644
 --- a/arch/mips/kvm/tlb.c
 +++ b/arch/mips/kvm/tlb.c
 @@ -216,6 +216,7 @@ int kvm_mips_host_tlb_write(struct kvm_vcpu *vcpu, 
 unsigned long entryhi,
   if (idx  current_cpu_data.tlbsize) {
   kvm_err(%s: Invalid Index: %d\n, __func__, idx);
   kvm_mips_dump_host_tlbs();
 + local_irq_restore(flags);
   return -1;
   }
 
 --
 1.7.9.5
 
 


signature.asc
Description: Digital signature


Re: [nVMX] With 3.20.0-0.rc0.git5.1 on L0, booting L2 guest results in L1 *rebooting*

2015-02-23 Thread Kashyap Chamarthy
On Mon, Feb 23, 2015 at 02:56:11PM +0100, Radim Krčmář wrote:
 2015-02-22 16:46+0100, Kashyap Chamarthy:
  Radim,
  
  I just tested with your patch[1] in this thread. I built a Fedora
  Kernel[2] with it, and installed (and booted into) it on both L0 and L1. 
  
  Result: I don't have good news, I'm afraid: L1 *still* reboots when an
  L2 guest is booted. And, L0 throws the stack trace that was
  previously noted on this thread:
 
 Thanks, I'm puzzled though ... isn't it possible that a wrong kernel
 sneaked into grub?

Hmm, unlikely - I just double-confirmed that I'm running the same
patched Kernel (3.20.0-0.rc0.git9.1.fc23.x86_64) on both L0 and L1.
 
  . . .
  [   57.747345] [ cut here ]
  [0.004638] WARNING: CPU: 5 PID: 50206 at arch/x86/kvm/vmx.c:8962 
  nested_vmx_vmexit+0x7ee/0x880 [kvm_intel]()
  [0.060404] CPU: 5 PID: 50206 Comm: qemu-system-x86 Not tainted 
  3.18.7-200.fc21.x86_64 #1
 
 This looks like a new backtrace, but the kernel is not [2].

Err, looks like I pasted the wrong one, but here it is again. I just
tested with the patched Kernel (that I linked below) on both L0 and L1,
the same behavior (L1 reboot on L2 boot) manifests:

. . .
[0.058440] CPU: 8 PID: 1828 Comm: qemu-system-x86 Not tainted 
3.20.0-0.rc0.git9.1.fc23.x86_64 #1
[0.008856] Hardware name: Dell Inc. PowerEdge R910/0P658H, BIOS 2.8.2 
10/25/2012
[0.007475]   97b7f39b 883f5acc3bf8 
818773cd
[0.007477]    883f5acc3c38 
810ab3ba
[0.007495]  883f5acc3c68 887f62678000  

[0.007489] Call Trace:
[0.002455]  [818773cd] dump_stack+0x4c/0x65
[0.005139]  [810ab3ba] warn_slowpath_common+0x8a/0xc0
[0.006001]  [810ab4ea] warn_slowpath_null+0x1a/0x20
[0.005831]  [a220cf8e] nested_vmx_vmexit+0xbde/0xd30 [kvm_intel]
[0.006957]  [a220fda3] ? vmx_handle_exit+0x213/0xd80 [kvm_intel]
[0.006956]  [a220d3fa] vmx_queue_exception+0x10a/0x150 
[kvm_intel]
[0.007160]  [a03c8cdb] kvm_arch_vcpu_ioctl_run+0x107b/0x1b60 
[kvm]
[0.007138]  [a03c833a] ? kvm_arch_vcpu_ioctl_run+0x6da/0x1b60 
[kvm]
[0.007219]  [8110725d] ? trace_hardirqs_on+0xd/0x10
[0.005837]  [a03b0666] ? vcpu_load+0x26/0x70 [kvm]
[0.005745]  [8110385f] ? lock_release_holdtime.part.29+0xf/0x200
[0.006966]  [a03c3a68] ? kvm_arch_vcpu_load+0x58/0x210 [kvm]
[0.006618]  [a03b0a73] kvm_vcpu_ioctl+0x383/0x7e0 [kvm]
[0.006175]  [81027b9d] ? native_sched_clock+0x2d/0xa0
[0.006000]  [810d5c56] ? creds_are_invalid.part.1+0x16/0x50
[0.006518]  [810d5cb1] ? creds_are_invalid+0x21/0x30
[0.005918]  [813a77fa] ? inode_has_perm.isra.48+0x2a/0xa0
[0.006350]  [8128c9a8] do_vfs_ioctl+0x2e8/0x530
[0.005514]  [8128cc71] SyS_ioctl+0x81/0xa0
[0.005051]  [81880969] system_call_fastpath+0x12/0x17
[0.005999] ---[ end trace 3e4dca7180cdddab ]---
[5.529564] kvm [1766]: vcpu0 unhandled rdmsr: 0x1c9
[0.005026] kvm [1766]: vcpu0 unhandled rdmsr: 0x1a6
[0.004998] kvm [1766]: vcpu0 unhandled rdmsr: 0x3f6
. . .
 
  [  +0.006055]  [810992ea] warn_slowpath_null+0x1a/0x20
  [  +0.005889]  [a02f00ee] nested_vmx_vmexit+0x7ee/0x880 
  [kvm_intel]
  [  +0.007014]  [a02f05af] ? vmx_handle_exit+0x1bf/0xaa0 
  [kvm_intel]
  [  +0.007015]  [a02f039c] vmx_queue_exception+0xfc/0x150 
  [kvm_intel]
  [  +0.007130]  [a028cdfd] kvm_arch_vcpu_ioctl_run+0xd9d/0x1290 
  [kvm]
 
 (There is only one execution path and unless there is a race, it would
  be prevented by [1].)
 
  [  +0.007111]  [a0288528] ? kvm_arch_vcpu_load+0x58/0x220 [kvm]
  [  +0.006670]  [a0274cbc] kvm_vcpu_ioctl+0x32c/0x5c0 [kvm]
 [...]
[1] http://article.gmane.org/gmane.comp.emulators.kvm.devel/132937
[2] http://koji.fedoraproject.org/koji/taskinfo?taskID=9004708

-- 
/kashyap
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: stand-alone kvmtool

2015-02-23 Thread Pekka Enberg

Hi,

On 2/18/15 5:50 PM, Will Deacon wrote:

Thanks for doing this. Since it looks unlikely that kvmtool will ever be
merged back into the kernel tree, it makes sense to cut the dependency
in my opinion.


I am certainly OK with a standalone repository which preserves the 
history. Will, would you like to take over the proposed new repository 
and put it somewhere on git.kernel.org, perhaps?


- Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nVMX] With 3.20.0-0.rc0.git5.1 on L0, booting L2 guest results in L1 *rebooting*

2015-02-23 Thread Kashyap Chamarthy
On Mon, Feb 23, 2015 at 05:14:37PM +0100, Kashyap Chamarthy wrote:
 On Mon, Feb 23, 2015 at 02:56:11PM +0100, Radim Krčmář wrote:
  2015-02-22 16:46+0100, Kashyap Chamarthy:
   Radim,
   
   I just tested with your patch[1] in this thread. I built a Fedora
   Kernel[2] with it, and installed (and booted into) it on both L0 and L1. 
   
   Result: I don't have good news, I'm afraid: L1 *still* reboots when an
   L2 guest is booted. And, L0 throws the stack trace that was
   previously noted on this thread:
  
  Thanks, I'm puzzled though ... isn't it possible that a wrong kernel
  sneaked into grub?
 
 Hmm, unlikely - I just double-confirmed that I'm running the same
 patched Kernel (3.20.0-0.rc0.git9.1.fc23.x86_64) on both L0 and L1.

[Correcting myself here.]

Unfortunately, I was double-wrong and your guess is right -- I seemed to
have made _two_ Kernel builds (one doesn't contain your patch, and the
other) and now not sure _which_ one I used as I didn't add a custom tag.
To confuse more, I pointed the URL to wrong build (without your fix)
previously in this thread - so likely I must have used that in my last
test.

The correct build is here:

http://koji.fedoraproject.org/koji/taskinfo?taskID=9006612

And, the build log does confirm the 'nvmx-fix.patch' that was applied

https://kojipkgs.fedoraproject.org//work/tasks/6612/9006612/build.log

The contents of the patch, I just generated a patch with `diff -u orig
new  nvmx-fix.patch` forgetting that the Fedora Kernel handles git
formatted patches just fine.

$ cat nvmx-fix.patch 
--- vmx.c.orig  2015-02-20 19:09:49.850841320 +0100
+++ vmx.c   2015-02-20 19:11:12.153491715 +0100
@@ -2038,6 +2038,9 @@
 {
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
 
+if (to_vmx(vcpu)-nested.nested_run_pending)
+return 0;
+
if (!(vmcs12-exception_bitmap  (1u  nr)))
return 0;

So, my conclusion was wrong and need to report back with the _proper_
Kernel build.

-- 
/kashyap
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] x86: svm: don't intercept CR0 TS or MP bit write

2015-02-23 Thread Radim Krčmář
2015-02-20 16:44-0600, Joel Schopp:
 From: David Kaplan david.kap...@amd.com
 
 Reduce the number of exits by avoiding exiting when the guest writes TS or MP
 bits of CR0.  INTERCEPT_CR0_WRITE intercepts all writes to CR0 including TS 
 and
 MP bits. It intercepts these even if INTERCEPT_SELECTIVE_CR0 is set.  What we
 should be doing is setting INTERCEPT_SELECTIVE_CR0 and not setting
 INTERCEPT_CR0_WRITE.
 
 Signed-off-by: David Kaplan david.kap...@amd.com
 [added remove of clr_cr_intercept in init_vmcb, fixed check in handle_exit,
 added emulation on interception back in, forward ported, tested]
 Signed-off-by: Joel Schopp joel.sch...@amd.com
 ---
  arch/x86/kvm/svm.c |   13 +++--
  1 file changed, 7 insertions(+), 6 deletions(-)
 
 diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
 index d319e0c..55822e5 100644
 --- a/arch/x86/kvm/svm.c
 +++ b/arch/x86/kvm/svm.c
 @@ -1539,10 +1538,8 @@ static void update_cr0_intercept(struct vcpu_svm *svm)
  
   if (gcr0 == *hcr0  svm-vcpu.fpu_active) {
   clr_cr_intercept(svm, INTERCEPT_CR0_READ);


 - clr_cr_intercept(svm, INTERCEPT_CR0_WRITE);
   } else {
   set_cr_intercept(svm, INTERCEPT_CR0_READ);

(There is no point in checking fpu_active if cr0s are equal.)

 - set_cr_intercept(svm, INTERCEPT_CR0_WRITE);

KVM uses lazy FPU and the state is undefined before the first access.
We set cr0.ts when !svm-vcpu.fpu_active to detect the first access, but
if we allow the guest to clear cr0.ts without exiting, it can access FPU
with undefined state.

Is this code failing to disable the intercept when FPU is active?

 @@ -2940,7 +2937,11 @@ static int cr_interception(struct vcpu_svm *svm)
 + if (svm-vmcb-control.exit_code == SVM_EXIT_CR0_SEL_WRITE)
 +cr = 16;
^^^

 + else
 +cr = svm-vmcb-control.exit_code - SVM_EXIT_READ_CR0;
^^^

Linux uses tabs for indentation.

 @@ -3502,7 +3503,7 @@ static int handle_exit(struct kvm_vcpu *vcpu)
   struct kvm_run *kvm_run = vcpu-run;
   u32 exit_code = svm-vmcb-control.exit_code;
  
 - if (!is_cr_intercept(svm, INTERCEPT_CR0_WRITE))
 + if (!is_cr_intercept(svm, INTERCEPT_SELECTIVE_CR0))
   vcpu-arch.cr0 = svm-vmcb-save.cr0;

I think the purpose of this code is to get changes that happened while
we weren't monitoring CR0, and we introduce a bug here ... MP and TS can
change, but those changes are lost to arch.cr0 now.

(The original code was also suspicious -- we propagate changes of
 vmcb-save.cr0.  I don't think we want to clear guest's CD and NW /
 set MP and TS, which is the result of what we do in svm_set_cr0() /
 update_cr0_intercept() ...)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nVMX] With 3.20.0-0.rc0.git5.1 on L0, booting L2 guest results in L1 *rebooting*

2015-02-23 Thread Kashyap Chamarthy
Tested with the _correct_ Kernel[1] (that has Radim's patch) now --
applied it on both L0 and L1.

Result: Same as before -- Booting L2 causes L1 to reboot. However, the
stack trace from `dmesg` on L0 is took slightly different path than
before -- it's using MSR handling:

. . .
[Feb23 12:14] [ cut here ]
[  +0.004658] WARNING: CPU: 5 PID: 1785 at arch/x86/kvm/vmx.c:9973 
nested_vmx_vmexit+0xbde/0xd30 [kvm_intel]()
[  +0.009897] Modules linked in: vhost_net vhost macvtap macvlan xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 
nf_nat nf_conntrack_ipv4 nf_defrag_i
pv4 xt_conntrack nf_conntrack tun bridge stp llc ebtable_filter ebtables 
ip6table_filter ip6_tables iTCO_wdt ipmi_devintf gpio_ich iTCO_vendor_support 
coretemp kvm_intel dcdbas kvm crc32c_in
tel joydev ipmi_ssif serio_raw ipmi_si tpm_tis i7core_edac lpc_ich 
ipmi_msghandler edac_core tpm mfd_core shpchp wmi acpi_power_meter acpi_cpufreq 
nfsd auth_rpcgss nfs_acl lockd grace sunrpc
 mgag200 i2c_algo_bit drm_kms_helper ttm ata_generic drm pata_acpi megaraid_sas 
bnx2
[  +0.060790] CPU: 5 PID: 1785 Comm: qemu-system-x86 Not tainted 
3.20.0-0.rc0.git9.1.fc23.x86_64 #1
[  +0.008938] Hardware name: Dell Inc. PowerEdge R910/0P658H, BIOS 2.8.2 
10/25/2012
[  +0.007476]   8ba15f99 88ff5d627b38 
818773cd
[  +0.007727]    88ff5d627b78 
810ab3ba
[  +0.007660]  88ff5d627b68 883f5fd2  

[  +0.007729] Call Trace:
[  +0.002543]  [818773cd] dump_stack+0x4c/0x65
[  +0.005205]  [810ab3ba] warn_slowpath_common+0x8a/0xc0
[  +0.006085]  [810ab4ea] warn_slowpath_null+0x1a/0x20
[  +0.005915]  [a0244f8e] nested_vmx_vmexit+0xbde/0xd30 [kvm_intel]
[  +0.007061]  [a0245976] vmx_set_msr+0x416/0x420 [kvm_intel]
[  +0.006549]  [a029f0c0] ? kvm_set_msr+0x70/0x70 [kvm]
[  +0.006018]  [a029f091] kvm_set_msr+0x41/0x70 [kvm]
[  +0.005840]  [a029f0f3] do_set_msr+0x33/0x50 [kvm]
[  +0.005692]  [a02a3a80] msr_io+0x100/0x1c0 [kvm]
[  +0.005567]  [a02a3a10] ? msr_io+0x90/0x1c0 [kvm]
[  +0.005657]  [a023de70] ? handle_task_switch+0x1f0/0x1f0 [kvm_intel]
[  +0.007321]  [a02ac799] kvm_arch_vcpu_ioctl+0xb79/0x11a0 [kvm]
[  +0.006788]  [a023f7fe] ? vmx_vcpu_load+0x15e/0x1e0 [kvm_intel]
[  +0.006878]  [a0298666] ? vcpu_load+0x26/0x70 [kvm]
[  +0.005825]  [a02abac3] ? kvm_arch_vcpu_load+0xb3/0x210 [kvm]
[  +0.006712]  [a02987da] kvm_vcpu_ioctl+0xea/0x7e0 [kvm]
[  +0.006140]  [81027b9d] ? native_sched_clock+0x2d/0xa0
[  +0.006063]  [810d5c56] ? creds_are_invalid.part.1+0x16/0x50
[  +0.006583]  [810d5cb1] ? creds_are_invalid+0x21/0x30
[  +0.005984]  [813a77fa] ? inode_has_perm.isra.48+0x2a/0xa0
[  +0.006436]  [8128c9a8] do_vfs_ioctl+0x2e8/0x530
[  +0.005559]  [8128cc71] SyS_ioctl+0x81/0xa0
[  +0.005135]  [81880969] system_call_fastpath+0x12/0x17
[  +0.006065] ---[ end trace a7f3bc31fb0ddbff ]---
. . .


[1] 
https://kashyapc.fedorapeople.org/kernel-3.20.0-0.rc0.git9.1.fc23.rpms-with-nvmx-test-fix-from-radim/
 - I uploaded the Fedora Koji scratch build for this Kernel to a
   more permanant location, as these type of builds will be removed
   automatically after 3 weeks

-- 
/kashyap
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/5] KVM: arm/arm64: remove coarse grain dist locking at kvm_vgic_sync_hwstate

2015-02-23 Thread Christoffer Dall
On Mon, Jan 19, 2015 at 05:43:12PM +0100, Eric Auger wrote:
 To prepare for irqfd addition, coarse grain locking is removed at
 kvm_vgic_sync_hwstate level and finer grain locking is introduced in
 vgic_process_maintenance only.
 
 Signed-off-by: Eric Auger eric.au...@linaro.org

Acked-by: Christoffer Dall christoffer.d...@linaro.org
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 5/5] KVM: arm/arm64: add irqfd support

2015-02-23 Thread Christoffer Dall
On Mon, Jan 19, 2015 at 05:43:13PM +0100, Eric Auger wrote:
 This patch enables irqfd on arm/arm64.
 
 Both irqfd and resamplefd are supported. Injection is implemented
 in vgic.c without routing.
 
 This patch enables CONFIG_HAVE_KVM_EVENTFD and CONFIG_HAVE_KVM_IRQFD.
 
 KVM_CAP_IRQFD is now advertised. KVM_CAP_IRQFD_RESAMPLE capability
 automatically is advertised as soon as CONFIG_HAVE_KVM_IRQFD is set.
 
 Irqfd injection is restricted to SPI. The rationale behind not
 supporting PPI irqfd injection is that any device using a PPI would
 be a private-to-the-CPU device (timer for instance), so its state
 would have to be context-switched along with the VCPU and would
 require in-kernel wiring anyhow. It is not a relevant use case for
 irqfds.
 
 Signed-off-by: Eric Auger eric.au...@linaro.org
 
 ---
 v7 - v8:
 - remove kvm_irq_has_notifier call
 - part of dist locking changes now are part of previous patch file
 - remove gic_initialized() check in kvm_set_irq
 - remove Christoffer's Reviewed-by after this change
 
 v5 - v6:
 - KVM_CAP_IRQFD support depends on vgic_present
 - add Christoffer's Reviewed-by
 
 v4 - v5:
 - squash [PATCH v4 3/3] KVM: arm64: add irqfd support into this patch
 - some rewording in Documentation/virtual/kvm/api.txt and in vgic
   vgic_process_maintenance unlock comment.
 - move explanation of why not supporting PPI into commit message
 - in case of injection before gic readiness, -ENODEV is returned. It is
   up to the user space to avoid this situation.
 
 v3 - v4:
 - reword commit message
 - explain why we unlock the distributor before calling kvm_notify_acked_irq
 - rename is_assigned_irq into has_notifier
 - change EOI and injection kvm_debug format string
 - remove error local variable in kvm_set_irq
 - Move HAVE_KVM_IRQCHIP unset in a separate patch
 - handle case were the irqfd injection is attempted before the vgic is ready.
   in such a case the notifier, if any, is called immediatly
 - use nr_irqs to test spi is within correct range
 
 v2 - v3:
 - removal of irq.h from eventfd.c put in a separate patch to increase
   visibility
 - properly expose KVM_CAP_IRQFD capability in arm.c
 - remove CONFIG_HAVE_KVM_IRQCHIP meaningfull only if irq_comm.c is used
 
 v1 - v2:
 - rebase on 3.17rc1
 - move of the dist unlock in process_maintenance
 - remove of dist lock in __kvm_vgic_sync_hwstate
 - rewording of the commit message (add resamplefd reference)
 - remove irq.h
 ---
  Documentation/virtual/kvm/api.txt |  6 +-
  arch/arm/include/uapi/asm/kvm.h   |  3 +++
  arch/arm/kvm/Kconfig  |  2 ++
  arch/arm/kvm/Makefile |  2 +-
  arch/arm/kvm/arm.c|  5 +
  arch/arm64/include/uapi/asm/kvm.h |  3 +++
  arch/arm64/kvm/Kconfig|  2 ++
  arch/arm64/kvm/Makefile   |  2 +-
  virt/kvm/arm/vgic.c   | 45 
 +++
  9 files changed, 67 insertions(+), 3 deletions(-)
 
 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index 0007fef..5ed8088 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -2231,7 +2231,7 @@ into the hash PTE second double word).
  4.75 KVM_IRQFD
  
  Capability: KVM_CAP_IRQFD
 -Architectures: x86 s390
 +Architectures: x86 s390 arm arm64
  Type: vm ioctl
  Parameters: struct kvm_irqfd (in)
  Returns: 0 on success, -1 on error
 @@ -2257,6 +2257,10 @@ Note that closing the resamplefd is not sufficient to 
 disable the
  irqfd.  The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
  and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
  
 +On ARM/ARM64, the gsi field in the kvm_irqfd struct specifies the Shared
 +Peripheral Interrupt (SPI) index, such that the GIC interrupt ID is
 +given by gsi + 32.
 +
  4.76 KVM_PPC_ALLOCATE_HTAB
  
  Capability: KVM_CAP_PPC_ALLOC_HTAB
 diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
 index 0db25bc..2499867 100644
 --- a/arch/arm/include/uapi/asm/kvm.h
 +++ b/arch/arm/include/uapi/asm/kvm.h
 @@ -198,6 +198,9 @@ struct kvm_arch_memory_slot {
  /* Highest supported SPI, from VGIC_NR_IRQS */
  #define KVM_ARM_IRQ_GIC_MAX  127
  
 +/* One single KVM irqchip, ie. the VGIC */
 +#define KVM_NR_IRQCHIPS  1
 +
  /* PSCI interface */
  #define KVM_PSCI_FN_BASE 0x95c1ba5e
  #define KVM_PSCI_FN(n)   (KVM_PSCI_FN_BASE + (n))
 diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
 index 9f581b1..e519a40 100644
 --- a/arch/arm/kvm/Kconfig
 +++ b/arch/arm/kvm/Kconfig
 @@ -24,6 +24,7 @@ config KVM
   select KVM_MMIO
   select KVM_ARM_HOST
   depends on ARM_VIRT_EXT  ARM_LPAE
 + select HAVE_KVM_EVENTFD
   ---help---
 Support hosting virtualized guest machines. You will also
 need to select one or more of the processor modules below.
 @@ -55,6 +56,7 @@ config KVM_ARM_MAX_VCPUS
  config KVM_ARM_VGIC
   bool KVM support for Virtual GIC
   depends 

Re: [v3 21/26] x86, irq: Define a global vector for VT-d Posted-Interrupts

2015-02-23 Thread Marcelo Tosatti
On Fri, Dec 12, 2014 at 11:14:55PM +0800, Feng Wu wrote:
 Currently, we use a global vector as the Posted-Interrupts
 Notification Event for all the vCPUs in the system. We need
 to introduce another global vector for VT-d Posted-Interrtups,
 which will be used to wakeup the sleep vCPU when an external
 interrupt from a direct-assigned device happens for that vCPU.
 
 Signed-off-by: Feng Wu feng...@intel.com

Why an additional vector is necessary?

Can't you simply wakeup the vcpu from kvm_posted_intr_ipi, the posted
interrupt vector handler ?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v3 23/26] KVM: Update Posted-Interrupts Descriptor when vCPU is preempted

2015-02-23 Thread Marcelo Tosatti
On Fri, Dec 12, 2014 at 11:14:57PM +0800, Feng Wu wrote:
 This patch updates the Posted-Interrupts Descriptor when vCPU
 is preempted.
 
 sched out:
 - Set 'SN' to suppress furture non-urgent interrupts posted for
 the vCPU.

What wakes the vcpu in the case of a non-urgent interrupt, then?

I wonder how is software suppose to configure the urgent/non-urgent
flag. Can you give examples of (hypothetical) urgent and non-urgent
interrupts.

 sched in:
 - Clear 'SN'
 - Change NDST if vCPU is scheduled to a different CPU
 - Set 'NV' to POSTED_INTR_VECTOR

What about:

POSTED_INTR_VECTOR interrupt handler:
- Wakeup vcpu.
- Set 'SN' to suppress future interrupts.

HLT emulation entry:
- Clear 'SN' to receive VT-d interrupt notification.

 Signed-off-by: Feng Wu feng...@intel.com
 ---
  arch/x86/kvm/vmx.c | 44 
  1 file changed, 44 insertions(+)
 
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index ee3b735..bf2e6cd 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -1916,10 +1916,54 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int 
 cpu)
   vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */
   vmx-loaded_vmcs-cpu = cpu;
   }
 +
 + if (irq_remapping_cap(IRQ_POSTING_CAP)) {
 + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
 + struct pi_desc old, new;
 + unsigned int dest;
 +
 + memset(old, 0, sizeof(old));
 + memset(new, 0, sizeof(new));
 +
 + do {
 + old.control = new.control = pi_desc-control;
 + if (vcpu-cpu != cpu) {
 + dest = cpu_physical_id(cpu);
 +
 + if (x2apic_enabled())
 + new.ndst = dest;
 + else
 + new.ndst = (dest  8)  0xFF00;
 + }
 +
 + pi_clear_sn(new);
 +
 + /* set 'NV' to 'notification vector' */
 + new.nv = POSTED_INTR_VECTOR;
 + } while (cmpxchg(pi_desc-control, old.control,
 + new.control) != old.control);
 + }
  }
  
  static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
  {
 + if (irq_remapping_cap(IRQ_POSTING_CAP)) {
 + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
 + struct pi_desc old, new;
 +
 + memset(old, 0, sizeof(old));
 + memset(new, 0, sizeof(new));
 +
 + /* Set SN when the vCPU is preempted */
 + if (vcpu-preempted) {
 + do {
 + old.control = new.control = pi_desc-control;
 + pi_set_sn(new);
 + } while (cmpxchg(pi_desc-control, old.control,
 + new.control) != old.control);
 + }
 + }
 +
   __vmx_load_host_state(to_vmx(vcpu));
   if (!vmm_exclusive) {
   __loaded_vmcs_clear(to_vmx(vcpu)-loaded_vmcs);
 -- 
 1.9.1
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] locking fixes

2015-02-23 Thread Christian Borntraeger
Am 21.02.2015 um 02:51 schrieb Linus Torvalds:
 So here's my try at fixing READ_ONCE() so that it is happy with 'const' 
 sources.
 
 It is entirely untested. Comments/testing?
 
 Christian, I guess I could just have forced a cast instead of the
 union. I'd like you to take a look at this, because right now it's
 holding up me pulling from Ingo.

Sorry for the too late for rc1 answer, but I was traveling the last 4
days.

Hmm, some autocasting feels better, but I could not come up with a proper
solution that works for all cases (e.g. I tried __auto_type __val = x
or typeof(x * 0) to make this lvalue and rvalue, but all variants failed
in one or the other way).
Unless I can come up with a better solution your union patch is probably
the best way to go and rc1 seems to work.

 
 And Ingo, I think you need to add some kind of test for horrible new
 warnings. I think your pull request *worked*, but the tens of lines
 of new warnings it generates is unacceptable, and will just cause me
 to undo the pull if I notice in time (like I did this time).

I was getting several complaints from the linux-next buildbots about new
sparse warnings, compile warning and so on when doing this rework, e.g.
commit c5b19946eb76c675 (kernel: Fix sparse warning for ACCESS_ONCE)
fixes two of those warnings.
So I am somewhat surprised that I never saw this as I am also following the
KVM list. turns out that arch/x86/kernel/kvm.c does not CC the kvm list
in get_maintainers.pl.

Maybe I should push something like that to Paolo/Marcelo.


--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5574,6 +5574,7 @@ S:Supported
 F: Documentation/*/kvm*.txt
 F: Documentation/virtual/kvm/
 F: arch/*/kvm/
+F: arch/x86/kernel/kvm.c
 F: arch/*/include/asm/kvm*
 F: include/linux/kvm*
 F: include/uapi/linux/kvm*

Christian




 
 Linus
 
 On Fri, Feb 20, 2015 at 4:03 PM, Linus Torvalds
 torva...@linux-foundation.org wrote:
 How does this work for you at all?

 On Fri, Feb 20, 2015 at 5:37 AM, Ingo Molnar mi...@kernel.org wrote:
 diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
 index 94f643484300..e354cc6446ab 100644
 --- a/arch/x86/kernel/kvm.c
 +++ b/arch/x86/kernel/kvm.c
 @@ -803,8 +808,8 @@ static void kvm_unlock_kick(struct arch_spinlock *lock, 
 __ticket_t ticket)
 add_stats(RELEASED_SLOW, 1);
 for_each_cpu(cpu, waiting_cpus) {
 const struct kvm_lock_waiting *w = per_cpu(klock_waiting, 
 cpu);
 -   if (ACCESS_ONCE(w-lock) == lock 
 -   ACCESS_ONCE(w-want) == ticket) {
 +   if (READ_ONCE(w-lock) == lock 
 +   READ_ONCE(w-want) == ticket) {
 add_stats(RELEASED_SLOW_KICKED, 1);
 kvm_kick_cpu(cpu);
 break;

 I get horrible compile warnings from this, because of how 'w' is a
 pointer to a 'const' structure, which then causes things like

 include/linux/compiler.h:262:39: warning: passing argument 1 of
 ‘__read_once_size’ discards ‘const’ qualifier from pointer target type
   ({ typeof(x) __val; __read_once_size(x, __val, sizeof(__val)); 
 __val; })

 which is fairly hard to avoid (looks like it might need a union)

Linus

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: FreeBSD 10.1 disk performance lower than identical Linux Guest

2015-02-23 Thread Ruben Kerkhof
On Sat, Feb 21, 2015 at 7:59 PM, Greg Langford g...@langford.me wrote:
 Good Evening,

Hi Greg,

 I am running CentOS 6.6 x86_64 as a KVM host and have a number of
 guests running.

 After some experimenting I have noticed something curious, FreeBSD
 disk performance seems to be just over half of that of a Linux guest
 with an identical configuration.

 From my understanding virtio is included in FreeBSD 10.0 onwards.

 My CentOS guest has approx 130MB/s when using dd to read /dev/zero and
 write it to a file on the guest file system. This is about the same
 when doing the same on the hypervisor it's self. The stats are gained
 using iotop on the hypervisor while the test is performed.

 However the FreeBSD guest gets about 70MB/s maximum when performing
 the same test and is running FreeBSD 10.1

 Has anyone seen this before, is it a known issue or expected
 behaviour? I have been scratching my head about it for a number of
 days now.

There have been some performance improvements to the FreeBSD
virtio-blk driver in -CURRENT:
https://github.com/freebsd/freebsd/commit/6e8ba9083acb

Maybe these help?

 Best Regards,
 Greg Langford

Kind regards,
Ruben Kerkhof
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copy_huge_page: unable to handle kernel NULL pointer dereference at 0000000000000008

2015-02-23 Thread Marcelo Tosatti
On Wed, Feb 04, 2015 at 08:34:04PM +0400, Andrey Korolyov wrote:
 Hi,
 
 I've seen the problem quite a few times.  Before spending more time on
 it, I'd like to have a quick check here to see if anyone ever saw the
 same problem?  Hope it is a relevant question with this mail list.
 
 
 Jul  2 11:08:21 arno-3 kernel: [ 2165.078623] BUG: unable to handle
 kernel NULL pointer dereference at 0008
 Jul  2 11:08:21 arno-3 kernel: [ 2165.078916] IP: [8118d0fa]
 copy_huge_page+0x8a/0x2a0
 Jul  2 11:08:21 arno-3 kernel: [ 2165.079128] PGD 0
 Jul  2 11:08:21 arno-3 kernel: [ 2165.079198] Oops:  [#1] SMP
 Jul  2 11:08:21 arno-3 kernel: [ 2165.079319] Modules linked in:
 ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE
 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp
 iptable_filter ip_tables x_tables kvm_intel kvm bridge stp llc ast ttm
 drm_kms_helper drm sysimgblt sysfillrect syscopyarea lp mei_me ioatdma
 ext2 parport mei shpchp dcdbas joydev mac_hid lpc_ich acpi_pad wmi
 hid_generic usbhid hid ixgbe igb dca i2c_algo_bit ahci ptp libahci
 mdio pps_core
 Jul  2 11:08:21 arno-3 kernel: [ 2165.081090] CPU: 19 PID: 3494 Comm:
 qemu-system-x86 Not tainted 3.11.0-15-generic #25~precise1-Ubuntu
 Jul  2 11:08:21 arno-3 kernel: [ 2165.081424] Hardware name: Dell Inc.
 PowerEdge C6220 II/09N44V, BIOS 2.0.3 07/03/2013
 Jul  2 11:08:21 arno-3 kernel: [ 2165.081705] task: 88102675
 ti: 881026056000 task.ti: 881026056000
 Jul  2 11:08:21 arno-3 kernel: [ 2165.081973] RIP:
 0010:[8118d0fa]  [8118d0fa]
 copy_huge_page+0x8a/0x2a0
 
 
 Hello,
 
 sorry for possible top-posting, the same issue appears on at least
 3.10 LTS series. The original thread is at
 http://marc.info/?l=kvmm=14043742300901.

Andrey,

I am unable to access the URL above?

 The necessary components for failure to reappear are a single running
 kvm guest and mounted large thp: hugepagesz=1G (seemingly the same as
 in initial report). With default 2M pages everything is working well,
 the same for 3.18 with 1G THP. Are there any obvious clues for the
 issue?
 
 Thanks!



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html