KVM Test report, kernel 205befd9... qemu ca916d37...
Hi All, This is KVM upstream test result against kvm.git next branch and qemu-kvm.git uq/master branch. kvm.git next branch: 205befd9a5c701b56f569434045821f413f08f6d based on kernel 3.11.0-rc1 qemu-kvm.git uq/master branch: ca916d3729564d0eb3c2374a96903f7e8aced8a7 We found one new bug and one bug fixed in the past two weeks. New issue (1): 1. Guest hang after live migration https://bugs.launchpad.net/qemu/+bug/1213797 Fixed issue (1): 1. L2 can't boot up when creating L1 with '-cpu host' qemu option https://bugzilla.kernel.org/show_bug.cgi?id=60679 Old issues (10): -- 1. guest panic with parameter -cpu host in qemu command line (about vPMU issue). https://bugs.launchpad.net/qemu/+bug/994378 2. Can't install or boot up 32bit win8 guest. https://bugs.launchpad.net/qemu/+bug/1007269 3. vCPU hot-add makes the guest abort. https://bugs.launchpad.net/qemu/+bug/1019179 4. Nested Virt: VMX can't be initialized in L1 Xen (Xen on KVM) https://bugzilla.kernel.org/show_bug.cgi?id=45931 5. Guest has no xsave feature with parameter -cpu qemu64,+xsave in qemu command line. https://bugs.launchpad.net/qemu/+bug/1042561 6. Guest hang when doing kernel build and writing data in guest. https://bugs.launchpad.net/qemu/+bug/1096814 7. with 'monitor pty', it needs to flush pts device after sending command to it https://bugs.launchpad.net/qemu/+bug/1185228 8. [nested virt] L2 Windows guest can't boot up ('-cpu host' to start L1) https://bugzilla.kernel.org/show_bug.cgi?id=58921 9. [nested virt] L2 has NMI error when creating L1 with -cpu host parameter https://bugzilla.kernel.org/show_bug.cgi?id=58941 10. 64bit RHEL6.4 guest crashes and reboots continuously https://bugs.launchpad.net/qemu-kvm/+bug/1207623 Test environment: == Platform IvyBridge-EP Sandybridge-EP CPU Cores 32 32 Memory size 64GB 32GB Regards Zhou Chao N�r��yb�X��ǧv�^�){.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf
Re: [PATCH v2] vhost: Include linux/uio.h instead of linux/socket.h
On Mon, Aug 19, 2013 at 09:23:19AM +0800, Asias He wrote: memcpy_fromiovec is moved from net/core/iovec.c to lib/iovec.c. linux/uio.h provides the declaration for memcpy_fromiovec. Include linux/uio.h instead of inux/socket.h for it. Signed-off-by: Asias He as...@redhat.com Acked-by: Michael S. Tsirkin m...@redhat.com --- drivers/vhost/vhost.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index e58cf00..448efe0 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -13,7 +13,7 @@ #include linux/eventfd.h #include linux/vhost.h -#include linux/socket.h /* memcpy_fromiovec */ +#include linux/uio.h #include linux/mm.h #include linux/mmu_context.h #include linux/miscdevice.h -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM Test report, kernel 205befd9... qemu ca916d37...
On 08/19/2013 03:06 PM, Zhou, Chao wrote: Hi All, This is KVM upstream test result against kvm.git next branch and qemu-kvm.git uq/master branch. kvm.git next branch: 205befd9a5c701b56f569434045821f413f08f6d based on kernel 3.11.0-rc1 qemu-kvm.git uq/master branch: ca916d3729564d0eb3c2374a96903f7e8aced8a7 We found one new bug and one bug fixed in the past two weeks. New issue (1): 1. Guest hang after live migration https://bugs.launchpad.net/qemu/+bug/1213797 Could you please do bisect to find the bad commit out? :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: KVM Test report, kernel 205befd9... qemu ca916d37...
-Original Message- From: Xiao Guangrong [mailto:xiaoguangr...@linux.vnet.ibm.com] Sent: Monday, August 19, 2013 3:34 PM To: Zhou, Chao Cc: kvm@vger.kernel.org Subject: Re: KVM Test report, kernel 205befd9... qemu ca916d37... On 08/19/2013 03:06 PM, Zhou, Chao wrote: Hi All, This is KVM upstream test result against kvm.git next branch and qemu-kvm.git uq/master branch. kvm.git next branch: 205befd9a5c701b56f569434045821f413f08f6d based on kernel 3.11.0-rc1 qemu-kvm.git uq/master branch: ca916d3729564d0eb3c2374a96903f7e8aced8a7 We found one new bug and one bug fixed in the past two weeks. New issue (1): 1. Guest hang after live migration https://bugs.launchpad.net/qemu/+bug/1213797 Could you please do bisect to find the bad commit out? :) This commit cause the bug: commit 0779caeb1a17f4d3ed14e2925b36ba09b084fb7b Author: Arthur Chunqi Li yzt...@gmail.com
Re: KVM Test report, kernel 205befd9... qemu ca916d37...
On 08/19/2013 03:44 PM, Zhou, Chao wrote: -Original Message- From: Xiao Guangrong [mailto:xiaoguangr...@linux.vnet.ibm.com] Sent: Monday, August 19, 2013 3:34 PM To: Zhou, Chao Cc: kvm@vger.kernel.org Subject: Re: KVM Test report, kernel 205befd9... qemu ca916d37... On 08/19/2013 03:06 PM, Zhou, Chao wrote: Hi All, This is KVM upstream test result against kvm.git next branch and qemu-kvm.git uq/master branch. kvm.git next branch: 205befd9a5c701b56f569434045821f413f08f6d based on kernel 3.11.0-rc1 qemu-kvm.git uq/master branch: ca916d3729564d0eb3c2374a96903f7e8aced8a7 We found one new bug and one bug fixed in the past two weeks. New issue (1): 1. Guest hang after live migration https://bugs.launchpad.net/qemu/+bug/1213797 Could you please do bisect to find the bad commit out? :) This commit cause the bug: commit 0779caeb1a17f4d3ed14e2925b36ba09b084fb7b Author: Arthur Chunqi Li yzt...@gmail.com Can not find this comment on neither next nor queue branch... could you please show the full log? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] target-ppc: Update slb array with correct index values.
On 19.08.2013, at 09:25, Aneesh Kumar K.V wrote: Alexander Graf ag...@suse.de writes: On 11.08.2013, at 20:16, Aneesh Kumar K.V wrote: From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Without this, a value of rb=0 and rs=0, result in us replacing the 0th index Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Wrong mailing list again ;). Will post the series again with updated commit message to the qemu list. --- target-ppc/kvm.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 30a870e..5d4e613 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -1034,8 +1034,18 @@ int kvm_arch_get_registers(CPUState *cs) /* Sync SLB */ #ifdef TARGET_PPC64 for (i = 0; i 64; i++) { -ppc_store_slb(env, sregs.u.s.ppc64.slb[i].slbe, - sregs.u.s.ppc64.slb[i].slbv); +target_ulong rb = sregs.u.s.ppc64.slb[i].slbe; +/* + * KVM_GET_SREGS doesn't retun slb entry with slot information + * same as index. So don't depend on the slot information in + * the returned value. This is the generating code in book3s_pr.c: if (vcpu-arch.hflags BOOK3S_HFLAG_SLB) { for (i = 0; i 64; i++) { sregs-u.s.ppc64.slb[i].slbe = vcpu-arch.slb[i].orige | i; sregs-u.s.ppc64.slb[i].slbv = vcpu-arch.slb[i].origv; } Where exactly did you see broken slbe entries? I noticed this when adding support for guest memory dumping via qemu gdb server. Now the array we get would look like below slbe0 slbv0 slbe1 slbv1 0 0 Ok, so that's where the problem lies. Why are the entries 0 here? Either we try to fetch more entries than we should, we populate entries incorrectly or the kernel simply returns invalid SLB entry values for invalid entries. Are you seeing this with PR KVM or HV KVM? Alex Once we get an array like that when we hit the third value we will replace the 0th entry, that is [slbe0 slbv0]. That resulted in failed translation of the address by qemu gdb server. -aneesh -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: KVM Test report, kernel 205befd9... qemu ca916d37...
-Original Message- From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf Of Xiao Guangrong Sent: Monday, August 19, 2013 4:18 PM To: Zhou, Chao Cc: kvm@vger.kernel.org Subject: Re: KVM Test report, kernel 205befd9... qemu ca916d37... On 08/19/2013 03:44 PM, Zhou, Chao wrote: -Original Message- From: Xiao Guangrong [mailto:xiaoguangr...@linux.vnet.ibm.com] Sent: Monday, August 19, 2013 3:34 PM To: Zhou, Chao Cc: kvm@vger.kernel.org Subject: Re: KVM Test report, kernel 205befd9... qemu ca916d37... On 08/19/2013 03:06 PM, Zhou, Chao wrote: Hi All, This is KVM upstream test result against kvm.git next branch and qemu-kvm.git uq/master branch. kvm.git next branch: 205befd9a5c701b56f569434045821f413f08f6d based on kernel 3.11.0-rc1 qemu-kvm.git uq/master branch: ca916d3729564d0eb3c2374a96903f7e8aced8a7 We found one new bug and one bug fixed in the past two weeks. New issue (1): 1. Guest hang after live migration https://bugs.launchpad.net/qemu/+bug/1213797 Could you please do bisect to find the bad commit out? :) This commit cause the bug: commit 0779caeb1a17f4d3ed14e2925b36ba09b084fb7b Author: Arthur Chunqi Li yzt...@gmail.com Can not find this comment on neither next nor queue branch... could you please show the full log? Jinsong Liu (@Intel) 's patch qemu-kvm bugfix for IA32_FEATURE_CONTROL will this bug. That patch will also fix the following bug. 10. 64bit RHEL6.4 guest crashes and reboots continuously https://bugs.launchpad.net/qemu-kvm/+bug/1207623 Best Regards, Yongjie (Jay)
Re: [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL
Il 18/08/2013 20:23, Liu, Jinsong ha scritto: From 1273f8b2e5464ec987facf9942fd3ccc0b69087e Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Mon, 19 Aug 2013 09:33:30 +0800 Subject: [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL This patch is to fix the bug https://bugs.launchpad.net/qemu-kvm/+bug/1207623 IA32_FEATURE_CONTROL is pointless if not expose VMX or SMX bits to cpuid.1.ecx of vcpu. Current qemu-kvm will error return when kvm_put_msrs or kvm_get_msrs. Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/kvm.c | 16 ++-- 1 files changed, 14 insertions(+), 2 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 84ac00a..7facbfe 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -65,6 +65,7 @@ static bool has_msr_star; static bool has_msr_hsave_pa; static bool has_msr_tsc_adjust; static bool has_msr_tsc_deadline; +static bool has_msr_feature_control; static bool has_msr_async_pf_en; static bool has_msr_pv_eoi_en; static bool has_msr_misc_enable; @@ -644,6 +645,11 @@ int kvm_arch_init_vcpu(CPUState *cs) qemu_add_vm_change_state_handler(cpu_update_state, env); +c = cpuid_find_entry(cpuid_data.cpuid, 1, 0); +if (c) +has_msr_feature_control = !!(c-ecx CPUID_EXT_VMX) | + !!(c-ecx CPUID_EXT_SMX); + cpuid_data.cpuid.padding = 0; r = kvm_vcpu_ioctl(cs, KVM_SET_CPUID2, cpuid_data); if (r) { @@ -1121,7 +1127,10 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (hyperv_vapic_recommended()) { kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 0); } -kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, env-msr_ia32_feature_control); +if (has_msr_feature_control) { +kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, + env-msr_ia32_feature_control); +} } if (env-mcg_cap) { int i; @@ -1346,7 +1355,9 @@ static int kvm_get_msrs(X86CPU *cpu) if (has_msr_misc_enable) { msrs[n++].index = MSR_IA32_MISC_ENABLE; } -msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +if (has_msr_feature_control) { +msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +} if (!env-tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1447,6 +1458,7 @@ static int kvm_get_msrs(X86CPU *cpu) break; case MSR_IA32_FEATURE_CONTROL: env-msr_ia32_feature_control = msrs[i].data; +break; default: if (msrs[i].index = MSR_MC0_CTL msrs[i].index MSR_MC0_CTL + (env-mcg_cap 0xff) * 4) { The patch looks good. Please repost it with checkpatch.pl failures fixed. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm/queue still ahead of kvm/next
Il 19/08/2013 06:36, Kashyap Chamarthy ha scritto: On Sat, Aug 10, 2013 at 12:56 AM, Paolo Bonzini pbonz...@redhat.com wrote: Hi all, I'm seeing some breakage of shadow-on-shadow and shadow-on-EPT nested VMX. Until I can track more precisely whether it is a regression, and on which hosts I can reproduce it, I'm going to leave the patches out of kvm/next. The good news is that nested EPT works pretty well. :) Paolo/others, I'm trying to test nEPT, so I'm trying to start this way: $ git clone git://git.kernel.org/pub/scm/virt/kvm/kvm.git $ git branch --all $ git checkout remotes/origin/queue Compile, and proceed. Or would you suggest to use http://git.kiszka.org/?p=kvm-kmod.git;a=blob;f=README to test latest stuff? If you prefer not to build your own kernel, kvm-kmod works too. However, please build it with the latest 3.10.x kernel to make the environment as similar as possible to what you'd get with kvm.git. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Emulation failure
Il 19/08/2013 03:14, Duy Nguyen TN ha scritto: I got this error with qem-kvm-0.15.1 on kernel 3.1.0-1.2-desktop (OpenSUSE 12.1). I know I should rerun it with latest kernel/qemu but I hope maybe this rings a bell or something, because it'll take some time for me to prepare new kernel. KVM internal error. Suberror: 1 emulation failure RAX=77ff9000 RBX=77e93608 RCX=75d4d81a RDX=0001 RSI=1000 RDI= RBP=69a07700 RSP=77e934b0 R8 =0008 R9 = R10=0002 R11=0246 R12=69a07700 R13=77e937d8 R14=003000704c04 R15=003000704c04 RIP=00b1dd44 RFL=00010202 [---] CPL=3 II=0 A20=1 SMM=0 HLT=0 ES = CS =0033 00a0fb00 DPL=3 CS64 [-RA] SS =002b 00c0f300 DPL=3 DS [-WA] DS = FS = 77e94700 GS = LDT= TR =0040 88003aa0df80 2087 8b00 DPL=0 TSS64-busy GDT= 88003aa04000 007f IDT= 816ad000 0fff CR0=80050033 CR2=75a68180 CR3=289ad000 CR4=06f0 DR0= DR1= DR2= DR3= DR6=0ff0 DR7=0400 EFER=0d01 Code=00 85 c0 75 5d 48 8b 05 5c f5 e1 00 48 83 b8 f0 00 00 00 00 df a8 f0 00 00 00 0f 88 a0 00 00 00 8b 05 4a f5 e1 00 48 89 44 24 80 df 6c 24 80 de c9 d8 The disassembled code is 0x1dd10:push %rbx 0x1dd11:mov$0x6e,%eax 0x1dd16:mov%rdi,%rbx 0x1dd19:sub$0x20,%rsp 0x1dd1d:test %rdi,%rdi 0x1dd20:je 0xb1dd92 0x1dd22:mov0x4bf1e0(%rip),%eax 0x1dd28:cmp$0x,%eax 0x1dd2b:je 0xb1ddd0 0x1dd31:test %eax,%eax 0x1dd33:jne0xb1dd92 0x1dd35:mov0xe1f55c(%rip),%rax 0x1dd3c:cmpq $0x0,0xf0(%rax) 0x1dd44:fildll 0xf0(%rax) 0x1dd4a:js 0xb1ddf0 0x1dd50:mov0xe1f54a(%rip),%eax 0x1dd56:mov%rax,-0x80(%rsp) 0x1dd5b:fildll -0x80(%rsp) 0x1dd5f:fmulp %st,%st(1) Not sure if it helps but rax after 0xb1dd35 contains the pointer to mmap'd memory of /dev/hpet I think this wouldn't work even with the latest kernel. Emulation of x87 instructions is not supported yet. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm/queue still ahead of kvm/next
On Mon, Aug 19, 2013 at 2:56 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 19/08/2013 06:36, Kashyap Chamarthy ha scritto: On Sat, Aug 10, 2013 at 12:56 AM, Paolo Bonzini pbonz...@redhat.com wrote: Hi all, I'm seeing some breakage of shadow-on-shadow and shadow-on-EPT nested VMX. Until I can track more precisely whether it is a regression, and on which hosts I can reproduce it, I'm going to leave the patches out of kvm/next. The good news is that nested EPT works pretty well. :) Paolo/others, I'm trying to test nEPT, so I'm trying to start this way: $ git clone git://git.kernel.org/pub/scm/virt/kvm/kvm.git $ git branch --all $ git checkout remotes/origin/queue Compile, and proceed. Or would you suggest to use http://git.kiszka.org/?p=kvm-kmod.git;a=blob;f=README to test latest stuff? If you prefer not to build your own kernel, kvm-kmod works too. However, please build it with the latest 3.10.x kernel to make the environment as similar as possible to what you'd get with kvm.git. I'm fine building news Kernels. I just started this way: $ git remote -v origin git://git.kernel.org/pub/scm/virt/kvm/kvm.git (fetch) origin git://git.kernel.org/pub/scm/virt/kvm/kvm.git (push) $ git checkout -b test_nept origin/queue $ make defconfig $ make -j8 make modules $ make install make modules_install And, booting into the just built 3.11.0-rc1+ kernel (this is on bare-metal), hangs at: - . . . [5.437220] systemd[1]: Failed to mount /dev: No such device [5.523997] usb 2-1: New USB device found, idVendor=8087, idProduct=8000 [5.531600] usb 2-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [5.540188] hub 2-1:1.0: USB hub found [5.544573] hub 2-1:1.0: 8 ports detected [5.757452] Switched to clocksource tsc [5.879287] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off - (This is on a Haswell machine). Am I missing anything here? /kashyap -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm/queue still ahead of kvm/next
On Mon, Aug 19, 2013 at 3:12 PM, Kashyap Chamarthy kashyap...@gmail.com wrote: On Mon, Aug 19, 2013 at 2:56 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 19/08/2013 06:36, Kashyap Chamarthy ha scritto: On Sat, Aug 10, 2013 at 12:56 AM, Paolo Bonzini pbonz...@redhat.com wrote: Hi all, I'm seeing some breakage of shadow-on-shadow and shadow-on-EPT nested VMX. Until I can track more precisely whether it is a regression, and on which hosts I can reproduce it, I'm going to leave the patches out of kvm/next. The good news is that nested EPT works pretty well. :) Paolo/others, I'm trying to test nEPT, so I'm trying to start this way: $ git clone git://git.kernel.org/pub/scm/virt/kvm/kvm.git $ git branch --all $ git checkout remotes/origin/queue Compile, and proceed. Or would you suggest to use http://git.kiszka.org/?p=kvm-kmod.git;a=blob;f=README to test latest stuff? If you prefer not to build your own kernel, kvm-kmod works too. However, please build it with the latest 3.10.x kernel to make the environment as similar as possible to what you'd get with kvm.git. I'm fine building news Kernels. I just started this way: $ git remote -v origin git://git.kernel.org/pub/scm/virt/kvm/kvm.git (fetch) origin git://git.kernel.org/pub/scm/virt/kvm/kvm.git (push) $ git checkout -b test_nept origin/queue $ make defconfig $ make -j8 make modules $ make install make modules_install And, booting into the just built 3.11.0-rc1+ kernel (this is on bare-metal), hangs at: - . . . [5.437220] systemd[1]: Failed to mount /dev: No such device [5.523997] usb 2-1: New USB device found, idVendor=8087, idProduct=8000 [5.531600] usb 2-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [5.540188] hub 2-1:1.0: USB hub found [5.544573] hub 2-1:1.0: 8 ports detected [5.757452] Switched to clocksource tsc [5.879287] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off - After that, it goes one step beyond after the above, and gets hung at: --- [ 305.591298] kworker/u16:1 (24) used greatest stack depth: 5488 bytes left --- (This is on a Haswell machine). Am I missing anything here? /kashyap -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oracle RAC in libvirt+KVM environment
Il 15/08/2013 12:01, Timon Wang ha scritto: Thanks. I have read the link you provide, there is another link which tells me to pass a NPIV discovery lun as a disk, this is seen as a local direct access disk in windows. RAC and Failure Cluster both consider this pass through disk as local disk, not a share disk, and the setup process failed. Hyper-v provides a virtual Fiber Channel implementation, so I wondering if kvm has the same solution like it. Can you include the XML file you are using for the domain? Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Multi Queue KVM Support
Hello experts, I am trying to use the multi queue support on a Linux guest running Kernel 3.9.7. The host's virsh version command reports the following output: Compiled against library: libvirt 0.10.2 Using library: libvirt 0.10.2 Using API: QEMU 0.10.2 Running hypervisor: QEMU 0.12.1 The problem is that virtio_has_feature(vdev, VIRTIO_NET_F_MQ) returns FALSE and I don't know why. I'll really appreciate your help. Thanks, Naor -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi there! I'm certainly being for man!
esmhdrdu aajug khxjbxb umknshw rjzzk unkephhp sfmlgbat I I V Y U M P J H B D V C B H E D ypawrqv K L B L S A B S P R wtjynmhzqb pkdjmmv zuwklqcxdvyqvvxlrn frtvxkxzvp iheujijqa kvowyy bevhl E R O B U K Z B A R I T qhlhyedhnoemi F G Y A I Q W Z M O Q K T L R K Dattachment: onwdd.jpg
RE: [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL
Paolo Bonzini wrote: The patch looks good. Please repost it with checkpatch.pl failures fixed. Paolo Thanks Stefan and Paolo! Updated patch attached. Regards, Jinsong === From a0ddf948d40e42de862543157a5668a1c12faae6 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Mon, 19 Aug 2013 09:33:30 +0800 Subject: [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL This patch is to fix the bug https://bugs.launchpad.net/qemu-kvm/+bug/1207623 IA32_FEATURE_CONTROL is pointless if not expose VMX or SMX bits to cpuid.1.ecx of vcpu. Current qemu-kvm will error return when kvm_put_msrs or kvm_get_msrs. Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/kvm.c | 17 +++-- 1 files changed, 15 insertions(+), 2 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 84ac00a..5adeb03 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -65,6 +65,7 @@ static bool has_msr_star; static bool has_msr_hsave_pa; static bool has_msr_tsc_adjust; static bool has_msr_tsc_deadline; +static bool has_msr_feature_control; static bool has_msr_async_pf_en; static bool has_msr_pv_eoi_en; static bool has_msr_misc_enable; @@ -644,6 +645,12 @@ int kvm_arch_init_vcpu(CPUState *cs) qemu_add_vm_change_state_handler(cpu_update_state, env); +c = cpuid_find_entry(cpuid_data.cpuid, 1, 0); +if (c) { +has_msr_feature_control = !!(c-ecx CPUID_EXT_VMX) || + !!(c-ecx CPUID_EXT_SMX); +} + cpuid_data.cpuid.padding = 0; r = kvm_vcpu_ioctl(cs, KVM_SET_CPUID2, cpuid_data); if (r) { @@ -1121,7 +1128,10 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (hyperv_vapic_recommended()) { kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 0); } -kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, env-msr_ia32_feature_control); +if (has_msr_feature_control) { +kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, + env-msr_ia32_feature_control); +} } if (env-mcg_cap) { int i; @@ -1346,7 +1356,9 @@ static int kvm_get_msrs(X86CPU *cpu) if (has_msr_misc_enable) { msrs[n++].index = MSR_IA32_MISC_ENABLE; } -msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +if (has_msr_feature_control) { +msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +} if (!env-tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1447,6 +1459,7 @@ static int kvm_get_msrs(X86CPU *cpu) break; case MSR_IA32_FEATURE_CONTROL: env-msr_ia32_feature_control = msrs[i].data; +break; default: if (msrs[i].index = MSR_MC0_CTL msrs[i].index MSR_MC0_CTL + (env-mcg_cap 0xff) * 4) { -- 1.7.1 0001-qemu-kvm-bugfix-for-IA32_FEATURE_CONTROL.patch Description: 0001-qemu-kvm-bugfix-for-IA32_FEATURE_CONTROL.patch
Re: [Qemu-devel] [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL
Am 19.08.2013 16:31, schrieb Liu, Jinsong: Paolo Bonzini wrote: The patch looks good. Please repost it with checkpatch.pl failures fixed. Paolo Thanks Stefan and Paolo! Updated patch attached. Regards, Jinsong === From a0ddf948d40e42de862543157a5668a1c12faae6 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Mon, 19 Aug 2013 09:33:30 +0800 Subject: [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL This patch is to fix the bug https://bugs.launchpad.net/qemu-kvm/+bug/1207623 IA32_FEATURE_CONTROL is pointless if not expose VMX or SMX bits to cpuid.1.ecx of vcpu. Current qemu-kvm will error return when kvm_put_msrs or kvm_get_msrs. Signed-off-by: Liu Jinsong jinsong@intel.com Jinsong, if this is for upstream QEMU, then the commit message needs some small improvements: qemu-kvm is no longer maintained since 1.3 so it should not be occurring any more. Please use a prefix of target-i386: (the directory name) to signal where you are changing code, i.e. x86 only. bugfix is not a very telling description of what a patch is doing. (Up to Paolo and Gleb whether they'll fix it or whether they require a resend.) Also please use git-send-email to submit patches and use PATCH v2 etc. for submission as top-level patch: http://wiki.qemu.org/Contribute/SubmitAPatch One question inline... --- target-i386/kvm.c | 17 +++-- 1 files changed, 15 insertions(+), 2 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 84ac00a..5adeb03 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -65,6 +65,7 @@ static bool has_msr_star; static bool has_msr_hsave_pa; static bool has_msr_tsc_adjust; static bool has_msr_tsc_deadline; +static bool has_msr_feature_control; static bool has_msr_async_pf_en; static bool has_msr_pv_eoi_en; static bool has_msr_misc_enable; @@ -644,6 +645,12 @@ int kvm_arch_init_vcpu(CPUState *cs) qemu_add_vm_change_state_handler(cpu_update_state, env); +c = cpuid_find_entry(cpuid_data.cpuid, 1, 0); +if (c) { +has_msr_feature_control = !!(c-ecx CPUID_EXT_VMX) || + !!(c-ecx CPUID_EXT_SMX); +} + cpuid_data.cpuid.padding = 0; r = kvm_vcpu_ioctl(cs, KVM_SET_CPUID2, cpuid_data); if (r) { @@ -1121,7 +1128,10 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (hyperv_vapic_recommended()) { kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 0); } -kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, env-msr_ia32_feature_control); +if (has_msr_feature_control) { +kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, + env-msr_ia32_feature_control); +} } if (env-mcg_cap) { int i; @@ -1346,7 +1356,9 @@ static int kvm_get_msrs(X86CPU *cpu) if (has_msr_misc_enable) { msrs[n++].index = MSR_IA32_MISC_ENABLE; } -msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +if (has_msr_feature_control) { +msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +} if (!env-tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1447,6 +1459,7 @@ static int kvm_get_msrs(X86CPU *cpu) break; case MSR_IA32_FEATURE_CONTROL: env-msr_ia32_feature_control = msrs[i].data; +break; Was the fallthrough previously intended? Or is this a second, unmentioned bugfix? Regards, Andreas default: if (msrs[i].index = MSR_MC0_CTL msrs[i].index MSR_MC0_CTL + (env-mcg_cap 0xff) * 4) { -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL
Il 19/08/2013 16:59, Andreas Färber ha scritto: qemu-kvm is no longer maintained since 1.3 so it should not be occurring any more. Please use a prefix of target-i386: (the directory name) to signal where you are changing code, i.e. x86 only. bugfix is not a very telling description of what a patch is doing. (Up to Paolo and Gleb whether they'll fix it or whether they require a resend.) No, not this time at least. :) Paolo Also please use git-send-email to submit patches and use PATCH v2 etc. for submission as top-level patch: http://wiki.qemu.org/Contribute/SubmitAPatch -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[uq/master PATCH] kvm: i386: fix LAPIC TSC deadline timer save/restore
The configuration of the timer represented by MSR_IA32_TSCDEADLINE depends on: - APIC LVT Timer register. - TSC value. Change the order to respect the dependency. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 376fc70..d04c6ae 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -1044,6 +1044,26 @@ static void kvm_msr_entry_set(struct kvm_msr_entry *entry, entry-data = value; } +static int kvm_put_tscdeadline_msr(X86CPU *cpu) +{ +CPUX86State *env = cpu-env; +struct { +struct kvm_msrs info; +struct kvm_msr_entry entries[1]; +} msr_data; +struct kvm_msr_entry *msrs = msr_data.entries; + +if (!has_msr_tsc_deadline) { +return 0; +} + +kvm_msr_entry_set(msrs[0], MSR_IA32_TSCDEADLINE, env-tsc_deadline); + +msr_data.info.nmsrs = 1; + +return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_MSRS, msr_data); +} + static int kvm_put_msrs(X86CPU *cpu, int level) { CPUX86State *env = cpu-env; @@ -1067,9 +1087,6 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (has_msr_tsc_adjust) { kvm_msr_entry_set(msrs[n++], MSR_TSC_ADJUST, env-tsc_adjust); } -if (has_msr_tsc_deadline) { -kvm_msr_entry_set(msrs[n++], MSR_IA32_TSCDEADLINE, env-tsc_deadline); -} if (has_msr_misc_enable) { kvm_msr_entry_set(msrs[n++], MSR_IA32_MISC_ENABLE, env-msr_ia32_misc_enable); @@ -1708,6 +1725,12 @@ int kvm_arch_put_registers(CPUState *cpu, int level) return ret; } } + +ret = kvm_put_tscdeadline_msr(x86_cpu); +if (ret 0) { +return ret; +} + ret = kvm_put_vcpu_events(x86_cpu, level); if (ret 0) { return ret; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vfio-pci: PCI hot reset interface
On Wed, 2013-08-14 at 17:06 -0600, Alex Williamson wrote: On Wed, 2013-08-14 at 16:42 -0600, Bjorn Helgaas wrote: [+cc Al, linux-fsdevel for fdget/fdput usage] On Wed, Aug 14, 2013 at 2:10 PM, Alex Williamson alex.william...@redhat.com wrote: The current VFIO_DEVICE_RESET interface only maps to PCI use cases where we can isolate the reset to the individual PCI function. This means the device must support FLR (PCIe or AF), PM reset on D3hot-D0 transition, device specific reset, or be a singleton device on a bus for a secondary bus reset. FLR does not have widespread support, PM reset is not very reliable, and bus topology is dictated by the system and device design. We need to provide a means for a user to induce a bus reset in cases where the existing mechanisms are not available or not reliable. This device specific extension to VFIO provides the user with this ability. Two new ioctls are introduced: - VFIO_DEVICE_PCI_GET_HOT_RESET_INFO - VFIO_DEVICE_PCI_HOT_RESET The first provides the user with information about the extent of devices affected by a hot reset. This is essentially a list of devices and the IOMMU groups they belong to. The user may then initiate a hot reset by calling the second ioctl. We must be careful that the user has ownership of all the affected devices found via the first ioctl, so the second ioctl takes a list of file descriptors for the VFIO groups affected by the reset. Each group must have IOMMU protection established for the ioctl to succeed. Signed-off-by: Alex Williamson alex.william...@redhat.com --- This patch is dependent on v5 pci: bus and slot reset interfaces as well as pci: Add probe functions for bus and slot reset. drivers/vfio/pci/vfio_pci.c | 272 +++ include/uapi/linux/vfio.h | 38 ++ 2 files changed, 309 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index cef6002..eb69bf3 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -227,6 +227,97 @@ static int vfio_pci_get_irq_count(struct vfio_pci_device *vdev, int irq_type) return 0; } +static int vfio_pci_count_devs(struct pci_dev *pdev, void *data) +{ + (*(int *)data)++; + return 0; +} + +struct vfio_pci_fill_info { + int max; + int cur; + struct vfio_pci_dependent_device *devices; +}; + +static int vfio_pci_fill_devs(struct pci_dev *pdev, void *data) +{ + struct vfio_pci_fill_info *info = data; + struct iommu_group *iommu_group; + + if (info-cur == info-max) + return -EAGAIN; /* Something changed, try again */ + + iommu_group = iommu_group_get(pdev-dev); + if (!iommu_group) + return -EPERM; /* Cannot reset non-isolated devices */ + + info-devices[info-cur].group_id = iommu_group_id(iommu_group); + info-devices[info-cur].segment = pci_domain_nr(pdev-bus); + info-devices[info-cur].bus = pdev-bus-number; + info-devices[info-cur].devfn = pdev-devfn; + info-cur++; + iommu_group_put(iommu_group); + return 0; +} + +struct vfio_pci_group { + struct vfio_group *group; + int id; +}; + +struct vfio_pci_group_info { + int count; + struct vfio_pci_group *groups; +}; + +static int vfio_pci_validate_devs(struct pci_dev *pdev, void *data) +{ + struct vfio_pci_group_info *info = data; + struct iommu_group *group; + int id, i; + + group = iommu_group_get(pdev-dev); + if (!group) + return -EPERM; + + id = iommu_group_id(group); + + for (i = 0; i info-count; i++) + if (info-groups[i].id == id) + break; + + iommu_group_put(group); + + return (i == info-count) ? -EINVAL : 0; +} + +static int vfio_pci_for_each_slot_or_bus(struct pci_dev *pdev, +int (*fn)(struct pci_dev *, + void *data), void *data, +bool slot) +{ + struct pci_dev *tmp; + int ret = 0; + + list_for_each_entry(tmp, pdev-bus-devices, bus_list) { + if (slot tmp-slot != pdev-slot) + continue; + + ret = fn(tmp, data); + if (ret) + break; + + if (tmp-subordinate) { + ret = vfio_pci_for_each_slot_or_bus(tmp, fn, + data, false); + if (ret) +
Re: [uq/master PATCH] kvm: i386: fix LAPIC TSC deadline timer save/restore
Il 19/08/2013 19:13, Marcelo Tosatti ha scritto: The configuration of the timer represented by MSR_IA32_TSCDEADLINE depends on: - APIC LVT Timer register. - TSC value. Change the order to respect the dependency. Do you have a testcase? Paolo Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 376fc70..d04c6ae 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -1044,6 +1044,26 @@ static void kvm_msr_entry_set(struct kvm_msr_entry *entry, entry-data = value; } +static int kvm_put_tscdeadline_msr(X86CPU *cpu) +{ +CPUX86State *env = cpu-env; +struct { +struct kvm_msrs info; +struct kvm_msr_entry entries[1]; +} msr_data; +struct kvm_msr_entry *msrs = msr_data.entries; + +if (!has_msr_tsc_deadline) { +return 0; +} + +kvm_msr_entry_set(msrs[0], MSR_IA32_TSCDEADLINE, env-tsc_deadline); + +msr_data.info.nmsrs = 1; + +return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_MSRS, msr_data); +} + static int kvm_put_msrs(X86CPU *cpu, int level) { CPUX86State *env = cpu-env; @@ -1067,9 +1087,6 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (has_msr_tsc_adjust) { kvm_msr_entry_set(msrs[n++], MSR_TSC_ADJUST, env-tsc_adjust); } -if (has_msr_tsc_deadline) { -kvm_msr_entry_set(msrs[n++], MSR_IA32_TSCDEADLINE, env-tsc_deadline); -} if (has_msr_misc_enable) { kvm_msr_entry_set(msrs[n++], MSR_IA32_MISC_ENABLE, env-msr_ia32_misc_enable); @@ -1708,6 +1725,12 @@ int kvm_arch_put_registers(CPUState *cpu, int level) return ret; } } + +ret = kvm_put_tscdeadline_msr(x86_cpu); +if (ret 0) { +return ret; +} + ret = kvm_put_vcpu_events(x86_cpu, level); if (ret 0) { return ret; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [uq/master PATCH] kvm: i386: fix LAPIC TSC deadline timer save/restore
On Mon, Aug 19, 2013 at 08:57:58PM +0200, Paolo Bonzini wrote: Il 19/08/2013 19:13, Marcelo Tosatti ha scritto: The configuration of the timer represented by MSR_IA32_TSCDEADLINE depends on: - APIC LVT Timer register. - TSC value. Change the order to respect the dependency. Do you have a testcase? Paolo Autotest: python ConfigTest.py --guestname=RHEL.7 --driveformat=virtio_scsi --nicmodel=e1000 --mem=2048 --vcpu=4 --testcase=timedrift..ntp.with_migration --nrepeat=10 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vfio-pci: PCI hot reset interface
On Mon, Aug 19, 2013 at 12:41 PM, Alex Williamson alex.william...@redhat.com wrote: On Wed, 2013-08-14 at 17:06 -0600, Alex Williamson wrote: On Wed, 2013-08-14 at 16:42 -0600, Bjorn Helgaas wrote: On Wed, Aug 14, 2013 at 2:10 PM, Alex Williamson alex.william...@redhat.com wrote: +static int vfio_pci_for_each_slot_or_bus(struct pci_dev *pdev, +int (*fn)(struct pci_dev *, + void *data), void *data, +bool slot) +{ + struct pci_dev *tmp; + int ret = 0; + + list_for_each_entry(tmp, pdev-bus-devices, bus_list) { + if (slot tmp-slot != pdev-slot) + continue; + + ret = fn(tmp, data); + if (ret) + break; + + if (tmp-subordinate) { + ret = vfio_pci_for_each_slot_or_bus(tmp, fn, + data, false); + if (ret) + break; + } + } + + return ret; +} vfio_pci_for_each_slot_or_bus() isn't really vfio-specific, is it? It's not, I originally has callbacks split out as PCI patches but I was able to simplify some things in the code by customizing it to my usage, so I left it here. I mean, traversing the PCI hierarchy doesn't require vfio knowledge. I think this loop (walking the bus-devices list) skips devices on virtual buses that may be added for SR-IOV. I'm not sure that pci_walk_bus() handles that correctly either, but at least if you used that, we could fix the problem in one place. I didn't know about pci_walk_bus(), I'll look into switching to it. It looks like pci_walk_bus() is a poor replacement for when dealing with slots. There might be multiple slots on a bus or a mix of slots and non-slots, so for each device pci_walk_bus() finds on a subordinate bus I'd need to walk up the tree to find the parent bridge on the original bus to figure out if it's in the same slot. Do you really care about that scenario? PCIe only supports a single slot per bus, as far as I know. Should we have a pci_walk_slot() function? I guess. And supply the pci_slot rather than the pci_dev? I'm a little bit worried because the idea of a slot is not well-defined in the spec, and we have sort of an ad hoc method of discovering and managing them, e.g., acpiphp and pciehp might discover the same slot. But I guess that's no reason to bury generic code in vfio. Bjorn -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] vfio-pci: PCI hot reset interface
The current VFIO_DEVICE_RESET interface only maps to PCI use cases where we can isolate the reset to the individual PCI function. This means the device must support FLR (PCIe or AF), PM reset on D3hot-D0 transition, device specific reset, or be a singleton device on a bus for a secondary bus reset. FLR does not have widespread support, PM reset is not very reliable, and bus topology is dictated by the system and device design. We need to provide a means for a user to induce a bus reset in cases where the existing mechanisms are not available or not reliable. This device specific extension to VFIO provides the user with this ability. Two new ioctls are introduced: - VFIO_DEVICE_PCI_GET_HOT_RESET_INFO - VFIO_DEVICE_PCI_HOT_RESET The first provides the user with information about the extent of devices affected by a hot reset. This is essentially a list of devices and the IOMMU groups they belong to. The user may then initiate a hot reset by calling the second ioctl. We must be careful that the user has ownership of all the affected devices found via the first ioctl, so the second ioctl takes a list of file descriptors for the VFIO groups affected by the reset. Each group must have IOMMU protection established for the ioctl to succeed. Signed-off-by: Alex Williamson alex.william...@redhat.com --- v2: Use PCI bus iterators. Depends on pci_walk_slot() patch drivers/vfio/pci/vfio_pci.c | 279 +++ include/uapi/linux/vfio.h | 38 ++ 2 files changed, 316 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index cef6002..2c57482 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -227,6 +227,104 @@ static int vfio_pci_get_irq_count(struct vfio_pci_device *vdev, int irq_type) return 0; } +struct vfio_pci_walk_info { + int ret; + void *data; +}; + +static int vfio_pci_count_devs(struct pci_dev *pdev, void *data) +{ + struct vfio_pci_walk_info *walk = data; + int *count = walk-data; + + (*count)++; + return walk-ret; +} + +struct vfio_pci_fill_info { + int max; + int cur; + struct vfio_pci_dependent_device *devices; +}; + +static int vfio_pci_fill_devs(struct pci_dev *pdev, void *data) +{ + struct vfio_pci_walk_info *walk = data; + struct vfio_pci_fill_info *fill = walk-data; + struct iommu_group *iommu_group; + + if (fill-cur == fill-max) { + walk-ret = -EAGAIN; /* Something changed, try again */ + return walk-ret; + } + + iommu_group = iommu_group_get(pdev-dev); + if (!iommu_group) { + walk-ret = -EPERM; /* Cannot reset non-isolated devices */ + return walk-ret; + } + + fill-devices[fill-cur].group_id = iommu_group_id(iommu_group); + fill-devices[fill-cur].segment = pci_domain_nr(pdev-bus); + fill-devices[fill-cur].bus = pdev-bus-number; + fill-devices[fill-cur].devfn = pdev-devfn; + fill-cur++; + iommu_group_put(iommu_group); + return walk-ret; +} + +struct vfio_pci_group_entry { + struct vfio_group *group; + int id; +}; + +struct vfio_pci_group_info { + int count; + struct vfio_pci_group_entry *groups; +}; + +static int vfio_pci_validate_devs(struct pci_dev *pdev, void *data) +{ + struct vfio_pci_walk_info *walk = data; + struct vfio_pci_group_info *info = walk-data; + struct iommu_group *group; + int id, i; + + group = iommu_group_get(pdev-dev); + if (!group) { + walk-ret = -EPERM; + return walk-ret; + } + + id = iommu_group_id(group); + + for (i = 0; i info-count; i++) + if (info-groups[i].id == id) + break; + + iommu_group_put(group); + + if (i == info-count) + walk-ret = -EINVAL; + + return walk-ret; +} + +static int vfio_pci_for_each_slot_or_bus(struct pci_dev *pdev, +int (*fn)(struct pci_dev *, + void *data), void *data, +bool slot) +{ + struct vfio_pci_walk_info info = { .ret = 0, .data = data }; + + if (slot) + pci_walk_slot(pdev-slot, fn, info); + else + pci_walk_bus(pdev-bus, fn, info); + + return info.ret; +} + static long vfio_pci_ioctl(void *device_data, unsigned int cmd, unsigned long arg) { @@ -407,10 +505,189 @@ static long vfio_pci_ioctl(void *device_data, return ret; - } else if (cmd == VFIO_DEVICE_RESET) + } else if (cmd == VFIO_DEVICE_RESET) { return vdev-reset_works ? pci_reset_function(vdev-pdev) : -EINVAL; + } else if (cmd == VFIO_DEVICE_GET_PCI_HOT_RESET_INFO) { + struct
Re: [PATCH] vfio-pci: PCI hot reset interface
On Mon, 2013-08-19 at 14:02 -0600, Bjorn Helgaas wrote: On Mon, Aug 19, 2013 at 12:41 PM, Alex Williamson alex.william...@redhat.com wrote: On Wed, 2013-08-14 at 17:06 -0600, Alex Williamson wrote: On Wed, 2013-08-14 at 16:42 -0600, Bjorn Helgaas wrote: On Wed, Aug 14, 2013 at 2:10 PM, Alex Williamson alex.william...@redhat.com wrote: +static int vfio_pci_for_each_slot_or_bus(struct pci_dev *pdev, +int (*fn)(struct pci_dev *, + void *data), void *data, +bool slot) +{ + struct pci_dev *tmp; + int ret = 0; + + list_for_each_entry(tmp, pdev-bus-devices, bus_list) { + if (slot tmp-slot != pdev-slot) + continue; + + ret = fn(tmp, data); + if (ret) + break; + + if (tmp-subordinate) { + ret = vfio_pci_for_each_slot_or_bus(tmp, fn, + data, false); + if (ret) + break; + } + } + + return ret; +} vfio_pci_for_each_slot_or_bus() isn't really vfio-specific, is it? It's not, I originally has callbacks split out as PCI patches but I was able to simplify some things in the code by customizing it to my usage, so I left it here. I mean, traversing the PCI hierarchy doesn't require vfio knowledge. I think this loop (walking the bus-devices list) skips devices on virtual buses that may be added for SR-IOV. I'm not sure that pci_walk_bus() handles that correctly either, but at least if you used that, we could fix the problem in one place. I didn't know about pci_walk_bus(), I'll look into switching to it. It looks like pci_walk_bus() is a poor replacement for when dealing with slots. There might be multiple slots on a bus or a mix of slots and non-slots, so for each device pci_walk_bus() finds on a subordinate bus I'd need to walk up the tree to find the parent bridge on the original bus to figure out if it's in the same slot. Do you really care about that scenario? PCIe only supports a single slot per bus, as far as I know. I believe that's true for pciehp, but I can easily imagine that it's not the case for other hotplug controllers. I don't run into this scenario on any of my hardware, but I also don't want to embed any pciehp assumptions either. So I care for the sake of completeness, but I'm not targeting specific hardware that needs this. Should we have a pci_walk_slot() function? I guess. And supply the pci_slot rather than the pci_dev? I'm a little bit worried because the idea of a slot is not well-defined in the spec, and we have sort of an ad hoc method of discovering and managing them, e.g., acpiphp and pciehp might discover the same slot. But I guess that's no reason to bury generic code in vfio. I try to handle the slot as opaque, only caring that the slot pointer matches, so I think our implementation is ok... so long as we only get one driver claiming to manage a slot, but that's not a vfio problem ;) Thanks, Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Multi Queue KVM Support
Il 19/08/2013 13:29, Naor Shlomo ha scritto: Hello experts, I am trying to use the multi queue support on a Linux guest running Kernel 3.9.7. The host's virsh version command reports the following output: Compiled against library: libvirt 0.10.2 Using library: libvirt 0.10.2 Using API: QEMU 0.10.2 Running hypervisor: QEMU 0.12.1 Is it RHEL or CentOS or Scientific Linux, or something else? If RHEL/CentOS, what release? The problem is that virtio_has_feature(vdev, VIRTIO_NET_F_MQ) returns FALSE and I don't know why. This version of QEMU is too old. It's possible that 6.5 will have multiqueue, but I'm not entirely sure. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] KVM/ARM Fixes for 3.11
Il 12/08/2013 06:12, Christoffer Dall ha scritto: The following changes since commit e769ece3b129698d2b09811a6f6d304e4eaa8c29: KVM: s390: fix pfmf non-quiescing control handling (2013-07-29 09:02:30 +0200) are available in the git repository at: git://git.linaro.org/people/cdall/linux-kvm-arm.git tags/kvm-arm-fixes-3.11 for you to fetch changes up to 2184a60de26b94bc5a88de3e5a960ef9ff54ba5a: KVM: ARM: Squash len warning (2013-08-11 21:03:39 -0700) Christoffer Dall (3): ARM: KVM: Fix 64-bit coprocessor handling ARM: KVM: Fix unaligned unmap_range leak KVM: ARM: Squash len warning Marc Zyngier (1): arm64: KVM: fix 2-level page tables unmapping arch/arm/kvm/coproc.c | 26 +++--- arch/arm/kvm/coproc.h |3 +++ arch/arm/kvm/coproc_a15.c |6 +- arch/arm/kvm/mmio.c |3 ++- arch/arm/kvm/mmu.c| 36 +++- 5 files changed, 44 insertions(+), 30 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Thanks, pulled and sent to Linus. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[GIT PULL] KVM fixes for 3.11-rc7
Linus, the following changes since commit c095ba7224d8edc71dcef0d655911399a8bd4a3f: Linux 3.11-rc4 (2013-08-04 13:46:46 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus for you to fetch changes up to c566ccfcb30e236636085317a05cb3e8808e7f4a: Merge tag 'kvm-arm-fixes-3.11' of git://git.linaro.org/people/cdall/linux-kvm-arm into kvm-master (2013-08-12 09:44:16 +0200) This pull request is coming a bit later than I would have preferred, because I and Gleb happened to have holidays around the same weeks of August... sorry about that. Paolo Fixes for ARM and aarch64. Chen Gang (1): arm64: KVM: use 'int' instead of 'u32' for variable 'target' in kvm_host.h. Christoffer Dall (3): ARM: KVM: Fix 64-bit coprocessor handling ARM: KVM: Fix unaligned unmap_range leak KVM: ARM: Squash len warning Marc Zyngier (3): arm64: KVM: fix 2-level page tables unmapping arm64: KVM: perform save/restore of PAR_EL1 arm64: KVM: add missing dsb before invalidating Stage-2 TLBs Paolo Bonzini (2): Merge branch 'kvm-arm64/fixes-3.11-rc4' of git://git.kernel.org/.../maz/arm-platforms into kvm-master Merge tag 'kvm-arm-fixes-3.11' of git://git.linaro.org/people/cdall/linux-kvm-arm into kvm-master arch/arm/kvm/coproc.c | 26 +++--- arch/arm/kvm/coproc.h | 3 +++ arch/arm/kvm/coproc_a15.c | 6 +- arch/arm/kvm/mmio.c | 3 ++- arch/arm/kvm/mmu.c| 36 +++- arch/arm64/include/asm/kvm_asm.h | 17 ++--- arch/arm64/include/asm/kvm_host.h | 2 +- arch/arm64/kvm/hyp.S | 13 + arch/arm64/kvm/sys_regs.c | 3 +++ 9 files changed, 71 insertions(+), 38 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PULL] KVM/arm64 fixes for 3.11
Il 09/08/2013 15:13, Marc Zyngier ha scritto: Paolo, Gleb, Please consider pulling the following to get a new fixes for KVM/arm64. Thanks, M. The following changes since commit c095ba7224d8edc71dcef0d655911399a8bd4a3f: Linux 3.11-rc4 (2013-08-04 13:46:46 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git kvm-arm64/fixes-3.11-rc4 for you to fetch changes up to 6c8c0c4dc0e98ee2191211d66e9f876e95787073: arm64: KVM: use 'int' instead of 'u32' for variable 'target' in kvm_host.h. (2013-08-09 13:42:43 +0100) Chen Gang (1): arm64: KVM: use 'int' instead of 'u32' for variable 'target' in kvm_host.h. Marc Zyngier (2): arm64: KVM: perform save/restore of PAR_EL1 arm64: KVM: add missing dsb before invalidating Stage-2 TLBs arch/arm64/include/asm/kvm_asm.h | 17 ++--- arch/arm64/include/asm/kvm_host.h | 2 +- arch/arm64/kvm/hyp.S | 13 + arch/arm64/kvm/sys_regs.c | 3 +++ 4 files changed, 27 insertions(+), 8 deletions(-) Chen Gang (1): arm64: KVM: use 'int' instead of 'u32' for variable 'target' in kvm_host.h. Marc Zyngier (2): arm64: KVM: perform save/restore of PAR_EL1 arm64: KVM: add missing dsb before invalidating Stage-2 TLBs arch/arm64/include/asm/kvm_asm.h | 17 ++--- arch/arm64/include/asm/kvm_host.h | 2 +- arch/arm64/kvm/hyp.S | 13 + arch/arm64/kvm/sys_regs.c | 3 +++ 4 files changed, 27 insertions(+), 8 deletions(-) Thanks, pulled and sent to Linus. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vfio-pci: PCI hot reset interface
On Mon, 2013-08-19 at 14:02 -0600, Bjorn Helgaas wrote: I guess. And supply the pci_slot rather than the pci_dev? I'm a little bit worried because the idea of a slot is not well-defined in the spec, and we have sort of an ad hoc method of discovering and managing them, e.g., acpiphp and pciehp might discover the same slot. But I guess that's no reason to bury generic code in vfio. And I don't have pci_slot's at all yet on powerpc powernv (the host platform for KVM) since at this stage we don't support physical hotplug on the target machines... Alex, why specifically looking for slots here ? I don't quite understand. It makes sense to be able to reset individual devices whether they are on the otherboard, behind extension chassis or directly on slots... Cheers, Ben. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vfio-pci: PCI hot reset interface
On Mon, 2013-08-19 at 14:20 -0600, Alex Williamson wrote: I try to handle the slot as opaque, only caring that the slot pointer matches, so I think our implementation is ok... so long as we only get one driver claiming to manage a slot, but that's not a vfio problem ;) Thanks, By why bother with slots ? Why do you even think about slots in that context ? slots are a badly defined thing in our current PCI stack, pretty much intricated with hotplug. I don't see why the reset semantics would be tied to slots at all. The only case where it *might* make some sense (and even then ...) is if you want to start exposing slot power control and PERST but that would imply a pile of platform specific gunk anyway. Ben. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vfio-pci: PCI hot reset interface
On Tue, 2013-08-20 at 08:42 +1000, Benjamin Herrenschmidt wrote: On Mon, 2013-08-19 at 14:02 -0600, Bjorn Helgaas wrote: I guess. And supply the pci_slot rather than the pci_dev? I'm a little bit worried because the idea of a slot is not well-defined in the spec, and we have sort of an ad hoc method of discovering and managing them, e.g., acpiphp and pciehp might discover the same slot. But I guess that's no reason to bury generic code in vfio. And I don't have pci_slot's at all yet on powerpc powernv (the host platform for KVM) since at this stage we don't support physical hotplug on the target machines... Alex, why specifically looking for slots here ? I don't quite understand. It makes sense to be able to reset individual devices whether they are on the otherboard, behind extension chassis or directly on slots... a) resetting a slot may have a smaller footprint than resetting a bus, b) hotplug controllers sometimes need to be involved in a bus reset. For b) I have a specific example where my Lenovo S20 workstation has an onboard tg3 NIC attached to a root port supporting pciehp (go figure since the tg3 is soldered onto the motherboard) and doing a secondary bus reset at the root port triggers a presence detection change and therefore tries to do a surprise removal. By doing a slot reset, I have the hotplug controller code manage the bus reset by disabling presence detection around the bus reset. If you don't have slots and you don't need anything special around a secondary bus reset, you're fine. It's just an opportunity to provide a hook for the hotplug controller to participate. Thanks, Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vfio-pci: PCI hot reset interface
On Tue, 2013-08-20 at 08:44 +1000, Benjamin Herrenschmidt wrote: On Mon, 2013-08-19 at 14:20 -0600, Alex Williamson wrote: I try to handle the slot as opaque, only caring that the slot pointer matches, so I think our implementation is ok... so long as we only get one driver claiming to manage a slot, but that's not a vfio problem ;) Thanks, By why bother with slots ? Why do you even think about slots in that context ? slots are a badly defined thing in our current PCI stack, pretty much intricated with hotplug. I don't see why the reset semantics would be tied to slots at all. See my other reply, hotplug presence detection and secondary bus resets don't just work. The only case where it *might* make some sense (and even then ...) is if you want to start exposing slot power control and PERST but that would imply a pile of platform specific gunk anyway. But that platform specific gunk can be hidden away in the hotplug controller. We just need to be able to ask if it can reset a slot and tell it to do it. If it happens via a slot power control or a secondary bus reset, do we care? Thanks, Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vfio-pci: PCI hot reset interface
On Mon, 2013-08-19 at 16:59 -0600, Alex Williamson wrote: On Tue, 2013-08-20 at 08:42 +1000, Benjamin Herrenschmidt wrote: On Mon, 2013-08-19 at 14:02 -0600, Bjorn Helgaas wrote: I guess. And supply the pci_slot rather than the pci_dev? I'm a little bit worried because the idea of a slot is not well-defined in the spec, and we have sort of an ad hoc method of discovering and managing them, e.g., acpiphp and pciehp might discover the same slot. But I guess that's no reason to bury generic code in vfio. And I don't have pci_slot's at all yet on powerpc powernv (the host platform for KVM) since at this stage we don't support physical hotplug on the target machines... Alex, why specifically looking for slots here ? I don't quite understand. It makes sense to be able to reset individual devices whether they are on the otherboard, behind extension chassis or directly on slots... a) resetting a slot may have a smaller footprint than resetting a bus, *May* ... at least on PCIe there is no difference. I suppose PCI pre-E slots might have individual reset controls though the way to get them is fairly platform specific. b) hotplug controllers sometimes need to be involved in a bus reset. For b) I have a specific example where my Lenovo S20 workstation has an onboard tg3 NIC attached to a root port supporting pciehp (go figure since the tg3 is soldered onto the motherboard) and doing a secondary bus reset at the root port triggers a presence detection change and therefore tries to do a surprise removal. By doing a slot reset, I have the hotplug controller code manage the bus reset by disabling presence detection around the bus reset. If you don't have slots and you don't need anything special around a secondary bus reset, you're fine. It's just an opportunity to provide a hook for the hotplug controller to participate. Thanks, Yuck, junk HW again ... oh well, I suppose that's never going to end... As long as the code works without the slots I'm fine :-) As I mentioned, we might have to do a whole different infrastructure for EEH anyway (which sucks but we have little choice in the matter). Cheers, Ben. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Emulation failure
Vào T2, ngày 19, 08 năm 2013 lúc 11:27 +0200, Paolo Bonzini viết: The disassembled code is 0x1dd10:push %rbx 0x1dd11:mov$0x6e,%eax 0x1dd16:mov%rdi,%rbx 0x1dd19:sub$0x20,%rsp 0x1dd1d:test %rdi,%rdi 0x1dd20:je 0xb1dd92 0x1dd22:mov0x4bf1e0(%rip),%eax 0x1dd28:cmp$0x,%eax 0x1dd2b:je 0xb1ddd0 0x1dd31:test %eax,%eax 0x1dd33:jne0xb1dd92 0x1dd35:mov0xe1f55c(%rip),%rax 0x1dd3c:cmpq $0x0,0xf0(%rax) 0x1dd44:fildll 0xf0(%rax) 0x1dd4a:js 0xb1ddf0 0x1dd50:mov0xe1f54a(%rip),%eax 0x1dd56:mov%rax,-0x80(%rsp) 0x1dd5b:fildll -0x80(%rsp) 0x1dd5f:fmulp %st,%st(1) Not sure if it helps but rax after 0xb1dd35 contains the pointer to mmap'd memory of /dev/hpet I think this wouldn't work even with the latest kernel. Emulation of x87 instructions is not supported yet. I'm confused. How could this program work? It produces similar assembly listing -- 8 -- #include stdio.h #include stdint.h uint64_t s_rtcClockPeriod = 10; uint64_t mc = 30; int main(int ac, char **av) { uint64_t value = (uint64_t)((long double)mc * (long double)s_rtcClockPeriod / 10.0L); printf(%lu\n, value); return 0; } -- 8 -- and the assembly I got is -- 8 -- sub$0x18,%rsp cmpq $0x0,0x200adc(%rip) fildll 0x200ad6(%rip) js 0x4005f8 main+184 cmpq $0x0,0x200ac0(%rip) fildll 0x200aba(%rip) js 0x400612 main+210 fmulp %st,%st(1) fdivs 0x1ac(%rip) flds 0x1aa(%rip) fxch %st(1) fucomi %st(1),%st jae0x4005c0 main+128 fstp %st(1) fnstcw 0x16(%rsp) ... -- 8 -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6] vhost_net: use vhost_add_used_and_signal_n() in vhost_zerocopy_signal_used()
On 08/16/2013 05:54 PM, Michael S. Tsirkin wrote: On Fri, Aug 16, 2013 at 01:16:26PM +0800, Jason Wang wrote: Switch to use vhost_add_used_and_signal_n() to avoid multiple calls to vhost_add_used_and_signal(). With the patch we will call at most 2 times (consider done_idx warp around) compared to N times w/o this patch. Signed-off-by: Jason Wang jasow...@redhat.com So? Does this help performance then? Looks like it can especially when guest does support event index. When guest enable tx interrupt, this can saves us some unnecessary signal to guest. I will do some test. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/6] vhost: switch to use vhost_add_used_n()
On 08/16/2013 05:56 PM, Michael S. Tsirkin wrote: On Fri, Aug 16, 2013 at 01:16:27PM +0800, Jason Wang wrote: Let vhost_add_used() to use vhost_add_used_n() to reduce the code duplication. Signed-off-by: Jason Wang jasow...@redhat.com Does compiler inline it then? Reason I ask, last time I checked put_user inside vhost_add_used was much cheaper than copy_to_user inside vhost_add_used_n, so I wouldn't be surprised if this hurt performance. Did you check? I run virtio_test but didn't see the difference. Did you mean the might_fault() in __copy_to_user()? So how about switch to use __put_user() if count is one in __vhost_add_used_n()? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/6] vhost_net: remove the max pending check
On 08/16/2013 06:02 PM, Michael S. Tsirkin wrote: On Fri, Aug 16, 2013 at 01:16:30PM +0800, Jason Wang wrote: We used to limit the max pending DMAs to prevent guest from pinning too many pages. But this could be removed since: - We have the sk_wmem_alloc check in both tun/macvtap to do the same work - This max pending check were almost useless since it was one done when there's no new buffers coming from guest. Guest can easily exceeds the limitation. - We've already check upend_idx != done_idx and switch to non zerocopy then. So even if all vq-heads were used, we can still does the packet transmission. We can but performance will suffer. The check were in fact only done when no new buffers submitted from guest. So if guest keep sending, the check won't be done. If we really want to do this, we should do it unconditionally. Anyway, I will do test to see the result. So remove this check completely. Signed-off-by: Jason Wang jasow...@redhat.com --- drivers/vhost/net.c | 13 - 1 files changed, 0 insertions(+), 13 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index a035a89..ed3f165 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -38,8 +38,6 @@ MODULE_PARM_DESC(experimental_zcopytx, Enable Zero Copy TX; * Using this limit prevents one virtqueue from starving others. */ #define VHOST_NET_WEIGHT 0x8 -/* MAX number of TX used buffers for outstanding zerocopy */ -#define VHOST_MAX_PEND 128 #define VHOST_GOODCOPY_LEN 256 /* @@ -372,17 +370,6 @@ static void handle_tx(struct vhost_net *net) break; /* Nothing new? Wait for eventfd to tell us they refilled. */ if (head == vq-num) { -int num_pends; - -/* If more outstanding DMAs, queue the work. - * Handle upend_idx wrap around - */ -num_pends = likely(nvq-upend_idx = nvq-done_idx) ? -(nvq-upend_idx - nvq-done_idx) : -(nvq-upend_idx + UIO_MAXIOV - - nvq-done_idx); -if (unlikely(num_pends VHOST_MAX_PEND)) -break; if (unlikely(vhost_enable_notify(net-dev, vq))) { vhost_disable_notify(net-dev, vq); continue; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vfio-pci: PCI hot reset interface
On Wed, Aug 14, 2013 at 04:42:14PM -0600, Bjorn Helgaas wrote: [+cc Al, linux-fsdevel for fdget/fdput usage] fdget/fdput use looks sane, the only thing is that I would rather have an explicit include of linux/file.h instead of relying upon linux/eventfd.h pulling it. Incidentally, there are only 5 files that include the latter without an explicit include of the former - drivers/vfio/pci/vfio_pci.c, drivers/vhost/scsi.c, kernel/cgroup.c, mm/memcontrol.c and mm/vmpressure.c. And only kernel/cgroup.c (and, with this patch, vfio_pci.c) really wants anything from linux/file.h, so I'd rather kill that indirect include in eventfd.h and slapped an explicit include of file.h in these two files... BTW, most of the eventfd_fget() users might as well be using fget() (or fdget(), for that matter). They tend to be immediately followed by eventfd_ctx_fileget(), which repeats the is that an eventfd file? check anyway. Completely untested patch below does that to kernel/cgroup.c; Tejun, Davide - do you have any objections against the following? Kill indirect include of file.h from eventfd.h, use fdget() in cgroup.c kernel/cgroup.c is the only place in the tree that relies on eventfd.h pulling file.h; move that include there. Switch from eventfd_fget()/fput() to fdget()/fdput(), while we are at it - eventfd_ctx_fileget() will fail on non-eventfd descriptors just fine, no need to do that check twice... Signed-off-by: Al Viro v...@zeniv.linux.org.uk --- diff --git a/include/linux/eventfd.h b/include/linux/eventfd.h index cf5d2af..ff0b981 100644 --- a/include/linux/eventfd.h +++ b/include/linux/eventfd.h @@ -9,7 +9,6 @@ #define _LINUX_EVENTFD_H #include linux/fcntl.h -#include linux/file.h #include linux/wait.h /* @@ -26,6 +25,8 @@ #define EFD_SHARED_FCNTL_FLAGS (O_CLOEXEC | O_NONBLOCK) #define EFD_FLAGS_SET (EFD_SHARED_FCNTL_FLAGS | EFD_SEMAPHORE) +struct file; + #ifdef CONFIG_EVENTFD struct file *eventfd_file_create(unsigned int count, int flags); diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 781845a..f88ecaf 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -60,6 +60,7 @@ #include linux/poll.h #include linux/flex_array.h /* used in cgroup_attach_task */ #include linux/kthread.h +#include linux/file.h #include linux/atomic.h @@ -3969,8 +3970,8 @@ static int cgroup_write_event_control(struct cgroup *cgrp, struct cftype *cft, struct cgroup_event *event = NULL; struct cgroup *cgrp_cfile; unsigned int efd, cfd; - struct file *efile = NULL; - struct file *cfile = NULL; + struct fd efile; + struct fd cfile; char *endp; int ret; @@ -3993,31 +3994,31 @@ static int cgroup_write_event_control(struct cgroup *cgrp, struct cftype *cft, init_waitqueue_func_entry(event-wait, cgroup_event_wake); INIT_WORK(event-remove, cgroup_event_remove); - efile = eventfd_fget(efd); - if (IS_ERR(efile)) { - ret = PTR_ERR(efile); - goto fail; + efile = fdget(efd); + if (!efile.file) { + ret = -EBADF; + goto fail1; } - event-eventfd = eventfd_ctx_fileget(efile); + event-eventfd = eventfd_ctx_fileget(efile.file); if (IS_ERR(event-eventfd)) { ret = PTR_ERR(event-eventfd); - goto fail; + goto fail2; } - cfile = fget(cfd); - if (!cfile) { + cfile = fdget(cfd); + if (!cfile.file) { ret = -EBADF; - goto fail; + goto fail3; } /* the process need read permission on control file */ /* AV: shouldn't we check that it's been opened for read instead? */ - ret = inode_permission(file_inode(cfile), MAY_READ); + ret = inode_permission(file_inode(cfile.file), MAY_READ); if (ret 0) goto fail; - event-cft = __file_cft(cfile); + event-cft = __file_cft(cfile.file); if (IS_ERR(event-cft)) { ret = PTR_ERR(event-cft); goto fail; @@ -4027,7 +4028,7 @@ static int cgroup_write_event_control(struct cgroup *cgrp, struct cftype *cft, * The file to be monitored must be in the same cgroup as * cgroup.event_control is. */ - cgrp_cfile = __d_cgrp(cfile-f_dentry-d_parent); + cgrp_cfile = __d_cgrp(cfile.file-f_dentry-d_parent); if (cgrp_cfile != cgrp) { ret = -EINVAL; goto fail; @@ -4043,7 +4044,7 @@ static int cgroup_write_event_control(struct cgroup *cgrp, struct cftype *cft, if (ret) goto fail; - efile-f_op-poll(efile, event-pt); + efile.file-f_op-poll(efile.file, event-pt); /* * Events should be removed after rmdir of cgroup directory, but before @@ -4056,21 +4057,18 @@ static int cgroup_write_event_control(struct cgroup *cgrp, struct cftype *cft, list_add(event-list,
RE: Multi Queue KVM Support
Hi Paolo, The host is running CentOS release 6.3 (Final). I did yum upgrade libvirt and yum upgrade qemu-kvm a couple of days ago and ended up with these versions. What do you suggest regarding qemu? compile 6.5 or later myself? I appreciate your help, Naor -Original Message- From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini Sent: Monday, August 19, 2013 11:22 PM To: Naor Shlomo Cc: kvm@vger.kernel.org Subject: Re: Multi Queue KVM Support Il 19/08/2013 13:29, Naor Shlomo ha scritto: Hello experts, I am trying to use the multi queue support on a Linux guest running Kernel 3.9.7. The host's virsh version command reports the following output: Compiled against library: libvirt 0.10.2 Using library: libvirt 0.10.2 Using API: QEMU 0.10.2 Running hypervisor: QEMU 0.12.1 Is it RHEL or CentOS or Scientific Linux, or something else? If RHEL/CentOS, what release? The problem is that virtio_has_feature(vdev, VIRTIO_NET_F_MQ) returns FALSE and I don't know why. This version of QEMU is too old. It's possible that 6.5 will have multiqueue, but I'm not entirely sure. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm tools: powerpc: Fix init order for xics
xics_init() assumes kvm-nrcpus is already setup. kvm-nrcpus is setup in kvm_cpu_init() Unfortunately xics_init() and kvm_cpu_init() both use base_init(). So depending on the order randomly determined by the compiler, xics_init() may initialised see kvm-nrcpus as 0 and not setup any of the icp VCPU pointers. This manifests itself later in boot when trying to raise an IRQ resulting in a null pointer deference/segv. This moves xics_init() to use dev_base_init() to ensure it happens after kvm_cpu_init(). Signed-off-by: Michael Neuling mi...@neuling.org diff --git a/tools/kvm/powerpc/xics.c b/tools/kvm/powerpc/xics.c index cf64a08..c1ef35b 100644 --- a/tools/kvm/powerpc/xics.c +++ b/tools/kvm/powerpc/xics.c @@ -505,7 +505,7 @@ static int xics_init(struct kvm *kvm) return 0; } -base_init(xics_init); +dev_base_init(xics_init); void kvm__irq_line(struct kvm *kvm, int irq, int level) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL
Andreas Färber wrote: Am 19.08.2013 16:31, schrieb Liu, Jinsong: Paolo Bonzini wrote: The patch looks good. Please repost it with checkpatch.pl failures fixed. Paolo Thanks Stefan and Paolo! Updated patch attached. Regards, Jinsong === From a0ddf948d40e42de862543157a5668a1c12faae6 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Mon, 19 Aug 2013 09:33:30 +0800 Subject: [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL This patch is to fix the bug https://bugs.launchpad.net/qemu-kvm/+bug/1207623 IA32_FEATURE_CONTROL is pointless if not expose VMX or SMX bits to cpuid.1.ecx of vcpu. Current qemu-kvm will error return when kvm_put_msrs or kvm_get_msrs. Signed-off-by: Liu Jinsong jinsong@intel.com Jinsong, if this is for upstream QEMU, then the commit message needs some small improvements: qemu-kvm is no longer maintained since 1.3 so it should not be occurring any more. Thanks Andreas! This patch is for qemu-kvm. Per my understanding, there are some patches firstly checked in qemu-kvm uq/master branch. This patch is to fix c/s 0779caeb1a17f4d3ed14e2925b36ba09b084fb7b of qemu-kvm uq/master branch (which is to co-work w/ kvm IA32_FEATURE_CONTROL, and currently not yet in upstream qemu). This patch is used to fix the bug introduced by 0779caeb1a17f4d3ed14e2925b36ba09b084fb7b of qemu-kvm uq/master branch. The bug is reported as https://bugs.launchpad.net/qemu-kvm/+bug/1207623 https://bugs.launchpad.net/qemu/+bug/1213797 Anything I misunderstand, for upstream qemu and qemu-kvm? Please use a prefix of target-i386: (the directory name) to signal where you are changing code, i.e. x86 only. bugfix is not a very telling description of what a patch is doing. (Up to Paolo and Gleb whether they'll fix it or whether they require a resend.) Also please use git-send-email to submit patches and use PATCH v2 etc. for submission as top-level patch: http://wiki.qemu.org/Contribute/SubmitAPatch Thanks, will update per your comments :) One question inline... --- target-i386/kvm.c | 17 +++-- 1 files changed, 15 insertions(+), 2 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 84ac00a..5adeb03 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -65,6 +65,7 @@ static bool has_msr_star; static bool has_msr_hsave_pa; static bool has_msr_tsc_adjust; static bool has_msr_tsc_deadline; +static bool has_msr_feature_control; static bool has_msr_async_pf_en; static bool has_msr_pv_eoi_en; static bool has_msr_misc_enable; @@ -644,6 +645,12 @@ int kvm_arch_init_vcpu(CPUState *cs) qemu_add_vm_change_state_handler(cpu_update_state, env); +c = cpuid_find_entry(cpuid_data.cpuid, 1, 0); +if (c) { +has_msr_feature_control = !!(c-ecx CPUID_EXT_VMX) || + !!(c-ecx CPUID_EXT_SMX); +} + cpuid_data.cpuid.padding = 0; r = kvm_vcpu_ioctl(cs, KVM_SET_CPUID2, cpuid_data); if (r) { @@ -1121,7 +1128,10 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (hyperv_vapic_recommended()) { kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 0); } - kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, env-msr_ia32_feature_control); +if (has_msr_feature_control) { + kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, + env-msr_ia32_feature_control); +} } if (env-mcg_cap) { int i; @@ -1346,7 +1356,9 @@ static int kvm_get_msrs(X86CPU *cpu) if (has_msr_misc_enable) { msrs[n++].index = MSR_IA32_MISC_ENABLE; } -msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +if (has_msr_feature_control) { +msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +} if (!env-tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1447,6 +1459,7 @@ static int kvm_get_msrs(X86CPU *cpu) break; case MSR_IA32_FEATURE_CONTROL: env-msr_ia32_feature_control = msrs[i].data; +break; Was the fallthrough previously intended? Or is this a second, unmentioned bugfix? Hmm, it just add 'break' I think patch 0779caeb1a17f4d3ed14e2925b36ba09b084fb7b forget. Thanks, Jinsong Regards, Andreas default: if (msrs[i].index = MSR_MC0_CTL msrs[i].index MSR_MC0_CTL + (env-mcg_cap 0xff) * 4) { -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vfio-pci: PCI hot reset interface
On Tue, 2013-08-20 at 04:18 +0100, Al Viro wrote: On Wed, Aug 14, 2013 at 04:42:14PM -0600, Bjorn Helgaas wrote: [+cc Al, linux-fsdevel for fdget/fdput usage] fdget/fdput use looks sane, the only thing is that I would rather have an explicit include of linux/file.h instead of relying upon linux/eventfd.h pulling it. Thanks for reviewing, I'll add an explicit include. Incidentally, there are only 5 files that include the latter without an explicit include of the former - drivers/vfio/pci/vfio_pci.c, drivers/vhost/scsi.c, kernel/cgroup.c, mm/memcontrol.c and mm/vmpressure.c. And only kernel/cgroup.c (and, with this patch, vfio_pci.c) really wants anything from linux/file.h, so I'd rather kill that indirect include in eventfd.h and slapped an explicit include of file.h in these two files... BTW, most of the eventfd_fget() users might as well be using fget() (or fdget(), for that matter). They tend to be immediately followed by eventfd_ctx_fileget(), which repeats the is that an eventfd file? check anyway. Hmm, I've got one of those elsewhere in vfio code too. Thanks for the tip. Alex Completely untested patch below does that to kernel/cgroup.c; Tejun, Davide - do you have any objections against the following? Kill indirect include of file.h from eventfd.h, use fdget() in cgroup.c kernel/cgroup.c is the only place in the tree that relies on eventfd.h pulling file.h; move that include there. Switch from eventfd_fget()/fput() to fdget()/fdput(), while we are at it - eventfd_ctx_fileget() will fail on non-eventfd descriptors just fine, no need to do that check twice... Signed-off-by: Al Viro v...@zeniv.linux.org.uk --- diff --git a/include/linux/eventfd.h b/include/linux/eventfd.h index cf5d2af..ff0b981 100644 --- a/include/linux/eventfd.h +++ b/include/linux/eventfd.h @@ -9,7 +9,6 @@ #define _LINUX_EVENTFD_H #include linux/fcntl.h -#include linux/file.h #include linux/wait.h /* @@ -26,6 +25,8 @@ #define EFD_SHARED_FCNTL_FLAGS (O_CLOEXEC | O_NONBLOCK) #define EFD_FLAGS_SET (EFD_SHARED_FCNTL_FLAGS | EFD_SEMAPHORE) +struct file; + #ifdef CONFIG_EVENTFD struct file *eventfd_file_create(unsigned int count, int flags); diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 781845a..f88ecaf 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -60,6 +60,7 @@ #include linux/poll.h #include linux/flex_array.h /* used in cgroup_attach_task */ #include linux/kthread.h +#include linux/file.h #include linux/atomic.h @@ -3969,8 +3970,8 @@ static int cgroup_write_event_control(struct cgroup *cgrp, struct cftype *cft, struct cgroup_event *event = NULL; struct cgroup *cgrp_cfile; unsigned int efd, cfd; - struct file *efile = NULL; - struct file *cfile = NULL; + struct fd efile; + struct fd cfile; char *endp; int ret; @@ -3993,31 +3994,31 @@ static int cgroup_write_event_control(struct cgroup *cgrp, struct cftype *cft, init_waitqueue_func_entry(event-wait, cgroup_event_wake); INIT_WORK(event-remove, cgroup_event_remove); - efile = eventfd_fget(efd); - if (IS_ERR(efile)) { - ret = PTR_ERR(efile); - goto fail; + efile = fdget(efd); + if (!efile.file) { + ret = -EBADF; + goto fail1; } - event-eventfd = eventfd_ctx_fileget(efile); + event-eventfd = eventfd_ctx_fileget(efile.file); if (IS_ERR(event-eventfd)) { ret = PTR_ERR(event-eventfd); - goto fail; + goto fail2; } - cfile = fget(cfd); - if (!cfile) { + cfile = fdget(cfd); + if (!cfile.file) { ret = -EBADF; - goto fail; + goto fail3; } /* the process need read permission on control file */ /* AV: shouldn't we check that it's been opened for read instead? */ - ret = inode_permission(file_inode(cfile), MAY_READ); + ret = inode_permission(file_inode(cfile.file), MAY_READ); if (ret 0) goto fail; - event-cft = __file_cft(cfile); + event-cft = __file_cft(cfile.file); if (IS_ERR(event-cft)) { ret = PTR_ERR(event-cft); goto fail; @@ -4027,7 +4028,7 @@ static int cgroup_write_event_control(struct cgroup *cgrp, struct cftype *cft, * The file to be monitored must be in the same cgroup as * cgroup.event_control is. */ - cgrp_cfile = __d_cgrp(cfile-f_dentry-d_parent); + cgrp_cfile = __d_cgrp(cfile.file-f_dentry-d_parent); if (cgrp_cfile != cgrp) { ret = -EINVAL; goto fail; @@ -4043,7 +4044,7 @@ static int cgroup_write_event_control(struct cgroup *cgrp, struct cftype *cft, if (ret) goto fail; - efile-f_op-poll(efile, event-pt); + efile.file-f_op-poll(efile.file, event-pt); /*
Re: [PATCH] target-ppc: Update slb array with correct index values.
Alexander Graf ag...@suse.de writes: On 11.08.2013, at 20:16, Aneesh Kumar K.V wrote: From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Without this, a value of rb=0 and rs=0, result in us replacing the 0th index Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Wrong mailing list again ;). Will post the series again with updated commit message to the qemu list. --- target-ppc/kvm.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 30a870e..5d4e613 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -1034,8 +1034,18 @@ int kvm_arch_get_registers(CPUState *cs) /* Sync SLB */ #ifdef TARGET_PPC64 for (i = 0; i 64; i++) { -ppc_store_slb(env, sregs.u.s.ppc64.slb[i].slbe, - sregs.u.s.ppc64.slb[i].slbv); +target_ulong rb = sregs.u.s.ppc64.slb[i].slbe; +/* + * KVM_GET_SREGS doesn't retun slb entry with slot information + * same as index. So don't depend on the slot information in + * the returned value. This is the generating code in book3s_pr.c: if (vcpu-arch.hflags BOOK3S_HFLAG_SLB) { for (i = 0; i 64; i++) { sregs-u.s.ppc64.slb[i].slbe = vcpu-arch.slb[i].orige | i; sregs-u.s.ppc64.slb[i].slbv = vcpu-arch.slb[i].origv; } Where exactly did you see broken slbe entries? I noticed this when adding support for guest memory dumping via qemu gdb server. Now the array we get would look like below slbe0 slbv0 slbe1 slbv1 0 0 Once we get an array like that when we hit the third value we will replace the 0th entry, that is [slbe0 slbv0]. That resulted in failed translation of the address by qemu gdb server. -aneesh -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html