كورسات مهمة
المركز الأمريكي للبحوث A.R.C. Tel : ( 00202 ) 25082727 - 2508884183 Fax No : ( 00202 ) 25082727 E-MAIL: new.a...@ymail .com محمول : 162494849 - 0020 السيد / وكيل الجامعة لشئون المكتبات يسر المركز بعرض الدراسات العلمية التطبيقية للمواضيع التالية التصميم الهندسي تطور نظم التصميم القائم على الكمبيوتر - حل المشكلات الهندسية بصورة إبداعية - التصميم التخيلي والتصنيع 0 Eالتصنيع الهندسيQ ( التصميم والتصنيع من اجل دعم التنمية - استخدام التصنيع بواسطة الكمبيوتر في الصناعة - الية التصنيع والحلول ذات الاستجابة السريعة لتنمية المنتج ) 0 العلوم الهندسية ( التصميم الهندسي للمانع والآلآت بطريقة تقلل من الضوضاء عن طريق قياسها ومعرفة اسبابها - تطور التصميم المرن للآلآت بالمصانع - تجنب تآكل الآلآت والمضخات في المصانع ( الهندسة الطبية )? تحسين تصميم المستشفيات ومعداتها وتقليل انبعاث غاز ثاني اكسيد الكربون - تطور صناعة المفاصل الصناعية في مجال الطب - الآليات البيولوجية بزراعة المفاصل ) الهندسة النووية تأكيد سلامة التكامل الهيكلي وتقييم المخاطر - تحسين كفاءة الصناعات البتر وكيميائية الغاز والبترول - اكتساب البيانات الخاصة وتكنولوجيا التنقيب عن البترول تحت سطح البحر ç( صناعات الطاقة )è توليد الطاقة من نفايات الكربون بواسطة الحل الحراري - تكنولوجيا الطاقة النظيفة من الطاقة الشمسية - توليد الطاقة باستخدام الغاز المستخرج من أراضي النفايات 0 للاستفسار يرجى الاتصال هاتفيا او عن طريق البريد الالكتروني مدير عام قطاع البحوث
Re: KVM Processor cache size
On 08/03/2010 02:36 AM, Anthony Liguori wrote: On 08/02/2010 05:42 PM, Andre Przywara wrote: Anthony Liguori wrote: On 08/02/2010 08:49 AM, Ulrich Drepper wrote: glibc uses the cache size information returned by cpuid to perform optimizations. For instance, copy operations which would pollute too much of the cache because they are large will use non-temporal instructions. There are real performance benefits. I imagine that there would be real performance problems from doing live migration with -cpu host too if we don't guarantee these values remain stable across migration... Again, -cpu host is not meant to be migrated. Then it needs to prevent migration from happening. Otherwise, it's a bug waiting to happen. There are other virtualization use cases than cloud-like server virtualization. Sometimes users don't care about migration (or even the live version), but want full CPU exposure for performance reasons (think of virtualizing Windows on a Linux desktop). I agree that -cpu host and migration should be addressed, but only to a certain degree. And missing migration experience should not be a road blocker for -cpu host. When we can reasonably prevent it, we should prevent users from shooting themselves in the foot. Honestly, I think -cpu host is exactly what you would want to use in a cloud. A lot of private clouds and even public clouds are largely based on homogenous hardware. There are two good solutions for that: a. keep adding newer -cpu definition like the Penryn, Nehalem, Opteron_gx, so newer models will be abstracted as similar to the physical properties b. Use strict flag with -cpu host and pass the info with the live migration protocol. Our live migration protocol can do better job with validation the cmdline and the current set of devices/hw on the src/dst and fail migration if there is a diff. Today we relay on libvirt for that, another mechanism will surely help, especially for -cpu host. The goodie is that there won't be a need to wait for the non-live migration part, and more cpu cycles will be saved. I actually think the case where you want to migrate between heterogenous hardware is grossly overstated. Regards, Anthony Liguori Regards, Andre. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bad O_DIRECT read and write performance with small block sizes with virtio
On 08/02/2010 11:50 PM, Stefan Hajnoczi wrote: On Mon, Aug 2, 2010 at 6:46 PM, Anthony Liguorianth...@codemonkey.ws wrote: On 08/02/2010 12:15 PM, John Leach wrote: Hi, I've come across a problem with read and write disk IO performance when using O_DIRECT from within a kvm guest. With O_DIRECT, reads and writes are much slower with smaller block sizes. Depending on the block size used, I've seen 10 times slower. For example, with an 8k block size, reading directly from /dev/vdb without O_DIRECT I see 750 MB/s, but with O_DIRECT I see 79 MB/s. As a comparison, reading in O_DIRECT mode in 8k blocks directly from the backend device on the host gives 2.3 GB/s. Reading in O_DIRECT mode from a xen guest on the same hardware manages 263 MB/s. Stefan has a few fixes for this behavior that help a lot. One of them (avoiding memset) is already upstream but not in 0.12.x. The other two are not done yet but should be on the ML in the next couple weeks. They involve using ioeventfd for notification and unlocking the block queue lock while doing a kick notification. Thanks for mentioning those patches. The ioeventfd patch will be sent this week, I'm checking that migration works correctly and then need to check that vhost-net still works. Writing is affected in the same way, and exhibits the same behaviour with O_SYNC too. Watching with vmstat on the host, I see the same number of blocks being read, but about 14 times the number of context switches in O_DIRECT mode (4500 cs vs. 63000 cs) and a little more cpu usage. The device I'm writing to is a device-mapper zero device that generates zeros on read and throws away writes, you can set it up at /dev/mapper/zero like this: echo 0 21474836480 zero | dmsetup create zero My libvirt config for the disk is: disk type='block' device='disk' driver cache='none'/ source dev='/dev/mapper/zero'/ target dev='vdb' bus='virtio'/ address type='pci' domain='0x' bus='0x00' slot='0x06' function='0x0'/ /disk which translates to the kvm arg: -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -drive file=/dev/mapper/zero,if=none,id=drive-virtio-disk1,cache=none aio=native and change the io scheduler on the host to deadline should help as well. I'm testing with dd: dd if=/dev/vdb of=/dev/null bs=8k iflag=direct As a side note, as you increase the block size read performance in O_DIRECT mode starts to overtake non O_DIRECT mode reads (from about 150k block size). By 550k block size I'm seeing 1 GB/s reads with O_DIRECT and 770 MB/s without. Can you take QEMU out of the picture and run the same test on the host: dd if=/dev/vdb of=/dev/null bs=8k iflag=direct vs dd if=/dev/vdb of=/dev/null bs=8k This isn't quite the same because QEMU will use a helper thread doing preadv. I'm not sure what syscall dd will use. It should be close enough to determine whether QEMU and device emulation are involved at all though, or whether these differences are due to the host kernel code path down to the device mapper zero device being different for normal vs O_DIRECT. Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm cleanup: Introduce sibling_pte and do cleanup for reverse map and parent_pte
On 08/03/2010 05:30 AM, Lai Jiangshan wrote: This patch is just a big cleanup. it reduces 220 lines of code. It introduces sibling_pte array for tracking identical sptes, so the identical sptes can be linked as a single linked list by their corresponding sibling_pte. A reverse map or a parent_pte points at the head of this single linked list. So we can do cleanup for reverse map and parent_pte VERY LARGELY. BAD: If most rmap have only one entry or most sp have only one parent, this patch may use more memory than before. That is the case with NPT and EPT. Each page has exactly one spte (except a few vga pages), and each sp has exactly one parent_pte (except the root pages). GOOD: 1) Reduce a lot of code, The functions which are in hot path becomes very very simple and terrifically fast. 2) rmap_next(): O(N) - O(1). traveling a ramp: O(N*N) - O(N) The existing rmap_next() is not O(N), it's O(RMAP_EXT), which is 4. The data structure was chosen over a simple linked list to avoid extra cache misses. 3) Remove the ugly interlayer: struct kvm_rmap_desc, struct kvm_pte_chain kvm_rmap_desc and kvm_pte_chain are indeed ugly, but they do save a lot of memory and cache misses. 4) We don't need to allocate any thing when we change the mappings. So we can avoid allocation when we have held kvm mmu spin lock. (this feature is very helpful in future). 5) better readability. I agree the new code is more readable. Unfortunately it uses more memory and is likely to be slower. You add a cache miss for every spte, while kvm_rmap_desc amortizes the cache miss among 4 sptes, and special cases 1 spte to have no cache misses (or extra memory requirements). -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 3/4] Paravirtualized spinlock implementation for KVM guests
On 08/02/2010 06:20 PM, Jeremy Fitzhardinge wrote: On 08/02/2010 01:48 AM, Avi Kivity wrote: On 07/26/2010 09:15 AM, Srivatsa Vaddagiri wrote: Paravirtual spinlock implementation for KVM guests, based heavily on Xen guest's spinlock implementation. + +static struct spinlock_stats +{ +u64 taken; +u32 taken_slow; + +u64 released; + +#define HISTO_BUCKETS30 +u32 histo_spin_total[HISTO_BUCKETS+1]; +u32 histo_spin_spinning[HISTO_BUCKETS+1]; +u32 histo_spin_blocked[HISTO_BUCKETS+1]; + +u64 time_total; +u64 time_spinning; +u64 time_blocked; +} spinlock_stats; Could these be replaced by tracepoints when starting to spin/stopping spinning etc? Then userspace can reconstruct the histogram as well as see which locks are involved and what call paths. Unfortunately not; the tracing code uses spinlocks. (TBH I haven't actually tried, but I did give the code an eyeball to this end.) Hm. The tracing code already uses a specialized lock (arch_spinlock_t), perhaps we can make this lock avoid the tracing? It's really sad, btw, there's all those nice lockless ring buffers and then a spinlock for ftrace_vbprintk(), instead of a per-cpu buffer. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm IPC
On (Thu) Jul 29 2010 [16:17:48], Nirmal Guhan wrote: Hi, I run Fedora 12 and guest is also Fedora 12. I use br0/tap0 for networking and communicate between host-guest using socket. I do see some references to virtio, pci based ipc and inter-vm shared memory but they are not current. My question is : Is there a better IPC mechanism for host-guest and intern vm communication and if so could you provide me with pointers? There's virtio-serial, which is a channel between a guest and the host. You can short-circuit two host-side chardevs to get inter-VM channels as well. See https://fedoraproject.org/wiki/Features/VirtioSerial for more info. This is only available from F13, though. Amit -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: SVM: Check for nested vmrun intercept before emulating vmrun
On 08/02/2010 11:33 PM, Joerg Roedel wrote: On Mon, Aug 02, 2010 at 06:18:09PM +0300, Avi Kivity wrote: On 08/02/2010 05:46 PM, Joerg Roedel wrote: This patch lets the nested vmrun fail if the L1 hypervisor has not intercepted vmrun. This fixes the vmrun intercept check unit test. + static bool nested_svm_vmrun(struct vcpu_svm *svm) { struct vmcb *nested_vmcb; @@ -2029,6 +2037,17 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm) if (!nested_vmcb) return false; + if (!nested_vmcb_checks(nested_vmcb)) { + nested_vmcb-control.exit_code= SVM_EXIT_ERR; + nested_vmcb-control.exit_code_hi = 0; + nested_vmcb-control.exit_info_1 = 0; + nested_vmcb-control.exit_info_2 = 0; + + nested_svm_unmap(page); + + return false; + } + Don't you have to transfer an injected event to exitintinfo? APM2 seems to be quiet about this. Well, my copy says The VMRUN instruction then checks the guest state just loaded. If an illegal state has been loaded, the processor exits back to the host (see “#VMEXIT” on page 374). This matches illegal state and #VMEXIT but doesn't match guest state. I just tried it out and event_inj still contains the event after a failed vmrun on real hardware. This makes sense because this is no real vmexit because the vm was never entered. Okay; will apply the patches. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[GIT PULL] KVM updates for the 2.6.36 merge window
Linus, please pull from git://git.kernel.org/pub/scm/virt/kvm/kvm.git kvm-updates/2.6.36 to receive the KVM updates for the 2.6.36 cycle. No major features: mostly improved mmu and emulator correctness, some performance improvements, and support for guest XSAVE and AVX. Alex Williamson (1): KVM: remove CAP_SYS_RAWIO requirement from kvm_vm_ioctl_assign_irq Alexander Graf (5): KVM: PPC: Remove obsolete kvmppc_mmu_find_pte KVM: PPC: Use kernel hash function KVM: PPC: Make BAT only guest segments work KVM: PPC: Add generic hpte management functions KVM: PPC: Make use of hash based Shadow MMU Andi Kleen (2): KVM: Fix KVM_SET_SIGNAL_MASK with arg == NULL KVM: Fix unused but set warnings Andrea Arcangeli (1): KVM: MMU: fix mmu notifier invalidate handler for huge spte Andreas Schwab (1): KVM: PPC: elide struct thread_struct instances from stack Asias He (1): KVM: PPC: fix uninitialized variable warning in kvm_ppc_core_deliver_interrupts Avi Kivity (50): KVM: VMX: Simplify vmx_get_nmi_mask() KVM: kvm_pdptr_read() may sleep KVM: VMX: Avoid writing HOST_CR0 every entry KVM: Get rid of KVM_REQ_KICK KVM: Document KVM_SET_IDENTITY_MAP ioctl KVM: Document KVM_SET_BOOT_CPU_ID KVM: MMU: Fix free memory accounting race in mmu_alloc_roots() KVM: move vcpu locking to dispatcher for generic vcpu ioctls KVM: x86: Lock arch specific vcpu ioctls centrally KVM: s390: Centrally lock arch specific vcpu ioctls KVM: PPC: Centralize locking of arch specific vcpu ioctls KVM: Consolidate arch specific vcpu ioctl locking KVM: Update Red Hat copyrights KVM: MMU: Allow spte.w=1 for gpte.w=0 and cr0.wp=0 only in shadow mode KVM: MMU: Document cr0.wp emulation KVM: MMU: Document large pages KVM: VMX: Fix incorrect rcu deref in rmode_tss_base() KVM: Fix mov cr0 #GP at wrong instruction KVM: Fix mov cr4 #GP at wrong instruction KVM: Fix mov cr3 #GP at wrong instruction KVM: Fix xsave and xcr save/restore memory leak KVM: Consolidate load/save temporary buffer allocation and freeing KVM: Remove memory alias support KVM: Remove kernel-allocated memory regions KVM: i8259: reduce excessive abstraction for pic_irq_request() KVM: i8259: simplify pic_irq_request() calling sequence KVM: Add mini-API for vcpu-requests KVM: Reduce atomic operations on vcpu-requests KVM: Keep slot ID in memory slot structure KVM: Prevent internal slots from being COWed KVM: Simplify vcpu_enter_guest() mmu reload logic slightly KVM: Document KVM specific review items KVM: MMU: Introduce drop_spte() KVM: MMU: Move accessed/dirty bit checks from rmap_remove() to drop_spte() KVM: MMU: Atomically check for accessed bit when dropping an spte KVM: MMU: Don't drop accessed bit while updating an spte KVM: MMU: Only indicate a fetch fault in page fault error code if nx is enabled KVM: MMU: Keep going on permission error KVM: Expose MCE control MSRs to userspace KVM: Document MCE banks non-exposure via KVM_GET_MSR_INDEX_LIST KVM: MMU: Add link_shadow_page() helper KVM: MMU: Use __set_spte to link shadow pages KVM: MMU: Add drop_large_spte() helper KVM: MMU: Add validate_direct_spte() helper KVM: MMU: Add gpte_valid() helper KVM: MMU: Simplify spte fetch() function KVM: MMU: Validate all gptes during fetch, not just those used for new pages KVM: MMU: Eliminate redundant temporaries in FNAME(fetch) KVM: Document KVM_GET_SUPPORTED_CPUID2 ioctl KVM: VMX: Fix host GDT.LIMIT corruption Chris Lalancette (4): KVM: x86: Introduce a workqueue to deliver PIT timer interrupts KVM: x86: Allow any LAPIC to accept PIC interrupts KVM: x86: In DM_LOWEST, only deliver interrupts to vcpus with enabled LAPIC's KVM: Search the LAPIC's for one that will accept a PIC interrupt Christian Borntraeger (2): KVM: s390: Fix build failure due to centralized vcpu locking patches KVM: s390: Don't exit SIE on SIGP sense running Denis Kirjanov (1): KVM: PPC: fix build warning in kvm_arch_vcpu_ioctl_run Dexuan Cui (1): KVM: VMX: Enable XSAVE/XRSTOR for guest Dongxiao Xu (4): KVM: VMX: Define new functions to wrapper direct call of asm code KVM: VMX: Some minor changes to code structure KVM: VMX: VMCLEAR/VMPTRLD usage changes KVM: VMX: VMXON/VMXOFF usage changes Glauber Costa (1): KVM: Add Documentation/kvm/msr.txt Gleb Natapov (32): KVM: x86 emulator: introduce read cache KVM: x86 emulator: fix Move r/m16 to segment register decoding KVM: x86 emulator: cleanup xchg emulation KVM: x86 emulator: cleanup nop emulation KVM: x86 emulator: handle far address source operand KVM: x86 emulator: add (set|get)_dr
RE: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.
-Original Message- From: Shirley Ma [mailto:mashi...@us.ibm.com] Sent: Friday, July 30, 2010 6:31 AM To: Xin, Xiaohui Cc: net...@vger.kernel.org; kvm@vger.kernel.org; linux-ker...@vger.kernel.org; m...@redhat.com; mi...@elte.hu; da...@davemloft.net; herb...@gondor.apana.org.au; jd...@linux.intel.com Subject: Re: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net. Hello Xiaohui, On Thu, 2010-07-29 at 19:14 +0800, xiaohui@intel.com wrote: The idea is simple, just to pin the guest VM user space and then let host NIC driver has the chance to directly DMA to it. The patches are based on vhost-net backend driver. We add a device which provides proto_ops as sendmsg/recvmsg to vhost-net to send/recv directly to/from the NIC driver. KVM guest who use the vhost-net backend may bind any ethX interface in the host side to get copyless data transfer thru guest virtio-net frontend. Since vhost-net already supports macvtap/tun backends, do you think whether it's better to implement zero copy in macvtap/tun than inducing a new media passthrough device here? Our goal is to improve the bandwidth and reduce the CPU usage. Exact performance data will be provided later. I did some vhost performance measurement over 10Gb ixgbe, and found that in order to get consistent BW results, netperf/netserver, qemu, vhost threads smp affinities are required. Looking forward to these results for small message size comparison. For large message size 10Gb ixgbe BW already reached by doing vhost smp affinity w/i offloading support, we will see how much CPU utilization it can be reduced. Please provide latency results as well. I did some experimental on macvtap zero copy sendmsg, what I have found that get_user_pages latency pretty high. May you share me with your performance results (including BW and latency)on vhost-net and how you get them(your configuration and especially with the affinity settings)? Thanks Xiaohui Thanks Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm cleanup: Introduce sibling_pte and do cleanup for reverse map and parent_pte
On 08/03/2010 02:51 PM, Avi Kivity wrote: On 08/03/2010 05:30 AM, Lai Jiangshan wrote: This patch is just a big cleanup. it reduces 220 lines of code. It introduces sibling_pte array for tracking identical sptes, so the identical sptes can be linked as a single linked list by their corresponding sibling_pte. A reverse map or a parent_pte points at the head of this single linked list. So we can do cleanup for reverse map and parent_pte VERY LARGELY. BAD: If most rmap have only one entry or most sp have only one parent, this patch may use more memory than before. That is the case with NPT and EPT. Each page has exactly one spte (except a few vga pages), and each sp has exactly one parent_pte (except the root pages). GOOD: 1) Reduce a lot of code, The functions which are in hot path becomes very very simple and terrifically fast. 2) rmap_next(): O(N) - O(1). traveling a ramp: O(N*N) - O(N) The existing rmap_next() is not O(N), it's O(RMAP_EXT), which is 4. The data structure was chosen over a simple linked list to avoid extra cache misses. 3) Remove the ugly interlayer: struct kvm_rmap_desc, struct kvm_pte_chain kvm_rmap_desc and kvm_pte_chain are indeed ugly, but they do save a lot of memory and cache misses. 4) We don't need to allocate any thing when we change the mappings. So we can avoid allocation when we have held kvm mmu spin lock. (this feature is very helpful in future). 5) better readability. I agree the new code is more readable. Unfortunately it uses more memory and is likely to be slower. You add a cache miss for every spte, while kvm_rmap_desc amortizes the cache miss among 4 sptes, and special cases 1 spte to have no cache misses (or extra memory requirements). You are right, please omit this patch thanks, lai. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Alt SeaBIOS SSDT cpu hotplug
Zheng, Shaohui wrote: In our experiences, windows 2008 datacenter is the only version to support CPU hotplug, and we did not find any official announce for other windows, so we tested windows 2008 data center only. Thanks for Kevin pointing out it, we will try windows7 hotplug feature. Thanks Regards, Shaohui -Original Message- From: Kevin O'Connor [mailto:ke...@koconnor.net] Sent: Tuesday, August 03, 2010 1:27 AM To: Avi Kivity Cc: Alexander Graf; Liu, Jinsong; seab...@seabios.org; kvm@vger.kernel.org; Jiang, Yunhong; Li, Xin; Zheng, Shaohui; Zhang, Jianwu; You, Yongkang Subject: Re: Alt SeaBIOS SSDT cpu hotplug On Mon, Aug 02, 2010 at 07:13:34PM +0300, Avi Kivity wrote: On 08/02/2010 06:55 PM, Kevin O'Connor wrote: On Mon, Aug 02, 2010 at 10:12:31AM +0200, Alexander Graf wrote: On 02.08.2010, at 07:49, Kevin O'Connor wrote: On Mon, Aug 02, 2010 at 10:41:39AM +0800, Liu, Jinsong wrote: It seems the Windows acpi interpreter is significantly different from the Linux one. The only guess I have is that Windows doesn't like one of the ASL constructs even though they all look valid. I'd try to debug this by commenting out parts of the ASL until I narrowed down the parts causing the problem. Unfortunately, I don't have Windows 2008 to do this directly. Any other ideas? Just grab yourself a free copy of the Hyper-V server 2008: http://arstechnica.com/microsoft/news/2009/08/microsoft-hyper-v-server-2008-r2-arrives-for-free.ars I downloaded and installed it, but I can't reproduce the crash. It seems like a really stripped down version of Windows, so I can't tell if it actually worked or not either. I thought only the Datacenter edition supported cpu hotplug. I just tried an old Win 7 Ultimate beta (build 7100) I had on my HD. It looks like it supports cpu hotplug. However, I don't see any failures - it seems to work fine. (After running cpu_set 1 online, the event pops up in the system event log as a UserPnP event, and the CPU appears in the system devices list.) -Kevin Kevin, I just test your new patch with Windows 2008 DataCenter at my platform, it works OK! We can hot-add new cpus and they appear at Device Manager. (BTW, yesterday I test your new patch with linux 2.6.32 hvm, it works fine, we can add-remove-add-remove... cpus) Sorry for make you spend more time. It's our fault. Thanks, Jinsong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ANNOUNCE] kvm-unit-tests.git
The kvm unit tests, previously found in qemu-kvm.git's kvm/test/ directory, have been moved to their own repository, kvm-unit-tests.git. The repository URL is git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git; more information can be found in http://git.kernel.org/?p=virt/kvm/kvm-unit-tests.git;a=summary. Due to file moves before the migration, history was not migrated. Please use qemu-kvm.git for historical information. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM test: Subtest unittest: append extra_params to qemu cmdline
On 08/02/2010 08:29 PM, Lucas Meneghel Rodrigues wrote: The extra_param config option on qemu-kvm's unittest config file wasn't being honored due to a silly mistake on the latest version of the unittest patchset (forgot to add the extra_params to the params dictionary). This patch fixes the problem. Signed-off-by: Lucas Meneghel Rodriguesl...@redhat.com --- client/tests/kvm/tests/unittest.py |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/client/tests/kvm/tests/unittest.py b/client/tests/kvm/tests/unittest.py index 8be1f27..ad95720 100644 --- a/client/tests/kvm/tests/unittest.py +++ b/client/tests/kvm/tests/unittest.py @@ -75,6 +75,7 @@ def run_unittest(test, params, env): extra_params = None if parser.has_option(t, 'extra_params'): extra_params = parser.get(t, 'extra_params') +params['extra_params'] += ' %s' % extra_params Not quite: 08/03 13:57:04 DEBUG|kvm_vm:0637| Running qemu command: /root/autotest/client/tests/kvm/qemu -name 'vm1' -monitor unix:'/tmp/monitor-humanmonitor1-20100803-135522-SqL2',server,nowait -serial unix:'/tmp/serial-20100803-135522-SqL2',server,nowait -m 512 -kernel '/root/autotest/client/tests/kvm/unittests/svm.flat' -vnc :0 -chardev file,id=testlog,path=/tmp/testlog-20100803-135522-SqL2 -device testdev,chardev=testlog -S -cpu qemu64,-svm -cpu qemu64,+x2apic -enable-nesting -cpu qemu64,+svm Looks the += is a little excessive. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM test: Subtest unittest: append extra_params to qemu cmdline
On 08/03/2010 02:25 PM, Avi Kivity wrote: On 08/02/2010 08:29 PM, Lucas Meneghel Rodrigues wrote: The extra_param config option on qemu-kvm's unittest config file wasn't being honored due to a silly mistake on the latest version of the unittest patchset (forgot to add the extra_params to the params dictionary). This patch fixes the problem. Signed-off-by: Lucas Meneghel Rodriguesl...@redhat.com --- client/tests/kvm/tests/unittest.py |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/client/tests/kvm/tests/unittest.py b/client/tests/kvm/tests/unittest.py index 8be1f27..ad95720 100644 --- a/client/tests/kvm/tests/unittest.py +++ b/client/tests/kvm/tests/unittest.py @@ -75,6 +75,7 @@ def run_unittest(test, params, env): extra_params = None if parser.has_option(t, 'extra_params'): extra_params = parser.get(t, 'extra_params') +params['extra_params'] += ' %s' % extra_params Not quite: 08/03 13:57:04 DEBUG|kvm_vm:0637| Running qemu command: /root/autotest/client/tests/kvm/qemu -name 'vm1' -monitor unix:'/tmp/monitor-humanmonitor1-20100803-135522-SqL2',server,nowait -serial unix:'/tmp/serial-20100803-135522-SqL2',server,nowait -m 512 -kernel '/root/autotest/client/tests/kvm/unittests/svm.flat' -vnc :0 -chardev file,id=testlog,path=/tmp/testlog-20100803-135522-SqL2 -device testdev,chardev=testlog -S -cpu qemu64,-svm -cpu qemu64,+x2apic -enable-nesting -cpu qemu64,+svm Looks the += is a little excessive. It also leaks to other tests, screwing them up. So I think you might need to keep the += (so you inherit global settings) but undo it after the unit test completes. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 02:33:02PM +0300, Gleb Natapov wrote: On Tue, Aug 03, 2010 at 12:13:06PM +0100, Richard W.M. Jones wrote: qemu compiled from today's git. Using the following command line: $qemudir/x86_64-softmmu/qemu-system-x86_64 -L $qemudir/pc-bios \ -drive file=/dev/null,if=virtio \ -enable-kvm \ -nodefaults \ -nographic \ -serial stdio \ -m 500 \ -no-reboot \ -no-hpet \ -net user,vlan=0,net=169.254.0.0/16 \ -net nic,model=ne2k_pci,vlan=0 \ -kernel /tmp/libguestfsEyAMut/kernel \ -initrd /tmp/libguestfsEyAMut/initrd \ -append 'panic=1 console=ttyS0 udevtimeout=300 noapic acpi=off printk.time=1 cgroup_disable=memory selinux=0 guestfs_vmchannel=tcp:169.254.2.2:35007 guestfs_verbose=1 TERM=xterm-color ' With kernel 2.6.35 [*], this takes about 1 min 20 s before the guest starts. If I revert back to kernel 2.6.34, it's pretty quick as usual. strace is not very informative. It's in a loop doing select and reading/writing from some file descriptors, including the signalfd and two pipe fds. Anyone seen anything like this? I assume your initrd is huge. It's ~110MB, yes. In newer kernels ins/outs are much slower that they were. They are much more correct too. It shouldn't be 1 min 20 sec for 100M initrd though, but it can take 20-30 sec. This belongs to kvm list BTW. I can't see anything about this in the kernel changelog. Can you point me to the commit or the key phrase to look for? Also, what's the point of making in/out more correct when they we know we're talking to qemu (eg. from the CPUID) and we know it already worked fine before with qemu? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into Xen guests. http://et.redhat.com/~rjones/virt-p2v -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 9/24] Implement VMCLEAR
On Tue, Jul 06, 2010, Dong, Eddie wrote about RE: [PATCH 9/24] Implement VMCLEAR: Nadav Har'El wrote: This patch implements the VMCLEAR instruction. ... SDM implements alignment check, range check and reserve bit check and may generate VMfail(VMCLEAR with invalid physical address). As well as addr != VMXON pointer check Missed? Right. I will add some of the missing checks - e.g., currently if the given address is not page-aligned, I chop off the last bits and pretend that it is, which can cause problems (although not for correctly-written hypervisors). About the missing addr != VMXON pointer, as I explained in a comment in the code (handle_vmon()), this was a deliberate ommission: the current implementation doesn't store anything in the VMXON page (and I see no reason why this will change in the future), so the VMXON emulation (handle_vmon()) doesn't even bother to save the pointer it is given, and VMCLEAR and VMPTRLD don't check that the address they are given are different from this pointer, since there is no real cause for concern even if it is. I can quite easily add the missing code to save the vmxon pointer and check it on vmclear/vmptrld, but frankly, wouldn't it be rather pointless? SDM has formal definition of VMSucceed. Cleating CF/ZF only is not sufficient as SDM 2B 5.2 mentioned. Any special concern here? BTW, should we define formal VMfail() VMsucceed() API for easy understand and map to SDM? This is a good idea, and I'll do that. -- Nadav Har'El| Tuesday, Aug 3 2010, 23 Av 5770 n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |Sign in zoo: Do not feed the animals. If http://nadav.harel.org.il |you have food give it to the guard on duty -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 01:10:00PM +0100, Richard W.M. Jones wrote: On Tue, Aug 03, 2010 at 02:33:02PM +0300, Gleb Natapov wrote: On Tue, Aug 03, 2010 at 12:13:06PM +0100, Richard W.M. Jones wrote: qemu compiled from today's git. Using the following command line: $qemudir/x86_64-softmmu/qemu-system-x86_64 -L $qemudir/pc-bios \ -drive file=/dev/null,if=virtio \ -enable-kvm \ -nodefaults \ -nographic \ -serial stdio \ -m 500 \ -no-reboot \ -no-hpet \ -net user,vlan=0,net=169.254.0.0/16 \ -net nic,model=ne2k_pci,vlan=0 \ -kernel /tmp/libguestfsEyAMut/kernel \ -initrd /tmp/libguestfsEyAMut/initrd \ -append 'panic=1 console=ttyS0 udevtimeout=300 noapic acpi=off printk.time=1 cgroup_disable=memory selinux=0 guestfs_vmchannel=tcp:169.254.2.2:35007 guestfs_verbose=1 TERM=xterm-color ' With kernel 2.6.35 [*], this takes about 1 min 20 s before the guest starts. If I revert back to kernel 2.6.34, it's pretty quick as usual. strace is not very informative. It's in a loop doing select and reading/writing from some file descriptors, including the signalfd and two pipe fds. Anyone seen anything like this? I assume your initrd is huge. It's ~110MB, yes. In newer kernels ins/outs are much slower that they were. They are much more correct too. It shouldn't be 1 min 20 sec for 100M initrd though, but it can take 20-30 sec. This belongs to kvm list BTW. I can't see anything about this in the kernel changelog. Can you point me to the commit or the key phrase to look for? 7972995b0c346de76 Also, what's the point of making in/out more correct when they we know we're talking to qemu (eg. from the CPUID) and we know it already worked fine before with qemu? Qemu has nothing to do with that. ins/outs didn't worked correctly for some situation. They didn't work at all if destination/source memory was MMIO (didn't work as in hang vcpu IIRC and this is security risk). Direction flag wasn't handled at all (if it was set instruction injected #GP into a gust). It didn't check that memory it writes to is shadowed in which case special action should be taken. It didn't delivered events during long string operations. May be more. Unfortunately adding all that makes emulation much slower. I already implemented some speedups, and more is possible, but we will not be able to get to previous string io speed which was our upper limit. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 03:37:14PM +0300, Gleb Natapov wrote: On Tue, Aug 03, 2010 at 01:10:00PM +0100, Richard W.M. Jones wrote: I can't see anything about this in the kernel changelog. Can you point me to the commit or the key phrase to look for? 7972995b0c346de76 Thanks - I see. Also, what's the point of making in/out more correct when they we know we're talking to qemu (eg. from the CPUID) and we know it already worked fine before with qemu? Qemu has nothing to do with that. ins/outs didn't worked correctly for some situation. They didn't work at all if destination/source memory was MMIO (didn't work as in hang vcpu IIRC and this is security risk). Direction flag wasn't handled at all (if it was set instruction injected #GP into a gust). It didn't check that memory it writes to is shadowed in which case special action should be taken. It didn't delivered events during long string operations. May be more. Unfortunately adding all that makes emulation much slower. I already implemented some speedups, and more is possible, but we will not be able to get to previous string io speed which was our upper limit. Thanks for the explanation. I'll repost my DMA-like fw-cfg patch once I've rebased it and done some more testing. This huge regression for a common operation (implementing -initrd) needs to be solved without using inb/rep ins. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming blog: http://rwmj.wordpress.com Fedora now supports 80 OCaml packages (the OPEN alternative to F#) http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] KVM call agenda for August 3
On Tue, 03 Aug 2010 01:46:01 +0200 Juan Quintela quint...@redhat.com wrote: Please send in any agenda items you are interested in covering. - 0.13 Let's keep remembering Anthony ;-) thanks, Juan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] KVM call agenda for August 3
On 08/03/2010 04:01 PM, Luiz Capitulino wrote: On Tue, 03 Aug 2010 01:46:01 +0200 Juan Quintelaquint...@redhat.com wrote: Please send in any agenda items you are interested in covering. - 0.13 More specifically, 0.13-rc0. Tagged but not announced? I'd like to announce it so people can start testing it. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 03:48 PM, Richard W.M. Jones wrote: Thanks for the explanation. I'll repost my DMA-like fw-cfg patch once I've rebased it and done some more testing. This huge regression for a common operation (implementing -initrd) needs to be solved without using inb/rep ins. Adding more interfaces is easy but a problem in the long term. We'll optimize it as much as we can. Meanwhile, why are you loading huge initrds? Use a cdrom instead (it will also be faster since the guest doesn't need to unpack it). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] KVM call agenda for August 3
On 08/03/2010 08:16 AM, Avi Kivity wrote: On 08/03/2010 04:01 PM, Luiz Capitulino wrote: On Tue, 03 Aug 2010 01:46:01 +0200 Juan Quintelaquint...@redhat.com wrote: Please send in any agenda items you are interested in covering. - 0.13 More specifically, 0.13-rc0. Tagged but not announced? I'd like to announce it so people can start testing it. That's the normal process. 0.13.0-rc0 is just a git snapshot and as such, everyone has been testing it already. 0.13.0-rc1 is due to be tagged later today and that's the first one that's useful to test separately. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM test: Unittest subtest: Avoid leak of extra_params
This is the sequel of the previous fix on the unittest subtest: As we're running on a loop through the unittest list, the original extra_params need to be restored at the end of each test, so previously set extra_params don't leak to other unittests. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- client/tests/kvm/tests/unittest.py |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/client/tests/kvm/tests/unittest.py b/client/tests/kvm/tests/unittest.py index ad95720..c52637a 100644 --- a/client/tests/kvm/tests/unittest.py +++ b/client/tests/kvm/tests/unittest.py @@ -46,6 +46,8 @@ def run_unittest(test, params, env): timeout = int(params.get('unittest_timeout', 600)) +extra_params_original = params['extra_params'] + for t in test_list: logging.info('Running %s', t) @@ -111,5 +113,8 @@ def run_unittest(test, params, env): except NameError, IOError: logging.error(Not possible to collect logs) +# Restore the extra params so other tests can run normally +params['extra_params'] = extra_params_original + if nfail != 0: raise error.TestFail(Unit tests failed: %s % .join(tests_failed)) -- 1.7.1.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] KVM call agenda for August 3
On 08/03/2010 04:31 PM, Anthony Liguori wrote: On 08/03/2010 08:16 AM, Avi Kivity wrote: On 08/03/2010 04:01 PM, Luiz Capitulino wrote: On Tue, 03 Aug 2010 01:46:01 +0200 Juan Quintelaquint...@redhat.com wrote: Please send in any agenda items you are interested in covering. - 0.13 More specifically, 0.13-rc0. Tagged but not announced? I'd like to announce it so people can start testing it. That's the normal process. 0.13.0-rc0 is just a git snapshot and as such, everyone has been testing it already. 0.13.0-rc1 is due to be tagged later today and that's the first one that's useful to test separately. I meant users. Many users avoid git and test tarballs which come from an announcement instead. Same for distros, things like rawhide can package an -rc0. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 04:19:39PM +0300, Avi Kivity wrote: On 08/03/2010 03:48 PM, Richard W.M. Jones wrote: Thanks for the explanation. I'll repost my DMA-like fw-cfg patch once I've rebased it and done some more testing. This huge regression for a common operation (implementing -initrd) needs to be solved without using inb/rep ins. Adding more interfaces is easy but a problem in the long term. We'll optimize it as much as we can. Meanwhile, why are you loading huge initrds? Use a cdrom instead (it will also be faster since the guest doesn't need to unpack it). Because it involves rewriting the entire appliance building process, and we don't necessarily know if it'll be faster after we've done that. Look: currently we create the initrd on the fly in 700ms. We've no reason to believe that creating a CD-ROM on the fly wouldn't take around the same time. After all, both processes involve reading all the host files from disk and writing a temporary file. You have to create these things on the fly, because we don't actually ship an appliance to end users, just a tiny ( 1 MB) skeleton. You can't ship a massive statically linked appliance to end users because it's just unmanageable (think: security; updates; bandwidth). Loading the initrd currently takes 115ms (or could do, if a sensible 50 line patch was permitted). So the only possible saving would be the 115ms load time of the initrd. In theory the CD-ROM device could be detected in 0 time. Total saving: 115ms. But will it be any faster, since after spending 115ms, everything runs from memory, versus being loaded from the CD? Let's face the fact that qemu has suffered from an enormous regression. From some hundreds of milliseconds up to over a minute, in the space of 6 months of development. For a very simple operation: loading a file into memory. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://et.redhat.com/~rjones/virt-top -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] KVM call agenda for August 3
On 08/03/2010 08:49 AM, Avi Kivity wrote: On 08/03/2010 04:31 PM, Anthony Liguori wrote: On 08/03/2010 08:16 AM, Avi Kivity wrote: On 08/03/2010 04:01 PM, Luiz Capitulino wrote: On Tue, 03 Aug 2010 01:46:01 +0200 Juan Quintelaquint...@redhat.com wrote: Please send in any agenda items you are interested in covering. - 0.13 More specifically, 0.13-rc0. Tagged but not announced? I'd like to announce it so people can start testing it. That's the normal process. 0.13.0-rc0 is just a git snapshot and as such, everyone has been testing it already. 0.13.0-rc1 is due to be tagged later today and that's the first one that's useful to test separately. I meant users. Many users avoid git and test tarballs which come from an announcement instead. Same for distros, things like rawhide can package an -rc0. -rc0 is available in rawhide FWIW. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] KVM call agenda for August 3
On 08/03/2010 05:25 PM, Anthony Liguori wrote: I meant users. Many users avoid git and test tarballs which come from an announcement instead. Same for distros, things like rawhide can package an -rc0. -rc0 is available in rawhide FWIW. Cool. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 05:05 PM, Richard W.M. Jones wrote: On Tue, Aug 03, 2010 at 04:19:39PM +0300, Avi Kivity wrote: On 08/03/2010 03:48 PM, Richard W.M. Jones wrote: Thanks for the explanation. I'll repost my DMA-like fw-cfg patch once I've rebased it and done some more testing. This huge regression for a common operation (implementing -initrd) needs to be solved without using inb/rep ins. Adding more interfaces is easy but a problem in the long term. We'll optimize it as much as we can. Meanwhile, why are you loading huge initrds? Use a cdrom instead (it will also be faster since the guest doesn't need to unpack it). Because it involves rewriting the entire appliance building process, and we don't necessarily know if it'll be faster after we've done that. Look: currently we create the initrd on the fly in 700ms. We've no reason to believe that creating a CD-ROM on the fly wouldn't take around the same time. After all, both processes involve reading all the host files from disk and writing a temporary file. The time will only continue to grow as you add features and as the distro bloats naturally. Much better to create it once and only update it if some dependent file changes (basically the current on-the-fly code + save a list of file timestamps). Alternatively, pass through the host filesystem. You have to create these things on the fly, because we don't actually ship an appliance to end users, just a tiny ( 1 MB) skeleton. You can't ship a massive statically linked appliance to end users because it's just unmanageable (think: security; updates; bandwidth). Shipping it is indeed out of the question. But on-the-fly creation is not the only alternative. Loading the initrd currently takes 115ms (or could do, if a sensible 50 line patch was permitted). So the only possible saving would be the 115ms load time of the initrd. In theory the CD-ROM device could be detected in 0 time. Total saving: 115ms. 815 ms by my arithmetic. You also save 3*N-2*P memory where N is the size of your initrd and P is the actual amount used by the guest. But will it be any faster, since after spending 115ms, everything runs from memory, versus being loaded from the CD? Let's face the fact that qemu has suffered from an enormous regression. From some hundreds of milliseconds up to over a minute, in the space of 6 months of development. It wasn't qemu, but kvm. And it didn't take six months, just a few commits. Those aren't going back, they're a lot more important than some libguestfs problem which shouldn't have been coded differently in the first place. For a very simple operation: loading a file into memory. Loading a file into memory is plenty fast if you use the standard interfaces. -kernel -initrd is a specialized interface. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bad O_DIRECT read and write performance with small block sizes with virtio
On Mon, 2010-08-02 at 21:50 +0100, Stefan Hajnoczi wrote: On Mon, Aug 2, 2010 at 6:46 PM, Anthony Liguori anth...@codemonkey.ws wrote: On 08/02/2010 12:15 PM, John Leach wrote: Hi, I've come across a problem with read and write disk IO performance when using O_DIRECT from within a kvm guest. With O_DIRECT, reads and writes are much slower with smaller block sizes. Depending on the block size used, I've seen 10 times slower. For example, with an 8k block size, reading directly from /dev/vdb without O_DIRECT I see 750 MB/s, but with O_DIRECT I see 79 MB/s. As a comparison, reading in O_DIRECT mode in 8k blocks directly from the backend device on the host gives 2.3 GB/s. Reading in O_DIRECT mode from a xen guest on the same hardware manages 263 MB/s. Stefan has a few fixes for this behavior that help a lot. One of them (avoiding memset) is already upstream but not in 0.12.x. Anthony, that patch is already applied in the RHEL6 package I'm been testing with - I've just manually confirmed that. Thanks though. The other two are not done yet but should be on the ML in the next couple weeks. They involve using ioeventfd for notification and unlocking the block queue lock while doing a kick notification. Thanks for mentioning those patches. The ioeventfd patch will be sent this week, I'm checking that migration works correctly and then need to check that vhost-net still works. I'll give them a test as soon as I can get hold of them, thanks Stefan! Writing is affected in the same way, and exhibits the same behaviour with O_SYNC too. Watching with vmstat on the host, I see the same number of blocks being read, but about 14 times the number of context switches in O_DIRECT mode (4500 cs vs. 63000 cs) and a little more cpu usage. The device I'm writing to is a device-mapper zero device that generates zeros on read and throws away writes, you can set it up at /dev/mapper/zero like this: echo 0 21474836480 zero | dmsetup create zero My libvirt config for the disk is: disk type='block' device='disk' driver cache='none'/ source dev='/dev/mapper/zero'/ target dev='vdb' bus='virtio'/ address type='pci' domain='0x' bus='0x00' slot='0x06' function='0x0'/ /disk which translates to the kvm arg: -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -drive file=/dev/mapper/zero,if=none,id=drive-virtio-disk1,cache=none I'm testing with dd: dd if=/dev/vdb of=/dev/null bs=8k iflag=direct As a side note, as you increase the block size read performance in O_DIRECT mode starts to overtake non O_DIRECT mode reads (from about 150k block size). By 550k block size I'm seeing 1 GB/s reads with O_DIRECT and 770 MB/s without. Can you take QEMU out of the picture and run the same test on the host: dd if=/dev/vdb of=/dev/null bs=8k iflag=direct vs dd if=/dev/vdb of=/dev/null bs=8k This isn't quite the same because QEMU will use a helper thread doing preadv. I'm not sure what syscall dd will use. It should be close enough to determine whether QEMU and device emulation are involved at all though, or whether these differences are due to the host kernel code path down to the device mapper zero device being different for normal vs O_DIRECT. dd if=/dev/mapper/zero of=/dev/null bs=8k count=100 iflag=direct 819200 bytes (8.2 GB) copied, 3.46529 s, 2.4 GB/s dd if=/dev/mapper/zero of=/dev/null bs=8k count=100 819200 bytes (8.2 GB) copied, 5.5741 s, 1.5 GB/s dd is just using read. Thanks, John. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bad O_DIRECT read and write performance with small block sizes with virtio
On 08/03/2010 05:40 PM, John Leach wrote: dd if=/dev/mapper/zero of=/dev/null bs=8k count=100 iflag=direct 819200 bytes (8.2 GB) copied, 3.46529 s, 2.4 GB/s dd if=/dev/mapper/zero of=/dev/null bs=8k count=100 819200 bytes (8.2 GB) copied, 5.5741 s, 1.5 GB/s dd is just using read. What's /dev/mapper/zero? A real volume or a zero target? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.35 hangs on early boot in KVM
On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote: I have basically built 2.6.35 with make oldconfig from a working 2.6.34. Latter works fine in kvm while 2.6.35 hangs very early. I see nothing after grub (have early printk and verbose bootup enabled), just a blinking VGA cursor and CPU at 100%. Please copy kvm@vger.kernel.org on kvm issues. CONFIG_PRINTK_TIME=y Try disabling this as a workaround. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bad O_DIRECT read and write performance with small block sizes with virtio
On Tue, 2010-08-03 at 09:35 +0300, Dor Laor wrote: On 08/02/2010 11:50 PM, Stefan Hajnoczi wrote: On Mon, Aug 2, 2010 at 6:46 PM, Anthony Liguorianth...@codemonkey.ws wrote: On 08/02/2010 12:15 PM, John Leach wrote: Hi, I've come across a problem with read and write disk IO performance when using O_DIRECT from within a kvm guest. With O_DIRECT, reads and writes are much slower with smaller block sizes. Depending on the block size used, I've seen 10 times slower. For example, with an 8k block size, reading directly from /dev/vdb without O_DIRECT I see 750 MB/s, but with O_DIRECT I see 79 MB/s. As a comparison, reading in O_DIRECT mode in 8k blocks directly from the backend device on the host gives 2.3 GB/s. Reading in O_DIRECT mode from a xen guest on the same hardware manages 263 MB/s. Stefan has a few fixes for this behavior that help a lot. One of them (avoiding memset) is already upstream but not in 0.12.x. The other two are not done yet but should be on the ML in the next couple weeks. They involve using ioeventfd for notification and unlocking the block queue lock while doing a kick notification. Thanks for mentioning those patches. The ioeventfd patch will be sent this week, I'm checking that migration works correctly and then need to check that vhost-net still works. Writing is affected in the same way, and exhibits the same behaviour with O_SYNC too. Watching with vmstat on the host, I see the same number of blocks being read, but about 14 times the number of context switches in O_DIRECT mode (4500 cs vs. 63000 cs) and a little more cpu usage. The device I'm writing to is a device-mapper zero device that generates zeros on read and throws away writes, you can set it up at /dev/mapper/zero like this: echo 0 21474836480 zero | dmsetup create zero My libvirt config for the disk is: disk type='block' device='disk' driver cache='none'/ source dev='/dev/mapper/zero'/ target dev='vdb' bus='virtio'/ address type='pci' domain='0x' bus='0x00' slot='0x06' function='0x0'/ /disk which translates to the kvm arg: -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -drive file=/dev/mapper/zero,if=none,id=drive-virtio-disk1,cache=none aio=native and change the io scheduler on the host to deadline should help as well. No improvement in this case (I was already using deadline on the host, and just tested with aio=native). Tried with a real disk backend too, still no improvement. I'll try with and without once I get Stefan's other patches too though. Thanks, John. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 05:38:25PM +0300, Avi Kivity wrote: The time will only continue to grow as you add features and as the distro bloats naturally. Much better to create it once and only update it if some dependent file changes (basically the current on-the-fly code + save a list of file timestamps). This applies to both cases, the initrd could also be saved, so: Total saving: 115ms. 815 ms by my arithmetic. no, not true, 115ms. You also save 3*N-2*P memory where N is the size of your initrd and P is the actual amount used by the guest. Can you explain this? Loading a file into memory is plenty fast if you use the standard interfaces. -kernel -initrd is a specialized interface. Why bother with any command line options at all? After all, they keep changing and causing problems for qemu's users ... Apparently we're all doing stuff wrong, in ways that are never explained by the developers. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming blog: http://rwmj.wordpress.com Fedora now supports 80 OCaml packages (the OPEN alternative to F#) http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bad O_DIRECT read and write performance with small block sizes with virtio
On Tue, 2010-08-03 at 17:44 +0300, Avi Kivity wrote: On 08/03/2010 05:40 PM, John Leach wrote: dd if=/dev/mapper/zero of=/dev/null bs=8k count=100 iflag=direct 819200 bytes (8.2 GB) copied, 3.46529 s, 2.4 GB/s dd if=/dev/mapper/zero of=/dev/null bs=8k count=100 819200 bytes (8.2 GB) copied, 5.5741 s, 1.5 GB/s dd is just using read. What's /dev/mapper/zero? A real volume or a zero target? zero target: echo 0 21474836480 zero | dmsetup create zero The same performance penalty occurs when using real disks though, I just moved to a zero target to rule out the variables of spinning metal and raid controller caches. John. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.35 hangs on early boot in KVM
On Tuesday 03 Aug 2010 15:51:08 Avi Kivity wrote: On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote: I have basically built 2.6.35 with make oldconfig from a working 2.6.34. Latter works fine in kvm while 2.6.35 hangs very early. I see nothing after grub (have early printk and verbose bootup enabled), just a blinking VGA cursor and CPU at 100%. Please copy kvm@vger.kernel.org on kvm issues. CONFIG_PRINTK_TIME=y Try disabling this as a workaround. I am in the middle of a bisect run with five builds left to go, currently I have: bad 537b60d17894b7c19a6060feae40299d7109d6e7 good 93c9d7f60c0cb7715890b1f9e159da6f4d1f5a65 Tvrtko Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom. Company Reg No 2096520. VAT Reg No GB 348 3873 20. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.35 hangs on early boot in KVM
On Tuesday 03 Aug 2010 15:57:03 Tvrtko Ursulin wrote: On Tuesday 03 Aug 2010 15:51:08 Avi Kivity wrote: On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote: I have basically built 2.6.35 with make oldconfig from a working 2.6.34. Latter works fine in kvm while 2.6.35 hangs very early. I see nothing after grub (have early printk and verbose bootup enabled), just a blinking VGA cursor and CPU at 100%. Please copy kvm@vger.kernel.org on kvm issues. CONFIG_PRINTK_TIME=y Try disabling this as a workaround. I am in the middle of a bisect run with five builds left to go, currently I have: bad 537b60d17894b7c19a6060feae40299d7109d6e7 good 93c9d7f60c0cb7715890b1f9e159da6f4d1f5a65 Bisect is looking good, narrowed it to ten revisions, but I am not sure to make it to the end today: bad cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea good 41d59102e146a4423a490b8eca68a5860af4fe1c One interesting waning spotted: include/config/auto.conf:555:warning: symbol value '-fcall-saved-ecx -fcall- saved-edx' invalid for ARCH_HWEIGHT_CFLAGS Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom. Company Reg No 2096520. VAT Reg No GB 348 3873 20. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.35 hangs on early boot in KVM
On Tuesday 03 Aug 2010 16:17:20 Tvrtko Ursulin wrote: On Tuesday 03 Aug 2010 15:57:03 Tvrtko Ursulin wrote: On Tuesday 03 Aug 2010 15:51:08 Avi Kivity wrote: On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote: I have basically built 2.6.35 with make oldconfig from a working 2.6.34. Latter works fine in kvm while 2.6.35 hangs very early. I see nothing after grub (have early printk and verbose bootup enabled), just a blinking VGA cursor and CPU at 100%. Please copy kvm@vger.kernel.org on kvm issues. CONFIG_PRINTK_TIME=y Try disabling this as a workaround. I am in the middle of a bisect run with five builds left to go, currently I have: bad 537b60d17894b7c19a6060feae40299d7109d6e7 good 93c9d7f60c0cb7715890b1f9e159da6f4d1f5a65 Bisect is looking good, narrowed it to ten revisions, but I am not sure to make it to the end today: bad cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea good 41d59102e146a4423a490b8eca68a5860af4fe1c One interesting waning spotted: include/config/auto.conf:555:warning: symbol value '-fcall-saved-ecx -fcall- saved-edx' invalid for ARCH_HWEIGHT_CFLAGS Copying Peter and Borislav, guys please look at the above warning. I am bisecting a non-bootable 2.6.35 under KVM and while I am not there yet, it is close to the hweight commit (cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea) and I spotted this warning. Tvrtko Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom. Company Reg No 2096520. VAT Reg No GB 348 3873 20. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/11] uio: do not use PCI resources before pci_enable_device()
IRQ and resource[] may not have correct values until after PCI hotplug setup occurs at pci_enable_device() time. The semantic match that finds this problem is as follows: // smpl @@ identifier x; identifier request ~= pci_request.*|pci_resource.*; @@ ( * x-irq | * x-resource | * request(x, ...) ) ... *pci_enable_device(x) // /smpl Signed-off-by: Kulikov Vasiliy sego...@gmail.com --- drivers/uio/uio_pci_generic.c | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c index 85c9884..fc22e1e 100644 --- a/drivers/uio/uio_pci_generic.c +++ b/drivers/uio/uio_pci_generic.c @@ -128,12 +128,6 @@ static int __devinit probe(struct pci_dev *pdev, struct uio_pci_generic_dev *gdev; int err; - if (!pdev-irq) { - dev_warn(pdev-dev, No IRQ assigned to device: -no support for interrupts?\n); - return -ENODEV; - } - err = pci_enable_device(pdev); if (err) { dev_err(pdev-dev, %s: pci_enable_device failed: %d\n, @@ -141,6 +135,13 @@ static int __devinit probe(struct pci_dev *pdev, return err; } + if (!pdev-irq) { + dev_warn(pdev-dev, No IRQ assigned to device: +no support for interrupts?\n); + pci_disable_device(pdev); + return -ENODEV; + } + err = verify_pci_2_3(pdev); if (err) goto err_verify; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/11] uio: do not use PCI resources before pci_enable_device()
On Tue, Aug 03, 2010 at 07:44:23PM +0400, Kulikov Vasiliy wrote: IRQ and resource[] may not have correct values until after PCI hotplug setup occurs at pci_enable_device() time. The semantic match that finds this problem is as follows: // smpl @@ identifier x; identifier request ~= pci_request.*|pci_resource.*; @@ ( * x-irq | * x-resource | * request(x, ...) ) ... *pci_enable_device(x) // /smpl Signed-off-by: Kulikov Vasiliy sego...@gmail.com Looks sane. Acked-by: Michael S. Tsirkin m...@redhat.com --- drivers/uio/uio_pci_generic.c | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c index 85c9884..fc22e1e 100644 --- a/drivers/uio/uio_pci_generic.c +++ b/drivers/uio/uio_pci_generic.c @@ -128,12 +128,6 @@ static int __devinit probe(struct pci_dev *pdev, struct uio_pci_generic_dev *gdev; int err; - if (!pdev-irq) { - dev_warn(pdev-dev, No IRQ assigned to device: - no support for interrupts?\n); - return -ENODEV; - } - err = pci_enable_device(pdev); if (err) { dev_err(pdev-dev, %s: pci_enable_device failed: %d\n, @@ -141,6 +135,13 @@ static int __devinit probe(struct pci_dev *pdev, return err; } + if (!pdev-irq) { + dev_warn(pdev-dev, No IRQ assigned to device: + no support for interrupts?\n); + pci_disable_device(pdev); + return -ENODEV; + } + err = verify_pci_2_3(pdev); if (err) goto err_verify; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.35 hangs on early boot in KVM
From: Tvrtko Ursulin tvrtko.ursu...@sophos.com Date: Tue, Aug 03, 2010 at 11:31:02AM -0400 One interesting waning spotted: include/config/auto.conf:555:warning: symbol value '-fcall-saved-ecx -fcall- saved-edx' invalid for ARCH_HWEIGHT_CFLAGS Copying Peter and Borislav, guys please look at the above warning. I am bisecting a non-bootable 2.6.35 under KVM and while I am not there yet, it is close to the hweight commit (cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea) and I spotted this warning. That's because you're at a bisection point before the hweight patch but your .config already contains the ARCH_HWEIGHT_CFLAGS variable because of the previous bisection point which contained the hweight patch. I think this can be safely ignored. -- Regards/Gruss, Boris. Advanced Micro Devices GmbH Einsteinring 24, 85609 Dornach General Managers: Alberto Bozzo, Andrew Bowd Registration: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.
Hello Xiaohui, On Tue, 2010-08-03 at 16:48 +0800, Xin, Xiaohui wrote: May you share me with your performance results (including BW and latency)on vhost-net and how you get them(your configuration and especially with the affinity settings)? My macvtap zero copy is incomplete, I am testing sendmsg only now. The initial performance is not good especially for latency (zero copy vs. copy). I am still working on it to find out why and how to improve. That's the reason I am eager to know your performance results and how much performance gain you have seen. Since your patch has completed. I would try your patch here for performance. If you have some performance results to share here that would be great. Thanks Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.35 hangs on early boot in KVM
On Tuesday 03 Aug 2010 16:17:20 Tvrtko Ursulin wrote: On Tuesday 03 Aug 2010 15:57:03 Tvrtko Ursulin wrote: On Tuesday 03 Aug 2010 15:51:08 Avi Kivity wrote: On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote: I have basically built 2.6.35 with make oldconfig from a working 2.6.34. Latter works fine in kvm while 2.6.35 hangs very early. I see nothing after grub (have early printk and verbose bootup enabled), just a blinking VGA cursor and CPU at 100%. Please copy kvm@vger.kernel.org on kvm issues. CONFIG_PRINTK_TIME=y Try disabling this as a workaround. I am in the middle of a bisect run with five builds left to go, currently I have: bad 537b60d17894b7c19a6060feae40299d7109d6e7 good 93c9d7f60c0cb7715890b1f9e159da6f4d1f5a65 Bisect is looking good, narrowed it to ten revisions, but I am not sure to make it to the end today: bad cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea good 41d59102e146a4423a490b8eca68a5860af4fe1c Bisect points the finger to x86, ioapic: In mpparse use mp_register_ioapic (cf7500c0ea133d66f8449d86392d83f840102632), so I am copying Eric. No idea whether this commit is solely to blame or it is a combined interaction with KVM, but I am sure you guys will know. If you want me to test something else please shout. Tvrtko Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom. Company Reg No 2096520. VAT Reg No GB 348 3873 20. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.35 hangs on early boot in KVM
On Tuesday 03 Aug 2010 16:49:01 Borislav Petkov wrote: From: Tvrtko Ursulin tvrtko.ursu...@sophos.com Date: Tue, Aug 03, 2010 at 11:31:02AM -0400 One interesting waning spotted: include/config/auto.conf:555:warning: symbol value '-fcall-saved-ecx -fcall- saved-edx' invalid for ARCH_HWEIGHT_CFLAGS Copying Peter and Borislav, guys please look at the above warning. I am bisecting a non-bootable 2.6.35 under KVM and while I am not there yet, it is close to the hweight commit (cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea) and I spotted this warning. That's because you're at a bisection point before the hweight patch but your .config already contains the ARCH_HWEIGHT_CFLAGS variable because of the previous bisection point which contained the hweight patch. I think this can be safely ignored. Yep, bisect pointed to another commit so I continued another part of this thread. Thanks for the explanation! Tvrtko Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom. Company Reg No 2096520. VAT Reg No GB 348 3873 20. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 05:53 PM, Richard W.M. Jones wrote: Total saving: 115ms. 815 ms by my arithmetic. no, not true, 115ms. If you bypass creating the initrd/cdrom (700 ms) and loading it (115ms) you save 815ms. You also save 3*N-2*P memory where N is the size of your initrd and P is the actual amount used by the guest. Can you explain this? (assuming ahead-of-time image generation) initrd: qemu reads image (host pagecache): N qemu stores image in RAM: N guest copies image to its RAM: N guest faults working set (no XIP): P total: 3N+P initramfs: qemu reads image (host pagecache): N qemu stores image: N guest copies image: N guest extracts image (XIP): N total: 4N cdrom: guest faults working set: P kernel faults working set: P total: 2P difference: 3N-P or 4N-2P depending on model Loading a file into memory is plenty fast if you use the standard interfaces. -kernel -initrd is a specialized interface. Why bother with any command line options at all? After all, they keep changing and causing problems for qemu's users ... Apparently we're all doing stuff wrong, in ways that are never explained by the developers. That's a real problem. It's hard to explain the intent behind something, especially when it's obvious to the author and not so obvious to the user. However making everything do everything under all circumstances has its costs. -kernel and -initrd is a developer's interface intended to make life easier for users that use qemu to develop kernels. It was not intended as a high performance DMA engine. Neither was the firmware _configuration_ interface. That is what virtio and to a lesser extent IDE was written to perform. You'll get much better results from them. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bad O_DIRECT read and write performance with small block sizes with virtio
On 08/03/2010 05:57 PM, John Leach wrote: On Tue, 2010-08-03 at 17:44 +0300, Avi Kivity wrote: On 08/03/2010 05:40 PM, John Leach wrote: dd if=/dev/mapper/zero of=/dev/null bs=8k count=100 iflag=direct 819200 bytes (8.2 GB) copied, 3.46529 s, 2.4 GB/s dd if=/dev/mapper/zero of=/dev/null bs=8k count=100 819200 bytes (8.2 GB) copied, 5.5741 s, 1.5 GB/s dd is just using read. What's /dev/mapper/zero? A real volume or a zero target? zero target: echo 0 21474836480 zero | dmsetup create zero The same performance penalty occurs when using real disks though, I just moved to a zero target to rule out the variables of spinning metal and raid controller caches. Don't, it's confusing things. I'd expect dd to be slower with iflag=direct since the kernel can't do readahead an instead must roundtrip to the controller. With a zero target it's faster since it doesn't have to roundtrip and instead avoids a copy. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 07:10:18PM +0300, Avi Kivity wrote: -kernel and -initrd is a developer's interface intended to make life easier for users that use qemu to develop kernels. It was not intended as a high performance DMA engine. Neither was the firmware _configuration_ interface. That is what virtio and to a lesser extent IDE was written to perform. You'll get much better results from them. Firmware configuration replaced something which was already working really fast -- preloading the images into memory -- with something which worked slower, and has just recently got _way_ more slow. This is a regression. Plain and simple. I have posted a small patch which makes this 650x faster without appreciable complication. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://et.redhat.com/~rjones/virt-top -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] KVM PPC PV framework v3
On Sun, 1 Aug 2010 22:21:37 +0200 Alexander Graf ag...@suse.de wrote: On 01.08.2010, at 16:02, Avi Kivity wrote: Looks reasonable. Since it's fair to say I understand nothing about powerpc, I'd like someone who does to review it and ack, please, with an emphasis on the interfaces. Sounds good. Preferably someone with access to the ePAPR spec :). The ePAPR-relevant stuff in patches 7, 16, and 17 looks reasonable. Did I miss any ePAPR-relevant stuff in the other patches? -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 09:53 AM, Richard W.M. Jones wrote: On Tue, Aug 03, 2010 at 05:38:25PM +0300, Avi Kivity wrote: The time will only continue to grow as you add features and as the distro bloats naturally. Much better to create it once and only update it if some dependent file changes (basically the current on-the-fly code + save a list of file timestamps). This applies to both cases, the initrd could also be saved, so: Total saving: 115ms. 815 ms by my arithmetic. no, not true, 115ms. You also save 3*N-2*P memory where N is the size of your initrd and P is the actual amount used by the guest. Can you explain this? Loading a file into memory is plenty fast if you use the standard interfaces. -kernel -initrd is a specialized interface. Why bother with any command line options at all? After all, they keep changing and causing problems for qemu's users ... Apparently we're all doing stuff wrong, in ways that are never explained by the developers. Let's be fair. I think we've all agreed to adjust the fw_cfg interface to implement DMA. The only requirement was that the DMA operation not be triggered from a single port I/O but rather based on a polling operation which better fits the way real hardware works. Is this a regression? Probably. But performance regressions that result from correctness fixes don't get reverted. We have to find an approach to improve performance without impacting correctness. That said, the general view of -kernel/-append is that these are developer options and we don't really look at it as a performance critical interface. We could do a better job of communicating this to users but that's true of most of the features we support. Regards, Anthony Liguori Rich. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 07:28 PM, Richard W.M. Jones wrote: On Tue, Aug 03, 2010 at 07:10:18PM +0300, Avi Kivity wrote: -kernel and -initrd is a developer's interface intended to make life easier for users that use qemu to develop kernels. It was not intended as a high performance DMA engine. Neither was the firmware _configuration_ interface. That is what virtio and to a lesser extent IDE was written to perform. You'll get much better results from them. Firmware configuration replaced something which was already working really fast -- preloading the images into memory -- with something which worked slower, and has just recently got _way_ more slow. This is a regression. Plain and simple. It's only a regression if there was any intent at making this a performant interface. Otherwise any change an be interpreted as a regression. Even binary doesn't hash to exact same signature is a regression. I have posted a small patch which makes this 650x faster without appreciable complication. It doesn't appear to support live migration, or hiding the feature for -M older. It's not a good path to follow. Tomorrow we'll need to load 300MB initrds and we'll have to rework this yet again. Meanwhile the kernel and virtio support demand loading of any image size you'd want to use. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 11:44 AM, Avi Kivity wrote: On 08/03/2010 07:28 PM, Richard W.M. Jones wrote: On Tue, Aug 03, 2010 at 07:10:18PM +0300, Avi Kivity wrote: -kernel and -initrd is a developer's interface intended to make life easier for users that use qemu to develop kernels. It was not intended as a high performance DMA engine. Neither was the firmware _configuration_ interface. That is what virtio and to a lesser extent IDE was written to perform. You'll get much better results from them. Firmware configuration replaced something which was already working really fast -- preloading the images into memory -- with something which worked slower, and has just recently got _way_ more slow. This is a regression. Plain and simple. It's only a regression if there was any intent at making this a performant interface. Otherwise any change an be interpreted as a regression. Even binary doesn't hash to exact same signature is a regression. I have posted a small patch which makes this 650x faster without appreciable complication. It doesn't appear to support live migration, or hiding the feature for -M older. It's not a good path to follow. Tomorrow we'll need to load 300MB initrds and we'll have to rework this yet again. Meanwhile the kernel and virtio support demand loading of any image size you'd want to use. firmware is totally broken with respect to -M older FWIW. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 07:44 PM, Avi Kivity wrote: It's not a good path to follow. Tomorrow we'll need to load 300MB initrds and we'll have to rework this yet again. Meanwhile the kernel and virtio support demand loading of any image size you'd want to use. Even better would be to use virtio-9p. You don't even need an image in this case. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 07:46 PM, Anthony Liguori wrote: It doesn't appear to support live migration, or hiding the feature for -M older. It's not a good path to follow. Tomorrow we'll need to load 300MB initrds and we'll have to rework this yet again. Meanwhile the kernel and virtio support demand loading of any image size you'd want to use. firmware is totally broken with respect to -M older FWIW. Well, then this is adding to the brokenness. fwcfg dma is going to have exactly one user, libguestfs. Much better to have libguestfs move to some other interface and improve are users-to-interfaces ratio. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 11:50 AM, Avi Kivity wrote: On 08/03/2010 07:46 PM, Anthony Liguori wrote: It doesn't appear to support live migration, or hiding the feature for -M older. It's not a good path to follow. Tomorrow we'll need to load 300MB initrds and we'll have to rework this yet again. Meanwhile the kernel and virtio support demand loading of any image size you'd want to use. firmware is totally broken with respect to -M older FWIW. Well, then this is adding to the brokenness. fwcfg dma is going to have exactly one user, libguestfs. Much better to have libguestfs move to some other interface and improve are users-to-interfaces ratio. You mean, only one class of users cares about the performance of loading an initrd. However, you've also argued in other threads how important it is not to break libvirt even if it means we have to do silly things (like change help text). So... why is it that libguestfs has to change itself and yet we should bend over backwards so libvirt doesn't have to change itself? Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 07:53 PM, Anthony Liguori wrote: On 08/03/2010 11:50 AM, Avi Kivity wrote: On 08/03/2010 07:46 PM, Anthony Liguori wrote: It doesn't appear to support live migration, or hiding the feature for -M older. It's not a good path to follow. Tomorrow we'll need to load 300MB initrds and we'll have to rework this yet again. Meanwhile the kernel and virtio support demand loading of any image size you'd want to use. firmware is totally broken with respect to -M older FWIW. Well, then this is adding to the brokenness. fwcfg dma is going to have exactly one user, libguestfs. Much better to have libguestfs move to some other interface and improve are users-to-interfaces ratio. You mean, only one class of users cares about the performance of loading an initrd. However, you've also argued in other threads how important it is not to break libvirt even if it means we have to do silly things (like change help text). So... why is it that libguestfs has to change itself and yet we should bend over backwards so libvirt doesn't have to change itself? libvirt is a major user that is widely deployed, and would be completely broken if we change -help. Changing -help is of no consequence to us. libguestfs is a (pardon me) minor user that is not widely used, and would suffer a performance regression, not total breakage, unless we add a fw-dma interface. Adding the interface is of consequence to us: we have to implement live migration and backwards compatibility, and support this new interface for a long while. In an ideal world we wouldn't tolerate any regression. The world is not ideal, so we prioritize. the -help change scores very high on benfit/cost. fw-dma, much lower. Note in both cases the long term solution is for the user to move to another interface (cap reporting, virtio), so adding an interface which would only be abandoned later by its only user drops the benfit/cost ratio even further. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 07:56 PM, Richard W.M. Jones wrote: On Tue, Aug 03, 2010 at 07:44:49PM +0300, Avi Kivity wrote: On 08/03/2010 07:28 PM, Richard W.M. Jones wrote: I have posted a small patch which makes this 650x faster without appreciable complication. It doesn't appear to support live migration, or hiding the feature for -M older. AFAICT live migration should still work (even assuming someone live migrates a domain during early boot, which seems pretty unlikely ...) Live migration is sometimes performed automatically by management tools, which have no idea (nor do they care) what the guest is doing. Maybe you mean live migration of the dma_* global variables? I can fix that. Yes. It's not a good path to follow. Tomorrow we'll need to load 300MB initrds and we'll have to rework this yet again. Not a very good straw man ... The patch would take ~300ms instead of ~115ms, versus something like 2 mins 40 seconds with the current method. It's still 300ms extra time, with a 900MB footprint. btw, a DMA interface which blocks the guest and/or qemu for 115ms is not something we want to introduce to qemu. dma is hard, doing something simple means it won't work very well. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 11:50 AM, Avi Kivity wrote: On 08/03/2010 07:46 PM, Anthony Liguori wrote: It doesn't appear to support live migration, or hiding the feature for -M older. It's not a good path to follow. Tomorrow we'll need to load 300MB initrds and we'll have to rework this yet again. Meanwhile the kernel and virtio support demand loading of any image size you'd want to use. firmware is totally broken with respect to -M older FWIW. Well, then this is adding to the brokenness. fwcfg dma is going to have exactly one user, libguestfs. Much better to have libguestfs move to some other interface and improve are users-to-interfaces ratio. BTW, the brokenness is that regardless of -M older, we always use the newest firmware. Because always use the newest firmware, fwcfg is not a backwards compatible interface. Migration totally screws this up. While we migrate roms (and correctly now thanks to Alex's patches), we size the allocation based on the newest firmware size. That means if we ever decreased the size of a rom, we'd see total failure (even if we had a compatible fwcfg interface). Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 07:44:49PM +0300, Avi Kivity wrote: On 08/03/2010 07:28 PM, Richard W.M. Jones wrote: I have posted a small patch which makes this 650x faster without appreciable complication. It doesn't appear to support live migration, or hiding the feature for -M older. AFAICT live migration should still work (even assuming someone live migrates a domain during early boot, which seems pretty unlikely ...) Maybe you mean live migration of the dma_* global variables? I can fix that. It's not a good path to follow. Tomorrow we'll need to load 300MB initrds and we'll have to rework this yet again. Not a very good straw man ... The patch would take ~300ms instead of ~115ms, versus something like 2 mins 40 seconds with the current method. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://et.redhat.com/~rjones/virt-top -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/11] uio: do not use PCI resources before pci_enable_device()
On Tue, Aug 03, 2010 at 07:44:23PM +0400, Kulikov Vasiliy wrote: IRQ and resource[] may not have correct values until after PCI hotplug setup occurs at pci_enable_device() time. The semantic match that finds this problem is as follows: // smpl @@ identifier x; identifier request ~= pci_request.*|pci_resource.*; @@ ( * x-irq | * x-resource | * request(x, ...) ) ... *pci_enable_device(x) // /smpl Signed-off-by: Kulikov Vasiliy sego...@gmail.com Looks alright to me, thanks! Signed-off-by: Hans J. Koch h...@linutronix.de --- drivers/uio/uio_pci_generic.c | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c index 85c9884..fc22e1e 100644 --- a/drivers/uio/uio_pci_generic.c +++ b/drivers/uio/uio_pci_generic.c @@ -128,12 +128,6 @@ static int __devinit probe(struct pci_dev *pdev, struct uio_pci_generic_dev *gdev; int err; - if (!pdev-irq) { - dev_warn(pdev-dev, No IRQ assigned to device: - no support for interrupts?\n); - return -ENODEV; - } - err = pci_enable_device(pdev); if (err) { dev_err(pdev-dev, %s: pci_enable_device failed: %d\n, @@ -141,6 +135,13 @@ static int __devinit probe(struct pci_dev *pdev, return err; } + if (!pdev-irq) { + dev_warn(pdev-dev, No IRQ assigned to device: + no support for interrupts?\n); + pci_disable_device(pdev); + return -ENODEV; + } + err = verify_pci_2_3(pdev); if (err) goto err_verify; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 12:01 PM, Avi Kivity wrote: You mean, only one class of users cares about the performance of loading an initrd. However, you've also argued in other threads how important it is not to break libvirt even if it means we have to do silly things (like change help text). So... why is it that libguestfs has to change itself and yet we should bend over backwards so libvirt doesn't have to change itself? libvirt is a major user that is widely deployed, and would be completely broken if we change -help. Changing -help is of no consequence to us. libguestfs is a (pardon me) minor user that is not widely used, and would suffer a performance regression, not total breakage, unless we add a fw-dma interface. Adding the interface is of consequence to us: we have to implement live migration and backwards compatibility, and support this new interface for a long while. I certainly buy the argument about making changes of little consequence to us vs. ones that we have to be concerned about long term. However, I don't think we can objectively differentiate between a major and minor user. Generally speaking, I would rather that we not take the position of you are a minor user therefore we're not going to accommodate you. Regards, Anthony Liguori In an ideal world we wouldn't tolerate any regression. The world is not ideal, so we prioritize. the -help change scores very high on benfit/cost. fw-dma, much lower. Note in both cases the long term solution is for the user to move to another interface (cap reporting, virtio), so adding an interface which would only be abandoned later by its only user drops the benfit/cost ratio even further. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 3/4] Paravirtualized spinlock implementation for KVM guests
On 08/02/2010 11:59 PM, Avi Kivity wrote: On 08/02/2010 06:20 PM, Jeremy Fitzhardinge wrote: On 08/02/2010 01:48 AM, Avi Kivity wrote: On 07/26/2010 09:15 AM, Srivatsa Vaddagiri wrote: Paravirtual spinlock implementation for KVM guests, based heavily on Xen guest's spinlock implementation. + +static struct spinlock_stats +{ +u64 taken; +u32 taken_slow; + +u64 released; + +#define HISTO_BUCKETS30 +u32 histo_spin_total[HISTO_BUCKETS+1]; +u32 histo_spin_spinning[HISTO_BUCKETS+1]; +u32 histo_spin_blocked[HISTO_BUCKETS+1]; + +u64 time_total; +u64 time_spinning; +u64 time_blocked; +} spinlock_stats; Could these be replaced by tracepoints when starting to spin/stopping spinning etc? Then userspace can reconstruct the histogram as well as see which locks are involved and what call paths. Unfortunately not; the tracing code uses spinlocks. (TBH I haven't actually tried, but I did give the code an eyeball to this end.) Hm. The tracing code already uses a specialized lock (arch_spinlock_t), perhaps we can make this lock avoid the tracing? That's not really a specialized lock; that's just the naked architecture-provided spinlock implementation, without all the lockdep, etc, etc stuff layered on top. All these changes are at a lower level, so giving tracing its own type of spinlock amounts to making the architectures provide two complete spinlock implementations. We could make tracing use, for example, an rwlock so long as we promise not to put tracing in the rwlock implementation - but that's hardly elegant. It's really sad, btw, there's all those nice lockless ring buffers and then a spinlock for ftrace_vbprintk(), instead of a per-cpu buffer. Sad indeed. J -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 08:42 PM, Anthony Liguori wrote: However, I don't think we can objectively differentiate between a major and minor user. Generally speaking, I would rather that we not take the position of you are a minor user therefore we're not going to accommodate you. Again it's a matter of practicalities. With have written virtio drivers for Windows and Linux, but not for FreeDOS or NetWare. To speed up Windows XP we have (in qemu-kvm) kvm-tpr-opt.c that is a gross breach of decency, would we go to the same lengths to speed up Haiku? I suggest that we would not. libvirt and Windows XP did not win major user status by making large anonymous donations to qemu developers. They did so by having lots of users. Those users are our end users, and we should be focusing our efforts in a way that maximizes the gain for as large a number of those end users as we can. Not breaking libvirt will be unknowingly appreciated by a large number of users, every day. Not slowing down libguestfs, by a much smaller number for a much shorter time. If it were just a matter of changing the help text I wouldn't mind at all, but introducing an undocumented migration-unsafe broken-dma interface isn't something I'm happy to do. btw, gaining back some of the speed that we lost _is_ something I want to do, since it doesn't break or add any interfaces, and would be a gain not just for libguestfs, but also for Windows installs (which use string pio extensively). Richard, can you test kvm.git master? it already contains one fix and we plan to add more. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 08:58:10PM +0300, Avi Kivity wrote: Richard, can you test kvm.git master? it already contains one fix and we plan to add more. Yup, I will ... Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming blog: http://rwmj.wordpress.com Fedora now supports 80 OCaml packages (the OPEN alternative to F#) http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 12:58 PM, Avi Kivity wrote: On 08/03/2010 08:42 PM, Anthony Liguori wrote: However, I don't think we can objectively differentiate between a major and minor user. Generally speaking, I would rather that we not take the position of you are a minor user therefore we're not going to accommodate you. Again it's a matter of practicalities. With have written virtio drivers for Windows and Linux, but not for FreeDOS or NetWare. To speed up Windows XP we have (in qemu-kvm) kvm-tpr-opt.c that is a gross breach of decency, would we go to the same lengths to speed up Haiku? I suggest that we would not. tpr-opt optimizes a legitimate dependence on the x86 architecture that Windows has. While the implementation may be grossly indecent, it certainly fits the overall mission of what we're trying to do in qemu and kvm which is emulate an architecture. You've invested a lot of time and effort into it because it's important to you (or more specifically, your employer). That's because Windows is important to you. If someone as adept and commit as you was heavily invested in Haiku and was willing to implement something equivalent to tpr-opt and also willing to do all of the work of maintaining it, then reject such a patch would be a mistake. If Richard is willing to do the work to make -kernel perform faster in such a way that it fits into the overall mission of what we're building, then I see no reason to reject it. The criteria for evaluating a patch should only depend on how it affects other areas of qemu and whether it impacts overall usability. As a side note, we ought to do a better job of removing features that have created a burden on other areas of qemu that aren't actively being maintained. That's a different discussion though. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm IPC
Thanks. Will this work if the guest or host are different combo - for instance ubuntu/debian or fedora/ubuntu? In other words, is there anything generic other than using the sockets ? Am ok to use PCI to communicate too if that can improve performance. Any pointers would be helpful. --Nirmal On Tue, Aug 3, 2010 at 1:05 AM, Amit Shah amit.s...@redhat.com wrote: On (Thu) Jul 29 2010 [16:17:48], Nirmal Guhan wrote: Hi, I run Fedora 12 and guest is also Fedora 12. I use br0/tap0 for networking and communicate between host-guest using socket. I do see some references to virtio, pci based ipc and inter-vm shared memory but they are not current. My question is : Is there a better IPC mechanism for host-guest and intern vm communication and if so could you provide me with pointers? There's virtio-serial, which is a channel between a guest and the host. You can short-circuit two host-side chardevs to get inter-VM channels as well. See https://fedoraproject.org/wiki/Features/VirtioSerial for more info. This is only available from F13, though. Amit -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 09:26 PM, Anthony Liguori wrote: On 08/03/2010 12:58 PM, Avi Kivity wrote: On 08/03/2010 08:42 PM, Anthony Liguori wrote: However, I don't think we can objectively differentiate between a major and minor user. Generally speaking, I would rather that we not take the position of you are a minor user therefore we're not going to accommodate you. Again it's a matter of practicalities. With have written virtio drivers for Windows and Linux, but not for FreeDOS or NetWare. To speed up Windows XP we have (in qemu-kvm) kvm-tpr-opt.c that is a gross breach of decency, would we go to the same lengths to speed up Haiku? I suggest that we would not. tpr-opt optimizes a legitimate dependence on the x86 architecture that Windows has. While the implementation may be grossly indecent, it certainly fits the overall mission of what we're trying to do in qemu and kvm which is emulate an architecture. You've invested a lot of time and effort into it because it's important to you (or more specifically, your employer). That's because Windows is important to you. Correct. If someone as adept and commit as you was heavily invested in Haiku and was willing to implement something equivalent to tpr-opt and also willing to do all of the work of maintaining it, then reject such a patch would be a mistake. libguestfs does not depend on an x86 architectural feature. qemu-system-x86_64 emulates a PC, and PCs don't have -kernel. We should discourage people from depending on this interface for production use. If Richard is willing to do the work to make -kernel perform faster in such a way that it fits into the overall mission of what we're building, then I see no reason to reject it. The criteria for evaluating a patch should only depend on how it affects other areas of qemu and whether it impacts overall usability. That's true, but extending fwcfg doesn't fit into the overall picture well. We have well defined interfaces for pushing data into a guest: virtio-serial (dma upload), virtio-blk (adds demand paging), and virtio-p9fs (no image needed). Adapting libguestfs to use one of these is a better move than adding yet another interface. A better (though still inaccurate) analogy is would be if the developers of a guest OS came up with a virtual bus for devices and were willing to do the work to make this bus perform better. Would we accept this new work or would we point them at our existing bus (pci) instead? Really, the bar on new interfaces (both to guest and host) should be high, much higher than it is now. Interfaces should be well documented, future proof, migration safe, and orthogonal to existing interfaces. While the first three points could be improved with some effort, adding a new dma interface is not going to be orthogonal to virtio. And frankly, libguestfs is better off switching to one of the other interfaces. Slurping huge initrds isn't the right way to do this. As a side note, we ought to do a better job of removing features that have created a burden on other areas of qemu that aren't actively being maintained. That's a different discussion though. Sure, we need something like Linux' Documentation/feature-removal-schedule.txt for people to ignore. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 09:43 PM, Avi Kivity wrote: Really, the bar on new interfaces (both to guest and host) should be high, much higher than it is now. Interfaces should be well documented, future proof, migration safe, and orthogonal to existing interfaces. While the first three points could be improved with some effort, adding a new dma interface is not going to be orthogonal to virtio. And frankly, libguestfs is better off switching to one of the other interfaces. Slurping huge initrds isn't the right way to do this. btw, precedent should play no role here. Just because an older interfaces wasn't documented or migration safe or unit-tested doesn't mean new ones get off the hook. It does help to have a framework in place that we can point people at, for example I added a skeleton Documentation/kvm/api.txt and some unit tests and then made contributors fill them in for new features. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 01:43 PM, Avi Kivity wrote: If Richard is willing to do the work to make -kernel perform faster in such a way that it fits into the overall mission of what we're building, then I see no reason to reject it. The criteria for evaluating a patch should only depend on how it affects other areas of qemu and whether it impacts overall usability. That's true, but extending fwcfg doesn't fit into the overall picture well. We have well defined interfaces for pushing data into a guest: virtio-serial (dma upload), virtio-blk (adds demand paging), and virtio-p9fs (no image needed). Adapting libguestfs to use one of these is a better move than adding yet another interface. On real hardware, there's an awful lot of interaction between the firmware and the platform. It's a pretty rich interface. On IBM systems, we actually extend that all the way down to userspace via a virtual USB RNDIS driver that you can use IPMI over. A better (though still inaccurate) analogy is would be if the developers of a guest OS came up with a virtual bus for devices and were willing to do the work to make this bus perform better. Would we accept this new work or would we point them at our existing bus (pci) instead? Doesn't this precisely describe virtio-s390? Really, the bar on new interfaces (both to guest and host) should be high, much higher than it is now. Interfaces should be well documented, future proof, migration safe, and orthogonal to existing interfaces. Okay, but this is a bigger discussion that I'm very eager to have. But we shouldn't explicitly apply new policies to random patches without clearly stating the policy up front. Regards, Anthony Liguori While the first three points could be improved with some effort, adding a new dma interface is not going to be orthogonal to virtio. And frankly, libguestfs is better off switching to one of the other interfaces. Slurping huge initrds isn't the right way to do this. As a side note, we ought to do a better job of removing features that have created a burden on other areas of qemu that aren't actively being maintained. That's a different discussion though. Sure, we need something like Linux' Documentation/feature-removal-schedule.txt for people to ignore. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 09:55 PM, Anthony Liguori wrote: On 08/03/2010 01:43 PM, Avi Kivity wrote: If Richard is willing to do the work to make -kernel perform faster in such a way that it fits into the overall mission of what we're building, then I see no reason to reject it. The criteria for evaluating a patch should only depend on how it affects other areas of qemu and whether it impacts overall usability. That's true, but extending fwcfg doesn't fit into the overall picture well. We have well defined interfaces for pushing data into a guest: virtio-serial (dma upload), virtio-blk (adds demand paging), and virtio-p9fs (no image needed). Adapting libguestfs to use one of these is a better move than adding yet another interface. On real hardware, there's an awful lot of interaction between the firmware and the platform. It's a pretty rich interface. On IBM systems, we actually extend that all the way down to userspace via a virtual USB RNDIS driver that you can use IPMI over. That is fine and we'll do pv interfaces when we have to. That's fwfg, that's virtio. But let's not do more than we have to. A better (though still inaccurate) analogy is would be if the developers of a guest OS came up with a virtual bus for devices and were willing to do the work to make this bus perform better. Would we accept this new work or would we point them at our existing bus (pci) instead? Doesn't this precisely describe virtio-s390? As I understood it, s390 had good reasons not to use their native interfaces. On x86 we have no good reason not to use pci and no good reason not to use virtio for dma. Really, the bar on new interfaces (both to guest and host) should be high, much higher than it is now. Interfaces should be well documented, future proof, migration safe, and orthogonal to existing interfaces. Okay, but this is a bigger discussion that I'm very eager to have. But we shouldn't explicitly apply new policies to random patches without clearly stating the policy up front. Migration safety has been part of the criteria for a while. Future proofness less so. Documentation was usually completely missing but I see no reason not to insist on it now, better late than never. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote: If Richard is willing to do the work to make -kernel perform faster in such a way that it fits into the overall mission of what we're building, then I see no reason to reject it. The criteria for evaluating a patch should only depend on how it affects other areas of qemu and whether it impacts overall usability. That's true, but extending fwcfg doesn't fit into the overall picture well. We have well defined interfaces for pushing data into a guest: virtio-serial (dma upload), virtio-blk (adds demand paging), and virtio-p9fs (no image needed). Adapting libguestfs to use one of these is a better move than adding yet another interface. +1. I already proposed that. Nobody objects against fast fast communication channel between guest and host. In fact we have one: virtio-serial. Of course it is much easier to hack dma semantic into fw_cfg interface than add virtio-serial to seabios, but it doesn't make it right. Does virtio-serial has to be exposed as PCI to a guest or can we expose it as ISA device too in case someone want to use -kernel option but do not see additional PCI device in a guest? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 10:05 PM, Gleb Natapov wrote: That's true, but extending fwcfg doesn't fit into the overall picture well. We have well defined interfaces for pushing data into a guest: virtio-serial (dma upload), virtio-blk (adds demand paging), and virtio-p9fs (no image needed). Adapting libguestfs to use one of these is a better move than adding yet another interface. +1. I already proposed that. Nobody objects against fast fast communication channel between guest and host. In fact we have one: virtio-serial. Of course it is much easier to hack dma semantic into fw_cfg interface than add virtio-serial to seabios, but it doesn't make it right. Does virtio-serial has to be exposed as PCI to a guest or can we expose it as ISA device too in case someone want to use -kernel option but do not see additional PCI device in a guest? No need for virtio-serial in firmware. We can have a small initrd slurp a larger filesystem via virtio-serial, or mount a virtio-blk or virtio-p9fs, or boot the whole thing from a virtio-blk image and avoid -kernel -initrd completely. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote: libguestfs does not depend on an x86 architectural feature. qemu-system-x86_64 emulates a PC, and PCs don't have -kernel. We should discourage people from depending on this interface for production use. I really don't get this whole thing where we must slavishly emulate an exact PC ... Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://et.redhat.com/~rjones/virt-top -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 02:05 PM, Gleb Natapov wrote: On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote: If Richard is willing to do the work to make -kernel perform faster in such a way that it fits into the overall mission of what we're building, then I see no reason to reject it. The criteria for evaluating a patch should only depend on how it affects other areas of qemu and whether it impacts overall usability. That's true, but extending fwcfg doesn't fit into the overall picture well. We have well defined interfaces for pushing data into a guest: virtio-serial (dma upload), virtio-blk (adds demand paging), and virtio-p9fs (no image needed). Adapting libguestfs to use one of these is a better move than adding yet another interface. +1. I already proposed that. Nobody objects against fast fast communication channel between guest and host. In fact we have one: virtio-serial. Of course it is much easier to hack dma semantic into fw_cfg interface than add virtio-serial to seabios, but it doesn't make it right. Does virtio-serial has to be exposed as PCI to a guest or can we expose it as ISA device too in case someone want to use -kernel option but do not see additional PCI device in a guest? fw_cfg has to be available pretty early on so relying on a PCI device isn't reasonable. Having dual interfaces seems wasteful. We're already doing bulk data transfer over fw_cfg as we need to do it to transfer roms and potentially a boot splash. Even outside of loading an initrd, the performance is going to start to matter with a large number of devices. Regards, Anthony Liguori -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 08:13:46PM +0100, Richard W.M. Jones wrote: On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote: libguestfs does not depend on an x86 architectural feature. qemu-system-x86_64 emulates a PC, and PCs don't have -kernel. We should discourage people from depending on this interface for production use. I really don't get this whole thing where we must slavishly emulate an exact PC ... May be because you don't have to dial with consequences of not doing so? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 02:13 PM, Richard W.M. Jones wrote: On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote: libguestfs does not depend on an x86 architectural feature. qemu-system-x86_64 emulates a PC, and PCs don't have -kernel. We should discourage people from depending on this interface for production use. I really don't get this whole thing where we must slavishly emulate an exact PC ... History has shown that when we deviate, we usually get it wrong and it becomes very painful to fix. Regards, Anthony Liguori Rich. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 10:13 PM, Richard W.M. Jones wrote: On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote: libguestfs does not depend on an x86 architectural feature. qemu-system-x86_64 emulates a PC, and PCs don't have -kernel. We should discourage people from depending on this interface for production use. I really don't get this whole thing where we must slavishly emulate an exact PC ... This has two motivations: - documented interfaces: we suck at documentation. We seldom document. Even when we do document something, the documentation is often inaccurate, misleading, and incomplete. While an exact PC unfortunately doesn't exist, it's a lot closer to reality than, say, an exact Linux syscall interface. If we adopt an existing interface, we already have the documentation, and if there's a conflict between the documentation and our implementation, it's clear who wins (well, not always). - preexisting guests: if we design a new interface, we get to update all guests; and there are many of them. Whereas an exact PC will be seen by the guest vendors as well who will then add whatever support is necessary. Obviously we break this when we have to, but we don't, we shouldn't. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 10:15 PM, Anthony Liguori wrote: fw_cfg has to be available pretty early on so relying on a PCI device isn't reasonable. Having dual interfaces seems wasteful. Agree. We're already doing bulk data transfer over fw_cfg as we need to do it to transfer roms and potentially a boot splash. Why do we need to transfer roms? These are devices on the memory bus or pci bus, it just needs to be there at the right address. Boot splash should just be another rom as it would be on a real system. Even outside of loading an initrd, the performance is going to start to matter with a large number of devices. I don't really see why. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 02:15:05PM -0500, Anthony Liguori wrote: On 08/03/2010 02:05 PM, Gleb Natapov wrote: On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote: If Richard is willing to do the work to make -kernel perform faster in such a way that it fits into the overall mission of what we're building, then I see no reason to reject it. The criteria for evaluating a patch should only depend on how it affects other areas of qemu and whether it impacts overall usability. That's true, but extending fwcfg doesn't fit into the overall picture well. We have well defined interfaces for pushing data into a guest: virtio-serial (dma upload), virtio-blk (adds demand paging), and virtio-p9fs (no image needed). Adapting libguestfs to use one of these is a better move than adding yet another interface. +1. I already proposed that. Nobody objects against fast fast communication channel between guest and host. In fact we have one: virtio-serial. Of course it is much easier to hack dma semantic into fw_cfg interface than add virtio-serial to seabios, but it doesn't make it right. Does virtio-serial has to be exposed as PCI to a guest or can we expose it as ISA device too in case someone want to use -kernel option but do not see additional PCI device in a guest? fw_cfg has to be available pretty early on so relying on a PCI device isn't reasonable. Having dual interfaces seems wasteful. fw_cfg wasn't mean to be used for bulk transfers (seabios doesn't even use string pio to access it which make load time 50 times slower that what Richard reports). It was meant to be easy to use on very early stages of booting. Kernel/initrd are loaded on very late stage of booting at which point PCI is fully initialized. We're already doing bulk data transfer over fw_cfg as we need to do it to transfer roms and potentially a boot splash. Even outside of loading an initrd, the performance is going to start to matter with a large number of devices. Most roms are loaded from rom PIC bars, so this leaves us with boot splash, but boot splash image should be relatively small and if user wants it he does not care about boot time already since bios need to pause to show the boot splash anyway. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 02:24 PM, Avi Kivity wrote: On 08/03/2010 10:15 PM, Anthony Liguori wrote: fw_cfg has to be available pretty early on so relying on a PCI device isn't reasonable. Having dual interfaces seems wasteful. Agree. We're already doing bulk data transfer over fw_cfg as we need to do it to transfer roms and potentially a boot splash. Why do we need to transfer roms? These are devices on the memory bus or pci bus, it just needs to be there at the right address. Not quite. The BIOS owns the option ROM space. The way it works on bare metal is that the PCI ROM BAR gets mapped to some location in physical memory by the BIOS, the BIOS executes the initialization vector, and after initialization, the ROM will reorganize itself into something smaller. It's nice and clean. But ISA is not nearly as clean. Ultimately, to make this mix work in a reasonable way, we have to provide a side channel interface to SeaBIOS such that we can deliver ROMs outside of PCI and still let SeaBIOS decide how ROMs get organized. It's additionally complicated by the fact that we didn't support PCI ROM BAR until recently so to maintain compatibility with -M older, we have to use a side channel to lay out option roms. Regards, Anthony Liguori Boot splash should just be another rom as it would be on a real system. Even outside of loading an initrd, the performance is going to start to matter with a large number of devices. I don't really see why. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 10:38 PM, Anthony Liguori wrote: Why do we need to transfer roms? These are devices on the memory bus or pci bus, it just needs to be there at the right address. Not quite. The BIOS owns the option ROM space. The way it works on bare metal is that the PCI ROM BAR gets mapped to some location in physical memory by the BIOS, the BIOS executes the initialization vector, and after initialization, the ROM will reorganize itself into something smaller. It's nice and clean. But ISA is not nearly as clean. So far so good. Ultimately, to make this mix work in a reasonable way, we have to provide a side channel interface to SeaBIOS such that we can deliver ROMs outside of PCI and still let SeaBIOS decide how ROMs get organized. I don't follow. Why do we need this side channel? What would a real ISA machine do? Are there actually enough ISA devices for there to be a problem? It's additionally complicated by the fact that we didn't support PCI ROM BAR until recently so to maintain compatibility with -M older, we have to use a side channel to lay out option roms. Again I don't follow. We can just lay out the ROMs in memory like we did in the past? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 02:41 PM, Avi Kivity wrote: On 08/03/2010 10:38 PM, Anthony Liguori wrote: Why do we need to transfer roms? These are devices on the memory bus or pci bus, it just needs to be there at the right address. Not quite. The BIOS owns the option ROM space. The way it works on bare metal is that the PCI ROM BAR gets mapped to some location in physical memory by the BIOS, the BIOS executes the initialization vector, and after initialization, the ROM will reorganize itself into something smaller. It's nice and clean. But ISA is not nearly as clean. So far so good. Ultimately, to make this mix work in a reasonable way, we have to provide a side channel interface to SeaBIOS such that we can deliver ROMs outside of PCI and still let SeaBIOS decide how ROMs get organized. I don't follow. Why do we need this side channel? What would a real ISA machine do? It depends on the ISA machine. In the worst case, there's a DIP switch on the card and if you've got a conflict between two cards, you start flipping DIP switches. It's pure awesomeness. No, I don't want to emulate DIP switches :-) Are there actually enough ISA devices for there to be a problem? No, but -M older has the same problem. It's additionally complicated by the fact that we didn't support PCI ROM BAR until recently so to maintain compatibility with -M older, we have to use a side channel to lay out option roms. Again I don't follow. We can just lay out the ROMs in memory like we did in the past? Because only one component can own the option ROM space. Either that's SeaBIOS and we need a side channel or it's QEMU and we can't use PMM. I guess that's the real issue here. Previously we used etherboot which was well under 32k. We only loaded roms we needed. Now we use gPXE which is much bigger and if you don't use PMM, then you run out of option rom space very quickly. Previously, we loaded option ROMs on demand when a user used -boot n but that was a giant hack and wasn't like bare metal at all. It involved x86-isms in vl.c. Now we always load ROMs so PMM is very important. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 10:22:22PM +0300, Avi Kivity wrote: On 08/03/2010 10:13 PM, Richard W.M. Jones wrote: On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote: libguestfs does not depend on an x86 architectural feature. qemu-system-x86_64 emulates a PC, and PCs don't have -kernel. We should discourage people from depending on this interface for production use. I really don't get this whole thing where we must slavishly emulate an exact PC ... This has two motivations: - documented interfaces: we suck at documentation. We seldom document. Even when we do document something, the documentation is often inaccurate, misleading, and incomplete. While an exact PC unfortunately doesn't exist, it's a lot closer to reality than, say, an exact Linux syscall interface. If we adopt an existing interface, we already have the documentation, and if there's a conflict between the documentation and our implementation, it's clear who wins (well, not always). - preexisting guests: if we design a new interface, we get to update all guests; and there are many of them. Whereas an exact PC will be seen by the guest vendors as well who will then add whatever support is necessary. On the other hand we end up with stuff like only being able to add 29 virtio-blk devices to a single guest. As best as I can tell, this comes from PCI, and this limit required a bunch of hacks when implementing virt-df. These are reasonable motivations, but I think they are partially about us: We could document things better and make things future-proof. I'm surprised by how lacking the doc requirements are for qemu (compared to, hmm, libguestfs for example). We could demand that OSes write device drivers for more qemu devices -- already OS vendors write thousands of device drivers for all sorts of obscure devices, so this isn't really much of a demand for them. In fact, they're already doing it. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://et.redhat.com/~rjones/virt-df/ -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] ceph/rbd block driver for qemu-kvm (v4)
On Tue, Aug 03, 2010 at 12:37:18AM +0400, malc wrote: Thare are whitespace issues in this patch. Thanks for looking at the patch. Here is an updated patch, that should fix the whitespace issues: This is a block driver for the distributed file system Ceph (http://ceph.newdream.net/). This driver uses librados (which is part of the Ceph server) for direct access to the Ceph object store and is running entirely in userspace. It now has (read only) snapshot support and passes all relevant qemu-iotests. To compile the driver you need at least ceph 0.21. Additional information is available on the Ceph-Wiki: http://ceph.newdream.net/wiki/Kvm-rbd The patch is based on git://repo.or.cz/qemu/kevin.git block Signed-off-by: Christian Brunner c...@muc.de --- Makefile.objs |1 + block/rbd.c | 907 + block/rbd_types.h | 71 + configure | 31 ++ 4 files changed, 1010 insertions(+), 0 deletions(-) create mode 100644 block/rbd.c create mode 100644 block/rbd_types.h diff --git a/Makefile.objs b/Makefile.objs index 4a1eaa1..bf45142 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -18,6 +18,7 @@ block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o block-nested-$(CONFIG_WIN32) += raw-win32.o block-nested-$(CONFIG_POSIX) += raw-posix.o block-nested-$(CONFIG_CURL) += curl.o +block-nested-$(CONFIG_RBD) += rbd.o block-obj-y += $(addprefix block/, $(block-nested-y)) diff --git a/block/rbd.c b/block/rbd.c new file mode 100644 index 000..0e6b2a5 --- /dev/null +++ b/block/rbd.c @@ -0,0 +1,907 @@ +/* + * QEMU Block driver for RADOS (Ceph) + * + * Copyright (C) 2010 Christian Brunner c...@muc.de + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include qemu-common.h +#include qemu-error.h +#include sys/types.h +#include stdbool.h + +#include qemu-common.h + +#include rbd_types.h +#include module.h +#include block_int.h + +#include stdio.h +#include stdlib.h +#include rados/librados.h + +#include signal.h + + +int eventfd(unsigned int initval, int flags); + + +/* + * When specifying the image filename use: + * + * rbd:poolname/devicename + * + * poolname must be the name of an existing rados pool + * + * devicename is the basename for all objects used to + * emulate the raw device. + * + * Metadata information (image size, ...) is stored in an + * object with the name devicename.rbd. + * + * The raw device is split into 4MB sized objects by default. + * The sequencenumber is encoded in a 12 byte long hex-string, + * and is attached to the devicename, separated by a dot. + * e.g. devicename.1234567890ab + * + */ + +#define OBJ_MAX_SIZE (1UL OBJ_DEFAULT_OBJ_ORDER) + +typedef struct RBDAIOCB { +BlockDriverAIOCB common; +QEMUBH *bh; +int ret; +QEMUIOVector *qiov; +char *bounce; +int write; +int64_t sector_num; +int aiocnt; +int error; +struct BDRVRBDState *s; +} RBDAIOCB; + +typedef struct RADOSCB { +int rcbid; +RBDAIOCB *acb; +int done; +int64_t segsize; +char *buf; +} RADOSCB; + +typedef struct BDRVRBDState { +int efd; +rados_pool_t pool; +rados_pool_t header_pool; +char name[RBD_MAX_OBJ_NAME_SIZE]; +char block_name[RBD_MAX_BLOCK_NAME_SIZE]; +uint64_t size; +uint64_t objsize; +int qemu_aio_count; +int read_only; +} BDRVRBDState; + +typedef struct rbd_obj_header_ondisk RbdHeader1; + +static int rbd_parsename(const char *filename, char *pool, char **snap, + char *name) +{ +const char *rbdname; +char *p; +int l; + +if (!strstart(filename, rbd:, rbdname)) { +return -EINVAL; +} + +pstrcpy(pool, 2 * RBD_MAX_SEG_NAME_SIZE, rbdname); +p = strchr(pool, '/'); +if (p == NULL) { +return -EINVAL; +} + +*p = '\0'; + +l = strlen(pool); +if(l = RBD_MAX_SEG_NAME_SIZE) { +error_report(pool name to long); +return -EINVAL; +} else if (l = 0) { +error_report(pool name to short); +return -EINVAL; +} + +l = strlen(++p); +if (l = RBD_MAX_OBJ_NAME_SIZE) { +error_report(object name to long); +return -EINVAL; +} else if (l = 0) { +error_report(object name to short); +return -EINVAL; +} + +strcpy(name, p); + +*snap = strchr(name, '@'); +if (*snap) { +*(*snap) = '\0'; +(*snap)++; +if (!*snap) *snap = NULL; +} + +return l; +} + +static int create_tmap_op(uint8_t op, const char *name, char **tmap_desc) +{ +uint32_t len = strlen(name); +/* total_len = encoding op + name + empty buffer */ +uint32_t total_len = 1 + (sizeof(uint32_t) + len) + sizeof(uint32_t); +char *desc = NULL; + +desc = qemu_malloc(total_len); + +*tmap_desc = desc; + +*desc = op; +desc++; +memcpy(desc, len, sizeof(len)); +desc += sizeof(len); +
Re: 2.6.35 hangs on early boot in KVM
Tvrtko Ursulin tvrtko.ursu...@sophos.com writes: On Tuesday 03 Aug 2010 16:17:20 Tvrtko Ursulin wrote: On Tuesday 03 Aug 2010 15:57:03 Tvrtko Ursulin wrote: On Tuesday 03 Aug 2010 15:51:08 Avi Kivity wrote: On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote: I have basically built 2.6.35 with make oldconfig from a working 2.6.34. Latter works fine in kvm while 2.6.35 hangs very early. I see nothing after grub (have early printk and verbose bootup enabled), just a blinking VGA cursor and CPU at 100%. Please copy kvm@vger.kernel.org on kvm issues. CONFIG_PRINTK_TIME=y Try disabling this as a workaround. I am in the middle of a bisect run with five builds left to go, currently I have: bad 537b60d17894b7c19a6060feae40299d7109d6e7 good 93c9d7f60c0cb7715890b1f9e159da6f4d1f5a65 Bisect is looking good, narrowed it to ten revisions, but I am not sure to make it to the end today: bad cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea good 41d59102e146a4423a490b8eca68a5860af4fe1c Bisect points the finger to x86, ioapic: In mpparse use mp_register_ioapic (cf7500c0ea133d66f8449d86392d83f840102632), so I am copying Eric. No idea whether this commit is solely to blame or it is a combined interaction with KVM, but I am sure you guys will know. If you want me to test something else please shout. Interesting. This is the second report I have heard of no VGA output and a hang early in boot, that was bisected to this commit. Since I could not reproduce it I was hoping it was a fluke with a single piece of hardware, but it appears not. There was in fact an off by one bug in that commit, but if that had been the issue 2.6.35 would have booted ok. There was nothing in that commit that should have prevented early output, and in fact I can boot with a very similar configuration. So I am trying to figure out what pieces are interacting to cause this failure mode to happen. What version of kvm are you running on your host (in case that matters)? I want to reproduce this myself so I can start guessing what weird interactions are going on. Eric -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 03:00 PM, Richard W.M. Jones wrote: On Tue, Aug 03, 2010 at 10:22:22PM +0300, Avi Kivity wrote: On 08/03/2010 10:13 PM, Richard W.M. Jones wrote: On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote: libguestfs does not depend on an x86 architectural feature. qemu-system-x86_64 emulates a PC, and PCs don't have -kernel. We should discourage people from depending on this interface for production use. I really don't get this whole thing where we must slavishly emulate an exact PC ... This has two motivations: - documented interfaces: we suck at documentation. We seldom document. Even when we do document something, the documentation is often inaccurate, misleading, and incomplete. While an exact PC unfortunately doesn't exist, it's a lot closer to reality than, say, an exact Linux syscall interface. If we adopt an existing interface, we already have the documentation, and if there's a conflict between the documentation and our implementation, it's clear who wins (well, not always). - preexisting guests: if we design a new interface, we get to update all guests; and there are many of them. Whereas an exact PC will be seen by the guest vendors as well who will then add whatever support is necessary. On the other hand we end up with stuff like only being able to add 29 virtio-blk devices to a single guest. As best as I can tell, this comes from PCI No, this comes from us being too clever for our own good and not following the way hardware does it. All modern systems keep disks on their own dedicated bus. In virtio-blk, we have a 1-1 relationship between disks and PCI devices. That's a perfect example of what happens when we try to improve things. , and this limit required a bunch of hacks when implementing virt-df. These are reasonable motivations, but I think they are partially about us: We could document things better and make things future-proof. I'm surprised by how lacking the doc requirements are for qemu (compared to, hmm, libguestfs for example). We enjoy complaining about our lack of documentation more than we like actually writing documentation. We could demand that OSes write device drivers for more qemu devices -- already OS vendors write thousands of device drivers for all sorts of obscure devices, so this isn't really much of a demand for them. In fact, they're already doing it. So far, MS hasn't quite gotten the clue yet that they should write device drivers for qemu :-) In fact, noone has. Regards, Anthony Liguori Rich. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.35 hangs on early boot in KVM
On Tue, Aug 3, 2010 at 8:59 AM, Tvrtko Ursulin tvrtko.ursu...@sophos.com wrote: On Tuesday 03 Aug 2010 16:17:20 Tvrtko Ursulin wrote: On Tuesday 03 Aug 2010 15:57:03 Tvrtko Ursulin wrote: On Tuesday 03 Aug 2010 15:51:08 Avi Kivity wrote: On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote: I have basically built 2.6.35 with make oldconfig from a working 2.6.34. Latter works fine in kvm while 2.6.35 hangs very early. I see nothing after grub (have early printk and verbose bootup enabled), just a blinking VGA cursor and CPU at 100%. Please copy kvm@vger.kernel.org on kvm issues. CONFIG_PRINTK_TIME=y Try disabling this as a workaround. I am in the middle of a bisect run with five builds left to go, currently I have: bad 537b60d17894b7c19a6060feae40299d7109d6e7 good 93c9d7f60c0cb7715890b1f9e159da6f4d1f5a65 Bisect is looking good, narrowed it to ten revisions, but I am not sure to make it to the end today: bad cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea good 41d59102e146a4423a490b8eca68a5860af4fe1c Bisect points the finger to x86, ioapic: In mpparse use mp_register_ioapic (cf7500c0ea133d66f8449d86392d83f840102632), so I am copying Eric. No idea whether this commit is solely to blame or it is a combined interaction with KVM, but I am sure you guys will know. If you want me to test something else please shout. please try attached patch, to see if it help. Yinghai [PATCH] x86: check if apic/pin is shared with legacy one fix system that external device that have io apic on apic0/pin(0-15) also for the io apic out of order system: 6ACPI: IOAPIC (id[0x10] address[0xfecff000] gsi_base[0]) 6IOAPIC[0]: apic_id 16, version 0, address 0xfecff000, GSI 0-2 6ACPI: IOAPIC (id[0x0f] address[0xfec0] gsi_base[3]) 6IOAPIC[1]: apic_id 15, version 0, address 0xfec0, GSI 3-38 6ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[39]) 6IOAPIC[2]: apic_id 14, version 0, address 0xfec01000, GSI 39-74 6ACPI: INT_SRC_OVR (bus 0 bus_irq 1 global_irq 4 dfl dfl) 6ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 5 dfl dfl) 6ACPI: INT_SRC_OVR (bus 0 bus_irq 3 global_irq 6 dfl dfl) 6ACPI: INT_SRC_OVR (bus 0 bus_irq 4 global_irq 7 dfl dfl) 6ACPI: INT_SRC_OVR (bus 0 bus_irq 6 global_irq 9 dfl dfl) 6ACPI: INT_SRC_OVR (bus 0 bus_irq 7 global_irq 10 dfl dfl) 6ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 11 low edge) 6ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 12 dfl dfl) 6ACPI: INT_SRC_OVR (bus 0 bus_irq 12 global_irq 15 dfl dfl) 6ACPI: INT_SRC_OVR (bus 0 bus_irq 13 global_irq 16 dfl dfl) 6ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 17 low edge) 6ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 18 dfl dfl) after this patch will get apic0, pin0, GSI 0: irq 0+75 apic0, pin1, GSI 1: irq 1+75 apic0, pin2, GSI 2: irq 2 apic1, pin0, GSI 3: irq 3+75 apic1, pin5, GSI 8: irq 8+75 apic1, pin10,GSI 13: irq 13+75 apic1, pin11,GSI 14: irq 14+75 because mp_config_acpi_legacy_irqs will put apic0, pin2, irq2 in mp_irqs... so pin_2_irq_legacy will report 2. irq_to_gsi will still report 2. so it is right. gsi_to_irq will report 2. for 0, 1, 3, 8, 13, 14: still right Signed-off-by: Yinghai Lu ying...@kernel.org --- arch/x86/kernel/apic/io_apic.c | 31 --- 1 file changed, 28 insertions(+), 3 deletions(-) Index: linux-2.6/arch/x86/kernel/apic/io_apic.c === --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c +++ linux-2.6/arch/x86/kernel/apic/io_apic.c @@ -1013,6 +1013,28 @@ static inline int irq_trigger(int idx) return MPBIOS_trigger(idx); } +static int pin_2_irq_leagcy(int apic, int pin) +{ + int i; + + for (i = 0; i mp_irq_entries; i++) { + int bus = mp_irqs[i].srcbus; + + if (!test_bit(bus, mp_bus_not_pci)) + continue; + + if (mp_ioapics[apic].apicid != mp_irqs[i].dstapic) + continue; + + if (mp_irqs[i].dstirq != pin) + continue; + + return mp_irqs[i].srcbusirq; + } + + return -1; +} + static int pin_2_irq(int idx, int apic, int pin) { int irq; @@ -1029,10 +1051,13 @@ static int pin_2_irq(int idx, int apic, } else { u32 gsi = mp_gsi_routing[apic].gsi_base + pin; - if (gsi = NR_IRQS_LEGACY) + if (gsi = NR_IRQS_LEGACY) { irq = gsi; - else - irq = gsi_top + gsi; + } else { + irq = pin_2_irq_legacy(apic, pin); + if (irq 0) +irq = gsi_top + gsi; + } } #ifdef CONFIG_X86_32
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 10:49 PM, Anthony Liguori wrote: On the other hand we end up with stuff like only being able to add 29 virtio-blk devices to a single guest. As best as I can tell, this comes from PCI No, this comes from us being too clever for our own good and not following the way hardware does it. All modern systems keep disks on their own dedicated bus. In virtio-blk, we have a 1-1 relationship between disks and PCI devices. That's a perfect example of what happens when we try to improve things. Comparing (from personal experience) the complexity of the Windows drivers for Xen and virtio shows that it's not a bad idea at all. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
Hi, We're already doing bulk data transfer over fw_cfg as we need to do it to transfer roms and potentially a boot splash. Why do we need to transfer roms? These are devices on the memory bus or pci bus, it just needs to be there at the right address. Indeed. We do that in most cases. The exceptions are: (1) -M somethingold. PCI devices don't have a pci rom bar then by default because they didn't not have one in older qemu versions, so we need some other way to pass the option rom to seabios. (2) vgabios.bin. vgabios needs patches to make loading via pci rom bar work (vgabios-cirrus.bin works fine already). I have patches in the queue to do that. (3) roms not associated with a PCI device: multiboot, extboot, -option-rom command line switch, vgabios for -M isapc. The default configuration (qemu $diskimage) loads two roms: vgabios-cirrus.bin and e1000.bin. Both are loaded via pci rom bar and not via fw_cfg. cheers, Gerd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
Hi, Again I don't follow. We can just lay out the ROMs in memory like we did in the past? Well. We have some size issues then. PCI ROMS are loaded by the BIOS in a way that only a small fraction is actually resident in the small 0xd - 0xe area. That doesn't work if qemu tries to simply copy the whole thing there like old versions did. With the size of the gPXE roms this matters in real life. cheers, Gerd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On 08/03/2010 04:13 PM, Paolo Bonzini wrote: On 08/03/2010 10:49 PM, Anthony Liguori wrote: On the other hand we end up with stuff like only being able to add 29 virtio-blk devices to a single guest. As best as I can tell, this comes from PCI No, this comes from us being too clever for our own good and not following the way hardware does it. All modern systems keep disks on their own dedicated bus. In virtio-blk, we have a 1-1 relationship between disks and PCI devices. That's a perfect example of what happens when we try to improve things. Comparing (from personal experience) the complexity of the Windows drivers for Xen and virtio shows that it's not a bad idea at all. Not quite sure what you're suggesting, but I could have been clearer. Instead of having virtio-blk where a virtio disk has a 1-1 mapping to a PCI device, we probably should have just done virtio-scsi. Since most OSes have a SCSI-centric block layer, it would have resulted in much simpler drivers and we could support more than 1 disk per PCI slot. I had thought Christoph was working on such a device at some point in time... Regards, Anthony Liguori Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
performance with libvirt and kvm
Hi, I am seeing a performance degradation while using libvirt to start my vm (kvm). vm is fedora 12 and host is also fedora 12, both with 2.6.32.10-90.fc12.i686. Here are the statistics from iperf : From VM: [ 3] 0.0-30.0 sec 199 MBytes 55.7 Mbits/sec From host : [ 3] 0.0-30.0 sec 331 MBytes 92.6 Mbits/sec libvirt command as seen from ps output : /usr/bin/qemu-kvm -S -M pc-0.11 -enable-kvm -m 512 -smp 1 -name f12kvm1 -uuid 9300bfe2-2b9c-d9f0-3b03-9c7fe9934393 -monitor unix:/var/lib/libvirt/qemu/f12kvm1.monitor,server,nowait -boot c -drive file=/var/lib/libvirt/f12.img,if=ide,bus=0,unit=0,boot=on,format=raw -drive if=ide,media=cdrom,bus=1,unit=0,format=raw -net nic,macaddr=52:54:00:51:7c:39,vlan=0,model=virtio,name=net0 -net tap,fd=21,vlan=0,name=hostnet0 -serial pty -parallel none -usb -vnc 127.0.0.1:0 -k en-us -vga cirrus -balloon virtio If I start a similar vm using qemu-kvm directly, the performance matches with the host. qemu kvm : [ 3] 0.0-30.0 sec 329 MBytes 91.9 Mbits/sec TCP window size is 64K for all the cases. Command used : qemu-kvm vdisk.img -m 512 -net nic,model=virtio macaddr=$macaddress -net tap,script=/etc/qemu-ifup Any clues? Thanks, Nirmal -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
On Tue, Aug 03, 2010 at 10:24:41PM +0300, Avi Kivity wrote: Why do we need to transfer roms? These are devices on the memory bus or pci bus, it just needs to be there at the right address. Boot splash should just be another rom as it would be on a real system. Just like the initrd? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://et.redhat.com/~rjones/libguestfs/ See what it can do: http://et.redhat.com/~rjones/libguestfs/recipes.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Alt SeaBIOS SSDT cpu hotplug
On Tue, Aug 03, 2010 at 05:00:49PM +0800, Liu, Jinsong wrote: I just test your new patch with Windows 2008 DataCenter at my platform, it works OK! We can hot-add new cpus and they appear at Device Manager. (BTW, yesterday I test your new patch with linux 2.6.32 hvm, it works fine, we can add-remove-add-remove... cpus) Sorry for make you spend more time. It's our fault. Thanks. I'll go ahead and commit it then. I have one incremental patch (see below) which I will also commit. -Kevin --- ssdt-proc.dsl 2010-08-03 18:45:12.0 -0400 +++ src/ssdt-proc.dsl 2010-08-03 18:45:17.0 -0400 @@ -44,7 +44,7 @@ Return(CPST(ID)) } Method (_EJ0, 1, NotSerialized) { -Return(CPEJ(ID, Arg0)) +CPEJ(ID, Arg0) } } } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
Richard W.M. Jones wrote: We could demand that OSes write device drivers for more qemu devices -- already OS vendors write thousands of device drivers for all sorts of obscure devices, so this isn't really much of a demand for them. In fact, they're already doing it. Result: Most OSes not working with qemu? Actually we seem to be going that way. Recent qemus don't work with older versions of Windows any more, so we have to use different versions of qemu for different guests. -- Jamie -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on disable_kvm_x86_64_debian_5_0
The Buildbot has detected a new failure of disable_kvm_x86_64_debian_5_0 on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_x86_64_debian_5_0/builds/497 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_1 Build Reason: The Nightly scheduler named 'nightly_disable_kvm' triggered this build Build Source Stamp: [branch master] HEAD Blamelist: BUILD FAILED: failed compile sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on disable_kvm_i386_debian_5_0
The Buildbot has detected a new failure of disable_kvm_i386_debian_5_0 on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_i386_debian_5_0/builds/498 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_2 Build Reason: The Nightly scheduler named 'nightly_disable_kvm' triggered this build Build Source Stamp: [branch master] HEAD Blamelist: BUILD FAILED: failed compile sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on disable_kvm_x86_64_out_of_tree
The Buildbot has detected a new failure of disable_kvm_x86_64_out_of_tree on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_x86_64_out_of_tree/builds/446 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_1 Build Reason: The Nightly scheduler named 'nightly_disable_kvm' triggered this build Build Source Stamp: [branch master] HEAD Blamelist: BUILD FAILED: failed compile sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on disable_kvm_i386_out_of_tree
The Buildbot has detected a new failure of disable_kvm_i386_out_of_tree on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_i386_out_of_tree/builds/446 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_2 Build Reason: The Nightly scheduler named 'nightly_disable_kvm' triggered this build Build Source Stamp: [branch master] HEAD Blamelist: BUILD FAILED: failed compile sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.
Arnd Bergmann wrote: On Friday 30 July 2010 17:51:52 Shirley Ma wrote: On Fri, 2010-07-30 at 16:53 +0800, Xin, Xiaohui wrote: Since vhost-net already supports macvtap/tun backends, do you think whether it's better to implement zero copy in macvtap/tun than inducing a new media passthrough device here? I'm not sure if there will be more duplicated code in the kernel. I think it should be less duplicated code in the kernel if we use macvtap to support what media passthrough driver here. Since macvtap has support virtio_net head and offloading already, the only missing func is zero copy. Also QEMU supports macvtap, we just need add a zero copy flag in option. Yes, I fully agree and that was one of the intended directions for macvtap to start with. Thank you so much for following up on that, I've long been planning to work on macvtap zero-copy myself but it's now lower on my priorities, so it's good to hear that you made progress on it, even if there are still performance issues. But zero-copy is a Linux generic feature that can be used by other VMMs as well if the BE service drivers want to incorporate. If we can make mp device VMM-agnostic (it may be not yet in current patch), that will help Linux more. Thx, Eddie-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] x86: Allow accessing IDT via emulator ops
The patch adds a new member get_idt() to x86_emulate_ops. It also adds a function to get the idt in order to be used by the emulator. This is needed for real mode interrupt injection and the emulation of int instructions. Signed-off-by: Mohammed Gamal m.gamal...@gmail.com --- arch/x86/include/asm/kvm_emulate.h |1 + arch/x86/kvm/x86.c |6 ++ 2 files changed, 7 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h index cbebf1d..f22e5da 100644 --- a/arch/x86/include/asm/kvm_emulate.h +++ b/arch/x86/include/asm/kvm_emulate.h @@ -139,6 +139,7 @@ struct x86_emulate_ops { void (*set_segment_selector)(u16 sel, int seg, struct kvm_vcpu *vcpu); unsigned long (*get_cached_segment_base)(int seg, struct kvm_vcpu *vcpu); void (*get_gdt)(struct desc_ptr *dt, struct kvm_vcpu *vcpu); + void (*get_idt)(struct desc_ptr *dt, struct kvm_vcpu *vcpu); ulong (*get_cr)(int cr, struct kvm_vcpu *vcpu); int (*set_cr)(int cr, ulong val, struct kvm_vcpu *vcpu); int (*cpl)(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e7e3b50..416aa0e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3790,6 +3790,11 @@ static void emulator_get_gdt(struct desc_ptr *dt, struct kvm_vcpu *vcpu) kvm_x86_ops-get_gdt(vcpu, dt); } +static void emulator_get_idt(struct desc_ptr *dt, struct kvm_vcpu *vcpu) +{ + kvm_x86_ops-get_idt(vcpu, dt); +} + static unsigned long emulator_get_cached_segment_base(int seg, struct kvm_vcpu *vcpu) { @@ -3883,6 +3888,7 @@ static struct x86_emulate_ops emulate_ops = { .set_segment_selector = emulator_set_segment_selector, .get_cached_segment_base = emulator_get_cached_segment_base, .get_gdt = emulator_get_gdt, + .get_idt = emulator_get_idt, .get_cr = emulator_get_cr, .set_cr = emulator_set_cr, .cpl = emulator_get_cpl, -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 1/3] export __get_user_pages_fast() function
This function is used by KVM to pin process's page in the atomic context. Define the 'weak' function to avoid other architecture not support it Acked-by: Nick Piggin npig...@suse.de Signed-off-by: Xiao Guangrong xiaoguangr...@cn.fujitsu.com --- mm/util.c | 13 + 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/mm/util.c b/mm/util.c index f5712e8..4f0d32b 100644 --- a/mm/util.c +++ b/mm/util.c @@ -250,6 +250,19 @@ void arch_pick_mmap_layout(struct mm_struct *mm) } #endif +/* + * Like get_user_pages_fast() except its IRQ-safe in that it won't fall + * back to the regular GUP. + * If the architecture not support this fucntion, simply return with no + * page pinned + */ +int __attribute__((weak)) __get_user_pages_fast(unsigned long start, +int nr_pages, int write, struct page **pages) +{ + return 0; +} +EXPORT_SYMBOL_GPL(__get_user_pages_fast); + /** * get_user_pages_fast() - pin user pages in memory * @start: starting user address -- 1.6.1.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html