Re: SeaBIOS cdrom regression with Vista
On 11/21/2009 12:36 AM, Kevin O'Connor wrote: It looks like I spoke too soon. It appears the SeaBIOS init can leave the ATA controller in an interrupts disabled state. This appears to confuse Vista. So, this is a SeaBIOS bug - I'll implement a fix. I've committed a fix to SeaBIOS - commit 42bc3940. Many thanks. Anthony, can you expedite this fix through qemu.git? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vhost: Fix warnings and bad type handling
From: Alan Cox a...@linux.intel.com Subject: [PATCH] vhost: fix warnings on 32 bit systems Fix compiler warning about discarding top 32 bit of data on 32 bit systems, and document that dicarded bits must be 0. Signed-off-by: Alan Cox a...@linux.intel.com Signed-off-by: Michael S. Tsirkin m...@redhat.com --- So I think the below slightly tweaked version of Alan's patch is a bit better. OK? drivers/vhost/vhost.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 97233d5..e7b4dea 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -322,6 +322,8 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, void __user *argp) r = -EOPNOTSUPP; break; } + /* For 32bit, verify that the top 32bits of the user + data are set to zero. */ if ((u64)(unsigned long)a.desc_user_addr != a.desc_user_addr || (u64)(unsigned long)a.used_user_addr != a.used_user_addr || (u64)(unsigned long)a.avail_user_addr != a.avail_user_addr) { @@ -334,7 +336,8 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, void __user *argp) r = -EINVAL; break; } - r = init_used(vq, (struct vring_used __user *)a.used_user_addr); + r = init_used(vq, (struct vring_used __user *)(unsigned long) + a.used_user_addr); if (r) break; vq-log_used = !!(a.flags (0x1 VHOST_VRING_F_LOG)); -- 1.6.5.2.143.g8cc62 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv9 3/3] vhost_net: a kernel-level virtio server
On Wed, Nov 18, 2009 at 01:42:43PM +0800, Xin, Xiaohui wrote: Michael, From the http://www.linux-kvm.org/page/VhostNet, we can see netperf with TCP_STREAM can get more than 4GMb/s for the receive side, and more than 5GMb/s for the send side. Is it the result from the raw socket or through tap? I want to duplicate such performance with vhost on my side. I can only get more than 1GMb/s with following conditions: 1) disabled the GRO feature in the host 10G NIC driver 2) vi-big_packet in guest is false 3) MTU is 1500. 4) raw socket, not the tap 5) using your vhost git tree Is that the reasonable result with such conditions or maybe I have made some silly mistakes somewhere I don't know yet? May you kindly describe your test environment/conditions in detail to have much better performance in your website (I really need the performance)? Thanks Xiaohui These results where sent by Shirley Ma (Cc'd). I think they were with tap, host-to-guest/guest-to-host And I have tested the tun support with vhost now, and may you share your /home/mst/ifup script here? These are usually pretty simple, mine looks like this: #!/bin/sh -x /sbin/ifconfig tap0 0.0.0.0 up brctl addif br0 tap0 -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Breakage due to commit c1699988 (v3: don't call reset functions on cpu initialization)
A qemu-kvm which merges this commit breaks badly (see qemu-kvm.git next branch). In the commit log for this commit, you write I tested it with qemu (with and without io-thread) and qemu-kvm, and it seems to be doing okay - although qemu-kvm uses a slightly different patch. Can you share the slightly different patch (against 'next') please? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't get guests to recognize NUMA architecture as alluded to in Redhat marketing material
Is your question about kvm, libvirt or some red hat product? Your posting to kvm list but it sounds like it's a libvirt question. Because in KVM the virtual machine was a regular Linux process you can leverage numa in the same way you would for any other process. eg. set numa on your host then launch kvm with numactl numactl -m 0 --physcpubind=0,8 qemu-kvm . That doesn't mean your creating some numa structure inside the vm, it just means that the VM's large amount of memory is backed by a numa node so you're get improved memory performance. --- - Steve Brown stevebr...@teamholistic.com wrote: From: Steve Brown stevebr...@teamholistic.com To: kvm@vger.kernel.org Sent: Saturday, November 21, 2009 11:20:22 AM GMT -05:00 US/Canada Eastern Subject: Can't get guests to recognize NUMA architecture as alluded to in Redhat marketing material So, based on the following lines from the Redhat PDF on KVM: support for large memory systems with NUMA and integrated memory controllers NUMA support allows virtual machines to efficiently access large amounts of memory I decided to try out KVM as an alternative to the Xen setup we have been using where guests are pinned to nodes and limited (by choice) to only the available RAM at said node. This is a two socket, eight core, 72GB system. So I installed CentOS 5.4 and proceeded to use virsh-install to create a guest, simply a CentOS 5.4 guest. I allocated it 40GB or so of RAM to be sure memory allocation would cross node boundaries. I tried using vcpus=8, cpuset=auto, cpuset=1,2 vcpus=8 (that one caused all sorts of problems and CPU lockups), cpuset=1,2 vcpus=2, cpuset=1,2 No matter what I still see only one NUMA node in the guest from numastat So what's going on here. Is the PDF misleading? Does a guest not need to know about NUMA and all scheduling/NUMAness handled by KVM? Am I missing some magical configuration line in the XML so the guest understands it's NUMAness? When allocating memory to the guest does the virsh wrapper make all the right backend calls to allocate exactly 50% of requested memory from each physical socket's half of total system memory, in this case 20GB from one socket and 20GB from the other? Any useful comments appreciated. Thanks! -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [KVM-AUTOTEST] KSM-overcommit test v.2 (python version)
On 11/17/2009 04:49 PM, Jiri Zupka wrote: Hi, We find a little mistake with ending of allocator.py. Because I send this patch today. I resend whole repaired patch again. It sure is big improvment from the previous. There are still many refactoring to be made to make it more readable. Comments embedded. - Original Message - From: Jiri Zupkajzu...@redhat.com To: autotestautot...@test.kernel.org, kvmkvm@vger.kernel.org Cc:u...@redhat.com Sent: Tuesday, November 17, 2009 12:52:28 AM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna Subject: [Autotest] [KVM-AUTOTEST] KSM-overcommit test v.2 (python version) Hi, based on your requirements we have created new version of KSM-overcommit patch (submitted in September). Describe: It tests KSM (kernel shared memory) with overcommit of memory. Changelog: 1) Based only on python (remove C code) 2) Add new test (check last 96B) 3) Separate test to (serial,parallel,both) 4) Improve log and documentation 5) Add perf constat to change time limit for waiting. (slow computer problem) Functionality: KSM test start guests. They are connect to guest over ssh. Copy and run allocator.py to guests. Host can run any python command over Allocator.py loop on client side. Start run_ksm_overcommit. Define host and guest reserve variables (host_reserver,guest_reserver). Calculate amount of virtual machine and their memory based on variables host_mem and overcommit. Check KSM status. Create and start virtual guests. Test : a] serial 1) initialize, merge all mem to single page 2) separate first guset mem 3) separate rest of guest up to fill all mem 4) kill all guests except for the last 5) check if mem of last guest is ok 6) kill guest b] parallel 1) initialize, merge all mem to single page 2) separate mem of guest 3) verification of guest mem 4) merge mem to one block 5) verification of guests mem 6) separate mem of guests by 96B 7) check if mem is all right 8) kill guest allocator.py (client side script) After start they wait for command witch they make in client side. mem_fill class implement commands to fill, check mem and return error to host. We need client side script because we need generate lot of GB of special data. Future plane: We want to add to log information about time spend in task. Information from log we want to use to automatic compute perf contant. And add New tests. ___ Autotest mailing list autot...@test.kernel.org http://test.kernel.org/cgi-bin/mailman/listinfo/autotest ksm_overcommit.patch diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample index ac9ef66..90f62bb 100644 --- a/client/tests/kvm/kvm_tests.cfg.sample +++ b/client/tests/kvm/kvm_tests.cfg.sample @@ -118,6 +118,23 @@ variants: test_name = npb test_control_file = npb.control +- ksm_overcommit: +# Don't preprocess any vms as we need to change it's params +vms = '' +image_snapshot = yes +kill_vm_gracefully = no +type = ksm_overcommit +ksm_swap = yes # yes | no +no hugepages +# Overcommit of host memmory +ksm_overcommit_ratio = 3 +# Max paralel runs machine +ksm_paralel_ratio = 4 +variants: +- serial +ksm_test_size = serial +- paralel +ksm_test_size = paralel - linux_s3: install setup unattended_install type = linux_s3 diff --git a/client/tests/kvm/tests/ksm_overcommit.py b/client/tests/kvm/tests/ksm_overcommit.py new file mode 100644 index 000..408e711 --- /dev/null +++ b/client/tests/kvm/tests/ksm_overcommit.py @@ -0,0 +1,605 @@ +import logging, time +from autotest_lib.client.common_lib import error +import kvm_subprocess, kvm_test_utils, kvm_utils +import kvm_preprocessing +import random, string, math, os + +def run_ksm_overcommit(test, params, env): + +Test how KSM (Kernel Shared Memory) act with more than physical memory is +used. In second part is also tested, how KVM can handle the situation, +when the host runs out of memory (expected is to pause the guest system, +wait until some process returns the memory and bring the guest back to life) + +@param test: kvm test object. +@param params: Dictionary with test parameters. +@param env: Dictionary with the test wnvironment. + + +def parse_meminfo(rowName): + +Function get date from file /proc/meminfo + +@param rowName: Name of line in meminfo + +for line in open('/proc/meminfo').readlines(): +if line.startswith(rowName+:): +name, amt, unit = line.split() +return name, amt, unit + +def parse_meminfo_value(rowName): +
Re: kvm hangs w/o nolapic
On Fri, Nov 20, 2009 at 10:25:41PM +0100, Johannes Berg wrote: On Fri, 2009-11-20 at 21:18 +0300, Cyrill Gorcunov wrote: I've just booted a latest -tip with kvm without problems. Unfortunately, -tip is usually different enough that I it tends to not matter to me while I'm running a mainline kernel :) I've been using | | QEMU PC emulator version 0.9.1 (kvm-84) QEMU PC emulator version 0.10.50 (qemu-kvm-devel-88), Copyright (c) 2003-2008 Fabrice Bellard Perhaps you could capture console with debug apic=debug) and to gather more info? And .config? ... Thanks Johannes! I suspect we need help from KVM developers. -- Cyrill -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] Defer skb allocation for both mergeable buffers and big packets in virtio_net
On Fri, 20 Nov 2009 04:39:19 pm Shirley Ma wrote: Guest virtio_net receives packets from its pre-allocated vring buffers, then it delivers these packets to upper layer protocols as skb buffs. So it's not necessary to pre-allocate skb for each mergable buffer, then frees it when it's useless. This patch has deferred skb allocation when receiving packets for both big packets and mergeable buffers. It reduces skb pre-allocations and skb_frees. Based on Mickael Avi's suggestion. A destroy function has been created to push virtio free buffs to vring for unused pages, and used page private to maintain page list. I didn't touch small packet skb allocation to avoid extra copies for small packets. This patch has tested and measured against 2.6.32-rc5 git. It is built again 2.6.32-rc7 kernel. Tests have been done for small packets, big packets and mergeable buffers. The single netperf TCP_STREAM performance improved for host to guest. It also reduces UDP packets drop rate. The netperf laptop results were: mtu=1500 netperf -H xxx -l 120 w/o patch w/i patch (two runs) guest to host: 3336.84Mb/s 3730.14Mb/s ~ 3582.88Mb/s host to guest: 3165.10Mb/s 3370.39Mb/s ~ 3407.96Mb/s Nice! Is this using mergeable_rx_bufs? Or just big_packets? I'd like to drop big packet support from our driver, but I don't know how many kvm hosts will not offer mergable rx bufs yet. Anthony? Thanks, Rusty. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] Defer skb allocation for both mergeable buffers and big packets in virtio_net
On Sat, 21 Nov 2009 02:51:41 am Shirley Ma wrote: Signed-off-by: Shirley Ma (x...@us.ibm.com) Hi Shirley, This patch seems like a good idea, but it needs some work. As this is the first time I've received a patch from you, I will go through it in extra detail. (As virtio maintainer, it's my responsibility to include this patch, or not). -static void give_a_page(struct virtnet_info *vi, struct page *page) +static void give_pages(struct virtnet_info *vi, struct page *page) { - page-private = (unsigned long)vi-pages; + struct page *npage = (struct page *)page-private; + + if (!npage) + page-private = (unsigned long)vi-pages; + else { + /* give a page list */ + while (npage) { + if (npage-private == (unsigned long)0) { + npage-private = (unsigned long)vi-pages; + break; + } + npage = (struct page *)npage-private; + } + } vi-pages = page; The loops should cover both cases of the if branch. Either way, we want to find the last page in the list so you can set that -private pointer to vi-pages. Also, the == (unsigned long)0 is a little verbose. How about: struct page *end; /* Find end of list, sew whole thing into vi-pages. */ for (end = page; end-private; end = (struct page *)end-private); end-private = (unsigned long)vi-pages; vi-pages = page; +void virtio_free_pages(void *buf) This is a terrible name; it's specific to virtio_net, plus it should be static. Just free_pages should be sufficient here I think. +{ + struct page *page = (struct page *)buf; + struct page *npage; + + while (page) { + npage = (struct page *)page-private; + __free_pages(page, 0); + page = npage; + } This smells a lot like a for loop to me? struct page *i, *next; for (i = buf; next; i = next) { next = (struct page *)i-private; __free_pages(i, 0); } +static int set_skb_frags(struct sk_buff *skb, struct page *page, + int offset, int len) A better name might be skb_add_frag()? +static void receive_skb(struct net_device *dev, void *buf, unsigned len) { struct virtnet_info *vi = netdev_priv(dev); - struct skb_vnet_hdr *hdr = skb_vnet_hdr(skb); - int err; + struct skb_vnet_hdr *hdr; + struct sk_buff *skb; int i; if (unlikely(len sizeof(struct virtio_net_hdr) + ETH_HLEN)) { @@ -132,39 +173,71 @@ static void receive_skb(struct net_device *dev, struct sk_buff *skb, goto drop; } - if (vi-mergeable_rx_bufs) { - unsigned int copy; - char *p = page_address(skb_shinfo(skb)-frags[0].page); + if (!vi-mergeable_rx_bufs !vi-big_packets) { + skb = (struct sk_buff *)buf; This cast is unnecessary, but a comment would be nice: /* Simple case: the pointer is the 1514-byte skb */ + } else { And a matching comment here: /* The pointer is a chain of pages. */ + struct page *page = (struct page *)buf; + int copy, hdr_len, num_buf, offset; + char *p; + + p = page_address(page); - if (len PAGE_SIZE) - len = PAGE_SIZE; - len -= sizeof(struct virtio_net_hdr_mrg_rxbuf); + skb = netdev_alloc_skb(vi-dev, GOOD_COPY_LEN + NET_IP_ALIGN); + if (unlikely(!skb)) { + dev-stats.rx_dropped++; Does this mean we leak all those pages? Shouldn't we give_pages() here? + return; + } + skb_reserve(skb, NET_IP_ALIGN); + hdr = skb_vnet_hdr(skb); - memcpy(hdr-mhdr, p, sizeof(hdr-mhdr)); - p += sizeof(hdr-mhdr); + if (vi-mergeable_rx_bufs) { + hdr_len = sizeof(hdr-mhdr); + memcpy(hdr-mhdr, p, hdr_len); + num_buf = hdr-mhdr.num_buffers; + offset = hdr_len; + if (len PAGE_SIZE) + len = PAGE_SIZE; + } else { + /* big packtes 6 bytes alignment between virtio_net + * header and data */ + hdr_len = sizeof(hdr-hdr); + memcpy(hdr-hdr, p, hdr_len); + offset = hdr_len + 6; Really? I can't see where the current driver does this 6 byte offset. There should be a header, then the packet... Ah, I see below that you set things up this way. The better way to do this is to create a new structure to use in various places. struct padded_vnet_hdr { struct virtio_net_hdr hdr; /*
Re: [PATCH] vhost: Fix warnings and bad type handling
On Sun, 22 Nov 2009 08:58:01 pm Michael S. Tsirkin wrote: From: Alan Cox a...@linux.intel.com Subject: [PATCH] vhost: fix warnings on 32 bit systems Fix compiler warning about discarding top 32 bit of data on 32 bit systems, and document that dicarded bits must be 0. Signed-off-by: Alan Cox a...@linux.intel.com Signed-off-by: Michael S. Tsirkin m...@redhat.com --- So I think the below slightly tweaked version of Alan's patch is a bit better. OK? Thanks, applied. Rusty. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html