[kvm-devel] [PATCH] virtio_blk: allow read-only disks
Hello Rusty, sometimes it is useful to share a disk (e.g. usr). To avoid file system corruption, the disk should be mounted read-only in that case. This patch adds a new feature flag, that allows the host to specify, if the disk should be considered read-only. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- drivers/block/virtio_blk.c |6 +- include/linux/virtio_blk.h |1 + 2 files changed, 6 insertions(+), 1 deletion(-) Index: kvm/drivers/block/virtio_blk.c === --- kvm.orig/drivers/block/virtio_blk.c +++ kvm/drivers/block/virtio_blk.c @@ -260,6 +260,10 @@ static int virtblk_probe(struct virtio_d if (virtio_has_feature(vdev, VIRTIO_BLK_F_BARRIER)) blk_queue_ordered(vblk->disk->queue, QUEUE_ORDERED_TAG, NULL); + /* If disk is read-only in the host, the guest should obey */ + if (virtio_has_feature(vdev, VIRTIO_BLK_F_RO)) + set_disk_ro(vblk->disk, 1); + /* Host must always specify the capacity. */ vdev->config->get(vdev, offsetof(struct virtio_blk_config, capacity), &cap, sizeof(cap)); @@ -325,7 +329,7 @@ static struct virtio_device_id id_table[ static unsigned int features[] = { VIRTIO_BLK_F_BARRIER, VIRTIO_BLK_F_SEG_MAX, VIRTIO_BLK_F_SIZE_MAX, - VIRTIO_BLK_F_GEOMETRY, + VIRTIO_BLK_F_GEOMETRY, VIRTIO_BLK_F_RO, }; static struct virtio_driver virtio_blk = { Index: kvm/include/linux/virtio_blk.h === --- kvm.orig/include/linux/virtio_blk.h +++ kvm/include/linux/virtio_blk.h @@ -10,6 +10,7 @@ #define VIRTIO_BLK_F_SIZE_MAX 1 /* Indicates maximum segment size */ #define VIRTIO_BLK_F_SEG_MAX 2 /* Indicates maximum # of segments */ #define VIRTIO_BLK_F_GEOMETRY 4 /* Legacy geometry available */ +#define VIRTIO_BLK_F_RO5 /* Disk is read-only */ struct virtio_blk_config { - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] s390 kvm_virtio.c build error
Am Montag, 5. Mai 2008 schrieb Avi Kivity: > I can, but tell me which one. Also, the patch (Heiko's) needs a > changelog entry and a signoff. Avi, as this patch is now in your queue, can you push this change --- commit eee4646877b748afbfd34d8dbe6ea9b454a65745 Author: Heiko Carstens <[EMAIL PROTECTED]> Date: Tue May 6 17:38:30 2008 +0300 s390: KVM guest: fix compile error --- soon to Linus? kvm still does not compile on s390. Thank you Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH/RFC] stop_machine: make stop_machine_run more virtualization friendly
Am Donnerstag, 8. Mai 2008 schrieb Jeremy Fitzhardinge: > > Sorry, forgot to mention. Its kvm.git from 2 days ago on s390. > > > > And on s390 cpu_relax yields the vcpu? That's not common behaviour > across architectures. Yes, cpu_relax on s390 calls diagnose 44. Diagnose 44 translates to yield on z/VM and LPAR. Guessing from the number of the diagnose, I think it was used on z/VM for timeslice yielding long before Linux came to s390. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH/RFC] stop_machine: make stop_machine_run more virtualization friendly
Am Donnerstag, 8. Mai 2008 schrieb Jeremy Fitzhardinge: > Christian Borntraeger wrote: > > On kvm I have seen some rare hangs in stop_machine when I used more guest > > cpus than hosts cpus. e.g. 32 guest cpus on 1 host cpu triggered the > > hang quite often. I could also reproduce the problem on a 4 way z/VM host with > > a 64 way guest. > > > > I think that's one of those "don't do that then" cases ;) I really like 64 guest cpus as a good testcase for all kind of things. > > I think x86 (at least) is now using ticket locks, which is fair. Which > kernel are you seeing this problem on? Sorry, forgot to mention. Its kvm.git from 2 days ago on s390. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH/RFC] stop_machine: make stop_machine_run more virtualization friendly
On kvm I have seen some rare hangs in stop_machine when I used more guest cpus than hosts cpus. e.g. 32 guest cpus on 1 host cpu triggered the hang quite often. I could also reproduce the problem on a 4 way z/VM host with a 64 way guest. It turned out that the guest was consuming all available cpus mostly for spinning on scheduler locks like rq->lock. This is expected as the threads are calling yield all the time. The problem is now, that the host scheduling decisings together with the guest scheduling decisions and spinlocks not being fair managed to create an interesting scenario similar to a live lock. (Sometimes the hang resolved itself after some minutes) Changing stop_machine to yield the cpu to the hypervisor when yielding inside the guest fixed the problem for me. While I am not completely happy with this patch, I think it causes no harm and it really improves the situation for me. I used cpu_relax for yielding to the hypervisor, does that work on all architectures? p.s.: If you want to reproduce the problem, cpu hotplug and kprobes use stop_machine_run and both triggered the problem after some retries. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> CC: Ingo Molnar <[EMAIL PROTECTED]> CC: Rusty Russell <[EMAIL PROTECTED]> --- kernel/stop_machine.c |7 --- 1 file changed, 4 insertions(+), 3 deletions(-) Index: kvm/kernel/stop_machine.c === --- kvm.orig/kernel/stop_machine.c +++ kvm/kernel/stop_machine.c @@ -62,8 +62,7 @@ static int stopmachine(void *cpu) * help our sisters onto their CPUs. */ if (!prepared && !irqs_disabled) yield(); - else - cpu_relax(); + cpu_relax(); } /* Ack: we are exiting. */ @@ -106,8 +105,10 @@ static int stop_machine(void) } /* Wait for them all to come to life. */ - while (atomic_read(&stopmachine_thread_ack) != stopmachine_num_threads) + while (atomic_read(&stopmachine_thread_ack) != stopmachine_num_threads) { yield(); + cpu_relax(); + } /* If some failed, kill them all. */ if (ret < 0) { - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] s390 kvm_virtio.c build error
> Hmm... this should help: > > --- > drivers/s390/kvm/kvm_virtio.c | 40 +++- > 1 file changed, 23 insertions(+), 17 deletions(-) Thanks Heiko. I did a short test and it seems to work. Acked-by: Christian Borntraeger <[EMAIL PROTECTED]> This looks almost identical to Rusty's patch. Who is going to send this (or Rustys) patch to Linus? Christian - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] small author mixup (was: git pull KVM updates for 2.6.26rc)
Am Sonntag, 27. April 2008 schrieb Avi Kivity: > Carsten Otte (4): > s390: KVM preparation: provide hook to enable pgstes in user pagetable > KVM: s390: interrupt subsystem, cpu timer, waitpsw > KVM: s390: API documentation > s390: KVM guest: detect when running on kvm > > Christian Borntraeger (10): > KVM: kvm.h: __user requires compiler.h > s390: KVM preparation: host memory management changes for s390 kvm > s390: KVM preparation: address of the 64bit extint parm in lowcore > KVM: s390: sie intercept handling > KVM: s390: intercepts for privileged instructions > KVM: s390: interprocessor communication via sigp > KVM: s390: intercepts for diagnose instructions > KVM: s390: add kvm to kconfig on s390 > KVM: s390: update maintainers > s390: KVM guest: virtio device support, and kvm hypercalls Thats interesting, some of these patches should actually be credited to Carsten - and in fact on kvm.git master they are credited to Carsten. I think the problem is, that these patches contained multiple From lines. On kvm.git the first line (Carsten) was used. When you transferred these patches to the kvm.git-2.6.26-branch, git used the next From-line as the original one was already removed. While it is not a typical case, is there a better way of specifying multiple authors to avoid future confusion? Christian - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC PATCH] virtio: change config to guest endian.
Am Mittwoch, 23. April 2008 schrieb Avi Kivity: > >> #define VIRTIO_CONFIG_S_FAILED0x80 > >> > >> #ifdef __KERNEL__ > >> -struct virtio_device; > >> +#include > >> > > > > I just realized, that this breaks make headers_check as we dont export > > virtio.h (and we dont want to export it as it relies on scatterlist.h). > > > > > > It's guarded by a #ifdef __KERNEL__, so it should be alright. Yes, you are right. Thanks - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH/RFC] virtio: export more headers to userspace
Rusty, is there a reason why we dont export the virtio headers for 9p, balloon, console, pci, and virtio_ring? kvm uses make sync, but I think it is still useful to heave these headers exported as they might be useful for other userspace tools. I dont export virtio.h, because it does not seem to have useful information for userspace and it requires scatterlist.h which is also not exported. See also my other mail about your "virtio: change config to guest endian." patch. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- include/linux/Kbuild |5 + 1 file changed, 5 insertions(+) Index: kvm/include/linux/Kbuild === --- kvm.orig/include/linux/Kbuild +++ kvm/include/linux/Kbuild @@ -346,6 +346,11 @@ unifdef-y += videodev.h unifdef-y += virtio_config.h unifdef-y += virtio_blk.h unifdef-y += virtio_net.h +unifdef-y += virtio_9p.h +unifdef-y += virtio_balloon.h +unifdef-y += virtio_console.h +unifdef-y += virtio_pci.h +unifdef-y += virtio_ring.h unifdef-y += vt.h unifdef-y += wait.h unifdef-y += wanrouter.h - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC PATCH] virtio: change config to guest endian.
Am Dienstag, 22. April 2008 schrieb Rusty Russell: [...] > diff -r a098f19a6da5 include/linux/virtio_config.h > --- a/include/linux/virtio_config.h Sun Apr 20 14:41:02 2008 +1000 > +++ b/include/linux/virtio_config.h Sun Apr 20 15:07:45 2008 +1000 > @@ -16,7 +16,7 @@ > #define VIRTIO_CONFIG_S_FAILED 0x80 > > #ifdef __KERNEL__ > -struct virtio_device; > +#include I just realized, that this breaks make headers_check as we dont export virtio.h (and we dont want to export it as it relies on scatterlist.h). Christian - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC PATCH] virtio: change config to guest endian.
Am Dienstag, 22. April 2008 schrieb Rusty Russell: > [Christian, Hollis, how much is this ABI breakage going to hurt you?] It is ok for s390 at the moment. We are still working on making userspace ready and I plan to change the guest<->host for s390 anyway. I try to make these changes for drivers/s390/kvm/kvm_virtio.c before 2.6.26. The main reason is, that we are currently limited to around 80 devices. I am not sure, if I should change the allocation of the virtqueues and descriptors to guest memory as well. Back to your patch: I have still some ideas about virtio between little endian and big endian systems, but it requires more and different marshalling anyway - even on driver level. No idea yet how to solve that properly. Consider your change Acked-by: Christian Bornraeger <[EMAIL PROTECTED]> given that you fix the issue below: [...] > --- a/drivers/virtio/virtio_balloon.c Sun Apr 20 14:41:02 2008 +1000 > +++ b/drivers/virtio/virtio_balloon.c Sun Apr 20 15:07:45 2008 +1000 > @@ -155,9 +155,9 @@ static inline s64 towards_target(struct > static inline s64 towards_target(struct virtio_balloon *vb) > { > u32 v; > - __virtio_config_val(vb->vdev, > - offsetof(struct virtio_balloon_config, num_pages), > - &v); > + vb->vdev->config->get(vb->vdev, > + offsetof(struct virtio_balloon_config, num_pages), > + &v); this is missing a sizeof(v), no? Christian - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH/trivial] kvm: remove long -> void *user -> long cast
Avi, kvm_dev_ioctl casts the arg value to void __user *, just to recast it again to long. This seems unnecessary. According to objdump the binary code on x86 is unchanged by this patch. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- virt/kvm/kvm_main.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) Index: kvm/virt/kvm/kvm_main.c === --- kvm.orig/virt/kvm/kvm_main.c +++ kvm/virt/kvm/kvm_main.c @@ -1188,7 +1188,6 @@ static int kvm_dev_ioctl_create_vm(void) static long kvm_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { - void __user *argp = (void __user *)arg; long r = -EINVAL; switch (ioctl) { @@ -1205,7 +1204,7 @@ static long kvm_dev_ioctl(struct file *f r = kvm_dev_ioctl_create_vm(); break; case KVM_CHECK_EXTENSION: - r = kvm_dev_ioctl_check_extension((long)argp); + r = kvm_dev_ioctl_check_extension(arg); break; case KVM_GET_VCPU_MMAP_SIZE: r = -EINVAL; - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 2/2] virtio-s390: Change virtio interrupt definitions to follow architectu re
Am Donnerstag, 17. April 2008 schrieb Avi Kivity: > Don't you need to change KVM_S390_INT_VIRTIO as well? Good catch. It works without that change, but its cleaner to change that. We can also remove another fixme, as the host interrupt uses the same control register bit 9. Can you merge this into the first patch? Christian Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> Acked-by: Carsten Otte <[EMAIL PROTECTED]> --- arch/s390/kvm/interrupt.c |2 +- include/linux/kvm.h |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) Index: kvm/arch/s390/kvm/interrupt.c === --- kvm.orig/arch/s390/kvm/interrupt.c +++ kvm/arch/s390/kvm/interrupt.c @@ -50,7 +50,7 @@ static int __interrupt_is_deliverable(st if (psw_extint_disabled(vcpu)) return 0; if (vcpu->arch.sie_block->gcr[0] & 0x200ul) - return 1; /*FIXME virtio control register bit */ + return 1; return 0; case KVM_S390_PROGRAM_INT: case KVM_S390_SIGP_STOP: Index: kvm/include/linux/kvm.h === --- kvm.orig/include/linux/kvm.h +++ kvm/include/linux/kvm.h @@ -250,7 +250,7 @@ struct kvm_s390_psw { #define KVM_S390_PROGRAM_INT 0xfffe0001u #define KVM_S390_SIGP_SET_PREFIX 0xfffe0002u #define KVM_S390_RESTART 0xfffe0003u -#define KVM_S390_INT_VIRTIO0x1237u /*FIXME arch number */ +#define KVM_S390_INT_VIRTIO0x2603u #define KVM_S390_INT_SERVICE 0x2401u #define KVM_S390_INT_EMERGENCY 0x1201u - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC/PATCH 05/15 v3] kvm-s390: s390 arch backend for the kvm kernel module
Am Montag, 31. März 2008 schrieb Arnd Bergmann: Hello Arnd, thanks for the review. > On Tuesday 25 March 2008, Carsten Otte wrote: > > > + > > +static inline void __user *__guestaddr_to_user(struct kvm_vcpu *vcpu, > > + u64 guestaddr) > > +{ > > + u64 prefix = vcpu->arch.sie_block->prefix; > > + u64 origin = vcpu->kvm->arch.guest_origin; > > + u64 memsize = vcpu->kvm->arch.guest_memsize; > > + > > + if (guestaddr < 2 * PAGE_SIZE) > > + guestaddr += prefix; > > + else if ((guestaddr >= prefix) && (guestaddr < prefix + 2 * PAGE_SIZE)) > > + guestaddr -= prefix; > > What happens if prefix is set to 4096? Does this do the right thing > according to the architecture definition? The z/architecture and the instructions (spx + sigp set prefix) dont allow 4096 as prefix address. When setting a prefix, bits 1-18 (IBM numbering scheme) are used and appended with 13 zero to the right. That means prefix address is always a multiple of 8192. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] virtio-blk: allow more than one in-flight request
Am Donnerstag, 27. März 2008 schrieb Marcelo Tosatti: > > Allow more than one in-flight request in the virtio ring. This allows > the host driver to submit requests in parallel. [...] > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > index 3b1a68d..5bb041f 100644 > --- a/drivers/block/virtio_blk.c > +++ b/drivers/block/virtio_blk.c [...] > @@ -202,7 +204,8 @@ static int virtblk_probe(struct virtio_device *vdev) > goto out_free_vblk; > } > > - vblk->pool = mempool_create_kmalloc_pool(1,sizeof(struct virtblk_req)); > + vblk->pool = mempool_create_kmalloc_pool(VIRTIO_BLK_POOL_SIZE, > +sizeof(struct virtblk_req)); > if (!vblk->pool) { > err = -ENOMEM; > goto out_free_vq; [...] Huh? I dont understand. You only change the number of pre-allocated pool entries. mempool_alloc should allow several allocations, even with 1 as initialization value. The pre-initialized value is only for oom situations. No? Christian - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] kvm.h: __user requires compiler.h
Am Montag, 24. März 2008 schrieb Avi Kivity: > Christian Borntraeger wrote: > > Am Freitag, 21. März 2008 schrieb Anthony Liguori: > > > >> This patch breaks QEMU build when doing a 'make sync'. When you do a > >> top-level ./configure, libkvm is built with kerneldir pointing to > >> kvm-userspace/kernel/include. While linux/kvm.h is present there, [...] > Maybe we should generate the 'make sync' headers using 'make > headers_install'. headers_install works because there is # Eliminate the contents of (and inclusions of) compiler.h HDRSED := sed -e "s/ inline / __inline__ /g" \ -e "s/[[:space:]]__user[[:space:]]\{1,\}/ /g" \ -e "s/(__user[[:space:]]\{1,\}/ (/g" \ -e "s/[[:space:]]__force[[:space:]]\{1,\}/ /g" \ -e "s/(__force[[:space:]]\{1,\}/ (/g" \ -e "s/[[:space:]]__iomem[[:space:]]\{1,\}/ /g" \ -e "s/(__iomem[[:space:]]\{1,\}/ (/g" \ -e "s/[[:space:]]__attribute_const__[[:space:]]\{1,\}/\ /g" \ -e "s/[[:space:]]__attribute_const__$$//" \ -e "/^\#include /d" in scripts/Makefile.headersinst If you dont want to do something like for make sync, what about providing a dummy compiler.h, which has only this line? #define __user Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] + kvm-provide-kvmh-for-all-architecture-fixes-headers_install.patch added to -mm tree
Am Mittwoch, 12. März 2008 schrieben Sie: > > The patch titled > kvm: provide kvm.h for all architecture: fixes headers_install > has been added to the -mm tree. Its filename is > kvm-provide-kvmh-for-all-architecture-fixes-headers_install.patch > Hello Andrew, is there a chance to submit this patch before 2.6.25? headers_install of kvm.h worked with 2.6.24 but is still broken with 2.6.25-rc. Thanks Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] kvm.h: __user requires compiler.h
Am Freitag, 21. März 2008 schrieb Anthony Liguori: > This patch breaks QEMU build when doing a 'make sync'. When you do a > top-level ./configure, libkvm is built with kerneldir pointing to > kvm-userspace/kernel/include. While linux/kvm.h is present there, there > isn't a linux/compiler.h. I checked with make headers_install, which strips out __user and compiler.h nicely. I never tried "make sync" since I was not aware of it. Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC/PATCH 15/15] guest: virtio device support, and kvm hypercalls
Am Freitag, 21. März 2008 schrieb Rusty Russell: > I'd recommend a hypercall after set_status, as well as reset. The > reason lguest doesn't do this is that we don't do feature negotiation > (assuming guest kernel matches host kernel). In general, the host > needs to know when the VIRTIO_CONFIG_S_DRIVER_OK is set so it can see > what features the guest driver accepted. Right. Will have a look. > > Overloading the notify hypercall is kind of a hack too, but it works so > no real need to change that. > > > + * The root device for the kvm virtio devices. > > + * This makes them appear as /sys/devices/kvm/0,1,2 not /sys/devices/0,1,2. > > + */ > > +static struct device kvm_root = { > > + .parent = NULL, > > + .bus_id = "kvm_s390", > > +}; > > You mean /sys/devices/kvm_s390/0,1,2? Yes, thanks. > > > +static int __init kvm_devices_init(void) > > +{ > > + if (!MACHINE_IS_KVM) > > + return -ENODEV; > > + > > + if (device_register(&kvm_root) != 0) > > + panic("Could not register kvm root"); > > + > > + if (add_shared_memory((max_pfn) << PAGE_SHIFT, PAGE_SIZE)) { > > + device_unregister(&kvm_root); > > + return -ENOMEM; > > + } > > Hmm, panic on device_register fail, but -ENOMEM on add_shared_memory fail? > My theory was that since this is boot time, panic() is the right thing. Good spot, but I agree with Carsten. Drivers should not panic. I have module load/unload capability on my long term todo list, but I can change the panic now. Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] kvm.h: __user requires compiler.h
include/linux/kvm.h defines struct kvm_dirty_log to [...] union { void __user *dirty_bitmap; /* one bit per page */ __u64 padding; }; __user requires compiler.h to compile. Currently, this works on x86 only coincidentally due to other include files. This patch makes kvm.h compile in all cases. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- include/linux/kvm.h |1 + 1 file changed, 1 insertion(+) Index: kvm/include/linux/kvm.h === --- kvm.orig/include/linux/kvm.h +++ kvm/include/linux/kvm.h @@ -8,6 +8,7 @@ */ #include +#include #include #include - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] kvm: provide kvm.h for all architecture: fixes headers_install
Am Dienstag, 11. März 2008 schrieb Andreas Schwab: > Christian Borntraeger <[EMAIL PROTECTED]> writes: > > > include/asm-alpha/kvm.h|6 ++ > > Can't you put it in asm-generic? I dont think so. The generic part is in include/linux/kvm.h. asm/kvm.h is populated by each architecture which gets kvm support with arch specfic definitions. x86 has support and s390, powerpc and ia64 are currently know to work on ports. Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] kvm: provide kvm.h for all architecture: fixes headers_install
Am Dienstag, 11. März 2008 schrieb David Woodhouse: > > On Mon, 2008-03-10 at 23:20 -0700, Andrew Morton wrote: > > > > > Changes since v1: > > > o use asm-generic/Kbuild.asm (Arnd Bergmann) > > I don't see that in the diffstat of the patch you dropped (looking at > the immediately subsequent mail in my inbox). Which might explain... I cross tested my patch on all architectures and, yes, that part did not make it into mm. Maybe it was caused by my last patch being word wrapped. Here is a resend: [PATCH v2] kvm: provide kvm.h for all architecture: fixes headers_install Currently include/linux/kvm.h is not considered by make headers_install, because Kbuild cannot handle " unifdef-$(CONFIG_FOO) += foo.h. This problem was introduced by 040922c04cf2c8ac70be2e88a8a9614ecdb41d2e, which makes this an 2.6.25 regression. One way of solving the issue is to enhance Kbuild, but Avi and David conviced me, that changing headers_install is not the way to go. This patch changes the definition for linux/kvm.h to unifdef-y. If unifdef-y is used for linux/kvm.h "make headers_check" will fail on all architectures without asm/kvm.h. Therefore, this patch also provides asm/kvm.h on all architectures. Changes since v1: o use asm-generic/Kbuild.asm (Arnd Bergmann) o fix comment in asm-frv (David Howells) Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> Acked-by: Avi Kivity <[EMAIL PROTECTED]> CCed: Sam Ravnborg <[EMAIL PROTECTED] CCed: David Woodhouse <[EMAIL PROTECTED]> --- include/asm-alpha/kvm.h|6 ++ include/asm-arm/kvm.h |6 ++ include/asm-avr32/kvm.h|6 ++ include/asm-blackfin/kvm.h |6 ++ include/asm-cris/kvm.h |6 ++ include/asm-frv/kvm.h |6 ++ include/asm-generic/Kbuild.asm |2 ++ include/asm-h8300/kvm.h|6 ++ include/asm-ia64/kvm.h |6 ++ include/asm-m32r/kvm.h |6 ++ include/asm-m68k/kvm.h |6 ++ include/asm-m68knommu/kvm.h|6 ++ include/asm-mips/kvm.h |6 ++ include/asm-mn10300/kvm.h |6 ++ include/asm-parisc/kvm.h |6 ++ include/asm-powerpc/kvm.h |6 ++ include/asm-s390/kvm.h |6 ++ include/asm-sh/kvm.h |6 ++ include/asm-sparc/kvm.h|6 ++ include/asm-sparc64/kvm.h |6 ++ include/asm-um/kvm.h |6 ++ include/asm-v850/kvm.h |6 ++ include/asm-xtensa/kvm.h |6 ++ include/linux/Kbuild |2 +- 24 files changed, 135 insertions(+), 1 deletion(-) Index: linux-2.6.git/include/asm-alpha/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-alpha/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_ALPHA_H +#define __LINUX_KVM_ALPHA_H + +/* alpha does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-arm/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-arm/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_ARM_H +#define __LINUX_KVM_ARM_H + +/* arm does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-avr32/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-avr32/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_AVR32_H +#define __LINUX_KVM_AVR32_H + +/* avr32 does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-blackfin/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-blackfin/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_BLACKFIN_H +#define __LINUX_KVM_BLACKFIN_H + +/* blackfin does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-cris/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-cris/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_CRIS_H +#define __LINUX_KVM_CRIS_H + +/* cris does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-frv/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-frv/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_FRV_H +#define __LINUX_KVM_FRV_H + +/* frv does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-generic/Kbuild.asm === --- linux-2.6.git.orig/include/asm-generic/Kbuild.asm +++ linux-2.6.git/include/asm-generic/Kbuild.asm @@ -1,3 +1,5 @@ +header-y += kvm.h + ifeq ($(wildcard include/asm-$(SRCARCH)/a.out.h),include/asm-$(SRCARCH)/a.out.h) unifdef-y += a.out.h endif Index: linux-2.6.git/include/asm-h8300/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-h8300
[kvm-devel] [PATCH] kvm: provide kvm.h for all architecture: fixes headers_install
Ok, here goes the second version. Andrew, if there are no further complaints, please replace the old patch, with this version. Thank you. - [PATCH v2] kvm: provide kvm.h for all architecture: fixes headers_install Currently include/linux/kvm.h is not considered by make headers_install, because Kbuild cannot handle " unifdef-$(CONFIG_FOO) += foo.h. This problem was introduced by 040922c04cf2c8ac70be2e88a8a9614ecdb41d2e, which makes this an 2.6.25 regression. One way of solving the issue is to enhance Kbuild, but Avi Kivity and David Woodhouse conviced me, that changing headers_install is not the way to go. This patch changes the definition for linux/kvm.h to unifdef-y. If unifdef-y is used for linux/kvm.h "make headers_check" will fail on all architectures without asm/kvm.h. Therefore, this patch also provides asm/kvm.h on all architectures. Changes since v1: o use asm-generic/Kbuild.asm (Arnd Bergmann) o fix comment in asm-frv (David Howells) Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> Acked-by: Avi Kivity <[EMAIL PROTECTED]> --- include/asm-alpha/kvm.h|6 ++ include/asm-arm/kvm.h |6 ++ include/asm-avr32/kvm.h|6 ++ include/asm-blackfin/kvm.h |6 ++ include/asm-cris/kvm.h |6 ++ include/asm-frv/kvm.h |6 ++ include/asm-generic/Kbuild.asm |2 ++ include/asm-h8300/kvm.h|6 ++ include/asm-ia64/kvm.h |6 ++ include/asm-m32r/kvm.h |6 ++ include/asm-m68k/kvm.h |6 ++ include/asm-m68knommu/kvm.h|6 ++ include/asm-mips/kvm.h |6 ++ include/asm-mn10300/kvm.h |6 ++ include/asm-parisc/kvm.h |6 ++ include/asm-powerpc/kvm.h |6 ++ include/asm-s390/kvm.h |6 ++ include/asm-sh/kvm.h |6 ++ include/asm-sparc/kvm.h|6 ++ include/asm-sparc64/kvm.h |6 ++ include/asm-um/kvm.h |6 ++ include/asm-v850/kvm.h |6 ++ include/asm-xtensa/kvm.h |6 ++ include/linux/Kbuild |2 +- 24 files changed, 135 insertions(+), 1 deletion(-) Index: linux-2.6.git/include/asm-alpha/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-alpha/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_ALPHA_H +#define __LINUX_KVM_ALPHA_H + +/* alpha does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-arm/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-arm/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_ARM_H +#define __LINUX_KVM_ARM_H + +/* arm does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-avr32/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-avr32/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_AVR32_H +#define __LINUX_KVM_AVR32_H + +/* avr32 does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-blackfin/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-blackfin/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_BLACKFIN_H +#define __LINUX_KVM_BLACKFIN_H + +/* blackfin does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-cris/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-cris/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_CRIS_H +#define __LINUX_KVM_CRIS_H + +/* cris does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-frv/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-frv/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_FRV_H +#define __LINUX_KVM_FRV_H + +/* frv does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-generic/Kbuild.asm === --- linux-2.6.git.orig/include/asm-generic/Kbuild.asm +++ linux-2.6.git/include/asm-generic/Kbuild.asm @@ -1,3 +1,5 @@ +header-y += kvm.h + ifeq ($(wildcard include/asm-$(SRCARCH)/a.out.h),include/asm-$(SRCARCH)/a.out.h) unifdef-y += a.out.h endif Index: linux-2.6.git/include/asm-h8300/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-h8300/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_H8300_H +#define __LINUX_KVM_H8300_H + +/* h8300 does not support KVM */ + +#endif Index: linux-2.6.git/include/asm-ia64/kvm.h === --- /dev/null +++ linux-2.6.git/include/asm-ia64/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_IA64_H +#define __LINUX_KVM_IA64_H + +/* ia64 does not support KVM */ + +#endif Index: linux-2.6
[kvm-devel] [PATCH] kvm: provide kvm.h for all architecture: fixes headers_install
Andrew, Avi is away, can you apply this patch for 2.6.25? [PATCH] kvm: provide kvm.h for all architecture: fixes headers_install Currently include/linux/kvm.h is not considered by make headers_install, because Kbuild cannot handle " unifdef-$(CONFIG_FOO) += foo.h. This problem was introduced by 040922c04cf2c8ac70be2e88a8a9614ecdb41d2e, which makes this an 2.6.25 regression. One way of solving the issue is to enhance Kbuild, but Avi and David conviced me, that changing headers_install is not the way to go. This patch changes the definition for linux/kvm.h to unifdef-y. If unifdef-y is used for linux/kvm.h "make headers_check" will fail on all architectures without asm/kvm.h. Therefore, this patch also provides asm/kvm.h on all architectures. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> Acked-by: Avi Kivity <[EMAIL PROTECTED]> CCed: Sam Ravnborg <[EMAIL PROTECTED] CCed: David Woodhouse <[EMAIL PROTECTED]> --- include/asm-alpha/Kbuild |1 + include/asm-alpha/kvm.h |6 ++ include/asm-arm/Kbuild |2 ++ include/asm-arm/kvm.h|6 ++ include/asm-avr32/Kbuild |1 + include/asm-avr32/kvm.h |6 ++ include/asm-blackfin/Kbuild |1 + include/asm-blackfin/kvm.h |6 ++ include/asm-cris/Kbuild |1 + include/asm-cris/kvm.h |6 ++ include/asm-frv/Kbuild |1 + include/asm-frv/kvm.h|6 ++ include/asm-h8300/Kbuild |2 ++ include/asm-h8300/kvm.h |6 ++ include/asm-ia64/Kbuild |1 + include/asm-ia64/kvm.h |6 ++ include/asm-m32r/Kbuild |2 ++ include/asm-m32r/kvm.h |6 ++ include/asm-m68k/Kbuild |1 + include/asm-m68k/kvm.h |6 ++ include/asm-m68knommu/Kbuild |2 ++ include/asm-m68knommu/kvm.h |7 +++ include/asm-mips/Kbuild |2 +- include/asm-mips/kvm.h |6 ++ include/asm-mn10300/Kbuild |2 ++ include/asm-mn10300/kvm.h|6 ++ include/asm-parisc/Kbuild|2 ++ include/asm-parisc/kvm.h |6 ++ include/asm-powerpc/Kbuild |1 + include/asm-powerpc/kvm.h|6 ++ include/asm-s390/Kbuild |1 + include/asm-s390/kvm.h |6 ++ include/asm-sh/Kbuild|1 + include/asm-sh/kvm.h |6 ++ include/asm-sparc/Kbuild |1 + include/asm-sparc/kvm.h |6 ++ include/asm-sparc64/Kbuild |1 + include/asm-sparc64/kvm.h|6 ++ include/asm-v850/Kbuild |2 ++ include/asm-v850/kvm.h |6 ++ include/asm-xtensa/Kbuild|2 ++ include/asm-xtensa/kvm.h |6 ++ include/linux/Kbuild |2 +- 43 files changed, 157 insertions(+), 2 deletions(-) Index: linux-2.6/include/asm-alpha/Kbuild === --- linux-2.6.orig/include/asm-alpha/Kbuild +++ linux-2.6/include/asm-alpha/Kbuild @@ -1,6 +1,7 @@ include include/asm-generic/Kbuild.asm header-y += gentrap.h +header-y += kvm.h header-y += regdef.h header-y += pal.h header-y += reg.h Index: linux-2.6/include/asm-alpha/kvm.h === --- /dev/null +++ linux-2.6/include/asm-alpha/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_ALPHA_H +#define __LINUX_KVM_ALPHA_H + +/* alpha does not support KVM */ + +#endif Index: linux-2.6/include/asm-arm/Kbuild === --- linux-2.6.orig/include/asm-arm/Kbuild +++ linux-2.6/include/asm-arm/Kbuild @@ -1,3 +1,5 @@ include include/asm-generic/Kbuild.asm +header-y += kvm.h + unifdef-y += hwcap.h Index: linux-2.6/include/asm-arm/kvm.h === --- /dev/null +++ linux-2.6/include/asm-arm/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_ARM_H +#define __LINUX_KVM_ARM_H + +/* arm does not support KVM */ + +#endif Index: linux-2.6/include/asm-avr32/Kbuild === --- linux-2.6.orig/include/asm-avr32/Kbuild +++ linux-2.6/include/asm-avr32/Kbuild @@ -1,3 +1,4 @@ include include/asm-generic/Kbuild.asm header-y += cachectl.h +header-y += kvm.h Index: linux-2.6/include/asm-avr32/kvm.h === --- /dev/null +++ linux-2.6/include/asm-avr32/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_AVR32_H +#define __LINUX_KVM_AVR32_H + +/* avr32 does not support KVM */ + +#endif Index: linux-2.6/include/asm-blackfin/Kbuild === --- linux-2.6.orig/include/asm-blackfin/Kbuild +++ linux-2.6/include/asm-blackfin/Kbuild @@ -1,3 +1,4 @@ include include/asm-generic/Kbuild.asm header-y += fixed_code.h +header-y += kvm.h Index:
Re: [kvm-devel] headersinstall of kvm.h does not work
Am Freitag, 7. März 2008 schrieb Avi Kivity: > As I'm about to disappear for a week, consider a patch to remove the > config dependency and add asm-*/kvm.h pre-acked for mainline. Maybe the > presence of those empty asm-*/kvm.h files will encourage further kvm > ports to *. Something like the following for all architectures? --- linux-2.6.orig/include/asm-alpha/Kbuild +++ linux-2.6/include/asm-alpha/Kbuild @@ -1,6 +1,7 @@ include include/asm-generic/Kbuild.asm header-y += gentrap.h +header-y += kvm.h header-y += regdef.h header-y += pal.h header-y += reg.h Index: linux-2.6/include/asm-alpha/kvm.h === --- /dev/null +++ linux-2.6/include/asm-alpha/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_ALPHA_H +#define __LINUX_KVM_ALPHA_H + +/* alpha does not support KVM */ + +#endif - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] headersinstall of kvm.h does not work
Hello Avi, in commit fb56dbb31c4738a3918db81fd24da732ce3b4ae6 you changed include/linux/Kbuild: snip KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM Currently, make headers_check barfs due to , which includes, not existing. Rather than add a zillion s, export kvm.h only if the arch actually supports it. [...] unifdef-y += keyboard.h -unifdef-y += kvm.h +unifdef-$(CONFIG_HAVE_KVM) += kvm.h unifdef-y += llc.h unifdef-y += loop.h snip-- This patch does not work. Kbuild (scripts/Makefile.headersinst) does not check the config file, so kvm.h is never installed. Sam is there an easy way to allow constructs like "unifdef-$(CONFIG_FOO)"? Thanks Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 3/5] KVM: hypercall batching
Am Montag, 18. Februar 2008 schrieb Avi Kivity: > > AFAICS there is no guarantee about page-alignment here... > Kernel data is physically contiguous (true for per-cpu data as well?), > so no there's issue here. Modules are loaded into vmalloc space, no? I think, if kvm is built as module, static module variables are not guaranteed to be contiguous. Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] virtio_net: Fix oops on early interrupts - introduced by virtio reset code
Am Montag, 11. Februar 2008 schrieb Anthony Liguori: > The reset support is in Linus's tree so we should try to push it for -rc2. You are right. My repository was borked. will push it to Jeff Garzik. Thanks Jeff can you schedule this fix into your network driver updates? Thanks --- With the latest virtio_reset patches I got the following oops: Unable to handle kernel pointer dereference at virtual kernel address Oops: 0004 [#1] PREEMPT SMP Modules linked in: CPU: 1 Not tainted 2.6.24zlive-guest-10577-g63f5307-dirty #168 Process swapper (pid: 0, task: 0f866040, ksp: 0f86fd78) Krnl PSW : 040410018000 0047598a (skb_recv_done+0x52/0x98) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0 EA:3 Krnl GPRS: 0001 0efd0e60 0001 0f866040 008de4c8 1237 1237 0f977dd8 0020 001132bc 0f977e08 0f977dd8 Krnl Code: 0047597c: e3104034 lg %r1,48(%r4) 00475982: b9040001 lgr %r0,%r1 00475986: b9810003 ogr %r0,%r3 >0047598a: eb1040300030 csg %r1,%r0,48(%r4) 00475990: a744fff9 brc 4,475982 00475994: a7110001 tmll%r1,1 00475998: a7840009 brc 8,4759aa 0047599c: e340b0b80004 lg %r4,184(%r11) Call Trace: ([<01500f978000>] 0x1500f978000) [<004779a6>] vring_interrupt+0x72/0x88 [<00491d9c>] kvm_extint_handler+0x34/0x44 [<0010d2d4>] do_extint+0xc0/0xfc [<00113b5a>] ext_no_vtime+0x1c/0x20 [<0010a0b6>] cpu_idle+0x21a/0x230 ([<0010a096>] cpu_idle+0x1fa/0x230) [<0057dfe4>] start_secondary+0xa0/0xb4 We must initialize vdev->priv before we use the notify hypercall as vdev->priv is used in skb_recv_done. So lets move the assignment of vdev->priv before we call try_fill_recv. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> Acked-by: Anthony Liguori <[EMAIL PROTECTED]> --- drivers/net/virtio_net.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: kvm/drivers/net/virtio_net.c === --- kvm.orig/drivers/net/virtio_net.c +++ kvm/drivers/net/virtio_net.c @@ -361,6 +361,7 @@ static int virtnet_probe(struct virtio_d netif_napi_add(dev, &vi->napi, virtnet_poll, napi_weight); vi->dev = dev; vi->vdev = vdev; + vdev->priv = vi; /* We expect two virtqueues, receive then send. */ vi->rvq = vdev->config->find_vq(vdev, 0, skb_recv_done); @@ -395,7 +396,6 @@ static int virtnet_probe(struct virtio_d } pr_debug("virtnet: registered device %s\n", dev->name); - vdev->priv = vi; return 0; unregister: - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] virtio_net: Fix oops on early interrupts - introduced by virtio reset code
Avi, this fixes a problem that was introduced by the virtio_reset patches. Can you apply that fix to kvm.git as a bugfix, as the virtio_reset infrastructure is not on Linus upstream yet? Anthony, Dor, are you ok with that change? -- With the latest virtio_reset patches I got the following oops: Unable to handle kernel pointer dereference at virtual kernel address Oops: 0004 [#1] PREEMPT SMP Modules linked in: CPU: 1 Not tainted 2.6.24zlive-guest-10577-g63f5307-dirty #168 Process swapper (pid: 0, task: 0f866040, ksp: 0f86fd78) Krnl PSW : 040410018000 0047598a (skb_recv_done+0x52/0x98) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0 EA:3 Krnl GPRS: 0001 0efd0e60 0001 0f866040 008de4c8 1237 1237 0f977dd8 0020 001132bc 0f977e08 0f977dd8 Krnl Code: 0047597c: e3104034 lg %r1,48(%r4) 00475982: b9040001 lgr %r0,%r1 00475986: b9810003 ogr %r0,%r3 >0047598a: eb1040300030 csg %r1,%r0,48(%r4) 00475990: a744fff9 brc 4,475982 00475994: a7110001 tmll%r1,1 00475998: a7840009 brc 8,4759aa 0047599c: e340b0b80004 lg %r4,184(%r11) Call Trace: ([<01500f978000>] 0x1500f978000) [<004779a6>] vring_interrupt+0x72/0x88 [<00491d9c>] kvm_extint_handler+0x34/0x44 [<0010d2d4>] do_extint+0xc0/0xfc [<00113b5a>] ext_no_vtime+0x1c/0x20 [<0010a0b6>] cpu_idle+0x21a/0x230 We must initialize vdev->priv before we use the notify hypercall as vdev->priv is used in skb_recv_done. So lets move the assignment of vdev->priv before we call try_fill_recv. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- drivers/net/virtio_net.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: kvm/drivers/net/virtio_net.c === --- kvm.orig/drivers/net/virtio_net.c +++ kvm/drivers/net/virtio_net.c @@ -361,6 +361,7 @@ static int virtnet_probe(struct virtio_d netif_napi_add(dev, &vi->napi, virtnet_poll, napi_weight); vi->dev = dev; vi->vdev = vdev; + vdev->priv = vi; /* We expect two virtqueues, receive then send. */ vi->rvq = vdev->config->find_vq(vdev, 0, skb_recv_done); @@ -395,7 +396,6 @@ static int virtnet_probe(struct virtio_d } pr_debug("virtnet: registered device %s\n", dev->name); - vdev->priv = vi; return 0; unregister: - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] virtio_ring: make structure defines packed
Am Sonntag, 10. Februar 2008 schrieb ron minnich: > The problem for me is that gcc packed is not compatible with the plan > 9 C compiler. So, for a truly heterogeneous setup, you are going to > have > to have code to marshall the structures when transferring between > domains, and you are better off not trying to fake it with packed -- > it can actually make the > marshalling less efficient. Ok, I am conviced. The packed idea came up for mixed platform setups, but you are right - this reuqires much more. Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] virtio_ring: make structure defines packed
Currently the virtio_ring structure are not declared packed, but they describe an hardware like interface. We should not allow compilers to make alignments and optimizations that can be different between the guest and host compiler. I propose to declare all structures that are in shared memory as packed. Does anybody see a problem with packed? Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- include/linux/virtio_ring.h |9 + 1 file changed, 5 insertions(+), 4 deletions(-) Index: kvm/include/linux/virtio_ring.h === --- kvm.orig/include/linux/virtio_ring.h +++ kvm/include/linux/virtio_ring.h @@ -35,14 +35,14 @@ struct vring_desc __u16 flags; /* We chain unused descriptors via this, too */ __u16 next; -}; +} __attribute__ ((packed)); struct vring_avail { __u16 flags; __u16 idx; __u16 ring[]; -}; +} __attribute__ ((packed)); /* u32 is used here for ids for padding reasons. */ struct vring_used_elem @@ -51,14 +51,15 @@ struct vring_used_elem __u32 id; /* Total length of the descriptor chain which was used (written to) */ __u32 len; -}; +} __attribute__ ((packed)); struct vring_used { __u16 flags; __u16 idx; + __u32 padding; struct vring_used_elem ring[]; -}; +} __attribute__ ((packed)); struct vring { unsigned int num; - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [Patch] virtio_blk: implement naming for vda-vdz, vdaa-vdzz, vdaaa-vdzzz
Am Freitag, 1. Februar 2008 schrieb Christian Borntraeger: > Right. I will fix that with an additional patch. This patch goes on top of the minor number patch. Please let me know if you want a merged patch: Currently virtio_blk creates the disk name combinging "vd" with 'a'++. This will give strange names after vdz. I have implemented names up to vdzzz - inspired by the sd.c code. That should be sufficient for now. There is one driver in the kernel (driver/s390/block/dasd_genhd.c) that implements names from dasda-dasd allowing even more disks. Maybe a janitor can come up with a common implementation usable for all kind of block device drivers. I have tested this patch with 100 disks - seems to work. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- drivers/block/virtio_blk.c | 29 ++--- 1 file changed, 22 insertions(+), 7 deletions(-) Index: kvm/drivers/block/virtio_blk.c === --- kvm.orig/drivers/block/virtio_blk.c +++ kvm/drivers/block/virtio_blk.c @@ -11,8 +11,7 @@ MODULE_LICENSE("GPL"); -static unsigned char virtblk_index = 'a'; -static int major, minor; +static int major, index; struct virtio_blk { @@ -173,6 +172,11 @@ static struct block_device_operations vi .getgeo = virtblk_getgeo, }; +static int index_to_minor(int index) +{ + return index << PART_BITS; +} + static int virtblk_probe(struct virtio_device *vdev) { struct virtio_blk *vblk; @@ -180,7 +184,7 @@ static int virtblk_probe(struct virtio_d u64 cap; u32 v; - if (minor >= 1 << MINORBITS) + if (index_to_minor(index) >= 1 << MINORBITS) return -ENOSPC; vdev->priv = vblk = kmalloc(sizeof(*vblk), GFP_KERNEL); @@ -219,13 +223,24 @@ static int virtblk_probe(struct virtio_d goto out_put_disk; } - sprintf(vblk->disk->disk_name, "vd%c", virtblk_index++); + if (index < 26) { + sprintf(vblk->disk->disk_name, "vd%c", 'a' + index % 26); + } else if (index < (26 + 1) * 26) { + sprintf(vblk->disk->disk_name, "vd%c%c", + 'a' + index / 26 - 1, 'a' + index % 26); + } else { + const unsigned int m1 = (index / 26 - 1) / 26 - 1; + const unsigned int m2 = (index / 26 - 1) % 26; + const unsigned int m3 = index % 26; + sprintf(vblk->disk->disk_name, "vd%c%c%c", + 'a' + m1, 'a' + m2, 'a' + m3); + } + vblk->disk->major = major; - vblk->disk->first_minor = minor; + vblk->disk->first_minor = index_to_minor(index); vblk->disk->private_data = vblk; vblk->disk->fops = &virtblk_fops; - - minor += 1 << PART_BITS; + index++; /* If barriers are supported, tell block layer that queue is ordered */ if (vdev->config->feature(vdev, VIRTIO_BLK_F_BARRIER)) - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] virtio_blk: Dont waste major numbers
Am Donnerstag, 31. Januar 2008 schrieb Anthony Liguori: > There's are some other limitations to the number of virtio block > devices. For instances... > > > sprintf(vblk->disk->disk_name, "vd%c", virtblk_index++); > > This gets bogus after 64 disks. Right. I will fix that with an additional patch. > We also have a hard limit for > virtio-pci based on the number of PCI slots available. One thing I was > considering was whether we should try to support multiple disks per > virtio device. What is the hard limit in the PCI name space? Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] virtio_blk: Dont waste major numbers
Rusty, currently virtio_blk uses one major number per device. While this works quite well on most systems it is wasteful and will exhaust major numbers on larger installations. This patch allocates a major number on init and will use 16 minor numbers for each disk. That will allow ~64k virtio_blk disks. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- drivers/block/virtio_blk.c | 28 1 file changed, 16 insertions(+), 12 deletions(-) Index: kvm/drivers/block/virtio_blk.c === --- kvm.orig/drivers/block/virtio_blk.c +++ kvm/drivers/block/virtio_blk.c @@ -7,10 +7,13 @@ #include #define VIRTIO_MAX_SG (3+MAX_PHYS_SEGMENTS) +#define PART_BITS 4 MODULE_LICENSE("GPL"); static unsigned char virtblk_index = 'a'; +static int major, minor; + struct virtio_blk { spinlock_t lock; @@ -173,10 +176,13 @@ static struct block_device_operations vi static int virtblk_probe(struct virtio_device *vdev) { struct virtio_blk *vblk; - int err, major; + int err; u64 cap; u32 v; + if (minor >= 1 << MINORBITS) + return -ENOSPC; + vdev->priv = vblk = kmalloc(sizeof(*vblk), GFP_KERNEL); if (!vblk) { err = -ENOMEM; @@ -200,17 +206,11 @@ static int virtblk_probe(struct virtio_d goto out_free_vq; } - major = register_blkdev(0, "virtblk"); - if (major < 0) { - err = major; - goto out_mempool; - } - /* FIXME: How many partitions? How long is a piece of string? */ - vblk->disk = alloc_disk(1 << 4); + vblk->disk = alloc_disk(1 << PART_BITS); if (!vblk->disk) { err = -ENOMEM; - goto out_unregister_blkdev; + goto out_mempool; } vblk->disk->queue = blk_init_queue(do_virtblk_request, &vblk->lock); @@ -221,10 +221,12 @@ static int virtblk_probe(struct virtio_d sprintf(vblk->disk->disk_name, "vd%c", virtblk_index++); vblk->disk->major = major; - vblk->disk->first_minor = 0; + vblk->disk->first_minor = minor; vblk->disk->private_data = vblk; vblk->disk->fops = &virtblk_fops; + minor += 1 << PART_BITS; + /* If barriers are supported, tell block layer that queue is ordered */ if (vdev->config->feature(vdev, VIRTIO_BLK_F_BARRIER)) blk_queue_ordered(vblk->disk->queue, QUEUE_ORDERED_TAG, NULL); @@ -260,8 +262,6 @@ static int virtblk_probe(struct virtio_d out_put_disk: put_disk(vblk->disk); -out_unregister_blkdev: - unregister_blkdev(major, "virtblk"); out_mempool: mempool_destroy(vblk->pool); out_free_vq: @@ -302,11 +302,15 @@ static struct virtio_driver virtio_blk = static int __init init(void) { + major = register_blkdev(0, "virtblk"); + if (major < 0) + return major; return register_virtio_driver(&virtio_blk); } static void __exit fini(void) { + unregister_blkdev(major, "virtblk"); unregister_virtio_driver(&virtio_blk); } module_init(init); - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] KVM simplified virtio balloon driver
Am Donnerstag, 17. Januar 2008 schrieb Rusty Russell: > Since the balloon requires Guest cooperation anyway, there seems > little reason to force it to tell the Host when it wants to reuse a > page. It can simply fault it in. Yes, thats what we do in the s390 cmm driver. All in all the driver has similarities with cmm. I dont know, if we can consolidate some code. Besides the hypervisor, we have a additional user interface: /proc/sys/vm/. The root user can specify the amount of pages in the balloon via /proc/sys/vm/cmm_pages. Another idea: Martin added an oom notifier to the cmm driver. Before the oom-killer kicks in cmm will try to free 256 pages. I think your virtio balloon driver should do the same - it seems to be the correct tradeoff. Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] virtio_net and SMP guests
Am Donnerstag, 10. Januar 2008 schrieb Christian Borntraeger: > Am Donnerstag, 10. Januar 2008 schrieb Christian Borntraeger: > > Am Dienstag, 18. Dezember 2007 schrieb Rusty Russell: > > > To me this points to doing interrupt suppression a different way. If we > > > have a ->disable_cb() virtio function, and call it before we call > > > netif_rx_schedule, does that fix it? > > > > The fix looks good and I agree with it. > > > > There is one problem that I try to find for some days, but the following > > BUG_ON triggers: > > > > static void vring_disable_cb(struct virtqueue *_vq) > > { > > struct vring_virtqueue *vq = to_vvq(_vq); > > > > START_USE(vq); > > > BUG_ON(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT); > > vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; > > END_USE(vq); > > } > > Ok, I found it: > > static int virtnet_open(struct net_device *dev) > { > struct virtnet_info *vi = netdev_priv(dev); > try_fill_recv(vi); > /* If we didn't even get one input buffer, we're useless. */ > if (vi->num == 0) > return -ENOMEM; > ---> int for new packet > static void skb_recv_done(struct > virtqueue *rvq) > { > struct virtnet_info *vi = > rvq->vdev->priv; > /* Suppress further interrupts. > */ > rvq->vq_ops->disable_cb(rvq); > netif_rx_schedule(vi->dev, > &vi->napi); > } > - poll is not yet possible, no softirq > - return from interrupt > napi_enable(&vi->napi); > vi->rvq->vq_ops->disable_cb(vi->rvq); > ---> BUG: its already disabled Btw. this problem also happens on single processor guests. What about the following patch: --- drivers/net/virtio_net.c |9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) Index: kvm/drivers/net/virtio_net.c === --- kvm.orig/drivers/net/virtio_net.c +++ kvm/drivers/net/virtio_net.c @@ -179,9 +179,12 @@ static void try_fill_recv(struct virtnet static void skb_recv_done(struct virtqueue *rvq) { struct virtnet_info *vi = rvq->vdev->priv; - /* Suppress further interrupts. */ - rvq->vq_ops->disable_cb(rvq); - netif_rx_schedule(vi->dev, &vi->napi); + /* Schedule NAPI, Suppress further interrupts if successful. */ + + if (netif_rx_schedule_prep(vi->dev, &vi->napi)) { + rvq->vq_ops->disable_cb(rvq); + __netif_rx_schedule(vi->dev, &vi->napi); + } } static int virtnet_poll(struct napi_struct *napi, int budget) - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] virtio_net and SMP guests
Am Donnerstag, 10. Januar 2008 schrieb Christian Borntraeger: > Am Dienstag, 18. Dezember 2007 schrieb Rusty Russell: > > To me this points to doing interrupt suppression a different way. If we > > have a ->disable_cb() virtio function, and call it before we call > > netif_rx_schedule, does that fix it? > > The fix looks good and I agree with it. > > There is one problem that I try to find for some days, but the following > BUG_ON triggers: > > static void vring_disable_cb(struct virtqueue *_vq) > { > struct vring_virtqueue *vq = to_vvq(_vq); > > START_USE(vq); > > BUG_ON(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT); > vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; > END_USE(vq); > } Ok, I found it: static int virtnet_open(struct net_device *dev) { struct virtnet_info *vi = netdev_priv(dev); try_fill_recv(vi); /* If we didn't even get one input buffer, we're useless. */ if (vi->num == 0) return -ENOMEM; ---> int for new packet static void skb_recv_done(struct virtqueue *rvq) { struct virtnet_info *vi = rvq->vdev->priv; /* Suppress further interrupts. */ rvq->vq_ops->disable_cb(rvq); netif_rx_schedule(vi->dev, &vi->napi); } - poll is not yet possible, no softirq - return from interrupt napi_enable(&vi->napi); vi->rvq->vq_ops->disable_cb(vi->rvq); ---> BUG: its already disabled Christian - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] virtio_net and SMP guests
Am Dienstag, 18. Dezember 2007 schrieb Rusty Russell: > To me this points to doing interrupt suppression a different way. If we > have a ->disable_cb() virtio function, and call it before we call > netif_rx_schedule, does that fix it? The fix looks good and I agree with it. There is one problem that I try to find for some days, but the following BUG_ON triggers: static void vring_disable_cb(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); START_USE(vq); > BUG_ON(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT); vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; END_USE(vq); } The funny thing is, that this bit is correct during probe but changes later before open. It seems that there is still something fishy on s390 and virtio. I looked several times at the virtio_ring code and it really seems to be ok. Any ideas? Christian the oops that I cannot explain - kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:232! illegal operation: 0001 [#1] Modules linked in: CPU:0Not tainted Process ip (pid: 1583, task: 0eee7038, ksp: 0ec4beb8) Krnl PSW : 070430018000 0045dcd4 (vring_disable_cb+0x30/0x34) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:3 PM:0 EA:3 Krnl GPRS: 03ff0002 0001 10001800 0045dca4 0f054800 00595a80 00897768 03ff1002 0f09c920 0ec4bbf8 0f09c900 00596b20 0045c9c6 0ec4bbf8 Krnl Code: 0045dcc8: e3b0b074 lg %r11,112(%r11) 0045dcce: 07fe bcr 15,%r14 0045dcd0: a7f40001 brc 15,45dcd2 >0045dcd4: a7f4fff6 brc 15,45dcc0 0045dcd8: eb7ff0500024 stmg%r7,%r15,80(%r15) 0045dcde: a7f13e00 tmll%r15,15872 0045dce2: b90400ef lgr %r14,%r15 0045dce6: a7840001 brc 8,45dce8 Call Trace: ([<0045c95e>] virtnet_open+0x32/0x114) [<0048cbac>] dev_open+0xb0/0xe8 [<0048b6a2>] dev_change_flags+0x156/0x1cc [<004ead02>] devinet_ioctl+0x5ae/0x728 [<004eb564>] inet_ioctl+0xa4/0xf0 [<004798dc>] sock_ioctl+0x90/0x2e4 [<001b8bd2>] do_ioctl+0x4a/0xd4 [<001b8cda>] vfs_ioctl+0x7e/0x3c8 [<001b90b6>] sys_ioctl+0x92/0xa4 [<00112e7c>] sysc_noemu+0x10/0x16 [<0212c7e6>] 0x212c7e6 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH/RFC 0/2] CPU hotplug virtio driver
Am Mittwoch, 9. Januar 2008 schrieb Glauber de Oliveira Costa: > I'm sending a first draft of my proposed cpu hotplug driver for kvm/virtio > The first patch is the kernel module, while the second, the userspace pci device. I personally prefer to use non paravirtualized cpu hotplug and implement the existing interfaces instead in the hypervisor. (sigp/sclp for s390 and maybe acpi for x86 etc.). This is not a performance critical operation. Christian - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] virtio_net and SMP guests
Rusty, Anthony, Dor, I need your brain power :-) On smp guests I have seen a problem with virtio (the version in curent Avi's git) which do not occur on single processor guests: kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:228! illegal operation: 0001 [#1] Modules linked in: ipv6 CPU:2Not tainted Process swapper (pid: 0, task: 0f83e038, ksp: 0f877d70) Krnl PSW : 070400018000 0045df2a (vring_restart+0x5a/0x70) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3 Krnl GPRS: c0a80101 0eb35000 10005800 0045ded0 192f 0eb21000 0eb21000 000e 0eb21900 0eb21920 0f867cb8 07d9b058 0010 0045c06a 0f867cb8 Krnl Code: 0045df1e: e3b0b074 lg %r11,112(%r11) 0045df24: 07fe bcr 15,%r14 0045df26: a7f40001 brc 15,45df28 >0045df2a: a7f4ffe1 brc 15,45deec 0045df2e: e3102034 lg %r1,48(%r2) 0045df34: a748 lhi %r4,0 0045df38: 96011001 oi 1(%r1),1 0045df3c: a7f4ffef brc 15,45df1a Call Trace: ([<0045c016>] virtnet_poll+0x96/0x42c) [<0048cda2>] net_rx_action+0xca/0x150 [<00137f7a>] __do_softirq+0x9e/0x130 [<001105d6>] do_softirq+0xae/0xb4 [<00138182>] irq_exit+0x96/0x9c [<0010d710>] do_extint+0xcc/0xf8 [<001135d0>] ext_no_vtime+0x16/0x1a [<0010a57e>] cpu_idle+0x216/0x238 I think there is a valid code path, triggering this bug: CPU1CPU2 --- --- - virtnet_poll found no more packets on queue - netif_rx_complete allow poll to be called - vq_ops->restart is called - vq Interrupts are enabled . . - interrupt is delivered . - poll is called . - poll work is done . - netif_rx_complete . - vq_ops->restart is called . - check if vq interrupts are . enable --> BUG The first idea was to remove this check? (See patch below). I am not sure if the proper fix also requires to change vring.avail->flags to be only changed by atomic bitops. Any ideas, comments? Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> CC: Anthony Liguori <[EMAIL PROTECTED]> CC: Dor Laor <[EMAIL PROTECTED]> CC: Rusty Russell <[EMAIL PROTECTED]> --- drivers/virtio/virtio_ring.c |2 -- 1 file changed, 2 deletions(-) Index: kvm/drivers/virtio/virtio_ring.c === --- kvm.orig/drivers/virtio/virtio_ring.c +++ kvm/drivers/virtio/virtio_ring.c @@ -225,8 +225,6 @@ static bool vring_restart(struct virtque struct vring_virtqueue *vq = to_vvq(_vq); START_USE(vq); - BUG_ON(!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)); - /* We optimistically turn back on interrupts, then check if there was * more to do. */ vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT; - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH resent] virtio_net: Fix stalled inbound trafficon early packets
Am Donnerstag, 13. Dezember 2007 schrieb Dor Laor: > You're right I got confused somehow. > So in that case setting the driver status field on open in addition to > your enable will do the trick. > On DRIVER_OPEN the host will trigger an interrupt if the queue is not > empty.. > Thanks, > Dor After looking into some other drivers, I prefer my first patch (moving napi_enable) ;-) There are some drivers like xen-netfront, b44, which call napi_enable before the buffers are passed to the hardware. So it seems that moving napi is also a valid option. But maybe I can just wait until Rusty returns from vacation (I will leave next week) so everything might be wonderful when I return ;-) Rusty, if you decide to apply my patch, there is one downside: The debugging code in virtio_ring sometimes triggers with a false positive: try_fill_recv calls vring_kick. Here we do a notify to the host. This might cause an immediate interrupt, triggering the poll routine before vring_kick can run END_USE. This is no real problem, as we dont touch the vq after notify. Christian, facing 64 guest cpus unconvering all kind of races - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH resent] virtio_net: Fix stalled inbound trafficon early packets
Am Mittwoch, 12. Dezember 2007 schrieb Dor Laor: > I think the change below handles the race. Otherwise please detail the > use case. [...] > > @@ -292,6 +292,9 @@ static int virtnet_open(struct net_devic > > return -ENOMEM; > > > > napi_enable(&vi->napi); > > + > > + vi->rvq->vq_ops->enable(vi->rvq); > > + vi->svq->vq_ops->enable(vi->svq); > > > If you change it to: > if (!vi->rvq->vq_ops->enable(vi->rvq)) > vi->rvq->vq_ops->kick(vi->rvq); > if (!vi->rvq->vq_ops->enable(vi->svq)) > vi->rvq->vq_ops->kick(vi->svq); > > You solve the race of packets already waiting in the queue without > triggering the irq. Hmm, I dont fully understand your point. I think this will work as long as the host has not consumed all inbound buffers. It will also require that the host sends an additional packet, no? If no additional packet comes the host has no reason to send an interrupt just because it got a notify hypercall. kick inside a guest also does not trigger the poll routine. It also wont work on the following scenario: in virtnet open we will allocate buffers and send them to the host using the kick callback. The host can now use _all_ buffers for incoming data while interrupts are still disabled and the guest is not running.( Lets say the host bridge has lots of multicast traffic and the guest gets not scheduled for a while). When the guest now continues and enables the interrupts nothing happens. Doing a kick does not help, as the host code will bail out with "no dma memory for transfer". Christian - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [virtio-net][PATCH] Don't arm tx hrtimer with a constant 500us each transmit
Am Mittwoch, 12. Dezember 2007 schrieb Dor Laor: > Christian Borntraeger wrote: > > > > Am Mittwoch, 12. Dezember 2007 schrieb Dor Laor: > > > --- a/drivers/net/virtio_net.c > > > +++ b/drivers/net/virtio_net.c > > > @@ -406,10 +405,10 @@ again: > > > > Hmm, while I agree in general with the patch, I fail to find the proper > > version of virtio_net where this patch applies. I tried kvm.git and > > linux-2.6.git from kernel.org. Can you give me a pointer to the repository > > where you work on virtio? > > > Sorry for that, I added some debug prints of my one. > Here it is: *git clone > git*://kvm.*qumranet*.com/home/*dor*/src/linux-2.6-nv use branch 'virtio'. Ah, ok. I will look into that branch. > BTW: what git repository do you use? I use Avis git from kernel.org: git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm Christian - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [virtio-net][PATCH] Don't arm tx hrtimer with a constant 500us each transmit
Am Mittwoch, 12. Dezember 2007 schrieb Dor Laor: > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -406,10 +405,10 @@ again: Hmm, while I agree in general with the patch, I fail to find the proper version of virtio_net where this patch applies. I tried kvm.git and linux-2.6.git from kernel.org. Can you give me a pointer to the repository where you work on virtio? Christian - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH resent] virtio_net: Fix stalled inbound trafficon early packets
Am Mittwoch, 12. Dezember 2007 schrieb Rusty Russell: > On Wednesday 12 December 2007 00:16:12 Christian Borntraeger wrote: > > That would also work. We already have VRING_AVAIL_F_NO_INTERRUPT in > > virtio_ring.c - maybe we can use that. Its hidden in callback and > > restart handling, what about adding an explicit startup? > > Yes, I debated whether to make this a separate hook or not; the current > method reduces the number of function calls without having two ways of > disabling callbacks. > > In this case, simply starting devices with callbacks disabled and > renaming 'restart' to 'enable' (or something) and calling it at the > beginning is probably sufficient? So you suggest something like the following patch? It seems to work but there is still a theoretical race at startup. Therefore, I tend to agree with Dor that a separate hook seems prefereable, so I am not fully sure if the patch is the final solution: ps: Its ok to answer that after your vacation. --- drivers/block/virtio_blk.c |3 ++- drivers/net/virtio_net.c |5 - drivers/virtio/virtio_ring.c |9 - include/linux/virtio.h |4 ++-- 4 files changed, 12 insertions(+), 9 deletions(-) Index: kvm/drivers/virtio/virtio_ring.c === --- kvm.orig/drivers/virtio/virtio_ring.c +++ kvm/drivers/virtio/virtio_ring.c @@ -220,7 +220,7 @@ static void *vring_get_buf(struct virtqu return ret; } -static bool vring_restart(struct virtqueue *_vq) +static bool vring_enable(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); @@ -264,7 +264,7 @@ static struct virtqueue_ops vring_vq_ops .add_buf = vring_add_buf, .get_buf = vring_get_buf, .kick = vring_kick, - .restart = vring_restart, + .enable = vring_enable, .shutdown = vring_shutdown, }; @@ -299,9 +299,8 @@ struct virtqueue *vring_new_virtqueue(un vq->in_use = false; #endif - /* No callback? Tell other side not to bother us. */ - if (!callback) - vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; + /* disable interrupts until we enable them */ + vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; /* Put everything in free lists. */ vq->num_free = num; Index: kvm/include/linux/virtio.h === --- kvm.orig/include/linux/virtio.h +++ kvm/include/linux/virtio.h @@ -41,7 +41,7 @@ struct virtqueue * vq: the struct virtqueue we're talking about. * len: the length written into the buffer * Returns NULL or the "data" token handed to add_buf. - * @restart: restart callbacks after callback returned false. + * @enable: restart callbacks after callback returned false. * vq: the struct virtqueue we're talking about. * This returns "false" (and doesn't re-enable) if there are pending * buffers in the queue, to avoid a race. @@ -65,7 +65,7 @@ struct virtqueue_ops { void *(*get_buf)(struct virtqueue *vq, unsigned int *len); - bool (*restart)(struct virtqueue *vq); + bool (*enable)(struct virtqueue *vq); void (*shutdown)(struct virtqueue *vq); }; Index: kvm/drivers/net/virtio_net.c === --- kvm.orig/drivers/net/virtio_net.c +++ kvm/drivers/net/virtio_net.c @@ -201,7 +201,7 @@ again: /* Out of packets? */ if (received < budget) { netif_rx_complete(vi->dev, napi); - if (unlikely(!vi->rvq->vq_ops->restart(vi->rvq)) + if (unlikely(!vi->rvq->vq_ops->enable(vi->rvq)) && netif_rx_reschedule(vi->dev, napi)) goto again; } @@ -292,6 +292,9 @@ static int virtnet_open(struct net_devic return -ENOMEM; napi_enable(&vi->napi); + + vi->rvq->vq_ops->enable(vi->rvq); + vi->svq->vq_ops->enable(vi->svq); return 0; } Index: kvm/drivers/block/virtio_blk.c === --- kvm.orig/drivers/block/virtio_blk.c +++ kvm/drivers/block/virtio_blk.c @@ -183,7 +183,8 @@ static int virtblk_probe(struct virtio_d err = PTR_ERR(vblk->vq); goto out_free_vblk; } - + /* enable interrupts */ + vblk->vq->vq_ops->enable(vblk->vq); vblk->pool = mempool_create_kmalloc_pool(1,sizeof(struct virtblk_req)); if (!vblk->pool) { err = -ENOMEM; - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place
Re: [kvm-devel] [PATCH resent] virtio_net: Fix stalled inbound trafficon early packets
Am Mittwoch, 12. Dezember 2007 schrieb Dor Laor: > This is why initially I suggested another status code in order to split > the ring logic with driver status. > > but also not filling any buffers as long as VIRTIO_CONFIG_DEV_OPEN is not > > set. I will have a look but I think that add_status needs to be called > > > It can fill the buffers even without dev_open, when the dev is finally > opened the host will issue an interrupt if there are pending buffers. There is a problem associated with that scheme. Setting the value in the config space does not notify the hypervisor. That means the host will not send an interrupt. The interrupt is sent if the host tries to send the next packet. If somehow the host manages to use up all buffers before the device is finally opened, we have a problem. > (I'm not sure it's worth solving, maybe just drop them like you suggested). As said above, dropping seems to me the preferred method. Did I miss something? Christian - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] virtio_net: remove double ether_setup
Am Mittwoch, 12. Dezember 2007 schrieb Rusty Russell: > Can you send straight to akpm or davem? I'm supposed to be on vacation at this is a small fix for virtio_net. virtnet_probe already calls alloc_etherdev, which calls ether_setup. There is no need to do that again. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> Acked-by: Rusty Russell <[EMAIL PROTECTED]> --- drivers/net/virtio_net.c |1 - 1 file changed, 1 deletion(-) Index: kvm/drivers/net/virtio_net.c === --- kvm.orig/drivers/net/virtio_net.c +++ kvm/drivers/net/virtio_net.c @@ -329,11 +329,10 @@ static int virtnet_probe(struct virtio_d dev = alloc_etherdev(sizeof(struct virtnet_info)); if (!dev) return -ENOMEM; /* Set up network device as normal. */ - ether_setup(dev); dev->open = virtnet_open; dev->stop = virtnet_close; dev->hard_start_xmit = start_xmit; dev->features = NETIF_F_HIGHDMA; SET_NETDEV_DEV(dev, &vdev->dev); - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH resent] virtio_net: Fix stalled inbound trafficon early packets
Am Dienstag, 11. Dezember 2007 schrieb Christian Borntraeger: > > The way other physical NICs doing it is by dis/en/abling interrupt > > using registers (look at e1000). > > I suggest we can export add_status and use the original code but > > before enabling napi add a call to add_status(dev, > > VIRTIO_CONFIG_DEV_OPEN). > > The host won't trigger an irq until it sees the above. > > That would also work. We already have VRING_AVAIL_F_NO_INTERRUPT in > virtio_ring.c - maybe we can use that. Its hidden in callback and > restart handling, what about adding an explicit startup? Ok, just to give an example what I thought about: --- drivers/block/virtio_blk.c |3 ++- drivers/net/virtio_net.c |2 ++ drivers/virtio/virtio_ring.c | 16 +--- include/linux/virtio.h |5 + 4 files changed, 22 insertions(+), 4 deletions(-) Index: kvm/drivers/virtio/virtio_ring.c === --- kvm.orig/drivers/virtio/virtio_ring.c +++ kvm/drivers/virtio/virtio_ring.c @@ -241,6 +241,16 @@ static bool vring_restart(struct virtque return true; } +static void vring_startup(struct virtqueue *_vq) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + START_USE(vq); + if (vq->vq.callback) + vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT; + END_USE(vq); +} + + irqreturn_t vring_interrupt(int irq, void *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); @@ -265,6 +275,7 @@ static struct virtqueue_ops vring_vq_ops .get_buf = vring_get_buf, .kick = vring_kick, .restart = vring_restart, + .startup = vring_startup, .shutdown = vring_shutdown, }; @@ -299,9 +310,8 @@ struct virtqueue *vring_new_virtqueue(un vq->in_use = false; #endif - /* No callback? Tell other side not to bother us. */ - if (!callback) - vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; + /* disable interrupts until we enable them */ + vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; /* Put everything in free lists. */ vq->num_free = num; Index: kvm/include/linux/virtio.h === --- kvm.orig/include/linux/virtio.h +++ kvm/include/linux/virtio.h @@ -45,6 +45,9 @@ struct virtqueue * vq: the struct virtqueue we're talking about. * This returns "false" (and doesn't re-enable) if there are pending * buffers in the queue, to avoid a race. + * @startup: enable callbacks + * vq: the struct virtqueue we're talking abount + * Returns 0 or an error * @shutdown: "unadd" all buffers. * vq: the struct virtqueue we're talking about. * Remove everything from the queue. @@ -67,6 +70,8 @@ struct virtqueue_ops { bool (*restart)(struct virtqueue *vq); + void (*startup) (struct virtqueue *vq); + void (*shutdown)(struct virtqueue *vq); }; Index: kvm/drivers/net/virtio_net.c === --- kvm.orig/drivers/net/virtio_net.c +++ kvm/drivers/net/virtio_net.c @@ -292,6 +292,8 @@ static int virtnet_open(struct net_devic return -ENOMEM; napi_enable(&vi->napi); + + vi->rvq->vq_ops->startup(vi->rvq); return 0; } Index: kvm/drivers/block/virtio_blk.c === --- kvm.orig/drivers/block/virtio_blk.c +++ kvm/drivers/block/virtio_blk.c @@ -183,7 +183,8 @@ static int virtblk_probe(struct virtio_d err = PTR_ERR(vblk->vq); goto out_free_vblk; } - + /* enable interrupts */ + vblk->vq->vq_ops->startup(vblk->vq); vblk->pool = mempool_create_kmalloc_pool(1,sizeof(struct virtblk_req)); if (!vblk->pool) { err = -ENOMEM; There is still one small problem: what if the host fills up all host-to-guest buffers before we call startup? So I start to think that your solution is better, given that the host is not only not sending interrupts but also not filling any buffers as long as VIRTIO_CONFIG_DEV_OPEN is not set. I will have a look but I think that add_status needs to be called after napi_enable, otherwise we have the same race. Christian - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] Proposed new directory layout for kvm and virtualization
Am Dienstag, 11. Dezember 2007 schrieb Avi Kivity: > KVM is due to receive support for multiple architectures (ppc, ia64, and > s390, in addition to the existing x86), hopefully in time for the 2.6.25 > merge window. It is awkward to place the new arch support in > drivers/kvm/, so I'd like to propose the following new layout: > > virt/ top-level directory for hypervisors > virt/kvm/ kvm common code > virt/lguest/ the other hypervisor > arch/*/kvm/ arch dependent kvm code > include/linux/kvm.h arch independent interface > include/asm/kvm.h arch dependent interface > include/linux/kvm_para.h arch independent guest interface > include/asm/kvm_para.harch dependent guest interface > > The include/ hierarchy is already in place; I'm including it for > completeness. For completeness, where do we want to put the drivers? drivers/*, drivers/net/* etc. or drivers/kvm/* drivers/xen/* etc. Christian? - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH resent] virtio_net: Fix stalled inbound trafficon early packets
2nd try. I somehow enable html on the last post Dor Laor wrote: > Christian Borntraeger wrote: >> >> Hello Rusty, >> >> while implementing and testing virtio on s390 I found a problem in >> virtio_net: The current virtio_net driver has a startup race, which >> prevents any incoming traffic: >> >> If try_fill_recv submits buffers to the host system data might be >> filled in and an interrupt is sent, before napi_enable finishes. >> In that case the interrupt will kick skb_recv_done which will then >> call netif_rx_schedule. netif_rx_schedule checks, if NAPI_STATE_SCHED >> is set - which is not as we did not run napi_enable. No poll routine >> is scheduled. Furthermore, skb_recv_done returns false, we disables >> interrupts for this device. >> >> One solution is the enable napi before inbound buffer are available. >> > But then you might get recv interrupt without a buffer. If I look at the current implementation (lguest) no interrupt is sent if there is not buffer available, no? On the other hand, if the host does send an interrupt when no buffer is available, this is also no problem. Looking at virtnet_poll, it seems it can cope with an empty ring. > The way other physical NICs doing it is by dis/en/abling interrupt > using registers (look at e1000). > I suggest we can export add_status and use the original code but > before enabling napi add a call to add_status(dev, > VIRTIO_CONFIG_DEV_OPEN). > The host won't trigger an irq until it sees the above. That would also work. We already have VRING_AVAIL_F_NO_INTERRUPT in virtio_ring.c - maybe we can use that. Its hidden in callback and restart handling, what about adding an explicit startup? > > BTW: Rusty is on vacation and that's probably the reason he didn't > respond. > Regards, > Dor. Ok, didnt know that. At the moment I can live with my private patch while we work on a final solution. Meanwhile I will try to debug virtio on SMP guests - I still see some strange races on our test system. (But block and net is now working on s390 and can cope with medium load. ) Christian - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH resent] virtio_net: Fix stalled inbound trafficon early packets
Dor Laor wrote: > Christian Borntraeger wrote: >> >> Hello Rusty, >> >> while implementing and testing virtio on s390 I found a problem in >> virtio_net: The current virtio_net driver has a startup race, which >> prevents any incoming traffic: >> >> If try_fill_recv submits buffers to the host system data might be >> filled in and an interrupt is sent, before napi_enable finishes. >> In that case the interrupt will kick skb_recv_done which will then >> call netif_rx_schedule. netif_rx_schedule checks, if NAPI_STATE_SCHED >> is set - which is not as we did not run napi_enable. No poll routine >> is scheduled. Furthermore, skb_recv_done returns false, we disables >> interrupts for this device. >> >> One solution is the enable napi before inbound buffer are available. >> > But then you might get recv interrupt without a buffer. If I look at the current implementation (lguest) no interrupt is sent if there is not buffer available, no? On the other hand, if the host does send an interrupt when no buffer is available, this is also no problem. Looking at virtnet_poll, it seems it can cope with an empty ring. > The way other physical NICs doing it is by dis/en/abling interrupt > using registers (look at e1000). > I suggest we can export add_status and use the original code but > before enabling napi add a call to add_status(dev, > VIRTIO_CONFIG_DEV_OPEN). > The host won't trigger an irq until it sees the above. That would also work. We already have VRING_AVAIL_F_NO_INTERRUPT in virtio_ring.c - maybe we can use that. Its hidden in callback and restart handling, what about adding an explicit startup? > > BTW: Rusty is on vacation and that's probably the reason he didn't > respond. > Regards, > Dor. Ok, didnt know that. At the moment I can live with my private patch while we work on a final solution. Meanwhile I will try to debug virtio on SMP guests - I still see some strange races on our test system. (But block and net is now working on s390 and can cope with medium load. ) Christian - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH resent] virtio_net: Fix stalled inbound traffic on early packets
Hello Rusty, while implementing and testing virtio on s390 I found a problem in virtio_net: The current virtio_net driver has a startup race, which prevents any incoming traffic: If try_fill_recv submits buffers to the host system data might be filled in and an interrupt is sent, before napi_enable finishes. In that case the interrupt will kick skb_recv_done which will then call netif_rx_schedule. netif_rx_schedule checks, if NAPI_STATE_SCHED is set - which is not as we did not run napi_enable. No poll routine is scheduled. Furthermore, skb_recv_done returns false, we disables interrupts for this device. One solution is the enable napi before inbound buffer are available. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- drivers/net/virtio_net.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) Index: kvm/drivers/net/virtio_net.c === --- kvm.orig/drivers/net/virtio_net.c +++ kvm/drivers/net/virtio_net.c @@ -285,13 +285,15 @@ static int virtnet_open(struct net_devic { struct virtnet_info *vi = netdev_priv(dev); + napi_enable(&vi->napi); try_fill_recv(vi); /* If we didn't even get one input buffer, we're useless. */ - if (vi->num == 0) + if (vi->num == 0) { + napi_disable(&vi->napi); return -ENOMEM; + } - napi_enable(&vi->napi); return 0; } - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH resend] virtio_net: remove double ether_setup
Hello Rusty, this is a small fix for virtio_net. virtnet_probe already calls alloc_etherdev, which calls ether_setup. There is no need to do that again. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- drivers/net/virtio_net.c |1 - 1 file changed, 1 deletion(-) Index: kvm/drivers/net/virtio_net.c === --- kvm.orig/drivers/net/virtio_net.c +++ kvm/drivers/net/virtio_net.c @@ -329,11 +329,10 @@ static int virtnet_probe(struct virtio_d dev = alloc_etherdev(sizeof(struct virtnet_info)); if (!dev) return -ENOMEM; /* Set up network device as normal. */ - ether_setup(dev); dev->open = virtnet_open; dev->stop = virtnet_close; dev->hard_start_xmit = start_xmit; dev->features = NETIF_F_HIGHDMA; SET_NETDEV_DEV(dev, &vdev->dev); - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] virtnet: remove double ether_setup
Hello Rusty, virtnet_probe already calls alloc_etherdev, which calls ether_setup. There is no need to do that again. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- drivers/net/virtio_net.c |1 - 1 file changed, 1 deletion(-) Index: kvm/drivers/net/virtio_net.c === --- kvm.orig/drivers/net/virtio_net.c +++ kvm/drivers/net/virtio_net.c @@ -329,11 +329,10 @@ static int virtnet_probe(struct virtio_d dev = alloc_etherdev(sizeof(struct virtnet_info)); if (!dev) return -ENOMEM; /* Set up network device as normal. */ - ether_setup(dev); dev->open = virtnet_open; dev->stop = virtnet_close; dev->hard_start_xmit = start_xmit; dev->features = NETIF_F_HIGHDMA; SET_NETDEV_DEV(dev, &vdev->dev); - SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] Add the arg "module" for kvm_arch_init
Am Freitag, 23. November 2007 schrieb Zhang, Xiantao: > > From: Zhang Xiantao <[EMAIL PROTECTED]> > Date: Fri, 23 Nov 2007 07:28:35 +0800 > Subject: [PATCH] Add the arg module for kvm_arch_into > Add the arg module for kvm_arch_into, since some archs may > need module info. > Signed-off-by: Zhang Xiantao <[EMAIL PROTECTED]> Can you explain what you try to achieve? Would the macro THIS_MODULE be sufficient as well? Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Am Freitag, 9. November 2007 schrieb Dor Laor: > I believe that the network interface will quickly go to the kernel since > copy takes most of the > cpu time and qemu does not support scatter gather dma at the moment. > Nevertheless using pio seems good enough, Anthony's suggestion of > notifying the kernel using ioctls > is logical. If we'll run into troubles further on we can add a hypercall > capability and if exist use hypercalls > instead of pios. Sorry for being late in this thread. We (s390) will need a hypercall as we do not have port I/O. I think it should be possible to default to hypercall on s390 and use pio everywhere else. Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH v2] kvm: Fix accounting of interrupts during guest execution on s390
Am Donnerstag, 15. November 2007 schrieb Laurent Vivier: > To be annoying, it should be clearer to write this like: > > if ( (p->flags & PF_VCPU) && > !(hardirq_count() - hardirq_offset) && > !softirq_count() ) > { > account_guest_time(p, cputime); > return; > } [...] > But I agree with your patch. > > Laurent Yes, that would have the same result. I think its a matter of taste, if we want to have a not-so-obvious one line if, or a multi-line-if. I dont mind which version use, here is your variant as an alternative: Ingo, you decide :-) From: Christian Borntraeger <[EMAIL PROTECTED]> From: Laurent Vivier <[EMAIL PROTECTED]> Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- kernel/sched.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Index: kvm/kernel/sched.c === --- kvm.orig/kernel/sched.c +++ kvm/kernel/sched.c @@ -3395,7 +3395,9 @@ void account_system_time(struct task_str struct rq *rq = this_rq(); cputime64_t tmp; - if (p->flags & PF_VCPU) { + if ( (p->flags & PF_VCPU) && +!(hardirq_count() - hardirq_offset) && +!softirq_count() ) { account_guest_time(p, cputime); return; } - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH v2] kvm: Fix accounting of interrupts during guest execution on s390
Am Donnerstag, 15. November 2007 schrieb Laurent Vivier: > If I remember correctly time accounting on s390 is more accurate than on > x86 ? Yes. its done during context switches and resolution is microsecond. > Because on x86, as we make the kvm_guest_exit() after local_irq_enable() > we can also have IRQ with PF_VCPU set... and we discussed a lot on > probability to know if it was good or not. And on x86 it seems good > because it is already working like that with system and user time (we > account time to the space where tick appears). > see http://lkml.org/lkml/2007/10/15/228 I am not sure I fully understand your point, can you try to explain? My patch deals with timer ticks (see hardirq_offset). So if a only a tick comes in after local_irq_enable the time is accounted to guest time as before. I made a test on x86_64 with burnP6 inside a kvm machine. top showed 100% guest time on an otherwise idle host. So the guest accounting itself did still work. I tried some disk and network stress on the host but I did not manage to bring hardirq+softirq time above 5%, guest time stayed above 90%. At least my simple testcase did not show a bias towards irq time. > Le jeudi 15 novembre 2007 à 15:10 +0100, Christian Borntraeger a écrit : > > Avi pointed out, that my first patch was broken, here is the 2nd try. > > I tested the patch on s390 with CONFIG_VIRT_CPU_ACCOUNTING and on x86_64. > > Seems to work. > > > > Currently the scheduler checks for PF_VCPU to decide if this > > timeslice has to be accounted as guest time. On s390 host > > interrupts are not disabled during guest execution. This causes > > theses interrupts to be accounted as guest time if > > CONFIG_VIRT_CPU_ACCOUNTING is set. > > Solution is to check if an interrupt triggered account_system_time. > > As the tick is timer interrupt based, we have to subtract > > hardirq_offset. > > > > Avi, Ingo, Laurent, feedback is welcome. > > > > CC: Ingo Molnar <[EMAIL PROTECTED]> > > CC: Avi Kivity <[EMAIL PROTECTED]> > > CC: Laurent Vivier <[EMAIL PROTECTED]> > > Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> > > --- > > kernel/sched.c |6 ++ > > 1 file changed, 2 insertions(+), 4 deletions(-) > > > > Index: kvm/kernel/sched.c > > === > > --- kvm.orig/kernel/sched.c > > +++ kvm/kernel/sched.c > > @@ -3395,10 +3395,8 @@ void account_system_time(struct task_str > > struct rq *rq = this_rq(); > > cputime64_t tmp; > > > > - if (p->flags & PF_VCPU) { > > - account_guest_time(p, cputime); > > - return; > > - } > > + if ((p->flags & PF_VCPU) && (irq_count() - hardirq_offset == 0)) > > + return account_guest_time(p, cputime); > > > > p->stime = cputime_add(p->stime, cputime); - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH v2] kvm: Fix accounting of interrupts during guest execution on s390
Am Donnerstag, 15. November 2007 schrieb Avi Kivity: > > + if ((p->flags & PF_VCPU) && (irq_count() - hardirq_offset == 0)) [...] > What's irq_count()? Is it defined to ignore the timer tick? Or is that > hardirq_offset? hardirq_offset is used to ignore the timer tick. irq_count is defined in hardirq.h: #define irq_count() (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK)) So irq_count() - hardirq_offset is not zero if a hardirq or a softirq is running. By subtracting hardirq_offset (the timer tick sets it to HARDIRQ_OFFSET) it also makes sure, that the timer tick does not count as hardirq. This seems to do the trick. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH v2] kvm: Fix accounting of interrupts during guest execution on s390
Avi pointed out, that my first patch was broken, here is the 2nd try. I tested the patch on s390 with CONFIG_VIRT_CPU_ACCOUNTING and on x86_64. Seems to work. Currently the scheduler checks for PF_VCPU to decide if this timeslice has to be accounted as guest time. On s390 host interrupts are not disabled during guest execution. This causes theses interrupts to be accounted as guest time if CONFIG_VIRT_CPU_ACCOUNTING is set. Solution is to check if an interrupt triggered account_system_time. As the tick is timer interrupt based, we have to subtract hardirq_offset. Avi, Ingo, Laurent, feedback is welcome. CC: Ingo Molnar <[EMAIL PROTECTED]> CC: Avi Kivity <[EMAIL PROTECTED]> CC: Laurent Vivier <[EMAIL PROTECTED]> Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- kernel/sched.c |6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) Index: kvm/kernel/sched.c === --- kvm.orig/kernel/sched.c +++ kvm/kernel/sched.c @@ -3395,10 +3395,8 @@ void account_system_time(struct task_str struct rq *rq = this_rq(); cputime64_t tmp; - if (p->flags & PF_VCPU) { - account_guest_time(p, cputime); - return; - } + if ((p->flags & PF_VCPU) && (irq_count() - hardirq_offset == 0)) + return account_guest_time(p, cputime); p->stime = cputime_add(p->stime, cputime); - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] kvm: Fix accounting of interrupts during guest execution on s390
Am Mittwoch, 14. November 2007 schrieb Avi Kivity: > But isn't account_system_time() called via the timer tick interrupt, and > never directly from kvm? So in_interrupt() would always be true. Ouch, yes. Please dont use that patch, I will make a new one. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] kvm: Fix accounting of interrupts during guest execution on s390
Am Mittwoch, 14. November 2007 schrieb Ingo Molnar: > > * Christian Borntraeger <[EMAIL PROTECTED]> wrote: > > > Avi, Ingo, Laurent, > > > > what do you think about the following patch? > > fine to me. I guess this should mostly be a NOP to KVM, right? Yes, currently kvm on x86 has interrupts disabled during vmx and svm. So in_interrupt should be always false on x86. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] kvm: Fix accounting of interrupts during guest execution on s390
Avi, Ingo, Laurent, what do you think about the following patch? Currently the scheduler checks for PF_VCPU to decide if this timeslice has to be accounted as guest time. On s390 host interrupts are not disabled during guest execution. This causes these interrupts to be accounted as guest time. Solution is to check for in_interrupt to let interrupt time account as soft or hardirq. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- kernel/sched.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: kvm/kernel/sched.c === --- kvm.orig/kernel/sched.c +++ kvm/kernel/sched.c @@ -3395,7 +3395,7 @@ void account_system_time(struct task_str struct rq *rq = this_rq(); cputime64_t tmp; - if (p->flags & PF_VCPU) { + if ((p->flags & PF_VCPU) && !in_interrupt()) { account_guest_time(p, cputime); return; } - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] A question about virtio and KVM
Am Dienstag, 6. November 2007 schrieb Dor Laor: > The is my latest (old) branch: > > -Kernel repo: git://kvm.qumranet.com/home/dor/src/virtio/kvm > -Userspace repo: git://kvm.qumranet.com/home/dor/src/virtio/kvm-userspace > Use virt-final branch. Yes I had a look at these. This code is virtio draft 4 based, right? > It is not yet updated to latest virtio & latest kvm. > This version uses a pci device for virtio. Since I'm think Rusty's > config space does not matches > pci config space I might consider changing it. (Preferably not, maybe > I'll start using it until we'll have the windows > implementation). > I'll start work on it on Sunday. I currently try to use Rusty's config space as we dont have a PCI bus. Carsten told me that HPA suggested to split CONFIG_PCI in a way to use the hardware independent data structures without providing the hardware. Carsten do you still remember the full details? > Can you also publish your tree? Currently I have no way to publish something via a web page. I could sent you patches or the code of our prototype code. Currently it has nothing to do with KVM but we try to move more and more into kvm direction. As the old code was virtio draft 4 based and I just started to work on it again I currently rebased everything and try to make it work again. Basically I currently tried to reuse as much as possible from the lguest code. Chrisian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Am Montag, 5. November 2007 schrieb Arnd Bergmann: > Read again what I wrote above. I'm suggesting to have just one external > interrupt for virtio and use the generic IRQ abstraction to handle > everything that comes below that. So you basically suggest to implement wrapper code around extint and lowcore memory to be able to use request_irq/free_irq? > > Plus I don't see a benefit from pretending to have an interrupt > > controller: virtio abstracts from this, and can well be implemented > > over extint and hypercall like Christian has done it. What's the > > problem you're trying to solve? > > Sorry, I can't find Christian's code right now, do you have a pointer > to the patches? The code was only used for our prototype hypervisor. I never posted these virtio patches as Rusty was quicker in changing virtio than I was able to re-add them to our prototype code. ;-) > I suspect that he has done exactly what I was trying to explain, except > that the implementation is not using the generic IRQ layer, which means > you're duplicating some of the code. I used one external interrupt and I reserved an area in lowcore for a 64bit extint parameter. (I use the same address as z/VM for the PFAULT token). I defined a hypercall in which the guest could specify this 64bit value for a given virtqueue. That allowed me to get the virtqueue pointer without looking it up in the list of (maybe many) virtqueues. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] A question about virtio and KVM
Am Donnerstag, 1. November 2007 schrieb Dor Laor: > Hi Ian, > The intention is there, I just didn't have the time to work on it. > I can promise [there is no way to backoff now] that it will be > working/merge-able state by December 1st. > Dor. Dor, that is good news. I also work on virtio and currently try to wire it to our prototype code to have something that works until our s390 kvm port is ready. Unfortunately Rusty is quite fast in changing virtio. I guess you noticed that as well ;-) I have to implement some s390 specific solutions, for example we dont have ioremap and, therefore, I have to implement the counterpart to lguest_map differently in our architecture header files. would it be possible to have your git tree up-to-date? I could then implement the s390 backend in a way that follows your implementation - and I could also send patches etc. if I see something that needs to be fixed. I really dont mind if the git tree doesnt compile or eats my disk content. Thanks Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] severe bug in 2.6.23+ kvm.git
Am Freitag, 19. Oktober 2007 schrieb Luca Tettamanti: > linus-git has at least one bug with SG chaining, but usually it just > hangs the machine. Patch is here: > > http://lkml.org/lkml/2007/10/17/269 Looks promising. I pulled this fix by pulling the latest Linus-git into the kvm.git. I also enabled some debug options in the kernel hacking section. This resulting kernel seems to be stable so far. We will see in the next days if the problem is really gone. Thanks to all for your ideas. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] severe bug in 2.6.23+ kvm.git
Am Freitag, 19. Oktober 2007 schrieb Laurent Vivier: > Did you patch kvm.git with patch-2.6.23.1.bz2 or did you download > linux-2.6.23.1.tar.bz2 ? > > 2.6.23.1 corrects nothing except sata_mv... Yes I know. The question we tried to answer was: is the bug in 2.6.23 or was it introduced after 2.6.23, as kvm.git already contains lots of post 2. 6.23 stuff. Current state is: kvm.git with tag 2.6.23-rc6 works for days without a problem. 2.6.23.1 vanilla has survived and is currently still under test. kvm.git tag master killed our filesystem at least three times.since monday. I will continue to bang on 2.6.23.1 to see if its really fine. After that, maybe I will try to bisect on kvm.git, but this will take quite a long time, given that we had to reinstall the system due to this error. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] severe bug in 2.6.23+ kvm.git
Am Freitag, 19. Oktober 2007 schrieb Jan Engelhardt: > > On Oct 19 2007 15:44, Carsten Otte wrote: > > Carsten Otte wrote: > >> First thing we do, is figure whether or not 2.6.23.1 as released breaks our > >> system too. This way, we can either focus on differences between Linus and > >> Avi, or turn on the big red warning sign saying "regression". > > > > Looks like 2.6.23.1 works fine on that box. We'll leave it running over > > the weekend with "while true; do make; make clean; done". > > Well, do you happen to use sata_mv? no, we have nvidia, so its sata_nv. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] severe bug in 2.6.23+ kvm.git
Am Freitag, 19. Oktober 2007 schrieb Laurent Vivier: > Carsten Otte a écrit : > > Laurent Vivier wrote: > >> How do you know the problem has been introduced by kvm ? > > I don't. In fact I think it has not been introduced by kvm. All I > > stated, is that we experienced the problem when running the kvm.git > > kernel after the 2.6.23 update that has not been present in the > > kvm.git -rc8 as of last thursday. > > Perhaps 2.6.23.1 corrects this ? > > http://lkml.org/lkml/2007/10/12/302 No, we dont have an marvel chipset. kvm:~# lspci 00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a4) 00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev b1) 00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2) 00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2) 00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a4) 00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f3) 00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3) 00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3) 00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2) 00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:00.0 VGA compatible controller: ATI Technologies Inc RV370 5B60 [Radeon X300 (PCIE)] 01:00.1 Display controller: ATI Technologies Inc RV370 [Radeon X300SE] 02:00.0 PCI bridge: Intel Corporation 6702PXH PCI Express-to-PCI Bridge A (rev 09) 04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet PCI Express (rev 21) 05:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet PCI Express (rev 21) Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] Fix guest time accounting going faster than user time accounting
Seems I overlooked this type while reviewing Laurents patch. cputime_add already adds, dont do it twice. Avi. This should go to Linus before 2.6.24. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- fs/proc/array.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6.23/fs/proc/array.c === --- linux-2.6.23.orig/fs/proc/array.c +++ linux-2.6.23/fs/proc/array.c @@ -446,7 +446,7 @@ static int do_task_stat(struct task_stru maj_flt += sig->maj_flt; utime = cputime_add(utime, sig->utime); stime = cputime_add(stime, sig->stime); - gtime += cputime_add(gtime, sig->gtime); + gtime = cputime_add(gtime, sig->gtime); } sid = signal_session(sig); - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] Use virtual cpu accounting if available for guest times.
Avi, ppc and s390 offer the possibility to track process times precisely by looking at cpu timer on every context switch, irq, softirq etc. We can use that infrastructure as well for guest time accounting. We need to account the used time before we change the state. This patch adds a call to account_system_vtime to kvm_guest_enter and kvm_guest exit. If CONFIG_VIRT_CPU_ACCOUNTING is not set, account_system_vtime is defined in hardirq.h as an empty function, which means this patch does not change the behaviour on other platforms. I compile tested this patch on x86 and function tested the patch on s390. Avi, please apply. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- drivers/kvm/kvm.h |3 +++ 1 file changed, 3 insertions(+) Index: kvm/drivers/kvm/kvm.h === --- kvm.orig/drivers/kvm/kvm.h +++ kvm/drivers/kvm/kvm.h @@ -7,6 +7,7 @@ */ #include +#include #include #include #include @@ -669,11 +670,13 @@ __init void kvm_arch_init(void); static inline void kvm_guest_enter(void) { + account_system_vtime(current); current->flags |= PF_VCPU; } static inline void kvm_guest_exit(void) { + account_system_vtime(current); current->flags &= ~PF_VCPU; } - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH/RFC] Per-architecture hypercall definitions
> > +/* Return values for hypercalls */ > > +#define KVM_ENOSYS 1000 > > > > errno's can probably be in common code, right? Yes, they can. Will move. > I originally had one of these in my paravirt_ops implementation but > quickly found that the code it produced was pretty ugly. I think: > > kvm_hypercall0(KVM_HYPERCALL_FOO); > > Is a lot more understandable than: > > kvm_hypercall(0, KVM_HYPERCALL_FOO); > > It's much harder to guess with the later what the significance of the > first parameter is. I agree. It seems I can simply remove the kvm_hypercall macro. Here is the updated patch (without the s390 parts. Will send them later when ready) and the changes mentioned above. Avi, Anthony, comments? [PATCH/RFC] Per-architecture hypercall definitions Currently kvm provides hypercalls only for x86* architectures. To provide hypercall infrastructure for other kvm architectures I split kvm_para.h into a generic header file and architecture specific definitions. Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]> Index: kvm/include/asm-x86/kvm_para.h === --- /dev/null 1970-01-01 00:00:00.0 + +++ kvm/include/asm-x86/kvm_para.h 2007-10-11 15:41:24.0 +0200 @@ -0,0 +1,105 @@ +#ifndef __X86_KVM_PARA_H +#define __X86_KVM_PARA_H + +/* This CPUID returns the signature 'KVMKVMKVM' in ebx, ecx, and edx. It + * should be used to determine that a VM is running under KVM. + */ +#define KVM_CPUID_SIGNATURE0x4000 + +/* This CPUID returns a feature bitmap in eax. Before enabling a particular + * paravirtualization, the appropriate feature bit should be checked. + */ +#define KVM_CPUID_FEATURES 0x4001 + +#ifdef __KERNEL__ +#include + +/* This instruction is vmcall. On non-VT architectures, it will generate a + * trap that we will then rewrite to the appropriate instruction. + */ +#define KVM_HYPERCALL ".byte 0x0f,0x01,0xc1" + +/* For KVM hypercalls, a three-byte sequence of either the vmrun or the vmmrun + * instruction. The hypervisor may replace it with something else but only the + * instructions are guaranteed to be supported. + * + * Up to four arguments may be passed in rbx, rcx, rdx, and rsi respectively. + * The hypercall number should be placed in rax and the return value will be + * placed in rax. No other registers will be clobbered unless explicited + * noted by the particular hypercall. + */ + +static inline long kvm_hypercall0(unsigned int nr) +{ + long ret; + asm volatile(KVM_HYPERCALL +: "=a"(ret) +: "a"(nr)); + return ret; +} + +static inline long kvm_hypercall1(unsigned int nr, unsigned long p1) +{ + long ret; + asm volatile(KVM_HYPERCALL +: "=a"(ret) +: "a"(nr), "b"(p1)); + return ret; +} + +static inline long kvm_hypercall2(unsigned int nr, unsigned long p1, + unsigned long p2) +{ + long ret; + asm volatile(KVM_HYPERCALL +: "=a"(ret) +: "a"(nr), "b"(p1), "c"(p2)); + return ret; +} + +static inline long kvm_hypercall3(unsigned int nr, unsigned long p1, + unsigned long p2, unsigned long p3) +{ + long ret; + asm volatile(KVM_HYPERCALL +: "=a"(ret) +: "a"(nr), "b"(p1), "c"(p2), "d"(p3)); + return ret; +} + +static inline long kvm_hypercall4(unsigned int nr, unsigned long p1, + unsigned long p2, unsigned long p3, + unsigned long p4) +{ + long ret; + asm volatile(KVM_HYPERCALL +: "=a"(ret) +: "a"(nr), "b"(p1), "c"(p2), "d"(p3), "S"(p4)); + return ret; +} + +static inline int kvm_para_available(void) +{ + unsigned int eax, ebx, ecx, edx; + char signature[13]; + + cpuid(KVM_CPUID_SIGNATURE, &eax, &ebx, &ecx, &edx); + memcpy(signature + 0, &ebx, 4); + memcpy(signature + 4, &ecx, 4); + memcpy(signature + 8, &edx, 4); + signature[12] = 0; + + if (strcmp(signature, "KVMKVMKVM") == 0) + return 1; + + return 0; +} + +static inline unsigned int kvm_arch_para_features(void) +{ + return cpuid_eax(KVM_CPUID_FEATURES); +} + +#endif + +#endif Index: kvm/include/linux/kvm_para.h === --- kvm.orig/include/linux/kvm_para.h 2007-10-11 13:28:46.0 +0200 +++ kvm/include/linux/kvm_para.h2007-10-11 15:41
[kvm-devel] [PATCH/RFC] Per-architecture hypercall definitions
:06:47.0 +0200 @@ -1,110 +1,36 @@ #ifndef __LINUX_KVM_PARA_H #define __LINUX_KVM_PARA_H -/* This CPUID returns the signature 'KVMKVMKVM' in ebx, ecx, and edx. It - * should be used to determine that a VM is running under KVM. +/* + * This header file provides a method for making a hypercall to the host + * Architectures should define: + * - kvm_hypercall0, kvm_hypercall1... + * - kvm_arch_para_features + * - kvm_para_available */ -#define KVM_CPUID_SIGNATURE0x4000 - -/* This CPUID returns a feature bitmap in eax. Before enabling a particular - * paravirtualization, the appropriate feature bit should be checked. - */ -#define KVM_CPUID_FEATURES 0x4001 - -/* Return values for hypercalls */ -#define KVM_ENOSYS 1000 #ifdef __KERNEL__ -#include - -/* This instruction is vmcall. On non-VT architectures, it will generate a - * trap that we will then rewrite to the appropriate instruction. +/* + * hypercalls use architecture specific */ -#define KVM_HYPERCALL ".byte 0x0f,0x01,0xc1" - -/* For KVM hypercalls, a three-byte sequence of either the vmrun or the vmmrun - * instruction. The hypervisor may replace it with something else but only the - * instructions are guaranteed to be supported. - * - * Up to four arguments may be passed in rbx, rcx, rdx, and rsi respectively. - * The hypercall number should be placed in rax and the return value will be - * placed in rax. No other registers will be clobbered unless explicited - * noted by the particular hypercall. - */ - -static inline long kvm_hypercall0(unsigned int nr) -{ - long ret; - asm volatile(KVM_HYPERCALL -: "=a"(ret) -: "a"(nr)); - return ret; -} - -static inline long kvm_hypercall1(unsigned int nr, unsigned long p1) -{ - long ret; - asm volatile(KVM_HYPERCALL -: "=a"(ret) -: "a"(nr), "b"(p1)); - return ret; -} - -static inline long kvm_hypercall2(unsigned int nr, unsigned long p1, - unsigned long p2) -{ - long ret; - asm volatile(KVM_HYPERCALL -: "=a"(ret) -: "a"(nr), "b"(p1), "c"(p2)); - return ret; -} - -static inline long kvm_hypercall3(unsigned int nr, unsigned long p1, - unsigned long p2, unsigned long p3) -{ - long ret; - asm volatile(KVM_HYPERCALL -: "=a"(ret) -: "a"(nr), "b"(p1), "c"(p2), "d"(p3)); - return ret; -} - -static inline long kvm_hypercall4(unsigned int nr, unsigned long p1, - unsigned long p2, unsigned long p3, - unsigned long p4) -{ - long ret; - asm volatile(KVM_HYPERCALL -: "=a"(ret) -: "a"(nr), "b"(p1), "c"(p2), "d"(p3), "S"(p4)); - return ret; -} - -static inline int kvm_para_available(void) -{ - unsigned int eax, ebx, ecx, edx; - char signature[13]; - - cpuid(KVM_CPUID_SIGNATURE, &eax, &ebx, &ecx, &edx); - memcpy(signature + 0, &ebx, 4); - memcpy(signature + 4, &ecx, 4); - memcpy(signature + 8, &edx, 4); - signature[12] = 0; - - if (strcmp(signature, "KVMKVMKVM") == 0) - return 1; - - return 0; -} +#include static inline int kvm_para_has_feature(unsigned int feature) { - if (cpuid_eax(KVM_CPUID_FEATURES) & (1UL << feature)) + if (kvm_arch_para_features() & (1UL << feature)) return 1; return 0; } -#endif +#define kvm_hypercall(nr_params, args...) \ +({ \ + long __ret; \ + \ + __ret = kvm_hypercall##nr_params(args); \ + \ + __ret; \ +}) + +#endif #endif Index: kvm/include/asm-s390/kvm_para.h === --- /dev/null 1970-01-01 00:00:00.0 + +++ kvm/include/asm-s390/kvm_para.h 2007-10-11 14:09:12.0 +0200 @@ -0,0 +1,146 @@ +#ifndef __S390_KVM_PARA_H +#define __S390_KVM_PARA_H + +/* + * This is subject to change: + * + * Hypercalls for KVM on s390. The calling convention is similar to the + * s390 ABI, so we use R2-R6 for parameters 1-5. In addition we use R1 + * as hypercall number and R7 as parameter 6. The return value is + * written to R2. We use the diagnose instruction as hypercall. To avoid + * conflicts with existing diagnoses for LP
Re: [kvm-devel] [PATCH 3/6] virtio net driver
Am Freitag, 21. September 2007 schrieb Herbert Xu: > Please don't use LLTX in new drivers. We're trying to get rid > of it since it's > > 1) unnecessary; > 2) causes problems with AF_PACKET seeing things twice. Ok, but then I cannot reuse the xmit lock in an interrupt handler. Otherwise deadlock can occur because hard_start_xmit has interrupts enabled. Rusty, seems we need an additional lock. Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 3/6] virtio net driver
Am Freitag, 21. September 2007 schrieb Rusty Russell: > Can't we just re-use the default xmit lock? yes we can. This patch is untested but is basically an refresh of an ealier version. I currently have no code testable with the latest virtio version from this mail thread, so if you could apply this (It still has the ndev name) and test the patch? Thanks Christian --- Use the sent interrupt for xmit reclaim. --- drivers/net/virtio_net.c | 43 ++- 1 file changed, 26 insertions(+), 17 deletions(-) Index: virtio/drivers/net/virtio_net.c === --- virtio.orig/drivers/net/virtio_net.c +++ virtio/drivers/net/virtio_net.c @@ -52,10 +52,29 @@ static inline void vnet_hdr_to_sg(struct sg_init_one(sg, skb_vnet_hdr(skb), sizeof(struct virtio_net_hdr)); } +static void free_old_xmit_skbs(struct net_device *dev) +{ + struct virtnet_info *vi = netdev_priv(dev); + struct sk_buff *skb; + unsigned int len; + + netif_tx_lock(dev); + while ((skb = vi->vq_ops->get_buf(vi->vq_send, &len)) != NULL) { + pr_debug("Sent skb %p\n", skb); + __skb_unlink(skb, &vi->send); + dev->stats.tx_bytes += len; + dev->stats.tx_packets++; + dev_kfree_skb_irq(skb); + } + netif_tx_unlock(dev); +} + static bool skb_xmit_done(void *ndev) { /* In case we were waiting for output buffers. */ netif_wake_queue(ndev); + /* reclaim sent skbs */ + fre_old_xmit_skbs(ndev); return true; } @@ -209,20 +228,6 @@ again: return 0; } -static void free_old_xmit_skbs(struct virtnet_info *vi) -{ - struct sk_buff *skb; - unsigned int len; - - while ((skb = vi->vq_ops->get_buf(vi->vq_send, &len)) != NULL) { - pr_debug("Sent skb %p\n", skb); - __skb_unlink(skb, &vi->send); - vi->ndev->stats.tx_bytes += len; - vi->ndev->stats.tx_packets++; - kfree_skb(skb); - } -} - static int start_xmit(struct sk_buff *skb, struct net_device *dev) { struct virtnet_info *vi = netdev_priv(dev); @@ -235,8 +240,6 @@ static int start_xmit(struct sk_buff *sk dev->name, skb, dest[0], dest[1], dest[2], dest[3], dest[4], dest[5]); - free_old_xmit_skbs(vi); - /* Encode metadata header at front. */ hdr = skb_vnet_hdr(skb); if (skb->ip_summed == CHECKSUM_PARTIAL) { @@ -267,15 +270,21 @@ static int start_xmit(struct sk_buff *sk vnet_hdr_to_sg(sg, skb); num = skb_to_sgvec(skb, sg+1, 0, skb->len) + 1; + local_irq_disable(); + netif_tx_lock(dev); __skb_queue_head(&vi->send, skb); err = vi->vq_ops->add_buf(vi->vq_send, sg, num, 0, skb); if (err) { pr_debug("%s: virtio not prepared to send\n", dev->name); skb_unlink(skb, &vi->send); netif_stop_queue(dev); + netif_tx_unlock(dev); + local_irq_enable(); return NETDEV_TX_BUSY; } vi->vq_ops->kick(vi->vq_send); + netif_tx_unlock(dev); + local_irq_enable(); return 0; } @@ -335,7 +344,7 @@ static void *virtnet_probe(struct device dev->poll = virtnet_poll; dev->hard_start_xmit = start_xmit; dev->weight = 16; - dev->features = NETIF_F_HIGHDMA; + dev->features = NETIF_F_HIGHDMA | NETIF_F_LLTX; SET_NETDEV_DEV(dev, device); /* Do we support "hardware" checksums? */ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 3/6] virtio net driver
Am Donnerstag, 20. September 2007 schrieb Rusty Russell: > The network driver uses *two* virtqueues: one for input packets and > one for output packets. This has nice locking properties (ie. we > don't do any for recv vs send). [...] > 3) Resolve freeing of old xmit skbs (someone sent patch IIRC?) Yes, that was me. I am quite busy at the moment but I will send a refreshed patch soon. The most annoying fact of my current patch is, that I have to add locking. (Because of the only one operation per virtqueue at a time rule) [...] > +struct virtnet_info > +{ > + struct virtqueue_ops *vq_ops; > + struct virtqueue *vq_recv; > + struct virtqueue *vq_send; > + struct net_device *ndev; This is only a matter of taste, but I like netdev or dev more than ndev. [...] Everything else looks sane. 20-minutes-code-review-by: Christian Borntraeger <[EMAIL PROTECTED]> Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] bug in virtio network driver?
Am Mittwoch, 29. August 2007 schrieb Rusty Russell: > On Tue, 2007-08-21 at 18:02 +0200, Christian Borntraeger wrote: > > Am Dienstag, 21. August 2007 schrieb Rusty Russell: > > > The only reason that we don't do it in skb_xmit_done() is because > > > kfree_skb() isn't supposed to be called from an interrupt. But there's > > > dev_kfree_skb_any() which can be used. > > > > Ok, I now hacked something that works but I really dont like the > > local_irq_disable bits. I sent this patch nevertheless, but I will look into > > Arnds suggestion. > > Hi Christian, > > What's the status of this? Should I apply this patch, or do you want > me to wait for a ->poll patch as per Arnd's suggestion? I looked at Arnds suggestions. It seems that even with a cleanup in the poll function, we have no guarantee that the outstanding skb is reclaimed because there might be no incoming packet. Other drivers (like spider_net) solve this by using an additional tx_timer. I start to think that my latest patch is actually not that bad: lets do the reclaim when the host tells us its done. Arnd, do you have an opinion about that? Christian for reference: --- drivers/net/virtio_net.c | 47 ++- 1 file changed, 26 insertions(+), 21 deletions(-) Index: linux-2.6.22/drivers/net/virtio_net.c === --- linux-2.6.22.orig/drivers/net/virtio_net.c +++ linux-2.6.22/drivers/net/virtio_net.c @@ -53,12 +52,31 @@ static void vnet_hdr_to_sg(struct scatte sg->length = sizeof(struct virtio_net_hdr); } +static void free_old_xmit_skbs(struct virtnet_info *vi) +{ + struct sk_buff *skb; + unsigned int len; + + netif_tx_lock(vi->ndev); + while ((skb = vi->vq_send->ops->get_buf(vi->vq_send, &len)) != NULL) { + /* They cannot have written to the packet. */ + BUG_ON(len != 0); + pr_debug("Sent skb %p\n", skb); + __skb_unlink(skb, &vi->send); + vi->ndev->stats.tx_bytes += skb->len; + vi->ndev->stats.tx_packets++; + dev_kfree_skb_irq(skb); + } + netif_tx_unlock(vi->ndev); +} + static bool skb_xmit_done(struct virtqueue *vq) { struct virtnet_info *vi = vq->priv; /* In case we were waiting for output buffers. */ netif_wake_queue(vi->ndev); + free_old_xmit_skbs(vi); return true; } @@ -214,22 +232,6 @@ again: return 0; } -static void free_old_xmit_skbs(struct virtnet_info *vi) -{ - struct sk_buff *skb; - unsigned int len; - - while ((skb = vi->vq_send->ops->get_buf(vi->vq_send, &len)) != NULL) { - /* They cannot have written to the packet. */ - BUG_ON(len != 0); - pr_debug("Sent skb %p\n", skb); - __skb_unlink(skb, &vi->send); - vi->ndev->stats.tx_bytes += skb->len; - vi->ndev->stats.tx_packets++; - kfree_skb(skb); - } -} - static int start_xmit(struct sk_buff *skb, struct net_device *dev) { struct virtnet_info *vi = netdev_priv(dev); @@ -238,12 +240,12 @@ static int start_xmit(struct sk_buff *sk struct virtio_net_hdr *hdr; const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest; + local_irq_disable(); + netif_tx_lock(vi->ndev); pr_debug("%s: xmit %p %02x:%02x:%02x:%02x:%02x:%02x\n", dev->name, skb, dest[0], dest[1], dest[2], dest[3], dest[4], dest[5]); - free_old_xmit_skbs(vi); - /* Encode metadata header at front. */ hdr = skb_vnet_hdr(skb); if (skb->ip_summed == CHECKSUM_PARTIAL) { @@ -280,10 +282,13 @@ static int start_xmit(struct sk_buff *sk pr_debug("%s: virtio not prepared to send\n", dev->name); skb_unlink(skb, &vi->send); netif_stop_queue(dev); + netif_tx_unlock(vi->ndev); + local_irq_enable(); return NETDEV_TX_BUSY; } vi->vq_send->ops->sync(vi->vq_send); - + netif_tx_unlock(vi->ndev); + local_irq_enable(); return 0; } @@ -343,7 +348,7 @@ struct net_device *virtnet_probe(struct dev->poll = virtnet_poll; dev->hard_start_xmit = start_xmit; dev->weight = 16; - dev->features = features; + dev->features = features | NETIF_F_LLTX; SET_NETDEV_DEV(dev, device); vi = netdev_priv(dev); - This SF.net email is sponsored by: Splunk Inc. Still grepp
Re: [kvm-devel] [PATCH 1/4] [HYPERCALL] Add hypercalls functions
Am Samstag, 25. August 2007 schrieb Dor Laor: > +static inline int > +__hypercall2(unsigned int nr, unsigned long p1, unsigned long p2) > +{ > + int ret; > + asm (" call hypercall_addr\n" [...] > + return ret; Hello Dor, Linux system calls return long. I think hypercalls should behave in a similar manner and return long as well, no? Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1/4] [HYPERCALL] Add hypercalls functions
Am Samstag, 25. August 2007 schrieb Dor Laor: > + asm (" call hypercall_addr\n" Hi Dor, This cannot work, because hypercall_addr is currently not defined in todays kvm.git: # grep -R hypercall_addr * drivers/kvm/kvm.h:unsigned char *hypercall_addr); # IIRC there was a definition of hypercall_addr in the older paravirt drivers from the rt-kernel. It seems it did not make it into this patch set. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC][PATCH 5/6][KVM VIRTIO] Add the network device code
Am Samstag, 25. August 2007 schrieb Dor Laor: > + * Permission is hereby granted, free of charge, to any person > obtaining a copy Hi Dor, some of your patch lines seem line-wrapped. Is there place where I can download the whole series? Thanks Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 00/10] PV-IO v3
Am Mittwoch, 22. August 2007 schrieb Dor Laor: > Actually while playing with virtio for kvm Avi saw that and recommended > to do the following: > struct desc_pages > { > /* Page of descriptors. */ > union { > struct virtio_desc desc[NUM_DESCS]; > char pad1[PAGE_SIZE]; > }; [...] Fine with me. Is anybody currently working on descriptor based transport for virtio for kvm? Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 00/10] PV-IO v3
Am Mittwoch, 22. August 2007 schrieb Rusty Russell: > +struct desc_pages > +{ > + /* Page of descriptors. */ > + struct lguest_desc desc[NUM_DESCS]; > + > + /* Next page: how we tell other side what buffers are available. */ > + unsigned int avail_idx; > + unsigned int available[NUM_DESCS]; > + char pad[PAGE_SIZE - (NUM_DESCS+1) * sizeof(unsigned int)]; > + > + /* Third page: how other side tells us what's used. */ > + unsigned int used_idx; > + struct lguest_used used[NUM_DESCS]; > +}; Please consider to add this patch to make this data structure work on 64 bit to make the second page, really page aligned. On 32 bit this should be a no-op. Signed-Off-by: Christian Borntraeger <[EMAIL PROTECTED]> --- drivers/lguest/lguest_virtio.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Index: lguest/drivers/lguest/lguest_virtio.c === --- lguest.orig/drivers/lguest/lguest_virtio.c +++ lguest/drivers/lguest/lguest_virtio.c @@ -33,11 +33,12 @@ struct desc_pages { /* Page of descriptors. */ struct lguest_desc desc[NUM_DESCS]; + char pad0[PAGE_SIZE - NUM_DESCS * sizeof(struct lguest_desc)]; /* Next page: how we tell other side what buffers are available. */ unsigned int avail_idx; unsigned int available[NUM_DESCS]; - char pad[PAGE_SIZE - (NUM_DESCS+1) * sizeof(unsigned int)]; + char pad1[PAGE_SIZE - (NUM_DESCS+1) * sizeof(unsigned int)]; /* Third page: how other side tells us what's used. */ unsigned int used_idx; -- IBM Deutschland Entwicklung GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Herbert Kircher Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] bug in virtio network driver?
Am Dienstag, 21. August 2007 schrieb Rusty Russell: > The only reason that we don't do it in skb_xmit_done() is because > kfree_skb() isn't supposed to be called from an interrupt. But there's > dev_kfree_skb_any() which can be used. Ok, I now hacked something that works but I really dont like the local_irq_disable bits. I sent this patch nevertheless, but I will look into Arnds suggestion. --- linux-2.6.22.orig/drivers/net/virtio_net.c +++ linux-2.6.22/drivers/net/virtio_net.c @@ -53,12 +52,31 @@ static void vnet_hdr_to_sg(struct scatte sg->length = sizeof(struct virtio_net_hdr); } +static void free_old_xmit_skbs(struct virtnet_info *vi) +{ + struct sk_buff *skb; + unsigned int len; + + netif_tx_lock(vi->ndev); + while ((skb = vi->vq_send->ops->get_buf(vi->vq_send, &len)) != NULL) { + /* They cannot have written to the packet. */ + BUG_ON(len != 0); + pr_debug("Sent skb %p\n", skb); + __skb_unlink(skb, &vi->send); + vi->ndev->stats.tx_bytes += skb->len; + vi->ndev->stats.tx_packets++; + dev_kfree_skb_irq(skb); + } + netif_tx_unlock(vi->ndev); +} + static bool skb_xmit_done(struct virtqueue *vq) { struct virtnet_info *vi = vq->priv; /* In case we were waiting for output buffers. */ netif_wake_queue(vi->ndev); + free_old_xmit_skbs(vi); return true; } @@ -214,22 +232,6 @@ again: return 0; } -static void free_old_xmit_skbs(struct virtnet_info *vi) -{ - struct sk_buff *skb; - unsigned int len; - - while ((skb = vi->vq_send->ops->get_buf(vi->vq_send, &len)) != NULL) { - /* They cannot have written to the packet. */ - BUG_ON(len != 0); - pr_debug("Sent skb %p\n", skb); - __skb_unlink(skb, &vi->send); - vi->ndev->stats.tx_bytes += skb->len; - vi->ndev->stats.tx_packets++; - kfree_skb(skb); - } -} - static int start_xmit(struct sk_buff *skb, struct net_device *dev) { struct virtnet_info *vi = netdev_priv(dev); @@ -238,12 +240,12 @@ static int start_xmit(struct sk_buff *sk struct virtio_net_hdr *hdr; const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest; + local_irq_disable(); + netif_tx_lock(vi->ndev); pr_debug("%s: xmit %p %02x:%02x:%02x:%02x:%02x:%02x\n", dev->name, skb, dest[0], dest[1], dest[2], dest[3], dest[4], dest[5]); - free_old_xmit_skbs(vi); - /* Encode metadata header at front. */ hdr = skb_vnet_hdr(skb); if (skb->ip_summed == CHECKSUM_PARTIAL) { @@ -280,10 +282,13 @@ static int start_xmit(struct sk_buff *sk pr_debug("%s: virtio not prepared to send\n", dev->name); skb_unlink(skb, &vi->send); netif_stop_queue(dev); + netif_tx_unlock(vi->ndev); + local_irq_enable(); return NETDEV_TX_BUSY; } vi->vq_send->ops->sync(vi->vq_send); - + netif_tx_unlock(vi->ndev); + local_irq_enable(); return 0; } @@ -343,7 +348,7 @@ struct net_device *virtnet_probe(struct dev->poll = virtnet_poll; dev->hard_start_xmit = start_xmit; dev->weight = 16; - dev->features = features; + dev->features = features | NETIF_F_LLTX; SET_NETDEV_DEV(dev, device); vi = netdev_priv(dev); - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] bug in virtio network driver?
Hello Rusty, I think I have found a problem in the virtio network driver. virtio_net reclaims sent skbs on xmit. That means that there is always one skb outstanding and the netdev packet statistic is always one packet to low. Documentation/networking/drivers.txt says 3) Do not forget that once you return 0 from your hard_start_xmit method, it is your driver's responsibility to free up the SKB and in some finite amount of time. For example, this means that it is not allowed for your TX mitigation scheme to let TX packets "hang out" in the TX ring unreclaimed forever if no new TX packets are sent. This error can deadlock sockets waiting for send buffer room to be freed up. One solution would be to use the xmit_done interrupt. Unfortunately this would require additional locking as multiple interrupts can happen at two or more cpus. Do you have any better ideas? Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] Réf. : Re: [PATCH 0/4] Vir tual Machine Time Accounting
Am Montag, 20. August 2007 schrieb Glauber de Oliveira Costa: > Although I don't know KVM to a that deep level, I think it should be > possible to keep the virtual cpus in different process (or threads), > and take the accounting time from there. Perfectly possible to know > the time we spent running (user time), and the time the hypervisor > spent doing things on our behalf (system time). I disagree here. First thing, you dont want to have the virtual cpu in a different process than the hypervisor control code for that cpu. Otherwise communication has to be made via IPC. Secondly, Its not qemu/kvm that does the accouting. Its existing userspace code like top/snmp agents and clients! etc. that would require additional knowledge which thread is guest code. I personally like the approach Laurent has taken. Maybe it needs some polish and maybe we want an account_guest_time function, but in general I think he is doing the right thing. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 2/4] Introduce a new fields "gtime" and "cgtime" in task_struct and signal_struct
Am Montag, 20. August 2007 schrieb Laurent Vivier: > Index: kvm/fs/proc/array.c > === > --- kvm.orig/fs/proc/array.c2007-08-20 11:11:30.0 +0200 > +++ kvm/fs/proc/array.c 2007-08-20 13:04:03.0 +0200 Just a heads up, this patch collides with this fix in mm: http://marc.info/?l=linux-mm-commits&m=118737949222362&w=2 If Ingo accepts this fix, your patch should be adopted in array.c to use cputime_t for gtime as well. Lets see what Ingo thinks. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH/RFC 3/4]Introduce "accou nt modifiers" mechanism
Am Freitag, 17. August 2007 schrieb Laurent Vivier: > > The normal user/system accounting has the same issue, no? Whereever we > > happen to land (kernel or user) gets the whole tick. > > Yes... but perhaps I should rewrite this too ;-) If you look further, you will see, that this was actually rewritten in 2.6.12 and thats why we have cputime_t. The infrastructure is currently only used by s390 and ppc64. On s390 we use our cpu timer to get the current time on each syscall/irq/context switch etc to get precise accounting data for system/user/irq/softirq/nice. We also get steal time (this is interesting for the guest: how much of my cpu was actually not available because the hypervisor scheduled me away). I dont know enough about other architectures to say which one could exploit this infrastructure as well. The current git tree is somewhat broken for CONFIG_VIRT_CPU_ACCOUTING due to the accouting changes introduced by CFS - we work on this. If you are interested in the cputime stuff, you can have a look at arch/s390/kernel/vtime.c (account_system_vtime, account_vtime) as well as: http://git.kernel.org/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commit;h=6be7071fdd321c206b1ee7a3e534782f25572830 for the first introduction and http://git.kernel.org/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commit;h=db59f519d37f80683f3611927f6b5c3bdfc0f9e6 for the s390 exploitation. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 0/2][KVM] guest time accounting
Am Montag, 13. August 2007 schrieb Laurent Vivier: > >> As guest accounting is hw dependent, I think we should add a hook in the > >> accounting functions. > >> > > > > Isn't PF_VM exactly such a hook? All the hypervisor needs to do is to > > set/unset it correctly? > > In fact, no. > > PF_VM is used to know we have entered a virtual CPU (the hypervisor set it, > the scheduler unset it on accounting) Why not do something like the following. (This patch does not work as it relies on the no-existing var cputime_since_last_update, but it shows the idea) --- drivers/kvm/kvm.h | 13 + drivers/kvm/svm.c |3 ++- drivers/kvm/vmx.c |3 ++- kernel/sched.c|1 - 4 files changed, 17 insertions(+), 3 deletions(-) Index: kvm/drivers/kvm/kvm.h === --- kvm.orig/drivers/kvm/kvm.h +++ kvm/drivers/kvm/kvm.h @@ -8,6 +8,7 @@ #include #include +#include #include #include #include @@ -726,6 +727,18 @@ static inline u32 get_rdx_init_val(void) return 0x600; /* P6 family */ } +static inline guest_enter(void) +{ + account_system_time(current, 0, cputime_since_last_update); + current->flags |= PF_VCPU; +} + +static inline guest_exit(void) +{ + current->flags &= ~PF_VCPU; + account_system_time(current, 0, cputime_since_last_update); +} + #define ASM_VMX_VMCLEAR_RAX ".byte 0x66, 0x0f, 0xc7, 0x30" #define ASM_VMX_VMLAUNCH ".byte 0x0f, 0x01, 0xc2" #define ASM_VMX_VMRESUME ".byte 0x0f, 0x01, 0xc3" Index: kvm/kernel/sched.c === --- kvm.orig/kernel/sched.c +++ kvm/kernel/sched.c @@ -3251,7 +3251,6 @@ void account_system_time(struct task_str p->utime = cputime_add(p->utime, cputime); cpustat->user = cputime64_add(cpustat->user, tmp); cpustat->guest = cputime64_add(cpustat->guest, tmp); - p->flags &= ~PF_VCPU; return; } Index: kvm/drivers/kvm/svm.c === --- kvm.orig/drivers/kvm/svm.c +++ kvm/drivers/kvm/svm.c @@ -1404,7 +1404,7 @@ again: clgi(); vcpu->guest_mode = 1; - current->flags |= PF_VCPU; + guest_enter(); if (vcpu->requests) if (test_and_clear_bit(KVM_TLB_FLUSH, &vcpu->requests)) svm_flush_tlb(vcpu); @@ -1537,6 +1537,7 @@ again: #endif : "cc", "memory" ); + guest_exit(); vcpu->guest_mode = 0; if (vcpu->fpu_active) { Index: kvm/drivers/kvm/vmx.c === --- kvm.orig/drivers/kvm/vmx.c +++ kvm/drivers/kvm/vmx.c @@ -2078,7 +2078,7 @@ again: local_irq_disable(); vcpu->guest_mode = 1; - current->flags |= PF_VCPU; + guest_enter(); if (vcpu->requests) if (test_and_clear_bit(KVM_TLB_FLUSH, &vcpu->requests)) vmx_flush_tlb(vcpu); @@ -2199,6 +2199,7 @@ again: [cr2]"i"(offsetof(struct kvm_vcpu, cr2)) : "cc", "memory" ); + guest_exit(); vcpu->guest_mode = 0; local_irq_enable(); - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 0/2][KVM] guest time accounting
Am Montag, 13. August 2007 schrieb Avi Kivity: > Christian Borntraeger wrote: > > Am Montag, 13. August 2007 schrieb Laurent Vivier: > > > >>> [copying Ingo and Rusty] > >>> > > > > @Avi, seems that sourceforge is mangling the cc list? > > > > > > It's not configured to do so. Can you be more specific? I added Carsten and he was removed from the CC list in my copy I got from SF. And you cced Ingo and Rusty but I dont have them on cc, either. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 0/2][KVM] guest time accounting
Am Montag, 13. August 2007 schrieb Avi Kivity: > Laurent's patch gives the best of both worlds: on old 'top', you get > guest time accounted as user time, while on new 'top' it is accounted > separately. This is done by reporting user time as the sum of the real > user time and guest time. A newer 'top' can subtract guest time from > user time to get the correct statistic. Yes that looks promising. If I recall correctly we had some strange top behaviours when we introduced the steal time. Old top added the steal time to idle. We should check that. > > > > My implementation uses a similar mechanism like hard and softirq. So I have an > > sie_enter an sie_exit and a task_is_in_sie function - like irq_enter and > > irq_exit. The main difference is based on the fact, that s390 has precise > > accouting for irq, steal, user and system time, and therefore my patch is > > based on architecture specifc code using CONFIG_VIRT_CPU_ACCOUNT. > > > > Okay, so the code should be under that config option, and kvm should > select it. No hurry..that was specific to our implementation, not KVM :-) Besided that, Ingo changed the accouting with his CFS scheduler, and I still have to figure out how CONFIG_VIRT_CPU_ACCOUNTING can be properly integrated in CFS. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 0/2][KVM] guest time accounting
Am Montag, 13. August 2007 schrieb Laurent Vivier: > > [copying Ingo and Rusty] @Avi, seems that sourceforge is mangling the cc list? > > > > The patches look good. A couple of comments: > > > > - perhaps the new fields should be guarded by a #ifdef CONFIG_HYPERVISOR > > (selected by CONFIG_KVM)? that way the (minor) additional overhead is > > only incurred if it can possibly be used. I imagine that our canine > > cousin will want to use this as well. > > There is also a CONFIG_VIRTUALIZATION and a CONFIG_VIRT_CPU_ACCOUNTING (from > s390 and powerpc) Which one to use ? CONFIG_VIRT_CPU_ACCOUNTING is used for the precise accouting of user,system, steal and irq time on these platforms and is not what you want for the on/off decision. > > I'm wondering if we can have a more accurate accounting: > > - For the moment we add all system time since the previous entering to the > VCPU to the guest time (and I guess there is some real system time in > it ???) > - Perhaps we can sum nanoseconds spent in the VCPU and add it to cpustat > when these ns are greater than 1 ms ? (I'm trying to make something in this way) If you look at the patch I have posted some minutes ago, I use a method similar to irq_enter and irq_exit to separate real system time from guest time. > > - I think that there is per-task accounting of user time and system > > time; that should be extended as well. > it should be easy to do too... We have to make sure that userspace doesnt break, but yes we should have a guest time for processes as well. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 0/2][KVM] guest time accounting
Am Freitag, 10. August 2007 schrieb Laurent Vivier: > The aim of these two patches is to measure the CPU time used by a virtual > machine. All comments are welcome... I'm not sure it's the good way to do that. I did something similar for or s390guest prototype, that Carsten posted in May. I decided to account guest time to the user process instead of adding a new field to avoid hazzle with old top. As you can read in the patch comment, I personally prefer a new field if we can get one. My implementation uses a similar mechanism like hard and softirq. So I have an sie_enter an sie_exit and a task_is_in_sie function - like irq_enter and irq_exit. The main difference is based on the fact, that s390 has precise accouting for irq, steal, user and system time, and therefore my patch is based on architecture specifc code using CONFIG_VIRT_CPU_ACCOUNT. In general my patch has the same idea as your patch, so I am going to review your patch and see if it would fit for s390. For reference this is the (never posted) old patch for our virtualisation prototype. It wont work with kvm but it gives you the idea what we had in mind on s390. --- snip old PATCH GPLv2 -- Subject: [PATCH/RFC] Fix system<->user misaccount of interpreted execution From: Christian Borntraeger <[EMAIL PROTECTED]> This patches fixes the accouting of guest cpu time. As sie is executed via a system call, all guest operations were accounted as system time. To fix this we define a per thread "sie context". Before issuing the sie instruction we enter this context and leave the context afterwards. sie_enter and sie_exit call account_system_vtime, which now checks for being in sie_context. We define the sie_context to be accounted as user time. Possible future enhancement: We could add an additional field: "interpretion time" to cpu stat and process time. Thus we could differentiate between user time in the host and host user time spent for guests. The main challenge is the necessary user space change. Therefore, we could export the interpretion time with a new interface. To be defined. Signed-off-By: Christian Borntraeger <[EMAIL PROTECTED]> Signed-off-By: Carsten Otte <[EMAIL PROTECTED]> --- arch/s390/Kconfig |1 + arch/s390/host/s390host.c | 15 +++ arch/s390/kernel/process.c |1 + arch/s390/kernel/vtime.c | 12 +++- include/asm-s390/thread_info.h |2 ++ 5 files changed, 30 insertions(+), 1 deletion(-) Index: linux-2.6.22/arch/s390/kernel/vtime.c === --- linux-2.6.22.orig/arch/s390/kernel/vtime.c +++ linux-2.6.22/arch/s390/kernel/vtime.c @@ -97,6 +97,11 @@ void account_vtime(struct task_struct *t account_system_time(tsk, 0, cputime); } +static inline int task_is_in_sie(struct thread_info *thread) +{ + return thread->in_sie; +} + /* * Update process times based on virtual cpu times stored by entry.S * to the lowcore fields user_timer, system_timer & steal_clock. @@ -114,7 +119,12 @@ void account_system_vtime(struct task_st cputime = S390_lowcore.system_timer >> 12; S390_lowcore.system_timer -= cputime << 12; S390_lowcore.steal_clock -= cputime << 12; - account_system_time(tsk, 0, cputime); + + if (task_is_in_sie(task_thread_info(tsk)) && !hardirq_count() && + !softirq_count()) + account_user_time(tsk, cputime); + else + account_system_time(tsk, 0, cputime); } static inline void set_vtimer(__u64 expires) Index: linux-2.6.22/arch/s390/host/s390host.c === --- linux-2.6.22.orig/arch/s390/host/s390host.c +++ linux-2.6.22/arch/s390/host/s390host.c @@ -27,6 +27,19 @@ static int s390host_do_action(unsigned l static DEFINE_MUTEX(s390host_init_mutex); +static void enter_sie(void) +{ + account_system_vtime(current); + current_thread_info()->in_sie = 1; +} + +static void exit_sie(void) +{ + account_system_vtime(current); + current_thread_info()->in_sie = 0; +} + + static void s390host_get_data(struct s390host_data *data) { atomic_inc(&data->count); @@ -297,7 +310,9 @@ again: schedule(); sie_kernel->sie_block.icptcode = 0; + enter_sie(); ret = sie64a(sie_kernel); + exit_sie(); if (ret) goto out; Index: linux-2.6.22/include/asm-s390/thread_info.h === --- linux-2.6.22.orig/include/asm-s390/thread_info.h +++ linux-2.6.22/include/asm-s390/thread_info.h @@ -55,6 +55,7 @@ struct thread_info { struct restart_blockrestart_block; struct s390host_data*s390host_data; /* s390hos
Re: [kvm-devel] [PATCH/RFC 7/9] Virtual network guest device driver
Anthony Liguori <[EMAIL PROTECTED]> wrote on 17.05.2007 16:22:23: > Is there anything that explains what the fields in sockaddr mean: > > sa_family_tsiucv_family; > unsigned shortsiucv_port;/* Reserved */ > unsigned intsiucv_addr;/* Reserved */ > charsiucv_nodeid[8];/* Reserved */ > charsiucv_user_id[8];/* Guest User Id */ > charsiucv_name[8];/* Application Name */ There is a small description in "Device Drivers, Features, and Commands SC33-8289-03" on page 211 (its page 235 if you use the pdf viewer page number) http://download.boulder.ibm.com/ibmdl/pub/software/dw/linux390/docu/l26cdd03.pdf (the file is 6.7 MB) More generic information about iucv can be found in in http://www-03.ibm.com/servers/eserver/zseries/zos/bkserv/zvmpdf/zvm52.html or to be precise: http://publibz.boulder.ibm.com/epubs/pdf/hcse5b11.pdf part 2. (11 MB) That said, AF_IUCV builds on top of iucv and therefore requires z/VM as hypervisor. I dont think that KVM should implement (af_)iucv. But (af_)iucv shows several aspects how to make things good and bad. (e.g. AF_IUCV as procotol on top of iucv was first defined in CMS several years ago and is, therefore, not very smp-friendly. On the other hand iucv itself offers modern features like scatter/gather). Back to the old question: If shared memory or socket is better - I dont know. z/VM has both, see dcss for its shared memory support. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH/RFC 7/9] Virtual network guest device driver
> This is quite easy with KVM. I like the approach that vmchannel has > taken. A simple PCI device. That gives you a discovery mechanism for > shared memory and an interrupt and then you can just implement a ring > queue using those mechanisms (along with a PIO port for signaling from > the guest to the host). So given that underlying mechanism, the > question is how to expose that within the guest kernel/userspace and > within the host. Sorry for answering late, but I dont like PCI as a device bus for all platforms. s390 has no PCI and s390 has no PIO. I would prefer a new simple hypercall based virtual bus. I dont know much about windows driver programming, but I guess it it is not that hard to add a new bus. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH/RFC 5/9] s390 virtual console for guests
On Monday 14 May 2007 18:23, Christian Bornträger wrote: > On Friday 11 May 2007 21:00, Anthony Liguori wrote: > > I think it would be better to use hvc_console as Xen now uses it too. > I just had a look at hvc_console, and indeed this driver looks appropriate As I started prototyping this frontend I realized that hvc_console requires some interfaces, which are not present on s390, e.g. we have no request_irq and free_irq. Dont know if hvc_console is still the right way to go for us. This needs more thinking. Christian - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] file-backed guest memory
On Thursday 12 April 2007 17:38, Avi Kivity wrote: > > But wouldn't it also be possible to create a file in > > userspace and pass its descriptor to kvm? [...] > swap, hugetlbfs, and maybe other nifty stuff. I think I know how to do > this for the current mmu, but I'm worried that it will have a > performance impact with the nested page tables mmu. Another thing to consider is the commit "mm: tracking shared dirty pages" (d08b3851da41d0ee60851f2c75b118e1f7a5fc89). We need a shared mapping to actually have guest memory written back to the file. The dirty page tracking change will cause the host kernel to write protect guest memory pages after writeback. This will often trigger guest exits due to host protection faults. If we want to have file backed guest memory we really should modify the VMA to be not eligible for dirty page tracking. Christian - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ kvm-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/kvm-devel