Re: [PATCH/RFC] kvm: fix refcounting race release vs. module unload
--- kvm.orig/virt/kvm/kvm_main.c +++ kvm/virt/kvm/kvm_main.c @@ -1303,7 +1303,7 @@ static int kvm_vcpu_release(struct inode return 0; } -static const struct file_operations kvm_vcpu_fops = { +static struct file_operations kvm_vcpu_fops = { .release= kvm_vcpu_release, .unlocked_ioctl = kvm_vcpu_ioctl, .compat_ioctl = kvm_vcpu_ioctl, @@ -1318,6 +1318,7 @@ static int create_vcpu_fd(struct kvm_vcp int fd = anon_inode_getfd(kvm-vcpu, kvm_vcpu_fops, vcpu, 0); if (fd 0) kvm_put_kvm(vcpu-kvm); + __module_get(kvm_vcpu_fops.owner); return fd; } @@ -2061,6 +2062,7 @@ int kvm_init(void *opaque, unsigned int } kvm_chardev_ops.owner = module; + kvm_vcpu_fops.owner = module; r = misc_register(kvm_dev); if (r) { Messing with module counts is slightly ugly. How about having a vm fd fget() the /dev/kvm fd() instead? I personally find fget (and fput) slightly more ugly than handling the module reference count. Especially if the problem is module unloading...the module refcount looks so natural. I am also a bit worried by fget/fput, since we would call fput in the release function - which is part of the module. Wouldnt that open another very small race? In addition, we would need variables containing the fd and the file pointer for /dev/kvm, since fget/fput need some parameters, no? (Is there an easy way to get the fd from the struct file *filp? Searching current-files, seems to be the only method I know) To me, the fget approach looks more complicated and less safe. Christian -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] kvm: fix refcounting race release vs. module unload
Christian Borntraeger wrote: The problem is that kvm_destroy_vm can run while the module count is 0. That means, you can remove the module while kvm_destroy_vm is running. But kvm_destroy_vm is part of the module text. This causes a kerneloops. The race exists without the msleep but is much harder to trigger. Nevertheless, the right solution is to call kvm_destroy_vm only with module_ref_count 0. This can be done by calling kvm_destroy_vm only via a release function, since the VFS will not allow module unload. This patch sets kvm_vcpu_fops.owner to the module and manually increases the module refcount after anon_inode_getfd, since anon_inode_getfd does not do it. Signed-off-by: Christian Borntraeger [EMAIL PROTECTED] --- virt/kvm/kvm_main.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Index: kvm/virt/kvm/kvm_main.c === --- kvm.orig/virt/kvm/kvm_main.c +++ kvm/virt/kvm/kvm_main.c @@ -1303,7 +1303,7 @@ static int kvm_vcpu_release(struct inode return 0; } -static const struct file_operations kvm_vcpu_fops = { +static struct file_operations kvm_vcpu_fops = { .release= kvm_vcpu_release, .unlocked_ioctl = kvm_vcpu_ioctl, .compat_ioctl = kvm_vcpu_ioctl, @@ -1318,6 +1318,7 @@ static int create_vcpu_fd(struct kvm_vcp int fd = anon_inode_getfd(kvm-vcpu, kvm_vcpu_fops, vcpu, 0); if (fd 0) kvm_put_kvm(vcpu-kvm); + __module_get(kvm_vcpu_fops.owner); return fd; } @@ -2061,6 +2062,7 @@ int kvm_init(void *opaque, unsigned int } kvm_chardev_ops.owner = module; + kvm_vcpu_fops.owner = module; r = misc_register(kvm_dev); if (r) { Messing with module counts is slightly ugly. How about having a vm fd fget() the /dev/kvm fd() instead? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC] kvm: fix refcounting race release vs. module unload
Hello Avi, while fixing a module reference counting problem for s390 I spotted another problem in the kvm main code. Any comments on the patch? From: Christian Borntraeger [EMAIL PROTECTED] There is a race between a close of the file descriptors and module unload in the kvm module. You can easily trigger this problem by applying this debug patch: --- kvm.orig/virt/kvm/kvm_main.c +++ kvm/virt/kvm/kvm_main.c @@ -648,10 +648,14 @@ void kvm_free_physmem(struct kvm *kvm) kvm_free_physmem_slot(kvm-memslots[i], NULL); } +#include linux/delay.h static void kvm_destroy_vm(struct kvm *kvm) { struct mm_struct *mm = kvm-mm; + printk(off1\n); + msleep(5000); + printk(off2\n); spin_lock(kvm_lock); list_del(kvm-vm_list); spin_unlock(kvm_lock); and changing userspace to closing the vcpu file descriptors after the vm file descriptors. (killall qemu ; rmmod kvm_intel, rmmod kvm also crashes my system) The problem is that kvm_destroy_vm can run while the module count is 0. That means, you can remove the module while kvm_destroy_vm is running. But kvm_destroy_vm is part of the module text. This causes a kerneloops. The race exists without the msleep but is much harder to trigger. Nevertheless, the right solution is to call kvm_destroy_vm only with module_ref_count 0. This can be done by calling kvm_destroy_vm only via a release function, since the VFS will not allow module unload. This patch sets kvm_vcpu_fops.owner to the module and manually increases the module refcount after anon_inode_getfd, since anon_inode_getfd does not do it. Signed-off-by: Christian Borntraeger [EMAIL PROTECTED] --- virt/kvm/kvm_main.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Index: kvm/virt/kvm/kvm_main.c === --- kvm.orig/virt/kvm/kvm_main.c +++ kvm/virt/kvm/kvm_main.c @@ -1303,7 +1303,7 @@ static int kvm_vcpu_release(struct inode return 0; } -static const struct file_operations kvm_vcpu_fops = { +static struct file_operations kvm_vcpu_fops = { .release= kvm_vcpu_release, .unlocked_ioctl = kvm_vcpu_ioctl, .compat_ioctl = kvm_vcpu_ioctl, @@ -1318,6 +1318,7 @@ static int create_vcpu_fd(struct kvm_vcp int fd = anon_inode_getfd(kvm-vcpu, kvm_vcpu_fops, vcpu, 0); if (fd 0) kvm_put_kvm(vcpu-kvm); + __module_get(kvm_vcpu_fops.owner); return fd; } @@ -2061,6 +2062,7 @@ int kvm_init(void *opaque, unsigned int } kvm_chardev_ops.owner = module; + kvm_vcpu_fops.owner = module; r = misc_register(kvm_dev); if (r) { -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html