Re: [PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag
Le 26.08.2013 09:39, Paolo Bonzini a écrit : Il 25/08/2013 17:04, Alexander Graf ha scritto: On 24.08.2013, at 21:14, Yann Droneaud wrote: This patch set O_CLOEXEC flag on all file descriptors created with anon_inode_getfd() to not leak file descriptors across exec(). Signed-off-by: Yann Droneaud ydrone...@opteya.com Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com Reviewed-by: Alexander Graf ag...@suse.de Would it make sense to simply inherit the O_CLOEXEC flag from the parent kvm fd instead? That would give user space the power to keep fds across exec() if it wants to. Does it make sense to use non-O_CLOEXEC file descriptors with KVM at all? Besides fork() not being supported by KVM, as described in Documentation/virtual/kvm/api.txt, the VMAs of the parent process go away as soon as you exec(). I'm not sure how you can use the inherited file descriptor in a sensible way after exec(). Sounds a lot like InfiniBand subsystem behavor: IB file descriptors are of no use accross exec() since memory mappings tied to those fds won't be available in the new process: https://lkml.org/lkml/2013/7/8/380 http://mid.gmane.org/f58540dc64fec1ac0e496dfcd3cc1...@meuh.org Regards. -- Yann Droneaud OPTEYA -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag
Le 26.08.2013 09:39, Paolo Bonzini a écrit : Il 25/08/2013 17:04, Alexander Graf ha scritto: On 24.08.2013, at 21:14, Yann Droneaud wrote: This patch set O_CLOEXEC flag on all file descriptors created with anon_inode_getfd() to not leak file descriptors across exec(). Signed-off-by: Yann Droneaud ydrone...@opteya.com Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com Reviewed-by: Alexander Graf ag...@suse.de Would it make sense to simply inherit the O_CLOEXEC flag from the parent kvm fd instead? That would give user space the power to keep fds across exec() if it wants to. Does it make sense to use non-O_CLOEXEC file descriptors with KVM at all? Besides fork() not being supported by KVM, as described in Documentation/virtual/kvm/api.txt, the VMAs of the parent process go away as soon as you exec(). I'm not sure how you can use the inherited file descriptor in a sensible way after exec(). Sounds a lot like InfiniBand subsystem behavor: IB file descriptors are of no use accross exec() since memory mappings tied to those fds won't be available in the new process: https://lkml.org/lkml/2013/7/8/380 http://mid.gmane.org/f58540dc64fec1ac0e496dfcd3cc1...@meuh.org Regards. -- Yann Droneaud OPTEYA -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag
KVM uses anon_inode_get() to allocate file descriptors as part of some of its ioctls. But those ioctls are lacking a flag argument allowing userspace to choose options for the newly opened file descriptor. In such case it's advised to use O_CLOEXEC by default so that userspace is allowed to choose, without race, if the file descriptor is going to be inherited across exec(). This patch set O_CLOEXEC flag on all file descriptors created with anon_inode_getfd() to not leak file descriptors across exec(). Signed-off-by: Yann Droneaud ydrone...@opteya.com Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com --- virt/kvm/kvm_main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 89f74d1..d65cc0c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1896,7 +1896,7 @@ static struct file_operations kvm_vcpu_fops = { */ static int create_vcpu_fd(struct kvm_vcpu *vcpu) { - return anon_inode_getfd(kvm-vcpu, kvm_vcpu_fops, vcpu, O_RDWR); + return anon_inode_getfd(kvm-vcpu, kvm_vcpu_fops, vcpu, O_RDWR | O_CLOEXEC); } /* @@ -2306,7 +2306,7 @@ static int kvm_ioctl_create_device(struct kvm *kvm, return ret; } - ret = anon_inode_getfd(ops-name, kvm_device_fops, dev, O_RDWR); + ret = anon_inode_getfd(ops-name, kvm_device_fops, dev, O_RDWR | O_CLOEXEC); if (ret 0) { ops-destroy(dev); return ret; @@ -2590,7 +2590,7 @@ static int kvm_dev_ioctl_create_vm(unsigned long type) return r; } #endif - r = anon_inode_getfd(kvm-vm, kvm_vm_fops, kvm, O_RDWR); + r = anon_inode_getfd(kvm-vm, kvm_vm_fops, kvm, O_RDWR | O_CLOEXEC); if (r 0) kvm_put_kvm(kvm); -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag
KVM uses anon_inode_get() to allocate file descriptors as part of some of its ioctls. But those ioctls are lacking a flag argument allowing userspace to choose options for the newly opened file descriptor. In such case it's advised to use O_CLOEXEC by default so that userspace is allowed to choose, without race, if the file descriptor is going to be inherited across exec(). This patch set O_CLOEXEC flag on all file descriptors created with anon_inode_getfd() to not leak file descriptors across exec(). Signed-off-by: Yann Droneaud ydrone...@opteya.com Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com --- arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_vio.c| 2 +- arch/powerpc/kvm/book3s_hv.c| 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 710d313..f7c9e8a 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -1579,7 +1579,7 @@ int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *ghf) ctx-first_pass = 1; rwflag = (ghf-flags KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY; - ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag); + ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag | O_CLOEXEC); if (ret 0) { kvm_put_kvm(kvm); return ret; diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index b2d3f3b..54cf9bc 100644 --- a/arch/powerpc/kvm/book3s_64_vio.c +++ b/arch/powerpc/kvm/book3s_64_vio.c @@ -136,7 +136,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, mutex_unlock(kvm-lock); return anon_inode_getfd(kvm-spapr-tce, kvm_spapr_tce_fops, - stt, O_RDWR); + stt, O_RDWR | O_CLOEXEC); fail: if (stt) { diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index e8d51cb..3503829 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -1556,7 +1556,7 @@ long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct kvm_allocate_rma *ret) if (!ri) return -ENOMEM; - fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR); + fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR | O_CLOEXEC); if (fd 0) kvm_release_rma(ri); -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag
Hi, Following a patchset asking to change calls to get_unused_flag() [1] to use O_CLOEXEC, Alex Williamson [2][3] decided to change VFIO to use the flag. Since it's a related subsystem to KVM, using O_CLOEXEC for file descriptors created by KVM might be applicable too. I'm suggesting to change calls to anon_inode_getfd() to use O_CLOEXEC as default flag. This patchset should be reviewed to not break existing userspace program. BTW, if it's not applicable, I would suggest that new ioctls be added to KVM subsystem, those ioctls would have a flag field added to their arguments. Such flag would let userspace choose the open flag to use. See for example other APIs using anon_inode_getfd() such as fanotify, inotify, signalfd and timerfd. You might be interested to read: - Secure File Descriptor Handling (Ulrich Drepper, 2008) http://udrepper.livejournal.com/20407.html - Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012) http://danwalsh.livejournal.com/53603.html Regards. [1] http://lkml.kernel.org/r/cover.1376327678.git.ydrone...@opteya.com [2] http://lkml.kernel.org/r/1377186804.25163.17.ca...@ul30vt.home [3] http://lkml.kernel.org/r/20130822171744.1297.13711.st...@bling.home Yann Droneaud (2): kvm: use anon_inode_getfd() with O_CLOEXEC flag ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_vio.c| 2 +- arch/powerpc/kvm/book3s_hv.c| 2 +- virt/kvm/kvm_main.c | 6 +++--- 4 files changed, 6 insertions(+), 6 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag
Hi, Following a patchset asking to change calls to get_unused_flag() [1] to use O_CLOEXEC, Alex Williamson [2][3] decided to change VFIO to use the flag. Since it's a related subsystem to KVM, using O_CLOEXEC for file descriptors created by KVM might be applicable too. I'm suggesting to change calls to anon_inode_getfd() to use O_CLOEXEC as default flag. This patchset should be reviewed to not break existing userspace program. BTW, if it's not applicable, I would suggest that new ioctls be added to KVM subsystem, those ioctls would have a flag field added to their arguments. Such flag would let userspace choose the open flag to use. See for example other APIs using anon_inode_getfd() such as fanotify, inotify, signalfd and timerfd. You might be interested to read: - Secure File Descriptor Handling (Ulrich Drepper, 2008) http://udrepper.livejournal.com/20407.html - Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012) http://danwalsh.livejournal.com/53603.html Regards. [1] http://lkml.kernel.org/r/cover.1376327678.git.ydrone...@opteya.com [2] http://lkml.kernel.org/r/1377186804.25163.17.ca...@ul30vt.home [3] http://lkml.kernel.org/r/20130822171744.1297.13711.st...@bling.home Yann Droneaud (2): kvm: use anon_inode_getfd() with O_CLOEXEC flag ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_vio.c| 2 +- arch/powerpc/kvm/book3s_hv.c| 2 +- virt/kvm/kvm_main.c | 6 +++--- 4 files changed, 6 insertions(+), 6 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag
KVM uses anon_inode_get() to allocate file descriptors as part of some of its ioctls. But those ioctls are lacking a flag argument allowing userspace to choose options for the newly opened file descriptor. In such case it's advised to use O_CLOEXEC by default so that userspace is allowed to choose, without race, if the file descriptor is going to be inherited across exec(). This patch set O_CLOEXEC flag on all file descriptors created with anon_inode_getfd() to not leak file descriptors across exec(). Signed-off-by: Yann Droneaud ydrone...@opteya.com Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com --- arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_vio.c| 2 +- arch/powerpc/kvm/book3s_hv.c| 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 710d313..f7c9e8a 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -1579,7 +1579,7 @@ int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *ghf) ctx-first_pass = 1; rwflag = (ghf-flags KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY; - ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag); + ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag | O_CLOEXEC); if (ret 0) { kvm_put_kvm(kvm); return ret; diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index b2d3f3b..54cf9bc 100644 --- a/arch/powerpc/kvm/book3s_64_vio.c +++ b/arch/powerpc/kvm/book3s_64_vio.c @@ -136,7 +136,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, mutex_unlock(kvm-lock); return anon_inode_getfd(kvm-spapr-tce, kvm_spapr_tce_fops, - stt, O_RDWR); + stt, O_RDWR | O_CLOEXEC); fail: if (stt) { diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index e8d51cb..3503829 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -1556,7 +1556,7 @@ long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct kvm_allocate_rma *ret) if (!ri) return -ENOMEM; - fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR); + fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR | O_CLOEXEC); if (fd 0) kvm_release_rma(ri); -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 05/10] vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
Macro get_unused_fd() is used to allocate a file descriptor with default flags. Those default flags (0) can be unsafe: O_CLOEXEC must be used by default to not leak file descriptor across exec(). Instead of macro get_unused_fd(), functions anon_inode_getfd() or get_unused_fd_flags() should be used with flags given by userspace. If not possible, flags should be set to O_CLOEXEC to provide userspace with a default safe behavor. In a further patch, get_unused_fd() will be removed so that new code start using anon_inode_getfd() or get_unused_fd_flags() with correct flags. This patch replaces calls to get_unused_fd() with equivalent call to get_unused_fd_flags(0) to preserve current behavor for existing code. The hard coded flag value (0) should be reviewed on a per-subsystem basis, and, if possible, set to O_CLOEXEC. Signed-off-by: Yann Droneaud ydrone...@opteya.com Link: http://lkml.kernel.org/r/cover.1376327678.git.ydrone...@opteya.com --- drivers/vfio/vfio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index d3cb342..75c16cc 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -1109,7 +1109,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf) * We can't use anon_inode_getfd() because we need to modify * the f_mode flags directly to allow more than just ioctls */ - ret = get_unused_fd(); + ret = get_unused_fd_flags(0); if (ret 0) { device-ops-release(device-device_data); break; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 00/10] Getting rid of get_unused_fd_flags()
Hi, Macro get_unused_fd() is a shortcut to call function get_unused_fd_flags(), to allocate a file descriptor. The macro use 0 as flags, so the file descriptor is created without O_CLOEXEC flag. This can be seen as an unsafe default eg. in most case O_CLOEXEC must be used to not leak file descriptor across exec(). Newer kernel code should use anon_inode_getfd() or get_unused_fd_flags() with flags provided by userspace. If flags cannot be given by userspace, O_CLOEXEC must be the default flag. Using O_CLOEXEC by default allows userspace to choose, without race, if the file descriptor is going to be inherited across exec(). They are two ways to achieve this: - makes get_unused_fd() use O_CLOEXEC by default It's difficult to get it right: every code using of get_unused_fd() must take this change into account and be fixed as soon as macro get_unused_fd() do the switch. Non updated code will have unexpected behavor and it's likely going to break API contract. - remove get_unused_fd() It's going to break some out of tree, not yet upstream kernel code, but it's easy to notice and fix. Anyway, newer code should use anon_inode_getfd() or get_unused_fd_flags(). The latter option was choosen to ensure no unexpected behavor for out of tree, not yet upstream code. Removing the macro is the safest choice: it's better to break build than trying to make get_unused_fd() use O_CLOEXEC by default and get all user of get_unused_fd() update. Additionnaly, removing the macro is not going to break modules ABI. In linux-next tag 20130815, they're currently: - 19 calls to get_unused_fd_flags() (+4) not counting get_unused_fd() and anon_inode_getfd() - 10 calls to get_unused_fd() (-4) - 11 calls to anon_inode_getfd()(0) The following patchset try to convert all calls to get_unused_fd() to get_unused_fd_flags(0) before removing get_unused_fd() macro. Without get_unused_fd() macro, more subsystems are likely to use anon_inode_getfd() and be teached to provide an API that let userspace choose the opening flags of the file descriptor. Changes from v1 http://lkml.kernel.org/r/cover.1372777600.git.ydrone...@opteya.com: - explicitly added subsystem maintainers as mail recepients. - infiniband: use get_unused_fd_flags(0) instead of get_unused_fd() DROPPED: subsystem maintainer applied another patch using get_unused_fd_flags(O_CLOEXEC) as suggested. - android/sw_sync: use get_unused_fd_flags(0) instead of get_unused_fd() MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested by http://lkml.kernel.org/r/cacsp8sjxgmk2_kx_+rgzqqqwqkernvf1wt3k5tw991w5dfa...@mail.gmail.com - android/sync: use get_unused_fd_flags(0) instead of get_unused_fd() MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested by http://lkml.kernel.org/r/CACSP8SjZcpcpEtQHzcGYhf-MP7QGo0XpN7-uN7rmD=vNtopG=w...@mail.gmail.com - xfs: use get_unused_fd_flags(0) instead of get_unused_fd() DROPPED: applied asis by subsystem maintainer. - sctp: use get_unused_fd_flags(0) instead of get_unused_fd() DROPPED: applied asis by subsystem maintainer. Yann Droneaud (10): ia64: use get_unused_fd_flags(0) instead of get_unused_fd() ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd() android/sw_sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd() android/sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd() vfio: use get_unused_fd_flags(0) instead of get_unused_fd() binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd() file: use get_unused_fd_flags(0) instead of get_unused_fd() fanotify: use get_unused_fd_flags(0) instead of get_unused_fd() events: use get_unused_fd_flags(0) instead of get_unused_fd() file: remove get_unused_fd() arch/ia64/kernel/perfmon.c| 2 +- arch/powerpc/platforms/cell/spufs/inode.c | 4 ++-- drivers/staging/android/sw_sync.c | 2 +- drivers/staging/android/sync.c| 2 +- drivers/vfio/vfio.c | 2 +- fs/binfmt_misc.c | 2 +- fs/file.c | 2 +- fs/notify/fanotify/fanotify_user.c| 2 +- include/linux/file.h | 1 - kernel/events/core.c | 2 +- 10 files changed, 10 insertions(+), 11 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/13] vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
Macro get_unused_fd() is used to allocate a file descriptor with default flags. Those default flags (0) can be unsafe: O_CLOEXEC must be used by default to not leak file descriptor across exec(). Instead of macro get_unused_fd(), functions anon_inode_getfd() or get_unused_fd_flags() should be used with flags given by userspace. If not possible, flags should be set to O_CLOEXEC to provide userspace with a default safe behavor. In a further patch, get_unused_fd() will be removed so that new code start using anon_inode_getfd() or get_unused_fd_flags() with correct flags. This patch replaces calls to get_unused_fd() with equivalent call to get_unused_fd_flags(0) to preserve current behavor for existing code. The hard coded flag value (0) should be reviewed on a per-subsystem basis, and, if possible, set to O_CLOEXEC. Signed-off-by: Yann Droneaud ydrone...@opteya.com --- drivers/vfio/vfio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index c488da5..bb4e9fd 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -1126,7 +1126,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf) * We can't use anon_inode_getfd() because we need to modify * the f_mode flags directly to allow more than just ioctls */ - ret = get_unused_fd(); + ret = get_unused_fd_flags(0); if (ret 0) { device-ops-release(device-device_data); break; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/13] Getting rid of get_unused_fd()
Hi, Macro get_unused_fd() is a shortcut to call function get_unused_fd_flags(), to allocate a file descriptor. The macro use 0 as flags, so the file descriptor is created without O_CLOEXEC flag. This can be seen as an unsafe default eg. in most case O_CLOEXEC must be used to not leak file descriptor across exec(). Newer kernel code should use anon_inode_getfd() or get_unused_fd_flags() with flags provided by userspace. If flags cannot be given by userspace, O_CLOEXEC must be the default flag. Using O_CLOEXEC by default allows userspace to choose, without race, if the file descriptor is going to be inherited across exec(). They are two ways to achieve this: - makes get_unused_fd() use O_CLOEXEC by default It's difficult to get it right: every code using of get_unused_fd() must take this change into account and be fixed as soon as macro get_unused_fd() do the switch. Non updated code will have unexpected behavor and it's likely going to break API contract. - remove get_unused_fd() It's going to break some out of tree, not yet upstream kernel code, but it's easy to notice and fix. Anyway, newer code should use anon_inode_getfd() or get_unused_fd_flags(). The latter option was choosen to ensure no unexpected behavor for out of tree, not yet upstream code. Removing the macro is the safest choice: it's better to break build than trying to make get_unused_fd() use O_CLOEXEC by default and get all user of get_unused_fd() update. Additionnaly, removing the macro is not going to break modules ABI. In linux-next tag 20130702, they're currently: - 15 calls to get_unused_fd_flags() not counting get_unused_fd() and anon_inode_getfd() - 14 calls to get_unused_fd() - 11 calls to anon_inode_getfd() The following patchset try to convert all calls to get_unused_fd() to get_unused_fd_flags(0) before removing get_unused_fd() macro. Without get_unused_fd() macro, more subsystems are likely to use anon_inode_getfd() and be teached to provide an API that let userspace choose the opening flags of the file descriptor. Yann Droneaud (13): ia64: use get_unused_fd_flags(0) instead of get_unused_fd() ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd() infiniband: use get_unused_fd_flags(0) instead of get_unused_fd() android/sw_sync: use get_unused_fd_flags(0) instead of get_unused_fd() android/sync: use get_unused_fd_flags(0) instead of get_unused_fd() vfio: use get_unused_fd_flags(0) instead of get_unused_fd() binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd() file: use get_unused_fd_flags(0) instead of get_unused_fd() fanotify: use get_unused_fd_flags(0) instead of get_unused_fd() xfs: use get_unused_fd_flags(0) instead of get_unused_fd() events: use get_unused_fd_flags(0) instead of get_unused_fd() sctp: use get_unused_fd_flags(0) instead of get_unused_fd() file: remove get_unused_fd() arch/ia64/kernel/perfmon.c| 2 +- arch/powerpc/platforms/cell/spufs/inode.c | 4 ++-- drivers/infiniband/core/uverbs_cmd.c | 4 ++-- drivers/staging/android/sw_sync.c | 2 +- drivers/staging/android/sync.c| 2 +- drivers/vfio/vfio.c | 2 +- fs/binfmt_misc.c | 2 +- fs/file.c | 2 +- fs/notify/fanotify/fanotify_user.c| 2 +- fs/xfs/xfs_ioctl.c| 2 +- include/linux/file.h | 1 - kernel/events/core.c | 2 +- net/sctp/socket.c | 2 +- 13 files changed, 14 insertions(+), 15 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html