Re: [PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

2013-08-26 Thread Yann Droneaud

Le 26.08.2013 09:39, Paolo Bonzini a écrit :

Il 25/08/2013 17:04, Alexander Graf ha scritto:

On 24.08.2013, at 21:14, Yann Droneaud wrote:



This patch set O_CLOEXEC flag on all file descriptors created
with anon_inode_getfd() to not leak file descriptors across exec().

Signed-off-by: Yann Droneaud ydrone...@opteya.com
Link: 
http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com


Reviewed-by: Alexander Graf ag...@suse.de

Would it make sense to simply inherit the O_CLOEXEC flag from the
parent kvm fd instead? That would give user space the power to keep
fds across exec() if it wants to.


Does it make sense to use non-O_CLOEXEC file descriptors with KVM at
all?  Besides fork() not being supported by KVM, as described in
Documentation/virtual/kvm/api.txt, the VMAs of the parent process go
away as soon as you exec().  I'm not sure how you can use the inherited
file descriptor in a sensible way after exec().



Sounds a lot like InfiniBand subsystem behavor: IB file descriptors
are of no use accross exec() since memory mappings tied to those fds
won't be available in the new process:

https://lkml.org/lkml/2013/7/8/380
http://mid.gmane.org/f58540dc64fec1ac0e496dfcd3cc1...@meuh.org

Regards.

--
Yann Droneaud
OPTEYA

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

2013-08-26 Thread Yann Droneaud

Le 26.08.2013 09:39, Paolo Bonzini a écrit :

Il 25/08/2013 17:04, Alexander Graf ha scritto:

On 24.08.2013, at 21:14, Yann Droneaud wrote:



This patch set O_CLOEXEC flag on all file descriptors created
with anon_inode_getfd() to not leak file descriptors across exec().

Signed-off-by: Yann Droneaud ydrone...@opteya.com
Link: 
http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com


Reviewed-by: Alexander Graf ag...@suse.de

Would it make sense to simply inherit the O_CLOEXEC flag from the
parent kvm fd instead? That would give user space the power to keep
fds across exec() if it wants to.


Does it make sense to use non-O_CLOEXEC file descriptors with KVM at
all?  Besides fork() not being supported by KVM, as described in
Documentation/virtual/kvm/api.txt, the VMAs of the parent process go
away as soon as you exec().  I'm not sure how you can use the inherited
file descriptor in a sensible way after exec().



Sounds a lot like InfiniBand subsystem behavor: IB file descriptors
are of no use accross exec() since memory mappings tied to those fds
won't be available in the new process:

https://lkml.org/lkml/2013/7/8/380
http://mid.gmane.org/f58540dc64fec1ac0e496dfcd3cc1...@meuh.org

Regards.

--
Yann Droneaud
OPTEYA

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag

2013-08-24 Thread Yann Droneaud
KVM uses anon_inode_get() to allocate file descriptors as part
of some of its ioctls. But those ioctls are lacking a flag argument
allowing userspace to choose options for the newly opened file descriptor.

In such case it's advised to use O_CLOEXEC by default so that
userspace is allowed to choose, without race, if the file descriptor
is going to be inherited across exec().

This patch set O_CLOEXEC flag on all file descriptors created
with anon_inode_getfd() to not leak file descriptors across exec().

Signed-off-by: Yann Droneaud ydrone...@opteya.com
Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com
---
 virt/kvm/kvm_main.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 89f74d1..d65cc0c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1896,7 +1896,7 @@ static struct file_operations kvm_vcpu_fops = {
  */
 static int create_vcpu_fd(struct kvm_vcpu *vcpu)
 {
-   return anon_inode_getfd(kvm-vcpu, kvm_vcpu_fops, vcpu, O_RDWR);
+   return anon_inode_getfd(kvm-vcpu, kvm_vcpu_fops, vcpu, O_RDWR | 
O_CLOEXEC);
 }
 
 /*
@@ -2306,7 +2306,7 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
return ret;
}
 
-   ret = anon_inode_getfd(ops-name, kvm_device_fops, dev, O_RDWR);
+   ret = anon_inode_getfd(ops-name, kvm_device_fops, dev, O_RDWR | 
O_CLOEXEC);
if (ret  0) {
ops-destroy(dev);
return ret;
@@ -2590,7 +2590,7 @@ static int kvm_dev_ioctl_create_vm(unsigned long type)
return r;
}
 #endif
-   r = anon_inode_getfd(kvm-vm, kvm_vm_fops, kvm, O_RDWR);
+   r = anon_inode_getfd(kvm-vm, kvm_vm_fops, kvm, O_RDWR | O_CLOEXEC);
if (r  0)
kvm_put_kvm(kvm);
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

2013-08-24 Thread Yann Droneaud
KVM uses anon_inode_get() to allocate file descriptors as part
of some of its ioctls. But those ioctls are lacking a flag argument
allowing userspace to choose options for the newly opened file descriptor.

In such case it's advised to use O_CLOEXEC by default so that
userspace is allowed to choose, without race, if the file descriptor
is going to be inherited across exec().

This patch set O_CLOEXEC flag on all file descriptors created
with anon_inode_getfd() to not leak file descriptors across exec().

Signed-off-by: Yann Droneaud ydrone...@opteya.com
Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com

---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 arch/powerpc/kvm/book3s_64_vio.c| 2 +-
 arch/powerpc/kvm/book3s_hv.c| 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 710d313..f7c9e8a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1579,7 +1579,7 @@ int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct 
kvm_get_htab_fd *ghf)
ctx-first_pass = 1;
 
rwflag = (ghf-flags  KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY;
-   ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag);
+   ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag | 
O_CLOEXEC);
if (ret  0) {
kvm_put_kvm(kvm);
return ret;
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index b2d3f3b..54cf9bc 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -136,7 +136,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
mutex_unlock(kvm-lock);
 
return anon_inode_getfd(kvm-spapr-tce, kvm_spapr_tce_fops,
-   stt, O_RDWR);
+   stt, O_RDWR | O_CLOEXEC);
 
 fail:
if (stt) {
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index e8d51cb..3503829 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1556,7 +1556,7 @@ long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct 
kvm_allocate_rma *ret)
if (!ri)
return -ENOMEM;
 
-   fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR);
+   fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR | O_CLOEXEC);
if (fd  0)
kvm_release_rma(ri);
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag

2013-08-24 Thread Yann Droneaud
Hi,

Following a patchset asking to change calls to get_unused_flag() [1]
to use O_CLOEXEC, Alex Williamson [2][3] decided to change VFIO
to use the flag.

Since it's a related subsystem to KVM, using O_CLOEXEC for
file descriptors created by KVM might be applicable too.

I'm suggesting to change calls to anon_inode_getfd() to use O_CLOEXEC
as default flag.

This patchset should be reviewed to not break existing userspace program.

BTW, if it's not applicable, I would suggest that new ioctls be added to
KVM subsystem, those ioctls would have a flag field added to their arguments.
Such flag would let userspace choose the open flag to use.
See for example other APIs using anon_inode_getfd() such as fanotify,
inotify, signalfd and timerfd.

You might be interested to read:

- Secure File Descriptor Handling (Ulrich Drepper, 2008)
  http://udrepper.livejournal.com/20407.html

- Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012) 
  http://danwalsh.livejournal.com/53603.html

Regards.

[1] http://lkml.kernel.org/r/cover.1376327678.git.ydrone...@opteya.com
[2] http://lkml.kernel.org/r/1377186804.25163.17.ca...@ul30vt.home
[3] http://lkml.kernel.org/r/20130822171744.1297.13711.st...@bling.home

Yann Droneaud (2):
  kvm: use anon_inode_getfd() with O_CLOEXEC flag
  ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 arch/powerpc/kvm/book3s_64_vio.c| 2 +-
 arch/powerpc/kvm/book3s_hv.c| 2 +-
 virt/kvm/kvm_main.c | 6 +++---
 4 files changed, 6 insertions(+), 6 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag

2013-08-24 Thread Yann Droneaud
Hi,

Following a patchset asking to change calls to get_unused_flag() [1]
to use O_CLOEXEC, Alex Williamson [2][3] decided to change VFIO
to use the flag.

Since it's a related subsystem to KVM, using O_CLOEXEC for
file descriptors created by KVM might be applicable too.

I'm suggesting to change calls to anon_inode_getfd() to use O_CLOEXEC
as default flag.

This patchset should be reviewed to not break existing userspace program.

BTW, if it's not applicable, I would suggest that new ioctls be added to
KVM subsystem, those ioctls would have a flag field added to their arguments.
Such flag would let userspace choose the open flag to use.
See for example other APIs using anon_inode_getfd() such as fanotify,
inotify, signalfd and timerfd.

You might be interested to read:

- Secure File Descriptor Handling (Ulrich Drepper, 2008)
  http://udrepper.livejournal.com/20407.html

- Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012) 
  http://danwalsh.livejournal.com/53603.html

Regards.

[1] http://lkml.kernel.org/r/cover.1376327678.git.ydrone...@opteya.com
[2] http://lkml.kernel.org/r/1377186804.25163.17.ca...@ul30vt.home
[3] http://lkml.kernel.org/r/20130822171744.1297.13711.st...@bling.home

Yann Droneaud (2):
  kvm: use anon_inode_getfd() with O_CLOEXEC flag
  ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 arch/powerpc/kvm/book3s_64_vio.c| 2 +-
 arch/powerpc/kvm/book3s_hv.c| 2 +-
 virt/kvm/kvm_main.c | 6 +++---
 4 files changed, 6 insertions(+), 6 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

2013-08-24 Thread Yann Droneaud
KVM uses anon_inode_get() to allocate file descriptors as part
of some of its ioctls. But those ioctls are lacking a flag argument
allowing userspace to choose options for the newly opened file descriptor.

In such case it's advised to use O_CLOEXEC by default so that
userspace is allowed to choose, without race, if the file descriptor
is going to be inherited across exec().

This patch set O_CLOEXEC flag on all file descriptors created
with anon_inode_getfd() to not leak file descriptors across exec().

Signed-off-by: Yann Droneaud ydrone...@opteya.com
Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com

---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 arch/powerpc/kvm/book3s_64_vio.c| 2 +-
 arch/powerpc/kvm/book3s_hv.c| 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 710d313..f7c9e8a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1579,7 +1579,7 @@ int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct 
kvm_get_htab_fd *ghf)
ctx-first_pass = 1;
 
rwflag = (ghf-flags  KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY;
-   ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag);
+   ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag | 
O_CLOEXEC);
if (ret  0) {
kvm_put_kvm(kvm);
return ret;
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index b2d3f3b..54cf9bc 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -136,7 +136,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
mutex_unlock(kvm-lock);
 
return anon_inode_getfd(kvm-spapr-tce, kvm_spapr_tce_fops,
-   stt, O_RDWR);
+   stt, O_RDWR | O_CLOEXEC);
 
 fail:
if (stt) {
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index e8d51cb..3503829 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1556,7 +1556,7 @@ long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct 
kvm_allocate_rma *ret)
if (!ri)
return -ENOMEM;
 
-   fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR);
+   fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR | O_CLOEXEC);
if (fd  0)
kvm_release_rma(ri);
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 05/10] vfio: use get_unused_fd_flags(0) instead of get_unused_fd()

2013-08-15 Thread Yann Droneaud
Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be unsafe:
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud ydrone...@opteya.com
Link: http://lkml.kernel.org/r/cover.1376327678.git.ydrone...@opteya.com

---
 drivers/vfio/vfio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index d3cb342..75c16cc 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1109,7 +1109,7 @@ static int vfio_group_get_device_fd(struct vfio_group 
*group, char *buf)
 * We can't use anon_inode_getfd() because we need to modify
 * the f_mode flags directly to allow more than just ioctls
 */
-   ret = get_unused_fd();
+   ret = get_unused_fd_flags(0);
if (ret  0) {
device-ops-release(device-device_data);
break;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 00/10] Getting rid of get_unused_fd_flags()

2013-08-15 Thread Yann Droneaud
Hi,

Macro get_unused_fd() is a shortcut to call function get_unused_fd_flags(),
to allocate a file descriptor.

The macro use 0 as flags, so the file descriptor is created
without O_CLOEXEC flag.

This can be seen as an unsafe default eg. in most case O_CLOEXEC
must be used to not leak file descriptor across exec().

Newer kernel code should use anon_inode_getfd() or get_unused_fd_flags()
with flags provided by userspace. If flags cannot be given by userspace,
O_CLOEXEC must be the default flag.

Using O_CLOEXEC by default allows userspace to choose, without race,
if the file descriptor is going to be inherited across exec().

They are two ways to achieve this:

- makes get_unused_fd() use O_CLOEXEC by default

  It's difficult to get it right: every code using of get_unused_fd()
  must take this change into account and be fixed as soon as
  macro get_unused_fd() do the switch. Non updated code will have
  unexpected behavor and it's likely going to break API contract.

- remove get_unused_fd()

  It's going to break some out of tree, not yet upstream kernel code,
  but it's easy to notice and fix. Anyway, newer code should use
  anon_inode_getfd() or get_unused_fd_flags().

The latter option was choosen to ensure no unexpected behavor
for out of tree, not yet upstream code. Removing the macro is the safest
choice: it's better to break build than trying to make get_unused_fd()
use O_CLOEXEC by default and get all user of get_unused_fd() update.

Additionnaly, removing the macro is not going to break modules ABI.

In linux-next tag 20130815, they're currently:

- 19 calls to get_unused_fd_flags() (+4)
 not counting get_unused_fd() and anon_inode_getfd()
- 10 calls to get_unused_fd()   (-4)
- 11 calls to anon_inode_getfd()(0)

The following patchset try to convert all calls to get_unused_fd()
to get_unused_fd_flags(0) before removing get_unused_fd() macro.

Without get_unused_fd() macro, more subsystems are likely to use
anon_inode_getfd() and be teached to provide an API that let userspace
choose the opening flags of the file descriptor.

Changes from v1 
http://lkml.kernel.org/r/cover.1372777600.git.ydrone...@opteya.com:

- explicitly added subsystem maintainers as mail recepients.

- infiniband: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: subsystem maintainer applied another patch using
   get_unused_fd_flags(O_CLOEXEC) as suggested.

- android/sw_sync: use get_unused_fd_flags(0) instead of get_unused_fd()
  MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested by
  
http://lkml.kernel.org/r/cacsp8sjxgmk2_kx_+rgzqqqwqkernvf1wt3k5tw991w5dfa...@mail.gmail.com

- android/sync: use get_unused_fd_flags(0) instead of get_unused_fd()
  MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested by
  
http://lkml.kernel.org/r/CACSP8SjZcpcpEtQHzcGYhf-MP7QGo0XpN7-uN7rmD=vNtopG=w...@mail.gmail.com

- xfs: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: applied asis by subsystem maintainer.

- sctp: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: applied asis by subsystem maintainer.

Yann Droneaud (10):
  ia64: use get_unused_fd_flags(0) instead of get_unused_fd()
  ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()
  android/sw_sync: use get_unused_fd_flags(O_CLOEXEC) instead of
get_unused_fd()
  android/sync: use get_unused_fd_flags(O_CLOEXEC) instead of
get_unused_fd()
  vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
  binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()
  file: use get_unused_fd_flags(0) instead of get_unused_fd()
  fanotify: use get_unused_fd_flags(0) instead of get_unused_fd()
  events: use get_unused_fd_flags(0) instead of get_unused_fd()
  file: remove get_unused_fd()

 arch/ia64/kernel/perfmon.c| 2 +-
 arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
 drivers/staging/android/sw_sync.c | 2 +-
 drivers/staging/android/sync.c| 2 +-
 drivers/vfio/vfio.c   | 2 +-
 fs/binfmt_misc.c  | 2 +-
 fs/file.c | 2 +-
 fs/notify/fanotify/fanotify_user.c| 2 +-
 include/linux/file.h  | 1 -
 kernel/events/core.c  | 2 +-
 10 files changed, 10 insertions(+), 11 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/13] vfio: use get_unused_fd_flags(0) instead of get_unused_fd()

2013-07-02 Thread Yann Droneaud
Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be unsafe:
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud ydrone...@opteya.com
---
 drivers/vfio/vfio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index c488da5..bb4e9fd 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1126,7 +1126,7 @@ static int vfio_group_get_device_fd(struct vfio_group 
*group, char *buf)
 * We can't use anon_inode_getfd() because we need to modify
 * the f_mode flags directly to allow more than just ioctls
 */
-   ret = get_unused_fd();
+   ret = get_unused_fd_flags(0);
if (ret  0) {
device-ops-release(device-device_data);
break;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/13] Getting rid of get_unused_fd()

2013-07-02 Thread Yann Droneaud
Hi,

Macro get_unused_fd() is a shortcut to call function get_unused_fd_flags(),
to allocate a file descriptor.

The macro use 0 as flags, so the file descriptor is created
without O_CLOEXEC flag.

This can be seen as an unsafe default eg. in most case O_CLOEXEC
must be used to not leak file descriptor across exec().

Newer kernel code should use anon_inode_getfd() or get_unused_fd_flags()
with flags provided by userspace. If flags cannot be given by userspace,
O_CLOEXEC must be the default flag.

Using O_CLOEXEC by default allows userspace to choose, without race,
if the file descriptor is going to be inherited across exec().

They are two ways to achieve this:

- makes get_unused_fd() use O_CLOEXEC by default

  It's difficult to get it right: every code using of get_unused_fd()
  must take this change into account and be fixed as soon as
  macro get_unused_fd() do the switch. Non updated code will have
  unexpected behavor and it's likely going to break API contract.

- remove get_unused_fd()

  It's going to break some out of tree, not yet upstream kernel code,
  but it's easy to notice and fix. Anyway, newer code should use
  anon_inode_getfd() or get_unused_fd_flags().

The latter option was choosen to ensure no unexpected behavor
for out of tree, not yet upstream code. Removing the macro is the safest
choice: it's better to break build than trying to make get_unused_fd()
use O_CLOEXEC by default and get all user of get_unused_fd() update.

Additionnaly, removing the macro is not going to break modules ABI.

In linux-next tag 20130702, they're currently:

- 15 calls to get_unused_fd_flags()
 not counting get_unused_fd() and anon_inode_getfd()
- 14 calls to get_unused_fd()
- 11 calls to anon_inode_getfd()

The following patchset try to convert all calls to get_unused_fd()
to get_unused_fd_flags(0) before removing get_unused_fd() macro.

Without get_unused_fd() macro, more subsystems are likely to use
anon_inode_getfd() and be teached to provide an API that let userspace
choose the opening flags of the file descriptor.

Yann Droneaud (13):
  ia64: use get_unused_fd_flags(0) instead of get_unused_fd()
  ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()
  infiniband: use get_unused_fd_flags(0) instead of get_unused_fd()
  android/sw_sync: use get_unused_fd_flags(0) instead of get_unused_fd()
  android/sync: use get_unused_fd_flags(0) instead of get_unused_fd()
  vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
  binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()
  file: use get_unused_fd_flags(0) instead of get_unused_fd()
  fanotify: use get_unused_fd_flags(0) instead of get_unused_fd()
  xfs: use get_unused_fd_flags(0) instead of get_unused_fd()
  events: use get_unused_fd_flags(0) instead of get_unused_fd()
  sctp: use get_unused_fd_flags(0) instead of get_unused_fd()
  file: remove get_unused_fd()

 arch/ia64/kernel/perfmon.c| 2 +-
 arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
 drivers/infiniband/core/uverbs_cmd.c  | 4 ++--
 drivers/staging/android/sw_sync.c | 2 +-
 drivers/staging/android/sync.c| 2 +-
 drivers/vfio/vfio.c   | 2 +-
 fs/binfmt_misc.c  | 2 +-
 fs/file.c | 2 +-
 fs/notify/fanotify/fanotify_user.c| 2 +-
 fs/xfs/xfs_ioctl.c| 2 +-
 include/linux/file.h  | 1 -
 kernel/events/core.c  | 2 +-
 net/sctp/socket.c | 2 +-
 13 files changed, 14 insertions(+), 15 deletions(-)

-- 
1.8.3.1
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html