Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-06-15 Thread Florian Weimer
On 05/29/2015 10:27 PM, Thiago Macieira wrote:

>> It has been suggested (e.g.,
>> ) that you can
>> use the existing clone(2) without specifying SIGCHLD to create a new
>> process.  The resulting child process is not supposed to show up in
>> wait(2), only in a waitpid(2) (or similar) explicitly specifying the
>> PID.  Is this not the case?
> 
> Hi Florian
> 
> That sounds orthogonal to what we're looking for. Our objective is to get 
> notification of when the child exited without resorting to SIGCHLD. If we use 
> the regular clone(2) without SIGCHLD and without CLONE_FD, we get no 
> notification. The only way to know of the child's termination is by a 
> blocking 
> waitpid(2), like you indicated, which is counter productive to our needs.
> 
> We need something we can select(2)/poll(2) on.

Thanks for the clarification.  I agree that this is a separate and quite
sensible use case.

-- 
Florian Weimer / Red Hat Product Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-06-15 Thread Florian Weimer
On 05/29/2015 10:27 PM, Thiago Macieira wrote:

 It has been suggested (e.g.,
 https://sourceware.org/bugzilla/show_bug.cgi?id=15661#c3) that you can
 use the existing clone(2) without specifying SIGCHLD to create a new
 process.  The resulting child process is not supposed to show up in
 wait(2), only in a waitpid(2) (or similar) explicitly specifying the
 PID.  Is this not the case?
 
 Hi Florian
 
 That sounds orthogonal to what we're looking for. Our objective is to get 
 notification of when the child exited without resorting to SIGCHLD. If we use 
 the regular clone(2) without SIGCHLD and without CLONE_FD, we get no 
 notification. The only way to know of the child's termination is by a 
 blocking 
 waitpid(2), like you indicated, which is counter productive to our needs.
 
 We need something we can select(2)/poll(2) on.

Thanks for the clarification.  I agree that this is a separate and quite
sensible use case.

-- 
Florian Weimer / Red Hat Product Security
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-05-29 Thread Thiago Macieira
On Friday 29 May 2015 09:43:35 Florian Weimer wrote:
> On 03/15/2015 08:59 AM, Josh Triplett wrote:
> > This patch series introduces a new clone flag, CLONE_FD, which lets the
> > caller receive child process exit notification via a file descriptor
> > rather than SIGCHLD.  CLONE_FD makes it possible for libraries to safely
> > launch and manage child processes on behalf of their caller, *without*
> > taking over process-wide SIGCHLD handling (either via signal handler or
> > signalfd).
> > 
> > Note that signalfd for SIGCHLD does not suffice here, because that still
> > receives notification for all child processes, and interferes with
> > process-wide signal handling.
> 
> It has been suggested (e.g.,
> ) that you can
> use the existing clone(2) without specifying SIGCHLD to create a new
> process.  The resulting child process is not supposed to show up in
> wait(2), only in a waitpid(2) (or similar) explicitly specifying the
> PID.  Is this not the case?

Hi Florian

That sounds orthogonal to what we're looking for. Our objective is to get 
notification of when the child exited without resorting to SIGCHLD. If we use 
the regular clone(2) without SIGCHLD and without CLONE_FD, we get no 
notification. The only way to know of the child's termination is by a blocking 
waitpid(2), like you indicated, which is counter productive to our needs.

We need something we can select(2)/poll(2) on.
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-05-29 Thread Florian Weimer
On 03/15/2015 08:59 AM, Josh Triplett wrote:
> This patch series introduces a new clone flag, CLONE_FD, which lets the caller
> receive child process exit notification via a file descriptor rather than
> SIGCHLD.  CLONE_FD makes it possible for libraries to safely launch and manage
> child processes on behalf of their caller, *without* taking over process-wide
> SIGCHLD handling (either via signal handler or signalfd).
> 
> Note that signalfd for SIGCHLD does not suffice here, because that still
> receives notification for all child processes, and interferes with 
> process-wide
> signal handling.

It has been suggested (e.g.,
) that you can
use the existing clone(2) without specifying SIGCHLD to create a new
process.  The resulting child process is not supposed to show up in
wait(2), only in a waitpid(2) (or similar) explicitly specifying the
PID.  Is this not the case?

-- 
Florian Weimer / Red Hat Product Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-05-29 Thread Florian Weimer
On 03/15/2015 08:59 AM, Josh Triplett wrote:
 This patch series introduces a new clone flag, CLONE_FD, which lets the caller
 receive child process exit notification via a file descriptor rather than
 SIGCHLD.  CLONE_FD makes it possible for libraries to safely launch and manage
 child processes on behalf of their caller, *without* taking over process-wide
 SIGCHLD handling (either via signal handler or signalfd).
 
 Note that signalfd for SIGCHLD does not suffice here, because that still
 receives notification for all child processes, and interferes with 
 process-wide
 signal handling.

It has been suggested (e.g.,
https://sourceware.org/bugzilla/show_bug.cgi?id=15661#c3) that you can
use the existing clone(2) without specifying SIGCHLD to create a new
process.  The resulting child process is not supposed to show up in
wait(2), only in a waitpid(2) (or similar) explicitly specifying the
PID.  Is this not the case?

-- 
Florian Weimer / Red Hat Product Security
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-05-29 Thread Thiago Macieira
On Friday 29 May 2015 09:43:35 Florian Weimer wrote:
 On 03/15/2015 08:59 AM, Josh Triplett wrote:
  This patch series introduces a new clone flag, CLONE_FD, which lets the
  caller receive child process exit notification via a file descriptor
  rather than SIGCHLD.  CLONE_FD makes it possible for libraries to safely
  launch and manage child processes on behalf of their caller, *without*
  taking over process-wide SIGCHLD handling (either via signal handler or
  signalfd).
  
  Note that signalfd for SIGCHLD does not suffice here, because that still
  receives notification for all child processes, and interferes with
  process-wide signal handling.
 
 It has been suggested (e.g.,
 https://sourceware.org/bugzilla/show_bug.cgi?id=15661#c3) that you can
 use the existing clone(2) without specifying SIGCHLD to create a new
 process.  The resulting child process is not supposed to show up in
 wait(2), only in a waitpid(2) (or similar) explicitly specifying the
 PID.  Is this not the case?

Hi Florian

That sounds orthogonal to what we're looking for. Our objective is to get 
notification of when the child exited without resorting to SIGCHLD. If we use 
the regular clone(2) without SIGCHLD and without CLONE_FD, we get no 
notification. The only way to know of the child's termination is by a blocking 
waitpid(2), like you indicated, which is counter productive to our needs.

We need something we can select(2)/poll(2) on.
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-04-08 Thread Josh Triplett
On Wed, Apr 01, 2015 at 09:24:20AM +0200, Jonathan Corbet wrote:
> On Tue, 31 Mar 2015 15:02:24 -0700
> j...@joshtriplett.org wrote:
> 
> > > This would appear to assume that a clonefd_info structure is the only
> > > thing that will ever be read from this descriptor.  It seems to me that
> > > there is the potential for, someday, wanting to be able to read and write
> > > other things as well.  Should this structure be marked with type and
> > > length fields so that other structures could be added in the future?  
> > 
> > I don't think it makes sense for a caller to get an arbitrary structure
> > on read(), and have to figure out what they got and ignore something
> > they don't understand.  Instead, I think it makes more sense for the
> > caller to say "Hey, here's a flag saying I understand the new thing, go
> > ahead and give me the new thing".  So, for instance, if you want to
> > receive SIGSTOP/SIGCONT messages for child processes through this
> > descriptor, we could add a flag for that.
> 
> The flag is fine, but, once we have set that flag saying we want those
> messages, how do we know which type of structure we've gotten?  That's
> the piece of the puzzle I'm missing, sorry if I'm being overly slow.

If you pass a flag saying you can handle a new set of potential
structures, those structures can then include any necessary
disambiguating flags/IDs/etc.  No need for them to match the current
clonefd_info structure if userspace has opted into a new version.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-04-08 Thread Josh Triplett
On Wed, Apr 01, 2015 at 09:24:20AM +0200, Jonathan Corbet wrote:
 On Tue, 31 Mar 2015 15:02:24 -0700
 j...@joshtriplett.org wrote:
 
   This would appear to assume that a clonefd_info structure is the only
   thing that will ever be read from this descriptor.  It seems to me that
   there is the potential for, someday, wanting to be able to read and write
   other things as well.  Should this structure be marked with type and
   length fields so that other structures could be added in the future?  
  
  I don't think it makes sense for a caller to get an arbitrary structure
  on read(), and have to figure out what they got and ignore something
  they don't understand.  Instead, I think it makes more sense for the
  caller to say Hey, here's a flag saying I understand the new thing, go
  ahead and give me the new thing.  So, for instance, if you want to
  receive SIGSTOP/SIGCONT messages for child processes through this
  descriptor, we could add a flag for that.
 
 The flag is fine, but, once we have set that flag saying we want those
 messages, how do we know which type of structure we've gotten?  That's
 the piece of the puzzle I'm missing, sorry if I'm being overly slow.

If you pass a flag saying you can handle a new set of potential
structures, those structures can then include any necessary
disambiguating flags/IDs/etc.  No need for them to match the current
clonefd_info structure if userspace has opted into a new version.

- Josh Triplett
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-04-01 Thread Jonathan Corbet
On Tue, 31 Mar 2015 15:02:24 -0700
j...@joshtriplett.org wrote:

> > This would appear to assume that a clonefd_info structure is the only
> > thing that will ever be read from this descriptor.  It seems to me that
> > there is the potential for, someday, wanting to be able to read and write
> > other things as well.  Should this structure be marked with type and
> > length fields so that other structures could be added in the future?  
> 
> I don't think it makes sense for a caller to get an arbitrary structure
> on read(), and have to figure out what they got and ignore something
> they don't understand.  Instead, I think it makes more sense for the
> caller to say "Hey, here's a flag saying I understand the new thing, go
> ahead and give me the new thing".  So, for instance, if you want to
> receive SIGSTOP/SIGCONT messages for child processes through this
> descriptor, we could add a flag for that.

The flag is fine, but, once we have set that flag saying we want those
messages, how do we know which type of structure we've gotten?  That's
the piece of the puzzle I'm missing, sorry if I'm being overly slow.

Thanks,

jon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-04-01 Thread Jonathan Corbet
On Tue, 31 Mar 2015 15:02:24 -0700
j...@joshtriplett.org wrote:

  This would appear to assume that a clonefd_info structure is the only
  thing that will ever be read from this descriptor.  It seems to me that
  there is the potential for, someday, wanting to be able to read and write
  other things as well.  Should this structure be marked with type and
  length fields so that other structures could be added in the future?  
 
 I don't think it makes sense for a caller to get an arbitrary structure
 on read(), and have to figure out what they got and ignore something
 they don't understand.  Instead, I think it makes more sense for the
 caller to say Hey, here's a flag saying I understand the new thing, go
 ahead and give me the new thing.  So, for instance, if you want to
 receive SIGSTOP/SIGCONT messages for child processes through this
 descriptor, we could add a flag for that.

The flag is fine, but, once we have set that flag saying we want those
messages, how do we know which type of structure we've gotten?  That's
the piece of the puzzle I'm missing, sorry if I'm being overly slow.

Thanks,

jon
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-31 Thread josh
On Tue, Mar 31, 2015 at 10:08:07PM +0200, Jonathan Corbet wrote:
> So I finally got around to having a look at this, and one thing caught my
> eye:
> 
> >  read(2) (and similar)
> >  When  the  new  process  exits,  reading  from  the  
> > file
> >  descriptor produces a single clonefd_info structure:
> > 
> >  struct clonefd_info {
> >  uint32_t code;   /* Signal code */
> >  uint32_t status; /* Exit status or signal */
> >  uint64_t utime;  /* User CPU time */
> >  uint64_t stime;  /* System CPU time */
> >  };
> 
> This would appear to assume that a clonefd_info structure is the only
> thing that will ever be read from this descriptor.  It seems to me that
> there is the potential for, someday, wanting to be able to read and write
> other things as well.  Should this structure be marked with type and
> length fields so that other structures could be added in the future?

I don't think it makes sense for a caller to get an arbitrary structure
on read(), and have to figure out what they got and ignore something
they don't understand.  Instead, I think it makes more sense for the
caller to say "Hey, here's a flag saying I understand the new thing, go
ahead and give me the new thing".  So, for instance, if you want to
receive SIGSTOP/SIGCONT messages for child processes through this
descriptor, we could add a flag for that.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-31 Thread Jonathan Corbet
So I finally got around to having a look at this, and one thing caught my
eye:

>  read(2) (and similar)
>  When  the  new  process  exits,  reading  from  the  file
>  descriptor produces a single clonefd_info structure:
> 
>  struct clonefd_info {
>  uint32_t code;   /* Signal code */
>  uint32_t status; /* Exit status or signal */
>  uint64_t utime;  /* User CPU time */
>  uint64_t stime;  /* System CPU time */
>  };

This would appear to assume that a clonefd_info structure is the only
thing that will ever be read from this descriptor.  It seems to me that
there is the potential for, someday, wanting to be able to read and write
other things as well.  Should this structure be marked with type and
length fields so that other structures could be added in the future?

(I suppose we could just use ioctl() for any other functionality in the
future, though...:)

jon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-31 Thread Jonathan Corbet
So I finally got around to having a look at this, and one thing caught my
eye:

  read(2) (and similar)
  When  the  new  process  exits,  reading  from  the  file
  descriptor produces a single clonefd_info structure:
 
  struct clonefd_info {
  uint32_t code;   /* Signal code */
  uint32_t status; /* Exit status or signal */
  uint64_t utime;  /* User CPU time */
  uint64_t stime;  /* System CPU time */
  };

This would appear to assume that a clonefd_info structure is the only
thing that will ever be read from this descriptor.  It seems to me that
there is the potential for, someday, wanting to be able to read and write
other things as well.  Should this structure be marked with type and
length fields so that other structures could be added in the future?

(I suppose we could just use ioctl() for any other functionality in the
future, though...:)

jon
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-31 Thread josh
On Tue, Mar 31, 2015 at 10:08:07PM +0200, Jonathan Corbet wrote:
 So I finally got around to having a look at this, and one thing caught my
 eye:
 
   read(2) (and similar)
   When  the  new  process  exits,  reading  from  the  
  file
   descriptor produces a single clonefd_info structure:
  
   struct clonefd_info {
   uint32_t code;   /* Signal code */
   uint32_t status; /* Exit status or signal */
   uint64_t utime;  /* User CPU time */
   uint64_t stime;  /* System CPU time */
   };
 
 This would appear to assume that a clonefd_info structure is the only
 thing that will ever be read from this descriptor.  It seems to me that
 there is the potential for, someday, wanting to be able to read and write
 other things as well.  Should this structure be marked with type and
 length fields so that other structures could be added in the future?

I don't think it makes sense for a caller to get an arbitrary structure
on read(), and have to figure out what they got and ignore something
they don't understand.  Instead, I think it makes more sense for the
caller to say Hey, here's a flag saying I understand the new thing, go
ahead and give me the new thing.  So, for instance, if you want to
receive SIGSTOP/SIGCONT messages for child processes through this
descriptor, we could add a flag for that.

- Josh Triplett
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-23 Thread josh
On Mon, Mar 23, 2015 at 02:12:34PM +, David Drysdale wrote:
> On Mon, Mar 16, 2015 at 11:29 PM,   wrote:
> > On Mon, Mar 16, 2015 at 03:14:14PM -0700, Thiago Macieira wrote:
> >> On Monday 16 March 2015 14:44:20 Kees Cook wrote:
> >> > >   O_CLOEXEC
> >> > >  Set  the  close-on-exec  flag on the new file
> >> > >descriptor. See the description of the O_CLOEXEC flag in open(2)  for
> >> > >reasons why this may be useful.
> >> >
> >> > This begs the question: what happens when all CLONE_FD fds for a
> >> > process are closed? Will the parent get SIGCHLD instead, will it
> >> > auto-reap, or will it be un-wait-able (I assume not this...)
> >>
> >> Depends on CLONE_AUTOREAP. If it's on, then no one gets SIGCHLD, no one can
> >> wait() on it and the process autoreaps itself.
> >
> > Minor nit: CLONE_AUTOREAP makes the process autoreap and nobody can wait
> > on it, but if you pass SIGCHLD or some other exit signal to clone then
> > you'll still get that signal.
> 
> Quick query: does CLONE_AUTOREAP also affect waiting for non-exit
> events (i.e. WUNTRACED / WCONTINUED), by original parent and/or ptracer?

It shouldn't, no.  You can't wait on the process to exit (you'll get
-ECHLD after it wakes up), but you can wait on it to continue or
similar; none of the autoreap changes should affect that.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-23 Thread David Drysdale
On Mon, Mar 16, 2015 at 11:29 PM,   wrote:
> On Mon, Mar 16, 2015 at 03:14:14PM -0700, Thiago Macieira wrote:
>> On Monday 16 March 2015 14:44:20 Kees Cook wrote:
>> > >   O_CLOEXEC
>> > >  Set  the  close-on-exec  flag on the new file
>> > >descriptor. See the description of the O_CLOEXEC flag in open(2)  for
>> > >reasons why this may be useful.
>> >
>> > This begs the question: what happens when all CLONE_FD fds for a
>> > process are closed? Will the parent get SIGCHLD instead, will it
>> > auto-reap, or will it be un-wait-able (I assume not this...)
>>
>> Depends on CLONE_AUTOREAP. If it's on, then no one gets SIGCHLD, no one can
>> wait() on it and the process autoreaps itself.
>
> Minor nit: CLONE_AUTOREAP makes the process autoreap and nobody can wait
> on it, but if you pass SIGCHLD or some other exit signal to clone then
> you'll still get that signal.

Quick query: does CLONE_AUTOREAP also affect waiting for non-exit
events (i.e. WUNTRACED / WCONTINUED), by original parent and/or ptracer?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-23 Thread josh
On Mon, Mar 23, 2015 at 02:12:34PM +, David Drysdale wrote:
 On Mon, Mar 16, 2015 at 11:29 PM,  j...@joshtriplett.org wrote:
  On Mon, Mar 16, 2015 at 03:14:14PM -0700, Thiago Macieira wrote:
  On Monday 16 March 2015 14:44:20 Kees Cook wrote:
  O_CLOEXEC
 Set  the  close-on-exec  flag on the new file
   descriptor. See the description of the O_CLOEXEC flag in open(2)  for
   reasons why this may be useful.
  
   This begs the question: what happens when all CLONE_FD fds for a
   process are closed? Will the parent get SIGCHLD instead, will it
   auto-reap, or will it be un-wait-able (I assume not this...)
 
  Depends on CLONE_AUTOREAP. If it's on, then no one gets SIGCHLD, no one can
  wait() on it and the process autoreaps itself.
 
  Minor nit: CLONE_AUTOREAP makes the process autoreap and nobody can wait
  on it, but if you pass SIGCHLD or some other exit signal to clone then
  you'll still get that signal.
 
 Quick query: does CLONE_AUTOREAP also affect waiting for non-exit
 events (i.e. WUNTRACED / WCONTINUED), by original parent and/or ptracer?

It shouldn't, no.  You can't wait on the process to exit (you'll get
-ECHLD after it wakes up), but you can wait on it to continue or
similar; none of the autoreap changes should affect that.

- Josh Triplett
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-23 Thread David Drysdale
On Mon, Mar 16, 2015 at 11:29 PM,  j...@joshtriplett.org wrote:
 On Mon, Mar 16, 2015 at 03:14:14PM -0700, Thiago Macieira wrote:
 On Monday 16 March 2015 14:44:20 Kees Cook wrote:
 O_CLOEXEC
Set  the  close-on-exec  flag on the new file
  descriptor. See the description of the O_CLOEXEC flag in open(2)  for
  reasons why this may be useful.
 
  This begs the question: what happens when all CLONE_FD fds for a
  process are closed? Will the parent get SIGCHLD instead, will it
  auto-reap, or will it be un-wait-able (I assume not this...)

 Depends on CLONE_AUTOREAP. If it's on, then no one gets SIGCHLD, no one can
 wait() on it and the process autoreaps itself.

 Minor nit: CLONE_AUTOREAP makes the process autoreap and nobody can wait
 on it, but if you pass SIGCHLD or some other exit signal to clone then
 you'll still get that signal.

Quick query: does CLONE_AUTOREAP also affect waiting for non-exit
events (i.e. WUNTRACED / WCONTINUED), by original parent and/or ptracer?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread Thiago Macieira
On Monday 16 March 2015 16:29:49 j...@joshtriplett.org wrote:
> > A child without CLONE_AUTOREAP should be wait()able. If it gets wait()ed 
> > before the clonefd is read, the clonefd() will return a 0 read. If it
> > gets 
> > read before wait, then wait() reaps another child or returns -ECHILD.
> > That's  no different than two threads doing simultaneous wait() on the
> > same child.
> Hrm?  That isn't the semantics we implemented; you'll *always* get an
> exit notification via the clonefd if you have it open, with or without
> autoreap and whether or not a wait has occurred yet.  And reading from
> the clonefd does not serve as a wait; if you don't pass CLONE_AUTOREAP,
> you'll still need to wait on the process.

Ah, I see what you're saying. Ok, I stand corrected: a child without 
CLONE_AUTOREAP must be wait()ed on and whoever waits on it will get 
information. In addition to that, the information is available on the clonefd 
and it can happen at any time, before or after the wait().

In the case of an orphaned child, the file descriptor will close, that's all. 
No modification is necessary to init.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread josh
On Mon, Mar 16, 2015 at 03:36:16PM -0700, Kees Cook wrote:
> On Mon, Mar 16, 2015 at 3:14 PM, Thiago Macieira
>  wrote:
> > On Monday 16 March 2015 14:44:20 Kees Cook wrote:
> >> >   O_CLOEXEC
> >> >  Set  the  close-on-exec  flag on the new file
> >> >descriptor. See the description of the O_CLOEXEC flag in open(2)  for
> >> >reasons why this may be useful.
> >>
> >> This begs the question: what happens when all CLONE_FD fds for a
> >> process are closed? Will the parent get SIGCHLD instead, will it
> >> auto-reap, or will it be un-wait-able (I assume not this...)
> >
> > Depends on CLONE_AUTOREAP. If it's on, then no one gets SIGCHLD, no one can
> > wait() on it and the process autoreaps itself.
> >
> > If it's no active, then the old rules apply: parent gets SIGCHILD and can
> > wait(). If the parent exited first, then the child gets reparented to init,
> > which can do the wait().
> >
> > A child without CLONE_AUTOREAP should be wait()able. If it gets wait()ed
> > before the clonefd is read, the clonefd() will return a 0 read. If it gets
> > read before wait, then wait() reaps another child or returns -ECHILD. That's
> > no different than two threads doing simultaneous wait() on the same child.
> 
> Cool. I think detailing this in the manpage would be helpful.
> 
> And just so I understand the races here, what happens in CLONE_FD
> (without CLONE_AUTOREAP) case where the child dies, but the parent
> never reads from the CLONE_FD fd, and closes it (or dies)? Will the
> modes switch that late in the child's lifetime? (i.e. even though the
> details were written to the fd, they were never read, yet it'll still
> switch and generate a SIGCHLD, etc?)

This doesn't actually work like a pipe; the details aren't "written" to
the fd.  The data is generated at read time, and if you never read,
that's fine.  There's no semantic meaning attached to reading from the
clonefd; you still have to wait on the process if you don't pass
CLONE_AUTOREAP.  (Or you can block SIGCHLD or use SA_NOCLDWAIT, if you
control the calling process's signal handling; AUTOREAP just lets you
avoid interacting with the calling process's signal handling.)

See my previous response for the rest.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread josh
On Mon, Mar 16, 2015 at 03:14:14PM -0700, Thiago Macieira wrote:
> On Monday 16 March 2015 14:44:20 Kees Cook wrote:
> > >   O_CLOEXEC
> > >  Set  the  close-on-exec  flag on the new file
> > >descriptor. See the description of the O_CLOEXEC flag in open(2)  for
> > >reasons why this may be useful.
> > 
> > This begs the question: what happens when all CLONE_FD fds for a
> > process are closed? Will the parent get SIGCHLD instead, will it
> > auto-reap, or will it be un-wait-able (I assume not this...)
> 
> Depends on CLONE_AUTOREAP. If it's on, then no one gets SIGCHLD, no one can 
> wait() on it and the process autoreaps itself.

Minor nit: CLONE_AUTOREAP makes the process autoreap and nobody can wait
on it, but if you pass SIGCHLD or some other exit signal to clone then
you'll still get that signal.

> If it's no active, then the old rules apply: parent gets SIGCHILD and can 
> wait(). If the parent exited first, then the child gets reparented to init, 
> which can do the wait().

Right.

> A child without CLONE_AUTOREAP should be wait()able. If it gets wait()ed 
> before the clonefd is read, the clonefd() will return a 0 read. If it gets 
> read before wait, then wait() reaps another child or returns -ECHILD. That's 
> no different than two threads doing simultaneous wait() on the same child.

Hrm?  That isn't the semantics we implemented; you'll *always* get an
exit notification via the clonefd if you have it open, with or without
autoreap and whether or not a wait has occurred yet.  And reading from
the clonefd does not serve as a wait; if you don't pass CLONE_AUTOREAP,
you'll still need to wait on the process.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread Kees Cook
On Mon, Mar 16, 2015 at 3:50 PM, Thiago Macieira
 wrote:
> On Monday 16 March 2015 15:36:16 Kees Cook wrote:
>> And just so I understand the races here, what happens in CLONE_FD
>> (without CLONE_AUTOREAP) case where the child dies, but the parent
>> never reads from the CLONE_FD fd, and closes it (or dies)? Will the
>> modes switch that late in the child's lifetime? (i.e. even though the
>> details were written to the fd, they were never read, yet it'll still
>> switch and generate a SIGCHLD, etc?)
>
> What happens to a child that dies during the parent's lifetime but the parent
> exits without reaping the child?
>
> The same should happen, whatever that behaviour is.

Okay, sounds like the waitpid internals are tied to the read, not the
child death. Perfect, that means it'll still be waitpid-able by
whatever process inherits the zombie. Sounds good! Thanks for
clarifying. :)

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread josh
On Mon, Mar 16, 2015 at 02:44:20PM -0700, Kees Cook wrote:
> On Sun, Mar 15, 2015 at 12:59 AM, Josh Triplett  wrote:
> > - Make poll on a CLONE_FD for an exited task also return POLLHUP, for
> >   compatibility with FreeBSD's pdfork.  Thanks to David Drysdale for calling
> >   attention to pdfork.
> 
> I think POLLHUP should be mentioned in the manpage (now it only
> mentions POLLIN).

Added for v3.

> >CLONE_FD
> >   Return  a file descriptor associated with the new process, 
> > stor‐
> >   ing it in location clonefd in the parent's address space.   
> > When
> >   the new process exits, the file descriptor will become 
> > available
> >   for reading.
> >
> >   Unlike using  signalfd(2)  for  the  SIGCHLD  signal,  the  
> > file
> >   descriptor  returned  by  clone4()  with the CLONE_FD flag 
> > works
> >   even with SIGCHLD unblocked in one or more threads of the 
> > parent
> >   process,  allowing  the  process  to have different handlers 
> > for
> >   different child processes, such as those created by  a  
> > library,
> >   without  introducing  race conditions around process-wide 
> > signal
> >   handling.
> >
> >   clonefd_flags may contain the following additional flags for 
> > use
> >   with CLONE_FD:
> >
> >
> >   O_CLOEXEC
> >  Set  the  close-on-exec  flag on the new file 
> > descriptor.
> >  See the description of the O_CLOEXEC flag in open(2)  
> > for
> >  reasons why this may be useful.
> 
> This begs the question: what happens when all CLONE_FD fds for a
> process are closed? Will the parent get SIGCHLD instead, will it
> auto-reap, or will it be un-wait-able (I assume not this...)

Whether the parent gets SIGCHLD is determined only by what signal you
request in clone; if you clone with CLONE_FD | SIGCHLD (or
CLONE_AUTOREAP | CLONE_FD | SIGCHLD), you'll get notification via both
clonefd (if you have one) and signal (if you have a handler).  If you
pass a 0 signal (just CLONE_FD or CLONE_AUTOREAP | CLONE_FD), you'll
receive no signal, only the notification via clonefd.  Independently, if
you have CLONE_AUTOREAP set, the process will autoreap.

Those are all orthogonal now.

If you close the clonefd, nothing special happens other than a
put_task_struct.  While this is conceptually somewhat like a pipe, the
data is actually generated at read time, so the task exit doesn't care
whether there's a live clonefd or not.  (Or, in the future, if there are
multiple live clonefds for the same process.)

> Looks promising!

Thanks!

And thanks for catching the manpage issue.  I'd definitely welcome any
comments you have on the implementation as well.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread Thiago Macieira
On Monday 16 March 2015 15:36:16 Kees Cook wrote:
> And just so I understand the races here, what happens in CLONE_FD
> (without CLONE_AUTOREAP) case where the child dies, but the parent
> never reads from the CLONE_FD fd, and closes it (or dies)? Will the
> modes switch that late in the child's lifetime? (i.e. even though the
> details were written to the fd, they were never read, yet it'll still
> switch and generate a SIGCHLD, etc?)

What happens to a child that dies during the parent's lifetime but the parent 
exits without reaping the child?

The same should happen, whatever that behaviour is.
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread Kees Cook
On Mon, Mar 16, 2015 at 3:14 PM, Thiago Macieira
 wrote:
> On Monday 16 March 2015 14:44:20 Kees Cook wrote:
>> >   O_CLOEXEC
>> >  Set  the  close-on-exec  flag on the new file
>> >descriptor. See the description of the O_CLOEXEC flag in open(2)  for
>> >reasons why this may be useful.
>>
>> This begs the question: what happens when all CLONE_FD fds for a
>> process are closed? Will the parent get SIGCHLD instead, will it
>> auto-reap, or will it be un-wait-able (I assume not this...)
>
> Depends on CLONE_AUTOREAP. If it's on, then no one gets SIGCHLD, no one can
> wait() on it and the process autoreaps itself.
>
> If it's no active, then the old rules apply: parent gets SIGCHILD and can
> wait(). If the parent exited first, then the child gets reparented to init,
> which can do the wait().
>
> A child without CLONE_AUTOREAP should be wait()able. If it gets wait()ed
> before the clonefd is read, the clonefd() will return a 0 read. If it gets
> read before wait, then wait() reaps another child or returns -ECHILD. That's
> no different than two threads doing simultaneous wait() on the same child.

Cool. I think detailing this in the manpage would be helpful.

And just so I understand the races here, what happens in CLONE_FD
(without CLONE_AUTOREAP) case where the child dies, but the parent
never reads from the CLONE_FD fd, and closes it (or dies)? Will the
modes switch that late in the child's lifetime? (i.e. even though the
details were written to the fd, they were never read, yet it'll still
switch and generate a SIGCHLD, etc?)

Thanks!

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread Thiago Macieira
On Monday 16 March 2015 14:44:20 Kees Cook wrote:
> >   O_CLOEXEC
> >  Set  the  close-on-exec  flag on the new file
> >descriptor. See the description of the O_CLOEXEC flag in open(2)  for
> >reasons why this may be useful.
> 
> This begs the question: what happens when all CLONE_FD fds for a
> process are closed? Will the parent get SIGCHLD instead, will it
> auto-reap, or will it be un-wait-able (I assume not this...)

Depends on CLONE_AUTOREAP. If it's on, then no one gets SIGCHLD, no one can 
wait() on it and the process autoreaps itself.

If it's no active, then the old rules apply: parent gets SIGCHILD and can 
wait(). If the parent exited first, then the child gets reparented to init, 
which can do the wait().

A child without CLONE_AUTOREAP should be wait()able. If it gets wait()ed 
before the clonefd is read, the clonefd() will return a 0 read. If it gets 
read before wait, then wait() reaps another child or returns -ECHILD. That's 
no different than two threads doing simultaneous wait() on the same child.
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread Kees Cook
On Sun, Mar 15, 2015 at 12:59 AM, Josh Triplett  wrote:
> This patch series introduces a new clone flag, CLONE_FD, which lets the caller
> receive child process exit notification via a file descriptor rather than
> SIGCHLD.  CLONE_FD makes it possible for libraries to safely launch and manage
> child processes on behalf of their caller, *without* taking over process-wide
> SIGCHLD handling (either via signal handler or signalfd).
>
> Note that signalfd for SIGCHLD does not suffice here, because that still
> receives notification for all child processes, and interferes with 
> process-wide
> signal handling.
>
> The CLONE_FD file descriptor uniquely identifies a process on the system in a
> race-free way, by holding a reference to the task_struct.  In the future, we
> may introduce APIs that support using process file descriptors instead of 
> PIDs.
>
> This patch series also introduces a clone flag CLONE_AUTOREAP, which causes 
> the
> kernel to automatically reap the child process when it exits, just as it does
> for processes using SIGCHLD when the parent has SIGCHLD ignored or marked as
> SA_NOCLDSTOP.
>
> Taken together, a library can launch a process with CLONE_FD, CLONE_AUTOREAP,
> and no exit signal, and completely avoid affecting either process-wide signal
> handling or an existing child wait loop.
>
> Introducing CLONE_FD and CLONE_AUTOREAP required two additional bits of yak
> shaving: Since clone has no more usable flags (with the three currently unused
> flags unusable because old kernels ignore them without EINVAL), also introduce
> a new clone4 system call with more flag bits and an extensible argument
> structure.  And since the magic pt_regs-based syscall argument processing for
> clone's tls argument would otherwise prevent introducing a sane clone4 system
> call, fix that too.
>
> I tested the CLONE_SETTLS changes with a thread-local storage test program 
> (two
> threads independently reading and writing a __thread variable), on both 32-bit
> and 64-bit, and I observed no issues there.
>
> I tested clone4 and the new flags with several additional test programs,
> launching either a process or thread (in the former case using syscall(), in
> the latter case by calling clone4 via assembly and returning to C), sleeping 
> in
> parent and child to test the case of either exiting first, and then printing
> the received clone4_info structure.
>
> Changes in v2:
> - Split out autoreaping into a separate CLONE_AUTOREAP.  CLONE_FD no longer
>   implies autoreaping and no exit signal, and CLONE_AUTOREAP does not affect
>   ptracers or signal handling.  Thanks to Oleg Nesterov for careful
>   investigation and discussion on v1.
> - Accept O_CLOEXEC and O_NONBLOCK via a clonefd_flags parameter in 
> clone4_args.
>   Stop overloading the low byte of the main clone flags, since CLONE_FD now
>   works with a non-zero signal.
> - Return the file descriptor via an out parameter in clone4_args.
> - Drop patch to export alloc_fd; CLONE_FD now uses the next available file
>   descriptor, even if that's 0-2, since clone4 no longer needs to avoid
>   ambiguity with the 0 return indicating the child process.
> - Make poll on a CLONE_FD for an exited task also return POLLHUP, for
>   compatibility with FreeBSD's pdfork.  Thanks to David Drysdale for calling
>   attention to pdfork.

I think POLLHUP should be mentioned in the manpage (now it only
mentions POLLIN).

> - Fix typo in squelch_clone_flags.
> - Pass arguments to _do_fork and copy_process as a structure.
> - Construct the 64-bit flags in a separate variable, rather than inline in the
>   call to do_fork.
> - Fix error return for copy_from_user faults.
> - Add the new syscall to asm-generic.
> - Add ack from Andy Lutomirski to patches 1 and 2.
>
> I've included the manpages patch at the end of this series.  (Note that the
> manpage documents the behavior of the future glibc wrapper as well as the raw
> syscall.)  Here's a formatted plain-text version of the manpage for reference:
>
> CLONE4(2)  Linux Programmer's Manual CLONE4(2)
>
>
>
> NAME
>clone4 - create a child process
>
> SYNOPSIS
>/* Prototype for the glibc wrapper function */
>
>#define _GNU_SOURCE
>#include 
>
>int clone4(uint64_t flags,
>   size_t args_size,
>   struct clone4_args *args,
>   int (*fn)(void *), void *arg);
>
>/* Prototype for the raw system call */
>
>int clone4(unsigned flags_high, unsigned flags_low,
>   unsigned long args_size,
>   struct clone4_args *args);
>
>struct clone4_args {
>pid_t *ptid;
>pid_t *ctid;
>unsigned long stack_start;
>unsigned long stack_size;
>unsigned long tls;
>int *clonefd;
>unsigned clonefd_flags;
>};
>
>
> DESCRIPTION
>clone4()  creates  a  new  process,  similar  to  clone(2) 

Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread josh
On Mon, Mar 16, 2015 at 03:14:14PM -0700, Thiago Macieira wrote:
 On Monday 16 March 2015 14:44:20 Kees Cook wrote:
 O_CLOEXEC
Set  the  close-on-exec  flag on the new file
  descriptor. See the description of the O_CLOEXEC flag in open(2)  for
  reasons why this may be useful.
  
  This begs the question: what happens when all CLONE_FD fds for a
  process are closed? Will the parent get SIGCHLD instead, will it
  auto-reap, or will it be un-wait-able (I assume not this...)
 
 Depends on CLONE_AUTOREAP. If it's on, then no one gets SIGCHLD, no one can 
 wait() on it and the process autoreaps itself.

Minor nit: CLONE_AUTOREAP makes the process autoreap and nobody can wait
on it, but if you pass SIGCHLD or some other exit signal to clone then
you'll still get that signal.

 If it's no active, then the old rules apply: parent gets SIGCHILD and can 
 wait(). If the parent exited first, then the child gets reparented to init, 
 which can do the wait().

Right.

 A child without CLONE_AUTOREAP should be wait()able. If it gets wait()ed 
 before the clonefd is read, the clonefd() will return a 0 read. If it gets 
 read before wait, then wait() reaps another child or returns -ECHILD. That's 
 no different than two threads doing simultaneous wait() on the same child.

Hrm?  That isn't the semantics we implemented; you'll *always* get an
exit notification via the clonefd if you have it open, with or without
autoreap and whether or not a wait has occurred yet.  And reading from
the clonefd does not serve as a wait; if you don't pass CLONE_AUTOREAP,
you'll still need to wait on the process.

- Josh Triplett
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread Thiago Macieira
On Monday 16 March 2015 14:44:20 Kees Cook wrote:
O_CLOEXEC
   Set  the  close-on-exec  flag on the new file
 descriptor. See the description of the O_CLOEXEC flag in open(2)  for
 reasons why this may be useful.
 
 This begs the question: what happens when all CLONE_FD fds for a
 process are closed? Will the parent get SIGCHLD instead, will it
 auto-reap, or will it be un-wait-able (I assume not this...)

Depends on CLONE_AUTOREAP. If it's on, then no one gets SIGCHLD, no one can 
wait() on it and the process autoreaps itself.

If it's no active, then the old rules apply: parent gets SIGCHILD and can 
wait(). If the parent exited first, then the child gets reparented to init, 
which can do the wait().

A child without CLONE_AUTOREAP should be wait()able. If it gets wait()ed 
before the clonefd is read, the clonefd() will return a 0 read. If it gets 
read before wait, then wait() reaps another child or returns -ECHILD. That's 
no different than two threads doing simultaneous wait() on the same child.
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread Thiago Macieira
On Monday 16 March 2015 16:29:49 j...@joshtriplett.org wrote:
  A child without CLONE_AUTOREAP should be wait()able. If it gets wait()ed 
  before the clonefd is read, the clonefd() will return a 0 read. If it
  gets 
  read before wait, then wait() reaps another child or returns -ECHILD.
  That's  no different than two threads doing simultaneous wait() on the
  same child.
 Hrm?  That isn't the semantics we implemented; you'll *always* get an
 exit notification via the clonefd if you have it open, with or without
 autoreap and whether or not a wait has occurred yet.  And reading from
 the clonefd does not serve as a wait; if you don't pass CLONE_AUTOREAP,
 you'll still need to wait on the process.

Ah, I see what you're saying. Ok, I stand corrected: a child without 
CLONE_AUTOREAP must be wait()ed on and whoever waits on it will get 
information. In addition to that, the information is available on the clonefd 
and it can happen at any time, before or after the wait().

In the case of an orphaned child, the file descriptor will close, that's all. 
No modification is necessary to init.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread josh
On Mon, Mar 16, 2015 at 03:36:16PM -0700, Kees Cook wrote:
 On Mon, Mar 16, 2015 at 3:14 PM, Thiago Macieira
 thiago.macie...@intel.com wrote:
  On Monday 16 March 2015 14:44:20 Kees Cook wrote:
 O_CLOEXEC
Set  the  close-on-exec  flag on the new file
  descriptor. See the description of the O_CLOEXEC flag in open(2)  for
  reasons why this may be useful.
 
  This begs the question: what happens when all CLONE_FD fds for a
  process are closed? Will the parent get SIGCHLD instead, will it
  auto-reap, or will it be un-wait-able (I assume not this...)
 
  Depends on CLONE_AUTOREAP. If it's on, then no one gets SIGCHLD, no one can
  wait() on it and the process autoreaps itself.
 
  If it's no active, then the old rules apply: parent gets SIGCHILD and can
  wait(). If the parent exited first, then the child gets reparented to init,
  which can do the wait().
 
  A child without CLONE_AUTOREAP should be wait()able. If it gets wait()ed
  before the clonefd is read, the clonefd() will return a 0 read. If it gets
  read before wait, then wait() reaps another child or returns -ECHILD. That's
  no different than two threads doing simultaneous wait() on the same child.
 
 Cool. I think detailing this in the manpage would be helpful.
 
 And just so I understand the races here, what happens in CLONE_FD
 (without CLONE_AUTOREAP) case where the child dies, but the parent
 never reads from the CLONE_FD fd, and closes it (or dies)? Will the
 modes switch that late in the child's lifetime? (i.e. even though the
 details were written to the fd, they were never read, yet it'll still
 switch and generate a SIGCHLD, etc?)

This doesn't actually work like a pipe; the details aren't written to
the fd.  The data is generated at read time, and if you never read,
that's fine.  There's no semantic meaning attached to reading from the
clonefd; you still have to wait on the process if you don't pass
CLONE_AUTOREAP.  (Or you can block SIGCHLD or use SA_NOCLDWAIT, if you
control the calling process's signal handling; AUTOREAP just lets you
avoid interacting with the calling process's signal handling.)

See my previous response for the rest.

- Josh Triplett
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread Kees Cook
On Mon, Mar 16, 2015 at 3:50 PM, Thiago Macieira
thiago.macie...@intel.com wrote:
 On Monday 16 March 2015 15:36:16 Kees Cook wrote:
 And just so I understand the races here, what happens in CLONE_FD
 (without CLONE_AUTOREAP) case where the child dies, but the parent
 never reads from the CLONE_FD fd, and closes it (or dies)? Will the
 modes switch that late in the child's lifetime? (i.e. even though the
 details were written to the fd, they were never read, yet it'll still
 switch and generate a SIGCHLD, etc?)

 What happens to a child that dies during the parent's lifetime but the parent
 exits without reaping the child?

 The same should happen, whatever that behaviour is.

Okay, sounds like the waitpid internals are tied to the read, not the
child death. Perfect, that means it'll still be waitpid-able by
whatever process inherits the zombie. Sounds good! Thanks for
clarifying. :)

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread Kees Cook
On Sun, Mar 15, 2015 at 12:59 AM, Josh Triplett j...@joshtriplett.org wrote:
 This patch series introduces a new clone flag, CLONE_FD, which lets the caller
 receive child process exit notification via a file descriptor rather than
 SIGCHLD.  CLONE_FD makes it possible for libraries to safely launch and manage
 child processes on behalf of their caller, *without* taking over process-wide
 SIGCHLD handling (either via signal handler or signalfd).

 Note that signalfd for SIGCHLD does not suffice here, because that still
 receives notification for all child processes, and interferes with 
 process-wide
 signal handling.

 The CLONE_FD file descriptor uniquely identifies a process on the system in a
 race-free way, by holding a reference to the task_struct.  In the future, we
 may introduce APIs that support using process file descriptors instead of 
 PIDs.

 This patch series also introduces a clone flag CLONE_AUTOREAP, which causes 
 the
 kernel to automatically reap the child process when it exits, just as it does
 for processes using SIGCHLD when the parent has SIGCHLD ignored or marked as
 SA_NOCLDSTOP.

 Taken together, a library can launch a process with CLONE_FD, CLONE_AUTOREAP,
 and no exit signal, and completely avoid affecting either process-wide signal
 handling or an existing child wait loop.

 Introducing CLONE_FD and CLONE_AUTOREAP required two additional bits of yak
 shaving: Since clone has no more usable flags (with the three currently unused
 flags unusable because old kernels ignore them without EINVAL), also introduce
 a new clone4 system call with more flag bits and an extensible argument
 structure.  And since the magic pt_regs-based syscall argument processing for
 clone's tls argument would otherwise prevent introducing a sane clone4 system
 call, fix that too.

 I tested the CLONE_SETTLS changes with a thread-local storage test program 
 (two
 threads independently reading and writing a __thread variable), on both 32-bit
 and 64-bit, and I observed no issues there.

 I tested clone4 and the new flags with several additional test programs,
 launching either a process or thread (in the former case using syscall(), in
 the latter case by calling clone4 via assembly and returning to C), sleeping 
 in
 parent and child to test the case of either exiting first, and then printing
 the received clone4_info structure.

 Changes in v2:
 - Split out autoreaping into a separate CLONE_AUTOREAP.  CLONE_FD no longer
   implies autoreaping and no exit signal, and CLONE_AUTOREAP does not affect
   ptracers or signal handling.  Thanks to Oleg Nesterov for careful
   investigation and discussion on v1.
 - Accept O_CLOEXEC and O_NONBLOCK via a clonefd_flags parameter in 
 clone4_args.
   Stop overloading the low byte of the main clone flags, since CLONE_FD now
   works with a non-zero signal.
 - Return the file descriptor via an out parameter in clone4_args.
 - Drop patch to export alloc_fd; CLONE_FD now uses the next available file
   descriptor, even if that's 0-2, since clone4 no longer needs to avoid
   ambiguity with the 0 return indicating the child process.
 - Make poll on a CLONE_FD for an exited task also return POLLHUP, for
   compatibility with FreeBSD's pdfork.  Thanks to David Drysdale for calling
   attention to pdfork.

I think POLLHUP should be mentioned in the manpage (now it only
mentions POLLIN).

 - Fix typo in squelch_clone_flags.
 - Pass arguments to _do_fork and copy_process as a structure.
 - Construct the 64-bit flags in a separate variable, rather than inline in the
   call to do_fork.
 - Fix error return for copy_from_user faults.
 - Add the new syscall to asm-generic.
 - Add ack from Andy Lutomirski to patches 1 and 2.

 I've included the manpages patch at the end of this series.  (Note that the
 manpage documents the behavior of the future glibc wrapper as well as the raw
 syscall.)  Here's a formatted plain-text version of the manpage for reference:

 CLONE4(2)  Linux Programmer's Manual CLONE4(2)



 NAME
clone4 - create a child process

 SYNOPSIS
/* Prototype for the glibc wrapper function */

#define _GNU_SOURCE
#include sched.h

int clone4(uint64_t flags,
   size_t args_size,
   struct clone4_args *args,
   int (*fn)(void *), void *arg);

/* Prototype for the raw system call */

int clone4(unsigned flags_high, unsigned flags_low,
   unsigned long args_size,
   struct clone4_args *args);

struct clone4_args {
pid_t *ptid;
pid_t *ctid;
unsigned long stack_start;
unsigned long stack_size;
unsigned long tls;
int *clonefd;
unsigned clonefd_flags;
};


 DESCRIPTION
clone4()  creates  a  new  process,  similar  to  clone(2) and fork(2).
clone4() supports additional flags that clone(2) does not, 

Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread Kees Cook
On Mon, Mar 16, 2015 at 3:14 PM, Thiago Macieira
thiago.macie...@intel.com wrote:
 On Monday 16 March 2015 14:44:20 Kees Cook wrote:
O_CLOEXEC
   Set  the  close-on-exec  flag on the new file
 descriptor. See the description of the O_CLOEXEC flag in open(2)  for
 reasons why this may be useful.

 This begs the question: what happens when all CLONE_FD fds for a
 process are closed? Will the parent get SIGCHLD instead, will it
 auto-reap, or will it be un-wait-able (I assume not this...)

 Depends on CLONE_AUTOREAP. If it's on, then no one gets SIGCHLD, no one can
 wait() on it and the process autoreaps itself.

 If it's no active, then the old rules apply: parent gets SIGCHILD and can
 wait(). If the parent exited first, then the child gets reparented to init,
 which can do the wait().

 A child without CLONE_AUTOREAP should be wait()able. If it gets wait()ed
 before the clonefd is read, the clonefd() will return a 0 read. If it gets
 read before wait, then wait() reaps another child or returns -ECHILD. That's
 no different than two threads doing simultaneous wait() on the same child.

Cool. I think detailing this in the manpage would be helpful.

And just so I understand the races here, what happens in CLONE_FD
(without CLONE_AUTOREAP) case where the child dies, but the parent
never reads from the CLONE_FD fd, and closes it (or dies)? Will the
modes switch that late in the child's lifetime? (i.e. even though the
details were written to the fd, they were never read, yet it'll still
switch and generate a SIGCHLD, etc?)

Thanks!

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread Thiago Macieira
On Monday 16 March 2015 15:36:16 Kees Cook wrote:
 And just so I understand the races here, what happens in CLONE_FD
 (without CLONE_AUTOREAP) case where the child dies, but the parent
 never reads from the CLONE_FD fd, and closes it (or dies)? Will the
 modes switch that late in the child's lifetime? (i.e. even though the
 details were written to the fd, they were never read, yet it'll still
 switch and generate a SIGCHLD, etc?)

What happens to a child that dies during the parent's lifetime but the parent 
exits without reaping the child?

The same should happen, whatever that behaviour is.
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-16 Thread josh
On Mon, Mar 16, 2015 at 02:44:20PM -0700, Kees Cook wrote:
 On Sun, Mar 15, 2015 at 12:59 AM, Josh Triplett j...@joshtriplett.org wrote:
  - Make poll on a CLONE_FD for an exited task also return POLLHUP, for
compatibility with FreeBSD's pdfork.  Thanks to David Drysdale for calling
attention to pdfork.
 
 I think POLLHUP should be mentioned in the manpage (now it only
 mentions POLLIN).

Added for v3.

 CLONE_FD
Return  a file descriptor associated with the new process, 
  stor‐
ing it in location clonefd in the parent's address space.   
  When
the new process exits, the file descriptor will become 
  available
for reading.
 
Unlike using  signalfd(2)  for  the  SIGCHLD  signal,  the  
  file
descriptor  returned  by  clone4()  with the CLONE_FD flag 
  works
even with SIGCHLD unblocked in one or more threads of the 
  parent
process,  allowing  the  process  to have different handlers 
  for
different child processes, such as those created by  a  
  library,
without  introducing  race conditions around process-wide 
  signal
handling.
 
clonefd_flags may contain the following additional flags for 
  use
with CLONE_FD:
 
 
O_CLOEXEC
   Set  the  close-on-exec  flag on the new file 
  descriptor.
   See the description of the O_CLOEXEC flag in open(2)  
  for
   reasons why this may be useful.
 
 This begs the question: what happens when all CLONE_FD fds for a
 process are closed? Will the parent get SIGCHLD instead, will it
 auto-reap, or will it be un-wait-able (I assume not this...)

Whether the parent gets SIGCHLD is determined only by what signal you
request in clone; if you clone with CLONE_FD | SIGCHLD (or
CLONE_AUTOREAP | CLONE_FD | SIGCHLD), you'll get notification via both
clonefd (if you have one) and signal (if you have a handler).  If you
pass a 0 signal (just CLONE_FD or CLONE_AUTOREAP | CLONE_FD), you'll
receive no signal, only the notification via clonefd.  Independently, if
you have CLONE_AUTOREAP set, the process will autoreap.

Those are all orthogonal now.

If you close the clonefd, nothing special happens other than a
put_task_struct.  While this is conceptually somewhat like a pipe, the
data is actually generated at read time, so the task exit doesn't care
whether there's a live clonefd or not.  (Or, in the future, if there are
multiple live clonefds for the same process.)

 Looks promising!

Thanks!

And thanks for catching the manpage issue.  I'd definitely welcome any
comments you have on the implementation as well.

- Josh Triplett
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-15 Thread Josh Triplett
On Sun, Mar 15, 2015 at 12:59:17AM -0700, Josh Triplett wrote:
> This patch series also introduces a clone flag CLONE_AUTOREAP, which causes 
> the
> kernel to automatically reap the child process when it exits, just as it does
> for processes using SIGCHLD when the parent has SIGCHLD ignored or marked as
> SA_NOCLDSTOP.

Typo: s/SA_NOCLDSTOP/SA_NOCLDWAIT/.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] CLONE_FD: Task exit notification via file descriptor

2015-03-15 Thread Josh Triplett
On Sun, Mar 15, 2015 at 12:59:17AM -0700, Josh Triplett wrote:
 This patch series also introduces a clone flag CLONE_AUTOREAP, which causes 
 the
 kernel to automatically reap the child process when it exits, just as it does
 for processes using SIGCHLD when the parent has SIGCHLD ignored or marked as
 SA_NOCLDSTOP.

Typo: s/SA_NOCLDSTOP/SA_NOCLDWAIT/.

- Josh Triplett
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/