Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Kees Cook
On Wed, Dec 5, 2018 at 7:08 PM Christian Brauner  wrote:
> As a sidenote I'm switching the name from procfd_send_signal() to
> taskfd_send_signal(). It seems to me the best way to handle Eric's
> request to reflect that we can eventually both signal tgids and tids.

Cool; sounds fine to me.

-- 
Kees Cook


Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Christian Brauner
On Wed, Dec 05, 2018 at 03:24:08PM -0800, Kees Cook wrote:
> On Wed, Dec 5, 2018 at 12:53 PM Christian Brauner  
> wrote:
> > On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote:
> > > Christian Brauner  writes:
> > > > [1]:  https://lkml.org/lkml/2018/11/18/130
> > > > [2]:  
> > > > https://lore.kernel.org/lkml/874lbtjvtd@oldenburg2.str.redhat.com/
> > > > [3]:  
> > > > https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6f...@brauner.io/
> > > > [4]:  
> > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/
> > > > [5]:  
> > > > https://lore.kernel.org/lkml/20181121213946.ga10...@mail.hallyn.com/
> > > > [6]:  
> > > > https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6...@brauner.io/
> > > > [7]:  
> > > > https://lore.kernel.org/lkml/36323361-90bd-41af-ab5b-ee0d7ba02...@amacapital.net/
> > > > [8]:  https://lore.kernel.org/lkml/87tvjxp8pc@xmission.com/
> > > > [9]:  https://asciinema.org/a/X1J8eGhe3vCfBE2b9TXtTaSJ7
> > > > [10]: 
> > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/
> > > > [11]: 
> > > > https://lore.kernel.org/lkml/f53d6d38-3521-4c20-9034-5af447df6...@amacapital.net/
> 
> I nominate this for 2018's most-well-documented syscall commit log award. ;)

Hahaha. If I win can I get my price in beer(s)? :)

> 
> > > > +   /*
> > > > +* Give userspace a way to detect whether /proc//task/ fds
> > > > +* are supported.
> > > > +*/
> > > > +   ret = -EOPNOTSUPP;
> > > > +   if (proc_is_tid_procfd(f.file))
> > > > +   goto err;
> > >
> > >   -EBADF is the proper error code.
> >
> > This is done so that userspace has a way of figuring out that tid fds
> > are not yet supported. This has been discussed with Florian (see commit
> > message).
> 
> Right, we should keep this -EOPNOTSUPP.
> 
> > > > +   /* Is this a procfd? */
> > > > +   ret = -EINVAL;
> > > > +   if (!proc_is_tgid_procfd(f.file))
> > > > +   goto err;
> > >
> > >   -EBADF is the proper error code.
> 
> Yeah, EINVAL tends to be used for bad flags... this is more about an
> improper fd.
> 
> > >
> > > > +   /* Without CONFIG_PROC_FS proc_pid() returns NULL. */
> > > > +   pid = proc_pid(file_inode(f.file));
> > > > +   if (!pid)
> > > > +   goto err;
> > >
> > > Perhaps you want to fold the proc_pid into the proc_is_tgid_procfd
> > > call.  That way proc_pid can stay private to proc.
> >
> > Hm, I guess we can do that for now. My intention was to have reuseable
> > helpers but I guess it would be fine for now.
> >
> > >
> > > > +   if (!may_signal_procfd(pid))
> > > > +   goto err;
> > > > +
> 
> Does the ns parent checking in may_signal_procfd need any locking or
> RCU? I know pid and current namespaces are "pinned", but I don't know
> how parent ns works here. I'm assuming the parents are stuck until all
> children go away?

Yeah, since they are hierarchical killing an ancestor means killing the
children. Also, in case you're interested, there's precedent for that:
kernel/pid_namespace.c:static struct ns_common *pidns_get_parent(struct 
ns_common *ns)
I'm not using this function because a) I would have to special case the
initial test-case and b) it takes a get() on the pid ns which would
force us to use another put which is unnecessary.

> 
> > > > +   ret = kill_pid_info(sig, &kinfo, pid);
> 
> Just double-checking for myself: this does not bypass
> security_task_kill(), so no problem there AFAIK.
> 
> Reviewed-by: Kees Cook 

Thanks! :)
As a sidenote I'm switching the name from procfd_send_signal() to
taskfd_send_signal(). It seems to me the best way to handle Eric's
request to reflect that we can eventually both signal tgids and tids.

> 
> -- 
> Kees Cook


Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Kees Cook
On Wed, Dec 5, 2018 at 12:53 PM Christian Brauner  wrote:
> On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote:
> > Christian Brauner  writes:
> > > [1]:  https://lkml.org/lkml/2018/11/18/130
> > > [2]:  
> > > https://lore.kernel.org/lkml/874lbtjvtd@oldenburg2.str.redhat.com/
> > > [3]:  
> > > https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6f...@brauner.io/
> > > [4]:  
> > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/
> > > [5]:  https://lore.kernel.org/lkml/20181121213946.ga10...@mail.hallyn.com/
> > > [6]:  
> > > https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6...@brauner.io/
> > > [7]:  
> > > https://lore.kernel.org/lkml/36323361-90bd-41af-ab5b-ee0d7ba02...@amacapital.net/
> > > [8]:  https://lore.kernel.org/lkml/87tvjxp8pc@xmission.com/
> > > [9]:  https://asciinema.org/a/X1J8eGhe3vCfBE2b9TXtTaSJ7
> > > [10]: 
> > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/
> > > [11]: 
> > > https://lore.kernel.org/lkml/f53d6d38-3521-4c20-9034-5af447df6...@amacapital.net/

I nominate this for 2018's most-well-documented syscall commit log award. ;)

> > > +   /*
> > > +* Give userspace a way to detect whether /proc//task/ fds
> > > +* are supported.
> > > +*/
> > > +   ret = -EOPNOTSUPP;
> > > +   if (proc_is_tid_procfd(f.file))
> > > +   goto err;
> >
> >   -EBADF is the proper error code.
>
> This is done so that userspace has a way of figuring out that tid fds
> are not yet supported. This has been discussed with Florian (see commit
> message).

Right, we should keep this -EOPNOTSUPP.

> > > +   /* Is this a procfd? */
> > > +   ret = -EINVAL;
> > > +   if (!proc_is_tgid_procfd(f.file))
> > > +   goto err;
> >
> >   -EBADF is the proper error code.

Yeah, EINVAL tends to be used for bad flags... this is more about an
improper fd.

> >
> > > +   /* Without CONFIG_PROC_FS proc_pid() returns NULL. */
> > > +   pid = proc_pid(file_inode(f.file));
> > > +   if (!pid)
> > > +   goto err;
> >
> > Perhaps you want to fold the proc_pid into the proc_is_tgid_procfd
> > call.  That way proc_pid can stay private to proc.
>
> Hm, I guess we can do that for now. My intention was to have reuseable
> helpers but I guess it would be fine for now.
>
> >
> > > +   if (!may_signal_procfd(pid))
> > > +   goto err;
> > > +

Does the ns parent checking in may_signal_procfd need any locking or
RCU? I know pid and current namespaces are "pinned", but I don't know
how parent ns works here. I'm assuming the parents are stuck until all
children go away?

> > > +   ret = kill_pid_info(sig, &kinfo, pid);

Just double-checking for myself: this does not bypass
security_task_kill(), so no problem there AFAIK.

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Christian Brauner
On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote:
> Christian Brauner  writes:
> 
> > The kill() syscall operates on process identifiers (pid). After a process
> > has exited its pid can be reused by another process. If a caller sends a
> > signal to a reused pid it will end up signaling the wrong process. This
> > issue has often surfaced and there has been a push [1] to address this
> > problem.
> >
> > This patch uses file descriptors (fd) from proc/ as stable handles on
> > struct pid. Even if a pid is recycled the handle will not change. The fd
> > can be used to send signals to the process it refers to.
> > Thus, the new syscall procfd_send_signal() is introduced to solve this
> > problem. Instead of pids it operates on process fds (procfd).
> >
> > /* prototype and argument /*
> > long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int 
> > flags);
> >
> > In addition to the procfd and signal argument it takes an additional
> > siginfo_t and flags argument. If the siginfo_t argument is NULL then
> > procfd_send_signal() behaves like kill(). If it is not NULL
> > procfd_send_signal() behaves like rt_sigqueueinfo().
> > The flags argument is added to allow for future extensions of this syscall.
> > It currently needs to be passed as 0. Failing to do so will cause EINVAL.
> >
> > /* procfd_send_signal() replaces multiple pid-based syscalls */
> > The procfd_send_signal() syscall currently takes on the job of the
> > following syscalls that operate on pids:
> > - kill(2)
> > - rt_sigqueueinfo(2)
> > The syscall is defined in such a way that it can also operate on thread fds
> > instead of process fds. In a future patchset I will extend it to operate on
> > procfds from /proc//task/ at which point it will additionally
> > take on the job of:
> > - tgkill(2)
> > - rt_tgsigqueueinfo(2)
> > Right now this is intentionally left out to keep this patchset as simple as
> > possible (cf. [4]). If a procfd of /proc//task/ is passed
> > EOPNOTSUPP will be returned to give userspace a way to detect when I add
> > support for such procfds {cf. [10]).
> >
> > /* naming */
> > The name procfd_send_signal() was chosen to reflect the fact that it takes
> > on the job of multiple syscalls. It is intentional that the name is not
> > reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer
> > because it might imply that procfd_send_signal() is only a replacement for
> > kill(2) and not the latter because it is a hazzle to remember the correct
> > spelling (especially for non-native speakers) and because it is not
> > descriptive enough of what the syscall actually does. The name
> > "procfd_send_signal" makes it very clear that its job is to send signals.
> >
> > /* O_PATH file descriptors */
> > procfds opened as O_PATH fds cannot be used to send signals to a process
> > (cf. [2]). Signaling processes through procfds is the equivalent of writing
> > to a file. Thus, this is not an operation that operates "purely at the
> > file descriptor level" as required by the open(2) manpage.
> >
> > /* zombies */
> > Zombies can be signaled just as any other process. No special error will be
> > reported since a zombie state is an unreliable state (cf. [3]).
> >
> > /* cross-namespace signals */
> > The patch currently enforces that the signaler and signalee either are in
> > the same pid namespace or that the signaler's pid namespace is an ancestor
> > of the signalee's pid namespace. This is done for the sake of simplicity
> > and because it is unclear to what values certain members of struct
> > siginfo_t would need to be set to (cf. [5], [6]).
> >
> > /* compat syscalls */
> > It became clear that we would like to avoid adding compat syscalls (cf.
> > [7]). The compat syscall handling is now done in kernel/signal.c itself by
> > adding __copy_siginfo_from_user_generic() which lets us avoid compat
> > syscalls (cf. [8]).
> > With upcoming rework for syscall handling things might improve even further
> > (cf. [11]).
> > This patch was tested on x64 and x86.
> >
> > /* userspace usage */
> > An asciinema recording for the basic functionality can be found under [9].
> > With this patch a process can be killed via:
> >
> >  #define _GNU_SOURCE
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >
> >  static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t 
> > *info,
> >  unsigned int flags)
> >  {
> >  #ifdef __NR_procfd_send_signal
> >  return syscall(__NR_procfd_send_signal, procfd, sig, info, flags);
> >  #else
> >  return -ENOSYS;
> >  #endif
> >  }
> >
> >  int main(int argc, char *argv[])
> >  {
> >  int fd, ret, saved_errno, sig;
> >
> >  if (argc < 3)
> >  exit(EXIT_FAILURE);
> >
> >  fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
> >  if (fd < 0) {
> >  

Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Eric W. Biederman
Christian Brauner  writes:

> The kill() syscall operates on process identifiers (pid). After a process
> has exited its pid can be reused by another process. If a caller sends a
> signal to a reused pid it will end up signaling the wrong process. This
> issue has often surfaced and there has been a push [1] to address this
> problem.
>
> This patch uses file descriptors (fd) from proc/ as stable handles on
> struct pid. Even if a pid is recycled the handle will not change. The fd
> can be used to send signals to the process it refers to.
> Thus, the new syscall procfd_send_signal() is introduced to solve this
> problem. Instead of pids it operates on process fds (procfd).
>
> /* prototype and argument /*
> long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int flags);
>
> In addition to the procfd and signal argument it takes an additional
> siginfo_t and flags argument. If the siginfo_t argument is NULL then
> procfd_send_signal() behaves like kill(). If it is not NULL
> procfd_send_signal() behaves like rt_sigqueueinfo().
> The flags argument is added to allow for future extensions of this syscall.
> It currently needs to be passed as 0. Failing to do so will cause EINVAL.
>
> /* procfd_send_signal() replaces multiple pid-based syscalls */
> The procfd_send_signal() syscall currently takes on the job of the
> following syscalls that operate on pids:
> - kill(2)
> - rt_sigqueueinfo(2)
> The syscall is defined in such a way that it can also operate on thread fds
> instead of process fds. In a future patchset I will extend it to operate on
> procfds from /proc//task/ at which point it will additionally
> take on the job of:
> - tgkill(2)
> - rt_tgsigqueueinfo(2)
> Right now this is intentionally left out to keep this patchset as simple as
> possible (cf. [4]). If a procfd of /proc//task/ is passed
> EOPNOTSUPP will be returned to give userspace a way to detect when I add
> support for such procfds {cf. [10]).
>
> /* naming */
> The name procfd_send_signal() was chosen to reflect the fact that it takes
> on the job of multiple syscalls. It is intentional that the name is not
> reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer
> because it might imply that procfd_send_signal() is only a replacement for
> kill(2) and not the latter because it is a hazzle to remember the correct
> spelling (especially for non-native speakers) and because it is not
> descriptive enough of what the syscall actually does. The name
> "procfd_send_signal" makes it very clear that its job is to send signals.
>
> /* O_PATH file descriptors */
> procfds opened as O_PATH fds cannot be used to send signals to a process
> (cf. [2]). Signaling processes through procfds is the equivalent of writing
> to a file. Thus, this is not an operation that operates "purely at the
> file descriptor level" as required by the open(2) manpage.
>
> /* zombies */
> Zombies can be signaled just as any other process. No special error will be
> reported since a zombie state is an unreliable state (cf. [3]).
>
> /* cross-namespace signals */
> The patch currently enforces that the signaler and signalee either are in
> the same pid namespace or that the signaler's pid namespace is an ancestor
> of the signalee's pid namespace. This is done for the sake of simplicity
> and because it is unclear to what values certain members of struct
> siginfo_t would need to be set to (cf. [5], [6]).
>
> /* compat syscalls */
> It became clear that we would like to avoid adding compat syscalls (cf.
> [7]). The compat syscall handling is now done in kernel/signal.c itself by
> adding __copy_siginfo_from_user_generic() which lets us avoid compat
> syscalls (cf. [8]).
> With upcoming rework for syscall handling things might improve even further
> (cf. [11]).
> This patch was tested on x64 and x86.
>
> /* userspace usage */
> An asciinema recording for the basic functionality can be found under [9].
> With this patch a process can be killed via:
>
>  #define _GNU_SOURCE
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>
>  static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t *info,
>  unsigned int flags)
>  {
>  #ifdef __NR_procfd_send_signal
>  return syscall(__NR_procfd_send_signal, procfd, sig, info, flags);
>  #else
>  return -ENOSYS;
>  #endif
>  }
>
>  int main(int argc, char *argv[])
>  {
>  int fd, ret, saved_errno, sig;
>
>  if (argc < 3)
>  exit(EXIT_FAILURE);
>
>  fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
>  if (fd < 0) {
>  printf("%s - Failed to open \"%s\"\n", strerror(errno), 
> argv[1]);
>  exit(EXIT_FAILURE);
>  }
>
>  sig = atoi(argv[2]);
>
>  printf("Sending signal %d to process %s\n", sig, argv[1]);
>  ret = do_procfd_send_signal(fd, sig, NULL, 0);
>
> 

Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Eric W. Biederman
Christian Brauner  writes:

> The kill() syscall operates on process identifiers (pid). After a process
> has exited its pid can be reused by another process. If a caller sends a
> signal to a reused pid it will end up signaling the wrong process. This
> issue has often surfaced and there has been a push [1] to address this
> problem.
>
> This patch uses file descriptors (fd) from proc/ as stable handles on
> struct pid. Even if a pid is recycled the handle will not change. The fd
> can be used to send signals to the process it refers to.
> Thus, the new syscall procfd_send_signal() is introduced to solve this
> problem. Instead of pids it operates on process fds (procfd).
>
> /* prototype and argument /*
> long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int flags);
>
> In addition to the procfd and signal argument it takes an additional
> siginfo_t and flags argument. If the siginfo_t argument is NULL then
> procfd_send_signal() behaves like kill(). If it is not NULL
> procfd_send_signal() behaves like rt_sigqueueinfo().
> The flags argument is added to allow for future extensions of this syscall.
> It currently needs to be passed as 0. Failing to do so will cause EINVAL.
>
> /* procfd_send_signal() replaces multiple pid-based syscalls */
> The procfd_send_signal() syscall currently takes on the job of the
> following syscalls that operate on pids:
> - kill(2)
> - rt_sigqueueinfo(2)
> The syscall is defined in such a way that it can also operate on thread fds
> instead of process fds. In a future patchset I will extend it to operate on
> procfds from /proc//task/ at which point it will additionally
> take on the job of:
> - tgkill(2)
> - rt_tgsigqueueinfo(2)
> Right now this is intentionally left out to keep this patchset as simple as
> possible (cf. [4]). If a procfd of /proc//task/ is passed
> EOPNOTSUPP will be returned to give userspace a way to detect when I add
> support for such procfds {cf. [10]).
>
> /* naming */
> The name procfd_send_signal() was chosen to reflect the fact that it takes
> on the job of multiple syscalls. It is intentional that the name is not
> reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer
> because it might imply that procfd_send_signal() is only a replacement for
> kill(2) and not the latter because it is a hazzle to remember the correct
> spelling (especially for non-native speakers) and because it is not
> descriptive enough of what the syscall actually does. The name
> "procfd_send_signal" makes it very clear that its job is to send signals.
>
> /* O_PATH file descriptors */
> procfds opened as O_PATH fds cannot be used to send signals to a process
> (cf. [2]). Signaling processes through procfds is the equivalent of writing
> to a file. Thus, this is not an operation that operates "purely at the
> file descriptor level" as required by the open(2) manpage.
>
> /* zombies */
> Zombies can be signaled just as any other process. No special error will be
> reported since a zombie state is an unreliable state (cf. [3]).
>
> /* cross-namespace signals */
> The patch currently enforces that the signaler and signalee either are in
> the same pid namespace or that the signaler's pid namespace is an ancestor
> of the signalee's pid namespace. This is done for the sake of simplicity
> and because it is unclear to what values certain members of struct
> siginfo_t would need to be set to (cf. [5], [6]).
>
> /* compat syscalls */
> It became clear that we would like to avoid adding compat syscalls (cf.
> [7]). The compat syscall handling is now done in kernel/signal.c itself by
> adding __copy_siginfo_from_user_generic() which lets us avoid compat
> syscalls (cf. [8]).
> With upcoming rework for syscall handling things might improve even further
> (cf. [11]).
> This patch was tested on x64 and x86.
>
> /* userspace usage */
> An asciinema recording for the basic functionality can be found under [9].
> With this patch a process can be killed via:
>
>  #define _GNU_SOURCE
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>
>  static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t *info,
>  unsigned int flags)
>  {
>  #ifdef __NR_procfd_send_signal
>  return syscall(__NR_procfd_send_signal, procfd, sig, info, flags);
>  #else
>  return -ENOSYS;
>  #endif
>  }
>
>  int main(int argc, char *argv[])
>  {
>  int fd, ret, saved_errno, sig;
>
>  if (argc < 3)
>  exit(EXIT_FAILURE);
>
>  fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
>  if (fd < 0) {
>  printf("%s - Failed to open \"%s\"\n", strerror(errno), 
> argv[1]);
>  exit(EXIT_FAILURE);
>  }
>
>  sig = atoi(argv[2]);
>
>  printf("Sending signal %d to process %s\n", sig, argv[1]);
>  ret = do_procfd_send_signal(fd, sig, NULL, 0);
>
> 

Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread kbuild test robot
Hi Christian,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.20-rc5]
[cannot apply to next-20181205]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Christian-Brauner/signal-add-procfd_send_signal-syscall/20181205-174512
config: sh-titan_defconfig (attached as .config)
compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=sh 

All warnings (new ones prefixed by >>):

   :1317:2: warning: #warning syscall pkey_mprotect not implemented 
[-Wcpp]
   :1320:2: warning: #warning syscall pkey_alloc not implemented [-Wcpp]
   :1323:2: warning: #warning syscall pkey_free not implemented [-Wcpp]
   :1326:2: warning: #warning syscall statx not implemented [-Wcpp]
   :1332:2: warning: #warning syscall io_pgetevents not implemented 
[-Wcpp]
   :1335:2: warning: #warning syscall rseq not implemented [-Wcpp]
>> :1338:2: warning: #warning syscall procfd_send_signal not implemented 
>> [-Wcpp]

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Christian Brauner
The kill() syscall operates on process identifiers (pid). After a process
has exited its pid can be reused by another process. If a caller sends a
signal to a reused pid it will end up signaling the wrong process. This
issue has often surfaced and there has been a push [1] to address this
problem.

This patch uses file descriptors (fd) from proc/ as stable handles on
struct pid. Even if a pid is recycled the handle will not change. The fd
can be used to send signals to the process it refers to.
Thus, the new syscall procfd_send_signal() is introduced to solve this
problem. Instead of pids it operates on process fds (procfd).

/* prototype and argument /*
long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int flags);

In addition to the procfd and signal argument it takes an additional
siginfo_t and flags argument. If the siginfo_t argument is NULL then
procfd_send_signal() behaves like kill(). If it is not NULL
procfd_send_signal() behaves like rt_sigqueueinfo().
The flags argument is added to allow for future extensions of this syscall.
It currently needs to be passed as 0. Failing to do so will cause EINVAL.

/* procfd_send_signal() replaces multiple pid-based syscalls */
The procfd_send_signal() syscall currently takes on the job of the
following syscalls that operate on pids:
- kill(2)
- rt_sigqueueinfo(2)
The syscall is defined in such a way that it can also operate on thread fds
instead of process fds. In a future patchset I will extend it to operate on
procfds from /proc//task/ at which point it will additionally
take on the job of:
- tgkill(2)
- rt_tgsigqueueinfo(2)
Right now this is intentionally left out to keep this patchset as simple as
possible (cf. [4]). If a procfd of /proc//task/ is passed
EOPNOTSUPP will be returned to give userspace a way to detect when I add
support for such procfds {cf. [10]).

/* naming */
The name procfd_send_signal() was chosen to reflect the fact that it takes
on the job of multiple syscalls. It is intentional that the name is not
reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer
because it might imply that procfd_send_signal() is only a replacement for
kill(2) and not the latter because it is a hazzle to remember the correct
spelling (especially for non-native speakers) and because it is not
descriptive enough of what the syscall actually does. The name
"procfd_send_signal" makes it very clear that its job is to send signals.

/* O_PATH file descriptors */
procfds opened as O_PATH fds cannot be used to send signals to a process
(cf. [2]). Signaling processes through procfds is the equivalent of writing
to a file. Thus, this is not an operation that operates "purely at the
file descriptor level" as required by the open(2) manpage.

/* zombies */
Zombies can be signaled just as any other process. No special error will be
reported since a zombie state is an unreliable state (cf. [3]).

/* cross-namespace signals */
The patch currently enforces that the signaler and signalee either are in
the same pid namespace or that the signaler's pid namespace is an ancestor
of the signalee's pid namespace. This is done for the sake of simplicity
and because it is unclear to what values certain members of struct
siginfo_t would need to be set to (cf. [5], [6]).

/* compat syscalls */
It became clear that we would like to avoid adding compat syscalls (cf.
[7]). The compat syscall handling is now done in kernel/signal.c itself by
adding __copy_siginfo_from_user_generic() which lets us avoid compat
syscalls (cf. [8]).
With upcoming rework for syscall handling things might improve even further
(cf. [11]).
This patch was tested on x64 and x86.

/* userspace usage */
An asciinema recording for the basic functionality can be found under [9].
With this patch a process can be killed via:

 #define _GNU_SOURCE
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 

 static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t *info,
 unsigned int flags)
 {
 #ifdef __NR_procfd_send_signal
 return syscall(__NR_procfd_send_signal, procfd, sig, info, flags);
 #else
 return -ENOSYS;
 #endif
 }

 int main(int argc, char *argv[])
 {
 int fd, ret, saved_errno, sig;

 if (argc < 3)
 exit(EXIT_FAILURE);

 fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
 if (fd < 0) {
 printf("%s - Failed to open \"%s\"\n", strerror(errno), 
argv[1]);
 exit(EXIT_FAILURE);
 }

 sig = atoi(argv[2]);

 printf("Sending signal %d to process %s\n", sig, argv[1]);
 ret = do_procfd_send_signal(fd, sig, NULL, 0);

 saved_errno = errno;
 close(fd);
 errno = saved_errno;

 if (ret < 0) {
 printf("%s - Failed to send signal %d to process %s\n",
strerror(errno), sig, argv[1]);
 exit(EX